Enhancing Model Agnostic Meta-Learning via Gradient Similarity Loss

Author:

Tak Jae-Ho1ORCID,Hong Byung-Woo1

Affiliation:

1. Department of Artificial Intelligence, Chung-Ang University, Seoul 156-756, Republic of Korea

Abstract

Artificial intelligence (AI) technology has advanced significantly, now capable of performing tasks previously believed to be exclusive to skilled humans. However, AI models, in contrast to humans who can develop skills with relatively less data, often require substantial amounts of data to emulate human cognitive abilities in specific areas. In situations where adequate pre-training data is not available, meta-learning becomes a crucial method for enhancing generalization. The Model Agnostic Meta-Learning (MAML) algorithm, which employs second-order derivative calculations to fine-tune initial parameters for better starting points, plays a pivotal role in this area. However, the computational demand of this method can be challenging for modern models with a large number of parameters. The concept of the Approximate Hessian Effect is introduced in this context, examining the effectiveness of second-order derivatives in identifying initial parameters conducive to high generalization performance. The study suggests the use of cosine similarity and squared error (L2 loss) as a loss function within the Approximate Hessian Effect framework to modify gradient weights, aiming for more generalizable model parameters. Additionally, an algorithm that relies on first-order calculations is presented, designed to achieve performance levels comparable to MAML. This approach was tested and compared with traditional MAML methods using both the MiniImagenet dataset and a modified MNIST dataset. The results were analyzed to evaluate its efficiency. Compared to previous studies that achieved good performance using only the first derivative, this approach is more efficient because it does not require iterative loops to converge on additional loss functions. Additionally, there is potential for further performance enhancement through hyperparameter tuning.

Funder

Chung-Ang University Research Scholarship Grants in 2022

Korea Government

National Research Foundation of Korea

Publisher

MDPI AG

Subject

Electrical and Electronic Engineering,Computer Networks and Communications,Hardware and Architecture,Signal Processing,Control and Systems Engineering

Reference33 articles.

1. Matching networks for one-shot learning;Vinyals;Adv. Neural Inf. Process. Syst.,2016

2. Meta-Learning in Neural Networks: A Survey;Hospedales;IEEE Trans. Pattern Anal. Mach. Intell.,2022

3. A survey of deep meta-learning;Huisman;Artif. Intell. Rev.,2021

4. Achille, A., Lam, M., Tewari, R., Ravichandran, A., Maji, S., Fowlkes, C.C., Soatto, S., and Perona, P. (November, January 27). Task2Vec: Task Embedding for Meta-Learning. Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea.

5. Wu, Z., Shi, X., Lin, G., and Cai, J. (2021, January 11–17). Learning Meta-class Memory for Few-Shot Semantic Segmentation. Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3