Self-Attention based Prototype Enhancement Network with Prediction-Driven CutMix for Few-Shot Learning
- Alternative Title
- 퓨샷 러닝을 위한 예측 주도 컷믹스와 결합한 자기주의집중 기반 프로토타입 향상 네트워크 연구
- Abstract
- Few-shot learning is a sub-field of machine learning, which explores the problem of recognizing new visual concepts by learning from a few labeled samples. Among the methods of solving few-shot learning problems, Prototypical Network (ProtoNet) is well-known for its strong generalization and simplicity. Nevertheless, it has been shown that ProtoNet is prone to learning redundant features under the few-shot settings.
In this dissertation, to alleviate the issue of ProtoNet, a Self-Attention based Prototype Enhancement Network (SAPENet) is proposed and a modified data augmentation method is employed to further increase the classification performance of SAPENet. More specifically, the dissertation makes the following contributions. First, an extensive and in-depth literature review is provided, aiming to illustrate the inherent reasons for the rise of few-shot learning. Second, SAPENet, the first framework that uses a multi-head attention mechanism to explore the information of intra-class features is proposed, which aims to learn more representative prototypes in the few-shot setting. Third, an efficient data augmentation method, Prediction-Driven CutMix (PDCM) is developed, which alleviates the occurrence of noisy labels and generates more variants of training samples to increase the classification accuracy of SAPENet. Finally, SAPENet with PDCM build up a new state-of-the-art performance on various few-shot learning benchmark datasets (e.g., miniImageNet, and CUB-200-2011), showing the effectiveness of the proposed method under few-shot settings.
퓨샷 러닝은 몇 개의 라벨을 가진 샘플에서 학습하여 새로운 시각적 개념을 인식하는 문제를 탐구하는 기계 학습의 하위 분야이다. 퓨샷 러닝 문제를 해결하는 방법 중 프로토타입 네트워크(Prototypical Network: ProtoNet)는 강력한 일반화와 단순성으로 잘 알려져 있다. 그럼에도 불구하고 ProtoNet은 퓨샷 설정 하에서 불필요한 특징을 학습하는 경향이 있는 것으로 나타났다.
본 논문에서는 ProtoNet의 문제를 완화하기 위해 자기주의집중 기반 프로토타입 향상 네트워크(Self-Attention based Prototype Enhancement Network: SAPENet)를 제안하고 완화된 데이터 증강 방법을 사용하여 SAPENet의 분류 성능을 더욱 높인다. 더 구체적으로 말하자면, 논문은 다음과 같은 기여를 한다. 첫째, 퓨샷 러닝의 일어남에 대한 구첵적인 원인을 설명하는 목표로 광범위하고 심층적인 문헌 검토가 제공된다. 둘째, 다중 헤드 주의 메커니즘을 사용하여 클래스 내 특징 정보를 탐색하는 첫 번째 프레임워크인 SAPENet을 제안한다. 셋째, 효율적인 데이터 증강 방법인 PDCM(Prediction-Driven CutMix)이 개발되어 잡음이 있는 라벨의 생성을 줄이고 더 많은 학습 샘플 변형을 생성하여 SAPENet의 분류 정확도를 증가시킨다. 마지막으로 PDCM을 가지는 SAPENet은 다양한 퓨샷 학습 벤치마크 데이터 세트(예를 들어: miniImageNet 및 CUB-200-2011)에서 새로운 최첨단 성능을 이루어 퓨샷 설정에서 제안된 방법의 효율성을 보여준다.
- Author(s)
- HUANG XILANG
- Issued Date
- 2022
- Awarded Date
- 2022. 8
- Type
- Dissertation
- Keyword
- Few-shot learning image classification multi-head self-attention mechanism data augmentation
- Publisher
- 부경대학교
- URI
- https://repository.pknu.ac.kr:8443/handle/2021.oak/32716
http://pknu.dcollection.net/common/orgView/200000635847
- Alternative Author(s)
- 황시랑
- Affiliation
- 부경대학교 대학원
- Department
- 대학원 인공지능융합학과
- Advisor
- 최필주
- Table Of Contents
- Chapter 1. Introduction 1
1.1. Limitations of deep learning 1
1.2. Learning from limited labeled samples 2
1.3. Motivation 4
1.4. Contributions 5
1.5. Outline of the dissertation 6
Chapter 2. Background knowledge 7
2.1. Convolutional neural network 7
2.2. Multi-head self-attention 10
2.3. Few-shot learning 13
2.4. Transfer learning 17
Chapter 3. Related work 19
3.1. Early work on few-shot learning 19
3.2. Meta-learning for few-shot learning 21
3.2.1. Optimization-based methods 22
3.2.2. Model-based methods 27
3.2.3. Data generation-based methods 29
3.2.4. Metric-based methods 32
Chapter 4. Methodology 38
4.1. Prototypical Network (ProtoNet) 38
4.2. Proposed method 42
4.2.1. Overview 42
4.2.2. Self-Attention based Prototype Enhancement Network (SAPENet) 43
4.2.2.1. Self-attention block 45
4.2.2.2. Intra-class attention block 48
4.2.2.3. Metric module 50
4.2.3. Prediction-Driven CutMix (PDCM) 53
Chapter 5. Experiments 56
5.1. Datasets 56
5.2. Implementation details 58
5.3. Few-shot classification results of SAPENet 60
5.3.1. Few-shot classification on miniImageNet 61
5.3.2. Few-shot classification on tieredImageNet 64
5.3.3. Few-shot classification on CUB-200-2011 66
5.3.4. Few-shot classification on CIFAR-FS 69
5.4. Ablation experiments of SAPENet 72
5.4.1. Key component analysis 72
5.4.2. Effect of attention head and scaling factor 74
5.4.3. Analysis of metric module 76
5.4.4. Analysis of different numbers of training shots 78
5.4.5. Visualization of activation maps 81
5.5. Combination of SAPENet and PDCM 83
5.5.1. Hyperparameter settings 83
5.5.2. Classification results on benchmark datasets 84
Chapter 6. Conclusions and future perspectives 91
6.1. Conclusions 91
6.2. Future perspectives 92
References 94
Publications 117
1. Journal papers 117
2. Conference papers 118
Acknowledgement 119
- Degree
- Doctor
-
Appears in Collections:
- 대학원 > 인공지능융합학과
- Authorize & License
-
- Files in This Item:
-
Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.