높은 소음 환경에서 강건한 음성 인식을 위한 ASR 시스템 개발
- Alternative Title
- Enhancing Robust Speech Recognition in High-Noise Environments with ASR System Development
- Abstract
- Speech recognition is a task that takes speech as input and outputs text. Speech recognition is used in various places in the real world, such as AI speakers, voice memos, and more, and there is room for further development. For this purpose, various studies based on deep learning are being conducted. However, these studies are trained with data containing relatively quiet noises rather than real-world environments with strong noises including ambient sounds or noisy situations, resulting in poor recognition rates in noisy environments. We improved the performance of the Whisper model by using a pre-trained model that removes the noise through a process of compressive restoration, and showed that the method of learning by synthesizing noisy speech with denoised speech shows better performance than the model that learned only noisy speech, showing that the proposed method is effective for DNN-based speech models, and proposes a framework for creating noise robust speech recognition models.
- Author(s)
- 김민성
- Issued Date
- 2024
- Awarded Date
- 2024-02
- Type
- Dissertation
- Keyword
- ASR
- Publisher
- 국립부경대학교 대학원
- URI
- https://repository.pknu.ac.kr:8443/handle/2021.oak/33864
http://pknu.dcollection.net/common/orgView/200000743998
- Alternative Author(s)
- Minsung Kim
- Affiliation
- 국립부경대학교 대학원
- Department
- 산업 및 데이터공학과
- Advisor
- 최성철
- Table Of Contents
- Ⅰ. 서 론 1
1. 연구 배경 1
2. 연구 목표 및 내용 4
Ⅱ. 선행 연구 6
1. Speech Enhancement 6
1.1 Denoiser 7
1.2 Observation Addding 9
2. Speech Recognition 10
2.1 Whisper 12
Ⅲ. 노이즈 제거 및 fine-tuning 프로세스 14
1. 연구 방법 소개 14
2. 데이터 설명 16
2.1 극한 소음 음성인식 데이터 16
2.2 데이터 전처리 19
3. Training Details 22
Ⅳ. 실험결과 24
4.1 Evaluation Metric WER, CER 24
4.2 Zero-Shot 25
4.3 Fine-tuning on Original Dataset 25
4.4 Fine-tuning on Denoised Dataset 26
4.5 Fine-tuning on Mixed Dataset 26
Ⅴ. 결론 27
Ⅵ. 부록 29
참고문헌 31
- Degree
- Master
-
Appears in Collections:
- 대학원 > 산업및데이터공학과
- Authorize & License
-
- Authorize공개
- Embargo2024-02-16
- Files in This Item:
-
Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.