PUKYONG

Estimation of Chlorophyll-a Concentration in the Nakdong River Using Sentinel-2 MSI and Algal Bloom Influence Factors Data Fusion

Metadata Downloads
Alternative Title
Sentinel-2 MSI와 녹조 영향인자를 융합 활용한 낙동강 chlorophyll-a 농도 추정: 머신러닝 모델 성능 비교 및 평가
Abstract
Due to global warming, the frequency and extent of algal blooms are increasingly prevalent worldwide. In Korea, the Nakdong River faces severe algal bloom issues every year. Algal blooms cause various damages such as ecological, economic, and aesthetic damages, periodic monitoring is essential for preemptive management and rapid response. Chlorophyll-a (chl-a) concentration is utilized as an indicator of algal bloom occurrences, and the use of satellite enables the detection of algal blooms over extensive areas. Numerous studies recently have utilized machine learning techniques to achieve more accurate estimations of chl-a concentrations. Various factors affect algal blooms occurrence, and identifying the cause of their occurrence remains challenging. Therefore, it is essential to apply a study using diverse input data to comprehensively consider these factors. This study fused Sentinel-2 satellite data with water quality, meteorological, and hydrological factors data to estimate chlorophyll-a concentrations for eight weirs along the Nakdong River over the last five years. AutoML selected six models (CatBoost, Extra Trees, Gradient Boosting, LightGBM, Random Forest, and XGBoost), and the SHAP method identified a total of 27 fused input variables. CatBoost (R2 = 0.862, RMSE = 5.560 mg/m3, MAE = 4.120 mg/m3) demonstrated superior performance, and all six models achieved significant results with R2 above 0.8. SHAP was conducted to analyze the importance of features, and Suspended Solids (SS) emerged as the most important factor in all six models. The ranking of variable importance varied by model, water quality variables such as Dissolved Oxygen (DO), pH, Total Organic Carbon (TOC), and Biochemical Oxygen Demand (BOD), and satellite variables using the band combinations of red-edge and red band were identified as common top-ranking variables. The feasibility of chl-a estimation was assessed by exhibiting the spatial patterns of the estimated chl-a values using CatBoost. This study confirmed the applicability of fusion data for estimating chl-a concentrations, and it is expected to be utilized for nationally and globally chl-a monitoring in the future.
Author(s)
박소련
Issued Date
2024
Awarded Date
2024-02
Type
Dissertation
Keyword
algal bloom, chlorophyll-a, Sentinel-2, machine learning
Publisher
국립부경대학교 대학원
URI
https://repository.pknu.ac.kr:8443/handle/2021.oak/33588
http://pknu.dcollection.net/common/orgView/200000743390
Alternative Author(s)
So Ryeon Park
Affiliation
국립부경대학교 대학원
Department
대학원 지구환경시스템과학부공간정보시스템공학전공
Advisor
김진수
Table Of Contents
1. Introduction 1
1.1. Background 1
1.2. Literature review 4
1.3. Objectives 10
2. Study Area and Data 12
2.1. Study Area 12
2.2. Data 15
2.2.1. Satellite data 15
2.2.2. Water quality data 19
2.2.3. Meteorology data 19
2.2.4. Hydrology data 20
3. Methodology 22
3.1. AutoML 22
3.2. Machine Learning Method 23
3.2.1. Bagging algorithm 24
3.2.2. Boosting algorithm 25
3.3. Model Accuracy Assessment 28
3.4. SHAP 29
3.5. Model train and test 31
4. Results and Discussion 32
4.1. Model performance 32
4.2. Model interpretation with SHAP 40
4.3. Spatial distribution map of Chl-a 53
5. Conclusions 56
6. References 58
Degree
Master
Appears in Collections:
대학원 > 지구환경시스템과학부-공간정보시스템공학전공
Authorize & License
  • Authorize공개
  • Embargo2024-02-16
Files in This Item:

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.