Development of the sensitive stress and/or disease markers in rainbow trout (Oncorhynchus mykiss) based on multi-omics analyses
- Alternative Title
- Multi-omics 기반 무지개송어의 스트레스 및 질병 진단을 위한 고민감도 바이오마커 개발에 관한 연구
- Abstract
- 지난 10년간 국내 양식 산업은 급속도로 성장했지만 질병을 진단하고 제어하는 기술은 여전히 충분하지 않아 많은 경제적인 손실을 야기하고 있다. 당면한 문제에 신속하게 대응하기 위해서는 무엇보다도 어류의 건강성을 조기진단 할 수 있는 체계의 구축이 시급하다. 그러므로, 본 연구의 목적은 1) 무지개송어를 이용하여 어류가 스트레스 환경이나 질병에 걸린 상태에서 변화하는 유전학적 및 혈청학적 인자를 분석하여 유용한 바이오마커를 발굴하는 것, 2) 머신러닝 기술과 빅데이터 분석을 통해 선정된 바이오마커의 정상과 비정상 기준을 설정하는 것, 3) 최종적으로 선정된 바이어마커를 이용하여 어류의 건강을 평가 할 수 있는 방법을 개발하는 것이다. 후보 바이오마커의 선정을 위해 현장에서 문제가 많이 되는 고수온 스트레스에 노출된 그리고 백점충에 감염된 무지개송어를 multi-omics 분석을 수행하였다. 또한 빈산소와 과급이 스트레스 환경에 노출되었거나 A. hydrophila에 감염된 어류에서도 후보 바이오마커의 변화를 분석하였다. 그 결과, hemopexin(Wap65-1과 Wap65-2)과 HSP70 가 무지개송어의 건강을 판단하는 데 사용될 수 있는 바이오마커로 선정되었다. 무지개송어에 존재하는 23개의 HSP70 family를 계통학적 분류에 따라7가지로 구분할 수 있었으며, HSP70 family에 포함되어 있는 23가지의 유전자 중 HSP70a가 정상과 비정상을 가장 잘 구분할 수 있는 인자임을 확인하여 최종적으로 선정하였다. 따라서 Wap65-1, Wap65-2, 및 HSP70a의 유전자 발현을 동시에 확인할 수 있는 multiplex realtime PCR방법을 개발하였다.
실험실에서 수행된 실험과 4년간 현장에서 샘플링한 병원체에 감염되었거나 임상학적, 병리학적으로 이상이 있거나 또는 건강한 무지개송어의 유전학적 인자(Wap65-1, Wap65-2, HSP70a)와 널리 사용되는 혈청학적 인자(GOT, GPT, ALP, GLU, TCHO, TP, LDH, Ca)의 값을 이용하여 정상범위를 설정하고자 하였다. 이를 위해 우선은 각 바이오마커가 수온과 체중에 대한 연관성이 있는지를 다중선형회귀분석을 통해 분석하였으며, 유의미한 영향이 있을 경우 설명변수에 대한 추정계수를 이용하여 보정 공식을 적용하였다. 또한, 각 바이오마커가 유의적인 발현을 보여주는 구간의 정상 개체와 비정상 개체에 대해 ROC curve를 작성한 뒤 민감도와 특이도의 합이 가장 높은 수치를 breakpoint로 하여 정상범위를 구하였다. 추가적으로 여러 바이오마커의 조합을 통해 더 높은 정확도를 도출할 수 있는 지를 여러 회귀분석과 Machine learning 기법 (의사결정나무, 랜덤포레스트, 부스팅, 인공신경망회로)을 이용해 분석한 뒤 검증하였다. 그 결과, 추정한 혈청학적 인자들과 새로운 바이오마커들의 정상 범위가 특정 비정상개체에 있어 높은 정확도를 보여줄 수 있음을 확인하였으며, 일반적으로 새로운 바이오마커들이 (Wap65-1, Wap65-2, HSP70a) 혈청학적 인자들보다 감염성, 스트레스성 질병에 대해서 높은 정확도를 보여주었다. 또한, 새로운 바이오마커들을 Random-forest 방법으로 조합하는 것이 진단의 정확성을 극대화 시킬 수 있는 적절한 방법인 것을 확인하였다.
본 연구에서는 무지개송어의 건강성 평가를 위한 고민감도 바이오마커를 multi-omics 분석을 통해 도출하고 여러 바이오마커를 동시에 검출할 수 있는 진단방법을 개발하였다. 또한, 빅데이터 분석을 통해 여러 혈청학적, 유전학적 바이오마커들에 대한 정상 범위와 정확성을 극대화 할 수 있는 방법을 도출하였다. 이런 검출 기법 및 기준은 추후에 무지개송어의 건강성을 조기에 진단하는데 큰 도움이 될 것으로 기대되며, 데이터가 축적됨에 따른 추가 학습으로 더 높은 정확도를 보여줄 수 있는 미래지향적인 방법이 될 것이다.
Although the aquaculture industry in Korea has dramatically grown over recent decades, insufficient technology for disease diagnosis and control has still led to tremendous economic losses. Therefore, the aims of this study were as follows; 1) to analyze changes of serological factors and expression of genes in rainbow trout under environmental stress conditions and infected with microbial pathogens to excavate promising biomarkers, 2) to determine cut-off values to categorize normal and abnormal in selected parameters (to obtain reference ranges) based on machine learning and big-data analyses, and 3) to develop diagnostic methods to evaluate fish health. Potential biomarkers were obtained from multi-omics data using fish under thermal stress or Ichthyophthirius multifiliis infection, which are major problems in trout farms. Then, the expression level of candidate genes and serological factors was monitored in fish exposed to hypoxic or overfeeding stress, or infected with A. hydrophila. Hemopexin, present as paralogs (Wap65-1 and Wap65-2), and heat shock protein (HSP70 family) were finally selected for further analysis. Twenty three genes related to HSP70 were classified into seven groups, and HSP70a was finally selected as normal fish were well separated from abnormal one by using the gene. As a result, a multiplex real-time PCR method for the quantification of Wap65-1, Wap65-2, HSP70a, and Ef-1α mRNA was developed and evaluated using samples obtained from fish farms.
In this study, reference ranges were determined using values of 8 serological factors (GOT, GPT, ALP, GLU, TCHO, TP, LDH, and Ca) and expression levels of genes (Wap65-1, Wap65-2, and HSP70a) obtained from lab-scale experiments and the field for 4 years.. Also, association of water temperature and body weight to each biomarker was analyzed using multiple linear regression (MLR). From this analysis, adjusted formula was applied using estimated coefficient of explanatory variable(s). In addition, the standard ranges were calculated using breakpoint of ROC curve between normal and abnormal groups, which maximized the sum of sensitivity and specificity. Moreover, the optimal combination of multiple biomarkers to increase the diagnostic accuracy were estimated based on regression analyses and machine learning technologies (i.e., decision tree, random forest, boosting, and artificial neutral network). As results, estimated reference ranges of serological factors and developed biomarkers (Wap65-1, Wap65-2, and HSP70a) have high accuracy for certain types of abnormal trout, and the expression of Wap65-1, Wap65-2, and HSP70a generally have higher diagnostic accuracy than serological factors for diseased and stressed trout. In addition, combination of Wap65-1, Wap65-2, and HSP70a using random forests was the most suitable method to maximize diagnostic accuracy in this study.
In conclusion, this study successfully excavated high sensitive biomarkers based on multi-omics approaches, and developed the quad-color multiplex realtime PCR method to quantify new biomarkers at the same time. Also, reference ranges of serological factors and genotypic factors (Wap65-1, Wap65-2, and HSP70a) but the appropriate combination of multiple biomarkers were derived based on big-data analyses. These diagnostic methods and standard ranges have been expected to promptly and accurately evaluate the trout health, and the biggest advantage of this method is the more data will be obtained in the future makes higher diagnostic accuracy.
- Author(s)
- 노형진
- Issued Date
- 2020
- Awarded Date
- 2020. 2
- Type
- Dissertation
- Publisher
- 부경대학교
- URI
- https://repository.pknu.ac.kr:8443/handle/2021.oak/23705
http://pknu.dcollection.net/common/orgView/200000294589
- Affiliation
- 부경대학교 대학원
- Department
- 대학원 수산생명의학과
- Advisor
- 김도형
- Table Of Contents
- General Introduction 1
Chapter I. Multi-omics (Transcriptomics and Proteomics) analysis under the acute thermal stress 5
1.1. Introduction 5
1.2. Materials and methods 7
1.2.1. Acute thermal stress 7
1.2.2. RNA extraction and RNA-seq analysis 7
1.2.3. Genome guided assembly and annotation 8
1.2.4. Selecting differentially expressed gene and functional annotation 8
1.2.5. Validation of RNAseq results using qPCR 9
1.2.6. Analyzing biochemical factors in plasma 11
1.2.7. Two-dimensional electrophoresis (2-DE) 11
1.2.7.1. Trichloroacetic acid/ Acetone preparation 11
1.2.7.2. Rehydration 11
1.2.7.3. Equilibration and Running two-dimensional gel electrophoresis 12
1.2.7.4. Gel staining and image scanning 12
1.2.8. Identifying the differential expressed proteins (DEP) by LC-MS/MS 13
1.2.9. Database Searching 14
1.3. Results 15
1.3.1. Genome-guided transcriptome assembly 15
1.3.2. Gene-set enrichment analysis (GSEA) 15
1.3.3. Differential expressed genes (DEGs) analysis and RNA-seq validation 23
1.3.4. Serological analysis 26
1.3.5. Analysis of plasma proteome and differently expressed proteins 27
1.4. Discussion 34
1.4.1. Transcriptomic analysis in head-kidney 34
1.4.2. Multi-omics approach through data integration of transcriptomics and proteomics 37
1.4.3. Glucose, lipid metabolisms and activating glycolysis after acute thermal stress 38
1.4.4. Activating hemolysis mechanism after acute thermal stress 41
1.4.5. Removing hemoglobin subunit derived from RBC lysis and process of met-hemoglobin generation 44
1.4.6. Occurring thrombosis after acute thermal stress, and mechanism of modulating the vasoconstriction/ vasodilation. 47
Chapter II. Profiling systemic changes by Ichthyophthirius multifiliis infection through multi-organ transcriptomic analyses 50
2.1. Introduction 50
2.2. Materials and methods 52
2.2.1. Fish and sample 52
2.2.2. RNA extraction and RNA-seq analysis 52
2.2.3. Mapping, assembly and selection of differential expressed genes 53
2.2.4. Pathway analysis using DEGs 54
2.2.5. Profiling featured genes in both organs using machine learning interface 54
2.2.6. Real-time PCR 55
2.3. Results 57
2.3.1. Hematology and serology 57
2.3.2. Sequencing and genome-guided assembly 60
2.3.3. Selection of differentially expressed genes (DEGs) 63
2.3.4. Pathway analysis during I. multifiliis infection and recovery 64
2.3.5. Integrating pathway expression of head-kidney and liver under I. multifiliis infection and recovery 69
2.3.6. Integrating pathway expression of head-kidney and liver under I. multifiliis infection and recovery 79
2.3.7. RNA-seq validation using qPCR 86
2.4. Discussion 87
Chapter III. Differences of hemato-serological and immunological characteristics under the abnormal condtion (Progressive hypoxia, overfeeding and A. hydrophila infection) 97
3.1. Introduction 97
3.2. Materials and methods 101
3.2.1. Hypoxic stress 101
3.2.1.1. Fish 101
3.2.1.2. Hypoxia stress 101
3.2.1.3. Blood sample analysis 102
3.2.1.4. Flow cytometry 102
3.2.1.5. Histopathology 103
3.2.1.6. RNA extraction and cDNA synthesis 103
3.2.1.7. Real-time PCR (Quantitative PCR) 104
3.2.1.8. Statistical analysis 104
3.2.2. Providing excessive feed and overfeeding stress 105
3.2.2.1. Fish 105
3.2.2.2. Overfeeding-induced obesity 106
3.2.2.3. Hematology and serology 106
3.2.2.4. Histopathology of the head-kidney and liver 107
3.2.2.5. RNA extraction, cDNA synthesis and real-time PCR (Quantitative PCR) 107
3.2.2.6. Statistical analysis 107
3.2.3. Aeromonas hydrophila infection 108
3.2.3.1. Strain and pathogenicity 108
3.2.3.2. Fish and challenge 108
3.2.3.3. Flow cytometry 109
3.2.3.4. Gene expression 109
3.2.2.5. Statistical analysis 109
3.3. Results and discussion 110
3.3.1. Comparison of changes in hemato-serological and immunological factors in rainbow trout under sub-hypoxia and lethal hypoxia 110
3.3.1.1. Hypoxic stress in teleost 110
3.3.1.2. Hematology 111
3.3.1.3. Leukocyte population in the head kidney 113
3.3.1.4. Gene expression level 115
3.3.2. Immuno-physiological disorders caused by intaking excessive feed in trout 117
3.3.2.1. Feed intake and hematology 117
3.3.2.2. Liver histopathology 123
3.3.2.3. Cellular responses in overfed trout 124
3.3.3. Hematological, serological and immunological changes after A. hydrophila infection. 129
3.3.3.1. Pathogenicity and characteristics of isolated A. hydrophila. 129
3.3.3.2. Hematological and serological results 130
3.3.3.3. Gene expression in head-kidney 133
3.3.3.4. Histopathological hematopoietic activity in head-kidney. 134
Chapter IV. Understanding the characteristics of newbiomarkers, and developing the most appropriate method to detect them 137
4.1. Introduction 137
4.2. Materials and methods 141
4.2.1. Warm temperature acclimation protein 141
4.2.1.1. Identification of hemopexin like proteins and phylogenetic analysis 141
4.2.1.2. Primer and Wap65-2 expression by A. hydrophila, I. multifiliis infection and thermal stress 142
4.2.1.3. Polyclonal antibody 144
4.2.1.4. SDS-PAGE and western blot 144
4.2.1.5. Immunohistochemistry (IHC) 145
4.2.1.6. Flow cytometry analysis for leukocytes Wap65-2 expression under the LPS and OxLDL stimulation 145
4.2.2. Heat shock protein 70 family 146
4.2.2.1. Identification of HSP70 in trout and taxonomic classification 146
4.2.2.2. DNA motif prediction of HSP70 family 147
4.2.2.3. Transcription factor binding protein prediction 147
4.2.2.4. HSP70a isoform prediction 149
4.2.2.5. Different exons expression level in HSP70a under hypoxia, thermal stress, and pathogen-related stress (A. hydrophila and I. multifiliis) 150
4.2.2.6. Polyclonal antibody 150
4.2.2.7. SDS-PAGE and western blot 151
4.2.3. Haptoglobin 152
4.2.3.1. Phylogenetic analysis of two haptoglobins in trout 152
4.2.3.2. Profiling two haptoglobin expression in multiple organs after bacterial infection 152
4.2.3.3. Western blot and Immunohistochemistry 155
4.2.3.4. Bacterial infection and western 156
4.2.3.5. Statistical analysis 157
4.2.4. Quad color multiplex real-time PCR 158
4.3. Results and discussion 159
4.3.1. Warm temperature acclimation protein 65 159
4.3.1.1. Re-identification of hemopexin (like protein) in rainbow trout 159
4.3.1.2. Wap65-2 expression 161
4.3.1.3. Wap65-2 polyclonal antibody 163
4.3.1.4. Effect of LPS and OxLDL on the expression of Wap65-2 in macrophage and lymphocyte 167
4.3.1.5. Summary of Wap65-2 expression with various teleost 171
4.3.2. Heat shock protein 70 family 173
4.3.2.1. Identification of HSP70 in trout and taxonomic classification 173
4.3.2.2. DNA motif analysis of HSP70 family 175
4.3.2.3. Transcription binding factor protein composition of HSP70s 179
4.3.2.4. Isoforms prediction of HSP70a 181
4.3.2.5. Exon518 and 588 expressions in HSP70a induced by stresses and infection in rainbow trout. 183
4.3.2.6. HSP70a isoforms expression in plasma 184
4.3.3. Haptoglobins 187
4.3.3.1. Phylogenetic analysis of different species of teleost 187
4.3.3.2. Two Haptoglobins expression in rainbow trout 189
4.3.4. The efficiency of quad-color multiplex PCR 196
Chapter V. Estimating the standard ranges for serological factors and new-biomarkers, and suggesting models for different types of abnormal condition based on big-data analysis. 198
5.1. Introduction 198
5.2. Materials and methods 200
5.2.1. Data collection for serological factors 200
5.2.2. Sample classification for serological factors 202
5.2.2.1. Classification of each group 202
5.2.2.2. Multiple linear regression analysis 202
5.2.2.3. Receiver operation characteristics (ROC) curve analysis 202
5.2.2.4. Evaluation of a standardized range of each factor 203
5.2.3. Excavating optimal model under different type of abnormal condition based on serological factor 203
5.2.3.1. Multiple linear regression model 203
5.2.3.2. Logistic and LASSO regression model 204
5.2.3.3. Models based on machine learning analysis (Decision tree, Random forest, boosting and artificial neural network) 205
5.2.4. Sample classification for Wap65-1, Wap65-2 and HSP70a expression 205
5.2.4.1. Classification of each group 205
5.2.4.2. Gene expression of Wap65-1, Wap65-2 and HSP70a-518 expression 205
5.2.4.3. Multiple linear regression analysis for gene expression 206
5.2.4.4. Receiver operation characteristics (ROC) curve analysis 206
5.2.5. Excavating optimal model in accordance with different type of abnormal condition based on Wap65-1, Wap65-2 and HSP70a expression 207
5.2.5.1. Multiple linear regression model 207
5.2.5.2. Logistic and LASSO regression model 208
5.2.5.3. Models based on machine learning analysis (Decision tree, Random forest, boosting and artificial neural network) 208
5.3. Results and discussion. 209
5.3.1. Effect of temperature and weight on each serological factor 209
5.3.2. Estimating standardized ranges in each serological factor 211
5.3.3. Accuracy (Specificity and sensitivity) verification of derived standardized range for each serological factor 213
5.3.3.1. The standard level for GOT (Aspartate transaminase; AST) 213
5.3.3.2. Standard level for GPT (Alanine transaminase; ALT) 213
5.3.3.3. Standard level for ALP (Alkaline phosphatase) 215
5.3.3.4. The standard level for GLU (Glucose) 216
5.3.3.5. The standard level for TCHO (Total cholesterol) 216
5.3.3.6. The standard level for LDH (Lactate dehydrogenase) 218
5.3.3.7. The standard level for TP (Total protein) and Ca (Calcium) 220
5.3.4. Establish multiple models and select an optimized model based on serological factors 222
5.3.4.1. Multiple linear regression model 222
5.3.4.2. Logistic regression analysis 227
5.3.4.3. LASSO regression analysis 228
5.3.4.4. Decision tree analysis 229
5.3.4.5. Random forest 231
5.3.4.6. Boosting model 233
5.3.4.7. Artificial neural network 234
5.3.4.8. Evaluating an optimized model for infectious diseases, stress environment and excessive feed intake. 236
5.3.5. Effect of temperature and weight on Wap65-1, Wap65-2 and HSP70a 240
5.3.6. Estimating standardized ranges in Wap65-1, Wap65-2 and HSP70a 241
5.3.7. Accuracy (Specificity and sensitivity) verification of derived standardized range for Wap65-1, Wap65-2 and HSP70a 242
5.3.7.1. The standard level for Wap65-1 and Ad_Wap65-1 242
5.3.7.2. The standard level for Wap65-2 and Ad_Wap65-2 243
5.3.7.3. The standard level for HSP70a 244
5.3.8. Establish multiple models and select an optimized model based on Wap65-1, Wap65-2 and HSP70a expression. 245
5.3.8.1. Multiple linear regression model 245
5.3.8.2. Logistic and LASSO regression model 249
5.3.8.3. Decision tree analysis 250
5.3.8.4. Random forest analysis 252
5.3.8.5. Boosting analysis 253
5.3.8.6. Artificial neural network analysis 254
5.3.8.7. Evaluating optimized model for infectious diseases and stress environment based on Wap65-1, Wap65-2 and HSP70 expression. 255
Conclusion 259
References 260
Acknowledgement 291
- Degree
- Doctor
-
Appears in Collections:
- 대학원 > 수산생명의학과
- Authorize & License
-
- Files in This Item:
-
Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.