About this item:

113 Views | 82 Downloads

Author Notes:

Correspondence: Xiangqin Cui, xiangqin.cui@emory.edu

Author contributions: Huan Zhong analyzed the data, prepared figures and/or tables, authored or reviewed drafts of the paper, approved the final draft. Soyeon Kim analyzed the data, authored or reviewed drafts of the paper, approved the final draft. Degui Zhi and Xiangqin Cui conceived and designed the experiments, authored or reviewed drafts of the paper, approved the final draft.

Disclosures: Degui Zhi and Xiangqin Cui are Academic Editors for PeerJ. The authors declare there are no competing interests.

Subjects:

Research Funding:

Degui Zhi was partially supported by NIH Grant R01 HG008115; Xiangqin Cui was partially supported by NIH 2P60AR048095.

Huan Zhong was supported by Hong Kong Baptist University’s strategic development fund SDF15-1012-P04 to Yiji Xia.

There was no additional external funding received for this study. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Keywords:

  • Science & Technology
  • Multidisciplinary Sciences
  • Science & Technology - Other Topics
  • DNA methylation
  • Methylation microarray
  • Transcriptome
  • Lasso
  • Integrative analysis
  • Body methylation
  • Adipose tissue
  • Start sites
  • Selection
  • Transcription
  • Epigenetics
  • Regression
  • Variants
  • Islands

Predicting gene expression using DNA methylation in three human populations

Tools:

Journal Title:

PeerJ: Bioinformatics and Genomics

Volume:

Volume 7, Number 5

Publisher:

, Pages e6757-e6757

Type of Work:

Article | Final Publisher PDF

Abstract:

Background: DNA methylation, an important epigenetic mark, is well known for its regulatory role in gene expression, especially the negative correlation in the promoter region. However, its correlation with gene expression across genome at human population level has not been well studied. In particular, it is unclear if genome-wide DNA methylation profile of an individual can predict her/his gene expression profile. Previous studies were mostly limited to association analyses between single CpG site methylation and gene expression. It is not known whether DNA methylation of a gene has enough prediction power to serve as a surrogate for gene expression in existing human study cohorts with DNA samples other than RNA samples. Results: We examined DNA methylation in the gene region for predicting gene expression across individuals in non-cancer tissues of three human population datasets, adipose tissue of the Multiple Tissue Human Expression Resource Projects (MuTHER), peripheral blood mononuclear cell (PBMC) from Asthma and normal control study participates, and lymphoblastoid cell lines (LCL) from healthy individuals. Three prediction models were investigated, single linear regression, multiple linear regression, and least absolute shrinkage and selection operator (LASSO) penalized regression. Our results showed that LASSO regression has superior performance among these methods. However, the prediction power is generally low and varies across datasets. Only 30 and 42 genes were found to have cross-validation R2 greater than 0.3 in the PBMC and Adipose datasets, respectively. A substantially larger number of genes (258) were identified in the LCL dataset, which was generated from a more homogeneous cell line sample source. We also demonstrated that it gives better prediction power not to exclude any CpG probe due to cross hybridization or SNP effect. Conclusion: In our three population analyses DNA methylation of CpG sites at gene region have limited prediction power for gene expression across individuals with linear regression models. The prediction power potentially varies depending on tissue, cell type, and data sources. In our analyses, the combination of LASSO regression and all probes not excluding any probe on the methylation array provides the best prediction for gene expression.

Copyright information:

© PeerJ, Inc. 2019.

This is an Open Access work distributed under the terms of the Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/).
Export to EndNote