Journal Article
. 2020 Mar; 10(1):3920.
doi: 10.1038/s41598-020-60845-2.

Collective effects of long-range DNA methylations predict gene expressions and estimate phenotypes in cancer

Soyeon Kim 1 Hyun Jung Park 2 Xiangqin Cui 3 Degui Zhi 4 
  • PMID: 32127627
  •     41 References
  •     1 citations


DNA methylation of various genomic regions has been found to be associated with gene expression in diverse biological contexts. However, most genome-wide studies have focused on the effect of (1) methylation in cis, not in trans and (2) a single CpG, not the collective effects of multiple CpGs, on gene expression. In this study, we developed a statistical machine learning model, geneEXPLORE (gene expression prediction by long-range epigenetics), that quantifies the collective effects of both cis- and trans- methylations on gene expression. By applying geneEXPLORE to The Cancer Genome Atlas (TCGA) breast and 10 other types of cancer data, we found that most genes are associated with methylations of as much as 10 Mb from the promoters or more, and the long-range methylation explains 50% of the variation in gene expression on average, far greater than cis-methylation. geneEXPLORE outperforms competing methods such as BioMethyl and MethylXcan. Further, the predicted gene expressions could predict clinical phenotypes such as breast tumor status and estrogen receptor status (AUC = 0.999, 0.94 respectively) as accurately as the measured gene expression levels. These results suggest that geneEXPLORE provides a means for accurate imputation of gene expression, which can be further used to predict clinical phenotypes.

GenomeRunner: automating genome exploration.
Mikhail G Dozmorov, Lukas R Cara, Cory B Giles, Jonathan D Wren.
Bioinformatics, 2011 Dec 14; 28(3). PMID: 22155868    Free PMC article.
Breast cancer risk-associated SNPs modulate the affinity of chromatin for FOXA1 and alter gene expression.
Richard Cowper-Sal lari, Xiaoyang Zhang, +5 authors, Mathieu Lupien.
Nat Genet, 2012 Sep 25; 44(11). PMID: 23001124    Free PMC article.
Highly Cited.
Neural crest transcription factor Sox10 is preferentially expressed in triple-negative and metaplastic breast carcinomas.
Ashley Cimino-Mathews, Andrea P Subhawong, +6 authors, Pedram Argani.
Hum Pathol, 2012 Dec 25; 44(6). PMID: 23260325    Free PMC article.
Cumulative haploinsufficiency and triplosensitivity drive aneuploidy patterns and shape the cancer genome.
Teresa Davoli, Andrew Wei Xu, +4 authors, Stephen J Elledge.
Cell, 2013 Nov 05; 155(4). PMID: 24183448    Free PMC article.
Highly Cited.
Activating ESR1 mutations in hormone-resistant metastatic breast cancer.
Dan R Robinson, Yi-Mi Wu, +22 authors, Arul M Chinnaiyan.
Nat Genet, 2013 Nov 05; 45(12). PMID: 24185510    Free PMC article.
Highly Cited.
ESR1 ligand-binding domain mutations in hormone-resistant breast cancer.
Weiyi Toy, Yang Shen, +15 authors, Sarat Chandarlapaty.
Nat Genet, 2013 Nov 05; 45(12). PMID: 24185512    Free PMC article.
Highly Cited.
A gene-based association method for mapping traits using reference transcriptome data.
Eric R Gamazon, Heather E Wheeler, +9 authors, Hae Kyung Im.
Nat Genet, 2015 Aug 11; 47(9). PMID: 26258848    Free PMC article.
Highly Cited.
Prediction-Oriented Marker Selection (PROMISE): With Application to High-Dimensional Regression.
Soyeon Kim, Veerabhadran Baladandayuthapani, J Jack Lee.
Stat Biosci, 2017 Aug 09; 9(1). PMID: 28785367    Free PMC article.
Functions of DNA methylation: islands, start sites, gene bodies and beyond.
Peter A Jones.
Nat Rev Genet, 2012 May 30; 13(7). PMID: 22641018
Highly Cited. Review.
Genome-wide evolutionary analysis of eukaryotic DNA methylation.
Assaf Zemach, Ivy E McDaniel, Pedro Silva, Daniel Zilberman.
Science, 2010 Apr 17; 328(5980). PMID: 20395474
Highly Cited.
Interplay between the cancer genome and epigenome.
Hui Shen, Peter W Laird.
Cell, 2013 Apr 02; 153(1). PMID: 23540689    Free PMC article.
Highly Cited. Review.
Passive and active DNA methylation and the interplay with genetic variation in gene regulation.
Maria Gutierrez-Arcelus, Tuuli Lappalainen, +18 authors, Emmanouil T Dermitzakis.
Elife, 2013 Jun 12; 2. PMID: 23755361    Free PMC article.
Highly Cited.
Tissue-specific effects of genetic and epigenetic variation on gene regulation and splicing.
Maria Gutierrez-Arcelus, Halit Ongen, +18 authors, Emmanouil T Dermitzakis.
PLoS Genet, 2015 Jan 31; 11(1). PMID: 25634236    Free PMC article.
Highly Cited.
DNA hypomethylation in cancer cells.
Melanie Ehrlich.
Epigenomics, 2010 May 25; 1(2). PMID: 20495664    Free PMC article.
Highly Cited. Review.
DNA methylation of distal regulatory sites characterizes dysregulation of cancer genes.
Dvir Aran, Sivan Sabato, Asaf Hellman.
Genome Biol, 2013 Mar 19; 14(3). PMID: 23497655    Free PMC article.
Highly Cited.
Inferring regulatory element landscapes and transcription factor networks from cancer methylomes.
Lijing Yao, Hui Shen, +2 authors, Benjamin P Berman.
Genome Biol, 2015 May 23; 16. PMID: 25994056    Free PMC article.
Highly Cited.
The role of enhancers in cancer.
Inderpreet Sur, Jussi Taipale.
Nat Rev Cancer, 2016 Jul 02; 16(8). PMID: 27364481
Highly Cited. Review.
In the loop: promoter-enhancer interactions and bioinformatics.
Antonio Mora, Geir Kjetil Sandve, Odd Stokke Gabrielsen, Ragnhild Eskeland.
Brief Bioinform, 2015 Nov 21; 17(6). PMID: 26586731    Free PMC article.
Promoter-enhancer interactions identified from Hi-C data using probabilistic models and hierarchical topological domains.
Gil Ron, Yuval Globerson, Dror Moran, Tommy Kaplan.
Nat Commun, 2017 Dec 23; 8(1). PMID: 29269730    Free PMC article.
BioMethyl: an R package for biological interpretation of DNA methylation data.
Yue Wang, Jennifer M Franks, Michael L Whitfield, Chao Cheng.
Bioinformatics, 2019 Feb 26; 35(19). PMID: 30799505    Free PMC article.
Predicting gene expression using DNA methylation in three human populations.
Huan Zhong, Soyeon Kim, Degui Zhi, Xiangqin Cui.
PeerJ, 2019 May 21; 7. PMID: 31106051    Free PMC article.
GSTM1, GSTT1, and GSTP1 genotypes in relation to breast cancer risk and frequency of mutations in the p53 gene.
K Gudmundsdottir, L Tryggvadottir, J E Eyfjord.
Cancer Epidemiol Biomarkers Prev, 2001 Nov 09; 10(11). PMID: 11700265
GSTT1 polymorphism and breast cancer risk in the Chinese population: an updated meta-analysis and review.
Zhang-Sheng Xiao, Yun Li, Yan-Li Guan, Jia-Gen Li.
Int J Clin Exp Med, 2015 Jul 30; 8(5). PMID: 26221202    Free PMC article.
GATA3 in Breast Cancer: Tumor Suppressor or Oncogene?
Motoki Takaku, Sara A Grimm, Paul A Wade.
Gene Expr, 2015 Dec 08; 16(4). PMID: 26637396    Free PMC article.
Emergence of constitutively active estrogen receptor-α mutations in pretreated advanced estrogen receptor-positive breast cancer.
Rinath Jeselsohn, Roman Yelensky, +28 authors, Vincent A Miller.
Clin Cancer Res, 2014 Jan 09; 20(7). PMID: 24398047    Free PMC article.
Highly Cited.
D538G mutation in estrogen receptor-α: A novel mechanism for acquired endocrine resistance in breast cancer.
Keren Merenbakh-Lamin, Noa Ben-Baruch, +11 authors, Ido Wolf.
Cancer Res, 2013 Nov 13; 73(23). PMID: 24217577
Highly Cited.
Expression Quantitative Trait Methylation Analysis Reveals Methylomic Associations With Gene Expression in Childhood Asthma.
Soyeon Kim, Erick Forno, +8 authors, Juan C Celedón.
Chest, 2020 Jun 23; 158(5). PMID: 32569636    Free PMC article.