Journal Article
. 2016 Jul;6().
doi: 10.1038/srep29915.

Relational Network for Knowledge Discovery through Heterogeneous Biomedical and Clinical Features

Huaidong Chen 1 Wei Chen 2 Chenglin Liu 2 Le Zhang 1 Jing Su 2 Xiaobo Zhou 2 
  • PMID: 27427091
  •     49 References
  •     3 citations


Biomedical big data, as a whole, covers numerous features, while each dataset specifically delineates part of them. "Full feature spectrum" knowledge discovery across heterogeneous data sources remains a major challenge. We developed a method called bootstrapping for unified feature association measurement (BUFAM) for pairwise association analysis, and relational dependency network (RDN) modeling for global module detection on features across breast cancer cohorts. Discovered knowledge was cross-validated using data from Wake Forest Baptist Medical Center's electronic medical records and annotated with BioCarta signaling signatures. The clinical potential of the discovered modules was exhibited by stratifying patients for drug responses. A series of discovered associations provided new insights into breast cancer, such as the effects of patient's cultural background on preferences for surgical procedure. We also discovered two groups of highly associated features, the HER2 and the ER modules, each of which described how phenotypes were associated with molecular signatures, diagnostic features, and clinical decisions. The discovered "ER module", which was dominated by cancer immunity, was used as an example for patient stratification and prediction of drug responses to tamoxifen and chemotherapy. BUFAM-derived RDN modeling demonstrated unique ability to discover clinically meaningful and actionable knowledge across highly heterogeneous biomedical big data sets.

Analytical validation of the Oncotype DX genomic diagnostic test for recurrence prognosis and therapeutic response prediction in node-negative, estrogen receptor-positive breast cancer.
Maureen Cronin, Chithra Sangli, +7 authors, Drew Watson.
Clin Chem, 2007 Apr 28; 53(6). PMID: 17463177
Highly Cited.
The eMERGE Network: a consortium of biorepositories linked to electronic medical records data for conducting genomic studies.
Catherine A McCarty, Rex L Chisholm, +10 authors, eMERGE Team.
BMC Med Genomics, 2011 Jan 29; 4. PMID: 21269473    Free PMC article.
Highly Cited.
Mining TCGA data using Boolean implications.
Subarna Sinha, Emily K Tsang, +2 authors, David L Dill.
PLoS One, 2014 Jul 24; 9(7). PMID: 25054200    Free PMC article.
Mining electronic health records: towards better research applications and clinical care.
Peter B Jensen, Lars J Jensen, Søren Brunak.
Nat Rev Genet, 2012 May 03; 13(6). PMID: 22549152
Highly Cited. Review.
Computational solutions to large-scale data management and analysis.
Eric E Schadt, Michael D Linderman, +2 authors, Garry P Nolan.
Nat Rev Genet, 2010 Aug 19; 11(9). PMID: 20717155    Free PMC article.
Highly Cited. Review.
Estrogen receptor (ER) mRNA expression and molecular subtype distribution in ER-negative/progesterone receptor-positive breast cancers.
Mitsuya Itoh, Takayuki Iwamoto, +12 authors, Lajos Pusztai.
Breast Cancer Res Treat, 2013 Dec 18; 143(2). PMID: 24337596
A description of the Molecular Signatures Database (MSigDB) Web site.
Arthur Liberzon.
Methods Mol Biol, 2014 Apr 20; 1150. PMID: 24743996
Highly Cited.
Meta-analysis of microarrays: interstudy validation of gene expression profiles reveals pathway dysregulation in prostate cancer.
Daniel R Rhodes, Terrence R Barrette, +2 authors, Arul M Chinnaiyan.
Cancer Res, 2002 Aug 03; 62(15). PMID: 12154050
Highly Cited.
The humoral immune system has a key prognostic impact in node-negative breast cancer.
Marcus Schmidt, Daniel Böhm, +7 authors, Mathias Gehrmann.
Cancer Res, 2008 Jul 03; 68(13). PMID: 18593943
Highly Cited.
Immunological aspects of cancer chemotherapy.
Laurence Zitvogel, Lionel Apetoh, François Ghiringhelli, Guido Kroemer.
Nat Rev Immunol, 2007 Dec 22; 8(1). PMID: 18097448
Highly Cited. Review.
Histopathologic variables predict Oncotype DX recurrence score.
Melina B Flanagan, David J Dabbs, +2 authors, Rohit Bhargava.
Mod Pathol, 2008 Mar 25; 21(10). PMID: 18360352
Cloud computing: a new business paradigm for biomedical information sharing.
Arnon Rosenthal, Peter Mork, +3 authors, Patti Reynolds.
J Biomed Inform, 2009 Sep 01; 43(2). PMID: 19715773
The coming age of data-driven medicine: translational bioinformatics' next frontier.
Nigam H Shah, Jessica D Tenenbaum.
J Am Med Inform Assoc, 2012 Jun 22; 19(e1). PMID: 22718035    Free PMC article.
Lymphocyte depletion during treatment with intensive chemotherapy for cancer.
C L Mackall, T A Fleisher, +7 authors, R E Gress.
Blood, 1994 Oct 01; 84(7). PMID: 7919339
LSimpute: accurate estimation of missing values in microarray data with least squares methods.
Trond Hellem Bø, Bjarte Dysvik, Inge Jonassen.
Nucleic Acids Res, 2004 Feb 24; 32(3). PMID: 14978222    Free PMC article.
Fast pairwise IBD association testing in genome-wide association studies.
Buhm Han, Eun Yong Kang, +2 authors, Eleazar Eskin.
Bioinformatics, 2013 Oct 26; 30(2). PMID: 24158599    Free PMC article.
An immune response gene expression module identifies a good prognosis subtype in estrogen receptor negative breast cancer.
Andrew E Teschendorff, Ahmad Miremadi, +2 authors, Carlos Caldas.
Genome Biol, 2007 Aug 09; 8(8). PMID: 17683518    Free PMC article.
Highly Cited.
Missing value imputation in high-dimensional phenomic data: imputable or not, and how?
Serena G Liao, Yan Lin, +5 authors, George C Tseng.
BMC Bioinformatics, 2014 Nov 06; 15. PMID: 25371041    Free PMC article.
The immunological effects of taxanes.
O T Chan, L X Yang.
Cancer Immunol Immunother, 2000 Aug 15; 49(4-5). PMID: 10941900
Risk prediction for late-stage ovarian cancer by meta-analysis of 1525 patient samples.
Markus Riester, Wei Wei, +8 authors, Michael J Birrer.
J Natl Cancer Inst, 2014 Apr 05; 106(5). PMID: 24700803    Free PMC article.
Highly Cited.
Drug discovery: a jump-start for electroceuticals.
Kristoffer Famm, Brian Litt, +2 authors, Moncef Slaoui.
Nature, 2013 Apr 13; 496(7444). PMID: 23579662    Free PMC article.
Highly Cited.
Tamoxifen in the treatment of breast cancer.
C K Osborne.
N Engl J Med, 1998 Nov 26; 339(22). PMID: 9828250
Highly Cited. Review.
BioPortal: enhanced functionality via new Web services from the National Center for Biomedical Ontology to access and use ontologies in software applications.
Patricia L Whetzel, Natalya F Noy, +4 authors, Mark A Musen.
Nucleic Acids Res, 2011 Jun 16; 39(Web Server issue). PMID: 21672956    Free PMC article.
Highly Cited.
MissForest--non-parametric missing value imputation for mixed-type data.
Daniel J Stekhoven, Peter Bühlmann.
Bioinformatics, 2011 Nov 01; 28(1). PMID: 22039212
Highly Cited.
Systematic curation of protein and genetic interaction data for computable biology.
Kara Dolinski, Andrew Chatr-Aryamontri, Mike Tyers.
BMC Biol, 2013 Apr 17; 11. PMID: 23587305    Free PMC article.
Intrinsic breast tumor subtypes, race, and long-term survival in the Carolina Breast Cancer Study.
Katie M O'Brien, Stephen R Cole, +6 authors, Robert C Millikan.
Clin Cancer Res, 2010 Dec 21; 16(24). PMID: 21169259    Free PMC article.
Highly Cited.
Research agenda. Promoting convergence in biomedical science.
Phillip A Sharp, Robert Langer.
Science, 2011 Jul 30; 333(6042). PMID: 21798916
Interactions between immunity, proliferation and molecular subtype in breast cancer prognosis.
Srikanth Nagalla, Jeff W Chou, +9 authors, Lance D Miller.
Genome Biol, 2013 Apr 27; 14(4). PMID: 23618380    Free PMC article.
Highly Cited.
A genomic predictor of response and survival following taxane-anthracycline chemotherapy for invasive breast cancer.
Christos Hatzis, Lajos Pusztai, +27 authors, W Fraser Symmans.
JAMA, 2011 May 12; 305(18). PMID: 21558518    Free PMC article.
Highly Cited.
Immune changes in patients with advanced breast cancer undergoing chemotherapy with taxanes.
N Tsavaris, C Kosmas, +2 authors, D Boulamatsis.
Br J Cancer, 2002 Jun 27; 87(1). PMID: 12085250    Free PMC article.
Molecular signatures database (MSigDB) 3.0.
Arthur Liberzon, Aravind Subramanian, +3 authors, Jill P Mesirov.
Bioinformatics, 2011 May 07; 27(12). PMID: 21546393    Free PMC article.
Highly Cited.
Finding and evaluating community structure in networks.
M E J Newman, M Girvan.
Phys Rev E Stat Nonlin Soft Matter Phys, 2004 Mar 05; 69(2 Pt 2). PMID: 14995526
Highly Cited.
Gene expression and benefit of chemotherapy in women with node-negative, estrogen receptor-positive breast cancer.
Soonmyung Paik, Gong Tang, +11 authors, Norman Wolmark.
J Clin Oncol, 2006 May 25; 24(23). PMID: 16720680
Highly Cited.
A multigene assay to predict recurrence of tamoxifen-treated, node-negative breast cancer.
Soonmyung Paik, Steven Shak, +12 authors, Norman Wolmark.
N Engl J Med, 2004 Dec 14; 351(27). PMID: 15591335
Highly Cited.
The hematopoietic stem cell in its place.
Gregor B Adams, David T Scadden.
Nat Immunol, 2006 Mar 22; 7(4). PMID: 16550195
Highly Cited. Review.
Estrogen receptors regulate an inflammatory pathway of dendritic cell differentiation: mechanisms and implications for immunity.
Susan Kovats.
Horm Behav, 2012 May 09; 62(3). PMID: 22561458    Free PMC article.
Comprehensive molecular portraits of human breast tumours.
Cancer Genome Atlas Network.
Nature, 2012 Sep 25; 490(7418). PMID: 23000897    Free PMC article.
Highly Cited.
Translational integrity and continuity: personalized biomedical data integration.
Xiaoming Wang, Lili Liu, +5 authors, Olufunmilayo I Olopade.
J Biomed Inform, 2008 Sep 02; 42(1). PMID: 18760382    Free PMC article.
Large-scale meta-analysis of cancer microarray data identifies common transcriptional profiles of neoplastic transformation and progression.
Daniel R Rhodes, Jianjun Yu, +6 authors, Arul M Chinnaiyan.
Proc Natl Acad Sci U S A, 2004 Jun 09; 101(25). PMID: 15184677    Free PMC article.
Highly Cited.
Quantification of regulatory T cells enables the identification of high-risk breast cancer patients and those at risk of late relapse.
Gaynor J Bates, Stephen B Fox, +4 authors, Alison H Banham.
J Clin Oncol, 2006 Dec 01; 24(34). PMID: 17135638
Highly Cited.
Antagonism of chemotherapy-induced cytotoxicity for human breast cancer cells by antiestrogens.
C K Osborne, L Kitten, C L Arteaga.
J Clin Oncol, 1989 Jun 01; 7(6). PMID: 2715802
The Enterprise Data Trust at Mayo Clinic: a semantically integrated warehouse of biomedical data.
Christopher G Chute, Scott A Beck, Thomas B Fisk, David N Mohr.
J Am Med Inform Assoc, 2010 Mar 02; 17(2). PMID: 20190054    Free PMC article.
Highly Cited.
The effects of tamoxifen on immunity.
S Behjati, M H Frank.
Curr Med Chem, 2009 Aug 20; 16(24). PMID: 19689284    Free PMC article.
Electronic medical records for genetic research: results of the eMERGE consortium.
Abel N Kho, Jennifer A Pacheco, +12 authors, Joshua C Denny.
Sci Transl Med, 2011 Apr 22; 3(79). PMID: 21508311    Free PMC article.
Highly Cited.
Postoperative chemotherapy and tamoxifen compared with tamoxifen alone in the treatment of positive-node breast cancer patients aged 50 years and older with tumors responsive to tamoxifen: results from the National Surgical Adjuvant Breast and Bowel Project B-16.
B Fisher, C Redmond, +7 authors, A G Glass.
J Clin Oncol, 1990 Jun 01; 8(6). PMID: 2189950
Missing value estimation methods for DNA microarrays.
O Troyanskaya, M Cantor, +5 authors, R B Altman.
Bioinformatics, 2001 Jun 08; 17(6). PMID: 11395428
Highly Cited.
The BioGRID Interaction Database: 2011 update.
Chris Stark, Bobby-Joe Breitkreutz, +12 authors, Mike Tyers.
Nucleic Acids Res, 2010 Nov 13; 39(Database issue). PMID: 21071413    Free PMC article.
Highly Cited.
Race, breast cancer subtypes, and survival in the Carolina Breast Cancer Study.
Lisa A Carey, Charles M Perou, +14 authors, Robert C Millikan.
JAMA, 2006 Jun 08; 295(21). PMID: 16757721
Highly Cited.
Reuse of imputed data in microarray analysis increases imputation efficiency.
Ki-Yeol Kim, Byoung-Jin Kim, Gwan-Su Yi.
BMC Bioinformatics, 2004 Oct 27; 5. PMID: 15504240    Free PMC article.
Robust clinical marker identification for diabetic kidney disease with ensemble feature selection.
Xing Song, Lemuel R Waitman, +3 authors, Mei Liu.
J Am Med Inform Assoc, 2019 Jan 03; 26(3). PMID: 30602020    Free PMC article.
DeePaN: deep patient graph convolutional network integrating clinico-genomic evidence to stratify lung cancers for immunotherapy.
Chao Fang, Dong Xu, +2 authors, Bolan Linghu.
NPJ Digit Med, 2021 Feb 04; 4(1). PMID: 33531613    Free PMC article.
System-Wide Pollution of Biomedical Data: Consequence of the Search for Hub Genes of Hepatocellular Carcinoma Without Spatiotemporal Consideration.
Ankush Sharma, Giovanni Colonna.
Mol Diagn Ther, 2021 Jan 22; 25(1). PMID: 33475988    Free PMC article.