Journal Article
. 2020 Dec; 12(12):.
doi: 10.3390/cancers12123506.

An Improved, Assay Platform Agnostic, Absolute Single Sample Breast Cancer Subtype Classifier

Mi-Kyoung Seo 1 Soonmyung Paik 2 Sangwoo Kim 1 
  • PMID: 33255759
  •     37 References


While intrinsic molecular subtypes provide important biological classification of breast cancer, the subtype assignment of individuals is influenced by assay technology and study cohort composition. We sought to develop a platform-independent absolute single-sample subtype classifier based on a minimal number of genes. Pairwise ratios for subtype-specific differentially expressed genes from un-normalized expression data from 432 breast cancer (BC) samples of The Cancer Genome Atlas (TCGA) were used as inputs for machine learning. The subtype classifier with the fewest number of genes and maximal classification power was selected during cross-validation. The final model was evaluated on 5816 samples from 10 independent studies profiled with four different assay platforms. Upon cross-validation within the TCGA cohort, a random forest classifier (MiniABS) with 11 genes achieved the best accuracy of 88.2%. Applying MiniABS to five validation sets of RNA-seq and microarray data showed an average accuracy of 85.15% (vs. 77.72% for Absolute Intrinsic Molecular Subtype (AIMS)). Only MiniABS could be applied to five low-throughput datasets, showing an average accuracy of 87.93%. The MiniABS can absolutely subtype BC using the raw expression levels of only 11 genes, regardless of assay platform, with higher accuracy than existing methods.

Keywords: breast cancer; classifier; machine learning; optimization; subtyping.

A genomic predictor of response and survival following taxane-anthracycline chemotherapy for invasive breast cancer.
Christos Hatzis, Lajos Pusztai, +27 authors, W Fraser Symmans.
JAMA, 2011 May 12; 305(18). PMID: 21558518    Free PMC article.
Highly Cited.
Predicting response and survival in chemotherapy-treated triple-negative breast cancer.
A Prat, A Lluch, +35 authors, E Alba.
Br J Cancer, 2014 Aug 08; 111(8). PMID: 25101563    Free PMC article.
The Sweden Cancerome Analysis Network - Breast (SCAN-B) Initiative: a large-scale multicenter infrastructure towards implementation of breast cancer genomic analyses in the clinical routine.
Lao H Saal, Johan Vallon-Christersson, +16 authors, Åke Borg.
Genome Med, 2015 Feb 28; 7(1). PMID: 25722745    Free PMC article.
Genefu: an R/Bioconductor package for computation of gene expression-based signatures in breast cancer.
Deena M A Gendoo, Natchar Ratanasirigulchai, +4 authors, Benjamin Haibe-Kains.
Bioinformatics, 2015 Nov 27; 32(7). PMID: 26607490    Free PMC article.
Highly Cited.
De-escalating and escalating treatments for early-stage breast cancer: the St. Gallen International Expert Consensus Conference on the Primary Therapy of Early Breast Cancer 2017.
G Curigliano, H J Burstein, +52 authors, B Xu.
Ann Oncol, 2017 Aug 26; 28(8). PMID: 28838210    Free PMC article.
Highly Cited.
PAM50 breast cancer subtyping by RT-qPCR and concordance with standard clinical molecular markers.
Roy R L Bastien, Álvaro Rodríguez-Lescure, +29 authors, Miguel Martín.
BMC Med Genomics, 2012 Oct 06; 5. PMID: 23035882    Free PMC article.
Highly Cited.
Clinical Outcomes in Early Breast Cancer With a High 21-Gene Recurrence Score of 26 to 100 Assigned to Adjuvant Chemotherapy Plus Endocrine Therapy: A Secondary Analysis of the TAILORx Randomized Clinical Trial.
Joseph A Sparano, Robert J Gray, +26 authors, George W Sledge.
JAMA Oncol, 2019 Oct 01; 6(3). PMID: 31566680    Free PMC article.
Comparing Breast Cancer Multiparameter Tests in the OPTIMA Prelim Trial: No Test Is More Equal Than the Others.
John M S Bartlett, Jane Bayani, +18 authors, OPTIMA TMG.
J Natl Cancer Inst, 2016 May 01; 108(9). PMID: 27130929    Free PMC article.
Early breast cancer: ESMO Clinical Practice Guidelines for diagnosis, treatment and follow-up†.
F Cardoso, S Kyriakides, +6 authors, ESMO Guidelines Committee. Electronic address:
Ann Oncol, 2019 Jun 05; 30(8). PMID: 31161190
Highly Cited.
West German Study Group Phase III PlanB Trial: First Prospective Outcome Data for the 21-Gene Recurrence Score Assay and Concordance of Prognostic Markers by Central and Local Pathology Assessment.
Oleg Gluz, Ulrike A Nitz, +18 authors, Nadia Harbeck.
J Clin Oncol, 2016 Mar 02; 34(20). PMID: 26926676
Highly Cited.
Gene Expression Omnibus: NCBI gene expression and hybridization array data repository.
Ron Edgar, Michael Domrachev, Alex E Lash.
Nucleic Acids Res, 2001 Dec 26; 30(1). PMID: 11752295    Free PMC article.
Highly Cited.
Erratum to: Breast cancer subtype predictors revisited: from consensus to concordance?
Herman M J Sontrop, Marcel J T Reinders, Perry D Moerland.
BMC Med Genomics, 2016 Jul 16; 9(1). PMID: 27417682    Free PMC article.
Meta-analyses of adjuvant therapies for women with early breast cancer: the Early Breast Cancer Trialists' Collaborative Group overview.
M Clarke.
Ann Oncol, 2006 Oct 05; 17 Suppl 10. PMID: 17018753
Personalizing the treatment of women with early breast cancer: highlights of the St Gallen International Expert Consensus on the Primary Therapy of Early Breast Cancer 2013.
A Goldhirsch, E P Winer, +5 authors, Panel members.
Ann Oncol, 2013 Aug 07; 24(9). PMID: 23917950    Free PMC article.
Highly Cited.
affy--analysis of Affymetrix GeneChip data at the probe level.
Laurent Gautier, Leslie Cope, Benjamin M Bolstad, Rafael A Irizarry.
Bioinformatics, 2004 Feb 13; 20(3). PMID: 14960456
Highly Cited.
TBCRC 018: phase II study of iniparib in combination with irinotecan to treat progressive triple negative breast cancer brain metastases.
Carey Anders, Allison M Deal, +21 authors, Lisa A Carey.
Breast Cancer Res Treat, 2014 Jul 09; 146(3). PMID: 25001612    Free PMC article.
Cep55 overexpression promotes genomic instability and tumorigenesis in mice.
Debottam Sinha, Purba Nag, +10 authors, Kum Kum Khanna.
Commun Biol, 2020 Oct 23; 3(1). PMID: 33087841    Free PMC article.
Comprehensive molecular portraits of human breast tumours.
Cancer Genome Atlas Network.
Nature, 2012 Sep 25; 490(7418). PMID: 23000897    Free PMC article.
Highly Cited.
Molecular portraits of human breast tumours.
C M Perou, T Sørlie, +15 authors, D Botstein.
Nature, 2000 Aug 30; 406(6797). PMID: 10963602
Highly Cited.
Prognostic value of PAM50 and risk of recurrence score in patients with early-stage breast cancer with long-term follow-up.
Hege O Ohnstad, Elin Borgen, +11 authors, Bjørn Naume.
Breast Cancer Res, 2017 Nov 16; 19(1). PMID: 29137653    Free PMC article.
70-Gene Signature as an Aid to Treatment Decisions in Early-Stage Breast Cancer.
Fatima Cardoso, Laura J van't Veer, +32 authors, MINDACT Investigators.
N Engl J Med, 2016 Aug 25; 375(8). PMID: 27557300
Highly Cited.
Intrinsic Subtypes and Gene Expression Profiles in Primary and Metastatic Breast Cancer.
Juan M Cejalvo, Eduardo Martínez de Dueñas, +24 authors, Aleix Prat.
Cancer Res, 2017 Mar 03; 77(9). PMID: 28249905    Free PMC article.
Highly Cited.
Ki67 index, HER2 status, and prognosis of patients with luminal B breast cancer.
Maggie C U Cheang, Stephen K Chia, +10 authors, Torsten O Nielsen.
J Natl Cancer Inst, 2009 May 14; 101(10). PMID: 19436038    Free PMC article.
Highly Cited.
Reparameterization of PAM50 Expression Identifies Novel Breast Tumor Dimensions and Leads to Discovery of a Genome-Wide Significant Breast Cancer Locus at 12q15.
Michael J Madsen, Stacey Knight, +15 authors, Nicola J Camp.
Cancer Epidemiol Biomarkers Prev, 2018 Apr 14; 27(6). PMID: 29650789    Free PMC article.
Biomarker analysis of neoadjuvant doxorubicin/cyclophosphamide followed by ixabepilone or Paclitaxel in early-stage breast cancer.
Christine E Horak, Lajos Pusztai, +6 authors, David Liu.
Clin Cancer Res, 2013 Jan 24; 19(6). PMID: 23340299
GeneSigDB: a manually curated database and resource for analysis of gene expression signatures.
Aedín C Culhane, Markus S Schröder, +13 authors, John Quackenbush.
Nucleic Acids Res, 2011 Nov 24; 40(Database issue). PMID: 22110038    Free PMC article.
Prospective Validation of a 21-Gene Expression Assay in Breast Cancer.
Joseph A Sparano, Robert J Gray, +28 authors, George W Sledge.
N Engl J Med, 2015 Sep 29; 373(21). PMID: 26412349    Free PMC article.
Highly Cited.
Estrogen receptor-positive breast cancer molecular signatures and therapeutic potentials (Review).
Mei Hong Zhang, Hong Tao Man, +2 authors, Shi Liang Ma.
Biomed Rep, 2014 Mar 22; 2(1). PMID: 24649067    Free PMC article.
Absolute assignment of breast cancer intrinsic molecular subtype.
Eric R Paquet, Michael T Hallett.
J Natl Cancer Inst, 2014 Dec 07; 107(1). PMID: 25479802
Clinical Value of RNA Sequencing-Based Classifiers for Prediction of the Five Conventional Breast Cancer Biomarkers: A Report From the Population-Based Multicenter Sweden Cancerome Analysis Network-Breast Initiative.
Christian Brueffer, Johan Vallon-Christersson, +13 authors, Lao H Saal.
JCO Precis Oncol, 2018 Mar 09; 2. PMID: 32913985    Free PMC article.
Supervised risk predictor of breast cancer based on intrinsic subtypes.
Joel S Parker, Michael Mullins, +17 authors, Philip S Bernard.
J Clin Oncol, 2009 Feb 11; 27(8). PMID: 19204204    Free PMC article.
Highly Cited.
Response and survival of breast cancer intrinsic subtypes following multi-agent neoadjuvant chemotherapy.
Aleix Prat, Cheng Fan, +11 authors, Charles M Perou.
BMC Med, 2015 Dec 20; 13. PMID: 26684470    Free PMC article.
Intrinsic Subtype Switching and Acquired ERBB2/HER2 Amplifications and Mutations in Breast Cancer Brain Metastases.
Nolan Priedigkeit, Ryan J Hartmaier, +15 authors, Adrian V Lee.
JAMA Oncol, 2016 Dec 08; 3(5). PMID: 27926948    Free PMC article.
Development and verification of the PAM50-based Prosigna breast cancer gene signature assay.
Brett Wallden, James Storhoff, +16 authors, Joel S Parker.
BMC Med Genomics, 2015 Aug 25; 8. PMID: 26297356    Free PMC article.
Highly Cited.
Analytical validation of the PAM50-based Prosigna Breast Cancer Prognostic Gene Signature Assay and nCounter Analysis System using formalin-fixed paraffin-embedded breast tumor specimens.
Torsten Nielsen, Brett Wallden, +7 authors, James Storhoff.
BMC Cancer, 2014 Mar 15; 14. PMID: 24625003    Free PMC article.
Highly Cited.
Test set bias affects reproducibility of gene signatures.
Prasad Patil, Pierre-Olivier Bachant-Winner, Benjamin Haibe-Kains, Jeffrey T Leek.
Bioinformatics, 2015 Mar 20; 31(14). PMID: 25788628    Free PMC article.
Deconstructing the molecular portraits of breast cancer.
Aleix Prat, Charles M Perou.
Mol Oncol, 2010 Dec 15; 5(1). PMID: 21147047    Free PMC article.
Highly Cited. Review.