Journal Article
. 2013 Jul; 20(6):1091-8.
doi: 10.1136/amiajnl-2012-001469.

Cancer Digital Slide Archive: an informatics resource to support integrated in silico analysis of TCGA pathology data

David A Gutman 1 Jake Cobb  Dhananjaya Somanna  Yuna Park  Fusheng Wang  Tahsin Kurc  Joel H Saltz  Daniel J Brat  Lee A D Cooper  
  • PMID: 23893318
  •     21 References
  •     45 citations


Background: The integration and visualization of multimodal datasets is a common challenge in biomedical informatics. Several recent studies of The Cancer Genome Atlas (TCGA) data have illustrated important relationships between morphology observed in whole-slide images, outcome, and genetic events. The pairing of genomics and rich clinical descriptions with whole-slide imaging provided by TCGA presents a unique opportunity to perform these correlative studies. However, better tools are needed to integrate the vast and disparate data types.

Objective: To build an integrated web-based platform supporting whole-slide pathology image visualization and data integration.

Materials And Methods: All images and genomic data were directly obtained from the TCGA and National Cancer Institute (NCI) websites.

Results: The Cancer Digital Slide Archive (CDSA) produced is accessible to the public ( and currently hosts more than 20,000 whole-slide images from 22 cancer types.

Discussion: The capabilities of CDSA are demonstrated using TCGA datasets to integrate pathology imaging with associated clinical, genomic and MRI measurements in glioblastomas and can be extended to other tumor types. CDSA also allows URL-based sharing of whole-slide images, and has preliminary support for directly sharing regions of interest and other annotations. Images can also be selected on the basis of other metadata, such as mutational profile, patient age, and other relevant characteristics.

Conclusions: With the increasing availability of whole-slide scanners, analysis of digitized pathology images will become increasingly important in linking morphologic observations with genomic and clinical endpoints.

Keywords: Cancer; Cell Morphology; Computer-Assisted Image Analysis; Digital Pathology; Image Cytometry; TCGA.

Systematic analysis of breast cancer morphology uncovers stromal features associated with survival.
Andrew H Beck, Ankur R Sangoi, +6 authors, Daphne Koller.
Sci Transl Med, 2011 Nov 11; 3(108). PMID: 22072638
Highly Cited.
Managing and Querying Whole Slide Images.
Fusheng Wang, Tae W Oh, +2 authors, Joel Saltz.
Proc SPIE Int Soc Opt Eng, 2012 Jul 31; 8319. PMID: 22844574    Free PMC article.
Integrative, multimodal analysis of glioblastoma using TCGA molecular data, pathology images, and clinical outcomes.
Jun Kong, Lee A D Cooper, +10 authors, Daniel J Brat.
IEEE Trans Biomed Eng, 2011 Sep 29; 58(12). PMID: 21947516    Free PMC article.
Digital pathology: DICOM-conform draft, testbed, and first results.
Ralf Zwönitzer, Thomas Kalinski, +2 authors, Johannes Bernarding.
Comput Methods Programs Biomed, 2007 Jul 10; 87(3). PMID: 17618703
Managing and Querying Image Annotation and Markup in XML.
Fusheng Wang, Tony Pan, Ashish Sharma, Joel Saltz.
Proc SPIE Int Soc Opt Eng, 2011 Jan 11; 7628(2010). PMID: 21218167    Free PMC article.
The tumor microenvironment strongly impacts master transcriptional regulators and gene expression class of glioblastoma.
Lee A D Cooper, David A Gutman, +8 authors, Daniel J Brat.
Am J Pathol, 2012 Mar 24; 180(5). PMID: 22440258    Free PMC article.
Highly Cited.
Storage and distribution of pathology digital images using integrated web-based viewing systems.
Alberto M Marchevsky, Ronda Dulbandzhyan, +2 authors, Raymond G Duncan.
Arch Pathol Lab Med, 2002 Apr 18; 126(5). PMID: 11958657
Digital Imaging and Communications in Medicine (DICOM) as standard in digital pathology.
Thomas Kalinski, Ralf Zwönitzer, +3 authors, Thomas Guenther.
Histopathology, 2012 May 04; 61(1). PMID: 22551421
Digital pathology--the big picture.
J H Saltz.
Hum Pathol, 2000 Aug 03; 31(7). PMID: 10923911
Web-based telemicroscopy.
M Hadida-Hassan, S J Young, +3 authors, M H Ellisman.
J Struct Biol, 1999 May 01; 125(2-3). PMID: 10222280
The virtual microscope.
Umit Catalyürek, Michael D Beynon, +3 authors, Joel Saltz.
IEEE Trans Inf Technol Biomed, 2004 Mar 06; 7(4). PMID: 15000350
Digital dynamic telepathology--the Virtual Microscope.
A Afework, M D Beynon, +8 authors, H Tsang.
Proc AMIA Symp, 1999 Feb 03;. PMID: 9929351    Free PMC article.
A data model and database for high-resolution pathology analytical image informatics.
Fusheng Wang, Jun Kong, +10 authors, Joel Saltz.
J Pathol Inform, 2011 Aug 17; 2. PMID: 21845230    Free PMC article.
Standardization in digital pathology: Supplement 145 of the DICOM standards.
Rajendra Singh, Lauren Chubb, Liron Pantanowitz, Anil Parwani.
J Pathol Inform, 2011 Jun 03; 2. PMID: 21633489    Free PMC article.
Reliability of 'new drug target' claims called into question.
Asher Mullard.
Nat Rev Drug Discov, 2011 Sep 01; 10(9). PMID: 21878966
Nature versus nurture in glioblastoma: microenvironment and genetics can both drive mesenchymal transcriptional signature.
Brent A Orr, Charles G Eberhart.
Am J Pathol, 2012 Mar 28; 180(5). PMID: 22449951    Free PMC article.
Refining DICOM for pathology--progress from the IHE and DICOM pathology working groups.
Christel Le Bozec, Dominique Henin, +3 authors, Bruce Beckwith.
Stud Health Technol Inform, 2007 Oct 04; 129(Pt 1). PMID: 17911754
The transcriptional network for mesenchymal transformation of brain tumours.
Maria Stella Carro, Wei Keat Lim, +11 authors, Antonio Iavarone.
Nature, 2009 Dec 25; 463(7279). PMID: 20032975    Free PMC article.
Highly Cited.
Integrated morphologic analysis for the identification and characterization of disease subtypes.
Lee A D Cooper, Jun Kong, +12 authors, Joel H Saltz.
J Am Med Inform Assoc, 2012 Jan 27; 19(2). PMID: 22278382    Free PMC article.
Virtual microscopy and grid-enabled decision support for large-scale analysis of imaged pathology specimens.
Lin Yang, Wenjin Chen, +4 authors, David J Foran.
IEEE Trans Inf Technol Biomed, 2009 Apr 17; 13(4). PMID: 19369162    Free PMC article.
Morphometic analysis of TCGA glioblastoma multiforme.
Hang Chang, Gerald V Fontenay, +5 authors, Bahram Parvin.
BMC Bioinformatics, 2011 Dec 22; 12. PMID: 22185703    Free PMC article.
Biomedical imaging informatics in the era of precision medicine: progress, challenges, and opportunities.
William Hsu, Mia K Markey, May D Wang.
J Am Med Inform Assoc, 2013 Oct 12; 20(6). PMID: 24114330    Free PMC article.
NCI Workshop Report: Clinical and Computational Requirements for Correlating Imaging Phenotypes with Genomics Signatures.
Rivka Colen, Ian Foster, +13 authors, Gary Whitman.
Transl Oncol, 2014 Nov 13; 7(5). PMID: 25389451    Free PMC article.
Novel genotype-phenotype associations in human cancers enabled by advanced molecular platforms and computational analysis of whole slide images.
Lee A D Cooper, Jun Kong, +3 authors, Daniel J Brat.
Lab Invest, 2015 Jan 20; 95(4). PMID: 25599536    Free PMC article.
The Cancer Genome Atlas (TCGA): an immeasurable source of knowledge.
Katarzyna Tomczak, Patrycja Czerwińska, Maciej Wiznerowicz.
Contemp Oncol (Pozn), 2015 Feb 19; 19(1A). PMID: 25691825    Free PMC article.
Highly Cited. Review.
Somatic POLE mutations cause an ultramutated giant cell high-grade glioma subtype with better prognosis.
E Zeynep Erson-Omay, Ahmet Okay Çağlayan, +25 authors, Murat Günel.
Neuro Oncol, 2015 Mar 06; 17(10). PMID: 25740784    Free PMC article.
OpenTein: a database of digital whole-slide images of stem cell-derived teratomas.
Sung-Joon Park, Yusuke Komiyama, +2 authors, Kenta Nakai.
Nucleic Acids Res, 2015 Oct 27; 44(D1). PMID: 26496950    Free PMC article.
Trends in biomedical informatics: automated topic analysis of JAMIA articles.
Dong Han, Shuang Wang, +4 authors, Lucila Ohno-Machado.
J Am Med Inform Assoc, 2015 Nov 12; 22(6). PMID: 26555018    Free PMC article.
MicroRNA 100 sensitizes luminal A breast cancer cells to paclitaxel treatment in part by targeting mTOR.
Baotong Zhang, Ranran Zhao, +5 authors, Jin-Tang Dong.
Oncotarget, 2016 Jan 09; 7(5). PMID: 26744318    Free PMC article.
Integrative radiogenomic analysis for multicentric radiophenotype in glioblastoma.
Doo-Sik Kong, Jinkuk Kim, +8 authors, Do-Hyun Nam.
Oncotarget, 2016 Feb 11; 7(10). PMID: 26863628    Free PMC article.
Integration of Multi-Modal Biomedical Data to Predict Cancer Grade and Patient Survival.
John H Phan, Ryan Hoffman, +2 authors, May D Wang.
IEEE EMBS Int Conf Biomed Health Inform, 2016 Aug 06; 2016. PMID: 27493999    Free PMC article.
Exploring cancer genomic data from the cancer genome atlas project.
Ju-Seog Lee.
BMB Rep, 2016 Aug 18; 49(11). PMID: 27530686    Free PMC article.
Image Montaging for Creating a Virtual Pathology Slide: An Innovative and Economical Tool to Obtain a Whole Slide Image.
Spoorthi Ravi Banavar, Prashanthi Chippagiri, +2 authors, Premalatha Bidadi Rajashekaraiah.
Anal Cell Pathol (Amst), 2016 Oct 18; 2016. PMID: 27747147    Free PMC article.
An Interactive Learning Framework for Scalable Classification of Pathology Images.
Michael Nalisnik, David A Gutman, Jun Kong, Lee Ad Cooper.
Proc IEEE Int Conf Big Data, 2015 Jan 01; 2015. PMID: 27796014    Free PMC article.
The molecular basis of breast cancer pathological phenotypes.
Yujing J Heng, Susan C Lester, +26 authors, Andrew H Beck.
J Pathol, 2016 Nov 20; 241(3). PMID: 27861902    Free PMC article.
Precision medicine driven by cancer systems biology.
Fabian V Filipp.
Cancer Metastasis Rev, 2017 Mar 08; 36(1). PMID: 28265786    Free PMC article.
Comprehensive analysis of The Cancer Genome Atlas reveals a unique gene and non-coding RNA signature of fibrolamellar carcinoma.
Timothy A Dinh, Eva C M Vitucci, +9 authors, Praveen Sethupathy.
Sci Rep, 2017 Mar 18; 7. PMID: 28304380    Free PMC article.
Identification of Histological Correlates of Overall Survival in Lower Grade Gliomas Using a Bag-of-words Paradigm: A Preliminary Analysis Based on Hematoxylin & Eosin Stained Slides from the Lower Grade Glioma Cohort of The Cancer Genome Atlas.
Reid Trenton Powell, Adriana Olar, +4 authors, Arvind Rao.
J Pathol Inform, 2017 Apr 07; 8. PMID: 28382223    Free PMC article.
IDH1 R132C mutation is detected in clear cell hepatocellular carcinoma by pyrosequencing.
Jung Hee Lee, Dong Hoon Shin, +9 authors, Mee Young Sol.
World J Surg Oncol, 2017 Apr 14; 15(1). PMID: 28403884    Free PMC article.
Selective analysis of cancer-cell intrinsic transcriptional traits defines novel clinically relevant subtypes of colorectal cancer.
Claudio Isella, Francesco Brundu, +13 authors, Andrea Bertotti.
Nat Commun, 2017 Jun 01; 8. PMID: 28561063    Free PMC article.
Highly Cited.
Characterizing Diagnostic Search Patterns in Digital Breast Pathology: Scanners and Drillers.
Ezgi Mercan, Linda G Shapiro, +2 authors, Joann G Elmore.
J Digit Imaging, 2017 Jul 07; 31(1). PMID: 28681097    Free PMC article.
Global Transcriptome Analysis of RNA Abundance Regulation by ADAR in Lung Adenocarcinoma.
Michael F Sharpnack, Bin Chen, +5 authors, Kun Huang.
EBioMedicine, 2017 Dec 24; 27. PMID: 29273356    Free PMC article.
PanCancer insights from The Cancer Genome Atlas: the pathologist's perspective.
Lee Ad Cooper, Elizabeth G Demicco, +3 authors, Alexander J Lazar.
J Pathol, 2017 Dec 31; 244(5). PMID: 29288495    Free PMC article.
Deep Convolutional Neural Networks Enable Discrimination of Heterogeneous Digital Pathology Images.
Pegah Khosravi, Ehsan Kazemi, +2 authors, Iman Hajirasouliha.
EBioMedicine, 2018 Jan 03; 27. PMID: 29292031    Free PMC article.
Highly Cited.
Computationally-Guided Development of a Stromal Inflammation Histologic Biomarker in Lung Squamous Cell Carcinoma.
Daniel Xia, Ruben Casanova, +4 authors, Alex Soltermann.
Sci Rep, 2018 Mar 04; 8(1). PMID: 29500362    Free PMC article.
Predicting cancer outcomes from histology and genomics using convolutional networks.
Pooya Mobadersany, Safoora Yousefi, +5 authors, Lee A D Cooper.
Proc Natl Acad Sci U S A, 2018 Mar 14; 115(13). PMID: 29531073    Free PMC article.
Highly Cited.
A survey and evaluation of Web-based tools/databases for variant analysis of TCGA data.
Zhuo Zhang, Hao Li, +4 authors, Xiaochen Bo.
Brief Bioinform, 2018 Apr 05; 20(4). PMID: 29617727    Free PMC article.
Imitating Pathologist Based Assessment With Interpretable and Context Based Neural Network Modeling of Histology Images.
Arunima Srivastava, Chaitanya Kulkarni, +3 authors, Raghu Machiraju.
Biomed Inform Insights, 2018 Nov 20; 10. PMID: 30450002    Free PMC article.
Artificial intelligence in digital pathology - new tools for diagnosis and precision oncology.
Kaustav Bera, Kurt A Schalper, +2 authors, Anant Madabhushi.
Nat Rev Clin Oncol, 2019 Aug 11; 16(11). PMID: 31399699    Free PMC article.
Highly Cited. Review.
Automated clear cell renal carcinoma grade classification with prognostic significance.
Katherine Tian, Christopher A Rubadue, +4 authors, Yujing J Heng.
PLoS One, 2019 Oct 04; 14(10). PMID: 31581201    Free PMC article.
Pan-Renal Cell Carcinoma classification and survival prediction from histopathology images using deep learning.
Sairam Tabibu, P K Vinod, C V Jawahar.
Sci Rep, 2019 Jul 22; 9(1). PMID: 31324828    Free PMC article.
A Coclinical Radiogenomic Validation Study: Conserved Magnetic Resonance Radiomic Appearance of Periostin-Expressing Glioblastoma in Patients and Xenograft Models.
Pascal O Zinn, Sanjay K Singh, +17 authors, Rivka R Colen.
Clin Cancer Res, 2018 Jul 29; 24(24). PMID: 30054278    Free PMC article.
Histoepigenetic analysis of HPV- and tobacco-associated head and neck cancer identifies both subtype-specific and common therapeutic targets despite divergent microenvironments.
Ivenise Carrero, Hsuan-Chen Liu, Andrew G Sikora, Aleksandar Milosavljevic.
Oncogene, 2019 Jan 19; 38(19). PMID: 30655605    Free PMC article.
Association between polymorphism in the promoter region of lncRNA GAS5 and the risk of colorectal cancer.
Yajie Wang, Shenshen Wu, +2 authors, Rui Chen.
Biosci Rep, 2019 Mar 25; 39(4). PMID: 30902880    Free PMC article.
Pan-cancer Convergence to a Small-Cell Neuroendocrine Phenotype that Shares Susceptibilities with Hematological Malignancies.
Nikolas G Balanis, Katherine M Sheu, +8 authors, Thomas G Graeber.
Cancer Cell, 2019 Jul 10; 36(1). PMID: 31287989    Free PMC article.
The value of lncRNA FENDRR and FOXF1 as a prognostic factor for survival of lung adenocarcinoma.
Antonio Herrera-Merchan, Marta Cuadros, +8 authors, Pedro P Medina.
Oncotarget, 2017 Oct 27; 11(13). PMID: 32284793    Free PMC article.
Emerging role of deep learning-based artificial intelligence in tumor pathology.
Yahui Jiang, Meng Yang, +2 authors, Yan Sun.
Cancer Commun (Lond), 2020 Apr 12; 40(4). PMID: 32277744    Free PMC article.
Integrative Analysis of CD133 mRNA in Human Cancers Based on Data Mining.
Gui-Min Wen, Fei-Fei Mou, +2 authors, Pu Xia.
Stem Cell Rev Rep, 2018 Nov 16; 15(1). PMID: 30430389
Integrative Analysis of Histopathological Images and Genomic Data Predicts Clear Cell Renal Cell Carcinoma Prognosis.
Jun Cheng, Jie Zhang, +7 authors, Kun Huang.
Cancer Res, 2017 Nov 03; 77(21). PMID: 29092949    Free PMC article.
Structured crowdsourcing enables convolutional segmentation of histology images.
Mohamed Amgad, Habiba Elfandy, +28 authors, Lee A D Cooper.
Bioinformatics, 2019 Feb 07; 35(18). PMID: 30726865    Free PMC article.
The Use of Screencasts with Embedded Whole-Slide Scans and Hyperlinks to Teach Anatomic Pathology in a Supervised Digital Environment.
Mary Wong, Joseph Frye, Stacey Kim, Alberto M Marchevsky.
J Pathol Inform, 2019 Jan 05; 9. PMID: 30607306    Free PMC article.
The transition module: a method for preventing overfitting in convolutional neural networks.
S Akbar, M Peikari, +2 authors, A L Martel.
Comput Methods Biomech Biomed Eng Imaging Vis, 2019 Jun 14; 7(3). PMID: 31192055    Free PMC article.
Concise Review: Organ Engineering: Design, Technology, and Integration.
Gaurav Kaushik, Jeroen Leijten, Ali Khademhosseini.
Stem Cells, 2016 Oct 26; 35(1). PMID: 27641724    Free PMC article.
Immune Activation and Benefit From Avelumab in EBV-Positive Gastric Cancer.
Anshuman Panda, Janice M Mehnert, +19 authors, Shridar Ganesan.
J Natl Cancer Inst, 2017 Nov 21; 110(3). PMID: 29155997    Free PMC article.
IDH1 and IDH2 mutations in lung adenocarcinomas: Evidences of subclonal evolution.
Erika F Rodriguez, Federico De Marchi, +6 authors, Ming-Tseh Li.
Cancer Med, 2020 Apr 26; 9(12). PMID: 32333643    Free PMC article.
The impact of site-specific digital histology signatures on deep learning model accuracy and bias.
Frederick M Howard, James Dolezal, +10 authors, Alexander T Pearson.
Nat Commun, 2021 Jul 22; 12(1). PMID: 34285218    Free PMC article.