Skip to main navigation menu Skip to main content Skip to site footer

Review article: Biomedical intelligence

Vol. 145 No. 3738 (2015)

Data mining The Cancer Genome Atlas in the era of precision cancer medicine

  • Phil F. Cheng
  • Reinhard Dummer
  • Mitchell P Levesque
Cite this as:
Swiss Med Wkly. 2015;145:w14183


The Cancer Genome Atlas (TCGA) has given researchers and clinicians unprecedented access to many different cancers through multiple platforms that include exome sequencing, comparative genomic hybridisation (CGH) arrays, DNA methylation arrays, RNA sequencing, reverse protein phase arrays (RPPA), and clinical features. Most data are available to the public in their raw and processed forms; however, analysis and interpretation of these data require specialised training and software. To address this problem, online tools such as cBioportal, canEvolve, GDAC firehose, PROGgeneV2, and UCSC Cancer browser have been developed by various groups to explore and perform analyses on the datasets that are easily understandable by basic researchers and clinicians. In this mini-review, we give an overview of the datasets available from TCGA and the public tools available for integrative analysis of survival with the genomic and transcriptomic datasets, and introduce a tool being developed by our group to analyse the datasets within TCGA.


  1. Weinstein JN, Collisson EA, Mills GB, Shaw KR, Ozenberger BA, Ellrott K, et al., Network CGAR: The Cancer Genome Atlas Pan-Cancer analysis project. Nat Genet. 2013;45:1113–20.
  2. Hoadley KA, Yau C, Wolf DM, Cherniack AD, Tamborero D, Ng S, et al. Multiplatform analysis of 12 cancer types reveals molecular classification within and across tissues of origin. Cell. 2014;158:929–44.
  3. Brown SD, Warren RL, Gibb EA, Martin SD, Spinelli JJ, Nelson BH, Holt RA. Neo-antigens predicted by tumor genome meta-analysis correlate with increased patient survival. Genome Res. 2014;24:743–50.
  4. Li B, Dewey CN: RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinformatics. 2011;12:323.
  5. Cerami E, Gao J, Dogrusoz U, Gross BE, Sumer SO, Aksoy BA, et al. The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data. Cancer Discov. 2012;2:401–4.
  6. Samur MK, Yan Z, Wang X, Cao Q, Munshi NC, Li C, Shah PK. canEvolve: a web portal for integrative oncogenomics. PLoS One 2013, 8:e56228.
  7. Goswami CP, Nakshatri H. PROGgeneV2: enhancements on the existing database. BMC Cancer. 2014;14:970.
  8. Zhu J, Sanborn JZ, Benz S, Szeto C, Hsu F, Kuhn RM, et al. The UCSC Cancer Genomics Browser. Nat Methods. 2009;6:239–40.
  9. Ciriello G, Cerami E, Sander C, Schultz N. Mutual exclusivity analysis identifies oncogenic network modules. Genome Res. 2012;22:398–406.
  10. Mermel CH, Schumacher SE, Hill B, Meyerson ML, Beroukhim R, Getz G: GISTIC2.0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers. Genome Biol. 2011;12:R41.
  11. Lawrence MS, Stojanov P, Polak P, Kryukov GV, Cibulskis K, Sivachenko A, et al. Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature. 2013;499:214–8.
  12. Vaske CJ, Benz SC, Sanborn JZ, Earl D, Szeto C, Zhu J, et al. Inference of patient-specific pathway activities from multi-dimensional cancer genomics data using PARADIGM. Bioinformatics. 2010;26:i237–245.
  13. Margolin AA, Nemenman I, Basso K, Wiggins C, Stolovitzky G, Dalla Favera R, Califano A: ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinformatics. 2006;7(Suppl 1):S7.
  14. Langfelder P, Horvath S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics. 2008;9:559.
  15. Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A. 2005;102:15545–50.
  16. Yan Z, Shah PK, Amin SB, Samur MK, Huang N, Wang X, et al. Integrative analysis of gene and miRNA expression profiles with transcription factor-miRNA feed-forward loops identifies regulators in human cancers. Nucleic Acids Res. 2012;40:e135.
  17. Salari K, Tibshirani R, Pollack JR. DR-Integrator: a new analytic tool for integrating DNA copy number and gene expression data. Bioinformatics. 2010;26:414–6.
  18. Giardine B, Riemer C, Hardison RC, Burhans R, Elnitski L, Shah P, et al. Galaxy: a platform for interactive large-scale genome analysis. Genome Res. 2005;15:1451–5.
  19. Goldman M, Craft B, Swatloski T, Cline M, Morozova O, Diekhans M, et al. The UCSC Cancer Genomics Browser: update 2015. Nucleic Acids Res. 2015;43:D812–817.
  20. Boiko AD, Razorenova OV, van de Rijn M, Swetter SM, Johnson DL, Ly DP, et al. Human melanoma-initiating cells express neural crest nerve growth factor receptor CD271. Nature. 2010;466:133–7.
  21. Civenni G, Walter A, Kobert N, Mihic-Probst D, Zipser M, Belloni B, et al. Human CD271-positive melanoma stem cells associated with metastasis establish tumor heterogeneity and long-term growth. Cancer Res. 2011;71:3098–09.

Most read articles by the same author(s)