Integration of somatic mutation, expression and functional data reveals potential driver genes predictive of breast cancer survival

Chen Suo, Olga Hrydziuszko, Donghwan Lee, Setia Pramana, Dhany Saputra, Himanshu Joshi, Stefano Calza, Yudi Pawitan

Research output: Contribution to journalArticlepeer-review

42 Scopus citations


Motivation: Genome and transcriptome analyses can be used to explore cancers comprehensively, and it is increasingly common to have multiple omics data measured from each individual. Furthermore, there are rich functional data such as predicted impact of mutations on protein coding and gene/protein networks. However, integration of the complex information across the different omics and functional data is still challenging. Clinical validation, particularly based on patient outcomes such as survival, is important for assessing the relevance of the integrated information and for comparing different procedures. Results: An analysis pipeline is built for integrating genomic and transcriptomic alterations from whole-exome and RNA sequence data and functional data from protein function prediction and gene interaction networks. The method accumulates evidence for the functional implications of mutated potential driver genes found within and across patients. A driver-gene score (DGscore) is developed to capture the cumulative effect of such genes. To contribute to the score, a gene has to be frequently mutated, with high or moderate mutational impact at protein level, exhibiting an extreme expression and functionally linked to many differentially expressed neighbors in the functional gene network. The pipeline is applied to 60 matched tumor and normal samples of the same patient from The Cancer Genome Atlas breast-cancer project. In clinical validation, patients with high DGscores have worse survival than those with low scores (P∈=∈0.001). Furthermore, the DGscore outperforms the established expression-based signatures MammaPrint and PAM50 in predicting patient survival. In conclusion, integration of mutation, expression and functional data allows identification of clinically relevant potential driver genes in cancer. Availability and implementation: The documented pipeline including annotated sample scripts can be found in

Original languageEnglish
Pages (from-to)2607-2613
Number of pages7
Issue number16
StatePublished - 19 Jan 2015
Externally publishedYes


Dive into the research topics of 'Integration of somatic mutation, expression and functional data reveals potential driver genes predictive of breast cancer survival'. Together they form a unique fingerprint.

Cite this