Millions of somatic mutations in human cancers have been identified by sequenc- ing. Identifying and distinguishing cancer driver genes amongst the millions of candi- date mutations remains a major challenge. Accurate identification of driver genes and mutations is essential for the progress of cancer research and personalizing treatment based on accurate stratification of patients. Because of inter-tumor genetic hetero- geneity, numerous driver mutations within a gene can be found at low frequencies. This makes them difficult to differentiate from other non-driver mutations. Inspired by these challenges, we devised a novel way of identifying cancer driver genes. Our approach utilizes multiple complementary types of information, specifically cellular phenotypes, cellular locations, function, and whole body physiological phenotypes as features. We demonstrate that our method can accurately identify known cancer driver genes and distinguish between their role in different types of cancer. In ad- dition to identifying known driver genes, we identify several novel candidate driver genes. We provide an external evaluation of the predicted genes using a dataset of 26 nasopharyngeal cancer samples that underwent whole exome sequencing. We find that the predicted driver genes have a significantly higher rate of mutation than non-driver genes, both in publicly available data and in the nasopharyngeal cancer samples we use for validation. Additionally, we characterize sub-networks of genes that are jointly involved in specific tumors.
|Date of Award||Nov 8 2018|
- Computer, Electrical and Mathematical Science and Engineering
|Supervisor||Robert Hoehndorf (Supervisor)|