// publications / [3] predicting gene and protein function
Inferring microbial gene function:
- The evolutionary signal in metagenome phyletic profiles predicts many gene functions. V Vidulin, T Smuc, S Dzeroski, F Supek (2018) Microbiome.
An increasing availability of microbiome DNA sequencing data provides an opportunity to infer gene function in a systematic manner // Metagenome phyletic profiles (MPPs) can accurately predict 826 Gene Ontology functional categories // MPPs derived from diverse environments infer distinct, non-overlapping sets of gene functions
- Inferring gene function from evolutionary change in signatures of translation efficiency. A Krisko, T Copic, T Gabaldón, B Lehner, F Supek (2014) Genome Biology.
The changes in codon adaptation in orthologous gene families can systematically predict function of many genes by employing machine learning to rule out confounding variables. We have experimentally validated novel roles in adaptation to environmental stressors (oxygen, heat, salinity) for tens of E. coli genes.
Predicting oncogenes:
- Systematic discovery of germline cancer predisposition genes through the identification of somatic second hits. S Park, F Supek, B Lehner (2018) Nature Communications.
A statistical method, ALFRED, tests Knudson’s two-hit hypothesis to systematically identify inherited cancer predisposing genes // We identify novel genes, such as the chromatin modifier NSD1, which cause cancer through germline variants and somatic loss-of-heterozygosity // 1 in 50 tumors is associated with novel ALFRED genes
Methods for gene function prediction:
- Extensive complementarity between gene function prediction methods. V Vidulin, T Smuc, F Supek (2016) Bioinformatics.
We analyzed 5 million genes from 2071 genomes to evaluate established methodologies for automated function prediction (AFP). While >1000 functions yielded reliable predictions, the majority of these were accessible to only one or two of the methods. Different methods tend to assign a function to non-overlapping sets of genes. Genomic AFP methods display a striking complementary, both gene-wise and function-wise.
"Prediction is very difficult, especially about the future." -- Niels Bohr.