The primary motive of this work is to substantiate the centrality-lethality hypothesis, which states that nodes that are highly central in a network are much more likely to be lethal/essential.
Our network features include some common measures of centrality, and some novel recursive measures recently proposed in social network literature.
Our network features capture the local, global and neighbourhood properties of the network and are hence effective for the prediction of essential genes across diverse organisms, even in the absence of other complex biological knowledge.
We extracted several sets of network-based features from protein–protein association networks available from the STRING database, and we built a machine learning model that classifies the genes based on their essentiality. On total, we trained the datasets using One-Organism-Out method and built models for 2711 bacterial organisms. The essential genes predicted by our model has been separated by species and is hosted in this website.
Senthamizhan, V., Ravindran, B., & Raman, K., (2021)
NetGenes: A Database of Essential Genes Predicted Using Features From Interaction Networks.
Front. Genet. 12:722198. DOI
Each gene in individual species pages is linked to its gene network in STRING database. Also, NCBI taxonomy page for each species can be accessed from individual species pages.
Function of each gene (or corresponding protein) is listed in individual species page. These annotations are collected using API offered by STRING database.
You can find essentiality scores for each gene in individual species page. These are the probability estimates of that particular gene given by the model and it ranges from 0.51 to 1.0.