Unsupervised Network Medicine for Longitudinal Omics Data
(FAU Funds)Term: since 15. January 2022
Over the last years, large amounts of molecular profiling data (also called “omics data”) have become available. This has raised hopes to identify so-called disease modules, i.e., sets of functionally related molecules constituting candidate disease mechanisms. However, omics data tend to be overdetermined and noisy; and modules identified via purely statistical means are hence often unstable and functionally uninformative. Hence, network-based disease module mining methods (DMMMs) project omics data onto biological networks such as protein-protein interaction (PPI) networks, gene regulatory networks (GRNs), or microbial interaction networks (MINs). Subsequently, network algorithms are used to identify disease modules consisting of small subnetworks. This dramatically decreases the size of the search space and prioritizes disease modules consisting of functionally related molecules, positively affecting both stability and functional relevance of the discovered modules.
However, to the best of our knowledge, all existing DMMMs are subject to at least one of the following two limitations: Firstly, existing DMMMs are typically supervised, in the sense that they try to find subnetworks explaining differences in the omics data between predefined case and control patients or pre-defined disease subtypes. This is potentially problematic, because it implies that existing DMMMs are biased by our current disease ontologies, which are mostly symptom- or organ-based and therefore often too coarse-grained. For instance, around 95 % of all patients with hypertension are diagnosed with so-called “essential hypertension” (code BA00.Z in the ICD-11 disease ontology), meaning that the cause of the hypertension is unknown. In fact, there are probably several disjoint molecular mechanisms causing “essential hypertension”, and the same holds true for many other complex diseases such as Alzheimer’s disease, multiple sclerosis, and Crohn’s disease. Supervised DMMMs which take existing disease definitions for granted hence risk overlooking the molecular mechanisms causing mechanistically distinct subtypes.
Secondly, most existing DMMMs are designed for static omics data and do not support longitudinal data where the patients’ molecular profiles are observed over time. Existing analysis frameworks for longitudinal omics data largely use purely statistical means. Consequently, network medicine approaches for time series data are needed.
To the best of our knowledge, there are only three DMMMs which, in part, overcome these limitations: BiCoN and GrandForest allow unsupervised disease module mining but do not support longitudinal omics data. TiCoNE supports longitudinal data but requires predefined case vs. control or subtype annotations as input. There is hence an unmet need for unsupervised DMMMs for longitudinal omics data. Developing such methods is the main objective of the proposed project.