David B. Blumenthal
Prof. Dr. David B. Blumenthal
- Since June 2021
W1 professor and head of the Biomedical Network Science Lab, Department Artificial Intelligence in Biomedical Engineering, Friedrich-Alexander University Erlangen-Nürnberg, Erlangen, Germany. - October 2019 – May 2021
Postdoctoral fellow at the Chair of Experimental Bioinformatics, Technical University of Munich, Freising, Germany. - March 2019 – September 2019
Postdoctoral fellow at the Faculty of Computer Science, Free University of Bozen-Bolzano, Bolzano, Italy. - November 2015 – February 2019
PhD candidate at the Faculty of Computer Science, Free University of Bozen-Bolzano, Bolzano, Italy. - February & November 2018
Visiting PhD student at the GREYC Research Lab in Digital Science, Caen, France. - October 2009 – June 2015
BSc & MSc Mathematics, Free University of Berlin & Technical University of Berlin, Berlin, Germany. - October 2007 – November 2013
BA & MA Philosophy, Free University of Berlin, Berlin, Germany. - September 2010 – June 2011
Erasmus exchange, University of Edinburgh, Edinburgh, United Kingdom. - June 2007
Abitur, Ottheinrich-Gymnasium Wiesloch, Wiesloch, Germany.
2024
Guiding questions to avoid data leakage in biological machine learning applications
In: Nature Methods 21 (2024), p. 1444-1453
ISSN: 1548-7091
DOI: 10.1038/s41592-024-02362-y
BibTeX: Download
, , , , , , :
Cracking the black box of deep sequence-based protein-protein interaction prediction
In: Briefings in Bioinformatics 25 (2024), Article No.: bbae076
ISSN: 1467-5463
DOI: 10.1093/bib/bbae076
BibTeX: Download
, , :
Inference of differential gene regulatory networks using boosted differential trees
In: Bioinformatics Advances 4 (2024), Article No.: vbae034
ISSN: 2635-0041
DOI: 10.1093/bioadv/vbae034
BibTeX: Download
, , , , , , :
Federated singular value decomposition for high-dimensional data
In: Data Mining and Knowledge Discovery 38 (2024), p. 938 - 975
ISSN: 1573-756X
DOI: 10.1007/s10618-023-00983-z
BibTeX: Download
, , :
Network medicine-based epistasis detection in complex diseases: Ready for quantum computing
In: Nucleic Acids Research 52 (2024), p. 10144-10160
ISSN: 0305-1048
DOI: 10.1093/nar/gkae697
BibTeX: Download
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , :
Efficiently Labeling and Retrieving Temporal Anomalies in Relational Databases
In: Information Systems Frontiers (2024)
ISSN: 1387-3326
DOI: 10.1007/s10796-024-10495-w
BibTeX: Download
, , , , :
Drugst.One - a plug-and-play solution for online systems medicine and network-based drug repurposing
In: Nucleic Acids Research 52 (2024), p. W481-W488
ISSN: 0305-1048
DOI: 10.1093/nar/gkae388
BibTeX: Download
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , :
ZEB1-mediated fibroblast polarization controls inflammation and sensitivity to immunotherapy in colorectal cancer
In: EMBO Reports (2024)
ISSN: 1469-221X
DOI: 10.1038/s44319-024-00186-7
BibTeX: Download
, , , , , , , , , , , , , , , , , , , , :
Correction to: The specific DNA methylation landscape in focal cortical dysplasia ILAE type 3D (Acta Neuropathologica Communications, (2023), 11, 1, (129), 10.1186/s40478-023-01618-6)
In: Acta Neuropathologica Communications 12 (2024), Article No.: 49
ISSN: 2051-5960
DOI: 10.1186/s40478-024-01752-9
BibTeX: Download
, , , , , , , , , , , :
2023
The edge-preservation similarity for comparing rooted, unordered, node-labeled trees
In: Pattern Recognition Letters 167 (2023), p. 189-195
ISSN: 0167-8655
DOI: 10.1016/j.patrec.2023.02.017
BibTeX: Download
, , , :
On the role of network topology in German-Jewish recommendation letter networks in the early twentieth century
In: Applied Network Science 8 (2023), Article No.: 24
ISSN: 2364-8228
DOI: 10.1007/s41109-023-00550-x
BibTeX: Download
, :
TF-Prioritizer: a Java pipeline to prioritize condition-specific transcription factors
In: GigaScience 12 (2023), Article No.: giad026
ISSN: 2047-217X
DOI: 10.1093/gigascience/giad026
BibTeX: Download
, , , , , , , , , , , , , , :
Demographic confounders distort inference of gene regulatory and gene co-expression networks in cancer
In: Briefings in Bioinformatics 24 (2023), Article No.: bbad413
ISSN: 1467-5463
DOI: 10.1093/bib/bbad413
BibTeX: Download
, :
Lacking mechanistic disease definitions and corresponding association data hamper progress in network medicine and beyond
In: Nature Communications 14 (2023), Article No.: 1662
ISSN: 2041-1723
DOI: 10.1038/s41467-023-37349-4
BibTeX: Download
, , , , , , , , , , , , :
Online bias-aware disease module mining with ROBUST-Web
In: Bioinformatics 35 (2023), Article No.: btad345
ISSN: 1367-4803
DOI: 10.1093/bioinformatics/btad345
BibTeX: Download
, , , , , , , :
The specific DNA methylation landscape in focal cortical dysplasia ILAE type 3D
In: Acta Neuropathologica Communications 11 (2023), Article No.: 129
ISSN: 2051-5960
DOI: 10.1186/s40478-023-01618-6
BibTeX: Download
, , , , , , , , , , , :
2022
Online in silico validation of disease and gene sets, clusterings or subnetworks with DIGEST
In: Briefings in Bioinformatics (2022)
ISSN: 1467-5463
DOI: 10.1093/bib/bbac247
URL: https://academic.oup.com/bib/advance-article/doi/10.1093/bib/bbac247/6618231?guestAccessKey=e1b9455f-ac28-4d46-af6d-c51b3b98cc8d
BibTeX: Download
, , , :
Robust disease module mining via enumeration of diverse prize-collecting Steiner trees
In: Bioinformatics 38 (2022), p. 1600-1606
ISSN: 1367-4803
DOI: 10.1093/bioinformatics/btab876
URL: https://academic.oup.com/bioinformatics/advance-article/doi/10.1093/bioinformatics/btab876/6497106?guestAccessKey=27c2a19b-11b5-4c3c-88d5-f1131e3809a9
BibTeX: Download
, , , , , , , :
Enumerating dissimilar minimum cost perfect and error-correcting bipartite matchings for robust data matching
In: Information Sciences 596 (2022), p. 202-221
ISSN: 0020-0255
DOI: 10.1016/j.ins.2022.03.017
BibTeX: Download
, , , :
Querying Temporal Anomalies in Healthcare Information Systems and Beyond
26th European Conference on Advances in Databases and Information Systems (ADBIS 2022) (Turin, 05/09/2022 - 08/09/2022)
In: Chiusano S, Cerquitelli T, Wrembel R (ed.): Advances in Databases and Information Systems (ADBIS 2022), Cham: 2022
DOI: 10.1007/978-3-031-15740-0_16
BibTeX: Download
, , , , :
Privacy-Preserving Artificial Intelligence Techniques in Biomedicine
In: Methods of Information in Medicine (2022)
ISSN: 0026-1270
DOI: 10.1055/s-0041-1740630
BibTeX: Download
, , , , , , , , :
2021
Metric Indexing for Graph Similarity Search
14th International Conference on Similarity Search and Applications (SISAP 2021) (Dortmund, 29/09/2021 - 01/10/2021)
In: Reyes N, Connor R, Kriege N, Kazempour D, Bartolini I, Schubert E, Chen J (ed.): Proceedings of the 14th International Conference on Similarity Search and Applications (SISAP 2021), Cham: 2021
DOI: 10.1007/978-3-030-89657-7_24
BibTeX: Download
, , , :
Scalable generalized median graph estimation and its manifold use in bioinformatics, clustering, classification, and indexing
In: Information Systems 100 (2021), Article No.: 101766
ISSN: 0306-4379
DOI: 10.1016/j.is.2021.101766
BibTeX: Download
, , , , , :
Upper Bounding Graph Edit Distance Based on Rings and Machine Learning
In: International Journal of Pattern Recognition and Artificial Intelligence (2021), Article No.: 2151008
ISSN: 0218-0014
DOI: 10.1142/S0218001421510083
BibTeX: Download
, , , :
The Minimum Edit Arborescence Problem and Its Use in Compressing Graph Collections
14th International Conference on Similarity Search and Applications (SISAP 2021) (Dortmund, 29/09/2021 - 01/10/2021)
In: Reyes N, Connor R, Kriege N, Kazempour D, Bartolini I, Schubert E, Chen J (ed.): Proceedings of the 14th International Conference on Similarity Search and Applications (SISAP 2021), Cham: 2021
DOI: 10.1007/978-3-030-89657-7_25
BibTeX: Download
, , , , :
Federated Principal Component Analysis for Genome-Wide Association Studies
21st IEEE International Conference on Data Mining (ICDM) (Auckland, New Zealand, 07/12/2021 - 10/12/2021)
In: 21st IEEE International Conference on Data Mining (ICDM) 2021
DOI: 10.1109/ICDM51629.2021.00127
BibTeX: Download
, , , :
On the limits of active module identification
In: Briefings in Bioinformatics 22 (2021), Article No.: bbab066
ISSN: 1467-5463
DOI: 10.1093/bib/bbab066
URL: https://academic.oup.com/bib/advance-article/doi/10.1093/bib/bbab066/6189770?guestAccessKey=1be406a0-6b26-4e15-b32b-505c1ea2b279
BibTeX: Download
, , , :
The AIMe registry for artificial intelligence in biomedical research
In: Nature Methods 18 (2021), p. 1128 - 1131
ISSN: 1548-7091
DOI: 10.1038/s41592-021-01241-0
URL: https://rdcu.be/cv5H7
BibTeX: Download
, , , , , , , , , , , , , , , , , , , , , , , , :
On the Privacy of Federated Pipelines
The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '21) (Virtual Event, 11/07/2021 - 15/07/2021)
In: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '21), New York, NY, USA: 2021
DOI: 10.1145/3404835.3462996
BibTeX: Download
, , , :
Network medicine for disease module identification and drug repurposing with the NeDRex platform
In: Nature Communications 12 (2021), Article No.: 6848
ISSN: 2041-1723
DOI: 10.1038/s41467-021-27138-2
BibTeX: Download
, , , , , , , , , , , , , , , :
Flimma: a federated and privacy-aware tool for differential gene expression analysis
In: Genome Biology 22 (2021), Article No.: 338
ISSN: 1474-760X
DOI: 10.1186/s13059-021-02553-2
BibTeX: Download
, , , , , , , , , , , , , , :
2020
A framework for modeling epistatic interaction
In: Bioinformatics 37 (2020), p. 1708 - 1716
ISSN: 1367-4803
DOI: 10.1093/bioinformatics/btaa990
URL: https://academic.oup.com/bioinformatics/advance-article/doi/10.1093/bioinformatics/btaa990/6012351?guestAccessKey=4683fb7f-7afc-4b9c-b0d5-3d0085904eb4
BibTeX: Download
, , , , :
Comparing heuristics for graph edit distance computation
In: Vldb Journal 29 (2020), p. 419-458
ISSN: 1066-8888
DOI: 10.1007/s00778-019-00544-1
BibTeX: Download
, , , , :
On the exact computation of the graph edit distance
In: Pattern Recognition Letters 134 (2020), p. 46-57
ISSN: 0167-8655
DOI: 10.1016/j.patrec.2018.05.002
BibTeX: Download
, :
EpiGEN: An epistasis simulation pipeline
In: Bioinformatics 36 (2020), p. 4957-4959
ISSN: 1367-4803
DOI: 10.1093/bioinformatics/btaa245
URL: https://academic.oup.com/bioinformatics/advance-article/doi/10.1093/bioinformatics/btaa245/5820008?guestAccessKey=2385c5f1-6587-4065-b2f2-1039765fa5b6
BibTeX: Download
, , , , , :
Improved local search for graph edit distance
In: Pattern Recognition Letters 129 (2020), p. 19-25
ISSN: 0167-8655
DOI: 10.1016/j.patrec.2019.10.028
BibTeX: Download
, , , :
Fast linear sum assignment with error-correction and no cost constraints
In: Pattern Recognition Letters 134 (2020), p. 37-45
ISSN: 0167-8655
DOI: 10.1016/j.patrec.2018.03.032
BibTeX: Download
, , , :
Finding k-shortest paths with limited overlap
In: Vldb Journal (2020)
ISSN: 1066-8888
DOI: 10.1007/s00778-020-00604-x
BibTeX: Download
, , , , :
What is meaningful research and how should we measure it?
In: Scientometrics 125 (2020), p. 153-169
ISSN: 0138-9130
DOI: 10.1007/s11192-020-03649-5
BibTeX: Download
, , :
BiCoN: Network-constrained biclustering of patients and omics data
In: Bioinformatics 37 (2020), p. 2398 - 2404
ISSN: 1367-4803
DOI: 10.1093/bioinformatics/btaa1076
URL: https://academic.oup.com/bioinformatics/advance-article/doi/10.1093/bioinformatics/btaa1076/6050718?guestAccessKey=e2e9cb2b-ac3d-44a6-abc7-2b92ae15d34c
BibTeX: Download
, , , , , , , :
Individuating Possibly Repurposable Drugs and Drug Targets for COVID-19 Treatment through Hypothesis-Driven Systems Medicine Using CoVex
In: Assay and Drug Development Technologies 18 (2020), p. 348-355
ISSN: 1540-658X
DOI: 10.1089/adt.2020.1010
BibTeX: Download
, , , , , :
Exploring the SARS-CoV-2 virus-host-drug interactome for drug repurposing
In: Nature Communications 11 (2020), Article No.: 3518
ISSN: 2041-1723
DOI: 10.1038/s41467-020-17189-2
BibTeX: Download
, , , , , , , , , , , , , , , , :
2019
GEDLIB: A C++ Library for Graph Edit Distance Computation
12th IAPR-TC15 Workshop on Graph-Based Representations in Pattern Recognition, GbRPR 2019 (Tours, 19/06/2019 - 21/06/2019)
In: Donatello Conte, Jean-Yves Ramel, Pasquale Foggia (ed.): Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2019
DOI: 10.1007/978-3-030-20081-7_2
BibTeX: Download
, , , :
2018
Ring based approximation of graph edit distance
Joint IAPR International Workshops on Structural and Syntactic Pattern Recognition, SSPR 2018 and Statistical Techniques in Pattern Recognition, SPR 2018 (Beijing, 17/08/2018 - 19/08/2018)
In: Edwin R. Hancock, Tin Kam Ho, Battista Biggio, Richard C. Wilson, Antonio Robles-Kelly, Xiao Bai (ed.): Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2018
DOI: 10.1007/978-3-319-97785-0_28
BibTeX: Download
, , , :
Quasimetric Graph Edit Distance as a Compact Quadratic Assignment Problem
24th International Conference on Pattern Recognition, ICPR 2018 (Beijing, 20/08/2018 - 24/08/2018)
In: Proceedings - International Conference on Pattern Recognition 2018
DOI: 10.1109/ICPR.2018.8546055
BibTeX: Download
, , , , :
Improved lower bounds for graph edit distance
In: IEEE Transactions on Knowledge and Data Engineering 30 (2018), p. 503-516
ISSN: 1041-4347
DOI: 10.1109/TKDE.2017.2772243
BibTeX: Download
, :
Finding k-dissimilar paths with minimum collective length
26th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM SIGSPATIAL GIS 2018 (Seattle, WA, 06/11/2018 - 09/11/2018)
In: Li Xiong, Roberto Tamassia, Kashani Farnoush Banaei, Ralf Hartmut Guting, Erik Hoel (ed.): GIS: Proceedings of the ACM International Symposium on Advances in Geographic Information Systems 2018
DOI: 10.1145/3274895.3274903
BibTeX: Download
, , , , :
2017
Correcting and speeding-up bounds for non-uniform graph edit distance
33rd IEEE International Conference on Data Engineering, ICDE 2017 (San Diego, CA, 19/04/2017 - 22/04/2017)
In: Proceedings - International Conference on Data Engineering 2017
DOI: 10.1109/ICDE.2017.57
BibTeX: Download
, :
Exact computation of graph edit distance for uniform and non-uniform metric edit costs
11th IAPR-TC-15 International Workshop on Graph-Based Representations in Pattern Recognition, GbRPR 2017 (Anacapri, ITA, 16/05/2017 - 18/05/2017)
In: Pasquale Foggia, Mario Vento, Cheng-Lin Liu (ed.): Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2017
DOI: 10.1007/978-3-319-58961-9_19
BibTeX: Download
, :
Experts for reviewing scientific journals
- Proceedings of the National Academy of Sciences of the United States of America
since 20/03/2023 - Nature Reviews Nephrology
since 03/01/2023 - Briefings in Bioinformatics
since 31/10/2022 - Genome Research
since 17/08/2022 - Computational and Structural Biotechnology Journal
since 22/06/2022 - Bioinformatics (Oxford, England)
since 05/11/2021 - Frontiers in Medicine
since 03/09/2021 - Heliyon
since 01/01/2021 - Assay and Drug Development Technologies
since 01/01/2021 - Nature Communications
since 01/01/2021 - Information Sciences
since 01/01/2020 - European Journal of Operational Research
since 01/01/2020 - Pattern Analysis and Applications
since 01/01/2020 - BioData Mining
since 01/01/2020 - Information Systems Frontiers
since 01/01/2019 - Pattern Recognition Letters
since 01/01/2019
Experts for funding organisations
- Dutch Research Council (NWO)
since 03/10/2022 - Ministerium für Kultur und Wissenschaft des Landes Nordrhein-Westfalen
since 01/10/2021 - TUM Global Postdoc Fellowship
since 01/01/2021 - The Research Council of Norway
since 01/01/2021
Other expert activities (FAU-external)
- Programme committee member of the workshop on Network Science and Artificial Intelligence for Biomedicine & Health Informatics at the International Conference on Bioinformatics and Biomedicine (IEEE BIBM)
since 15/09/2022, URL: https://sites.google.com/unicz.it/nefico/home-page - Programme committee member of the network biology (NetBio) track at the Conference on Intelligent Systems for Molecular Biology (ISMB)
since 05/05/2021, URL: http://cosi.iscb.org/wiki/NetBio:Home
Other activities (FAU-external)
- 2022: Best Poster Award (German Conference on Bioinformatics 2022)
- 2022: Best Paper Award (26th European Conference on Advances in Databases and Information Systems)
- 2021: Best PhD Student Award (Faculty of Computer Science, Free University of Bozen-Bolzano)
- 2021: Best Talk Award (29th Conference on Intelligent Systems for Molecular Biology, Network Biology Community of Special Interest)
- 2017: Best Paper Award (11th IAPR-TC-15 International Workshop on Graph-Based Representations in Pattern Recognition)
- 2015: MSc Award (Institute of Mathematics, Technical University of Berlin)
- 2015: PhD Grant (Faculty of Computer Science, Free University of Bozen-Bolzano)
- 2007: Student Scholarship (German Academic Scholarship Foundation)
2025
-
BZKF Translationsgruppe - Determination of residual disease in AML using AI-supported analysis of flow cytometry data
(Third Party Funds Single)
Term: 01/01/2025 - 31/12/2026
Funding source: andere Förderorganisation
2024
-
Federated network medicine for laboratory data in paediatric oncology
(Third Party Funds Group – Overall project)
Term: 01/11/2024 - 31/10/2026
Funding source: BMBF / VerbundprojektIn FLabNet, we will harness the potential of algorithmic network biology and distributed machine learning to address two exemplary unmet needs in paediatric oncology: prediction ofchemotherapy side effects like neutropenic fever and early-stage detection of rare malignantdiseases such as myeloproliferative neoplasms. Based on >54 million laboratory test resultsfrom >500,000 patients from the Core Dataset of the German Medical Informatics Initiative (MII),we will create personalised networks, where nodes represent individual laboratory measurementsand edges encode patient-specific relationships. We hypothesise the emerging personal graph representations to capture the unique spectra and dependencies of the individual patients’ health anddisease characteristics. The networks will be used as signatures for label-efficient graph-based pre-dictors such as graph kernels; and we will provide privacy-preserving federated implementationsof our predictors that are fully interoperable with MII standards. To achieve its objectives, ourconsortium combines expertise in algorithmic systems biology (FAU), paediatric oncology (UKER),quantitative analysis of laboratory data (UKER), federated learning for biomedicine (Bitspark GmbH& FAU), and professional software development (Bitspark GmbH). These synergistic skill sets willenable us to combine laboratory diagnostics, computational systems medicine, and privacy-preserving machine learning, advancing the state of the art in quantitative analysis of laboratory data for precision medicine in paediatric oncology and beyond.
-
High-resolution protein-protein interaction networks for biomedical research
(Third Party Funds Group – Overall project)
2023
-
AI4MDD: AI-Powered Prognosis of Treatment Response in Major Depression Disorder
(Third Party Funds Single)
Term: 01/07/2023 - 30/09/2026
Funding source: Industrie -
Detection of ALS-Specific Protein Profiles in Multi-Antigen Analysis Imaging Data
(Third Party Funds Single)
Term: 01/03/2023 - 31/08/2023
Funding source: Industrie -
Dimensionality reduction for molecular data based on explanatory power of differential regulatory networks
(Third Party Funds Group – Overall project)
Term: 01/03/2023 - 28/02/2026
Funding source: Bundesministerium für Bildung und Forschung (BMBF)
URL: https://www.netmap.ai/Rapid advances in single-cell RNA sequencing (scRNA-seq) technology are leading to ever-increasing dimensions of the generated molecular data, which complicates data analyses. In NetMap, new scalable and robust dimensionality reduction approaches for scRNA-seq data will be developed. To this end, dimensionality reduction will be integrated into a central task of the systems medicine analysis of scRNA-seq data: inference of gene regulatory networks (GRNs) and driver transcription factors based on cell expression profiles. Each resulting dimension will correspond to a driver GRN, and the coordinate of a cell in this low-dimensional representation will quantify the extent to which the particular driver GRN explains the cell's gene expression profile. These new methods will be implemented as a user-friendly software platform for exploratory expert-in-the-loop analysis and in silico prediction of drug repurposing candidates.
As a case study, we will investigate CD4 helper T cell exhaustion, a potential limiting factor in immunotherapy. NetMap's strategy consists of (1) analyzing phenotypic heterogeneity of depleted CD4 T cells, (2) identifying transcriptional mechanisms that control this heterogeneity, (3) amplifying/eliminating specific subsets and testing their functional impact. This will allow the development of an atlas of the gene regulatory landscape of depleted CD4 T cells, while the in vivo testing of key regulatory transcription factors will help demonstrate the power of the developed methods and allow evaluation and improvement of predictions.
-
A Platform for Dynamic Exploration of the Cooperative Health Research in South Tyrol Study Data via Multi-Level Network Medicine
(Third Party Funds Single)
Term: 01/12/2023 - 30/11/2026
Funding source: Deutsche Forschungsgemeinschaft (DFG)
URL: https://www.dyhealthnet.ai/The Cooperative Health Research in South Tyrol (CHRIS) study offers a comprehensive overview of the health state of >13,000 adults in the middle and upper Val Venosta. It is the largest population-based molecular study in Italy with a longitudinal lookout to investigate the genetic and molecular basis of age-related common chronic conditions and their interaction with lifestyle and environment in the general population. In CHRIS, the combination of molecular profiling data, such as genomics and metabolomics, together with important baseline clinical and lifestyle data offers vast opportunities for understanding physiological changes that could lead to clinical complications or indicate the prevalence or early onset of diseases together with their molecular underpinnings.
Where disease-focused studies often have a clear hypothesis that dictates the necessary statistical analyses, population-based cohorts such as CHRIS are more versatile and allow both testing existing hypotheses as well as generating new hypotheses that arise from statistically significant associations of the available data. Ideally, this type of explorative analysis is open to biomedical researchers that do not necessarily have experience with data analysis or machine learning. Network-based approaches are ideally suited for studying heterogeneous biomedical data, giving rise to the field of network medicine. However, network medicine techniques have so far mainly been used in the context of studies focusing on individual diseases. Network-based platforms for the explorative analysis of population-based cohort data do not exist.
In DyHealthNet, we will close this gap and develop a network-based data analysis platform, which will allow to integrate heterogeneous data and support explorative data analytics over dynamically generated subsets of the CHRIS study data. To fully leverage the potential of the available multi-level data, the DyHealthNet platform combines (1) data integration using standardized medical information models (HL7 FHIR), (2) innovative index structures for scalable dynamic analysis, (3) machine learning, and (4) visual analytics. DyHealthNet will render the CHRIS population cohort data accessible for state-of-the-art privacy-preserving, network-based data analysis. DyHealthNet will hence enable mining of context-specific pathomechanisms for precision medicine, and will serve as a blueprint for dynamic explorative analysis of multi-level cohort data worldwide. To achieve these objectives, the DyHeathNet consortium combines expertise in population-based cohort studies (Fuchsberger) and in the development of complex algorithms for the analysis of molecular networks (Blumenthal), applied biomedical AI and software systems (List), and customized index structures for scalable data management (Gamper).
2022
-
Unsupervised Network Medicine for Longitudinal Omics Data
(FAU Funds)
Term: since 15/01/2022Over the last years, large amounts of molecular profiling data (also called “omics data”) have become available. This has raised hopes to identify so-called disease modules, i.e., sets of functionally related molecules constituting candidate disease mechanisms. However, omics data tend to be overdetermined and noisy; and modules identified via purely statistical means are hence often unstable and functionally uninformative. Hence, network-based disease module mining methods (DMMMs) project omics data onto biological networks such as protein-protein interaction (PPI) networks, gene regulatory networks (GRNs), or microbial interaction networks (MINs). Subsequently, network algorithms are used to identify disease modules consisting of small subnetworks. This dramatically decreases the size of the search space and prioritizes disease modules consisting of functionally related molecules, positively affecting both stability and functional relevance of the discovered modules.
However, to the best of our knowledge, all existing DMMMs are subject to at least one of the following two limitations: Firstly, existing DMMMs are typically supervised, in the sense that they try to find subnetworks explaining differences in the omics data between predefined case and control patients or pre-defined disease subtypes. This is potentially problematic, because it implies that existing DMMMs are biased by our current disease ontologies, which are mostly symptom- or organ-based and therefore often too coarse-grained. For instance, around 95 % of all patients with hypertension are diagnosed with so-called “essential hypertension” (code BA00.Z in the ICD-11 disease ontology), meaning that the cause of the hypertension is unknown. In fact, there are probably several disjoint molecular mechanisms causing “essential hypertension”, and the same holds true for many other complex diseases such as Alzheimer’s disease, multiple sclerosis, and Crohn’s disease. Supervised DMMMs which take existing disease definitions for granted hence risk overlooking the molecular mechanisms causing mechanistically distinct subtypes.
Secondly, most existing DMMMs are designed for static omics data and do not support longitudinal data where the patients’ molecular profiles are observed over time. Existing analysis frameworks for longitudinal omics data largely use purely statistical means. Consequently, network medicine approaches for time series data are needed.
To the best of our knowledge, there are only three DMMMs which, in part, overcome these limitations: BiCoN and GrandForest allow unsupervised disease module mining but do not support longitudinal omics data. TiCoNE supports longitudinal data but requires predefined case vs. control or subtype annotations as input. There is hence an unmet need for unsupervised DMMMs for longitudinal omics data. Developing such methods is the main objective of the proposed project.