Mapping of molecular networks across diseases, cell states and species. The Ideker Lab has long promoted strategies to experimentally map and analyze the molecular networks that encode biological function. This interest began during Dr. Ideker’s PhD, at which time Leroy Hood and Dr. Ideker outlined the “Systems Biology” approach to the study of biological systems by “perturbing them systematically; monitoring the global response at multiple levels (gene, protein, metabolite); and formulating network models that describe the structure of the system and its response to perturbation”. Network mapping has continually driven the research agenda of the laboratory via our interest in complementary approaches, especially in mapping protein interaction networks with affinity purification mass spectrometry or yeast-two-hybrid or mapping epistatic genetic interactions by combinatorial gene knockout. Recently, we have participated in large team efforts to map the physical interaction landscapes of multiple cancer types and SARS-CoV-2. We are also making exciting progress with integrating protein interaction networks with protein immunofluorescent images to reconstruct most human cell components and chart new ones.
Using molecular knowledge networks to interpret genetic alterations in cancer. Complex genetic diseases like cancer often have multiple subtypes with distinct causes and clinical outcomes. Genome sequences provide a rich source of data for recognizing and classifying subtypes in a patient population, but genomes have proven difficult to compare as two patients rarely share the same mutations. To address this challenge, in 2007 Ideker Lab developed the concept of network biomarkers, by which heterogeneous coding and non-coding alterations are integrated by common signaling and transcriptional networks. The key idea is that complex diseases are hard to analyze because they can invoke different genetic causes in different patients, but these causes often integrate at levels of organization higher than individual genes, as captured by molecular networks. The network biomarkers approach was initially applied to classify patient subtypes in chronic lymphocytic leukemia and translated to the clinic in 2012. This success stimulated a progression of work from the lab furthering the set of tools for identifying biomarkers as networks rather than individual genes or proteins. Notable later work in this area includes network-based stratification (NBS), a method to stratify a tumor population into informative subtypes by clustering together patients with mutations in common network regions. Collectively, these works have led to further studies by us and many other research groups, who have advanced the methods or networks underlying network-based biomarkers for a variety of diseases.
Visible Machine Learning. We are developing “visible” machine learning approaches to model the flow of genetic information, in which predictive models learn not only to translate genotype to phenotype but to also identify the molecular functions and mechanisms by which these predictions are made. The central concept is to couple the structure of a machine learning model to the structure and function of a target biological system, creating what we have called visible neural networks (VNNs). Whereas a typical predictive model is not readily interpretable due to many hidden variables and states (a “black box”), VNNs can be more directly inspected to reveal the molecular and cellular events responsible for each prediction. They provide a framework for interpretable deep learning, combining the generality, scale and power of neural networks with a biomechanistic understanding. These concepts led to DCell, a deep neural network modeling approximately 3500 subsystems in a budding yeast cell and which is able to accurately translate genotypes to growth phenotypes. Using DCell as a foundation, we created deep neural networks of cancer which predict response to therapy based on the tumor genotype (i.e. its profile of genetic markers) and the drug formula. Visible deep learning models open up the possibility to address a host of biomedical questions which have been recalcitrant to machine learning thus far, with significant applications in cancer genomics.
Epigenetic aging. In 2012 we showed that large parts of the methylome are remodeled with age, a process that is accelerated by disease and slowed in certain genotypes and in women versus men. These findings led to the “epigenetic clock” model for predicting rate of biological aging. We have since reported that these changes are accelerated by viral infection and slowed by anti-aging treatments such as caloric restriction and rapamycin. Most recently, we used epigenetic profiles to translate age between humans and dogs. Comparison of Labrador retriever and human methylomes revealed a nonlinear relationship between dog and human aging which did not follow the conventional wisdom that 1 dog year = 7 human years, leading to a story that was popularized by many news outlets.
Widely used bioinformatics software. The Ideker lab is involved in development of bioinformatic resources that are widely used in biomedical research. The most visible of these is the network analysis platform Cytoscape (www.cytoscape.org). It is a principal tool used by researchers to create and visualize models of molecular interaction networks, with approximately 20,000 downloads per month and >40,000 citations to the original Cytoscape marker paper. More than half of these citations have been added in the past five years, underscoring the continued relevance of the software and propelling the marker paper to the status of most highly cited work in the journal Genome Research. The platform includes an appstore for third-party Cytoscape analysis tools, with >350 such plugins or “Apps” and a cloud-based storage system for networks. Finally, we have been working on significant new Cytoscape functionality for detecting communities of proteins (in protein-protein interaction networks) or cells (in single-cell RNA sequencing data).