With the advent of high
throughput technologies, data in molecular biology is accumulating at a
staggering rate. While this flood of data bears much promise, the
computational methodology needed to analyze such data remains a
challenging bottleneck.
My objective is to understand the architecture of molecular
networks and elucidate a global view of how the molecular network
processes combinations of signals to compute and execute a concerted
cellular decision and response. I approach this goal with a
computational toolbox and develop methods that integrate and explain
diverse high throughput technologies to uncover novel insights about the
workings of living cells and organisms.
Understanding “cellular computation” requires knowledge of the network
architecture and the influences among its components. To address
this challenge, I take a hierarchical modeling approach. At the
finest resolution, I develop models that infer the detailed connectivity
and influences between network components. In this talk, I will
demonstrate how we applied Bayesian networks to the automated derivation
of causal influences in signaling networks. This relied on state
of the art technology that simultaneously measures the levels of
multiple signaling components in thousands of individual human cells.
Our method automatically discovered de novo, most traditionally
established influences between the measured signaling components, as
well as discovering novel inter-pathway crosstalk, which we
experimentally verified. A key distinction of our approach is the
use of single cell measurements, thus avoiding population averaging,
which often masks true activities.
However, molecular networks are complex, constituting of a web of
thousands of components interacting via a wide range of molecular
mechanisms. To address this challenge, I develop probabilistic
models that integrate heterogeneous data and exploit biological
principles such as modularity for obtaining efficient representations
and learning algorithms. I will present a probabilistic method
that integrates genotype and gene expression data for discovering
regulatory modules: a set of genes that is controlled by the same
regulatory program. Our method automatically identifies the genes
comprising each module and the regulatory program controlling their
behavior. Additionally, our method provides fine-grained
mechanistic explanations for the effect of individual genetic variation
on regulatory interactions. By taking in the system as a whole,
our analysis suggests some insights on how genetic variation adapts the
regulatory network to different environments.