Skip to main content

A Causal Network Approach to Understanding Transcription and Methylation in Breast Cancer

Project Team: Audrey Fu, Md. Bahadur Badsha, Evan Martin

Complex diseases often involve changes in DNA sequence, and in DNA transcription and methylation, an epigenetic process that can both regulate and be regulated by gene expression. These changes result in a wide range of symptoms or multiple subtypes of the same disease. In breast cancer, for example, different patterns of gene expression and DNA methylation characterize subtypes that vary in terms of tumor progression and treatment. In order to develop more effective treatments for different subtypes, it is necessary to understand the genes and processes (i.e., transcription and methylation) that drive the differences between subtypes. It is therefore of immense interest to understand how genetic variation influences disease through gene regulatory networks. Unfortunately, identification of genes and processes that are key to diseases is often compromised by inference based on correlation, not causation.

Our long-term goal is to develop computational methods to infer gene regulatory networks that are potentially causal for multiple clinical phenotypes using genomic and clinical data of complex diseases. In this project, we will develop new statistical approaches based on the principle of Mendelian randomization to systematically identify regulatory networks involving both transcription and methylation that are potentially causal for disease subtype. We will use breast cancer as the disease model and apply our methods to genomic data. The principle of Mendelian randomization assumes that the alleles of a genetic variant are randomly assigned to individuals in a population, analogous to a natural randomization experiment. This principle has gained increasing attention in genomics, given its power to separate correlation due to causation from correlation not due to causation.

The models and algorithms developed here will allow us to make causal statements about the two processes at the single gene level and account for confounding variables, which similar studies have not examined. These methods will help to identify key genes for specific breast cancer subtypes and elucidate the roles of transcription and methylation when many genes are involved, offering insights into genes and processes that could better inform subtype classification, cancer diagnosis and development of novel drug targets. These methods are not limited to breast cancer but are applicable to complex diseases in general.

Mountain West Mine Tailings, Watersheds and Adverse Human Health Outcomes

Project Team: Alan Kolok, Lucas Sheneman, Chantal Vella

The long-term goal of this program is to model the associations that occur among metal contamination (as a consequence of mining), watershed geography and adverse human health impacts across the Rocky Mountains. In this pilot project, we will focus on generating a predictive classifier model that includes data from Oregon, Washington, Idaho and Western Montana. Our central hypothesis is that geospatial models that incorporate the occurrence of metal contamination in large watersheds can be predictive of adverse health outcomes, including birth defects, pediatric cancers and cardiovascular disease. To satisfy this hypothesis, the following aims will be addressed.

Aim 1: derive a collection of interoperable digital map layers of the northwestern United States that effectively integrate adverse health outcomes and hydrologic units (watersheds).
Aim 2: use supervised machine learning methods using derived Aim 1 data layers to build and train a spatially-explicit classifier model that discretely categorizes mountain west hydrologic regions in terms of estimated relative health risk by effectively correlating related adverse health outcomes with identified hydrologic units.

A comprehensive evaluation of data available from Public Health Departments in Idaho, Oregon, Washington and Montana will be accomplished. We will also acquire data on premature mortality from the National Vital Statistics System via the publicly available CDC WONDER database. Data on the prevalence of pediatric cancer and birth defects will be gathered from state registries, where available. Spatially nested hydrologic unit (watershed) maps at varying scales (regions, sub-regions, accounting units, cataloging units, etc.) will be harvested from the publicly-available USGS Watershed Boundary Dataset (WBD). All combined source watershed and public health data will be centrally stored, catalogued, transformed, and managed in collaboration with the Northwest Knowledge Network (NKN) at UI.

A discrete classifier system in Esri ArcGIS and/or R will be produced using the spatially-transformed health data from Specific Aim 1. Uniquely identified hydrologic units at multiple scales (using USGS HUC naming conventions) will be assigned by the trained classifier that will be produced for this project. A gradient relative health-risk label ranging from low to high will be developed. The end result of applying an effective trained classifier across the full input dataset will be an efficiently derived geospatial data layer that estimates and discretely labels overall relative human health risk within identified watershed boundaries.

Modeling and Evaluation of Physical Therapy Movements Using Machine Learning

Project Team: Aleksandar Vakanski, Stephen Lee, David Paul, Russell Baker, Hyung-Pil Jun

The long-term goal of this project is to develop a commercial system for evaluation of physical therapy and rehabilitation exercises performed by patients in a home-based setting. The system will employ a machine learning approach for evaluation of patient movements, based on input data related to the trajectories of the body parts of a patient captured with a Kinect vision sensor.

The objective is to develop the necessary methodology for mathematical modeling and evaluation of physical therapy movements. The three specific aims of this project are:

  1. Create the first comprehensive dataset of human movements related to physical therapy;
  2. Develop a novel methodology for mathematical modeling of therapy movements based on deep neural networks; and
  3. Define a set of metrics for therapy performance evaluation.

The academic novelty of the research is related to utilizing deep artificial neural networks for mathematical modeling of therapy movement at multiple levels of abstraction. This will be achieved by employing a network architecture with multiple layers of convolutional computational units for encoding data patterns at different levels of abstraction, combined with layers of recurrent computational units for encoding the temporal correlations of the extracted data features. Such mathematical models furnish a potential for robust algorithmic evaluation of movement sequences performed by patients with respect to a prescribed model of the movements. The project team will also create a comprehensive dataset of therapy movements recorded with multiple sensory systems, which will be made available to the public, and can serve as a benchmark for similar future research.

The significance of the project is in proposing a cost-effective tool to benefit millions of patients undertaking physical therapy by continuously monitoring their progress toward recovery, encouraging their treatment plan compliance, and stimulating their engagement in the rehabilitation effort.

Modeling Variability in Persistence Induced From Within by a Toxic Metabolite

Project Team: Christopher Marx (PI), Andreas Vasdekis (Co-PI), Chris Remien, Siavash Riazi, Denis Liyu

Multidrug antibiotic persistence, which allows some cells that lack genetic resistance to survive antibiotic stresses by becoming dormant, is a major public health concern. This exploratory project will use data from state-of-the-art image cytometry and single-cell analysis in combination with mechanistic mathematical modeling to study the formaldehyde-sensing network that was recently discovered in Methylobacterium by the Marx Lab. The formaldehyde-sensing network in Methylobacterium shares many characteristics with antibiotic persistence, but has the advantage of allowing us to externally manipulate factors governing the transition from growth to stasis, and all the cells in a population go dormant. The research team wishes to develop mathematical models in combination with relevant experimentation 1) to study the ability the biochemical network to allow for distinct cell fate outcomes as a function of key parameters such as protein levels of EfgA 2) to analyze how stochasticity in the form of spontaneous fluctuations in protein levels, which can lead to a potentially toxic pulse of formaldehyde, influences cell transitions between phenotypes such as growth, death, or persistence. This pilot grant will position the researchers to explore fundamental processes associated with antimicrobial resistance, which if eventually manipulated could prevent disease and promote health.

Multi-Scale Model of Interactions Between Lung and Pulmonary Ventilation

Multi-Scale Model of Interactions Between Lung and Pulmonary Ventilation

Project Team: Tao Xing (PI), Gordon Murdoch (Co-PI), Michelle Wiest, Loel Fenwick, Rabijit Dutta

Without adequate respiration, life ceases in as little as three minutes. The failure of effective spontaneous respiration requires immediate intervention to preserve life. However, lungs are far more complicated than simple bags at the ends of tubes. Life support or therapeutic treatment of injured or diseased lungs is frequently needed. However, the current understanding of the most effective reliable and safe pulmonary ventilation methodology is sorely lacking, for either conventional mechanical ventilator (closed-circuit) or the more recent flow ventilation (open-circuit) by Percussionaire Corp. The goal of this project is to develop and validate a multi-scale model for understanding and optimizing the interaction between lungs and pulmonary ventilations, including the key mechanisms impacting the effectiveness of pulmonary ventilation at the organ, tissue, cellular, and molecular scales.

Additionally, the incorporation and integration of hardware can reliably yield empirical experimental data throughout the conducting components of the respiratory anatomy (buccal cavity, trachea, bronchi, bronchioles) as well as the respiratory anatomy responsible for effective gas exchange (inferior bronchioles and alveoli). The model, once validated using the experimental data, will provide a reliable simulation based design toolbox to evaluate comparative profiles of various methods of rescue/supplemental ventilation, which are critical for extrapolating the efficacy of biomedical instrumentation that are designed to reduce mortality associated with respiratory diseases and/or damaged or physiologically compromised lungs. It will provide a virtual three-dimensional laboratory for in silico study of various lung diseases that facilitates a personalized flow ventilation technologies. It will fundamentally change the way current researchers design respiration devices such that simulation-based modeling is applied toward the next generation expert system for the optimization of pulmonary ventilation.

If successful, the proposed innovation would contribute to the saving of lives of those with acute respiratory distress and extend and improve the lives of those with chronic pulmonary conditions. Moreover, it will generate scientific data that can be utilized to provide patient and condition specific therapy. Tackling Critical Issues in the Ebola Epidemic through Modeling Viral Evolution.