Skip to main content

Collaborative Research: Deep-Sequencing Analysis of Edited Metabolic Pathways to Uncover, Model, and Overcome the Epistatic Constraints Upon Optimization

Project Team: Christopher J. Marx, Ankur Dalia (Indiana University), Sergey Stolyar, Jeremy Draghi (Brooklyn College), Norma Martinez-Gomez (Michigan State University)

Epistasis – non-linear interactions between genotypic changes upon phenotypes –represents a critical challenge to optimization of biological systems, whether by evolution or engineered via synthetic biology. When mutational effects upon growth or product generation depend on the genetic background, assessing performance across the entire parameter space of any system of realistic size quickly becomes impossible. This is especially problematic when there is sign epistasis – mutations that change from beneficial to deleterious depending upon the other loci – as this creates ridges and peaks on the fitness landscape that can restrict stepwise optimization via either synthetic biological changes or beneficial mutations. Development of kinetic computational models of metabolism can provide guidance, but unfortunately these models are dogged by numerous free parameters. There is an immediate need for two linked developments: empirical techniques that can rapidly generate and assess rational, combinatorial variants, and modeling techniques to incorporate these data and predict where in parameter space further rounds of generation and assessment of variation would be most effective. The test-bed for our novel approach is to optimize the function of the high-efficiency ribulose monophosphate (RuMP) pathway that the team has successfully introduced into the model methanolconsuming organism, Methylobacterium extorquens. First, in vivo gene editing of a plasmid-encoded suite of enzymes will be performed, and deep sequencing used to rapidly assess the fitnesses of a quartermillion genotypes with combinatorial variation in nine dimensions of expression. Second, this massive volume of data about epistasis – combined with direct measurement of intracellular metabolite concentrations for select combinations – will be used to infer the numerous parameter values in our kinetic model. Third, the model will be utilized to predict which regions of parameter space would be more or less evolvable and these will be targeted and compared in the second round of editing, fitness assays, and experimental evolution.

Small Area Estimation of Obesity-Related Indicators

Project Team: Helen Brown (PI), Christopher Murphy, Chantal Vella, Marco Mesa-Frias, Michelle Wiest

This project will aim to develop or adapt an existing model to generate small area estimates in Idaho counties of the factors that place individuals at highest risk for obesity (e.g. sedentary behaviors, food insecurity, sugar-sweetened beverage consumption). Knowledge of the local environment is often critical in public health planning and development. In the model, selected demographic characteristics, health conditions, health behaviors, and health status will be estimated to provide a precise geographical picture of the relevant local population. The model will attempt to synthesize a geographically-relevant study population in their local context using various datasets. The American Community Survey (e.g. census) will be used to provide the geographical and demographical context at the local level, and it will be linked to regional datasets such as the Behavioral Risk Factor Surveillance System (BRFFS) to provide domain relevant information, mainly in the context of health-related lifestyles. Prevalence of obesity indicators and other risk factors will be estimated for Idaho counties and local authorities in 2013/2014 to identify the “hot spots” or those counties with the highest risk of obesity.

Mathematical Modeling of Nutritional Factors and Bacterial Communities of the Maternal-Infant Dyad

Project Director: Mark McGuire

Project Team: Janet Williams, Sarah Brooker, Chris Remien, Ben Ridenhour, JT Van Leuven

Myriad microbial communities within and on the human body interact constantly with each other and environmental factors, and although much work has focused on characterizing these various community structures (especially in adults), almost nothing is known about how those of mothers and infants interact. Understanding this crosstalk is likely critical to understanding the establishment of the gastrointestinal (GI) microbiome in early life and how it influences risk for diseases and conditions such as necrotizing enterocolitis, diarrhea, obesity, and Crohn’s disease. However, development of GI microbial community in infancy has been relatively poorly studied. Nonetheless, its complexity is known to be substantial and tracking individual species may not provide the knowledge to impact human health.

Here, we propose to model the development of breastfed infants’ GI tract in connection with various sites within the mother-infant dyad and examine potential effect modifiers, such as maternal/infant nutrition, mode/location of delivery, and antibiotic use. For instance, studies have demonstrated that the bacterial community in milk may influence the development of infant’s GI tract. Other studies have shown that maternal consumption of targeted probiotics can alter the bacteria in the milk she produces. Thus, it is possible that the bacterial community in a woman’s GI tract can also alter that of her infant. Other sites of within a woman, such as the skin and saliva, may contribute to milk and infant fecal microbiomes as well. For these reasons, we have collected samples from mothers (milk, breast skin, feces and saliva) and infants (feces and saliva) from birth through 6 mo postpartum to examine these and other relationships. In this project, we will leverage this complex dataset (which includes both quantitative and qualitative empirical data) to model never-before-evaluated relationships among environmental/behavioral factors and multiple microbial communities in mothers and their infants over time. Because of the intimate and emerging connections between microbial communities and health, we anticipate that these findings will lay a solid foundation supporting additional collaborative studies related to the manipulation of early-life microbial communities for optimal acute and chronic health.

Mathematical Modeling of Human Motions Using Recurrent Neural Networks

Project Team: Alex Vakanski (PI), Stephen Lee, Jake Ferguson

The current methods in the literature for representing human motions are based on modeling the movements at a single level of abstraction, either at a low-level (i.e., trajectory level) of abstraction or at the high-level (i.e., symbolic level) of abstraction. The proposed project will exploit the latest advances in recurrent neural networks for modeling human motions at multiple hierarchical levels of abstraction. The ultimate aim is to allow patients to perform rehabilitation exercises at home using a sensory system for capturing the motions, where an algorithm will retrieve the trajectories of patient’s exercises, will perform a data analysis by comparing the performed motions to a model of desired motions, and will send the analysis results to the patient’s physician with recommendations for improvement.

Evolution of Tandemly-Replicated Opsin Genes: Molecular Models That Predict Spectral Shift

Project Team: Deborah Stenkamp (PI), Diana Mitchell, Robert Mackin, Jagdish Patel

Gene replication is an established mechanism for the generation of raw genetic material upon which evolution may act. For example, tandem replication of genes to generate arrays of paralogs underlies functional diversification in vertebrate sensory systems. Tandem replication of opsin (visual pigment) genes and subsequent neofunctionalization provide selective advantages for the exploitation of novel visual environments, food sources, and mate selection. Despite their importance, the mechanisms underlying the subsequent “acts” of neo-functionalization of new genetic material are not clear. In this Modeling Access Grant proposal, we use the tandemly-replicated cone opsin genes of teleosts and primates to address this significant knowledge gap. The tandemly-replicated cone opsin genes are ideal for this study because many independent tandem replications have occurred very recently, and because experimental protein structures are available to inform molecular models. Early in vertebrate evolution, one-rod opsin (RH1) and four cone opsin.