By creating tools to enable the automated analysis of data and by creating unique data storage solutions, we hope to enable other researchers to accomplish their research goals more efficiently and effectively.
How are plant chromosomes arranged? Is it possible to relate the genetic and cytological maps to an assembled genome sequence? Are there sequences present at centromeres that signal the cell to construct kinetochores, the machines that ensure proper chromosome segregation to occur, at the correct site?
As the genomes of more plants get sequenced, complex questions like these can can be translated into testable hypotheses. Eventually the content of plant genomes can be related to broad function, both within the cell and at the level of the organism as a whole.
Convergence of traditional biological investigation with genome content and organization is the focus of much of the work carried out in this group. We explore this area of research using maize, Arabidopsis, and other plants.
Phenotype Prediction for Basic Research
The ability to compare phenotypes, both within and across species, enables predictive biology. Though descriptions of myriad aspects of phenotype are readily available, representation of morphology, development, and other traits using computable formats is in its infancy. It has been shown in vertebrate and other (primarily animal) systems that biological equivalencies can be predicted across broad diversity based on reasoning across phenotype ontology markup of experimentally well-characterized genes and pathways. Within plants, maize, rice, soybean, Medicago, Arabidopsis, and tomato have sufficient gene function information (as inferred from mutational screens) to develop such systems. Given current data and existing algorithms that reason across annotations, it is now possible to assert biologically relevant phenolog relationships associated with genes, genomic regions, molecular pathways, and gene function data for plants.
This sort of work enables:
(1) Prediction of the biology that underlies phenotype in non-model systems, including crops that do not have well-characterized genomes (e.g., blueberry, strawberry, apple, peach, etc.). Using this method, the phenotype of a non-model plant can be used to query model species' genes, molecular markers, pathways, etc. directly to bootstrap testable hypotheses.
(2) Identification of non-obvious model systems to study conserved processes across broad taxa.
(3) Creation of phenotypic data systems that interoperate.
Crop Improvement: Phenotype = Genotype x Environment
Codifying and integrating genotypes with phenotypes and precise environmental conditions enables the discovery of basic biological mechanisms and revolutionize plant breeding. Currently deployed high-throughput phenotype data collection and analysis systems cannot be leveraged across multiple groups' datasets due to the complete absence of guidelines. The development and use of standards and best practices will allow researchers to tease out biologically-relevant environmental conditions and molecular mechanisms from large-scale datasets to enable targeted crop improvement. Satisfying this basic need to enable data sharing is necessary to effect a scale-change for basic biology that leads to agricultural advancement and is critically needed given that doubling production by 2050 in the face of climate change is required to meet worldwide projected needs.