About Me.

I am a Principal Researcher at Microsoft Research New England. I also maintain a faculty position in the School of Public Health as an Associate Professor of Biostatistics with an affiliation in the Center for Computational Molecular Biology at Brown University. The central aim of my research program is to build machine learning algorithms and statistical tools that aid in the understanding of how nonlinear interactions between genetic features affect the architecture of complex traits and contribute to disease etiology. An overarching theme of the research done in the Crawford Lab group is to take modern computational approaches and develop theory that enable their interpretations to be related back to classical genomic principles. Some of my most recent work has landed me a place on Forbes 30 Under 30 list and recognition as a member of The Root 100 Most Influential African Americans in 2019. I have also been fortunate enough to be awarded an Alfred P. Sloan Research Fellowship, a David & Lucile Packard Foundation Fellowship for Science and Engineering, and a COPSS Emerging Leader Award.

Prior to joining both MSR and Brown, I received my PhD from the Department of Statistical Science at Duke University where I was co-advised by Sayan Mukherjee and Kris C. Wood. As a Duke Dean’s Graduate Fellow and NSF Graduate Research Fellow I completed my PhD dissertation entitled "Bayesian Kernel Models for Statistical Genetics and Cancer Genomics" which was awarded a Leonard J. Savage Award in Applied Methodology. I also received my Bachelors of Science degree in Mathematics from Clark Atlanta University.

Research Themes

Interpretability in Machine Learning Methods

Machine learning algorithms have become frequently used in genomic studies because they typically exhibit high predictive accuracy. However, recently, these same algorithms have also become criticized as “black box” techniques. We look to build methods that over this challenge.

Dissecting Genetic Architecture of Complex Traits

The explosion of large-scale genomic datasets has provided the unique opportunity to move beyond the traditional LMM framework within GWAS. We build novel ML methods that exhibit power for complex traits that are driven by non-additive genetic variation (e.g., gene-by-gene interactions).

Modeling 3D Variation with Topological Summaries

It has been a longstanding challenge to implement an analogue of variable selection with 3D shapes as the covariates in a regression model. Here, we develop novel statistical and topological data analytic (TDA) pipelines for sub-image selection where the goal is to identify the physical features of 3D shapes that best explain the variation between two phenotypic classes.

Statistical Methods for Cancer Pharmacology

Targeted therapies aimed to inhibit oncogenic signaling within many cancer subtypes have been proven to have high initial clinical responses, but relapse in these patients is almost inevitable. To better understand this phenomenon, we develop algorithms that define rigorous transcriptional signatures of cancer recurrence and therapeutic resistance.

Publications

Key: * co-first authors; co-senior authors; # corresponding author(s); advisee

Preprint(s):

2024:

2023:

2022:

2021:

2020:

2019:

2018:

2017:

2016:

2014-2015:

Stay Connected