About Me.

I am a Principal Researcher at Microsoft Research. My research program focuses on developing interpretable machine learning and AI algorithms to study how genetic effects and gene-by-environmental interactions influence complex traits and disease progression. As part of this work, I co-lead Project Ex Vivo, a collaborative effort between Microsoft and the Broad Institute focused on defining, engineering, and targeting cell states in cancer. I have been featured on Forbes 30 Under 30 and The Root 100 Most Influential African Americans list. I also received an Alfred P. Sloan Research Fellowship, a Packard Foundation Fellowship for Science and Engineering, and a COPSS Emerging Leader Award.

I earned my PhD from the Department of Statistical Science at Duke University where I was co-advised by Sayan Mukherjee and Kris C. Wood. As a Duke Dean’s Graduate Fellow and NSF Graduate Research Fellow I completed my PhD dissertation entitled "Bayesian Kernel Models for Statistical Genetics and Cancer Genomics" which was awarded a Leonard J. Savage Award in Applied Methodology. I also received my Bachelors of Science degree in Mathematics from Clark Atlanta University.

Research Themes

Interpretability in Machine Learning Methods

Machine learning algorithms have become frequently used in genomic studies because they typically exhibit high predictive accuracy. However, recently, these same algorithms have also become criticized as “black box” techniques. We look to build methods that over this challenge.

Dissecting Genetic Architecture of Complex Traits

The explosion of large-scale genomic datasets has provided the unique opportunity to move beyond the traditional LMM framework within GWAS. We build novel ML methods that exhibit power for complex traits that are driven by non-additive genetic variation (e.g., gene-by-gene interactions).

Modeling 3D Variation with Topological Summaries

It has been a longstanding challenge to implement an analogue of variable selection with 3D shapes as the covariates in a regression model. Here, we develop novel statistical and topological data analytic (TDA) pipelines for sub-image selection where the goal is to identify the physical features of 3D shapes that best explain the variation between two phenotypic classes.

Machine Learning for Cancer Pharmacology

Targeted therapies aimed to inhibit oncogenic signaling within many cancer subtypes have been proven to have high initial clinical responses, but relapse in these patients is almost inevitable. To better understand this phenomenon, we develop algorithms that define rigorous transcriptional signatures of cancer recurrence and therapeutic resistance.

Publications

Key: * co-first authors; co-senior authors; # corresponding author(s); advisee

Preprint(s):

2025:

2024:

2023:

2022:

2021:

2020:

2019:

2018:

2017:

2016:

2014-2015:

Stay Connected