About Me.

I am a Principal Researcher at Microsoft Research. I also maintain a faculty position as a Distinguished Senior Fellow in Biostatistics at Brown University. The central aim of my research program is to build machine learning algorithms and statistical tools that aid in the understanding of how genetic effects and gene-by-environmental interactions contribute to the architecture of complex traits and disease progression. An overarching theme of my work is to take modern computational approaches and develop theory that enable their interpretations to be related back to classical genomic principles. Some of my work has landed me a place on Forbes 30 Under 30 list and recognition as a member of The Root 100 Most Influential African Americans. I have also been fortunate enough to be awarded an Alfred P. Sloan Research Fellowship, a David & Lucile Packard Foundation Fellowship for Science and Engineering, and a COPSS Emerging Leader Award.

Prior to joining both MSR and Brown, I received my PhD from the Department of Statistical Science at Duke University where I was co-advised by Sayan Mukherjee and Kris C. Wood. As a Duke Dean’s Graduate Fellow and NSF Graduate Research Fellow I completed my PhD dissertation entitled "Bayesian Kernel Models for Statistical Genetics and Cancer Genomics" which was awarded a Leonard J. Savage Award in Applied Methodology. I also received my Bachelors of Science degree in Mathematics from Clark Atlanta University.

Research Themes

Interpretability in Machine Learning Methods

Machine learning algorithms have become frequently used in genomic studies because they typically exhibit high predictive accuracy. However, recently, these same algorithms have also become criticized as “black box” techniques. We look to build methods that over this challenge.

Dissecting Genetic Architecture of Complex Traits

The explosion of large-scale genomic datasets has provided the unique opportunity to move beyond the traditional LMM framework within GWAS. We build novel ML methods that exhibit power for complex traits that are driven by non-additive genetic variation (e.g., gene-by-gene interactions).

Modeling 3D Variation with Topological Summaries

It has been a longstanding challenge to implement an analogue of variable selection with 3D shapes as the covariates in a regression model. Here, we develop novel statistical and topological data analytic (TDA) pipelines for sub-image selection where the goal is to identify the physical features of 3D shapes that best explain the variation between two phenotypic classes.

Machine Learning for Cancer Pharmacology

Targeted therapies aimed to inhibit oncogenic signaling within many cancer subtypes have been proven to have high initial clinical responses, but relapse in these patients is almost inevitable. To better understand this phenomenon, we develop algorithms that define rigorous transcriptional signatures of cancer recurrence and therapeutic resistance.

Publications

Key: * co-first authors; co-senior authors; # corresponding author(s); advisee

Preprint(s):

2024:

2023:

2022:

2021:

2020:

2019:

2018:

2017:

2016:

2014-2015:

Stay Connected