I'm a second-year Computer Science PhD candidate at Stanford, where I'm co-advised by Kayvon Fatahalian and Chris Ré, affiliated with the Stanford AI Lab, the Statistical Machine Learning Group, DAWN, and the Stanford Computer Graphics Lab. In my research, I'm interested in building systems for machine learning, computer vision, and graphics, and I'm currently working on systems for rapidly creating machine learning models. I'm very fortunate to be supported by a Department of Defense NDSEG fellowship and a Magic Grant from the Brown Institute.
In 2018, I graduated from Harvard with an AB and an SM in Computer Science, cum laude with highest thesis honors. When I'm not working on school work or other projects, I spend most of my free time ballroom dancing.
|06/30/20||Epoxy, our new work on using weak supervision + pre-trained embeddings without fine-tuning is now available - paper, code, and a short video online!|
|06/29/20||A pre-recording of our FlyingSquid talk for ICML 2020 is available on YouTube!|
|06/01/20||FlyingSquid paper Fast and Three-rious: Speeding Up Weak Supervision with Triplet Methods accepted to ICML 2020!|
|02/28/20||Excited to announce FlyingSquid! Blog post, paper, and code now available!|
|02/28/20||New blog post on Software 2.0 and Data Programming: Lessons Learned and What's Next for systems and machine learning!|
|12/12/19||Presented Multi-Resolution weak Supervision for Sequential Data at NeurIPS 2019 in Vancouver!|
|11/13/19||Gave a talk on Rekall at Intel's Autonomous Driving Community of Practice Event!|
|10/27/19||Presented Video Event Specification using Programmatic Composition at AI Systems @ SOSP 2019 - poster and oral presentation!|
|10/21/19||Preprint of our paper on Multi-Resolution Weak Supervision for Sequential Data now available on arXiv!|
|10/09/19||A new blog post about Rekall up on the DAWN blog - Why Train What You Can Code? Rekall: A Compositional Approach to Video Analysis!|
|10/03/19||Our poster on Video Event Specification using Programmatic Composition accepted to AI Systems @ SOSP 2019!|
|09/03/19||Our paper on Multi-Resolution Weak Supervision for Sequential Data accepted to NeurIPS 2019!||05/15/19||I was awarded a Department of Defense NDSEG fellowship!|
|05/03/19||James Hong and I won a Magic Grant from the Brown Institute for Media Innovation for Public Analysis of TV News!|
We take a first step towards building models that
can be iterated on at programmatically-interactive speeds (seconds
instead of hours or days).
We build on work in weak supervision and pre-trained embeddings to
create models that can approach the quality of training deep networks,
but without the cost of training to fine-tune feature representations.
arXiv / code / video
We present FlyingSquid, a new weak supervision framework that runs
orders of magnitude faster than previous work.
Our speedups come from a closed-form solution to latent variable
estimation, which enables weakly-supervised video analysis and online
arXiv / blog post / code / talk
We present a framework to apply weak supervision to multi-resolution
data like videos and time-series data that can handle sequential
correlations among supervision sources.
We experimentally validate our system over population-level video
datasets and gait sensor data.
We present Rekall, a data model and programming model for detecting
new events in video collections by composing the outputs of
We demonstrate the use of Rekall in analyzing video from cable TV
news broadcasts, films, static-camera vehicular video streams, and
commercial autonomous vehicle logs.
arXiv / blog post / code / demo videos
Research in progress on analyzing large collections of video data, from the past 10 years of cable TV news to 600 films spanning the past century. Stay tuned for more details!
We study the problem of how to control flocking behavior in
We find that some strategies that have worked well in high-density
conditions are often less effective in lower-density environments,
and propose new strategies for low-density environments.
In an undergraduate thesis,
we also explored using genetic algorithms to evolve influencing agent
aamas paper / undergraduate thesis / code
The ASC project seeks to create a powerful, practical automatic
parallelization runtime by using machine learning to predict future
states of a program and running speculative executions from those
If the main program hits any of the predicted states, ASC can speed
up execution by skipping the main program to the end of the
has shown that this approach is promising, but did not
actually manage to achieve speedup on native hardware,
since it the main program on an emulator.
We have been working on making this approach work on native hardware.
We develop mathematical models to understand the complex response
of body temperature to methamphetamine and use our models to separate
out the individual components of the response.
We further analyze the ways that this response changes when
orexinergic neurotransmission is inhibited.
We analyze a family of delay differential equations used to model
genetic oscillatory networks, and use our findings to develop two new
mathematical methods for analyzing such models.
pdf / talk / coverage
In Summer 2017, I conducted a machine learning/data science
project for the Google Flights team to try to predict flight arrival
times using a mix of historical data and live positional data.
I built a few models (including two unique ML models) and trained
and evaluated them.
Some of these models could outperform official airline
estimates in many cases.
This exploratory work was converted into production after my
For the final project of an AI class I took in fall 2016, I helped
build a ballroom matchmaker program to intelligently form
partnerships in a rookie class.
We treated the task as a local search problem with a complex cost
We are hopeful that this tool will help make the lives of future
HBDT rookie captains much easier.
pdf / code
When I was the Captain of the Harvard Ballroom team, I wrote a
dead-simple Python script to help me run competitive rounds.
The only requirement I had was that it had to run on my computer, so
it has hard dependencies on afplay (to play songs from the command
line), sox (to program in fadeouts in songs), and say (this is a
weaker dependency, but I liked to have the script yell at me when it
was time to start a new round).
As a result, it is OS X only, but I thought I'd put it online in case
anyone else found it useful.
I had two main projects when I was working with the Google My Business Android team at Google in 2016: a research project to extend and evaluate an image classifier to automatically categorize uploaded images, and a development project to integrate a newsfeed to drive increases in daily active users. While neither project made it to production, the work I did helped inform major product decisions (including the decision to move away from classifying images in the first place).
When I was at Tamr, I worked on two main projects - augmenting the main product to handle a new use case for a client, and working on Tamr on Google Cloud Platform Tamr on Google Cloud Platform prior to its launch. My first project eventually led to a major deal with the client, and my second project helped the product launch successfully, leading to sales leads for both Tamr and GCP.
At Interactive Intelligence, I built the Interaction Speech Tuner, a full-stack web application to tune Interactive Intelligence's automated speech recognition system.
When I was at DyKnow, I helped build an analytics feature in the DyKnow web app to allow teachers to track student participation and understanding over time. I also integrated the new web app with Google Drive and Dropbox to make it easier for teachers to upload lesson plans.
When I was a freshman in high school, I built a sports app for
students to report live game results to the whole school.
In an age before schools had learned what to do with social media, it
was very popular among parents who didn't want to wait to find out
how their kids had played.