I'm a second-year Computer Science PhD candidate at Stanford, where I'm co-advised by Kayvon Fatahalian and Chris Ré. In my research, I'm interested in building systems for graphics and computer vision, and I'm currently working on systems for analyzing large amounts of video data. In the past, I've also done some research on influencing agents for flock formation. I'm supported by an NDSEG fellowship and a Magic Grant from the Brown Institute for Media Innovation.
In 2018, I graduated from Harvard with an AB and an SM in Computer Science, cum laude with highest thesis honors. When I'm not working on school work or other projects, I spend most of my free time ballroom dancing.
Many real-world video analysis applications require the ability to identify domain-specific events in video, such as interviews and commercials in TV news broadcasts, or action sequences in film. Unfortunately, pre-trained models to detect all the events of interest in video may not exist, and training new models from scratch can be costly and labor-intensive. We explore the utility of specifying new events in video in a more traditional manner: by writing queries that compose outputs of existing, pre-trained models. To write these queries, we have developed Rekall, a library that exposes a data model and programming model for compositional video event specification. Rekall represents video annotations from different sources (object detectors, transcripts, etc.) as spatiotemporal labels associated with continuous volumes of spacetime in a video, and provides operators for composing labels into queries that model new video events. We demonstrate the use of Rekall in analyzing video from cable TV news broadcasts, films, static-camera vehicular video streams, and commercial autonomous vehicle logs. In these efforts, domain experts were able to quickly (in a few hours to a day) author queries that enabled the accurate detection of new events (on par with, and in some cases much more accurate than, learned approaches) and to rapidly retrieve video clips for human-in-the-loop tasks such as video content curation and training data curation. In a user study, novice users of Rekall were able to author queries to retrieve new events in video given just one hour of query development time. Code on GitHub.
Research in progress on analyzing large collections of video data, from the past 10 years of cable TV news to 600 films spanning the past century. Stay tuned for more details!
In flocking, local sensing from individual agents results in collective behavior that appears coordinated. In the interest of learning how to control flocking behavior, recent work in the multiagent systems literature has explored the use of influencing agents for guiding flocking agents to face a certain direction. Existing work has large focused on simulation settings of small areas with toroidal shapes; in such settings, agent density is high, so interactions are common, and flock formation occurs easily. In this project, we study flocking environments with lower agent density with rarer interactions. We find that behaviors which have worked well in high-density conditions are often less effective in lower density environments. We use these insights to propose new influencing agent behaviors, which we dub “follow-then-influence”; agents act like normal members of the flock to achieve positions that allow for control and then exert their influence. We presented this work as a full paper at AAMAS 2018. In an undergraduate thesis, we also explored using genetic algorithms to evolve influencing agent behaviors. The code for all the experiments can be found on Github.
The ASC project seeks to create a powerful, practical automatic parallelization runtime by using machine learning to predict future states of a program and running speculative executions from those states. If the main program hits any of the predicted states, ASC can speed up execution by skipping the main program to the end of the speculative execution. Prior work has shown that this approach is promising, but did not actually manage to achieve speedup on native hardware, since it the main program on an emulator. We have been working on making this approach work on native hardware. This work was submitted to SOSP '17 and ASPLOS '18.
HotCRP is a conference submission and review system with complex information flow policies and an expressive search capability. As a result, optimizing the search process is technically difficult and can result in information leaks if the optimization process returns either more or fewer papers than the unoptimized process. In particular, optimizations that transfer query burden across a sanitization pass can be especially problematic. In this project, we tackle this problem using formal verification. We develop a formal model of information flow in HotCRP and use it to model different information flow policies and optimizations in HotCRP. We ultimately use our framework to prove that the optimizations do not leak information.
In this project, we speed up neural-network based image classification by using compression to reduce the input dimension of images. We use the discrete cosine transform, the same technique that underlies the JPEG standard, to compress images to a fraction of their size and feed the compressed images to some simple neural networks used for image classification. We find that we can achieve major speedups (ranging from 2x to 10x) in exchange for a modest hit in accuracy.
How often do vector clocks falsely report mis-ordered events and unnecessarily force a distributed system to slow down? Can we reduce the false positive rate by introducing more granular schema? In this research project, we explore the relationship between the granularity of vector clocks and the percentage of time that distributed systems spend waiting for their various components to catch up with each other.
In this research project, we demonstrate that a rogue application can evade dynamic information flow analysis and leak sensitive information using an extremely simple timing channel implemented in roughly 10 lines of Java code.
We develop mathematical models to understand the complex response of body temperature to methamphetamine and use our models to separate out the individual components of the response. We further analyze the ways that this response changes when orexinergic neurotransmission is inhibited. Published May 2015; 2013 Siemens Competition Semifinalist project; presented at CNS 2013 Paris.
We analyze a family of delay differential equations used to model genetic oscillatory networks, and use our findings to develop two new mathematical methods for analyzing such models. Published March 2014; 2012 Siemens Competition 2nd Place project.
In Summer 2017, I conducted a machine learning/data science project for the Google Flights team to try to predict flight arrival times using a mix of historical data and live positional data. I built a few models (including two unique ML models) and trained and evaluated them. I ended up with some models that could outperform official airline estimates in many cases.
For the final project of an AI class I took in fall 2016, I helped build a ballroom matchmaker program to intelligently form partnerships in a rookie class. We treated the task as a local search problem with a complex cost function. We are hopeful that this tool will help make the lives of future HBDT rookie captains much easier. A writeup with the details of our approach and some interesting findings is available here.
When I was the Captain of the Harvard Ballroom team, I wrote a dead-simple Python script to help me run competitive rounds. The only requirement I had was that it had to run on my computer, so it has hard dependencies on afplay (to play songs from the command line), sox (to program in fadeouts in songs), and say (this is a weaker dependency, but I liked to have the script yell at me when it was time to start a new round). As a result, it is OS X only, but I thought I'd put it online in case anyone else found it useful.
I had two main projects when I was working with the Google My Business Android team at Google in 2016: a research project to extend and evaluate an image classifier to automatically categorize uploaded images, and a development project to integrate a newsfeed to drive increases in daily active users. While neither project made it to production, the work I did helped inform major product decisions (including the decision to move away from classifying images in the first place).
When I was at Tamr, I worked on two main projects - augmenting the main product to handle a new use case for a client, and working on Tamr on Google Cloud Platform prior to its launch. My first project eventually led to a major deal with the client, and my second project helped the product launch successfully, leading to sales leads for both Tamr and GCP.
At Interactive Intelligence, I built the Interaction Speech Tuner, a full-stack web application to tune Interactive Intelligence's automated speech recognition system.
When I was at DyKnow, I helped build an analytics feature in the DyKnow web app to allow teachers to track student participation and understanding over time. I also integrated the new web app with Google Drive and Dropbox to make it easier for teachers to upload lesson plans.
When I was a freshman in high school, I built a sports app for students to report live game results to the whole school. In an age before schools had learned what to do with social media, it was very popular among parents who didn't want to wait to find out how their kids had played.