Picture of Dan Fu

Dan Fu

CS PhD Candidate/Researcher at Stanford.
Systems for machine learning, vision, and graphics.

About Me

I'm a second-year Computer Science PhD candidate at Stanford, where I'm co-advised by Kayvon Fatahalian and Chris Ré, affiliated with the Stanford AI Lab, the Statistical Machine Learning Group, DAWN, and the Stanford Computer Graphics Lab. In my research, I'm interested in building systems for machine learning, computer vision, and graphics, and I'm currently working on systems for rapidly creating machine learning models. I'm very fortunate to be supported by a Department of Defense NDSEG fellowship and a Magic Grant from the Brown Institute.

In 2018, I graduated from Harvard with an AB and an SM in Computer Science, cum laude with highest thesis honors. When I'm not working on school work or other projects, I spend most of my free time ballroom dancing.

Latest News

06/30/20 Epoxy, our new work on using weak supervision + pre-trained embeddings without fine-tuning is now available - paper, code, and a short video online!
06/29/20 A pre-recording of our FlyingSquid talk for ICML 2020 is available on YouTube!
06/01/20 FlyingSquid paper Fast and Three-rious: Speeding Up Weak Supervision with Triplet Methods accepted to ICML 2020!
02/28/20 Excited to announce FlyingSquid! Blog post, paper, and code now available!
02/28/20 New blog post on Software 2.0 and Data Programming: Lessons Learned and What's Next for systems and machine learning!
show more


Epoxy: Interactive Model Iteration with Weak Supervision and Pre-Trained Embeddings

We take a first step towards building models that can be iterated on at programmatically-interactive speeds (seconds instead of hours or days). We build on work in weak supervision and pre-trained embeddings to create models that can approach the quality of training deep networks, but without the cost of training to fine-tune feature representations.
arXiv / code / video

FlyingSquid: Faster and More Interactive Weak Supervision
To appear at ICML 2020

We present FlyingSquid, a new weak supervision framework that runs orders of magnitude faster than previous work. Our speedups come from a closed-form solution to latent variable estimation, which enables weakly-supervised video analysis and online learning applications.
arXiv / blog post / code / talk

Multi-Resolution Weak Supervision for Sequential Data
NeurIPS 2019

We present a framework to apply weak supervision to multi-resolution data like videos and time-series data that can handle sequential correlations among supervision sources. We experimentally validate our system over population-level video datasets and gait sensor data.

Rekall: Specifying Video Events using Compositions of Spatiotemporal Labels
AI Systems @ SOSP 2019

We present Rekall, a data model and programming model for detecting new events in video collections by composing the outputs of pre-trained models. We demonstrate the use of Rekall in analyzing video from cable TV news broadcasts, films, static-camera vehicular video streams, and commercial autonomous vehicle logs.
arXiv / blog post / code / demo videos

Esper: Query, Analysis, and Visualization of Large Video Collections

Research in progress on analyzing large collections of video data, from the past 10 years of cable TV news to 600 films spanning the past century. Stay tuned for more details!

Influencing Agents for Flock Formation in Low-Density Settings, 2017-2018
AAMAS 2018

We study the problem of how to control flocking behavior in low-density environments. We find that some strategies that have worked well in high-density conditions are often less effective in lower-density environments, and propose new strategies for low-density environments. In an undergraduate thesis, we also explored using genetic algorithms to evolve influencing agent behaviors.
aamas paper / undergraduate thesis / code

Automatically Scalable Computation

The ASC project seeks to create a powerful, practical automatic parallelization runtime by using machine learning to predict future states of a program and running speculative executions from those states. If the main program hits any of the predicted states, ASC can speed up execution by skipping the main program to the end of the speculative execution. Prior work has shown that this approach is promising, but did not actually manage to achieve speedup on native hardware, since it the main program on an emulator. We have been working on making this approach work on native hardware.

Modeling Effects of Meth on Temperature
PLOS One 2015, CNS 2013 Paris, Siemens Research Competition 2013 semifinalist

We develop mathematical models to understand the complex response of body temperature to methamphetamine and use our models to separate out the individual components of the response. We further analyze the ways that this response changes when orexinergic neurotransmission is inhibited.

Dynamics of Genetic Oscillatory Networks
PLOS One 2014, Siemens Research Competition 2012 team runner-up

We analyze a family of delay differential equations used to model genetic oscillatory networks, and use our findings to develop two new mathematical methods for analyzing such models.
pdf / talk / coverage


Mayee F. Chen*, Daniel Y. Fu*, Frederic Sala, Sen Wu, Ravi Teja Mullapudi, Fait Poms, Kayvon Fatahalian, Christopher Ré
Paper on arXiv, June 2020
Daniel Y. Fu*, Mayee F. Chen*, Frederic Sala, Sarah M. Hooper, Kayvon Fatahalian, Christopher Ré
To appear at ICML 2020
Frederic Sala, Paroma Varma, Jason Fries, Daniel Y. Fu, Shiori Sagawa, Saelig Khattar, Ashwini Ramamoorthy, Ke Xiao, Kayvon Fatahalian, James R. Priest, Christopher Ré
NeurIPS 2019
Daniel Y. Fu, Will Crichton, James Hong, Xinwei Yao, Haotian Zhang, Anh Truong, Avanika Narayan, Maneesh Agrawala, Christopher Ré, Kayvon Fatahalian
Paper on arXiv, Oct 2019
Daniel Y. Fu, Will Crichton, James Hong, Xinwei Yao, Haotian Zhang, Anh Truong, Avanika Narayan, Maneesh Agrawala, Christopher Ré, Kayvon Fatahalian
AI Systems @ SOSP 2019, Oral Presentation
Daniel Y. Fu, Emily S. Wang, Peter M. Krafft, Barbara J. Grosz
International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS), 2018
Peter Kraft, Amos Waterland, Daniel Y. Fu, Anitha Gollamudi, Shai Szulanski, Margo Seltzer
Paper on arXiv, Jul 2018
Daniel Y. Fu
Undergraduate Senior Thesis (Harvard University), 2018
Older Papers
Abolhassan Behrouzvaziri, Daniel Fu, Patrick Tan, Yeonjoo Yoo, Maria V. Zaretskaia, Daniel E. Rusyniak, Yaroslav I. Molkov, Dmitry V. Zaretsky
Published in PLOS ONE on May 20, 2015
Daniel Fu, Patrick Tan, Alexey Kuznetsov, Yaroslav I. Molkov
Published in PLOS ONE on March 25, 2014
Yaroslav Molkov, Daniel Fu, Patrick Tan, Maria Zaretskaia, Dmitry Zaretsky
Poster at Computational Neurosciences (CNS) 2013

Software Projects

Google Internship (Summer 2017)

In Summer 2017, I conducted a machine learning/data science project for the Google Flights team to try to predict flight arrival times using a mix of historical data and live positional data. I built a few models (including two unique ML models) and trained and evaluated them. Some of these models could outperform official airline estimates in many cases. This exploratory work was converted into production after my internship.

Ballroom Matchmaker

For the final project of an AI class I took in fall 2016, I helped build a ballroom matchmaker program to intelligently form partnerships in a rookie class. We treated the task as a local search problem with a complex cost function. We are hopeful that this tool will help make the lives of future HBDT rookie captains much easier.
pdf / code

Ballroom Runthroughs Script

When I was the Captain of the Harvard Ballroom team, I wrote a dead-simple Python script to help me run competitive rounds. The only requirement I had was that it had to run on my computer, so it has hard dependencies on afplay (to play songs from the command line), sox (to program in fadeouts in songs), and say (this is a weaker dependency, but I liked to have the script yell at me when it was time to start a new round). As a result, it is OS X only, but I thought I'd put it online in case anyone else found it useful.

Google Internship (Summer 2016)

I had two main projects when I was working with the Google My Business Android team at Google in 2016: a research project to extend and evaluate an image classifier to automatically categorize uploaded images, and a development project to integrate a newsfeed to drive increases in daily active users. While neither project made it to production, the work I did helped inform major product decisions (including the decision to move away from classifying images in the first place).

Tamr Internship (Summer 2015))

When I was at Tamr, I worked on two main projects - augmenting the main product to handle a new use case for a client, and working on Tamr on Google Cloud Platform Tamr on Google Cloud Platform prior to its launch. My first project eventually led to a major deal with the client, and my second project helped the product launch successfully, leading to sales leads for both Tamr and GCP.

Interactive Intelligence Internship (Summer 2014)

At Interactive Intelligence, I built the Interaction Speech Tuner, a full-stack web application to tune Interactive Intelligence's automated speech recognition system.

DyKnow Internship (Summer 2013)

When I was at DyKnow, I helped build an analytics feature in the DyKnow web app to allow teachers to track student participation and understanding over time. I also integrated the new web app with Google Drive and Dropbox to make it easier for teachers to upload lesson plans.


When I was a freshman in high school, I built a sports app for students to report live game results to the whole school. In an age before schools had learned what to do with social media, it was very popular among parents who didn't want to wait to find out how their kids had played.


CS 152: Programming Languages - Spring 2018
Teaching Fellow

course website

CS 61: Systems Programming and Machine Organization - Fall 2015, 2016, 2017
Teaching Fellow

2015 / 2016 / 2017