Picture of Dan Fu

Dan Fu

CS PhD Candidate/Researcher at Stanford.
Systems for machine learning, vision, and graphics.

About Me

I'm a third-year Computer Science PhD candidate at Stanford, where I'm co-advised by Kayvon Fatahalian and Chris Ré, affiliated with the Stanford AI Lab, the Statistical Machine Learning Group, DAWN, and the Stanford Computer Graphics Lab. In my research, I'm interested in building systems for machine learning, computer vision, and graphics, and I'm currently working on systems for rapidly creating machine learning models. I'm very fortunate to be supported by a Department of Defense NDSEG fellowship and a Magic Grant from the Brown Institute.

Since fall 2020, I've been co-organizing the Stanford MLSys Seminar Series - talks every Thursday, livestreamed on YouTube! Check out our website, and subscribe to our channel!

In 2018, I graduated from Harvard with an AB and an SM in Computer Science, cum laude with highest thesis honors. When I'm not working on school work or other projects, I spend most of my free time ballroom dancing.

Latest News

06/07/21 Our TV news analysis paper accepted to KDD!
03/18/21 New blog post about three lessons I've learned about applying the design process to my ML research.
08/17/20 Preprint of our work on analyzing a decade of US cable TV news is now available on arXiv!
06/30/20 Epoxy, our new work on using weak supervision + pre-trained embeddings without fine-tuning is now available - paper, code, and a short video online!
06/29/20 A pre-recording of our FlyingSquid talk for ICML 2020 is available on YouTube!
show more


Analyzing Who and What Appears in a Decade of US Cable TV News
To appear, KDD 2021

Cable TV news reaches millions of U.S. households each day, and decisions about who appears on the news, and what stories get talked bout, can profoundly influence public opinion and discourse. We use computational techniques to analyze a data set of nearly 24/7 video, audio, and text captions from three major U.S. cable TV networks (CNN, FOX News, and MSNBC) from the last decade.
arXiv / stanford cable tv news analyzer

Epoxy: Interactive Model Iteration with Weak Supervision and Pre-Trained Embeddings

We take a first step towards building models that can be iterated on at programmatically-interactive speeds (seconds instead of hours or days). We build on work in weak supervision and pre-trained embeddings to create models that can approach the quality of training deep networks, but without the cost of training to fine-tune feature representations.
arXiv / code / video

FlyingSquid: Faster and More Interactive Weak Supervision
ICML 2020

We present FlyingSquid, a new weak supervision framework that runs orders of magnitude faster than previous work. Our speedups come from a closed-form solution to latent variable estimation, which enables weakly-supervised video analysis and online learning applications.
arXiv / blog post / code / talk

Multi-Resolution Weak Supervision for Sequential Data
NeurIPS 2019

We present a framework to apply weak supervision to multi-resolution data like videos and time-series data that can handle sequential correlations among supervision sources. We experimentally validate our system over population-level video datasets and gait sensor data.

Rekall: Specifying Video Events using Compositions of Spatiotemporal Labels
AI Systems @ SOSP 2019

We present Rekall, a data model and programming model for detecting new events in video collections by composing the outputs of pre-trained models. We demonstrate the use of Rekall in analyzing video from cable TV news broadcasts, films, static-camera vehicular video streams, and commercial autonomous vehicle logs.
arXiv / blog post / code / demo videos

Influencing Agents for Flock Formation in Low-Density Settings, 2017-2018
AAMAS 2018

We study the problem of how to control flocking behavior in low-density environments. We find that some strategies that have worked well in high-density conditions are often less effective in lower-density environments, and propose new strategies for low-density environments. In an undergraduate thesis, we also explored using genetic algorithms to evolve influencing agent behaviors.
aamas paper / undergraduate thesis / code

Automatically Scalable Computation

The ASC project seeks to create a powerful, practical automatic parallelization runtime by using machine learning to predict future states of a program and running speculative executions from those states. If the main program hits any of the predicted states, ASC can speed up execution by skipping the main program to the end of the speculative execution. Prior work has shown that this approach is promising, but did not actually manage to achieve speedup on native hardware, since it the main program on an emulator. We have been working on making this approach work on native hardware.

Modeling Effects of Meth on Temperature
PLOS One 2015, CNS 2013 Paris, Siemens Research Competition 2013 semifinalist

We develop mathematical models to understand the complex response of body temperature to methamphetamine and use our models to separate out the individual components of the response. We further analyze the ways that this response changes when orexinergic neurotransmission is inhibited.

Dynamics of Genetic Oscillatory Networks
PLOS One 2014, Siemens Research Competition 2012 team runner-up

We analyze a family of delay differential equations used to model genetic oscillatory networks, and use our findings to develop two new mathematical methods for analyzing such models.
pdf / talk / coverage


Trenton Chang, Daniel Y. Fu, Sharon Yixuan Li, Christopher Ré
ECCV 2020 Workshop on Adversarial Robustness in the Real World
James Hong, Will Crichton, Haotian Zhang, Daniel Y. Fu, Jacob Ritchie, Jeremy Barenholtz, Ben Hannel, Xinwei Yao, Michaela Murray, Geraldine Moriba, Maneesh Agrawala, Kayvon Fatahalian
Paper on arXiv, August 2020
Mayee F. Chen*, Daniel Y. Fu*, Frederic Sala, Sen Wu, Ravi Teja Mullapudi, Fait Poms, Kayvon Fatahalian, Christopher Ré
Paper on arXiv, June 2020
Daniel Y. Fu*, Mayee F. Chen*, Frederic Sala, Sarah M. Hooper, Kayvon Fatahalian, Christopher Ré
ICML 2020
Frederic Sala, Paroma Varma, Jason Fries, Daniel Y. Fu, Shiori Sagawa, Saelig Khattar, Ashwini Ramamoorthy, Ke Xiao, Kayvon Fatahalian, James R. Priest, Christopher Ré
NeurIPS 2019
Daniel Y. Fu, Will Crichton, James Hong, Xinwei Yao, Haotian Zhang, Anh Truong, Avanika Narayan, Maneesh Agrawala, Christopher Ré, Kayvon Fatahalian
Paper on arXiv, Oct 2019
Daniel Y. Fu, Will Crichton, James Hong, Xinwei Yao, Haotian Zhang, Anh Truong, Avanika Narayan, Maneesh Agrawala, Christopher Ré, Kayvon Fatahalian
AI Systems @ SOSP 2019, Oral Presentation
Daniel Y. Fu, Emily S. Wang, Peter M. Krafft, Barbara J. Grosz
International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS), 2018
Peter Kraft, Amos Waterland, Daniel Y. Fu, Anitha Gollamudi, Shai Szulanski, Margo Seltzer
Paper on arXiv, Jul 2018
Daniel Y. Fu
Undergraduate Senior Thesis (Harvard University), 2018
Older Papers
Abolhassan Behrouzvaziri, Daniel Fu, Patrick Tan, Yeonjoo Yoo, Maria V. Zaretskaia, Daniel E. Rusyniak, Yaroslav I. Molkov, Dmitry V. Zaretsky
Published in PLOS ONE on May 20, 2015
Daniel Fu, Patrick Tan, Alexey Kuznetsov, Yaroslav I. Molkov
Published in PLOS ONE on March 25, 2014
Yaroslav Molkov, Daniel Fu, Patrick Tan, Maria Zaretskaia, Dmitry Zaretsky
Poster at Computational Neurosciences (CNS) 2013

Software Projects

Google Internships (Summer 2017, 2016)

In Summer 2017, I conducted a machine learning/data science project for the Google Flights team to try to predict flight arrival times using a mix of historical data and live positional data. I built a few models (including two unique ML models) and trained and evaluated them. Some of these models could outperform official airline estimates in many cases. This exploratory work was converted into production after my internship. In Summer 2016, I did some work on image classification with the Google My Business team (subset of maps).

Tamr Internship (Summer 2015))

I was very fortunate to intern at Tamr in summer 2015. I worked on two main projects - augmenting the main product to handle a new use case for a client, and working on Tamr on Google Cloud Platform Tamr on Google Cloud Platform prior to its launch. I've been told my internship work eventually helped land a client and launch a new product--but what I really took away from the internship was an early love for MLSys!

Ballroom Dance Projects

I did a lot of ballroom dancing in undergrad, and put together a few apps to help me run the team. One of the common problems is matching up new first-year dancers based on their interests (and how much time they want to practice). The ballroom matchmaker project uses a simple local search with a complex cost function. I also wrote a simple Python script to run competitive dance rounds from my laptop. Check out both projects below! For the final project of an AI class I took in fall 2016, I helped build a ballroom matchmaker program to intelligently form partnerships in a rookie class. We treated the task as a local search problem with a complex cost function. We are hopeful that this tool will help make the lives of future HBDT rookie captains much easier.
ballroom matchmaker / dance rounds


When I was a freshman in high school, I built a sports app for students to report live game results with a writeup. A surprisingly fun project looking back--my high school's first brush with social media!


CS 152: Programming Languages - Spring 2018
Teaching Fellow

course website

CS 61: Systems Programming and Machine Organization - Fall 2015, 2016, 2017
Teaching Fellow

2015 / 2016 / 2017