I’m a PhD Student at UC San Diego dedicated to building AI systems capable of robust decision making and human-like perception. Natural selection had to work with a number of constraints to design the human brain, which may or may not be necessary to include as priors in our machine learning models. I believe a principled yet intuitive approach is the way forward: Determine the biases based on human cognition to bring into today’s best model architectures. Relying on scale alone will not be sufficient to solve issues with robustness, interpretability, and truthfulness, especially in regimes interacting with humans. I’m currently advised by Prof. Zhiting Hu, and collaborating with Prof. Yilun Du at Harvard. Previously I worked with Prof. Bruno Olsahusen at UC Berkeley’s Redwood Center for Theoretical Neuroscience, with Prof. Kurt Keutzer at BAIR, and as a founding research scientist at New Theory AI.

Some current fascinations:

  • Using world models for decision making: The ability to predict what will happen in the future implies understanding the world itself. I believe to make complex decisions in a complex world, this capability is necessary. How can we build models that learn these world models from data, and use them for decision making?
  • Modeling uncertainty about the future: Instead of taking just one sample of what the future could be like, and then constraining your generation based on that, can generative models explicitly model inherent uncertainty about the future? A simple example: I think my opponent will move left with 75% probability, and right with 25% probability, but not forward or backward. Compared to just using the most likely sample of moving left as conditioning, my entire distribution contains more information that can help me make decisions about what to do next, i.e. can help construct a better world model.
  • Memory conditioned generation: Today’s RL models can work quite well on a static environment where they only are required to do one task with a clear goal. Generative models for tasks like video generation suffer from inconsistencies for their time series data output. I believe both of these are symptoms of models’ lack of memory. The correct latent space needs to be filled with semantically useful data related to the task at hand to condition generation of actions or other output modalities. Specifically, how can the model learn a representation space optimal for storing only what is necessary to remember, and learn to fill its own memory?
  • Compositional generative modeling: Humans understand the world’s factorial components, able to recombine them to solve new examples never seen before. From a raw dataset, how can these factors be discovered and then composed to solve out of distribution generalization problems?
  • Other topics: Emergent hierarchy, visual representations, subjective reality in ML models, evolution, history, geopolitics.

A selection of my favorite books:

A selection of my favorite textbooks: