~ / karpathy.ai · main

Andrej Karpathy

I like to train deep neural nets on large datasets 🧠🤖💥

github

github blog

medium

bear blog

history

2024 —

Founder · Eureka Labs

I am founder at Eureka Labs. I recently elaborated on its vision on the Dwarkesh podcast. While work on Eureka continues, I create educational videos on AI on my YouTube channel. There are two tracks.

General audience track:

Deep Dive into LLMs like ChatGPT — on under-the-hood fundamentals of LLMs.
How I use LLMs — a more practical guide to examples of use in my own life.
Intro to Large Language Models — a third, parallel, video from a longer time ago.

Technical track: Follow the Zero to Hero playlist.

For all the latest, I spend most of my time on 𝕏/Twitter or GitHub.

2023 — 2024

Returned to OpenAI

I came back to OpenAI where I built a new team working on midtraining and synthetic data generation.

2017 — 2022

Director of AI · Tesla

I was the Director of AI at Tesla, where I led the computer vision team of Tesla Autopilot and (very briefly) Tesla Optimus. My team handled all in-house data labeling, neural network training and deployment on Tesla's custom inference chip. Today, the Autopilot increases the safety and convenience of driving, but the team's goal is to make Full Self-Driving a reality at scale. See Aug 2021 Tesla AI Day for more.

2015 — 2017

Founding member · OpenAI

I was a research scientist and a founding member at OpenAI.

2011 — 2015

PhD · Stanford

My PhD was focused on convolutional/recurrent neural networks and their applications in computer vision, natural language processing and their intersection. My adviser was Fei-Fei Li at the Stanford Vision Lab and I also had the pleasure to work with Daphne Koller, Andrew Ng, Sebastian Thrun and Vladlen Koltun along the way during the first year rotation program.

I designed and was the primary instructor for the first deep learning class at Stanford — CS 231n: Convolutional Neural Networks for Visual Recognition. The class became one of the largest at Stanford and has grown from 150 enrolled in 2015 to 330 students in 2016, and 750 students in 2017.

Along the way I squeezed in 3 internships at (baby) Google Brain in 2011 working on learning-scale unsupervised learning from videos, then again in Google Research in 2013 working on large-scale supervised learning on YouTube videos, and finally at DeepMind in 2015 working on the deep reinforcement learning team with Koray Kavukcuoglu and Vlad Mnih.

2009 — 2011

MSc · UBC

MSc at the University of British Columbia where I worked with Michiel van de Panne on learning controllers for physically-simulated figures (i.e., machine-learning for agile robotics but in a physical simulation).

2005 — 2009

BSc · University of Toronto

BSc at the University of Toronto with a double major in computer science and physics and a minor in math. This is where I first got into deep learning, attending Geoff Hinton's class and reading groups.

bio

Andrej Karpathy is an AI researcher and founder of Eureka Labs, focused on modernizing education in the age of AI. He previously served as the Director of AI at Tesla and was a founding member of OpenAI. During his PhD at Stanford, he was the architect and lead instructor of the first deep learning course at Stanford (CS231n), which has become one of its most popular classes.

featured talks

Dwarkesh podcast 2025

YC AI Startup School 2025

GPU Mode 2024

No Priors podcast 2024

UC Berkeley AI Hackathon 2024

State of GPT @ Microsoft Build 2023 slides →

Lex Fridman podcast 2022

Robot Brains podcast with Pieter Abbeel 2021

Tesla AI Day 2021

AI for Full Self-Driving @ CVPR 2021

AI for Full Self-Driving @ ScaledML 2020

Tesla Autonomy Day 2019

Multi-Task Learning in the Wilderness @ ICML 2019

PyTorch at Tesla @ PyTorch DevCon 2019

Building the Software 2.0 stack @ Spark-AI 2018

2017 RE•WORK Summit with Nathan Benaich

2017 "Heroes of Deep Learning" with Andrew Ng

2017 Deep RL Bootcamp with Pieter Abbeel et al

2016 Bay Area Deep Learning School: CNNs

Deep Learning Workshop @ CVPR 2016

RE•WORK Deep Learning Summit 2016

NVIDIA GTC Keynote 2015 with Jensen Huang

teaching

I have a YouTube channel, where I post lectures on LLMs and AI more generally.

In 2015 I designed and was the primary instructor for the first deep learning class at Stanford — CS 231n: Convolutional Neural Networks for Visual Recognition ❤️. The class became one of the largest at Stanford and has grown from 150 enrolled in 2015 to 330 students in 2016, and 750 students in 2017.

featured writing

I have three blogs 🤦‍♂️. This GitHub blog is my oldest one. I then briefly and sadly switched to my second blog on Medium. I now have a Bear blog. Here is the collection of posts across all three:

Feb 2026 microgpt github
Dec 2025 2025 LLM Year in Review bear
Dec 2025 Chemical hygiene bear
Dec 2025 Auto-grading decade-old Hacker News discussions with hindsight bear
Nov 2025 The space of minds bear
Nov 2025 Verifiability bear
Oct 2025 Animals vs Ghosts bear
Apr 2025 Vibe coding MenuGen bear
Apr 2025 Power to the people: How LLMs flip the script on technology diffusion bear
Mar 2025 Finding the Best Sleep Tracker bear
Mar 2025 The append-and-review note bear
Mar 2025 Digital hygiene bear
Sep 2024 I love calculator bear
Mar 2022 Deep Neural Nets: 33 years ago and 33 years from now github
Jun 2021 A from-scratch tour of Bitcoin in Python github
Mar 2021 Short Story on AI: Forward Pass github
Jun 2020 Biohacking Lite github
Apr 2019 A Recipe for Training Neural Networks github
Nov 2017 Software 2.0 medium
May 2017 AlphaGo, in context medium
May 2017 ICML accepted papers institution stats medium
Apr 2017 A Peek at Trends in Machine Learning medium
Mar 2017 ICLR 2017 vs arxiv-sanity medium
Jan 2017 Virtual Reality: still not quite there, again. medium
Dec 2016 Yes you should understand backprop medium
Sep 2016 A Survival Guide to a PhD github
Nov 2015 Short Story on AI: A Cognitive Discontinuity github
Nov 2015 CS183c Assignment #3 medium
May 2015 The Unreasonable Effectiveness of Recurrent Neural Networks github
Sep 2014 What I learned from competing against a ConvNet on ImageNet github
Oct 2012 The state of Computer Vision and AI: we are really, really far away github

pet projects

This list is a bit outdated, see my up to date projects on my GitHub.

./micrograd

micrograd is a tiny scalar-valued autograd engine (with a bite! :)). It implements backpropagation (reverse-mode autodiff) over a dynamically built DAG and a small neural networks library on top of it with a PyTorch-like API.

./char-rnn

char-rnn was a Torch character-level language model built out of LSTMs/GRUs/RNNs. Related to this also see the Unreasonable Effectiveness of Recurrent Neural Networks blog post, or the minimal RNN gist.

./arxiv-sanity

arxiv-sanity tames the overwhelming flood of papers on Arxiv. It allows researchers to discover relevant papers, search/sort by similarity, see recent/popular papers, and get recommendations. Deployed live at arxiv-sanity.com. My obsession with meta research involved many more projects over the years, e.g. see pretty NIPS 2020 papers, research lei, scholaroctopus, and biomed-sanity. My most recent arxiv-sanity-lite from-scratch rewrite is much better.

./neuraltalk2

neuraltalk2 was an early image captioning project in (lua)Torch. Also see our later extension with Justin Johnson to dense captioning.

./imagenet-ref

I am sometimes jokingly referred to as the reference human for ImageNet because I competed against an early ConvNet on categorizing images into 1,000 classes. This required a bunch of custom tooling and a lot of learning about dog breeds. See the blog post "What I learned from competing against a ConvNet on ImageNet". Also a Wired article.

./convnetjs

ConvNetJS is a deep learning library written from scratch entirely in Javascript. This enables nice web-based demos that train convolutional neural networks (or ordinary ones) entirely in the browser. Many web demos included. I did an interview with Data Science Weekly about the library and some of its back story here. Also see my later followups such as tSNEJS, REINFORCEjs, or recurrentjs, GANs in JS.

./ulogme

How productive were you today? How much code have you written? Where did your time go? For a while I was really into tracking my productivity, and since I didn't like that RescueTime uploads your (very private) computer usage statistics to a cloud I wrote my own, privacy-first, tracker — ulogme! That was fun.

./misc

I built a lot of other random stuff over time. Rubik's cube color extractor, predator prey neuroevolutionary multiagent simulations, more of those, sketcher bots, games for computer game competitions #1, #2, #3, random computer graphics things, Tetris AI, multiplayer coop tetris, etc.

publications

2017

World of Bits: An Open-Domain Platform for Web-Based Agents Tianlin (Tim) Shi, Andrej Karpathy, Linxi (Jim) Fan, Jonathan Hernandez, Percy Liang

ICML 2017

2017

PixelCNN++: A PixelCNN Implementation with Discretized Logistic Mixture Likelihood and Other Modifications Tim Salimans, Andrej Karpathy, Xi Chen, Diederik P. Kingma, Yaroslav Bulatov

ICLR 2017

2016

Connecting Images and Natural Language (PhD thesis) Andrej Karpathy

Thesis

2016

DenseCap: Fully Convolutional Localization Networks for Dense Captioning Justin Johnson*, Andrej Karpathy*, Li Fei-Fei

CVPR 2016 · oral

2016

Visualizing and Understanding Recurrent Networks Andrej Karpathy*, Justin Johnson*, Li Fei-Fei

ICLR 2016 workshop

2015

Deep Visual-Semantic Alignments for Generating Image Descriptions Andrej Karpathy, Li Fei-Fei

CVPR 2015 · oral

2015

ImageNet Large Scale Visual Recognition Challenge Russakovsky, Deng, Su, Krause, Satheesh, Ma, Huang, Karpathy, Khosla, Bernstein, Berg, Fei-Fei

IJCV 2015

2014

Deep Fragment Embeddings for Bidirectional Image-Sentence Mapping Andrej Karpathy, Armand Joulin, Li Fei-Fei

NIPS 2014

2014

Large-Scale Video Classification with Convolutional Neural Networks Karpathy, Toderici, Shetty, Leung, Sukthankar, Fei-Fei

CVPR 2014 · oral

2013

Grounded Compositional Semantics for Finding and Describing Images with Sentences Richard Socher, Andrej Karpathy, Quoc V. Le, Christopher D. Manning, Andrew Y. Ng

TACL 2013

2013

Object Discovery in 3D scenes via Shape Analysis Andrej Karpathy, Stephen Miller, Li Fei-Fei

ICRA 2013

2012

Emergence of Object-Selective Features in Unsupervised Feature Learning Adam Coates, Andrej Karpathy, Andrew Ng

NIPS 2012

2012

Curriculum Learning for Motor Skills Andrej Karpathy, Michiel van de Panne

AI 2012

2011

Locomotion Skills for Simulated Quadrupeds Stelian Coros, Andrej Karpathy, Benjamin Jones, Lionel Reveret, Michiel van de Panne

SIGGRAPH 2011

Also on Google Scholar

misc unsorted

Neural Networks: Zero To Hero lecture series
My first blog, my second blog and my current blog.
I like sci-fi. I enumerated and sorted sci-fi books I've read here.
Justin Johnson and I held a reading group on Clubhouse. See YouTube or as podcast.
Loss function Tumblr :D! My collection of funny loss functions.
Some advice for undergrads and advice for those considering or pursuing a PhD.
New York Times article covering my PhD image captioning work.
t-SNE visualization of CNN codes for ImageNet, pretty!
A long time ago I was really into Rubik's Cubes. I learned to solve them in about 17 seconds and then, frustrated by lack of learning resources, created YouTube videos explaining the Speedcubing methods. There's also my long dead cubing page. And a video of me at a Rubik's cube competition :)
0 frameworks were used to make this simple responsive website because I am becoming seriously allergic to 500-pound websites. This one is pure HTML and CSS in two static files and that's it.