S3/E1: One Good AI Model and Everyone Forgets How Panic Works

brotherskeleton
Jan 30, 2025
1 min read

DeepSeek dropped R1 on January 20th, trained it for a reported six million dollars, and by the 27th had wiped nearly six hundred billion from Nvidia's market cap in a single day. Both reactions - gaah the 'America is finished' crowd and the gahh 'it's a Chinese spy app' crowd - are dead wrong in interesting ways. Today we're actually looking at the model: what it does, how mixture-of-experts architecture works, why the cost claims are real but also complicated, and what it actually means when a smaller team produces a competitive result on constrained hardware. DeepSeek is a genuine engineering achievement. It is not Skynet. Calm down.

Recent Posts

See All

S3/E7: The Rayners Lane Disappearances Aren't Supernatural.

The police are finally acknowledging that several women have disappeared near Rayners Lane. Within forty-eight hours, the tunnel had a ghost. Within seventy-two hours, the ghost had a Victorian backst

S3/E6: Your Brain Wants That to Be a Portal. It's Lens Flare.

A video of what appears to be a shimmering oval aperture above a car park in Ohio has forty million views. I've watched it eleven times. It's lens flare off a wet surface interacting with a telephoto

S3/E5: God Didn't Make Miracles. A Diffusion Model Did.

TikTok this summer has been wall-to-wall angels. I want to be careful here because I'm not interested in arguing about faith, and that's not what this episode is. This episode is about image artefacts

Comments