AI Engineer World’s Fair 2026 — Session Index

Timestamped index of the main-stage livestreams. Each row links to where the speaker is introduced. Tick rows and hit Copy selected to paste a formatted table into Microsoft Teams. Timestamps & summaries derived from the video transcripts.

Full video30 segments

🌅 Morning Keynotes ≈ 9:00–10:35 AM PT

TimeSpeakerSessionWatchSummary
0:00:00 Stream open / walk-in ▶ 0:00:00 Pre-show walk-in music before the program begins.
0:01:08 Host (Allie Howe, Keycard) Welcome & opening ▶ 0:01:08 The main-stage MC opens Day 1 of AI Engineer World's Fair and welcomes the biggest-ever gathering of AI engineers before handing to swyx.
0:01:27 swyx (AI Engineer, co-founder) The Highest Loop ▶ 0:01:27 swyx frames the entire day around “loops” — his Loopcraft idea that AI engineering is about choosing which loop you work in, going up a level for scale and down a level for reliability. Sets up the day's “software factories” theme.
0:11:20 Pablo Castro (Microsoft) On AI and Knowledge (Foundry) ▶ 0:11:20 Castro breaks knowledge into intrinsic (in the model), extrinsic (retrieved) and learned, tracing the productivity curve from IntelliSense to Copilot to agents. He demos Microsoft Foundry's “agent optimizer,” which builds a real learning loop from an agent's own traces to capture each organization's differentiated knowledge.
0:28:37 Romain Huet & Alexander Embiricos (OpenAI) The Golden Age of AI Engineering ▶ 0:28:37 OpenAI argues AI didn't end engineering but returned it to its problem-solving roots — “AI engineers are eating the world.” They note the accelerating release cadence (from ~15 months to ~6 weeks between models) and, instead of a live demo, bring on a special guest.
0:48:07 Peter Steinberger (OpenClaw → OpenAI) ↳ Guest keynote (within OpenAI's slot) ▶ 0:48:07 The OpenClaw “ClawFather,” now at OpenAI, describes going from babysitting 10 terminal windows to managing one long-running “manager” agent that delegates to a team — enabled by server-side compaction, coordination and triggers. His theme: the bottleneck keeps moving, from tokens to compute to human attention.
0:57:16 Zixuan Li (Z.ai, remote) GLM-5.2: Frontier Intelligence, Open Weights ▶ 0:57:16 Joining remotely, Li introduces GLM-5.2 and explains that “GLM” (General Language Model) dates to a 2021 paper, making Z.ai one of the earliest large-model labs. He positions GLM as a top open-weights model that's strong well beyond coding.
1:11:05 Thom Wolf (Hugging Face) × Olive Song (MiniMax) Keynote conversation ▶ 1:11:05 Hugging Face's Thom Wolf interviews MiniMax's Olive Song about M3 — a ~400B-parameter (20B active) open model that also understands vision. They dig into its 1M-token context and new MiniMax Sparse Attention (MSA) architecture, framing MiniMax among China's top open “AI dragons.”
1:33:18 Randall Diggs (Snyk) Security Track intro ▶ 1:33:18 Diggs gives a brief intro to the conference's first Security Track and the shift from traditional app-sec to agentic security. He frames three obstacles — insecure AI-generated code, safely deploying autonomous agents, and the geopolitics of model access — and points people to the track. (The printed schedule listed Manoj Nair for this slot.)

🏭 Software Factories — late-morning sessions ≈ 10:36 AM–12:25 PM PT · interleaved with expo/demo talks

TimeSpeakerSessionWatchSummary
1:37:42 Tisha & Sushin Reproducing agent failures in production ▶ 1:37:42 A two-person talk on the one thing you lose when an agent misbehaves in production: reproducibility. Using a broker-API example where an agent sells 1,000 shares instead of $1,000, they show why turning temperature to zero doesn't make a broken reasoning path debuggable.
1:45:33 Kushan Browser agents ▶ 1:45:33 Kushan (ex-founding engineer at Sohm) argues browser agents underperform because the infra around the model is poor, not the model itself. He demos a compressed page representation that lets a cheaper model plan long action sequences and recover from failures far faster. (This window also carries other short demos.)
2:10:44 Tereza Tížková (Factory) Rise of the Software Factory ▶ 2:10:44 Tížková defines the “software factory” as the whole autonomous software lifecycle — collecting signals, prioritizing, orchestrating, validating and continuously improving — not just code generation. She argues writing code is the easy part, a swarm of coding agents isn't a factory, and organizations must rebuild from the ground up rather than bolt one on.
2:37:01 Burak (Mutagent) The Agentic AI Engineer ▶ 2:37:01 Mutagent applies the build-loop idea to building agents themselves: an offline loop (iterate, test, evaluate, improve) and an online loop (monitor production traces, diagnose, feed back). Their thesis — doing this loop by hand doesn't scale to hundreds of agents, so the loop itself should run agentically.
2:47:00 Charlie Holtz Orchestras, not Factories (Conductor) ▶ 2:47:00 Holtz argues for conducting an orchestra of agents rather than running an assembly line. Using his Conductor tooling he covers a central “feed the beast” database of all company context (Slack, Discord, meetings) exposed via a SQL tool, “free-range” sandboxed agents, and carefully-authored AGENTS.md / CLAUDE.md files.
3:05:10 Daksh Gupta (Greptile) What we learned analyzing 1M+ AI-generated PRs ▶ 3:05:10 As reviewer (not author) of code across tens of thousands of teams, Gupta shares what he found in over a million AI-generated pull requests. He covers how hard it is to even identify fully “vibe-coded” PRs and argues the future is agents simulating users to validate code — rather than humans reviewing endless slop.
3:17:17 Amole (Nori Agentic) Coding agents for slides, docs & video ▶ 3:17:17 Nori's CEO argues coding agents can do far more than write code — including visual artifacts. His insight: don't hand agents human tools like PowerPoint or Figma; give them the right medium (HTML), so a model that looks bad at spatial tasks (Simon Willison's pelican-SVG test) can build good decks end-to-end.
3:24:11 Zion 10X: Reimagining the mobile dev workflow ▶ 3:24:11 A 14-year mobile engineer asks why the promised 10x from AI agents hasn't materialized. Using the factory-electrification analogy — real gains came only when factories were redesigned around small distributed motors, not by swapping the steam engine — he argues we must redesign the whole workflow, not bolt agents onto the old one.

🍽 Lunch / interstitial programming ≈ 12:33–1:30 PM PT

TimeSpeakerSessionWatchSummary
3:34:11 Gergely Orosz × Simon Eskildsen (Turbopuffer) Technical fireside chat ▶ 3:34:11 Pragmatic Engineer author Gergely Orosz hosts a deep-technical fireside with Turbopuffer founder/CEO Simon Eskildsen, from his origin story (PowerPoint, FrontPage, WoW-fueled English) to Turbopuffer's CPU-first architecture. Includes a comedic Jensen Huang “do you vape?” anecdote and why CPUs are surprisingly scarce at the hyperscalers.

🔧 Afternoon sessions ≈ 1:30–4:05 PM PT · interleaved with sponsor lightning talks

TimeSpeakerSessionWatchSummary
4:31:51 Kevin Hou (Google Antigravity) Get Out of the Model's Way ▶ 4:31:51 Hou, who leads engineering on Google's Antigravity coding product, argues you should “get out of the model's way” — like giving Messi the ball. He walks through Antigravity 2.0 decoupling the IDE from a standalone agent manager (subagents, worktrees, scheduled tasks, voice) and the principle of “scaling with intelligence.”
4:55:23 Zach Lloyd (Warp) Self-Improving Software Factories ▶ 4:55:23 Warp's founder — a 20-year engineer who hasn't written code in six months — argues software engineering is becoming “factory engineering,” moving from chat/autocomplete to interactive agents to full automation. He covers open-sourcing Warp (60k+ stars, 800k+ developers) and building self-improving factories.
5:17:25 Gabe (OpenGov) OG Assist ▶ 5:17:25 OpenGov engineer Gabe demos “OG Assist,” an AI assistant embedded across OpenGov's government ERP products (budgeting, procurement, permitting). Agents make tool calls against product data and can read/act on the current screen, backed by automated evals in CI and deterministic human-in-the-loop approval gates.
5:26:30 Ido Salomon We're the Bottleneck (But We Don't Have To Be) ▶ 5:26:30 Salomon (creator of AgentCraft and MCP-UI) argues humans — not models — are now the bottleneck, because steering and reviewing many agents is exhausting. His fix borrows from gaming: AgentCraft is an RTS/Sims-style orchestrator that represents each agent as a unit you can spawn and supervise.
5:50:16 Sarah Sachs (Notion) Token Town ▶ 5:50:16 Notion's AI engineering lead talks “Token Town” — building AI-native products sustainably so you go from “AI-pled to AI-poor.” She frames Notion as the durable system of record where humans and agents collaborate, and shows giving agents on-demand sandboxes to safely write and run code.
6:20:09 Vaibhav Gupta (BAML) Fighting Slop with Slop ▶ 6:20:09 Gupta describes BAML's provocative practices — no code reviews, everyone in parallel, no standard tooling — while shipping a programming language that can't tolerate slop. His answer: a tiny, stable architecture.md (not CLAUDE.md) holding only what won't change, plus rules like “talk to another human before going deeper into the compiler.”
6:46:40 Kyle Mistele (Human Layer) Loop Engineering from First Principles ▶ 6:46:40 Mistele argues most people build loops wrong — piping a prompt into a coding agent yields 40,000-line PRs nobody reads. Riffing on Jeff Huntley's “Ralph” and Peter Steinberger's loop philosophy, he lays out loop engineering for real-world teams with real customers, regulatory obligations and SLAs.

🌆 Afternoon Keynotes ≈ 4:30–5:30 PM PT

TimeSpeakerSessionWatchSummary
7:30:12 Host (Allie Howe, Keycard) Afternoon keynotes resume ▶ 7:30:12 The MC welcomes everyone back, thanks the sponsors (presenting sponsor Microsoft) and sets up a closing block on building software factories that actually work — and don't produce slop.
7:33:03 Dex Horthy (Human Layer) Harness Engineering Is Not Enough: Why Software Factories Fail ▶ 7:33:03 Horthy (who coined “context engineering”) pushes back on the “you're the bottleneck, just spend more tokens, stop reading the code” narrative. Citing rising incidents, falling PR-review quality and more bugs since teams adopted AI coding tools, he argues no amount of harness engineering alone can fix a fundamentally different problem.
7:52:38 Erik Meijer (Harvard / Lean’s Labs) In Code They Act, In Proof We Trust ▶ 7:52:38 A tutorial (not a pitch) on using elementary type systems and compiler knowledge to make AI agents provably safe. Meijer argues models will do anything to reach a goal — including deleting your files — and shows how inductive proofs the model itself can generate yield mathematically-proven-safe agentic compute.
8:13:07 Lee Robinson (Cursor) Recursive Model Improvement ▶ 8:13:07 Robinson explains how Cursor trains its own models and how the model “learns to train itself” via recursive improvement. He details the outer/inner training loops (feedback → better data/evals → more compute → new model), citing Composer 2.5 as Cursor's most popular model and teasing a notable new one soon.
8:33:00 Host (Allie Howe, Keycard) Closing ▶ 8:33:00 Allie Howe closes Day 1: the raw materials for software factories now exist (bigger context, better memory, vision, verification, agent security), but teams need the discipline to wield them — and, echoing Dex's talk, engineering as a practice is not dead.