Sunday, May 10, 2026 · social stream

Media Live

daily roll-up

Summary

A quiet day with one real signal cluster. @MillionInt's amplification of Jiayi Weng's "Learning Beyond Gradients" essay carried both substantive slots, first as a "bearish on RL" framing in the morning, then sharpened in the afternoon with a follow-up arguing that RL only appears to lose where simple heuristics already solve the task. The underlying result is the standout: Codex iterates a pure NumPy and cv2 closed-loop policy for VizDoom D3 Battle from raw pixels, no neural net, no map, no seed-specific routes, and wins. Tesla contributed three deployed-AI notes in the afternoon (photon-count FSD reconstruction, crash-data-driven airbag timing, end of Model S and X at Fremont) that are interesting as engineering but adjacent to wiki priorities. Everything else (Grok Imagine promo, Altman-firing musical, architecture aside, physics-of-companies aphorism) is noise. No @bayesiansapien retweets, no Gmail-grade signal.

Posts

Learning Beyond Gradients: heuristics as the next paradigm after RLVR? (@MillionInt · blog) [morning + afternoon] (cluster of 2). Jiayi Weng had Codex iterate a pure NumPy and cv2 heuristic for VizDoom D3 Battle from screen pixels alone. The argument: coding agents flatten the maintenance curve on hand-written rules enough that programmatic policies become continual-learning vehicles without weight updates. @MillionInt amplified it as a bearish take on RL. See Jiayi Weng Deep Dive and the daily digest.
Follow-up on where RL actually fails to generalize (@MillionInt) [afternoon]. Clarifies the bearish post: if a heuristic simple enough for Codex solves the game, RL will find and overfit to that heuristic and fail to generalize. Reframes the VizDoom result less as "RL is dead" and more as "this benchmark was always heuristic-shaped."
Tesla photon-count image reconstruction for FSD (@Tesla) [afternoon]. RGB next to a photon-count reconstruction, pitched as why FSD sees through glare and at night. Interesting as low-light sensor-fusion engineering, marketing-shaped with no method specifics.
Tesla crash-data-driven restraint deployment timing (@Tesla) [afternoon]. Wes Morrill describes replaying fleet crash data in simulation to sweep airbag and pretension timing, finding earlier deployment improves occupant kinematics. Mature data flywheel turning fleet telemetry into a control policy.
End of Model S and Model X production at Fremont (@Tesla) [morning + afternoon] (cluster of 2). Industrial milestone, no AI content. Skip.
Laws-of-physics-of-companies aphorism (@MillionInt) [afternoon]. One-line take on outsized returns requiring tech-driven bending of industry "laws". Skip.
Sam-Altman-firing texts as a musical (@MillionInt) [afternoon]. Repost of @dgrreen turning the Altman/Murati 2023 firing texts into a Hamilton-style number. Curio. Skip.
Grok Imagine weekend-plans promo (@imagine) [afternoon]. Generated-content showcase. Skip.
Cathedral of Learning aside (@magicsilicon) [afternoon]. Off-topic architecture musing. Skip.

slot detail

Evening

scraped 2026-05-10 22:00 IST · 1 tweets

Summary

Sparse evening slot with a single signal. Tesla announced a coast-to-coast FSD Supervised demonstration: 2,833 miles New York City to Los Angeles in 49:55:57 with zero human interventions on FSD v14.3.2, claiming an ~8.5 hour improvement over the previous cannonball record. No curated reposts, no research feed activity. Treat this as a marketing milestone with engineering substance underneath, but the only public artifact is a tweet, so the technical claim is unverifiable from this slot alone. Worth noting against the broader thread of end-to-end driving stacks scaling without intervention, but nothing to act on tonight.

Posts

Tesla FSD v14.3.2 NYC to LA cannonball, zero interventions (@Tesla · driver thread). 2,833 miles in 49:55:57 with zero disengagements, claimed as a new FSD cannonball record beating the prior by ~8.5 hours. Headline number is the zero-intervention claim, not the time. Useful as a deployment data point on end-to-end driving policies if the route logs ever become public.

Afternoon

scraped 2026-05-10 16:40 IST · 0 tweets

Summary

Quiet slot dominated by one substantive thread: Jiayi Weng's "Learning Beyond Gradients" essay, surfaced by @MillionInt with a bearish framing on RL. A coding-agent-iterated NumPy+cv2 heuristic policy beats neural RL on VizDoom D3 Battle, prompting the claim that coding agents may turn maintainable heuristics into a serious post-RLVR paradigm. @MillionInt's follow-up sharpens the point: RL only "wins" where simple heuristics already get you most of the way, so the heuristic-policy result is more an indictment of the benchmark than a refutation of RL. Tesla supplies three operational notes (final Model S/X off the Fremont line, photon-count image reconstruction for FSD night vision, crash-data-driven airbag deployment timing) that are interesting as deployed-AI engineering but adjacent to wiki priorities. Grok Imagine and an architecture aside round out the noise.

Posts

Learning Beyond Gradients: heuristics as the next paradigm after RLVR? (@MillionInt · blog). Jiayi Weng had Codex iterate a closed-loop pure NumPy+cv2 heuristic for VizDoom D3 Battle using only screen pixels and public game variables. No network, no map, no seeds. It works. The argument: hand-written rules were never useless, they were too expensive to maintain, and coding agents flatten that maintenance curve enough that programmatic policies become continual-learning vehicles without weight updates. @MillionInt frames this as bearish for RL.
Follow-up: where RL actually fails to generalize (@MillionInt). Clarifies the bearish post: if a heuristic simple enough for Codex solves the game, RL will find and overfit to that heuristic, and won't generalize. The bottleneck is environments where simple heuristics get you far. Useful caveat. It reframes the VizDoom result less as "RL is dead" and more as "this benchmark was always heuristic-shaped."
Tesla photon-count image reconstruction for FSD (@Tesla). Tesla shows the human-perceived RGB next to its photon-count reconstruction, claimed to explain why FSD sees through extreme glare and at night. Interesting as a sensor-fusion / low-light vision pipeline detail, though the post is marketing-shaped and gives no method specifics.
Tesla crash-data-driven restraint deployment timing (@Tesla). Wes Morrill describes replaying real fleet crash data in simulation, sweeping airbag and seat-belt-pretension timing, and finding that earlier deployment improves occupant kinematics. Strong example of a mature data flywheel turning fleet telemetry into a control-policy improvement, even if not directly an AI-modeling advance.
End of Model S and Model X production at Fremont (@Tesla). Industrial milestone. No AI content.
Laws-of-physics-of-companies aphorism (@MillionInt). One-line take that outsized returns require bending industry-specific "laws" via tech innovation. Skip.
Sam-Altman-firing texts as a musical (@MillionInt). Repost of @dgrreen turning the Altman/Murati 2023 firing texts (now Musk v. OpenAI trial evidence) into a Hamilton-style number. Curio, not signal. Skip.
Grok Imagine weekend-plans promo (@imagine). Generated-content showcase. Skip.
Cathedral of Learning aside (@magicsilicon). Off-topic architecture musing. Skip.

Morning

scraped 2026-05-10 09:08 IST · 2 tweets

Summary

A near-empty morning slot. Two tweets total, no @bayesiansapien retweets, no Gmail-grade signal. The single substantive item is @MillionInt amplifying Jiayi Weng's blog post "Learning Beyond Gradients", framed as "mostly a bearish take on RL", in which Codex iterates a pure NumPy + cv2 heuristic policy for VizDoom D3 with no neural net training. Today's digest treats this as a Tier 2 Deep Dive against the RL-substrate thread. The other tweet is Tesla announcing the end of Model S and Model X production at Fremont, off-topic for AI.

Posts

Jiayi Weng — Learning Beyond Gradients (@MillionInt reposting · blog). Codex iterates a NumPy + cv2 closed-loop heuristic policy for VizDoom D3 from raw pixels and public game variables, no neural network training, no map, no object coordinates, no seed-specific routes. Weng frames iterated heuristic learning as a candidate next paradigm after pretraining and RL/RLVR. The amplifier framed it as a bearish take on RL. Today's Jiayi Weng Deep Dive and digest Connecting the Dots cover the substrate question alongside the Gowers + ChatGPT 5.5 Pro result.
End of Model S and Model X production (@Tesla). Production farewell post, no AI content. Skip.