2026-05-10 — cere-bro

Summary

A quiet day with one real signal cluster. @MillionInt's amplification of Jiayi Weng's "Learning Beyond Gradients" essay carried both substantive slots, first as a "bearish on RL" framing in the morning, then sharpened in the afternoon with a follow-up arguing that RL only appears to lose where simple heuristics already solve the task. The underlying result is the standout: Codex iterates a pure NumPy and cv2 closed-loop policy for VizDoom D3 Battle from raw pixels, no neural net, no map, no seed-specific routes, and wins. Tesla contributed three deployed-AI notes in the afternoon (photon-count FSD reconstruction, crash-data-driven airbag timing, end of Model S and X at Fremont) that are interesting as engineering but adjacent to wiki priorities. Everything else (Grok Imagine promo, Altman-firing musical, architecture aside, physics-of-companies aphorism) is noise. No @bayesiansapien retweets, no Gmail-grade signal.

Posts

Learning Beyond Gradients: heuristics as the next paradigm after RLVR? (@MillionInt · blog) [morning + afternoon] (cluster of 2). Jiayi Weng had Codex iterate a pure NumPy and cv2 heuristic for VizDoom D3 Battle from screen pixels alone. The argument: coding agents flatten the maintenance curve on hand-written rules enough that programmatic policies become continual-learning vehicles without weight updates. @MillionInt amplified it as a bearish take on RL. See Jiayi Weng Deep Dive and the daily digest.
Follow-up on where RL actually fails to generalize (@MillionInt) [afternoon]. Clarifies the bearish post: if a heuristic simple enough for Codex solves the game, RL will find and overfit to that heuristic and fail to generalize. Reframes the VizDoom result less as "RL is dead" and more as "this benchmark was always heuristic-shaped."
Tesla photon-count image reconstruction for FSD (@Tesla) [afternoon]. RGB next to a photon-count reconstruction, pitched as why FSD sees through glare and at night. Interesting as low-light sensor-fusion engineering, marketing-shaped with no method specifics.
Tesla crash-data-driven restraint deployment timing (@Tesla) [afternoon]. Wes Morrill describes replaying fleet crash data in simulation to sweep airbag and pretension timing, finding earlier deployment improves occupant kinematics. Mature data flywheel turning fleet telemetry into a control policy.
End of Model S and Model X production at Fremont (@Tesla) [morning + afternoon] (cluster of 2). Industrial milestone, no AI content. Skip.
Laws-of-physics-of-companies aphorism (@MillionInt) [afternoon]. One-line take on outsized returns requiring tech-driven bending of industry "laws". Skip.
Sam-Altman-firing texts as a musical (@MillionInt) [afternoon]. Repost of @dgrreen turning the Altman/Murati 2023 firing texts into a Hamilton-style number. Curio. Skip.
Grok Imagine weekend-plans promo (@imagine) [afternoon]. Generated-content showcase. Skip.
Cathedral of Learning aside (@magicsilicon) [afternoon]. Off-topic architecture musing. Skip.