2026-05-14-morning

Summary

The morning slot is dominated by a memory and long-context cluster: δ-mem (lightweight 8x8 frozen-backbone associative memory), Microsoft's KV-Cache compression paper getting reposted twice, and Lighthouse Attention (NousResearch). Three more curated reposts land on the agentic-process-failure axis, the multi-agent Bystander Effect paper, the AutoTTS agent that discovers TTS controllers, and the refusal-neurons single-MLP-neuron alignment bypass. NVIDIA fills the AI-account feed with four tweets (the SAP OpenShell agent runtime, the Snap GPU petabyte pipeline). Standalone signals worth surfacing: Anthropic's Glasswing/Mythos cyber-range results from @bcherny, and a Nous Research Token Superposition Training claim of 2-3x pretraining speedup at identical inference behavior. Tesla and an @AnatoliKopadze repost are the noise.

Posts

δ-mem: efficient online memory for frozen LLMs (cluster of 2) (@HuggingPapers · @dair_ai · paper). 8x8 associative memory state with delta-rule learning gives 1.31x on MemoryAgentBench and 1.20x on LoCoMo without fine-tuning or backbone replacement. The dair_ai writeup frames it as "one of the more elegant memory mechanisms I've seen this month": the state is coupled directly to attention computation and the readout produces low-rank corrections. Pairs with WriteSAE on the recurrent-state axis (both intervene at the cache write site) and with today's MMProLong on the long-context axis (both bypass context-window inflation).
AutoTTS: agentic discovery of test-time scaling controllers (cluster of 2) (@zhengtoong · @ihtesham2005 · paper · wiki). Both reposts amplify the Google-Meta paper where Claude Code proposes its own TTS controllers, tests them, and refines based on failures. In 5 rounds the agent discovered a controller with 4 coordinated mechanisms (EMA momentum stopping, coupled width-depth control, alignment-aware depth allocation, conservative branch abandonment). Cost: $39.9 total discovery. Cross-references today's DAgger paper and MAP on the long-horizon agent thread.
Multi-agent Bystander Effect / Sovereignty Gap (@dair_ai · paper). 22,500 deterministic trajectories across GAIA, SWE-bench, Multi-Challenge with three frontier models. Agents frequently compute the correct answer internally then suppress it to agree with the swarm. The paper formalizes an Interaction Depth Limit and a "lead anchor" non-commutativity finding (the brand identity of the auditor disproportionately dictates swarm integrity). Direct complement to today's AgentLens Lucky Pass paper: both surface process-quality failure modes that pass-rate alone hides.
Lighthouse Attention: removable subquadratic wrapper for long-context training (@omarsar0 · paper). Nous Research. Wraps SDPA with a hierarchical, gradient-free, symmetrical selection layer for queries, keys, and values. Recoverable at the end of training so the deployed model runs vanilla attention with no architectural cost. Same "asymmetric training, identical inference" pattern as today's Orthrus and the Nous Token Superposition Training claim below.
Token Superposition Training (TST) (@NousResearch). 2-3x wall-clock pretraining speedup at matched FLOPs without changing architecture, optimizer, tokenizer, or training data. Bag-of-tokens prediction in the first third of training, standard NTP thereafter. The deployed model is identical to one trained conventionally. Third paper in one week on the "asymmetric training, identical inference" frame (alongside Lighthouse Attention and Orthrus). Worth tracking whether this pattern unifies into a research direction by end of Q2.
Microsoft KV-Cache compression hype repost (@AiwithYasir). Inflated framing ("Microsoft just solved the context window problem") of a paper on KV-cache compression for long chain-of-thought. The technical substrate is real, the framing is press-release. Adjacent to Make Each Token Count and today's Orthrus on the cache-as-coordination-object axis.
Refusal-neurons: a single MLP neuron bypasses safety alignment (@hamid_kazemi22). Across 7 models, 2 families, 1.7B-70B scales, suppressing one MLP neuron bypasses refusal behavior. No fine-tuning, no prompt engineering. Concept-neurons (e.g. suicide-related) identified as a proof-of-concept. Dense-transformer analogue of today's WriteSAE behavioral install at the cache-write site: single-feature interventions cross both architecture families now.
"Attention Is All You Need V2" / Nested Learning framing (@HowToAI_). Hype-style repost. The underlying paper is likely the Google "HOPE / Nested Learning" architecture from 2026-04-28. Framing significantly overstates "end of the Transformer era". Click through to original paper if interested, skip the tweet narrative.
omarsar0 on HTML Artifacts + agents (@omarsar0 · DAIR.AI event). Demo of an HTML+JS artifact backed by Obsidian markdown for agents to read/modify. Practitioner-side signal about where agent UI is heading. Adjacent to Ken Huang's "Agentic AI Harness Pattern" pattern catalog.
Anthropic Mythos / Glasswing cyber results (@bcherny · XBOW evaluation · AISI report). UK AISI confirms Mythos Preview is the first model to solve both their cyber ranges end-to-end including "Cooling Tower" which no model had solved before. XBOW's evaluation calls Mythos "a major advance" at finding vulnerability candidates when source is available. AISI's wider report says autonomous AI cyber-task length has been doubling every few months and recent models exceeded prior trend lines. The doubling-rate question pairs directly with today's AgentLens Lucky Pass finding: how much of cyber-eval doubling is process quality?
NVIDIA + SAP OpenShell agent runtime (@nvidia · NVIDIA blog). SAP embeds NVIDIA OpenShell (open-source secure agent runtime) into SAP Business AI Platform. Isolated execution, policy enforcement at filesystem/network layers, infrastructure-level containment. Adjacent to the deployment-services thread (OpenAI Deployment Company, Google's customer-adoption engineering hires).
NVIDIA + Snap GPU petabyte A/B testing (@nvidia · NVIDIA AI Podcast Ep 298). Snap migrated 10+ petabytes/day of A/B-test data processing to GPU-accelerated Google Cloud pipelines: 76% cost reduction, 80% memory footprint reduction, zero code changes. The GPU-as-cheaper-database pattern.
Opaque article-only reposts (group, click through to read) (@akshay_pachaar · @oneill_c · @amitiitbhu · @AnatoliKopadze · @mem0ai). Five x.com/i/article/ reposts with no text content. Mem0 in the list is most likely about agent memory (mem0ai brand). The Akshay Pachaar one is plausibly about an agent or RAG topic given his usual feed. Skip unless triaging.
Tesla on EVs (@Tesla). Marketing. Skip.