Wiki Pages

Click a topic to expand. Concept pages first, then summaries newest-to-oldest.

concept page
Agent Evaluation & Benchmarks
concept page
Agent Memory
concept page
GUI Agents
concept page
Multi-Agent Systems
concept page
Tool Use & Function Calling
2026-05-19 · Tier 2
AI for Auto-Research: Roadmap & User Guide
2026-05-18 · Tier 2
Look Before You Leap: Autonomous Exploration for LLM Agents
2026-05-18 · Tier 2
MMSkills: Multimodal Skills for General Visual Agents
2026-05-18 · Tier 2
PAGER: Bridging the Semantic-Execution Gap in Point-Precise Geometric GUI Control
2026-05-18 · Tier 2
Solvita: Agentic Evolution for Competitive Programming
2026-05-17 · Tier 2
LIFE Survey: Collaboration, Failure Attribution, and Self-Evolution in LLM-based Multi-Agent Systems
2026-05-16 · Tier 2
FrontierSmith: Synthesizing Open-Ended Coding Problems at Scale
2026-05-16 · Tier 2
SPIN: Structural LLM Planning via Iterative Navigation for Industrial Tasks
2026-05-15 · Tier 2
Agent Memory Cluster: STALE + Preping + EvolveMem + MemEye + MemLens + BOOKMARKS
2026-05-15 · Tier 2
EvoEnv: Self-Evolving Reasoning RL via Verifiable Environment Synthesis
2026-05-15 · Tier 2
Orchard: Open-Source Agentic Modeling Framework — 67.5% SWE-bench Verified at 30B
2026-05-15 · Tier 2
SDAR: Self-Distilled Agentic Reinforcement Learning
2026-05-15 · Tier 2
WildClawBench: Native-Runtime Long-Horizon Agent Benchmark — Claude Opus 4.7 Tops Out at 62.2%
2026-05-14 · Tier 2
AgentLens: the Lucky Pass problem in SWE-agent evaluation
2026-05-14 · Tier 2
Context Training with Active Information Seeking
2026-05-14 · Tier 2
Revisiting DAgger in the era of LLM agents
2026-05-14 · Tier 2
MAP: a Map-then-Act paradigm for long-horizon interactive agents
2026-05-13 · Tier 2
Agent-BRACE: Decoupling Beliefs from Actions in Long-Horizon Tasks
2026-05-13 · Tier 2
LLM Agents Already Know When to Call Tools — Even Without Reasoning (Probe&Prefill)
2026-05-13 · Tier 2
LongMemEval-V2: Evaluating Long-Term Agent Memory Toward Experienced Colleagues
2026-05-13 · Tier 2
Useful Memories Become Faulty When Continuously Updated by LLMs
2026-05-12 · Tier 2
X-OmniClaw: Unified Mobile Agent for Multimodal Understanding and Interaction
2026-05-11 · Tier 2
AutoTTS: Agentic Discovery for Test-Time Scaling
2026-05-10 · Tier 2
Jiayi Weng: Learning Beyond Gradients
2026-05-10 · Tier 3
OncoAgent: Dual-tier multi-agent framework for privacy-preserving oncology decision support
2026-05-09 · Tier 2
AI Co-Mathematician
2026-05-09 · Tier 2
Auto Research with Specialist Agents Develops Effective and Non-Trivial Training Recipes
2026-05-09 · Tier 2
Beyond Semantic Similarity: Direct Corpus Interaction (DCI)
2026-05-09 · Tier 2
Skill Curation Cluster: StraTA, Skill1, SkillOS
2026-05-07 · Tier 2
BRIGHT-Pro and RTriever-4B: Reasoning-Intensive Retrieval for Agentic Search
2026-05-07 · Tier 2
MedSkillAudit: Domain-Specific Audit Framework for Medical Research Agent Skills
2026-05-07 · Tier 2
OpenSearch-VL: An Open Recipe for Frontier Multimodal Search Agents
2026-05-05 · Tier 2
AcademiClaw: When Students Set Challenges for AI Agents
2026-05-05 · Tier 2
Ctx2Skill: From Context to Skills — Self-Evolving Multi-Agent Skill Extraction
2026-05-05 · Tier 2
PhysicianBench: Evaluating LLM Agents in Real-World EHR Environments
2026-05-05 · Tier 2
T^2PO: Token- and Turn-Level Policy Optimization for Stable Multi-Turn Agentic RL
2026-05-04 · Tier 1
Why Your Agentic AI Pentester Is Probably Just a Fancy Scanner — Ken Huang
2026-05-04 · Tier 3
LWD — Learning While Deploying: Fleet-Scale RL for Generalist Robot Policies
2026-05-02 · Tier 1
Ken Huang Ch 15 — Structured Output and Schema-Constrained Generation
2026-05-01 · Tier 2
Ara: Agent-Native Research Artifacts
2026-05-01 · Tier 2
Claw-Eval-Live: Live Agent Benchmark for Evolving Real-World Workflows
2026-05-01 · Tier 2
Eywa: Heterogeneous Scientific Foundation Model Collaboration
2026-05-01 · Tier 2
InteractWeb-Bench: Multimodal Agents under Non-Expert User Instructions
2026-05-01 · Tier 2
Intern-Atlas: A Methodological Evolution Graph
2026-05-01 · Tier 2
MCP Integration: Claude Code vs Hermes (Ken Huang Ch 13)
2026-05-01 · Tier 2
Synthetic Computers at Scale: Long-Horizon Productivity Simulation
2026-04-30 · Tier 2
ClawGym: A Scalable Framework for Building Effective Claw Agents
2026-04-25 · Tier 2
Claude Code Memory Systems: Chapter 8 Analysis
2026-04-23 · Tier 2
Claude Code vs. Hermes Agent: Permission System Architectures
2026-04-23 · Tier 2
Persistent Agent Infrastructure: Kimi K2.6, OpenAI Agent Studio, Anthropic Conway
2026-04-22 · Tier 2
AgentSPEX: An Agent Specification and Execution Language
2026-04-22 · Tier 2
SimpleTES: Evaluation-Driven Scaling for Scientific Discovery
2026-04-22 · Tier 2
HuggingFace ml-intern: Open-Source Agentic Post-Training Loop
2026-04-21 · Tier 2
Precise Debugging Benchmark: Models Regenerate, They Don't Debug
2026-04-21 · Tier 2
Reward-Free Self-Evolution: Agents That Learn Without Being Told What to Learn
2026-04-20
GTA-2: Benchmarking General Tool Agents from Atomic Use to Open-Ended Workflows
2026-04-20
PRL-Bench: LLMs on Frontier Physics Research
2026-04-20
Chapter 3: The Query/Agent Loop — Claude Code vs. Hermes Agent
2026-04-19 · Tier 2
Claude Code Architecture: A Deep Reading
2026-04-19 · Tier 2
UniDoc-RL: RL-Based Visual RAG with Hierarchical Actions
2026-04-18 · Tier 2
Corpus2Skill: Don't Retrieve, Navigate
2026-04-18 · Tier 2
DR3-Eval: Realistic Benchmark for Deep Research Agents
2026-04-17 · Tier 2
Dive into Claude Code: Architecture Analysis
2026-04-17 · Tier 2
SuperLocalMemory V3.3: Biologically-Inspired Agent Memory
2026-04-16 · Tier 2
DefenseClaw, MAESTRO, and the Security Boundary Agentic AI Has Been Missing
2026-04-16 · Tier 2
Do AI Coding Agents Log Like Humans? An Empirical Study
2026-04-16 · Tier 2
Exploration and Exploitation Errors Are Measurable for Language Model Agents
2026-04-16 · Tier 2
OccuBench: Evaluating AI Agents on Real-World Professional Tasks via Language World Models
2026-04-16 · Tier 2
TREX: Automating LLM Fine-tuning via Agent-Driven Tree-based Exploration
2026-04-16 · Tier 2
UI-Copilot: Long-Horizon GUI Automation via Tool-Integrated Policy Optimization
2026-04-16 · Tier 2
Inside VAKRA: Reasoning, Tool Use, and Failure Modes of Agents
2026-05-17 · Tier 2
Open Artifacts #21: The May 2026 Open-Model Wave and the CAISI / ECI Gap
2026-05-13 · Tier 2
Anthropic overtakes OpenAI in B2B adoption for the first time (Ramp data)
2026-05-13 · Tier 2
Recursive emerges from stealth with $650M for self-improving AI
2026-05-10 · Tier 3
Broadcom won't build OpenAI's custom chip without Microsoft buying 40 percent
2026-05-08 · Tier 1
Anthropic ↔ Colossus 1 Deal: Capacity Crunch + Brand Risk
2026-05-08 · Tier 1
GitHub Reliability Crisis: AI Load Breaks the Platform
2026-05-08 · Tier 2
Lambert: Notes from inside China's AI labs
2026-05-04 · Tier 3
Anthropic + OpenAI Both Build Services Companies Around Their AI
2026-05-03 · Tier 3
Microsoft VS Code Auto-Inserts "Co-Authored-by Copilot" Even With AI Off
2026-05-03 · Tier 1
Xiaomi MiMo-V2.5-Pro — Open-Weight Long-Horizon Coding at 40-60% Fewer Tokens
2026-05-02 · Tier 3
ChatGPT Tracks Free Users for Ads by Default
2026-05-01 · Tier 3
AISN #72 — CAIS AI-Wellbeing Research, Public-Sentiment Decline, OpenAI Releases
2026-05-01 · Tier 3
Anthropic Launches Claude Security — Defensive Cyber Productization
2026-05-01 · Tier 3
Chinese AI Startups Onshoring: Moonshot, StepFun Dissolving Offshore Structures
2026-05-01 · Tier 3
UK AISI: GPT-5.5 Matches Claude Mythos on Full Network Attack Simulation
2026-05-01 · Tier 2
Marcus: "The Greatest Capital Misallocation in History?"
2026-05-01 · Tier 3
Pentagon Signs Eight Tech Giants for AI-First Fighting Force; Anthropic Excluded
2026-05-01 · Tier 2
Pragmatic Engineer: AI Load Breaks GitHub; Anthropic's Trust Speedrun
2026-05-01 · Tier 1
SemiAnalysis: AI Value Capture — The Shift to Model Labs
2026-04-30 · Tier 3
Zig's Anti-LLM Contribution Policy and the "Contributor Poker" Argument
2026-04-29 · Tier 3
Claude for Creative Work: MCP Connectors for Blender, Adobe, Ableton, Autodesk
2026-04-29 · Tier 3
UCP Wins the Agentic Commerce Governance Layer
2026-04-22 · Tier 3
Amazon-Anthropic $33B Deal and AI Capital Concentration Week
2026-04-18 · Tier 3
OpenAI Executive Departures and Product Restructuring (April 2026)
2026-04-17 · Tier 3
Anthropic's Mythos Model: Government Access and the Trust Debate
2026-04-17 · Tier 3
Claude's Explosive Market Share Surge (April 2026)
concept page
Knowledge Distillation
concept page
KV Cache
concept page
Speculative Decoding
2026-05-19 · Tier 1
CompactAttention: Accelerating Chunked Prefill with Block-Union KV Selection
2026-05-19 · Tier 1
EndPrompt: Efficient Long-Context Extension via Terminal Anchoring
2026-05-19 · Tier 1
LongLive-2.0: An NVFP4 Parallel Infrastructure for Long Video Generation
2026-05-19 · Tier 2
Measuring Maximum Activations in Open Large Language Models
2026-05-19 · Tier 1
PUMA: Semantic-Preserving Early Exit for Reasoning Models
2026-05-19 · Tier 1
SNLP: Layer-Parallel Inference via Structured Newton Corrections
2026-05-19 · Tier 1
ZEDA: Post-Trained MoE Can Skip Half Experts via Self-Distillation
2026-05-18 · Tier 1
FashionChameleon: Training-Free KV Cache Rescheduling for Interactive Video Customization
2026-05-18 · Tier 1
HodgeCover: Higher-Order Topological Coverage Drives Compression of Sparse Mixture-of-Experts
2026-05-17 · Tier 1
MTP support merged into llama.cpp: Strix Halo benchmarks confirm a 2x decode speedup at 27B, mixed result at 35B
2026-05-17 · Tier 1
Recent Developments in LLM Architectures: KV Sharing, mHC, and Compressed Attention (Raschka)
2026-05-16 · Tier 1
ATESD: Adaptive Teacher Exposure for Self-Distillation in LLM Reasoning
2026-05-16 · Tier 1
Lighthouse Attention: Long-Context Pre-Training as a Detachable Wrapper
2026-05-15 · Tier 1
Asynchronous Continuous Batching: CPU-GPU Overlap via Dual Buffer Slots
2026-05-15 · Tier 1
Forcing-KV: Hybrid KV Cache Compression for Autoregressive Video Diffusion
2026-05-14 · Tier 1
The Extrapolation Cliff: a closed-form clip-safety threshold for on-policy distillation
2026-05-14 · Tier 1
MinT: managed infrastructure for million-scale LoRA training and serving
2026-05-14 · Tier 2
MMProLong: training long-context vision-language models with generalization beyond 128K
2026-05-14 · Tier 1
Orthrus: dual-view diffusion + autoregressive on a shared KV cache
2026-05-13 · Tier 1
δ-mem: Efficient Online Memory for Large Language Models
2026-05-13 · Tier 1
FocuSFT: Bilevel Optimization for Dilution-Aware Long-Context Fine-Tuning
2026-05-13 · Tier 1
The Many Faces of On-Policy Distillation: Pitfalls, Mechanisms, and Fixes
2026-05-13 · Tier 1
Beyond GRPO and On-Policy Distillation: An Empirical Sparse-to-Dense Reward Principle
2026-05-13 · Tier 1
Token Superposition Training (TST): Efficient Pre-Training with Token Superposition
2026-05-12 · Tier 1
Make Each Token Count: Improving Long-Context Performance with Learned KV Eviction
2026-05-11 · Tier 1
MDN: Parallelizing Stepwise Momentum for Delta Linear Attention
2026-05-11 · Tier 1
MISA: Mixture of Indexer Sparse Attention for Long-Context LLM Inference
2026-05-11 · Tier 1
UniPrefill: Universal Long-Context Prefill Acceleration via Block-wise Dynamic Sparsification
2026-05-09 · Tier 1
EMO: Pretraining Mixture of Experts for Emergent Modularity
2026-05-09 · Tier 1
MiA-Signature: Approximating Global Activation for Long-Context Understanding
2026-05-09 · Tier 1
UniPool: A Globally Shared Expert Pool for Mixture-of-Experts
2026-05-07 · Tier 1
D-OPSD: On-Policy Self-Distillation for Continuously Tuning Step-Distilled Diffusion Models
2026-05-07 · Tier 1
LIVEditor: Lightning Unified Video Editing via In-Context Sparse Attention (ISA)
2026-05-07 · Tier 1
Stream-R1: Reliability-Perplexity Aware Reward Distillation for Streaming Video Generation
2026-05-07 · Tier 1
Stream-T1: Test-Time Scaling for Streaming Video Generation
2026-05-05 · Tier 1
MotionCache: Motion-Aware Caching for Efficient Autoregressive Video Generation
2026-05-04 · Tier 1
The Distillation Panic — Nathan Lambert (Interconnects AI)
2026-05-02 · Tier 1
FlashRT: Efficient Red-Teaming for Prompt Injection and Knowledge Corruption
2026-05-02 · Tier 1
Nemotron 3 Nano Omni: Efficient Open Multimodal Intelligence
2026-05-01 · Tier 1
LenVM: Token-Level Length Value Model
2026-05-01 · Tier 1
RoundPipe: Efficient Training on Multiple Consumer GPUs
2026-04-30 · Tier 1
Accelerating RL Post-Training Rollouts via System-Integrated Speculative Decoding
2026-04-30 · Tier 1
Tide: Cross-Architecture Distillation for Diffusion Large Language Models
2026-04-22 · Tier 1
PrfaaS: Prefill-as-a-Service via Cross-Datacenter KV Cache Transfer
2026-04-22 · Tier 1
SDVG: Speculative Decoding for Autoregressive Video Generation
2026-04-22 · Tier 1
ShadowPEFT: Centralized Layer-Space Parameter-Efficient Fine-Tuning
2026-04-22 · Tier 1
TurboQuant: Online Vector Quantization for KV Cache Compression
2026-04-21 · Tier 1
Nemotron 3 Super: Hybrid Mamba-Attention MoE at NVFP4
2026-04-20
1D Ordered Tokens Enable Efficient Test-Time Search
2026-04-20
AccelOpt: Self-Improving LLM Agent for AI Accelerator Kernel Optimization
2026-04-20
AVR: Adaptive Visual Reasoning for Efficient VRMs
2026-04-20
Maximal Brain Damage: Disrupting Neural Networks via Sign-Bit Flips
2026-04-20
STOP: Super Token for Path Pruning in Parallel Reasoning
2026-04-20
W-RAC: Web Retrieval-Aware Chunking for Cost-Efficient RAG
2026-04-18 · Tier 1
LongAct: Harnessing Intrinsic Activation Patterns for Long-Context RL
2026-04-18 · Tier 1
Switch-KD: Visual-Switch Knowledge Distillation for VLMs
2026-04-18 · Tier 1
TESSY: Teacher-Student Cooperation Framework for SFT Data Synthesis
2026-04-17 · Tier 1
Cross-Tokenizer LLM Distillation via Byte-Level Interface
2026-04-17 · Tier 1
KV Packet: Recomputation-Free Context-Independent KV Caching
2026-04-17 · Tier 1
Model Capability Dominates: Lessons from AIMO 3 Inference-Time Optimization
2026-04-16 · Tier 1
TIP: Token Importance in On-Policy Distillation
concept page
Reinforcement Learning for LLMs
2026-05-19 · Tier 2
DiHAL: Where Should Diffusion Enter a Language Model? Geometry-Guided Hidden-State Replacement
2026-05-19 · Tier 2
MixSD: Mixed Contextual Self-Distillation for Knowledge Injection
2026-05-19 · Tier 2
NGM: A Plug-and-Play Training-Free Memory Module for LLMs
2026-05-18 · Tier 2
AIRA-Compose and AIRA-Design: Agentic Discovery of Neural Architectures
2026-05-18 · Tier 2
CIPO: Correction-Oriented Policy Optimization with Verifiable Rewards
2026-05-18 · Tier 2
NudgeRL: Strategy-Guided Exploration for RLVR
2026-05-15 · Tier 2
Darwin Family: MRI-Trust-Weighted Evolutionary Merging for Training-Free Reasoning
2026-05-15 · Tier 2
SU-01: Gold-Medal Olympiad Reasoning at 30B via Simple and Unified Scaling
2026-05-14 · Tier 2
Many-Shot CoT-ICL: long context as structured curriculum, not retrieval buffer
2026-05-13 · Tier 2
Reward Hacking in Rubric-Based Reinforcement Learning
2026-05-12 · Tier 2
G-Zero: Self-Play for Open-Ended Generation from Zero Data
2026-05-12 · Tier 2
Geometry Conflict: Explaining and Controlling Forgetting in LLM Continual Post-Training
2026-05-12 · Tier 2
Model Merging Scaling Laws in Large Language Models
2026-05-12 · Tier 2
Rebellious Student: Reversing Teacher Signals for Reasoning Exploration with Self-Distilled RLVR
2026-05-12 · Tier 2
Soohak: Mathematician-Curated Research-Level Math Benchmark
2026-05-10 · Tier 2
Gowers + ChatGPT 5.5 Pro: PhD-level math research in under two hours
2026-05-09 · Tier 2
Balanced Aggregation: Understanding and Fixing Aggregation Bias in GRPO
2026-05-09 · Tier 2
Prescriptive Scaling Laws for Data Constrained Training
2026-05-09 · Tier 1
TIDE: Every Layer Knows the Token Beneath the Context
2026-05-08 · Tier 2
ResRL: Boosting LLM Reasoning via Negative Sample Projection Residual Reinforcement Learning
2026-05-08 · Tier 2
When to Think, When to Speak: Learning Disclosure Policies for LLM Reasoning
2026-05-04 · Tier 1
Import AI 455: AI Systems Are About to Start Building Themselves — Jack Clark
2026-05-04 · Tier 2
Themis — Robust Multilingual Code Reward Models for Multi-Criteria Scoring
2026-05-03 · Tier 1
Ken Huang — World Models, Architectures, and the Next Phase of AI
2026-05-03 · Tier 2
Marcus — Have LLMs Improved Patient Outcomes?
2026-05-03 · Tier 2
MIT Study — Superposition Explains Why Scaling Language Models Works So Reliably
2026-05-03 · Tier 2
Philosophy-Bench — Frontier Models Diverge on 100 Everyday Ethical Scenarios
2026-05-02 · Tier 2
ARC-AGI-3 — Three Systematic Reasoning Errors in Frontier Models
2026-05-02 · Tier 2
Compliance vs Sensibility: Reasoning Controllability in LLMs
2026-05-02 · Tier 1
The Defense Trilemma + NP-Hardness of Reward Hacking Detection
2026-05-02 · Tier 2
Safety Drift After Fine-Tuning: Evidence from High-Stakes Domains
2026-05-01 · Tier 2
CoPD: Co-Evolving Policy Distillation
2026-04-30 · Tier 2
A Survey on LLM-Based Conversational User Simulation
2026-04-28 · Tier 2
Hope Architecture: Nested Learning and Continuously Adapting LLMs
2026-04-24 · Tier 2
DeepSeek V4: Architecture and Industry Impact
2026-04-24 · Tier 2
GPT-5.5: Launch Analysis and System Card Deep Dive
2026-04-22 · Tier 2
Chain-of-Thought Degrades Visual Spatial Reasoning
2026-04-22 · Tier 2
Target-Oriented Pretraining via Neuron-Activated Graph (NAG)
2026-04-22 · Tier 2
TEMPO: Scaling Test-Time Training for Large Reasoning Models
2026-04-22 · Tier 2
Weight Disentanglement and Task Arithmetic: OrthoReg
2026-04-21 · Tier 2
Geometric Canary: Steerability and Drift Detection from Representational Geometry
2026-04-21 · Tier 2
GFT: SFT is Degenerate Policy Gradient — and Group Fine-Tuning Fixes It
2026-04-21 · Tier 2
When Does RLVR Generalize? Reward Saturation and Reasoning Faithfulness
2026-04-19 · Tier 2
ASGuard: Mechanistic Defense Against Targeted Jailbreaking
2026-04-19 · Tier 2
Value Gradient Flow: RL as Optimal Transport
2026-04-18 · Tier 2
C2: Cooperative-Critical Rubric-Augmented Reward Modeling
2026-04-16 · Tier 2
InfiniteScienceGym: Procedurally-Generated Benchmark for Scientific Analysis
2026-04-16 · Tier 2
My Bets on Open Models, Mid-2026
2026-04-16 · Tier 2
From P(y|x) to P(y): Reinforcement Learning in Pre-train Space
2026-05-17 · Tier 2
CurveBench: Hierarchical Topological Reasoning from Visual Input
2026-05-12 · Tier 2
Auto-Rubric as Reward (ARR): From Implicit Preferences to Explicit Multimodal Generative Criteria
2026-05-12 · Tier 2
DeltaRubric: Generative Multimodal Reward Modeling via Joint Planning and Verification
2026-05-12 · Tier 2
ROMA: Reinforcing Multimodal Reasoning Against Visual Degradation
2026-05-07 · Tier 3
APEX: Aesthetic-Informed Popularity Prediction for AI-Generated Music
2026-05-07 · Tier 3
HERMES++: Unified Driving World Model for 3D Scene Understanding and Generation
2026-05-07 · Tier 3
JoyAI-Image: Awaking Spatial Intelligence in Unified Multimodal Understanding and Generation
2026-05-07 · Tier 3
Parameter-Efficient Multi-View Proficiency Estimation
2026-05-07 · Tier 3
PhysForge: Physics-Grounded 3D Asset Generation
2026-05-07 · Tier 3
RLDX-1: VLA Robotic Policy for Dexterous Humanoid Manipulation
2026-05-04 · Tier 4
AnalogRetriever — Cross-Modal Representations for Analog Circuit Retrieval
2026-05-04 · Tier 3
End-to-End Autoregressive Image Generation with 1D Semantic Tokenizer
2026-05-04 · Tier 3
GenLIP — Generative Language-Image Pre-training for ViTs
2026-05-04 · Tier 4
Map2World — Segment Map Conditioned Text-to-3D World Generation
2026-05-04 · Tier 3
UniVidX — Unified Multimodal Framework for Versatile Video Generation
2026-05-02 · Tier 3
Nemotron 3 Nano Omni — Efficient Open Multimodal Intelligence (NVIDIA)
2026-05-02 · Tier 3
Semi-DPO: Learning from Noisy Preferences via Semi-Supervised DPO
2026-05-02 · Tier 3
ViPO: Visual Preference Optimization at Scale
2026-05-01 · Tier 3
Edit-R1: Verifier-Based RL for Image Editing
2026-05-01 · Tier 3
FD-loss: Representation Fréchet Loss for Visual Generation
2026-05-01 · Tier 3
PhyCo: Controllable Physical Priors for Generative Motion
2026-05-01 · Tier 3
Visual Generation in the New Era: Atomic to Agentic World Modeling
2026-04-30 · Tier 3
Diffusion Templates: A Unified Plugin Framework for Controllable Diffusion
2026-04-30 · Tier 3
FASH-iCNN: Editorial Fashion Identity via Multimodal CNN Probing
2026-04-30 · Tier 3
GLM-5V-Turbo: Native Foundation Model for Multimodal Agents
2026-04-30 · Tier 3
X-WAM: Unified 4D World Action Modeling with Asynchronous Denoising
2026-04-20
Qwen3.5-Omni Technical Report
2026-04-16 · Tier 3
GameWorld: Standardized and Verifiable Evaluation of Multimodal Game Agents
2026-04-16 · Tier 3
MERRIN: Multimodal Evidence Retrieval and Reasoning in Noisy Web Environments
2026-04-16 · Tier 3
RationalRewards: Reasoning Rewards Scale Visual Generation at Training and Test Time
2026-04-16 · Tier 3
Seedance 2.0: Advancing Video Generation for World Complexity