llms-foundation-models · 2026-05-03 · Tier 2

Philosophy-Bench — Frontier Models Diverge on 100 Everyday Ethical Scenarios

Philosophy-Bench — Frontier Models Diverge on 100 Everyday Ethical Scenarios

Source: The Decoder Raw: raw/rss/2026-05-03-the-decoder-same-prompt-different-morals-how-frontier-ai-models-div.md URL: https://the-decoder.com/same-prompt-different-morals-how-frontier-ai-models-diverge-on-ethical-dilemmas/ Date: 2026-05-03 Tier: 2 — alignment evaluation, deployment risk

TL;DR

A new benchmark runs leading LMs through 100 everyday ethical scenarios (data misuse in sales, protocol violations in oncology, etc.). Models diverge on the same prompt. The question the article frames as the deeper one: who decides what an AI is allowed to do, and whose ethics does it follow.

Why this matters

Pairs structurally with Safety Drift After Fine-Tuning (05-02): that paper showed safety is a vector across benchmarks; this paper shows ethics is a vector across providers. Both reinforce the trilemma's no-free-lunch result (Defense Trilemma 05-02): there is no canonical "safer" or "more ethical" model — there is a profile, and the profile is benchmark- and prompt-dependent.

Connections

  • Safety Drift After Fine-Tuning (05-02) — same vector-not-scalar framing for safety.
  • Defense Trilemma + NP-hardness (05-02) — formal underpinning for why there is no single right answer.
  • AISN #72 (05-01) — public sentiment on AI ethics; this benchmark is the empirical companion.