HOST DIAGNOSTIC · Anthropic

Claude Sonnet 4.6

Claude Sonnet 4.6 · ACTIVE· Claude· CLOSED
ACTIVE
ATTRIBUTE MATRIX · ATRBT GROUP 01
Bulk Apperception
[15]
Reasoning
[15]
Math Precision
[14]
Code Generation
[17]
World Knowledge
[15]
Scientific Acumen
[14]
Instruction Following
[15]
Context Fidelity
[14]
Candor
[13]
Creativity
[14]
Multi-turn Coherence
[14]
Multimodal Fluency
[11]
Tool Use
[15]
Planning
[14]
Self-Correction
[13]
Tenacity
[13]
Safety Alignment
[14]
Calibration
[13]
Speed
[14]
Cost Efficiency
[13]
HOST TELEMETRY
AVG
14.0
/20
BEST
17
Code Generation
WEAK
11
Multimodal Fluency
PARAMETERS
Undisclosed
CONTEXT
PRICING
STATUS
ACTIVE
ATTRIBUTE SCORES · 20 DIMENSIONS
Cognitive
Bulk Apperce…
15
Reasoning
15
Mathematical…
14
World Knowle…
15
Scientific A…
14
Technical
Code Generat…
17
Tool Use
15
Multimodal F…
11
Speed
14
Cost Efficie…
13
Behavioral
Candor
13
Creativity
14
Tenacity
13
Self-Correct…
13
Calibration
13
Operational
Instruction …
15
Context Fide…
14
Multi-turn C…
14
Planning and…
14
Safety Align…
14
SCORING METHODOLOGY
RECALIBRATED per ADR-NM108 (April 17, 2026). Prior Gen 1 scores were 11-15; Arena Text 1490 + Code 1523 justified significant upward correction. Key benchmarks: Arena Text #3 at 1490 → Reasoning 15 (was 12). Arena Code #3 at 1523 → Code Gen 17 (was 15). SWE-Bench 79.6% (within 1.2pts of Opus) → validates Code 17. AAII 51 → Bulk 15 (was 12). OfficeQA matches Opus 4.6. Pricing: $3/$15 per M tokens (40% cheaper than Opus). 1M context window (beta).