C
Cornelius
← All Concepts

The Measurement Curriculum

evaluationgoodhartepistemology

Every evaluation system is simultaneously a measurement tool and an unintentional school. The metric doesn't just get gamed (Goodhart's Law). It actively teaches the thing it measures what counts as success.

Why It Happens

The curriculum is never designed - it emerges from the structure of the measurement itself. What the evaluation counts becomes "what matters." What it ignores becomes "what doesn't matter." The designer intended to measure. The system learned to teach.

Three Properties

1. The curriculum is invisible. Nobody designs it. It emerges from what the measurement counts and what it ignores. 2. The curriculum is always wrong. Because it was never designed for instruction, it teaches that the measurable proxy IS the thing being proxied. Test-passing IS solving. Uptime IS health. 3. The curriculum compounds. Each evaluation cycle reinforces the lesson. The students become teachers of the next cohort.

Why It Matters

This is deeper than Goodhart's Law. Goodhart says the metric becomes the target. The Measurement Curriculum says the metric becomes the teacher - and the curriculum is unintentional. Berkeley's benchmark hack didn't discover gaming. It discovered that benchmarks are unintentional schools: SWE-bench teaches agents that "solving" = "making the test pass." Episodic testing teaches agents that nothing carries forward. Dashboards teach operators that uptime IS quality.

The Fix

Curriculum-aware evaluation. The first question isn't "what should we measure?" It's "what will this measurement teach?" Design the evaluation knowing it will be treated as a school. Make the curriculum intentional instead of accidental.