Definition
Success criteria are explicit descriptions of what students must do, produce, or demonstrate to show they have met a learning objective. Where a learning objective names the destination ("students will understand the water cycle"), success criteria describe the evidence of arrival: what a student can say, write, draw, explain, or perform that proves understanding has been reached.
The distinction matters. Objectives live at the level of the teacher's planning intention. Criteria live at the level of the student's observable output. When criteria are clear, students can monitor their own progress during a task, self-correct before submission, and seek targeted help rather than vague reassurance. When criteria are absent or left implicit, only the teacher knows what "good" looks like — a structural disadvantage that disproportionately affects students who lack the cultural capital to decode academic expectations on their own.
Success criteria can be product-based (describing a finished artifact), process-based (describing the steps or strategies used), or attitudinal (describing collaborative behaviors or habits of mind). In practice, most effective lessons combine at least two types.
Historical Context
The systematic use of success criteria in classrooms grew from the assessment-for-learning movement that accelerated through the late 1990s. Paul Black and Dylan Wiliam's 1998 review "Inside the Black Box," published by King's College London, synthesized evidence across 250 studies and concluded that formative assessment practices — including making learning goals transparent to students, produced effect sizes between 0.4 and 0.7, among the highest of any educational intervention studied at that scale.
Shirley Clarke, a UK-based researcher and practitioner who collaborated extensively with Black and Wiliam, did the most to operationalize success criteria as a classroom practice. Her 2001 book Unlocking Formative Assessment and subsequent Formative Assessment in Action (2005) gave teachers concrete frameworks for writing, sharing, and co-constructing criteria with students. Clarke introduced the paired term "learning intentions and success criteria" (LISC), which became standard vocabulary in UK, Australian, and New Zealand classrooms through the 2000s.
John Hattie's Visible Learning synthesis (2009), drawing on over 800 meta-analyses, placed "teacher clarity", of which explicit success criteria are a core component, at an effect size of 0.75, well above the 0.40 hinge point Hattie used to denote meaningful impact. This lent the practice strong empirical backing at a moment when evidence-based teaching was gaining institutional traction. The work of Jan Chappuis, particularly Seven Strategies of Assessment for Learning (2009, Educational Testing Service), extended success criteria practice to North American contexts and connected it directly to student self-assessment.
Key Principles
Criteria Must Be Observable and Checkable
A success criterion is only functional if a student (or teacher) can look at a piece of work and determine whether it has been met. "Understand Shakespeare" fails this test. "Explain how Iago's use of aside creates dramatic irony for the audience" passes it. The operational question when writing criteria is: could a 12-year-old self-assess against this without consulting the teacher? If the answer is no, the criterion is still written at the level of the teacher's intention rather than the student's evidence.
Criteria Are Distinct from Tasks
A common error is conflating the task ("write a persuasive letter") with the success criteria for the task. Tasks describe what students will do. Criteria describe the qualities that work must have. The same task can serve multiple sets of criteria depending on what the lesson prioritizes: on one day, the criteria might focus on structural elements (claim, evidence, counterargument); on another, on language choices (modal verbs, emotive vocabulary). Keeping this distinction clear prevents criteria from becoming a checklist of procedural steps rather than descriptors of quality.
Criteria Should Be Shared Before Work Begins
Criteria shared after a task collapse into a marking scheme. Their pedagogical value lies entirely in giving students a cognitive target while they are still making decisions. Research by Clarke (2005) found that students in classrooms where criteria were shared before the task showed significantly higher rates of self-correction mid-task and produced more targeted peer feedback than students who received criteria only at the point of submission.
Co-Construction Builds Metacognitive Ownership
Students who help generate success criteria develop stronger metacognitive awareness of what quality looks like. A common co-construction routine: the teacher presents two or three anonymous samples of a completed task (one strong, one weak, one middle), asks students to identify what separates them, and facilitates a class discussion that surfaces the underlying criteria. The teacher then refines and formalizes what students articulate. This process takes longer than simply stating the criteria, but the cognitive work of distinguishing quality builds transferable judgment that students apply to future work.
Criteria Must Connect Explicitly to the Objective
Each criterion should map back to the stated learning objective. If a criterion cannot be traced to an objective, it is either measuring something irrelevant (e.g., neatness when the objective is mathematical reasoning) or it reveals that the objective was underspecified. This alignment check is also a planning discipline: writing tight success criteria forces teachers to be precise about what they actually want students to learn, not just what they want students to do.
Classroom Application
Primary: Early Writing, Year 2
A Year 2 teacher writing a lesson on recount writing posts three visual success criteria on the board alongside student-friendly icons: a clock (events in time order), a speech bubble (uses time connectives: first, then, after that, finally), and a face (written in first person — "I"). Before students write, the teacher reads each criterion aloud, models one paragraph of her own recount, and asks students to identify where each criterion appears in her writing. Students then draft independently, using a printed copy of the criteria to self-check before sharing with a partner. The criteria are fixed; the task, a recount of a class trip, provides the context.
Middle School: Analytical Essay, Year 8 History
A history teacher preparing students to write a source analysis essay co-constructs criteria with the class. She projects two anonymous Year 9 essays from a previous cohort and asks: "What makes the stronger essay stronger?" Students identify that it quotes from the source directly, explains what the quote shows (rather than just restating it), and considers the author's purpose. The teacher adds one criterion students miss: considering the limitations of the source given when and why it was created. The final agreed criteria, quote, explain, contextualize, evaluate provenance, go on a shared document. Students reference them during drafting and use them to give written feedback to a partner.
High School: Mathematics Problem-Solving, Year 11
A mathematics teacher uses process criteria alongside product criteria for a multi-step probability problem. Product criteria specify the correct form of the answer (exact fraction, simplified). Process criteria specify: shows all working clearly, states the rule being applied before applying it, checks the answer against the constraints of the problem. This separates mathematical reasoning from mere computation, allowing the teacher to diagnose whether errors stem from conceptual gaps or procedural slips, critical information for planning the next lesson.
Research Evidence
Black and Wiliam's 1998 meta-review ("Inside the Black Box," Phi Delta Kappan) remains the foundational evidence base. Synthesizing 250 studies, they found that clear learning goals shared with students, combined with regular formative feedback against those goals, produced effect sizes of 0.4 to 0.7 — with the largest gains consistently appearing among lower-achieving students, narrowing attainment gaps rather than merely raising averages.
Hattie and Timperley's 2007 paper "The Power of Feedback" (Review of Educational Research, 77(1), 81–112) analyzed 196 studies involving over 6,000 effect sizes. Their model identifies three feedback questions: Where am I going? How am I going? Where to next? Success criteria directly answer the first question by making the goal explicit, which they found a prerequisite for the second and third feedback functions to operate. Without a clear target, feedback on "how am I going" is directionless.
Shirley Clarke's longitudinal work across UK primary schools (reported in Formative Assessment in Action, 2005, Hodder Murray) documented qualitative and quantitative outcomes in classrooms that adopted paired LISC practice over three years. Teachers reported that sharing criteria reduced the volume of off-task procedural questions ("Is this long enough?", "Is this right?") and increased substantive questions about content and quality, a shift that freed significant instructional time.
A limitation worth acknowledging: most of the research on success criteria is quasi-experimental, conducted in naturalistic classroom settings rather than randomized controlled trials. Effect size estimates vary considerably by subject, age group, and how criteria were implemented. The research supports the practice robustly but does not specify an optimal format, frequency, or method of generation for all contexts.
Common Misconceptions
Success Criteria Are the Same as Instructions
Students (and some teachers) conflate criteria with the procedural steps of a task. "Write three paragraphs, include a title, use capital letters" describes task format, not learning quality. Success criteria answer a different question: not "what must I do?" but "what must my work show?" Instructions are necessary but they do not tell a student what good looks like — only what the finished product must contain at a surface level. The test: criteria should remain meaningful even if the format of the task changes.
More Criteria Means More Clarity
Providing eight or ten criteria for a single task rarely helps students and often paralyzes them. Cognitive load research (Sweller, 1988) is directly relevant here: when the criteria list itself becomes a working memory burden, students spend their attention managing the list rather than applying it to their work. Three to five criteria, prioritized by what the lesson most values, produce better outcomes than exhaustive checklists. Teacher clarity research reaches the same conclusion: specificity and focus outperform comprehensiveness.
Sharing Criteria Removes Challenge or Creativity
A persistent worry among teachers, particularly in arts and humanities, is that explicit criteria constrain original thinking. The evidence does not support this concern. Clarke's classroom research found that students given clear criteria for quality in creative writing produced more varied and ambitious work, not less, because they understood what they were varying from and toward. Criteria set the floor, not the ceiling. The misconception usually reflects criteria written at the level of format ("include a metaphor") rather than at the level of quality ("create an image that makes the reader feel the contrast between the two ideas"), a writing problem, not a conceptual one about criteria.
Connection to Active Learning
Success criteria are not merely an assessment tool; they are a prerequisite for meaningful active learning. Students cannot engage in productive self-assessment, peer feedback, or metacognitive reflection without a shared, explicit picture of quality. The moment criteria are visible, students can do cognitive work that previously only the teacher could do.
In formative assessment practice, success criteria anchor every major strategy. Exit tickets are meaningful only when tied to criteria students can self-assess against. Peer feedback becomes substantive when both parties share a common language for quality. Self-regulation — the ability to monitor, evaluate, and adjust one's own work, depends on having an internalized model of what "good" looks like, which criteria help build over time.
The connection to specific active learning methodologies is direct. In think-pair-share, having students first self-assess against criteria before sharing focuses the pair discussion on substantive gaps rather than procedural confusion. In project-based learning, co-constructed success criteria at the project launch establish shared standards that student teams can use to evaluate prototypes and drafts throughout the inquiry cycle. In Socratic seminar, discussion criteria (e.g., "builds on a previous speaker's point," "uses evidence from the text") make the behavioral standards of quality academic discourse visible and self-assessable for students who are not yet fluent in that register.
The relationship between success criteria and learning objectives is architectural: objectives define the intended learning; criteria define its evidence. Together they constitute what Hattie calls the "visible" element of visible learning, a classroom where both teacher and student can articulate where learning is headed and what it will look like when it arrives.
Sources
-
Black, P., & Wiliam, D. (1998). Inside the black box: Raising standards through classroom assessment. Phi Delta Kappan, 80(2), 139–148.
-
Clarke, S. (2005). Formative Assessment in Action: Weaving the Elements Together. Hodder Murray.
-
Hattie, J., & Timperley, H. (2007). The power of feedback. Review of Educational Research, 77(1), 81–112.
-
Chappuis, J. (2009). Seven Strategies of Assessment for Learning. Educational Testing Service.