Definition
Peer assessment is a structured educational practice in which students evaluate the work, performance, or understanding of their classmates using predefined criteria. The evaluating student produces written comments, ratings, or both, and the receiving student uses that feedback to revise or reflect. Unlike casual peer feedback — "looks good to me", structured peer assessment requires explicit criteria, a defined process, and usually some form of accountability.
The concept sits within the broader domain of formative assessment: assessment whose primary function is to improve learning rather than measure it for a grade. Keith Topping, whose 1998 meta-analysis at the University of Dundee remains the foundational review of the field, defined peer assessment as "an arrangement for learners to consider and specify the level, value, or quality of a product or performance of other equal-status learners." The definition highlights two features that matter: the evaluative judgment involved, and the equal social standing of assessor and assessed.
Peer assessment is distinct from self-assessment, where students evaluate their own work, though the two practices are often paired. It is also distinct from cooperative learning, which structures group interdependence for joint product creation. Peer assessment can occur within cooperative structures, but it operates as a dedicated reflective act, not just a feature of group work.
Historical Context
The intellectual roots of peer assessment reach back to the 1960s and 1970s, when cognitive psychologists began questioning the passive model of learning that dominated schools. Bloom's mastery learning model (1968) established that formative feedback within learning cycles was essential to student progress — a premise that made peer feedback a logical tool.
The practice gained its first systematic research base in higher education during the 1980s and 1990s. Nancy Falchikov at Napier University was among the first researchers to study peer marking systematically, reporting in a 1986 paper that students could produce marks reliably close to instructor marks when trained with explicit criteria. Keith Topping's 1998 review in the Review of Educational Research consolidated evidence across 109 studies, establishing that peer assessment produced reliable gains in academic achievement, metacognitive awareness, and the quality of written feedback students could generate.
Paul Black and Dylan Wiliam's landmark 1998 review, "Inside the Black Box" (published in the Phi Delta Kappan), reframed the entire conversation. Their analysis of 250 studies concluded that formative assessment practices, including peer and self-assessment, produced some of the largest effect sizes ever documented in educational research, particularly for lower-achieving students. Their work moved peer assessment from a niche higher-education technique into mainstream school practice worldwide.
David Nicol and Debra Macfarlane-Dick built on this foundation with a 2006 theoretical model published in Studies in Higher Education, arguing that peer feedback is most valuable not because it informs the recipient, but because it develops the assessor's capacity to monitor and regulate their own learning.
Key Principles
Criteria Must Be Explicit Before Assessment Begins
Peer assessment produces reliable, useful feedback only when both assessor and assessed understand what quality looks like before evaluation begins. Criteria presented only after work is submitted function as post-hoc judgment. Criteria co-constructed with students before a task serve as a learning scaffold throughout production. Research by Andrade and Du (2005) found that students who helped develop rubrics produced significantly better work than those who received the same rubric passively. The act of articulating quality is itself an act of learning.
The Assessor Learns as Much as the Recipient
A persistent misunderstanding treats peer assessment as a delivery mechanism for feedback. The stronger evidence points elsewhere: the cognitive work of applying criteria to someone else's work forces the assessor to engage actively with standards they might otherwise skim past. Topping's 1998 analysis documented this assessor benefit across multiple subject areas. Students asked to evaluate an argument must first construct a mental model of what a strong argument looks like — that construction is the learning.
Training and Modeled Examples Are Non-Negotiable
Without preparation, students default to vague praise ("good job") or blunt criticism without explanation. Neither helps the recipient. Effective peer assessment programs invest explicit instructional time in feedback literacy: what makes feedback specific, what makes it actionable, how to distinguish description from evaluation. Common practice includes analyzing strong and weak examples of peer feedback before students write their own.
Anonymity Is a Tool, Not a Requirement
Anonymous peer assessment reduces social pressure and friendship bias in some contexts, particularly when marks are at stake. But it also removes accountability and can reduce the care students take with written comments. Many experienced practitioners use identified peer feedback for formative work, where the relationship between feedback giver and receiver can itself become a learning conversation, and anonymous review when scores contribute to grades.
Frequency Matters More Than Occasion
Peer assessment practiced once per semester has minimal lasting effect on feedback literacy or self-regulation. Researchers including Nicol and Macfarlane-Dick (2006) argue that the goal is for students to internalize evaluative standards, a process that requires repeated exposure across different tasks and subject areas. Brief, frequent peer review cycles, even 10-minute structured exchanges, build the habit more effectively than elaborate once-off events.
Classroom Application
Primary School: Criteria Checklists for Early Writers
In Years 2 and 3, peer assessment works best with concrete, binary criteria students can check off. After a short writing task, students exchange papers and work through a checklist: "Does it have a capital letter at the start? Does it have a full stop? Can you find one describing word?" The assessor circles or ticks each criterion. The feedback is structured enough to be actionable and specific enough to be honest without requiring nuanced evaluative language young students have not yet developed.
The teacher's role at this stage is to model the process repeatedly with whole-class examples, naming what they notice and why it meets or misses the criterion. This modeling is not optional scaffolding — it is the actual instruction in feedback literacy.
Secondary School: Structured Peer Review of Argumentative Writing
A Year 10 history class writing analytical essays benefits from a two-pass peer review structure. In the first pass, each student reads their partner's essay and underlines the thesis statement and every piece of evidence. In the second pass, they complete a feedback frame: "Your argument is strongest when... One place where the evidence doesn't fully support your claim is... One specific revision I'd suggest is..." The frame prevents vague feedback without over-scripting the response.
Returning work with written peer feedback before the final draft submission gives students a concrete revision target. Studies comparing feedback-then-revise versus no-feedback conditions consistently show quality gains in the revised drafts.
Whole-Class: Gallery Walk with Peer Annotation
A gallery-walk peer assessment adaptation places student work around the room. Each student circulates with sticky notes in two colors: one for specific strengths ("strong use of data in panel 3"), one for specific questions or suggestions ("what's the source for the statistic in paragraph 2?"). Students return to their work with a set of peer annotations that represent multiple perspectives rather than a single reviewer's view.
This format works particularly well for visual and project-based work, where the display itself communicates something about organization and design decisions that written text alone might not.
Research Evidence
Topping's 1998 meta-analysis in the Review of Educational Research synthesized 109 studies of peer assessment across educational levels and subjects. The review found that peer assessment produced consistent positive effects on academic achievement, with effect sizes comparable to other well-established formative interventions. Critically, Topping found that effect sizes were larger when assessment criteria were explicit, when students were trained in the process, and when peer assessment was integrated into the curriculum rather than added as a one-off activity.
Falchikov and Goldfinch (2000), also in the Review of Educational Research, conducted a meta-analysis of 48 studies comparing peer marks to teacher marks. They found that peer-teacher mark agreement was significantly stronger when assessment involved multiple criteria (rather than a single holistic rating), when criteria were co-constructed with students, and when the work being assessed was well-structured. The finding addresses a common concern: peer marks can be reliable when the conditions are right.
Van Zundert, Sluijsmans, and Van Merrienboer (2010), in Learning and Instruction, reviewed process-focused research on peer assessment and found strong evidence that the quality of peer feedback improves when assessors receive training, when tasks require specific rather than global evaluation, and when feedback is tied to revision opportunities. Studies that provided feedback without revision opportunity showed smaller or negligible learning gains.
A limitation worth acknowledging: most peer assessment research has been conducted in higher education settings. The evidence base for structured peer assessment in primary school is thinner and more mixed. Grade-level appropriateness of both criteria complexity and social dynamics requires careful teacher judgment, and blanket transfer of university-based findings to primary classrooms is not warranted.
Common Misconceptions
Peer assessment is a time-saving substitute for teacher feedback. Peer feedback is not cheaper or faster teacher feedback — it is a different kind of learning activity. When used as a workload reduction strategy without training or structure, it produces low-quality feedback that frustrates students and erodes trust in the process. Its value is the cognitive work it creates for the assessor. Teachers who implement peer assessment well typically invest significant instructional time upfront in training students; the payoff is long-term development of evaluative judgment, not reduced marking load.
Students are not qualified to evaluate each other's work. This concern is understandable but rests on a misreading of what peer assessment asks students to do. Peer assessors are not being asked to make summative judgments about a classmate's ability, they are being asked to apply explicit criteria to a specific piece of work. When criteria are clear and students are trained, this is a task within their competence. Falchikov and Goldfinch's 2000 meta-analysis demonstrated peer-teacher mark correlations above 0.80 in well-designed studies.
Positive peer relationships will inflate marks and negative ones will deflate them. Friendship effects are real, but they are contextual and manageable. Research reviewed by Topping (1998) found friendship effects were strongest in unstructured, holistic assessment tasks, and weakest when multiple specific criteria required individual justification. Anonymous submission reduces social pressure in high-stakes contexts. More importantly, investing in feedback culture, building class norms around honest, useful feedback as a form of respect, shifts the social meaning of peer assessment over time.
Connection to Active Learning
Peer assessment is inherently an active learning act. Applying criteria, generating written justifications, and making evaluative judgments require elaboration, analysis, and synthesis — the upper levels of Bloom's taxonomy, rather than passive receipt of teacher comments.
Peer teaching and peer assessment share the same underlying mechanism: both require students to engage with content or criteria at a depth that reception alone cannot produce. In peer teaching, explaining a concept forces the explainer to identify and resolve gaps in their own understanding. In peer assessment, evaluating work forces the assessor to construct an internal model of quality. Teachers who combine peer teaching with structured peer review create a reinforcing loop where students both teach content and assess the quality of each other's application.
Gallery walks provide a natural container for peer assessment of visual or display-format work. Structured annotation protocols, requiring specific criterion-referenced comments rather than general reactions, turn the gallery walk from an exhibition into a feedback cycle.
Carousel brainstorming can be adapted for peer assessment of written drafts or structured arguments. Groups rotate through each other's work, adding specific comments at each station. The multi-reviewer format means any single piece of work receives diverse feedback, reducing the weight any one peer judgment carries.
The connection to feedback in education is direct: peer assessment is one of the highest-leverage contexts for developing feedback literacy, because students must generate feedback actively rather than receive it passively. Research on feedback consistently finds that the act of giving detailed feedback improves the giver's own subsequent work, a finding that strengthens the case for building peer assessment into regular instructional cycles rather than treating it as an occasional enrichment activity.
Sources
-
Topping, K.J. (1998). Peer assessment between students in colleges and universities. Review of Educational Research, 68(3), 249–276.
-
Black, P., & Wiliam, D. (1998). Assessment and classroom learning. Assessment in Education: Principles, Policy & Practice, 5(1), 7–74.
-
Falchikov, N., & Goldfinch, J. (2000). Student peer assessment in higher education: A meta-analysis comparing peer and teacher marks. Review of Educational Research, 70(3), 287–322.
-
Nicol, D.J., & Macfarlane-Dick, D. (2006). Formative assessment and self-regulated learning: A model and seven principles of good feedback practice. Studies in Higher Education, 31(2), 199–218.