I had dinner with Arlene Russell of UCLA, the creator of Calibrated Peer Review (CPR). With Calibrated Peer Review, students evaluate the written assignments submitted by several other students (their peers), and simultaneously gain a deeper understanding of the assignment's requirements. For several years I've wanted to use this tool in my classes, but until now there have been logistic and legal obstacles - we can't send private student info out of the country, and UBC's instructional technology can't figure out how to run CPR here.
Each student first submits their own answer to written assignment. They are then given a grading rubric and three 'calibration' submissions to review. These submissions were prepared by the instructor; the first two have carefully chosen errors typical of those on which grading is to be based, and the third is a fully correct example. The student evaluates each of these calibration submissions according to the points specified by the rubric, providing brief explanations for their decisions. They then assign each submission a grade out of 10.
The students are then given feedback on their evaluations. If the evaluations were poorly done they're given a second try (the instructor specifies how closely the student evaluation must match their expectations.). The quality of the evaluations will be taken account of in the next step, evaluation or real submissions from other students, with a poor calibration decreasing the impact of the grades they give to their peers' submissions.
Once the deadline for calibration evalutations is passed, each student is given submissions by three other students. They use the rubric to evaluate them, again providing comments to justify their evaluation and giving a grade. After they've done all three they are also asked to evaluate their own submission.
Once all the reviews are done, each student gets their grade (the mean of the four grades given by the three peers and themself). Students also get to see the reviews submitted by the two other reviews of the submissions they reviewed, giving them a better sense of how good their evalutations were.
The grading of the whole project is critical to its success. Arlene recommends that only 20% of the total assignment+review activity be allocated to the actual assignment grade. 30% is given for their performance in the calibration activity (how well their assessments matched those specified by the instructor), and 30% for their performance in assessment of their peer's work (how well each of their assessments matched those of the other two reviewers). The final 20% is for their assessment of their own submission - if they gave a grade too different from those of the three other reviewers, they get zero. This is to prevent students from unfairly inflating their own grade.
What does the instructor need to do? Basically, design the assignment and create the calibration submissions and the grading rubric. A number of premade assignments are available to be used or modified, or just used as guides for creation of a new one. The instructor also needs to deal with problems that arise, especially defaulting students and inconsistent grading.
What would I use CPR for? The letter-to-the-editor assignment. The students would submit their draft letters for review, and then improve their draft based on both the feedback they've gotten from their peer reviewers and the experience they've gained by evaluating other submissions. I could allow lots of time for this, maybe having the initial submissions due just before the Reading Week break, and the calibration and peer-reviews done in the two weeks after the break. This would leave the students a week or two for their revisions, with the final submissions due at least two weeks before the end of term. Ideally the students would get their graded letters back before the end of term, and would then be encouraged to submit them to the editors or producers responsible for the error.
After more conversation, over breakfast: The 'calibration submissions' need to be carefully designed to allow students to learn to identify the errors. For example, if a biology submission contains both biology errors and writing errors, the student will have a hard time disentangling these. Instead we might provide one calibration submission that is biologically correct but might contains a few or small writing errors (this would have a grade of 8-10), one that is well written but contains significant biological errors (grade of 6-8) and one that is well written but contains more biology errors (grade of 4-7). Distribute the important errors across the three submissions, rather than combining them in one really bad example.