Grading doesn’t have to suck—for teachers or for students. Two books on grading have inspired me to adopt a reinvigorated mindset towards grading and feedback: Specifications Grading by Linda Nilson and Grading for Equity by Joe Feldman. In this post, I want to summarize key ideas from both books and share my thought process for course preparation that incorporates the essential parts of both frameworks.

Why change grading systems? The beast that is traditional grading

Until recently, I used a traditional points-based grading system in my courses: points for homework, exams, and projects, all under some weighting scheme to determine the overall course grade. Grading always brought about negative emotions for me at every stage of the process—the anticipation of it, the act of actually doing it, and reflecting on grades that I had assigned. This negativity generally stemmed from five sources:

  • Grading always took up so much time.
  • Disputes with students about points were always disheartening.
  • I was disappointed that grades sometimes reflected students’ understanding poorly—both overestimating and underestimating their understanding.
  • I was disappointed at seeing the same mistakes over and over again.
  • I was worried about the stress that my students were feeling.

Out of habit I dismissed these woes as necessary evils of my job, but after reading Specifications Grading and Grading for Equity, I’ve felt inspired to reach for something better. In the next two sections, I’ll summarize the specifications and equity grading frameworks. I’ll end with some course preparation thoughts that combine both frameworks.

Specifications Grading

Two key components: specs and rubrics

A specification (or spec for short) is a requirement that an instructor sets for a piece of student work. In a specifications grading system, all assessments of course learning objectives have a set of specs that are each graded on a pass/fail scale. For a given assessment, all specs must be passed in order for the assignment as a whole to receive a passing grade. With this pass/fail evaluation, it is essential that instructors provide a very clear rubric of what it means to earn a pass for a given spec. For example, I might ask students to create a captioned data visualization that shows the relationship between two variables in a dataset. The specs and rubrics for a passing grade could be as follows:

  • Spec 1: Visualization must be correct and well-labeled.
    • Rubric: Visualization must…
      • Be the appropriate type for the given variables
      • Have axis labels with units and full words (as opposed to coded variable names)
  • Spec 2: Caption must correctly describe the visualization with appropriate numerical summary measures.
    • Rubric: Caption must…
      • Discuss the strength and direction of the relationship
      • Use appropriate numerical summaries

Mapping assessment grades to course grades

To determine final course grades, instructors can link letter grades to desired combinations of passed assessments. For example, to earn an A students must pass 9 out of the 10 course assignments or must pass a specific set of the 10 assignments. This approach to determining a final course grade is called the bundling approach—bundles of passed assignments translate to a letter grade. (Another approach that assigns points to passed assessments is discussed in Chapter 6 of Specifications Grading. However, in alignment with the equity grading framework discussed later, I’m not an advocate of this point system approach.) The bundling approach has transparency advantages for both instructors and students. Instructors can look at a letter grade and know exactly what students understand. Students know exactly what they must to to earn a given letter grade.

Pros and cons of specs grading

The pass/fail grading of specifications and assessments (with sufficiently clear rubrics) is intended to produce the following benefits:

  • Saving time for teachers: The lack of partial credit avoids time-consuming deliberations over points.
  • Increasing rigor: Setting high standards for a pass encourages higher quality work than a points-based system with partial credit.
  • Increased consistency of grades between students: Rubrics for pass/fail grades should be resistant to biases that plague finer scales.
  • High clarity and transparency:
    • For instructors: With bundling, a letter indicates exactly what students understand.
    • For students: Rubrics and grade bundles tell students exactly what they need to do to succeed, which can motivate them to aim higher than they would have normally.

Some potential downsides of specs grading include:

  • Writing sufficiently clear rubrics can be very time-consuming the first time. However, this initial time investment should translate to time savings during the course and its future offerings.
  • Students may be resistant to a new grading system without partial credit. Instructors will need to devote time to communicating why they are using a specs grading system.

In my opinion, the downsides of specs grading are well worth its benefits. As we’ll see next, many of its benefits align with an equity-oriented grading framework.

Grading for Equity

Pillars of the framework

An equity grading system is rooted in three pillars:

  • Grades must be mathematically accurate: The final number or letter grade given to students must accurately reflect their understanding of course concepts by the end of the course.
  • Grades must be bias-resistant: As much as possible, grades must be free from the influence of the instructor’s implicit biases and the student’s life circumstances.
  • Grades must be motivational: Grades should inspire students to learn and excel.

In Grading for Equity, Joe Feldman proposes several concrete course policies that support these pillars. Throughout, he builds from the central tenet that grades should accurately reflect students’ understanding of course content and nothing else. In the rest of this section, I’ll summarize his proposed policies and how they connect to these three pillars.

Avoid assigning zeroes by using a minimum grading policy

In points-based grading systems, many instructors assign zero points to assessments that are late, incomplete, or fraudulent as a means of deterring and punishing undesirable behavior. A more equitable points-based system avoids assigning zeroes. One policy that Feldman proposes for this is minimum grading, which is a policy that assigns a standard nonzero grade (say, 40 out of 100 points) where the instructor would normally give a zero.

  • Pillar 1 (Mathematical accuracy): Particularly on 100 point scales, zeroes have a disproportionate effect on point averages that are commonly used to determine final grades. In this way, assigning zeroes substantially underestimates student understanding. Further, a zero is inherently dubious as it signals that a student has absolutely no knowledge of a topic, which is highly unlikely. Minimum grading can help rectify the sensitivity of grades to outlier zeroes.
  • Pillar 3 (Motivation): Seeing a zero can be utterly demotivating to a student. Often times, recovering from one or more zeroes requires an extraordinary amount of effort (if it is possible at all), and this can extinguish any flicker of a growth mindset. Minimum grading can help sustain student optimism.

Since thinking about specifications grading frameworks, I’m not a fan of points-based grading systems, so I don’t intend to use these the no-zero or minimum grading policies. However, I think they are worth considering for instructors who prefer points-based systems.

Use a 0-4 scale

Points-based grading systems often use 100-point or similarly fine-grained scales to evaluate student work. A coarser (e.g., 0-4) scale can help make grades more equitable.

  • Pillar 1 (Mathematical accuracy): Fine scales are subject to considerable variability. Even with a rubric, an instructor could easily reread the same work and assign slightly different scores each time. A coarse 0-4 scale promotes accuracy and consistency of grades.
  • Pillar 2 (Resistance to bias): Because scores given on fine scales have more room to vary, they are more subject to instructors’ implicit biases. Coarser scales allow instructors to follow rubrics with greater consistency.

The 0-4 scale policy very closely aligns with the specifications grading approach. The specs grading approach takes coarse scales to the extreme by essentially proposing a 0-1 scale.

Weigh recent performance more than early performance

Typically, instructors include all student attempts to show understanding of a concept in the course grade. Instead, they should weigh latest student performance more heavily. At the extreme, only the most recent assessments of a skill count toward the grade.

  • Pillar 1 (Mathematical accuracy): Grades should reflect student understanding of material by the end of the course—not how quickly students reached that level of understanding. Thus, it doesn’t make sense to include grades for the early stage of the learning process. By weighing recent performance more heavily, grades more accurately reflect student understanding by the end of the course.
  • Pillar 2 (Resistance to bias): Students who have more difficult life circumstances might take longer to grasp concepts and would have lower grades under a traditional system that evaluates during the learning process. Weighing recent performance more heavily allows these struggling students the extra time they need to show the same level of understanding as their more privileged peers.
  • Pillar 3 (Motivation): Giving less weight to (and perhaps not even counting) early performance can inspire a growth mindset in students—particularly those who initially struggle. Without early struggles pulling the grade down, students can optimistically work toward genuine improvement.

This policy relates closely to a retakes and redos policy described later.

Avoid assigning grades for the product of group work

Instead of grading the product of group work, instructors should evaluate students individually after the group work is complete.

  • Pillar 1 (Mathematical accuracy): Group work tends to reflect the work of the strongest group members. Applying the group grade to individual students will overestimate the understanding of struggling students. With this policy, students can still benefit from collaboration, but the grades assigned will actually reflect individual understanding.
  • Pillar 3 (Motivation): When students must demonstrate their learning individually, they are more motivated to do the necessary work than when falling back on group members is possible.

If group work is solely meant to provide enriching learning experiences for students, then the framework of complex instruction is worth thinking about. I’ll say a bit more about complex instruction at the end of the post.

Avoid incorporating extra credit into the grade

Instructors award extra credit for a wide variety of activities. These activities can be course-related, but often times they are not. Extra credit violates all 3 pillars of equitable grading.

  • Pillar 1 (Mathematical accuracy): In the worst case that extra credit has nothing to do with course content (e.g., bringing in materials for the class), grades become a nonsensical mix of content knowledge and participation in these extracurricular activities.
  • Pillar 2 (Resistance to bias): Instructors tend to offer extra credit for activities that require an appreciable amount of time outside of class, expose students to new ideas, or offer a good challenge. Struggling students and students whose life circumstances preclude extra time to devote to schoolwork will tend to not take extra credit opportunities. It ends up being exactly the students who do not need extra credit who end up getting it.
  • Pillar 3 (Motivation): Students who tend to be unable to participate in extra credit opportunities can feel demotivated knowing about these missed opportunities. Further, whether related to course content or not, extra credit demotivates true learning by incentivizing the accumulation of points. If students view any earned points equally (whether they come from extra credit or the assessments of course content), then including extra credit derails learning of the primary material that instructors truly care about.

Use alternative (non-grade) consequences for late work and cheating

It is common to deduct points for late work and for academic integrity violations (on top of school-mandated punishments for integrity violations). These point loss policies embody a punishment system focused on deterrence of and retribution for bad behaviors. Feldman advocates instead for a rehabilitation-centered system: students have a chance to turn in late work without a grade penalty and to repeat the work on which they cheated, as opposed to losing points.

  • Pillar 1 (Mathematical accuracy): Deducting points for late work mixes the timing of student performance into a grade that should only reflect the quality of student performance. Deducting points for cheating results in a number that does not reflect a student’s knowledge. By definition, work that is copied does not represent a student’s understanding and deserves a missing or NA grade. This further underscores the merits of rehabilitative punishment—having students repeat the work on which they cheated is the only way to observe their true understanding and fill in the missing grade with an accurate one.
  • Pillar 2 (Resistance to bias): Grade consequences for late work tend to affect more vulnerable students. These students might have significant time commitments outside of school that make it difficult to hand in work on time. These difficult personal circumstances can also explain why students cheat: with a grading system that harshly punishes lateness, they feel that there is no other way to complete or succeed on the assignment.
  • Pillar 3 (Motivation): Grade consequences for late work and cheating can result in a vicious cycle: the demotivation resulting from the grade punishments results in further late work and cheating in desperate attempts to catch up.

Exclude “participation” and “effort” from the grade

”Participation” and “effort” categories within the course grade broadly encompass desirable behaviors that the instructor wants to encourage (e.g., speaking in class discussions, contributing to group work, asking questions, attempting assignments). Including such categories results in inequitable grades.

  • Pillar 1 (Mathematical accuracy): Grades should reflect students’ understanding of course content. “Participation” and “effort” indicate nothing about this understanding.
  • Pillar 2 (Resistance to bias): The components of “participation” and “effort” that instructors decide to include in the grade arise completely from their values. These values generally reflect a very narrow view of what it takes to be academically successful. Students who do not fit this narrow mold end up suffering despite their understanding of the course material. Further, there may be cultural reasons (school culture, classroom culture, etc.) underlying lack of participation from some students. Grade penalties for non-participation can perpetuate cycles of negative outcomes for certain groups.
  • Pillar 3 (Motivation): The same problems arising from Pillar 2 can result in demotivation and a loss of faith in the learning system.

Use only summative assessments in the grade, not formative assessments

This is effectively a “No homework in the grade” policy. This may seem controversial to instructors and perhaps terrifying to students (“100% of the grade is exams!?”), but coupled with a policy of offering retakes (described below), this can be a part of an equitable grading framework.

  • Pillar 1 (Mathematical accuracy): Formative assessments, like homework, are meant for students to practice their understanding. They should be a source of feedback but not a source of evaluation. To include formative assessment scores in the final grade would result in an inaccurate representation of students’ ultimate understanding because these assessments are part of students’ learning journey.
  • Pillar 2 (Resistance to bias): Grading homework for correctness can have disproportionately negative effects on the most vulnerable students who may lack the time to complete homework due to weaker understanding and/or external commitments. For fear of losing points, they may not attempt the homework or resort to copying. Both acts prevent them from practicing the material, which was the main goal of homework in the first place.

Renaming grades

Instead of using a 0-4 or A-F scale, instructors can use short descriptors. Example: Exceeding Standards, Meeting Standards, Approaching Standards, Not Yet Met Standards, Insufficient Evidence.

  • Pillar 3 (Motivation): Renaming grades in this way can clarify expectations for each grade level, dispel the tendency for students to over-judge themselves based on grades, and prompt students to think about grades guiding their learning as part of a growth mindset.

Retakes and redos

To truly allow students to learn from their mistakes, instructors should give students the opportunity to retake portions of summative assessments that indicate room for improvement.

  • Pillar 3 (Motivation): Having the opportunity to try again can alleviate some students’ testing anxiety and show them that their instructors truly care about their growth. This can motivate students to aim higher than they would have normally.

Blending the two frameworks in course preparation

In this section, I’ll discuss policies that I’ve adopted and questions that I ask myself when preparing a course that uses a specs-equity grading system.

Iterate between writing learning objectives and assessments

I have never been organized enough to write all of my assessments before the start of the course, but in implementing a specs-equity grading system, I think it would be useful to draft assessments in parallel with writing learning objectives for a few reasons:

  • This can help me identify “implicit” skills that I have not expressed as an explicit learning objective. A common example in my courses is the ability to recognize which of many concepts or tools is most useful for a given problem. By writing learning objectives and assessments in tandem, I could more easily recognize that “Identify relevant tools” should be its own learning objective.
  • Drafting assessments early also prompts me to draft rubrics early. Although this front-loads a lot of work, this should result in a more manageable workload during the semester.
  • Having early assessment drafts helps me shape my course schedule, which in turn guides my thinking on the type and timing of metacognitive activities for my students. This helps me decide if I want dedicated learning objectives for metacognition.

Bundling to determine the course grade

In a bundling approach, students must pass a particular set of assessments to earn a particular final course grade. If crafting assessments such that each assessment targets one learning objective, this amounts to each course grade being linked to mastery of a specific set of concepts. In my opinion, bundling (rather than assigning points to passed assessments) is the optimal way to use specs grading to determine course grades for a couple of reasons:

  • For one, the transparency of knowing exactly what concepts need to be understood to earn each letter grade is beneficial for both me and my students. I like being able to communicate to colleagues and potential future employers exactly what a student understood from my course. Students like having clear expectations.
  • Further, bundling can serve as a great signal to the most important ideas in the course. If the objectives that need to be mastered for a D are a subset of those required for a C (and so forth), students can clearly see that the objectives required for a D are crucial concepts.

Renaming pass/fail scores

To encourage a growth mindset, I prefer to rename pass/fail scores to Meets Standards (MS) and Not Yet Meeting Standards (NY).

What assessments to use?

  • Quizzes and exams
    • In some courses, there are concepts that are so fundamental that they need to top of mind (e.g., knowing the appropriate types of data visualizations to make for the variables of interest). Are there enough of these concepts that warrant regular, timed, in-class exams?
    • I prefer more frequent quizzes to a couple of large exams because frequent quizzes provide numerous retake opportunities, which can reduce testing anxiety. Later quizzes can include questions on earlier learning objectives that only need to be completed by students who have not yet mastered the concept. To manage time constraints, quiz questions on current content can be designed to only take, say, half of the class period. The hope is that the other half of the class period is sufficient for students pursuing retakes to finish the associated parts of the quiz.
    • In Grading for Equity, Joe Feldman notes that retakes are only equitable if they are mandatory—the most vulnerable students might be reluctant to choose to pursue a retake opportunity for a variety of reasons. My takeaway from this is that instructors should strongly consider any potentially beneficial activity to be mandatory in order to promote equity.
  • Homework
    • What is the role of homework? If using quizzes and/or exams, is the purpose of homework primarily to practice for these summative assessments? If so, I’m in favor of Joe Feldman’s “no formative assessments in the grade” policy. Providing full solutions and offering formative feedback for the most crucial exercises should be enough to help students learn and practice, but excluding them from the grade can relieve students’ time pressures and anxieties.
    • Aside from traditional problem sets, I have also tried writing-intensive assignments which required students to explain statistical concepts in their own words. I was comfortable including this type of homework in the grade because they comprised a semester-long assignment that had flexible deadlines and constant opportunities for revision.
  • Projects
    • If students begin projects towards the end of the course, grading them on a pass/fail scale can cause stress for students because of the shorter amount of time available for feedback. Clear specifications for good work are always necessary, but in this case a finer scale than a pass/fail scale might be necessary. This can still form a part of a bundling approach to determining the course grade: each letter grade is associated with a minimum project grade—in addition to a required set of learning objectives to have mastered by the end of the course.
    • If the project spans the entire course and students are able to start early, a fully specs grading approach is completely feasible. Project milestones can be graded on a pass/fail scale with the opportunity to revise non-passing milestones throughout the course.

What is the role of group work in the course?

Active learning in groups is a major part of class time in my courses. In place of a lecture, students watch videos and/or complete readings before class. During class, they work on exercises in groups to practice the ideas from the pre-class material. Overall, my students have appreciated this opportunity to practice and ask questions while I am present to help and classmates are there to collaborate. However, this style of learning can be challenging for some students who prefer different ways of learning or who don’t develop a good rapport with group members. For these reasons, I have started to consider the framework of complex instruction.

Complex instruction is a style of pedagogy that centers group work through the use of “groupworthy” tasks—tasks that truly require varied mindsets, opinions, and skills. While some students have found it helpful to be in groups to talk through the concepts prompted by my in-class exercises, there is nothing truly groupworthy about the exercises—nothing that truly necessitates diverse mindsets and skills. Although I have been guilty of using group projects rather than individual projects to cut down on my workload, I would like to be more intentional about group work in my courses going forward. Complex instruction captures exactly what I want in group work: the opportunity for students to engage in meaningful tasks together to promote better learning for everyone. Alana Unfried’s talks at ECOTS 2020 and at USCOTS 2021 are excellent resources for learning more about complex instruction.

Summary

Specifications grading brings to the table the specific practice of creating pass/fail criteria for student work. When used throughout a course, this framework can lead to heightened clarity for instructors and students. This benefits instructors with time savings during grading and improvements in communication about student understanding. This benefits students by making clear what they need to do to succeed, which can motivate them to aim higher. The pass/fail nature of specifications grading naturally pairs with the policy offering retakes on assessments, which is a core part of an equitable grading approach. Feldman’s equity grading approach builds on this to encourage thinking more broadly about student motivation, the role of instructors’ implicit biases and of student’s life circumstances in grading practices, and the accuracy of grades in reflecting students’ ultimate understanding. Considering the implementation of these practices in our courses can lead to better outcomes for both students and instructors.