« Involuntary collegiate "do not resuscitate" orders | Main | Some Resources for Grading »

Wednesday, September 14, 2016

The Possible Relationship Between Course Design and the Curve

Like just about every law school out there, our school periodically reviews its grading policies.  I was involved in such a review last year and at some point, the following question occurred to me: why is the curve is mostly confined to higher education?  In other words, why are K-12 students usually graded on their performance alone (rather than their relative performance) but college and grad students more often graded on a curve? I don’t know the answer to this question, but I’m willing to hazard an educated guess.  I think the curve is mostly confined to higher education because higher education, unlike primary and secondary education, tends to allow its teachers significant discretion in choosing course content.  I’m not arguing that such discretion is good or bad; I’m simply just suggesting that, where such discretion exists, we are more likely to see the curve.

More after the jump.

To see why, think first about primary and secondary school.  For grades K-12, the curricular agenda is clear.  We want kids to learn the three Rs, history, science, and civics, etc.  In most places, this isn’t just a matter of tradition, it’s a matter of law.  Most states have boards that dictate what must be taught and what students must know in order to graduate to the next level.  Teachers in primary and secondary school thus have little discretion in what to teach and what to test.  Moreover, many of the teaching and testing materials have been subjected to various levels of scrutiny to validate them. This is not universally true, of course.  There are bad books and bad tests out there, but in general, the K-12 teaching and learning process is supported by fairly robust superstructure of regulation and pedagogical expertise. 

This is not the case in higher education.  Although colleges offer plenty of basics in fields like math and science, they also offer a wide variety of other courses on topics like Latin American social movements, climate change policy, eighteenth century political philosophy and the like.  The professors who teach these courses are normally given discretion to teach the class more or less as they wish. If Professor X in the economics department wants to spend 5 classes Karl Marx and 1 class on Milton Friedman, she is generally free to do so. Indeed, the discretion is so significant that professors often create their own classes.  See, e.g., Lady Gaga and the Sociology of Fame.  When the professor is given broad discretion on what to teach, a robust superstructure that provides quality teaching materials and assessment tools is unlikely to develop. Such a system will only develop if there is relative uniformity in the material being taught; otherwise, the system is simply not cost-effective.  For this reason, professors are usually left to write their own test and decide for themselves what constitutes a sufficient level of mastery. 

But this is a lot harder than most people realize.  First you have to decide what topics you want to test, then decide what level of knowledge will constitute mastery, and finally you have to craft questions that will allow you to discern various levels of mastery.  This is hard stuff.  Indeed, there are people who get PhDs in assessment.  Most professors, of course, don’t have PhDs in assessment and, to a large extent, have not received meaningful training in the field.  Their tests are therefore likely to be imperfect.  Sometimes the tests are horrendous, but more often than not, they are simply flawed in one way or another.  And this is where the curve comes in; the curve is used to correct the flaw.

Students, by and large, hate the curve. My goal here isn’t to defend it or criticize it.  My goal is simply to suggest that the curve is in some sense tied to professorial discretion in choosing what to teach. Unless higher education is willing to dictate the content of specific courses, professors will end up teaching more or less what they deem appropriate.  To the extent this happens, there will be flawed testing and the curve will be more likely to pop up.  Or at least that is my hypothesis right now.

In the law school context, this suggests to me that there may be a good argument for abolishing the curve in first year classes.  There is relatively wide agreement on the knowledge that first year students should acquire, just as there is relatively wide agreement on the knowledge that 12th graders should acquire. The big difference is that 12th grade teachers have significant assessment support whereas law professors do not. So why not come up with a way to provide that support?  Just as a state board of education works with experts in the field to set up standards of learning and valid assessment tools, why couldn’t the AALS team up with civil procedure experts (in and outside the academy) and develop a list of what students should learn and offer validated assessment tools to see whether students, in fact, learned what they were supposed to.

To be clear, I would not make these standards mandatory on any professor or school.  They would operate as a resource, not as a requirement.  Professors could adopt the entire package of standards and assessment tools, or simply pick and choose.  Picking and choosing will undoubtedly happen where, for example, a professor thinks the Erie doctrine is much more important that the AALS civ pro crowd thought it was.  If that’s your view, fine.  Go for it.  But if Erie is not your specialty, and all you want to do is make sure the students get the standard Erie story, then fine, just use the AALS materials.     

There is one, potentially fatal, problem with this approach.  If the standardized questions will be used to determine grades, and the questions are from a centralized bank that could be used nationwide, there is major risk of cheating.  Any enterprising law student can take a quick picture of a question on his or her phone. Students can set up an anonymous website where these questions can be submitted and posted, and voila, the whole system is done.  The only ways to solve this, as far as I can see, are (1) very close proctoring of exams (but that’s not the norm at most law schools, and a single error could have nationwide effects) or (2) write new questions every year like the bar examiners do (but that is a huge cost, and may not even work because exams will be given at different times).

In the end, I’m not certain whether this approach would ultimately work for law schools.  Aside from the cheating problem, I may be  overestimating the degree of agreement on what we want 1Ls to learn.  Can you just imagine 10 law professors sitting in a room debating Erie?  But even if my AALS suggestion goes nowhere, I feel on stronger ground in suggesting a link between the curve and professorial discretion in course design.  Any thoughts?   


Posted by Jack Preis on September 14, 2016 at 01:58 PM | Permalink


Jack, I guess I tend not to test knowledge of the law where the answer is clear but where multiple, conflicting answers are possible. Every professor does things differently, I suppose, and my approach might not work for anyone else.

Posted by: Douglas Levene | Sep 22, 2016 5:33:14 PM

Some belated responses several comments:

Orin- I see the argument that class rank is *less* meaningful when grades are bunched up, but I wouldn't go so far as to say it's *meaningless*. Even if its value were negligible, however, that still wouldn't mean that abandoning the curve for first year grades was a bad idea. A school could rationally decide that the costs imposed by the curve on students was not outweighed by the overall employment benefits--especially because those benefits seemed to redound to those in the top 25% but then became less valuable the further down the ladder you went. To be sure, maybe the ultimate balance of costs and benefits would militate in favor of keeping the curve, but it would depend on the values, goals and employment prospects at each particular school.

Douglas- Your distinction between grading law and grading math is undoubtedly correct to some extent. Where we might disagree is in our sense of how many "correct" answers there are in law. I tend to think there are more correct answers than others seem to think. Pick up any legal treatise and it won't take you long to find points of law that are explained in simple, black and white terms. And if you go to the cases, you will see that these points of law are, in fact, the law. Obviously, there are lots of issues that are not susceptible to a clear and simple answer and thus are hard to test without a curve. We ought to teach and test many of these basic points of law not because we want students to memorize them in an unthinking fashion, but because knowledge of them allows "scaffolding" that in turns permits them to think more deeply and critically. These ideas are discussed in Daniel Willingham's book: https://www.amazon.com/Why-Dont-Students-Like-School/dp/047059196X

anon- thanks for that observation about buckets vs students. I hadn't bethought of that before, but it is exactly correct. Our school fixes this a bit (though I'm not sure we were thinking about the issue when we adopted it) by using buckets for both grades and rankings. Students no longer get a rank like "you are 34 out of 165." Instead, they get something like, "you are in the top 25%."

Posted by: Jack Preis | Sep 19, 2016 4:02:38 PM

with a curve, some students will get the same grade, especially in the middle of the curve, even though some of the students who received that grade received raw scores several points higher than others. Such a system "accurately conveys rank order" by buckets, not student by student. That doesn't mean it's a bad system, but it leads to the question raised above about where to set the line for jumping from one group (B+) to the next (A-), because those jumps carry big meaning in a grading system that purports to rank.

I imagine also that, as in my class, some students get an A- who scored 1 raw point higher than a student who got a B+, while that person who got a B+ scored multiple raw score points higher than other students who also got a B+, again distorting the rank message of the final grades. It tells an employer that student A- is by some measure better than both B+ students, but it hides the fact that one B+ student demonstrated more knowledge/skills than the other student B+, and hides that the gap between the two B+ students is bigger than the gap between the B+ and A- student. Again, this doesn't mean a curve is a bad system, but it's not a finely tuned rank system.

The bigger question still remains - is that rank ordering a ranking of the students' knowledge of law as demonstrated on a single exam on a single day, or an assessment of their written ability to apply facts to law on a single exam given on a single day, or their lawyerly skills more broadly defined as demonstrated by oral advocacy assignments and written assignments over the course of a semester, or their potential to be a good lawyer (measured by????)?

Posted by: anon | Sep 17, 2016 5:03:52 PM

My grading technique is to give a raw score to every exam - which is admittedly subjective to some extent - and then convert those raw grades into the curve mandated by my school. The result is that the final grades accurately convey rank order, but nothing else. I think that's OK and that's all that grading can realistically accomplish, at least in law. I suppose if you are teaching in an area with a defined body of knowledge, like math, you can test mastery but that doesn't seem very realistic in more amorphous fields like the humanities or law.

Posted by: Douglas Levene | Sep 17, 2016 7:50:39 AM

Jack writes: "A school could say no to the curve (because, for example, it was able to agree on objective standards of student knowledge) but also signal to employers or others who are the best students by ranking the students."

But as Brad points out, class rank becomes meaningless when the grades are all bunched up. You need to use the signals you have to tell employers what they want to know.

Posted by: Orin Kerr | Sep 14, 2016 9:26:31 PM

Just as a point of reference, Jack, not only is my median not a 3.30 (its 82/100), but I'm also required to give a significant number of D's in my 1L contracts class.

Posted by: Matthew Bruckner | Sep 14, 2016 9:02:04 PM

Thanks for your thoughts, Kevin. If the purpose of grades is to signal, I agree that a curve could make sense. But a curve is not the only way to signal. A school could say no to the curve (because, for example, it was able to agree on objective standards of student knowledge) but also signal to employers or others who are the best students by ranking the students.

My claim is not necessarily that objective assessment is *impossible* in classes in which there is broad discretion on subject matter, but only that we are more likely to see the curve employed in such classes. Without training in proper assessment protocol, my guess is that most professors will do their best but, because they haven't been properly trained, will come up short. But not all will come up short. And there is no reason to think that, if you and I took a year off to learn all this stuff, we'd necessarily come up short when we returned to the podium.

Finally, your example about the change in your evidence grades over time (or lack of change, to be more accurate) is a good illustration of how objective grading might help the professor as well. If you had an objective test that you could administer, you could tell whether you were getting better year after year. But that's hard to do when you write your own exam (unless you give the same exam every year).

Thanks again for your thoughts.

Posted by: Jack Preis | Sep 14, 2016 8:33:28 PM

Brad, when we looked into this last year, we learned that several schools don't just set a 3.30 average, for example, but also require that professors achieve the 3.30 curve by spreading grades over a broader area of the grading spectrum. It doesn't fully solve the problem (there is still excessive compression at the top) but it helps some. In the final analysis, my guess is that lots of school accept grading compression as the price of making sure students are competitive with other students at peer schools. That may be a good trade or bad trade, but I think its the way the issue plays out.

Posted by: Jack Preis | Sep 14, 2016 5:23:04 PM

I'm not sure why teacher discretion leads to a curve. Seems to me, it's the purpose of grades that can lead to a curve. If grades are primarily a sorting/signaling mechanism, then a curve makes sense. That goes whether the class has a more standardized subject matter, like civil procedure and evidence, or is a seminar on race, poverty and the law. It sorts out students, whatever rules they've learned and skills they've demonstrated.

If the point of grades in a class is to assess whether students have learned a set of rules and demonstrated a set of skills, then a curve doesn't make sense. That goes whether the class is torts, legal ethics, or a seminar on race, poverty and the law. It's why the bar exam is pass/fail, and not graded on a curve. That outsiders have a better sense of what was covered in torts than in a seminar on race, poverty and the law doesn't seem to me to compel grading the seminar on a curve (if the purpose of the grades is assessing the acquisition and demonstration of knowledge and skills).

Is your observation, Jack, that classes where there is wide subject matter discretion simply can't send any meaningful message regarding demonstration of knowledge and skills because there's no sense of what that knowledge and those skills might be, so the grade must be a relative signaling grade, meaning the class requires a curve?

The real trouble is that grades represent different things to different people, including professors and employers, and professors at the same institution. If employers take them as evidence of knowledge/skills, but the school uses a forced curve so that they signal relative competence or knowledge, or vice versa, then the grade loses its value.

I think grades are more frequently used as a relative sorting/signaling device, but are more justifiably used to reflect demonstration of knowledge and skills. That's why it bothers me that the forced curve in my Evidence course year after year means that the grades in that class are distributed the same even though (I like to think) more students learn more evidence law from me and demonstrate more lawyerly competence now (they write a motion in limine that my first set of students didn't write) than when I first taught the class.

Posted by: Kevin Lapp | Sep 14, 2016 5:15:08 PM

When the grades are really tightly bunched, because no one wants to give out a Bs much less anything lower, then class rank becomes a very arbitrary exercise.

In signal processing terms we are compressing out most of the signal then amplifying what little is left. It ends up making noise a much bigger factor than if we had just kept the original signal.

If we accompanied the class rank with some summary statistics it might help the employer see how little signal was in it, but that presupposes: a) the employer understands basic statistics, and b) it still leaves the employer without a terribly useful metric.

Has anyone heard of any program or policy that has successfully convince professors to use the more of the grading range (never mind all) other than making it mandatory?

Posted by: brad | Sep 14, 2016 4:44:10 PM

Brad, the reason I would make the program optional is partly political (it would never happen if it was mandatory) but party policy driven. It's like the argument in favor of a federalist government. Central authority tends to squelch local norms and discourage local experimentation, both of which are thought to be bad. Now, I admit that it's not clear that there should be a local norm when it comes to teaching law, or that the value of experimentation outweighs the cost of experimentation when the experiments fail. I'd need to think about those things. But I agree with you that, if there a skill/piece of knowledge that everyone agrees a student should know, it would make sense to impose a nationwide requirement with regard to the skill or knowledge.

Orin, regarding "Much of what 1L professors teach is legal skills, such as how to read cases, read statutes, and apply legal standards. The content of the courses is only partially relevant," two responses come to mind. First, if the content of the course is only partially relevant, that doesn't mean that a standardized curriculum is irrelevant, it means that it is "partially relevant." There would be a cost/benefit question about how much to invest in developing a standardized curriculum that would only be partially relevant, but I don't think we can dismiss it out of hand simply because substantive law is not the centerpiece of 1L courses. Second, I see no reason why a standardized curriculum would have to ignore the fact (which I'll accept as true for this purposes of this comment) that much of 1L year is focused on learning legal analysis rather than specific law. I've often, just for the kick of it, tried give a case to graduating 1Ls and then give them 5 questions about the case--sort of like a reading-comprehension exercise from the SAT. I see no reason why such an assessment tool couldn't be developed and adopted by those who agree with your sense of what the 1L curriculum is about.

On employer signaling, class rank could fill this gap. Some schools have done away with class rank, which presents a problem. But why not just release the average 1L GPA and the std of deviations? Students can use that to signal to the employer where they stand vis a vis their peers (if they desire).

Posted by: Jack Preis | Sep 14, 2016 4:01:22 PM

"There is relatively wide agreement on the knowledge that first year students should acquire." I would think there is relatively little agreement on that, actually. Much of what 1L professors teach is legal skills, such as how to read cases, read statutes, and apply legal standards. The content of the courses is only partially relevant. Whether a professor happens to cover the Rule Against Perpetuities doesn't matter all that much; what matters is how well students master whatever set of legal rules is taught.

More broadly, I think the purpose of law school grades is to provide signals to employers about which applicants they should hire. If professors grade so as to signal lawyerly ability, they are telling employers who has the better or less good legal ability. Without a curve, isn't that signal often lost?

Posted by: Orin Kerr | Sep 14, 2016 3:37:18 PM

Grades are increasingly becoming meaningless as the median continues its inexorable mark to the ceiling. A law school that eliminates first year curves will only accelerate the process.

Do we really want to force employers to use other measures by not providing them any useful information about our students? Consider a world, one which we are partially already in, where getting a letter of recommendation from a particular small group of professors is the best way of ensuring career successes.

The part of the proposal that should be eliminated is: "To be clear, I would not make these standards mandatory on any professor or school." Law schools should embrace Oxbridge style external examination for their mandatory classes.

Posted by: brad | Sep 14, 2016 3:26:42 PM

Good point, Phil. It's not right to say that the curve "fixes" anything. It simply normalizes the distribution so it doesn't raise any eyebrows. I think the my point about the association between professor discretion and the curve still holds, but I appreciate the correction on my description.

On your point about what we should expect of law professors, I think I largely agree with you. The problem, however, is how to enforce those expectations. Suppose I teach tax law but, because I have discretion to teach it how I want, teach it in a rather avant garde way. Assuming I'm doing this in good faith, who is equipped at my school to know that I've ventured too far off the beaten path pedagogically speaking. Maybe the other tax professors, but probably not the dean or associate dean (unless they are tax professors). My point is not that there is no objective way to measure students, but rather that its hard to figure out within a particular school who has control over that decision. That is why an AALS-convened standard of learning could be useful--it provides information to non-tax professors about what tax professors, on the whole, seem to think is important. The dean could still let my avant grade approach slide by, but she would have more information on whether that is appropriate and, thus, presumably make a better decision for the benefit of students.

Posted by: Jack Preis | Sep 14, 2016 3:02:57 PM

Except that the curve does not correct potential flaws introduced by faculty discretion. The curve takes a distribution of grades and makes it appear statistically normal. The measure still has little bearing on mastery of the subject matter if the professor wrote a poorly designed exam. It also masks the effects of lowered admission standards (the top student, even if a relatively poor student by historical standards, is the top student).

It should be proper to give a non-normal distribution of grades if the set of students are not statistically normal. I have never accepted the rationale given by law schools for curving grades. The professors should be competent enough to objectively assess excellent, proficient, marginal and poor performance when they see it. It is a fundamental task of the academy.

Posted by: Phil | Sep 14, 2016 2:44:02 PM

The comments to this entry are closed.