Photo shows a lecture hall with many students sitting in rows, facing the front of the room where a professor stands near a podium.
Photo by the University of Manchester School and College’s Photostream, Flickr CC

It’s teaching evaluation season again, when universities collect anonymous student evaluations of each class, contingent faculty wonder whether their scores will help them get another contract, and female faculty brace themselves for comments about their appearance. Teaching evaluations continue to be the most widely used (often the only) tool to evaluate and reward college-level teaching, despite a long history of research on gender bias in evaluations. New research considers how the design of evaluations affect their outcomes, and whether simply changing the number of points in a rating scale reduces the size of gender gaps.

Lauren A. Rivera and András Tilcsik studied a large North American university that moved from a 10-point scale to a 6-point scale in its teaching evaluations. The change in scale allowed the researchers to test whether the same professors, teaching the same courses, were evaluated differently on the 6-point scale than on the 10-point scale. Rivera and Tilcsik also performed an experiment where participants evaluated identical lecture transcripts in order to control for teacher quality and improvement.

Changing from a 10-point to a 6-point scale significantly affected the gender gap in the most male-dominated fields. Specifically, on the 10-point scale 31.4% of male professors received the highest score, while only 19.5% of female professors did. A ten was the most common rating for male professors, followed by a nine and then an eight. For female professors, an eight was the most common, followed by a nine and then a ten.

After switching to the 6-point scale, the gender gap disappeared. On the 6-point scale, 41.2% of male professors and 41.7% of female professors received the highest score. Findings from the experiment likewise found a statistically significant difference between men and women using the 10-point scale and no statistically significant difference with the 6-point scale for professors in male-dominated fields.

The authors hypothesize that a ten on a 10-point scale connotes brilliance, a trait that students are less likely to attribute to female professors in these male-dominated fields. While ingrained biases are difficult to shift, careful construction of evaluation instruments is an achievable step for organizations looking to mitigate gendered effects.