As teams assemble questionnaires for us to review, these draft survey instruments often end up with a mishmash of scales, with different questions having scales with three, four, five, seven, eleven, or more items, as everyone incorporates their favorite scale. One of the first revisions we make is to replace all the different-sized unipolar scales with five-point scales instead.
In 2008, Jon Krosnick, professor of communication at Stanford, and then-doctoral student Alexander Tahk wrote “The Optimal Length of Rating Scales to Maximize Reliability and Validity”:
Survey research frequently uses multi-point scales to assess respondents’ views. These scales vary from two points (e.g., agree or disagree) to 101 points (e.g., the American National Election Study’s thermometer-style ratings). Scales can also vary in another regard: being bipolar (meaning the zero point is in the middle and the end points are opposites, such as extremely positive and extremely negative) or unipolar (meaning the zero point is at one end, as in “not at all important”). However, different scale lengths may differ in reliability, so it is important to understand how the length of the scales affects the reliability of the responses.
To explore the relation between scale length and reliability, we conducted a meta-analysis of the results of many past studies. Our data consist of results from 706 tests of reliability taken from thirty different between-subject studies. We combined various measures of reliability and various sample sizes, controlling for these and other factors in determining the relation of scale length to reliability.
In general, we found that five- or seven-point scales produced the most reliable results. Bipolar scales performed best with seven points, whereas unipolar scales performed best with five. We also found that offering a midpoint on a bipolar scale, indicating a neutral position, increased reliability.
The situation has evolved for bipolar scales since this was published, with the best practice now being to break such questions into batteries of three or four questions. For more on bipolar scales, see “When and How to Use Bipolar Scales“.
The other best practice for five-point scales is to label every point (e.g., Not at all satisfied, Hardly satisfied, Somewhat satisfied, Very satisfied, Completely satisfied) and to hide any numbers that might be used for analysis behind the scenes. See “The Case for Fully Labeled Scales” for the research behind that.
Originally published on December 5, 2017. Updated to use current-recommended scale wording.