An Experiment Testing Six Formats of 0 to 100 Rating Scales
Download data and study materials from OSF
University of Michigan
Sample size: 2106
Field period: 04/28/2014-10/21/2014
Although short to medium length rating scales are popular in surveys, long scales are also not uncommon. For example, the American National Election Studies (ANES) use a 0-to-100 scale to measure people's feelings toward various subjects such as political parties, candidates, and racial and ethnic groups. Although the longer scales are supposed to provide more refined measures, one major measurement issue is the rounding of the responses. For another example, in the 2012 ANES, the lowest rounding rate across the 45 feeling thermometer questions is 95%. One alternative to the numeric question is the visual analog scale (VAS). As many surveys are moving toward Web surveys, the VAS becomes easier to implement. However, the VAS has received little attention in the literature and most of what research has been done has compared the VAS with other scale formats with very different response options. In this study, we tested six types of 0-to-100 response scales. The main goal is to determine which type of scale has lower response difficulty and yields more precise answers.
VAS will reduce the task difficulty of responding to the survey questions. Hence, there will be fewer rounded answers, fewer item nonresponse and reduced response latency. Also, the respondent's self-report task difficulty will be lower in the VAS conditions than numeric open text box.
Experimental condition 1:
Respondents in this condition will be provided with a VAS with dynamic feedback. The VAS ranges from 0 to 100, with no numeric labels. Only the two endpoints will be given appropriate verbal labels. The respondent will be instructed to drag a slider bar to the location that indicates his or her attitude or opinion. The respondent will receive numerical feedback as he or she moves the slider bar (e.g., 73%).
Experimental condition 2:
The VAS in this condition is identical to the one described in experimental condition 1 with one alteration: no dynamic feedback will be provided to the respondent as he or she moves the slider.
Experimental condition 3:
Respondents in this condition will be provided with VAS dynamic feedback. The VAS ranges from 0 to 100, with verbal labels on the two endpoints. Numeric labels with multiples of 5 and 10 will also be provided on the VAS. No numeric labels or verbal labels will be used at any other point on the scale. The numeric feedback is the same as in experimental condition 1.
Experimental condition 4:
The VAS in this condition is identical to the one in experimental condition 3 with one alteration: no dynamic feedback will be provided to the respondent as he or she moves the sliders.
Experimental condition 5:
A drop down menu is provided for each question, ranging from 0 to 100, with verbal labels for the top and bottom two options only. The respondent will be instructed to choose the number that best describes his or her attitude or opinion.
Experimental condition 6 (control condition):
This condition requires a numeric input directly from the respondents. A small box is provided after the question and the respondent will be instructed to give one number from 0 to 100 to describe his or her attitude or opinion.
1. The percentage of rounded numeric answers (i.e., responses that are multiple of 5)
2. The percentage of item nonresponse
3. Response latency in seconds
4. Self-report of task difficulty
Summary of Results
The proportions of rounded answer are lower in the VAS than numeric text input questions. Across the four versions of VAS and all the items, the proportions of rounded answers decreases to about 60% or less. Among them, the VAS designs without numeric labels (only end points are labeled) and dynamic feedback consistently produce fewer rounded answers than the other three variations of VAS. Also, the two versions of VAS without numeric labels tend to evoke fewer rounded answers than VAS with numeric labels. The presence or absence of the numeric feedback does not exert much impact on the endorsement of rounded answers. These findings serve as strong evidence that the VAS leads to superior measurement quality compared to text input question for 101-point rating scales.
The item nonresponse rates are very low for all six conditions and they do not differ from each other. It suggests that the data quality among the different question formats, when measured by item nonresponse, do not differ in any reliable way.
Task difficulty is also measured through both response times and the respondents' subjective ratings. As for response times, the differences are primarily due to slower times for the drop-down and open-ended numeric text input formats than the four alternatives of VAS, which have very similar response times. As for the subjective task difficulty rating, there is no significant difference across the six designs of rating scales. These findings suggest that the improved measurement accuracy of VAS does not come at the cost of increased difficulty.
References Liu, Mingnan, and Frederick G. Conrad. "An experiment testing six formats of 101-point rating scales." Computers in Human Behavior 55 (2016): 364-371.