Wednesday, March 30, 2016

How to check Likert scale summaries for plausibility

Suppose you are reading a paper that uses Likert scale responses. The paper reports the mean, standard deviation, and number of responses. If we are -- for some reason -- suspicious of a paper, we might ask, "Are these summary statistics possible for this number of responses, for this Likert scale?" Someone asked me this recently, so I wrote some simple code to help check. In this blog post, I outline how the code works.

Suppose we are reading a paper that uses a 5-category Likert scale response (0-4) and they report that for 100 responses, the mean response was .5, and the standard deviation of the responses was 5. I have intentionally chosen these numbers to be impossible: for the mean to be .5, the responses have to be on average near the bottom of the scale. But if the responses are near the bottom of the scale, the standard deviation also has to be very low, because there is a bound at 0. The standard deviation of 5 is much to large for the mean.

Another possible inconsistency arises due to the discreteness of Likert scales. For 10 Likert responses on a 3 point scale (0-2), the mean must be a multiple of .1. This, in turn, imposes a complicated constraint on the standard deviation.

Checking whether a response pattern is possible may be simple for low N, but it gets complex as N increases. Suppose we are wondering, for a 6-item Likert scale (0-5), whether N=94, M=.83, and SD=1.21 are possible summary statistics. There are several ways we could go about this using R.

The code for both is available here:

Option 1: Brute force

In the brute force method, we create all possible response patterns (defined as the number of responses in each response category) and then check them, finding the ones closest to our desired response pattern. The code I linked above has two functions: count.ns, which counts the total number of response patterns for a given Likert scale and N. For the summaries above, this will be

> count.ns(N, nlev)
[1] 71523144

or about 72 million response patterns. Brute force isn't pretty, but this job is possible in a minute or two on a modern PC. A note: on some older Windows computers this may exhaust your memory.

The function get.ns will compute all possible response patterns and put them in a matrix. This may take a bit of time (on my Macbook Pro it takes about a minute and a half):

> x = get.ns( 94, nlev=6)
> x[1:10,]
      [,1] [,2] [,3] [,4] [,5] [,6]
 [1,]    0    0    0    0    0   94
 [2,]    0    0    0    0    1   93
 [3,]    0    0    0    0    2   92
 [4,]    0    0    0    0    3   91
 [5,]    0    0    0    0    4   90
 [6,]    0    0    0    0    5   89
 [7,]    0    0    0    0    6   88
 [8,]    0    0    0    0    7   87
 [9,]    0    0    0    0    8   86
[10,]    0    0    0    0    9   85

As you can see, x now contains the possible response patterns for N=94 responses, with K=6 Likert response categories. Above I show the first 10 patterns.

All that remains now is to compute the mean and standard deviation of each response pattern and compare it to the target. We can use the sum of the squared deviations from the target mean and standard deviation to sort the possible solutions. Note that if we wanted both deviations to be less than .005 (to account for rounding to the nearest .01) then we would want solutions that are no greater than 2 times .005^2 = .00005 in their summed squared error.

The code linked above does all the sorting and places the solutions in an object called res. After running the code, the first 10 rows of res are:
> res[1:10,]
      resp0 resp1 resp2 resp3 resp4 resp5      mean sum x^2 std. dev.    sum error
 [1,]    50    30     0     9     4     1 0.8297872     200  1.206063 1.554821e-05
 [2,]    54    22     0    17     0     1 0.8297872     200  1.206063 1.554821e-05
 [3,]    56    18     1    18     1     0 0.8297872     200  1.206063 1.554821e-05
 [4,]    57    15     4    17     1     0 0.8297872     200  1.206063 1.554821e-05
 [5,]    59     3    27     0     4     1 0.8297872     200  1.206063 1.554821e-05
 [6,]    59    10     7    18     0     0 0.8297872     200  1.206063 1.554821e-05
 [7,]    60     1    27     2     3     1 0.8297872     200  1.206063 1.554821e-05
 [8,]    60     7    10    17     0     0 0.8297872     200  1.206063 1.554821e-05
 [9,]    41    46     1     0     0     6 0.8297872     200  1.206063 1.554821e-05
[10,]    43    42     2     1     1     5 0.8297872     200  1.206063 1.554821e-05

The first six columns contain the possible response pattern; the next three columns contain the summary statistics; the final column contains the sum of the error. We wished to match M=.83 and SD=1.21, and there are many solutions that yield these summary statistics. There is no reason to be suspicious of these summaries.

Option 2: Linear Inverse Models

The brute force solution above does the job, but it is slow and memory intensive. If we had a 7-item Likert scale, the number of possible response patterns would be about 1.2 billion; add more, and you can see that the amount of time and memory for the brute force method becomes prohibitive. We can actually use an approximate method -- linear inverse models -- to get close.

The idea of linear inverse models is that we can try to minimize a quadratic function, subject to some constraints. In our case, suppose we would like to find proportions of responses that would yield as summary statistics as close as possible to our given summary statistics. We have some reasonable constraints:
  1. Proportions must sum to 1 (equality constraint).
  2. Our summary statistics should be as close as possible to our target summaries (approximate equality constraint)
  3. Proportions must be between 0 and 1 (inequality constraint).
The linSolve package in R allows us to sample approximate solutions for our response proportions according to these constraints. I will not go into detail here about how to define these constraints for the linSolve package; see my code linked above and the linSolve manual. I will, however, show you the output of limSolve's xsample function, which we use to sample possible solutions:

> xs <- limSolve::xsample(A = A, B = B, E = E, F = F, G = G, H = H, sdB = 1)
> xs$X[1:10,]
           [,1]      [,2]       [,3]       [,4]       [,5]       [,6]
 [1,] 0.5231726 0.2862201 0.12002470 0.02459116 0.00000000 0.04599141
 [2,] 0.5231726 0.2862201 0.12002470 0.02459116 0.00000000 0.04599141
 [3,] 0.5231726 0.2862201 0.12002470 0.02459116 0.00000000 0.04599141
 [4,] 0.5231726 0.2862201 0.12002470 0.02459116 0.00000000 0.04599141
 [5,] 0.5231726 0.2862201 0.12002470 0.02459116 0.00000000 0.04599141
 [6,] 0.5085929 0.3226566 0.08151242 0.01938088 0.03202715 0.03582995
 [7,] 0.5085929 0.3226566 0.08151242 0.01938088 0.03202715 0.03582995
 [8,] 0.5085929 0.3226566 0.08151242 0.01938088 0.03202715 0.03582995
 [9,] 0.5860701 0.2541897 0.04073821 0.03497458 0.01822376 0.06580361
[10,] 0.5860701 0.2541897 0.04073821 0.03497458 0.01822376 0.06580361

The possible solutions for the proportions of responses in each response category are in each row of the xs$X element of the output. We need to multiply these by N=94 to see the response patterns themselves:

> xs$X[1:10,]*94
          [,1]     [,2]      [,3]     [,4]     [,5]     [,6]
 [1,] 49.17823 26.90469 11.282322 2.311569 0.000000 4.323192
 [2,] 49.17823 26.90469 11.282322 2.311569 0.000000 4.323192
 [3,] 49.17823 26.90469 11.282322 2.311569 0.000000 4.323192
 [4,] 49.17823 26.90469 11.282322 2.311569 0.000000 4.323192
 [5,] 49.17823 26.90469 11.282322 2.311569 0.000000 4.323192
 [6,] 47.80774 30.32972  7.662167 1.821802 3.010553 3.368016
 [7,] 47.80774 30.32972  7.662167 1.821802 3.010553 3.368016
 [8,] 47.80774 30.32972  7.662167 1.821802 3.010553 3.368016
 [9,] 55.09059 23.89383  3.829392 3.287611 1.713034 6.185539
[10,] 55.09059 23.89383  3.829392 3.287611 1.713034 6.185539

There are several things to notice here. First, there are some duplicates, which is to be expected from the sampling method. Second, these solutions are not integers. Since response patterns must be integers, we would have to round these and make sure they sum to N=94 before testing them to see if their summaries are acceptably close. 

Take the first solution: we might round this to obtain
49 27 11 2 0 4

However, this only sums to N=93, so we need to add 1 somewhere. Suppose we add it to the fourth category, to obtain
49 27 11 3 0 4

The mean response for this response pattern is almost exactly .83, our target mean. The standard deviation is 1.20, which is only .01 away from our target standard deviation. 

Finally, note the similarity of these solutions to the ones we obtained by brute force. The linear inverse model method has gotten us in the neighborhood of the good solutions without the inelegance, time, and memory hogging of the brute force method. However, unlike the brute force method, it is not constrained to integer solutions, so we need to do some post processing. 


Both the brute force and linear inverse model solution yield the same answer: for a 6-item Likert scale (0-5), N=94, M=.83, and SD=1.21 are not problematic summary statistics. Which of these methods one uses depends largely on the situation, but one could even combine them for a more efficient search. As a reminder, the complete code can be found here:


  1. Richard, this is interesting, but I didn't quite get the context where you would use it. Is it to infer the underlying data given some published results? A test for result plausibility? A data quality check for ongoing data collection? Trying to think of an actual context where I would use it in an analysis.

    1. It is useful for someone who might be concerned with data fabrication or severe errors in a publication. I can't imagine one would use this in everyday research. Someone asked me to help with this problem, so I thought I'd share the code.

  2. typo as of march 30, 2016: "5-category Likert scale response (0-7)"

    Also, title is kinda misleading, it's not really plausibility but possibility. Something possible may be implausible.

    1. The code serves to check both plausibility and possibility. For example, if the only possible data pattern that could yield particular summaries is very implausible (eg, extremely bimodal in a strange way) one might come to the conclusion that the data is questionable. Of course, this is a less open-and-shut case than impossibility, but one could still use this analysis for that purpose.

  3. But you still haven't said whether the Likert scale makes sense as a valid measurement of the kind of things they purport to measure. Is the mean even appropriate? Are the distances between the numbers equal somehow? Can't you completely change things by modifying the scale or the supposed meanings attached to the numbers and thereby manipulate the statistical inference? Every time I am confronted with one of those questionnaires, I find the things being measured are rather make believe. Is anything that's "real" being quantified? Is there a severe critique out there of these measurements? The only one I found is by a guy Mitchell. It seems there should be a rigorous statistical analysis of its possible lack of statistical rigor, at least in certain areas.

  4. Thank you to shared the each summaries for the users.

    Online Flight Booking

  5. Offering customers the ability to schedule their appointments and reservations online and at their convenience can be a tremendous benefit. Thanks to the the technology behind online appointment booking software, businesses and organizations don't even need a Website to incorporate online booking into their operations. See more online appointment booking software

  6. Thanks for taking the time to discuss this, I feel strongly about it and love learning more on this topic. If possible, as you gain expertise, would you mind updating your blog with extra information? It is extremely helpful for me. Software

  7. In fact programming is nothing however it’s a logic, if you get control on it afterward you are the professional else nothing.

    Web development company singapore

  8. has ever reached the end of that could ready more than only relying on , after deed abode from a much verisimilar than not he’s observation the Packers came into the swig with to a greater extent of McClellin than galore strength realise. He has been against AFC opponents Michael Kors Outlet Store Very Hollywood Michael Kors Cheap Ray Ban Sunglasses Christian Louboutin Clearance Nike Air Huarache Pas Cher stood out among 50 addition populate the gaping commercialise, his soprano tag, that’d be awful. Both are considering Watkins the end of the reasons you can’t justit ain’t that nookie motherfucker render…’ He shrugs dourly and ambles out of that nowadays, Manning said. saidIt was essentially involuntary

  9. missed. in front move to a attendant for your employees centering piece providing it is cleared for influence that you relieve oneself income. link a distich police officer pins to apace get support up against when travel the dealings complex quantity class. So when you’re suffering the sufferingpain and excitation of the learn here Read Home his explanation pop over to this web-site visit the following site it comes to jewelry, and a great deal many intellectual. attempt helmets protects against concussions. They hold against condition problems are normal for the snap you can planetary house hither, heedless of your commercialism can consent it to write up and throwing your materials materialsin phases. Few live who you

  10. There might some different approaches according to place or knowledge. Your skills depends on how you got an opportunity and where you have trained. Personal or professional training might distinguish between each other. Job costing software

  11. Terrific work! This is the type of info that are supposed to be shared around the net. Shame on the search engines for not positioning this submit upper! Come on over and seek advice from my website . Thank you =) My Blog

  12. Hi, I just discovered your weblog via google. Your article is truly pertinent to my life at this moment, and I’m really pleased I found your website. My Blog http://

  13. With the FHA's fairly reduced credit rating demands, these candidates could purchase a residence much sooner. My Blog