Include zero frequencies in the frequency table for Likert data

I have a dataset with responses to a Likert element on a 9pt scale. I would like to create a frequency table (and barplot) of the data, but some scale values โ€‹โ€‹are never found in my data set, so table() removes this value from the frequency table. I would like to introduce a value with a frequency of 0 instead. That is, given the following data set

 # Assume a 5pt Likert scale for ease of example data <- c(1, 1, 2, 1, 4, 4, 5) 

I would like to get the following frequency table without having to manually insert a column named 3 with a value of 0 .

 1 2 3 4 5 3 1 0 2 1 

I am new to R , so maybe I missed something basic, but I did not come across a function or parameter that gives the desired result.

+8
r frequency
source share
3 answers

table creates a contingency table, and tabular creates a frequency table that includes null values.

 tabulate(data) # [1] 3 1 0 2 1 

Another way (if you have integers starting with 1 but easy to change for other cases):

 setNames(tabulate(data), 1:max(data)) # to make the output easier to read # 1 2 3 4 5 # 3 1 0 2 1 
+5
source share

EDIT:

tabular creates frequency tables, and table creates action tables. However, to get zero frequencies in a one-dimensional contingency table, as in the above example, the code below still works, of course.


This question provided the missing link. By converting a Likert element to a coefficient and explicitly defining levels, levels with a frequency of 0 are still counted

 data <- factor(data, levels = c(1:5)) table(data) 

displays the desired result

+17
source share

If you want to quickly calculate the number or proportions for several equivalent elements and get your result in the data.frame file, you might like the psych::response.frequencies function in the psych package.

Allows you to create some data (note that there are no 9 of them):

 df <- data.frame(item1 = sample(1:7, 2000, replace = TRUE), item2 = sample(1:7, 2000, replace = TRUE), item3 = sample(1:7, 2000, replace = TRUE)) 

If you want to calculate the share in each category

 psych::response.frequencies(df, max = 1000, uniqueitems = 1:9) 

You will get the following:

  1 2 3 4 5 6 7 8 9 miss item1 0.1450 0.1435 0.139 0.1325 0.1380 0.1605 0.1415 0 0 0 item2 0.1535 0.1315 0.126 0.1505 0.1535 0.1400 0.1450 0 0 0 item3 0.1320 0.1505 0.132 0.1465 0.1425 0.1535 0.1430 0 0 0 

If you want calculations, you can multiply by the sample size:

 psych::response.frequencies(df, max = 1000, uniqueitems = 1:9) * nrow(df) 

You get the following:

  1 2 3 4 5 6 7 8 9 miss item1 290 287 278 265 276 321 283 0 0 0 item2 307 263 252 301 307 280 290 0 0 0 item3 264 301 264 293 285 307 286 0 0 0 

A few notes:

  • the default max is 10. Thus, if you have more than 10 answer options, you will have problems. Otherwise, in your case and in many cases with Likert elements, you can omit the max argument.
  • uniqueitems indicates possible values. If all your values โ€‹โ€‹were present in at least one element, this will be deduced from the data.
  • I think the function only works with numeric data. Therefore, if you have your category categories encoded "Strongly disagree," etc., they will not work.
0
source share

All Articles