Mimic tabulate team from Stata to R

I am trying to get a table with two ways in R like this from Stata. I tried to use the CrossTable package from gmodels , but the table is different. Do you know how to do this in R ?

Hopefully at least get frequencies from

when cursmoke1 == "Yes" and cursmoke2 == "No" and vice versa

In R, I get totals only from yes, no, and NA.

Here is the result:

Stata p>

 . tabulate cursmoke1 cursmoke2, cell column miss row +-------------------+ | Key | |-------------------| | frequency | | row percentage | | column percentage | | cell percentage | +-------------------+ Current | smoker, | Current smoker, exam 2 exam 1 | No Yes . | Total -----------+---------------------------------+---------- No | 1,898 131 224 | 2,253 | 84.24 5.81 9.94 | 100.00 | 86.16 7.59 44.44 | 50.81 | 42.81 2.95 5.05 | 50.81 -----------+---------------------------------+---------- Yes | 305 1,596 280 | 2,181 | 13.98 73.18 12.84 | 100.00 | 13.84 92.41 55.56 | 49.19 | 6.88 35.99 6.31 | 49.19 -----------+---------------------------------+---------- Total | 2,203 1,727 504 | 4,434 | 49.68 38.95 11.37 | 100.00 | 100.00 100.00 100.00 | 100.00 | 49.68 38.95 11.37 | 100.00 

R

 > CrossTable(cursmoke2, cursmoke1, missing.include = T, format="SAS") Cell Contents |-------------------------| | N | | Chi-square contribution | | N / Row Total | | N / Col Total | | N / Table Total | |-------------------------| Total Observations in Table: 4434 | cursmoke1 cursmoke2 | No | Yes | NA | Row Total | -------------|-----------|-----------|-----------|-----------| No | 2203 | 0 | 0 | 2203 | | 1122.544 | 858.047 | 250.409 | | | 1.000 | 0.000 | 0.000 | 0.497 | | 1.000 | 0.000 | 0.000 | | | 0.497 | 0.000 | 0.000 | | -------------|-----------|-----------|-----------|-----------| Yes | 0 | 1727 | 0 | 1727 | | 858.047 | 1652.650 | 196.303 | | | 0.000 | 1.000 | 0.000 | 0.389 | | 0.000 | 1.000 | 0.000 | | | 0.000 | 0.389 | 0.000 | | -------------|-----------|-----------|-----------|-----------| NA | 0 | 0 | 504 | 504 | | 250.409 | 196.303 | 3483.288 | | | 0.000 | 0.000 | 1.000 | 0.114 | | 0.000 | 0.000 | 1.000 | | | 0.000 | 0.000 | 0.114 | | -------------|-----------|-----------|-----------|-----------| Column Total | 2203 | 1727 | 504 | 4434 | | 0.497 | 0.389 | 0.114 | | -------------|-----------|-----------|-----------|-----------| 
+6
source share
1 answer

Maybe something is missing for me. The default settings for CrossTable seem to provide substantially what you are looking for.

Here's a CrossTable with minimal arguments. (I uploaded the dataset as "temp".) Note that the results are the same as you posted on Stata's output (you just need to multiply by 100 if you want to get the result as a percentage).

 library(gmodels) with(temp, CrossTable(cursmoke1, cursmoke2, missing.include=TRUE)) Cell Contents |-------------------------| | N | | Chi-square contribution | | N / Row Total | | N / Col Total | | N / Table Total | |-------------------------| Total Observations in Table: 4434 | cursmoke2 cursmoke1 | No | Yes | NA | Row Total | -------------|-----------|-----------|-----------|-----------| No | 1898 | 131 | 224 | 2253 | | 541.582 | 635.078 | 4.022 | | | 0.842 | 0.058 | 0.099 | 0.508 | | 0.862 | 0.076 | 0.444 | | | 0.428 | 0.030 | 0.051 | | -------------|-----------|-----------|-----------|-----------| Yes | 305 | 1596 | 280 | 2181 | | 559.461 | 656.043 | 4.154 | | | 0.140 | 0.732 | 0.128 | 0.492 | | 0.138 | 0.924 | 0.556 | | | 0.069 | 0.360 | 0.063 | | -------------|-----------|-----------|-----------|-----------| Column Total | 2203 | 1727 | 504 | 4434 | | 0.497 | 0.389 | 0.114 | | -------------|-----------|-----------|-----------|-----------| 

Alternatively, you can use format="SPSS" if you want the numbers displayed as a percentage.

 with(temp, CrossTable(cursmoke1, cursmoke2, missing.include=TRUE, format="SPSS")) Cell Contents |-------------------------| | Count | | Chi-square contribution | | Row Percent | | Column Percent | | Total Percent | |-------------------------| Total Observations in Table: 4434 | cursmoke2 cursmoke1 | No | Yes | NA | Row Total | -------------|-----------|-----------|-----------|-----------| No | 1898 | 131 | 224 | 2253 | | 541.582 | 635.078 | 4.022 | | | 84.243% | 5.814% | 9.942% | 50.812% | | 86.155% | 7.585% | 44.444% | | | 42.806% | 2.954% | 5.052% | | -------------|-----------|-----------|-----------|-----------| Yes | 305 | 1596 | 280 | 2181 | | 559.461 | 656.043 | 4.154 | | | 13.984% | 73.177% | 12.838% | 49.188% | | 13.845% | 92.415% | 55.556% | | | 6.879% | 35.995% | 6.315% | | -------------|-----------|-----------|-----------|-----------| Column Total | 2203 | 1727 | 504 | 4434 | | 49.684% | 38.949% | 11.367% | | -------------|-----------|-----------|-----------|-----------| 

Update: prop.table()

Just FYI (to save the tedious job you did when creating your own data.frame , just like you), you might also be interested in the prop.table() function.

Again, using the data you contacted and think it's called "temp", the basic data below is what you can build your data.frame . You may also be interested in viewing the functions margin.table() or addmargins() :

 ## Your basic table CurSmoke <- with(temp, table(cursmoke1, cursmoke2, useNA = "ifany")) CurSmoke # cursmoke2 # cursmoke1 No Yes <NA> # No 1898 131 224 # Yes 305 1596 280 ## Row proportions prop.table(CurSmoke, 1) # * 100 # If you so desire # cursmoke2 # cursmoke1 No Yes <NA> # No 0.84243231 0.05814470 0.09942299 # Yes 0.13984411 0.73177442 0.12838148 ## Column proportions prop.table(CurSmoke, 2) # * 100 # If you so desire # cursmoke2 # cursmoke1 No Yes <NA> # No 0.86155243 0.07585408 0.44444444 # Yes 0.13844757 0.92414592 0.55555556 ## Cell proportions prop.table(CurSmoke) # * 100 # If you so desire # cursmoke2 # cursmoke1 No Yes <NA> # No 0.42805593 0.02954443 0.05051872 # Yes 0.06878665 0.35994587 0.06314840 
+6
source

All Articles