Dummy variables in Julia

R has good functionality for triggering a regression with dummy variables for each level of the categorical variable. for example, Automatic expansion of the R-factor into a set of indicator indicators 1/0 for each level factor

Is there an equivalent way to do this in Julia.

x = randn(1000) group = repmat(1:25 , 40) groupMeans = randn(25) y = 3*x + groupMeans[group] data = DataFrame(x=x, y=y, g=group) for i in levels(group) data[parse("I$i")] = data[:g] .== i end lm(y~x+I1+I2+I3+I4+I5+I6+I7+I8+I9+I10+ I11+I12+I13+I14+I15+I16+I17+I18+I19+I20+ I21+I22+I23+I24, data) 
+7
dataframe julia-lang glm
source share
1 answer

If you use the DataFrames package, after the pool data, the package will take care of the rest:

Column joining is important for working with the GLM package. When installing regression models, the input columns of PooledDataArray are converted to indicator columns 0/1 in ModelMatrix - with one column for each of the PooledDataArray levels.

You can see the rest of the merged data documentation here.

+4
source share

All Articles