I want to create a vector of dummy variables (can only take O or 1). I do the following:
data = ['one','two','three','four','six']
variables = ['two','five','ten']
I got the following two ways:
dummy=[]
for variable in variables:
if variable in data:
dummy.append(1)
else:
dummy.append(0)
or with a list:
dummy = [1 if variable in data else 0 for variable in variables]
The results are in order:
>>> [1,0,0]
Is there a build function that does this task faster? Its appearance is slow if the variables are thousands.
Edit : Results using time.time(): I am using the following data:
data = ['one','two','three','four','six']*100
variables = ['two','five','ten']*100000
- Loop (from my example): 2.11 sec
- list comprehension: 1.55 s
- list comprehension (variables are a type of set): 0.0004992 sec
- Example from St. Petersburg: 0.0004999 seconds
- Example from falsetrue: 0.000502 sec
source
share