Quickly apply string operations in pandas DataFrame

Suppose I have a DataFrame with 100k rows and a column DataFrame . I would like to break this name into first and last name as efficiently as possible. My current method:

 def splitName(name): return pandas.Series(name.split()[0:2]) df[['first', 'last']] = df.apply(lambda x: splitName(x['name']), axis=1) 

Unfortunately, DataFrame.apply really, really slow. Is there anything I can do to make this line operation almost as fast as the numpy operation?

Thanks!

+8
python pandas
source share
1 answer

Try (pandas required> = 0.8.1):

 splits = x['name'].split() df['first'] = splits.str[0] df['last'] = splits.str[1] 
+18
source share

All Articles