Quickly apply string operations in pandas DataFrame

Question

Quickly apply string operations in pandas DataFrame

Suppose I have a DataFrame with 100k rows and a column DataFrame . I would like to break this name into first and last name as efficiently as possible. My current method:

 def splitName(name): return pandas.Series(name.split()[0:2]) df[['first', 'last']] = df.apply(lambda x: splitName(x['name']), axis=1)

Unfortunately, DataFrame.apply really, really slow. Is there anything I can do to make this line operation almost as fast as the numpy operation?

Thanks!

+8

python pandas

duckworthd Oct 10 '12 at 22:29

source share

1 answer

Wes mckinney · Accepted Answer · 2012-10-11T20:03:34+0000

Try (pandas required> = 0.8.1):

 splits = x['name'].split() df['first'] = splits.str[0] df['last'] = splits.str[1]

Quickly apply string operations in pandas DataFrame

More articles: