How to get cell value length in pandas dataframe?

You have a pandas dataframe:

idx Event
0   abc/def
1   abc
2   abc/def/hij

Launch: df['EventItem'] = df['Event'].str.split("/")

Got:

idx EventItem
0   ['abc','def']
1   ['abc']
2   ['abc','def','hij']

Want to get the length of each cell, rundf['EventCount'] = len(df['EventItem'])

Got:

idx EventCount
0   6
1   6
2   6

How can I get the right amount as it should?

idx EventCount
0   2
1   1
2   3
+4
source share
2 answers

You can use .str.lento get the length of the list, even if the lists are not strings:

df['EventCount'] = df['Event'].str.split("/").str.len()

Alternatively, the score you are looking for is only 1 more than the number "/"in the row, so you can add 1 to the result .str.count:

df['EventCount'] = df['Event'].str.count("/") + 1

The result obtained for any method:

         Event  EventCount
0      abc/def           2
1          abc           1
2  abc/def/hij           3

Timing on a slightly larger DataFrame:

%timeit df['Event'].str.count("/") + 1
100 loops, best of 3: 3.18 ms per loop

%timeit df['Event'].str.split("/").str.len()
100 loops, best of 3: 4.28 ms per loop

%timeit df['Event'].str.split("/").apply(len)
100 loops, best of 3: 4.08 ms per loop
+7
source

apply len :

df['EventItem'].apply(len)

0    2
1    1
2    3
Name: EventItem, dtype: int64
+3

All Articles