Using len () in a Pandas dataframe

Question

Using len () in a Pandas dataframe

This looks like my DataFrame :

  StateAb GivenNm Surname PartyNm PartyAb ElectedOrder
 35 WA Joe BULLOCK Australian Labor Party ALP 2
 36 WA Michaelia CASH Liberal LP 3
 37 WA Linda REYNOLDS Liberal LP 4
 38 WA Wayne DROPULICH Australian Sports Party SPRT 5
 39 WA Scott LUDLAM The Greens (WA) GRN 6

and I want to list a list of senators whose last name exceeds 9 characters.

So I think the code should be like this:

 df[len(df.Surname) > 9]

but this causes a KeyError , where did I go wrong?

+7

python pandas dataframe

Dong Sep 03 '16 at 11:09

source share

2 answers

ayhan · Answer 1 · 2016-09-03T11:21:40+0000

The right way to filter a DataFrame based on the length of the rows in the column

 df[df['Surname'].str.len() > 9]

df['Surname'].str.len() creates a series of lengths for the last column and df[df['Surname'].str.len() > 9] filters out those that are less than or equal to 9. What you did is this is to check the length of the series itself (how many lines it has).

Sytse reitsma · Answer 2 · 2016-09-03T11:44:23+0000

Take a look at the python filter function. He does exactly what you want.

 df = [ {"Surname": "Bullock-ish"}, {"Surname": "Cash"}, {"Surname": "Reynolds"}, ] longnames = list(filter(lambda s: len(s["Surname"]) > 9, df)) print(longnames) >>[{'Surname': 'Bullock-ish'}]

Sytse

Using len () in a Pandas dataframe

More articles: