Using len () in a Pandas dataframe

This looks like my DataFrame :

  StateAb GivenNm Surname PartyNm PartyAb ElectedOrder
 35 WA Joe BULLOCK Australian Labor Party ALP 2
 36 WA Michaelia CASH Liberal LP 3
 37 WA Linda REYNOLDS Liberal LP 4
 38 WA Wayne DROPULICH Australian Sports Party SPRT 5
 39 WA Scott LUDLAM The Greens (WA) GRN 6

and I want to list a list of senators whose last name exceeds 9 characters.

So I think the code should be like this:

 df[len(df.Surname) > 9] 

but this causes a KeyError , where did I go wrong?

+7
source share
2 answers

The right way to filter a DataFrame based on the length of the rows in the column

 df[df['Surname'].str.len() > 9] 

df['Surname'].str.len() creates a series of lengths for the last column and df[df['Surname'].str.len() > 9] filters out those that are less than or equal to 9. What you did is this is to check the length of the series itself (how many lines it has).

+16
source

Take a look at the python filter function. He does exactly what you want.

 df = [ {"Surname": "Bullock-ish"}, {"Surname": "Cash"}, {"Surname": "Reynolds"}, ] longnames = list(filter(lambda s: len(s["Surname"]) > 9, df)) print(longnames) >>[{'Surname': 'Bullock-ish'}] 

Sytse

0
source

All Articles