How is a self-reference column in a pandas data frame?

Question

How is a self-reference column in a pandas data frame?

In Python Pandas, I use a Data Frame as such:

drinks = pandas.read_csv(data_url)

Where data_url is the string URL of the CSV file

When indexing the framework for all “easy drinkers,” where light drinkers make up 1 drink, it says:

drinks.light_drinker[drinks.light_drinker == 1]

Is there a more DRY-like way to independently refer to the "parent"? That is, something like:

drinks.light_drinker[self == 1]

+4

python scipy pandas

James graham Jan 23 '15 at 0:48

source share

3 answers

elyase · Answer 1 · 2015-01-23T03:13:22+0000

Now you can use query or assign depending on what you need:

drinks.query('light_drinker == 1')

or for mutation df:

df.assign(strong_drinker = lambda x: x.light_drinker + 100)

Old answer

. where . API :

df.set(new_column=lambda self: self.light_drinker*2)

aus_lacy · Answer 2 · 2015-01-23T01:23:16+0000

, , self this Pandas, , , , , DRY, where().

drinks.where(drinks.light_drinker == 1, inplace=True)

Windchimes · Answer 3 · 2016-07-06T20:46:23+0000

pandas, .where() !

http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.where.html?highlight=where#pandas.DataFrame.where

, :

drinks.light_drinker.where(lambda x: x == 1)

. , ( DataFrame, light_drinker). , .

DataFrame, :

drinks.where(lambda x: x.light_drinker == 1)

Note that this will save the self form (this means that you will have rows where all the records will be NaN, because the condition failed for the value light_drinkerin this index).

If you do not want to save the DataFrame form (i.e. you want to delete rows NaN), use:

drinks.query('light_drinker == 1')

Note that the elements in DataFrame.indexand DataFrame.columnsby default are placed in the namespace query, which means that you do not need to reference self.

How is a self-reference column in a pandas data frame?

More articles: