Filtering and selecting from pivot tables executed with python pandas

Question

Filtering and selecting from pivot tables executed with python pandas

I am struggling with hierarchical indexes in the Python package for pandas . In particular, I do not understand how to filter and compare data in rows after they are rotated.

Here is an example table from the documentation:

 import pandas as pd import numpy as np In [1027]: df = pd.DataFrame({'A' : ['one', 'one', 'two', 'three'] * 6, 'B' : ['A', 'B', 'C'] * 8, 'C' : ['foo', 'foo', 'foo', 'bar', 'bar', 'bar'] * 4, 'D' : np.random.randn(24), 'E' : np.random.randn(24)}) In [1029]: pd.pivot_table(df, values='D', rows=['A', 'B'], cols=['C']) Out[1029]: C bar foo AB one A -1.154627 -0.243234 B -1.320253 -0.633158 C 1.188862 0.377300 three A -1.327977 NaN B NaN -0.079051 C -0.832506 NaN two A NaN -0.128534 B 0.835120 NaN C NaN 0.838040

I would like to analyze the following:

1) Filter this table by column attributes, for example by selecting rows with negative foo :

  C bar foo AB one A -1.154627 -0.243234 B -1.320253 -0.633158 three B NaN -0.079051 two A NaN -0.128534

2) Compare the remaining values of series B between the different groups of series A ? I’m not sure how to access this information: {'one':['A','B'], 'two':['A'], 'three':['B']} and determine which series B values are unique for each key or displayed in several key groups, etc.

Is there a way to do this directly in the structure of the pivot table, or do I need to convert this back to dataframe pandas ?

Update: I think this code is a step in the right direction. This at least allows me to access the individual values in this table, but I'm still hard-coded in a series of values:

 table = pivot_table(df, values='D', rows=['A', 'B'], cols=['C']) table.ix['one', 'A']

+7

python pandas indexing pivot-table pivot

alexhli Aug 15 '12 at 17:04

source share

2 answers

Chang she · Answer 1 · 2012-08-15T19:47:23+0000

the pivot table returns a DataFrame, so you can just filter:

 In [15]: pivoted = pivot_table(df, values='D', rows=['A', 'B'], cols=['C']) In [16]: pivoted[pivoted.foo < 0] Out[16]: C bar foo AB one A -0.412628 -1.062175 three B NaN -0.562207 two A NaN -0.007245

You can use something like

 pivoted.ix['one']

to select all A-series groups

or

 pivoted.ix['one', 'A']

to select individual groups of groups A and B

mjbsgll · Answer 2 · 2019-02-06T14:42:46+0000

Just to add information in the previous answer. When you try to use pivoted.ix['one'] in python3, you get the following message:

/usr/lib/python3.7/site-packages/ipykernel_launcher.py:7: DeprecationWarning: .ix is deprecated. Please use .loc for tag-based indexing or .iloc for positional indexing
See the documentation here: http://pandas.pydata.org/pandas-docs/stable/indexing.html#ix-indexer-is-deprecated import sys

So in this python version use the following code:

pivoted.loc ['one']

Filtering and selecting from pivot tables executed with python pandas

More articles: