Home>

I want to count by the number of elements using pivot_table of pandas.
I am in trouble because the elements are not counted.

▼ What I tried

import pandas as pd
#Create a data frame
df = pd.DataFrame ({"name": ["tarou", "tarou", "tarou", "tarou", "hanako", "hanako", "hanako", "jun", "jun"], " food ": ["orange "," apple "," banana "," orange "," banana "," orange "," apple "," orange "," banana "]})
#Count by element
df.pivot_table (index = "name", columns = "food", aggfunc ='count')

As a result of the above, the count number for each element does not come out, and such a result is returned.

Originally, it is assumed that the following results will be returned.

Can you tell me if you have any suggestions in the pivot_table code?

  • Answer # 1

    .pivot_table ()So, we are using GroupBy internally,indexWhencolumnsBecause there is no "value column" when all columns of the data frame are used up'count'If you specify, it will be blank as such.

    df.pivot_table (index = index, columns = columns, aggfunc = aggfunc)
    # ↑ This is
    # ↓ Almost equal to this
    df.groupby ([index, columns]). agg (aggfunc) .unstack (columns)
    In [10]: df.groupby (['name','food']). count ()
    Out [10]:
    Empty DataFrame
    Columns: []
    Index: [(hanako, apple), (hanako, banana), (hanako, orange), (jun, banana), (jun, orange), (tarou, apple), (tarou, banana), (tarou, orange) ]

    If you want to count in such cases'size'(lenBut it is possible).

    In [14]: df.groupby (['name','food']). size (). Unstack ('food', fill_value = 0)
    Out [14]:
    food apple banana orange
    name
    hanako 1 1 1
    jun 0 1 1
    tarou 1 1 2
    In [15]: df.pivot_table (index ='name', columns ='food', aggfunc ='size', fill_value = 0)
    Out [15]:
    food apple banana orange
    name
    hanako 1 1 1
    jun 0 1 1
    tarou 1 1 2

    In addition, it should be notedpd.crosstab ()Or if you have pandas version 1.1.0 or latervalue_counts ()You can also use methods (both methods do much the same thing inside).

    In [16]: pd.crosstab (df ['name'], df ['food'])
    Out [16]:
    food apple banana orange
    name
    hanako 1 1 1
    jun 0 1 1
    tarou 1 1 2
    In [17]: df.value_counts (). unstack ('food', fill_value = 0)
    Out [17]:
    food apple banana orange
    name
    hanako 1 1 1
    jun 0 1 1
    tarou 1 1 2