Home>
I want to add a column after pivoting with pandas
data, time, item, num, fav
20201010,0900, A, 3, BB
20201010,0910, A, 2, BB
20201010,0900, B, 4, AA
20201010,0910, B, 5, AA

df = pd.read_csv ('File.csv')
matrix = df.pivot (index ['date','time'], columns ='time', values ​​='item')

output
0900 0910
202010 A 3 2
B Four Five
0900 0910 fav
202010 A 3 2 BB
B Four Five AA

I want to add fav from the original data as above.

What I tried
matrix = matrix.merge (df ['fav'], on = ['date','item'])


It has returned to the shape before pivoting ...
What should I do?

  • Answer # 1

    Well, there is something wrong with the CSV file presented in the question and the code that follows.

    First line of CSV filedataIsdateWould be a mistake (from the code written later).

    When reading the CSV file,dtype = {'time': str}Should be specified. If it is left as it is, it will be regarded as a numerical value.timeThe leading 0 padding disappears.

    The code to get the pivot is incorrect. To get the pivot presented in the table, in the indexdateWhenitemTo valuesnumMust be specified.

    -matrix = df.pivot (index ['date','time'], columns ='time', values ​​='item')
    + matrix = df.pivot (index = ['date','item'], columns ='time', values ​​='num')

    After fixing these, you can read fav together when pivoting.values ​​= ['num','fav']Or omit the specification of the optional argument values ​​(and all columns not used for index and columns are used for values).

    ThennumalikefavIs also configured as a multi-index column, so

                 num fav
    time 0900 0910 0900 0910
    date item
    20201010 A 3 2 BB BB
             B 4 5 AA AA

    You can get this result. If you want to combine favs into one, see the end of the code below.

    import pandas as pd
    import io
    txt = "" "
    date, time, item, num, fav
    20201010,0900, A, 3, BB
    20201010,0910, A, 2, BB
    20201010,0900, B, 4, AA
    20201010,0910, B, 5, AA
    "" "
    df = pd.read_csv (io.StringIO (txt), dtype = {'time': str})
    # df = pd.read_csv ('File.csv', dtype = {'time': str})
    # print (df)
    matrix = df.pivot (index = ['date','item'], columns ='time', values ​​= ['num','fav'])
    # print (matrix)
    Bring the #fav column together
    matrix ['tmp'] = matrix.iloc [:, 2]
    matrix.drop (columns ='fav', inplace = True)
    matrix.rename (columns = {'tmp':'fav'}, inplace = True)
    print (matrix)
                 num fav
    time 0900 0910
    date item
    20201010 A 3 2 BB
             B 4 5 AA