Home>

There are the following data frames.

df = pd.DataFrame (np.array ([
    [np.nan, np.nan, 0.1, 0.1],
    [np.nan, 0.0, 0.2, 0.4],
    [np.nan, np.nan, np.nan, 0.0],
    [0.8, 0.6, 0.4, 0.2],
    [np.nan, 1, 0.9, 1],
]))
col_1 col_2 col_3 col_4
NaN NaN 0.1 0.1
NaN 0.0 0.2 0.4
NaN NaN NaN 0.0
0.8 0.6 0.4 0.2
NaN 1 0.9 1

I want to convert this to:

col_1 col_2 col_3 col_4
0.1 0.1 NaN NaN
0.0 0.2 0.4 NaN
0.0 NaN NaN NaN
0.8 0.6 0.4 0.2
1 0.9 1 NaN
What you did

I tried to convert each industry into a list and return it to the data frame, but I abandoned because I didn't know how to eliminate missing values ​​in the list.

col_1
[nan, nan, 0.1,0.1]
[nan, 0.0,0.2,0.4]
[nan, nan, nan, 0.0]
[0.8,0.6,0.4,0.2]
[nan, 1,0.9,1]

→ Cannot delete missing values ​​

  • Answer # 1

    It's basically the same as nomuken, but if you want to use the original Column name as it is, a little more processing is required.

    import pandas as pd
    import numpy as np
    df = pd.DataFrame (
        [[np.nan, np.nan, 0.1, 0.1],
         [np.nan, 0.0, 0.2, 0.4],
         [np.nan, np.nan, np.nan, 0.0],
         [0.8, 0.6, 0.4, 0.2],
         [np.nan, 1, 0.9, 1]],
        columns = ['col_1', 'col_2', 'col_3', 'col_4'])
    df = df.apply (lambda d: d.dropna (). reset_index (drop = True), axis = 1) .rename (pd.Series (df.columns.values), axis = 1)
    print (df)
    # col_1 col_2 col_3 col_4
    # 0 0.1 0.1 NaN NaN
    # 1 0.0 0.2 0.4 NaN
    # 2 0.0 NaN NaN NaN
    # 3 0.8 0.6 0.4 0.2
    # 4 1.0 0.9 1.0 NaN

    or simply

    col = df.columns.values
    df = df.apply (lambda d: d.dropna (). reset_index (drop = True), axis = 1)
    df.columns = col


    You may also write as follows.
    However, if this method deletes NaN and the total number of columns changes, an error will occur.

  • Answer # 2

    Kana ...

    import pandas as pd
    import numpy as np
    df = pd.DataFrame (np.array ([
        [np.nan, np.nan, 0.1, 0.1],
        [np.nan, 0.0, 0.2, 0.4],
        [np.nan, np.nan, np.nan, 0.0],
        [0.8, 0.6, 0.4, 0.2],
        [np.nan, 1, 0.9, 1],
    ]))
    df = df.apply (lambda x: x.dropna (). reset_index (drop = True), axis = 1)
    print (df)