Home>

The table I scraped with python looks like the following, but it would be helpful to tell me what code I should write to format it like"

I want to detect a different pattern from a string and distribute it to a new column.

Before formatting
Exam date Place name
2018/09/15 Nagoya
2018/09/15 Tokyo-1Day
2018/09/15 (2018/09/16) Tokyo-2Day
2018/09/29 Nagoya
2018/09/29 (2018/09/30) Osaka-2Day
After shaping
Exam date Test day 2 Place name
2018/09/15 Nagoya
2018/09/15 Tokyo-1Day
2018/09/15 2018/09/16 Tokyo-2Day
2018/09/29 Nagoya
2018/09/29 2018/09/30 Osaka-2Day
  • Answer # 1

    How about usingSeries.str.extract ()andDataFrame.update ()like this
    extract

    import pandas as pd
    import numpy as np
    df = pd.DataFrame ({
        'Test Date': ['2018/09/15',
                   '2018/09/15',
                   '2018/09/15 (2018/09/16)'
                   , '2018/09/29',
                   '2018/09/29 (2018/09/30)'],
        'Place name': ['Nagoya',
                 'Tokyo-1Day',
                 'Tokyo-2Day',
                 'Nagoya',
                 'Osaka-2Day']})
    df ['Exam date 2'] = np.nan
    ptn = r "(\ d {4}/\ d {2}/\ d {2}) \ ((\ d {4}/\ d {2}/\ d {2}) \)"
    df.update (df ['Exam Date']. str.extract (ptn, expand = True) .rename (columns = {0: 'Exam Date', 1: 'Exam Date 2'}))
    print (df)
    # Place name Test date Test date 2
    # 0 Nagoya 2018/09/15 NaN
    # 1 Tokyo-1Day 2018/09/15 NaN
    # 2 Tokyo-2Day 2018/09/15 2018/09/16
    # 3 Nagoya 2018/09/29 NaN
    # 4 Osaka-2Day 2018/09/29 2018/09/30