Home>
I want to achieve
  • I want to convert multiple JSON files with different columns into one CSV format
  • Can you teach me the logic without thinking how to handle it?
original data
a = {
      "id": 0,
      "name": "Jeffrey",
      "age": "23",
      "address": "Tokyo",
      "hoby": "tennis",
      "favorite_food": "pizza",
    }
b = {
      "id": 1,
      "name": "Essie",
      "age": "18",
      "favorite_drink": "coke"
    }
c = {
      "id": 2,
      "name": "Juan",
      "hoby": "basketball",
      "favorite_food": "cake",
      "favorite_drink": "tea"
    }
Expected processing result
id, name, age, address, hoby, favorite_food, favorite_drink
0, Jeffrey, 23, Tokyo, tennis, pizza,
1, Essie, 18 ,,,, coke
2, Juan ,,, basketball, cake, tea
  • Answer # 1

    Since a, b, and c are dictionaries below, when actually importing a JSON string
    Use json.loads etc. to convert the JSON string to a dictionary.

    import pandas as pd
    a = {
          "id": 0,
          "name": "Jeffrey",
          "age": "23",
          "address": "Tokyo",
          "hoby": "tennis",
          "favorite_food": "pizza"
        }
    b = {
          "id": 1,
          "name": "Essie",
          "age": "18",
          "favorite_drink": "coke"
        }
    c = {
          "id": 2,
          "name": "Juan",
          "hoby": "basketball",
          "favorite_food": "cake",
          "favorite_drink": "tea"
        }
    dics = [a, b, c]
    # Data processing of duplicate columns/newly appearing columns is performed automatically.
    df = pd.json_normalize (dics)
    '''
    # You can do the following, but the upper one is faster.
    df = pd.DataFrame ()
    for dic in dics:
        df = df.append (pd.json_normalize (dic))
    '''
    # In the following sentence, it will be saved as a csv format file with the specified name (the return value of this function is None).
    df.to_csv ('data.csv', index = False)