Home>

I ask for help from those who decided to look at this question for me at the moment. most likely, the solution to the problem that I encountered is quite easy, however, I need help from you in this matter. thanks in advance <3

task:

  • you need to convert the finished dictionary into a dataframe with data in the word format | genre (columns), and save to some variable.

given a dictionary:

genre_dict= {
    'comedy': ['satirical', 'adventurous', 'funny'],
    'melodrama': ['choice', 'shame'],
    'fairy tale': ['adventure', 'cute', 'family'],
    'detective': ['mystery', 'unravel', 'mysterious'],
    'thriller': ['horror', 'sinister', 'nervous']
}

should be turned into a dataframe like this:

genre word
comedy satirical
comedy adventurous
comedy funny
melodrama select
melodrama shame
fairy tale adventure
fairy tale cute
fairy tale family
detective mystery
detective Unravel
detective mysterious
thriller horror
thriller sinister
thriller nerve

However, when I try to recreate the dataframe from the dictionary with the following code:

genres_df= pd.DataFrame(genre_dict.items(), columns=['genres', 'word'])
genres_df['genres']= genres_df['genres'].str.join(', ')

I get this:

genre word
comedy satirical, adventurous, funny
melodrama choice, shame
fairy tale adventure, cute, family
detective mystery, unravel, puzzling
thriller horror, sinister, nerve

and this is not exactly what I need. I ask for help in resolving this issue. Thank you

  • Answer # 1

    Try this:

    df= pd.DataFrame([[k,x] for k,v in genre_dict.items() for x in v],
                      columns=["genre","word"])
    

    result:

    In[72]: df
    Out[72]:
            genre word
    0 comedy satirical
    1 comedy adventurous
    2 comedy funny
    3 melodrama selection
    4 melodrama shame
    5 fairy tale adventure
    6 fairy tale cute
    7 fairy tale family
    8 detective mystery
    9 detective unravel
    10 detective mysterious
    11 horror thriller
    12 thriller sinister
    13 thriller nerve
    

    You can also get the desired result from yourgenres_dfusing the method:

    genres_df= pd.DataFrame(genre_dict.items(), columns=['genres', 'word'])
    In [75]: genres_df.explode("word")
    Out[75]:
          genre word
    0 comedy satirical
    0 comedy adventurous
    0 comedy funny
    1 melodrama selection
    1 melodrama shame
    2 fairy tale adventure
    2 fairy tale cute
    2 fairy tale family
    3 detective mystery
    3 detective unravel
    3 detective mysterious
    4 thriller horror
    4 thriller sinister
    4 thriller nerve
    

    works as it should, thanks a lot :)

    izcrezi2022-02-14 08:15:38