I want to extract words in a word list saved in a separate text file from Japanese sentences and check their appearance frequency.

Example) If there is a word list "Orange apple pineapple ...", extract those words in the sentence.
Apple 8
Pineapple 3

Even if you search, you will only find what you search for by typing one word at a time. Since there are a lot of sentences and word lists to analyze, I want to read a sentence file and a word list and make a result that will be output in a batch. What should I do?

  • Answer # 1


    If it is such a requirement, perhaps you write a program that morphologically analyzes the Japanese sentences written in the target text file and counts up each time a noun appears in the word list. I assume it ’s good.

    wikipedia-Morphological Analysis

    After a quick read of the above, you may want to google around "Python morphological analysis". You can also find many articles in Qiita saying "I have tried morphological analysis".

  • Answer # 2

    If you can do something that "searches by typing one word at a time," you can build a for loop that uses it.

