I'd like to replace a string to fix the notation in python.
For example, suppose you have "green onion", "long onion", "kujo onion" in the text. I want to replace all the words that contain "leeks" with "leeks".
I can't find anything that suits what I want to do even after searching.
With replace () you can only replace one word, and it takes a long time to write all the patterns.
Is there any good way?
Is it possible to replace a green onion, a long onion, or a Kujo onion instead of specifying a green onion? If there are many types, it is difficult to grasp everything and it is difficult to write all patterns.
Answer # 1
Replace all words that contain "leeks" with "leeks"
To do this, you first need to do a morphological analysis.
I only need to know the word to replace.
Answer # 2
Carnegie Hall, Negitro, Koganegiku, Magical Teacher Negima, Negishi Systex, Takamine Guitar, etc.
In simpler terms, are onions and leeks together?
(The above is obviously a bad example. The onion and the onion are different depending on what you want to make)
It takes a lot of work to maintain a dictionary manually, or to search for word-like sequences by summing up the frequency of neighborhoods of characters from large-scale text data.
Also, I think that it is quite difficult to automatically judge whether you can put together as a representative just because they match. Whether it can be adopted as a hierarchical relationship of concepts is a process that must be made from information that is still manpowered or manually organized.
Is it possible to replace a green onion, a long onion, or a Kujo onion instead of a green onion? If there are many types, it is difficult to grasp everything and it is difficult to write all patterns.
If there is a kind of data that is difficult to understand,if it is true, when writing in a pattern, "how much should have been rewritten "Is there anything that should be rewritten and how much has been leaked?"I can't measure.
It's trying to make a system that doesn't know how well it works.
If "There are so many kinds that it is difficult to grasp everything" is correct, it is better to avoid rewriting with patterns.
Is it true that there are so many kinds that are difficult to grasp?
(Since there are many users and error collection can be done gradually after operation, it may be coarse at first, but it would be nice to talk like that)
Answer # 3
It is better to use regular expressions Is it not? Use the
In : import re In : target = 'Green onion, long onion and Kujo onion are onions' In : re.sub ('(blue | long | kujo) leek', 'leek', target) Out : 'Onions, leeks and leeks are leeks'
'(Blue | Long | Kujo) Leek'is a word that is blue, long or Kujo followed by a leek
Answer # 4
Please try google search.
A variety of information can be obtained quickly from the answers.
Although it is necessary to judge whether the information is good or bad, it should be the same for the answers on the QA site.
Answer # 5
If you try to do it straightforwardly, complex natural language processing is likely to be required.
However, whether or not you really need it depends on the purpose of use. So please indicate what you want to use.
- python - i want to extract the character string excluding the blank characters at the end of the line
- python - replace function of dataframe, i want to replace with the previous value (column direction)
- python 3x - search for any escape character in a string
- (i'm a beginner in python) when converting a character string to a set, does the information about how many characters appear in
- regular expressions - replace string containing regular expression on specific line
- i want to replace the string that is output to standard output with c ++ later
- python - replace does not work
- i want to get the string with python selenium
- i want to replace the string in the file with sed, a regular expression
- python - about the problem that the character string sent from the pc by serial communication cannot be recognized on the arduin
- how to use string format for% operator in python 3
- Python achieve image quickly replace a certain color
- How to convert string case in Python
- python - cannot extract the character string
- i want to handle backslashes in string replacement in python
- python 3x - index a simple way to replace everything
- how to print a string with class specified in for in python beautifulsoup
- python - if you want to make a string that contains the characters that meet the conditions in the list
- i want to parse a string with c# and replace it with another character
- python - character string processing for columns in a data frame that contains both numbers and strings
- python - if a specific button (eg root) is pressed, execute the process that was stopped
- python: how to change the print format of the answer obtained with return
- python - order django admin categories alphabetically
- python - i'm in trouble because i don't understand atcoder146d
- python - unicodedecodeerror when installing pip on windows
- python - missing 1 required positional argument
- python - i created an ssl authentication detection program, but an exception occurred
- python - reason why certificate information is returned even though it is not https
- python - collect data every n seconds with pandas and process for each column
- how to use boolean type in python