I want to eliminate duplication of character strings up to commas in text files.
In the text file handled here, one line (**, ********There is a string like), and there is always a comma. The conditions are as follows.
- Regardless of the presence or absence of duplication, the character string before the comma shall not be deleted or extracted when exporting to another file.
- If there are duplicates, the entire character string will not be deleted or extracted when writing to another file.
I was able to complete the code below that does not extract duplicates of the entire string, but I could not create the code with the above conditions. I would appreciate it if anyone could answer.
lines_seen = set () outfile = open ("******. Txt", "w") for line in open ("*****. Txt", "r"): if line not in lines_seen: outfile.write (line) lines_seen.add (line) outfile.close ()
Answer # 1
It is assumed that the encoding of the file to be read below is UTF-8.
lines_seen = set () outfile = open ("out.txt", "w") for line in open ("in.txt", "r", encoding = "utf-8"): try: try: # "There is always a comma in one line" (In this wording, there is not always one comma, so cut out only the beginning) key, data = line.split (',', 1) except ValueError: print ("--- read a line without a comma. Skip and continue processing. ---") continue if key not in lines_seen: outfile.write (data) lines_seen.add (key) outfile.close () print ("Processing completed.")
- python 3x - i want to delete duplicate lines
- i want to read excel data with python and output duplicate lines as csv or excel data
- python - i want to combine strings as one word instead of displaying them character by character
- about how to eliminate the error when updating python with raspbian (buster)
- i want to extract csv duplicate keywords in python
- about combining when there are duplicate values in python dictionary type
- [python] i want to interweave character strings and data with write () of file writing
- python 3x - python i want to eliminate the error when outputting the average temperature in tokyo
- python - updating a specific column in each row does not work in the case of duplicate index in dataframe
- i want to manipulate the strings inside a python array
- python3 extract and delete duplicate data of date and time
- python 3x - i want to list the strings as they are
- python - i want to eliminate an infinite loop
- python pandas handling of duplicate data
- python pandas duplicate data changes
- python - i want to sort the list of character strings stored in a lexicographic order in order of appearance frequency
- python - i want to handle strings with scipy
- python - how to search for duplicate characters in the character string
- python - how to convert a tuple of strings
- python - you may need to restart the kernel to use updated packages error
- php - coincheck api authentication doesn't work
- php - i would like to introduce the coincheck api so that i can make payments with bitcoin on my ec site
- [php] i want to get account information using coincheck api
- the emulator process for avd pixel_2_api_29 was killed occurred when the android studio emulator was started, so i would like to
- python 3x - typeerror: 'method' object is not subscriptable
- i want to call a child component method from a parent in vuejs
- dart - flutter: the instance member'stars' can't be accessed in an initializer error
- xcode - pod install [!] no `podfile 'found in the project directory