I want to create a process for reading CSV files with Pandas.
Currently, you can use Django to create a file uploader and read CSV files.
However, if you try to upload a file with a different number of columns for a row, you will get a parseerror.
import pandas as pd df = pd.read_csv ('CSV_file', header = None, low_memory = False)
The file is being read by Pandas.
Here, if you give names as an argument and give a column name, you can embed the missing part as a missing value and read it, or if you set the argument error_bad_lines = False, it will skip the strange line and read it. I understand.
On the other hand, for the CSV file to be read, the provision for the number of columns is provided, and if the number of columns is different from the provision, processing is terminated. I realized that if I gave the names argument, all files would meet the requirements.
Also, because header = None, I think that Pandas gives the column name without permission, but the column name does not read the file as if the column name was given by names. It seems.
In such a case, is there a way to maintain validation (?) with the specified number of columns and not cause parseerror?
Thank you for teaching me. Thank you.
Answer # 1
I can only say that such a process is not very suitable for pandas.
Reading table-like data will do its best without regret, but it does not help with data that is not originally table-like.
For the time being, it will be one of the options to process the string until it is in the form of the desired table, read it with a standard csv module, etc., and then convert it to a pandas data frame.
I think that it is not impossible to process based on the appearance position of missing values in pandas ...
There is a provision for the number of columns, and if the number of columns is different from the provision, the processing is terminated
, it is the fastest.
Answer # 2
What is the purpose of validation?
If you want to play if it is not cleared, try to except the error to finish.
If there is no error even if it is not cleared, isn't there meaning for validation?
- python - how to use variables defined in different files on google colab
- Detailed cross-reference methods for different py files in the same Python folder
- Java programming method to compare two text files and mark the same or different
- How to modify the maximum number of open files in Linux
- Python algorithm to find the number of different binary trees of n nodes
- Tutorial of using configuration files in different environments of Node combat
- Python DataFrame method to get the number of rows, columns, indexes and rows and columns
- Python count method for number of different elements in a list
- How to read CSV files with Numpy and delete rows and columns
- Android modify the number of launcher rows and columns
- 4 different operations for accessing files in Python
- Method to classify a large number of files by modification time through python
- Examples of different ways to read files in Python
- Python simple method to get the number of rows and columns in a two-dimensional array
- ruby - certain columns in csv file are not read
- Java implementation to get a specified number of different random numbers
- View the number of files in each subfolder in a specified folder in Linux
- Method to count the number of files in a PHP directory
- Java wagon how to package files to different servers
- Python27 to copy a large number of files and folders
- i want to give a column name to a column that is not specified in csv
- how to fix pandasread_csv error?
- python - [pandas] dataframe cannot be replaced (partial match) with replace method
- add vertical elements of python table data and calculate average
- python 3x - python importerror: cannot import name 'stringio' error in pandas_datareader
- csv - i want to calculate the distance between two points from one point to another point using latitude and longitude, but i ca
- i want to read multiple files in python and add them together
- python - i don't know how to replace elements that meet the conditions of list type data
- python 3x - after importing the matrix with pandas, i want to do the parsing and copying of specific rows
- python 3x - i want to extract data based on the conditions of dataframe