Home>

Acquire the paper size and monochrome or color information of multiple image data (tif.jpeg) in the directory,

print(size+color)

By executing, I was able to get the following results.

A4 mono
A4 mono
A4 mono
A4 mono
A0 color
A0 thing
A0 color
A0 thing
A1 color
A1 thing
A1 color
A1 thing
A2 color
A2 Mono
A2 color
A2 Mono

How do you write the code when totaling these and dividing into black and white in Excel and totaling the number of files for each as shown in the following Excel writing? [A4 mono:] is row A, [4] is row B.
Row A Row B
A4 thing: 4
A2 Mono: 2
A1 Mono: 2


A2 color: 2
A1 color: 2

*6/29 Add the entire program to the question.

from PIL import Image

for filename in os.listdir(folder_path):
path_in = os.path.join(folder_path, filename)

*Open the file and calculate the resolution
img = Image.open(path_in,'r')

dpi = img.info['dpi']

if dpi == (200, 200):
result = 200

if dpi == (300, 300):
result = 300

if dpi == (400, 400):
result = 400

if dpi == (600, 600):
result = 600

*Get file size (width,height)
w, h = img.size

w = int(w/result * 25.4)
h = int(h/result * 25.4)

* Allocate to column A based on the acquired size

if w<= 210 and h<= 297:
result ='A4'

elif w<= 420 and h<= 297:
result ='A3'




else:
result ='A0_OVERSIZE'

size = result

* Judge monochrome or color
mode = img.mode

if mode == '1':
result ='thing'

if mode != '1':
result ='color'

color = result

*Display result
print(size+color)

  • Answer # 1

    First, about the program written in the question
    -Python is a programming language that distinguishes execution blocks by indentation (indentation at the beginning of lines). Without proper indentation, it will not work as expected. If you plan to program in Python, paying attention to indentation is the first step.
    -Using the same variable for different purposes is the cause of a program error. The question program results, w, h, etc. are used that way.
    -Be sure to give the variable a meaningful name. It will be easier to understand the contents of the program when reviewing or modifying the program after a while.

    I modified the program from the above viewpoint.
    The totaled result is output to the CSV file in any order, so please read it in Excel.

    Please use Excel to say "total the number of files for each of monochrome and color as follows". Put the first two letters of column A and the rest of letters in columns C and D (put a worksheet function like that), and sort columns A to D in descending order of column D and descending order of column C. is.

    Of course, it is also possible to realize it by modifying the following program.
    Please do your best and try.

    This program is not tested for execution (because it is difficult to prepare files for testing), so it may contain bugs.

    from PIL import Image
    csvFilePath = os.path.join(folder_path,"data.csv")
    csvFile = csvFilePath.open(mode='w') as csvF # Open csv file for output in write mode
    for filename in os.listdir(folder_path):
        path_in = os.path.join(folder_path, filename)
        dict = {} # Initialize dictionary variable dict used for aggregation (initially empty dictionary)
        count = 0;# integer variable count used for aggregation
        #Open the file and calculate the resolution
        img = Image.open(path_in,'r')
        dpi = img.info['dpi']
        if dpi == (200, 200):
            dotsPerInch = 200
        elif dpi == (300, 300):
            dotsPerInch = 300
        elif dpi == (400, 400):
            dotsPerInch = 400
        elif dpi == (600, 600):
            dotsPerInch = 600
        #Get image size in Pixcel units (width,height)
        w, h = img.size
        # Convert image size to mm
        w_mm = int(w/dotsPerInch * 25.4)
        h_mm = int(h/dotsPerInch * 25.4)
        # Allocate to column A based on the acquired size
        if w_mm<= 210 and h_mm<= 297:
            paperSize ='A4'
        elif w_mm<= 420 and h_mm<= 297:
            paperSize ='A3'
        The judgment part of # A0, A1, A2 is omitted
        else:
            paperSize ='A0_OVERSIZE'
        #Judge monochrome or color
        mode = img.mode
        if mode == '1':
            mono_color ='mono'
        else:
            mono_color ='color'
        #Display the result of judging image size and B/W/color
    Print(paperSize+mono_color)
        # Use a dictionary to aggregate. (The dictionary key is the image data ('A4 Mono', etc.), and the value is the number of times that image data appeared)
        if paperSize+mono_color in dict: #The image data is included in the dictionary
            dict[paperSize+mono_color] = dict[paperSize+mono_color]+1 # Increase the number by 1
        else:
        Dict[paperSize+mono_color] = 1 #The number of cases is 1 because it is image data that is not included in the dictionary yet
    #Output of aggregation result
    for k,v in dict.items:
        print(k+':', v) # display the aggregated result on the screen
        csvF.write(k+':'+','+v) #Write the aggregation result to a csv file
    csvF.close() # Close with csv file

  • Answer # 2

    There is CSV as a format that can be read in Excel.
    You can separate each item with a comma and a line break.

    A4 thing, 4
    A2 Mono, 2
    A1 Mono, 2
    ...

    If you output it in the form, you can finally drop it in a file and read it in Excel