Home>

A function that reads a txt file and stores it in a dictionary.

I got an error saying that element [0] is out of range.

IndexError Traceback (most recent call last)<ipython-input-30-8f3a925c59a3>in<module>   twenty four 
     25 # Link training data captions with image names
--->26 train_imagecaptions_dict = make_imagecaptions_dictionary ('caption_mold.txt', train)
     27 # Link validation data caption and image name
     28 val_imagecaptions_dict = make_imagecaptions_dictionary ('caption_mold.txt', val)<ipython-input-30-8f3a925c59a3>in make_imagecaptions_dictionary (filename, setname)
     10 word = line.split ()
     11 # Read the first word as the image name and all the rest as the caption
--->12 image_id = word [0]
     13 image_sentence = word [1:]
     14 # If the image name is included in the dataset, do the following
IndexError: list index out of range
Corresponding source code
#Function to create a dictionary that associates the image name with the caption
def make_imagecaptions_dictionary (filename, setname): # argument dataset is train or val
    text = load_file (filename)
    #Dictionary type creation
    imagecaptions = {}
    # Take out line by line and loop
    word = []
    for line in text.split ('\ n'):
        #Separate with spaces
        word = line.split ()#Read the first word as the image name and all the rest as the caption
        image_id = word [0]
        image_sentence = word [1:]
        #If the image name is included in the set, do the following
        if image_id in setname:
            #Create a list with the image name as the key
            #Create a list if the image name is the first
            if image_id not in imagecaptions:
                imagecaptions [image_id] = list ()
            #Enclose the caption with the start and end words
            image_sentence ='startseq' +'''. join (image_sentence) +'endseq'
            imagecaptions [image_id] .append (image_sentence) #Store in dictionary
    return image captions
#Associate the training data caption with the image name
train_imagecaptions_dict = make_imagecaptions_dictionary ('caption_mold.txt', train)
#Link validation data caption with image name
val_imagecaptions_dict = make_imagecaptions_dictionary ('caption_mold.txt', val)
What I tried

When I printed word [0] with print, the value was stored normally.
It is an element of printed word [0] (partial).

COCO_train2014_000000000009
COCO_train2014_000000000009
COCO_train2014_000000000009
COCO_train2014_000000000009
COCO_train2014_000000000009
COCO_train2014_000000000025
COCO_train2014_000000000025
COCO_train2014_000000000025
COCO_train2014_000000000025
COCO_train2014_000000000025
COCO_train2014_000000000030

The argument file has about 600,000 lines (about 62KB), and the test file with a few lines extracted from the top of this file did not cause an error, so is the error related to the size of the file?

Supplementary information (FW/tool version, etc.)

The contents (part) of the argument caption_mold.txt.

COCO_train2014_000000000009 Bread, vegetables, meat, fruits and cookies lunch box
COCO_train2014_000000000009 Broccoli is in the yellow case
COCO_train2014_000000000009 Food is served in the lunch box
COCO_train2014_000000000009 Inside the lunch box are bread with hamburger, broccoli and cheese, nuts and fruits.
COCO_train2014_000000000009 Colorful Side dishes packed in a lunch box
COCO_train2014_000000000025 There is a giraffe eating leaves
COCO_train2014_000000000025 1 One giraffe is eating tall tree grass
COCO_train2014_000000000025 Giraffe is eating the leaves on the tree
COCO_train2014_000000000025 A giraffe eating food on a tree branch
COCO_train2014_000000000025 A giraffe is eating food on a tree
COCO_train2014_000000000030 White and red flowers are displayed in a white vase
COCO_train2014_000000000030 There is a vase with lots of flowers on the balcony.
COCO_train2014_000000000030 Red and white flowers in a white vase
COCO_train2014_000000000030 Colorful flowers are alive in a white pottery vase
COCO_train2014_000000000030 White and red flowers are laid in a white vase

The contents (part) of the argument train.

{'COCO_train2014_000000180055','COCO_train2014_000000525932','COCO_train2014_000000433084','COCO_train2014_000000449780','COCO_train2014_000000150877','COCO_train2014_000000099628','COCO_train2014_000000437660','COCO_train2014_000000437660',' COCO_train2014_000000437660'

  • Answer # 1

    If there is a blank line somewhere in the text,IndexError: list index out of rangeWill occur.

    Please modify as follows. Even if there is a blank line and the word becomes empty, the error does not occur because it moves to the processing of the next line.

    for line in text.split ('\ n'):
        word = line.split ()
        if not word:
            continue
        image_id = word [0]
        (Omitted below)

  • Answer # 2

    If Nakami in the list is empty, even index 0 will be over.