Home>

I want to create a model that combines CNN and RNN in the future, and I am making a sample using mnist, but the accuracy of the training data is close to 1, but the test data is completely (R ^ 2 is a negative value ....).
I am writing the code referring to this page.Link content
Since it is rewritten with Keras, the structure of the model and the execution part are different, but the method of creating sample data is exactly the same, so please take a look here.
I thought that the structure of the model might be strange because the accuracy is low or not such a value, so I tried various things, but it does not seem to change at all.
Machine learning is a beginner and may be doing something unclear, but thanks for your cooperation.

def CNN_LSTM_Model (sequence_size, img_num, img_size, hidden_size):
#cnn
    input = Input (shape = (sequence_size, img_size, img_size, 1))
    cnn_list = []
    for i in range (sequence_size):
        x = Conv2D (16, kernel_size = (3, 3), strides = (1, 1), padding ='same', activation ='elu') (input [:, i,:,:,:])
        x = MaxPooling2D (pool_size = (2, 2), strides = (2, 2)) (x)
        x = Dropout (0.3) (x)
        x = Conv2D (16, kernel_size = (3, 3), strides = (1, 1), padding ='same', activation ='elu') (x)
        x = MaxPooling2D (pool_size = (2, 2), strides = (2, 2)) (x)
        x = Dropout (0.3) (x)
        x = Flatten () (x)
        output1 = Dense (hidden_size, activation ='elu', kernel_initializer = tf.keras.initializers.VarianceScaling ()) (x)
        cnn_list.append (output1)
    cnn_list = tf.transpose (cnn_list, perm = [1,0,2])
    #cnn_list = tf.reshape (cnn_list, (img_num, sequence_size, hidden_size))
#LSTM
    x = LSTM (hidden_size, stateful = False) (cnn_list)
    output2 = Dense (19, activation ='softmax') (x)
    model = Model (inputs = input, outputs = output2)
    model.compile (loss = "categorical_crossentropy", optimizer = "Adam", metrics = ['accuracy'])
    model.summary ()
    return model
img_num = 128
sequence_size = 2
img_size = 28
output_size = 19
hidden_size = 256
(x_train, y_train), (x_test, y_test) = mnist.load_data ()
x_T, y_T, y_T_onehot = sampling_data (x_train, y_train)
y_T_onehot = np.reshape (y_T_onehot, (img_num, output_size))
#for fitting first dimension size
x_T = np.reshape (x_T, (sequence_size, img_num, img_size, img_size, 1))
x_T = np.transpose (x_T, (1,0,2,3,4))
model = CNN_LSTM_Model (sequence_size = sequence_size, img_num = img_num, img_size = img_size, hidden_size = hidden_size)
model.fit (x_T, y_T_onehot, batch_size = img_num, epochs = 500, verbose = 1, validation_split = 0.0)
x_T_test, y_T_test, y_T_onehot_test = sampling_data (x_test, y_test)
y_T_onehot_test = np.reshape (y_T_onehot_test, (img_num, output_size))
x_T_test = np.transpose (x_T_test, (1,0,2,3,4))
pred_train = model.predict (x_T, batch_size = img_num, verbose = 0)
pred_test = model.predict (x_T_test, batch_size = img_num, verbose = 0)
train_ans = []
for i in range (img_num):
    max_index = np.argmax (pred_train [i])
    ans = max_index -9
    train_ans.append (ans)
test_ans = []
for i in range (img_num):
    max_index = np.argmax (pred_test [i])
    ans = max_index -9
    test_ans.append (ans)
print ("RMSE:% .0f"% (np.sqrt (mean_squared_error (y_T, train_ans))))
print ("R2:% .3f"% (r2_score (y_T, train_ans)))
print ("RMSE:% .0f"% (np.sqrt (mean_squared_error (y_T_test, test_ans))))
print ("R2:% .3f"% (r2_score (y_T_test, test_ans)))
#RMSE: 0
# R2: 1.000
#RMSE: 4
# R2: 0.054
  • Answer # 1

    I should have reviewed the sample code from 1 properly,
    I don't know tensorflow at all, so I was sloppy about the execution. .. ..
    Maybe you can't do it with keras? However, in the sample, the data was replaced every epoch (the function to get the data is executed every time).
    In other words, since the number of batches is 128 each time with 10,000 epochs, I used 1280000 x 2 images.
    On the other hand, I didn't know that, and I used only 128 x 2 images, so that would cause overfitting ...
    It seems that you can not do that with keras model.fit, so for the time being, if you set the number of images at one time to 10000, the accuracy was about 80% at all ... I did not notice this for about 3 weeks .. ..

    Well, I didn't have a relationship, but I learned a lot in the process of trial and error, so let's say it's okay ...

    Thank you to those who answered! !!

  • Answer # 2

    I don't usually use Keras and answer intuitively.

    First, since the accuracy of train is close to 1, we exclude the defects of the model.

    Since the difference between train and test is large, check the difference in the code.

    There are differences between train and test.

    x_T = np.reshape (x_T, (sequence_size, img_num, img_size, img_size, 1))
    x_T = np.transpose (x_T, (1,0,2,3,4))

    When

    x_T_test = np.transpose (x_T_test, (1,0,2,3,4))

    np.reshapeIsn't the deformation of?