Home>
Problems and what I want to achieve

Since loss became nan in Tensorflow, I thought about solving it by using an error function.
However, as a result of learning with my own error function, learning does not proceed at all.
I would like you to point out anything that is wrong.

Implemented code
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import matplotlib.pyplot as plt
import tensorflow as tf
import datetime
datagen = ImageDataGenerator (
    normalize with rescale = 1./255, #rescale
    Validation_split = 0.3 #validation_split can be used to split the validation dataset
    )
aug_datagen = ImageDataGenerator (
    normalize with rescale = 1./255, #rescale
    rotation_range = 20, # Randomly rotate within ± 20 degrees
    width_shift_range = 10, # Randomly move left and right within ± 8px range
    height_shift_range = 10 # Randomly move up and down within ± 4px range
    )
batch_size = 10
train_generator = datagen.flow_from_directory (
    '/ content/drive/My Drive/my_data_set/train /',
    target_size = (224, 224),
    class_mode ='categorical',
    batch_size = batch_size,
    subset ='training',
)
aug_train_generator = aug_datagen.flow_from_directory (
    '/ content/drive/My Drive/my_data_set/train /',
    target_size = (224, 224),
    class_mode ='categorical',
    batch_size = batch_size,
    subset ='training',
)
val_generator = datagen.flow_from_directory (
    '/ content/drive/My Drive/my_data_set/train /',
    target_size = (224, 224),
    class_mode ='categorical',
    batch_size = batch_size,
    subset ='validation'
)
% load_ext tensorboard
os.chdir ('/ content/drive/My Drive/my_data_set /')
! rm -rf ./logs/
class MyModel (tf.keras.Model):
    def __init __ (self):
        super (MyModel, self) .__ init__ ()
        self.conv1 = tf.keras.layers.Conv2D (8, (1,1), activation ='relu', input_shape = (224, 224, 3))
        self.conv1_2 = tf.keras.layers.Conv2D (32, (5,5), activation ='relu')
        self.conv2 = tf.keras.layers.Conv2D (16, (1,1), activation ='relu')
        self.conv2_2 = tf.keras.layers.Conv2D (64, (3,3), activation ='relu')
        self.conv3 = tf.keras.layers.Conv2D (32, (1,1), activation ='relu')
        self.conv3_2 = tf.keras.layers.Conv2D (128, (3,3), activation ='relu')
        self.pooling1 = tf.keras.layers.MaxPooling2D ((2,2), strides = None, padding ='valid')
        self.pooling2 = tf.keras.layers.MaxPooling2D ((2,2), strides = None, padding ='valid')
        self.flatten = tf.keras.layers.Flatten ()
        self.fc1 = tf.keras.layers.Dense (256, activation ='relu')
        self.fc2 = tf.keras.layers.Dense (9, activation ='relu')
        self.dropout = tf.keras.layers.Dropout (0.2)
    def call (self, x, training = False):
        x = self.conv1 (x)
        x = self.conv1_2 (x)
        x = self.pooling1 (x)
        x = self.conv2 (x)
        x = self.conv2_2 (x)
        x = self.pooling2 (x)
        x = self.conv3 (x)
        x = self.conv3_2 (x)
        x = self.pooling2 (x)
        x = self.flatten (x)
        x = self.fc1 (x)
        x = self.dropout (x, training = training)
        x = self.fc2 (x)
        return x
model = MyModel ()
Adam = tf.keras.optimizers.Adam (learning_rate = 0.001, clipvalue = 1.0)
def custom_loss (y_true, y_pred):
    error = -tf.reduce_sum (y_true * tf.math.log (tf.nn.softmax (y_pred) + 1e-10))
    return error
model.compile (optimizer = Adam,
              loss = custom_loss,
              metrics = ['accuracy'])
log_dir = "logs/fit /" + datetime.datetime.now (). strftime ("% Y% m% d-% H% M% S")
tensorboard_callback = tf.keras.callbacks.TensorBoard (log_dir = log_dir, histogram_freq = 1, write_graph = True)
early_stopping_call_back = tf.keras.callbacks.EarlyStopping (monitor ='accuracy', patience = 10, verbose = 0, mode ='auto')
set_seed (0)
model.fit (train_generator, epochs = 40)
history = model.fit (
    aug_train_generator,
    epochs = 50,
    validation_data = val_generator,
    callbacks = [tensorboard_callback, early_stopping_call_back],
)
What I tried

I wondered if some additions or changes were needed to support multi-class classification, but I have no idea how to solve it now.

Execution environment

googlecolab
tensorflow 2.3.0

  • Answer # 1

    It seems that softmax is not included in the error function in tesorflow, and it was solved by adding the softmax function at the end in the network definition.