Home>

I have a question about Darknet53 because I have some questions.

The YOLOv3 paper stated that the Softmax function would be abolished.
https://qiita.com/mdo4nt6n/items/7cd5f106adc775e5d92b
Why is there a Softmax layer at the bottom of the Darknet 53 configuration on the above site?

Also, it was stated that Darknet53 consists of 53 convolution layers, but even if you count the Conv layers in the figure on the above site, only 52 layers were found. Is it my counting mistake?

  • Answer # 1

    Why is there a Softmax layer at the bottom of the Darknet 53 configuration on the above site?
    Also, it was stated that Darknet53 consists of 53 convolution layers, but even if you count the Conv layers in the figure on the above site, only 52 layers were found. Is it my counting mistake?

    Learning weights from scratch in ImageNet's classification problem with 52 layers + 53 layers (darknet53) of fully connected layers (softmax) (feature extractor weight learning)

    Of these, 52 layers excluding the fully connected layer of the output layer are brought to YOLOv3 to learn object detection (fine-tuning for object detection).

    It is in the order. So darknet53, which is part of YOLOv3, has only 52 layers, but originally it had 53 layers. This is not explicitly stated in the YOLOv3 paper, but it is stated in the YOLOv2 paper.

    YOLO9000.pdf

    For YOLOv2 we first fine tune the classification network
    at the full 448 x 448 resolution for 10 epochs on ImageNet.
    This gives the network time to adjust its filters to work better
    on higher resolution input. We then fine tune the resulting
    network on detection.

    Since the papers related to object detection are the accumulation of improvements to existing research, I think it is necessary to read the past prominent object detection papers after Faster-RCNN in chronological order in order to understand them properly.