Why do we usually have more than one fully connected layer in the late stages of CNN?

Question

Why do we usually have more than one fully connected layer in the late stages of CNN?

As I noticed, in many popular convolutional neural network architectures (such as AlexNet), people use more than one fully connected layer with almost the same size to collect responses to previously discovered functions at early levels.

Why don't we use only one FC? Why is this hierarchical layout of fully connected layers perhaps more useful?

+4

image-processing deep-learning machine-learning computer-vision conv-neural-network

Ali May 14, '16 at 12:45

source share

2 answers

yura · Answer 1 · 2016-07-25T20:33:10+0000

/. 2015+ (, Resnet, Inception 4) (GAP) + softmax, . 2 VGG16 80% . , 2- MLP . 1 , 2 SGD.

SpinyNormam · Answer 2 · 2016-11-03T19:37:41+0000

, XOR, . , (-) . , , , .

Why do we usually have more than one fully connected layer in the late stages of CNN?

More articles: