Implementing residual block

Question

In the traditional residual block, is the "addition" of layer N to the output of layer N+2 (prior to non-linearity) element-wise addition or concatenation?

The literature indicates something like this:

X1 = X
X2 = relu(conv(X1))
X3 = conv(X2)
X4 = relu(conv(X3 + X1))

stackoverflow.com/q/46902386/712995

Maxim
– Maxim

2017-12-22 15:22:15 +00:00
Commented Dec 22, 2017 at 15:22 — Maxim
– Maxim, Commented Dec 22, 2017 at 15:22

Dr. Snoopy · Accepted Answer · 2017-12-22 09:02:14Z

1

It has to be element-wise, with concatenation you don't get a residual function. One has also to be aware about using the proper padding mode so convolutions produce outputs with the same spatial dimensions as the block input.

answered Dec 22, 2017 at 9:02

Dr. Snoopy

56.5k7 gold badges148 silver badges161 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

rodrigo-silveira Over a year ago

Thanks for that explanation. Since the convolutions must retain dimensions, does the input ever change size as it flows through the net?

Dr. Snoopy Over a year ago

@rodrigo-silveira It does through some blocks between the residual ones. Read the ResNet paper for more information about their architecture.

Collectives™ on Stack Overflow

Implementing residual block

1 Answer 1

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related