0

In the traditional residual block, is the "addition" of layer N to the output of layer N+2 (prior to non-linearity) element-wise addition or concatenation?

The literature indicates something like this:

X1 = X
X2 = relu(conv(X1))
X3 = conv(X2)
X4 = relu(conv(X3 + X1))
1

1 Answer 1

1

It has to be element-wise, with concatenation you don't get a residual function. One has also to be aware about using the proper padding mode so convolutions produce outputs with the same spatial dimensions as the block input.

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks for that explanation. Since the convolutions must retain dimensions, does the input ever change size as it flows through the net?
@rodrigo-silveira It does through some blocks between the residual ones. Read the ResNet paper for more information about their architecture.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.