Combining and convolution operations shift the window through the input tensor. Using tf.nn.conv2d as an example: If the input tensor has 4 dimensions: [batch, height, width, channels] , then the convolution works in a 2D window in the dimensions height, width .
strides determines how many windows are shifted in each dimension. Typical use sets the first (batch) and last (depth) step to 1.
Let me use a very specific example: Launching the 2nd convolution above the input image with a gray scale of 32x32. I say shades of gray, because then the input image has a depth = 1, which helps keep it simple. Let this image look like this:
00 01 02 03 04 ... 10 11 12 13 14 ... 20 21 22 23 24 ... 30 31 32 33 34 ... ...
Launch the 2x2 convolution window according to one example (batch size = 1). We will give convolution output channel depth 8.
The convolution input has shape=[1, 32, 32, 1] .
If you specify strides=[1,1,1,1] with padding=SAME , then the filter output will be [1, 32, 32, 8].
First, the filter will create an output for:
F(00 01 10 11)
And then for:
F(01 02 11 12)
etc. Then it will move to the second line, calculating:
F(10, 11 20, 21)
then
F(11, 12 21, 22)
If you specify the step [1, 2, 2, 1], it will not overlap the windows. He will calculate:
F(00, 01 10, 11)
and then
F(02, 03 12, 13)
The step works similarly for join operators.
Question 2: Why strides [1, x, y, 1] for convnets
The first one is the batch: you usually don't want to skip the examples in your batch or shouldn't have included them in the first place. :)
The last one is the convolution depth: for the same reason, you usually don't want to skip inputs.
The conv2d operator is more general, so you can create convolutions that move the window across other dimensions, but this is not a typical use in convnets. A typical use is to use them spatially.
Why redraw to -1 -1 is a placeholder that says "if necessary, adjust to fit the size needed for the full tensor." This is a way to make the code independent of the size of the input lot, so you can change your pipeline and not change the lot size in the code.