Tensorflow CNN Training Images Have Different Sizes

I created the Neural Network Deep Convolution to classify individual pixels in an image. My training data will always be the same size (32x32x7), but my testing data can be of any size.

Github repository

Currently, my model will only work on images of the same size. I very often used a longorflow mnist tutorial to help me build my model. In this tutorial we use only 28x28 images. How to change the following mnist model to accept images of any size?

x = tf.placeholder(tf.float32, shape=[None, 784]) y_ = tf.placeholder(tf.float32, shape=[None, 10]) W = tf.Variable(tf.zeros([784,10])) b = tf.Variable(tf.zeros([10])) x_image = tf.reshape(x, [-1, 28, 28, 1]) 

To make things a little more complicated, my model transposes convolutions where the output form should be indicated. How to adjust the next line of code so that the transposed convolution produces a form that has the same input size.

  DeConnv1 = tf.nn.conv3d_transpose(layer1, filter = w, output_shape = [1,32,32,7,1], strides = [1,2,2,2,1], padding = 'SAME') 
+8
python deep-learning tensorflow conv-neural-network deconvolution
source share
3 answers

Unfortunately, there is no way to build dynamic graphs in Tensorflow (you can try with fold , but this is beyond the scope of the question), This gives you two options:

  • Bucketing: you create several input tensors in several manually selected sizes, and then at runtime you select the right bucket (see example). In any case, you probably need the second option. Seq2seq with bucketing

  • Resize input and output images. Assuming the images support the same aspect, you can try resizing the image before outputting it. You do not know why you care about the exit, since MNIST is a classification task.

In any case, you can use the same approach:

 from PIL import Image basewidth = 28 # MNIST image width img = Image.open('your_input_img.jpg') wpercent = (basewidth/float(img.size[0])) hsize = int((float(img.size[1])*float(wpercent))) img = img.resize((basewidth,hsize), Image.ANTIALIAS) # Save image or feed directly to tensorflow img.save('feed_to_tf.jpg') 
+4
source share

The mnist model code that you mentioned is an example of using FC networks, not for convolution networks. The input form [None, 784] is set for the size mnist (28 x 28). An example is an FC network with a fixed input size.

What you are asking for is not possible on FC networks because the number of weights and offsets depends on the input form. This is possible if you use a fully convolutional architecture. So my suggestion is to use a fully convolutional architecture so that weights and offsets are independent of the input form

0
source share

By adding @gidim's answer , you can resize the images in Tensorflow and directly submit the results to your output. Note. This method scales and distorts the image, which can increase your loss.

All loans are sent to Prasad Pai's article on updating data .

 import tensorflow as tf import numpy as np from PIL import Image IMAGE_SIZE = 32 CHANNELS = 1 def tf_resize_images(X_img_file_paths): X_data = [] tf.reset_default_graph() X = tf.placeholder(tf.float32, (None, None, CHANNELS)) tf_img = tf.image.resize_images(X, (IMAGE_SIZE, IMAGE_SIZE), tf.image.ResizeMethod.NEAREST_NEIGHBOR) with tf.Session() as sess: sess.run(tf.global_variables_initializer()) # Each image is resized individually as different image may be of different size. for index, file_path in enumerate(X_img_file_paths): img = Image.open(file_path) resized_img = sess.run(tf_img, feed_dict = {X: img}) X_data.append(resized_img) X_data = np.array(X_data, dtype = np.float32) # Convert to numpy return X_data 
0
source share

All Articles