Tensorflow VarLenFeature vs FixedLenFeature

I tried to save images of different sizes in tf record. I found that although the images have different sizes, I can still load them using FixedLenFeature .

FixedLenFeature checking documents for FixedLenFeature and VarLenFeature , I found that the difference is that VarLenFeauture returns a sparse tensor.

Can someone illustrate some situations that should use FixedLenFeature or VarLenFeature ?

+22
python tensorflow
source share
2 answers

You can load images, probably because you saved them using the function type tf.train.BytesList() and the entire image data represents one large byte value inside the list.

If I'm right, you use tf.decode_raw to get the data from the image that you are loading from TFRecord.

As for usage examples: I use VarLenFeature to save datasets for the object detection task: there is a variable number of bounding frames per image (equal to the object in the image), so I need another object objects_number to track the number of objects (and bboxes). Each bounding box is a list of 4 floating point coordinates

I use the following code to download:

 features = tf.parse_single_example( serialized_example, features={ # We know the length of both fields. If not the # tf.VarLenFeature could be used 'height': tf.FixedLenFeature([], tf.int64), 'width': tf.FixedLenFeature([], tf.int64), 'depth': tf.FixedLenFeature([], tf.int64), # Label part 'objects_number': tf.FixedLenFeature([], tf.int64), 'bboxes': tf.VarLenFeature(tf.float32), 'labels': tf.VarLenFeature(tf.int64), # Dense data 'image_raw': tf.FixedLenFeature([],tf.string) }) # Get metadata objects_number = tf.cast(features['objects_number'], tf.int32) height = tf.cast(features['height'], tf.int32) width = tf.cast(features['width'], tf.int32) depth = tf.cast(features['depth'], tf.int32) # Actual data image_shape = tf.parallel_stack([height, width, depth]) bboxes_shape = tf.parallel_stack([objects_number, 4]) # BBOX data is actually dense convert it to dense tensor bboxes = tf.sparse_tensor_to_dense(features['bboxes'], default_value=0) # Since information about shape is lost reshape it bboxes = tf.reshape(bboxes, bboxes_shape) image = tf.decode_raw(features['image_raw'], tf.uint8) image = tf.reshape(image, image_shape) 

Please note that "image_raw" has a fixed Feature length (has one element) and contains values โ€‹โ€‹of type "bytes", however a value of type "bytes" itself can have a variable size (this is a string of bytes and can contain many characters inside it). ) Thus, "image_raw" is a list with ONE element of type "bytes", which can be very large.

To further clarify how this works: objects are lists of values, these values โ€‹โ€‹have a certain "type".

Data types for objects are a subset of data types for tensors, you have:

  • int64 (64-bit memory space)
  • byte (takes up as many bytes in memory as you want)
  • float (takes 32-64 bits in idk memory)

You can check the data types of tensors here .

That way, you can store VarLenFeatures variable-length VarLenFeatures without VarLenFeatures (you actually do it well), but first you will need to convert it to bytes / string function, and then decode them. And this is the most common method.

+38
source share

@Xyz

 import tensorflow as tf def __format(record): features = tf.parse_single_example( record, features={ 'image/height': tf.FixedLenFeature([], tf.int64), 'image/width': tf.FixedLenFeature([], tf.int64), 'image/depth': tf.FixedLenFeature([], tf.int64), 'image/source': tf.FixedLenFeature([], tf.string) }) height = tf.cast(features['image/height'], tf.int32) width = tf.cast(features['image/width'], tf.int32) depth = tf.cast(features['image/depth'], tf.int32) # Actual data image_shape = tf.parallel_stack([height, width, depth]) image = tf.decode_raw(features['image/source'], tf.uint8) image = tf.reshape(image, image_shape) return image dataset = tf.data.TFRecordDataset("train.tfrecord") dataset = dataset.repeat() dataset = dataset.map(__format) dataset = dataset.batch(1) # From the paper iterator = dataset.make_one_shot_iterator() image = iterator.get_next() print(image.get_shape()) with tf.Session() as sess: sess.run(tf.global_variables_initializer()) pass 

this result (?,?,?,?) the image has no shape ..... and cannot CNN

What am I doing?

0
source share

All Articles