How to read json files in Tensorflow?

I am trying to write a function that reads json files in a tensor stream. Json files have the following structure:

{ "bounding_box": { "y": 98.5, "x": 94.0, "height": 197, "width": 188 }, "rotation": { "yaw": -27.97019577026367, "roll": 2.206029415130615, "pitch": 0.0}, "confidence": 3.053506851196289, "landmarks": { "1": { "y": 180.87722778320312, "x": 124.47326660156205}, "0": { "y": 178.60653686523438, "x": 183.41931152343795}, "2": { "y": 224.5936889648438, "x": 141.62365722656205 }}} 

I only need information about the bounding box. There are some examples of how to write read_and_decode functions, and I am trying to convert these examples into a function for json files, but there are many more questions ...:

 def read_and_decode(filename_queue): reader = tf.WhichKindOfReader() # ??? _, serialized_example = reader.read(filename_queue) features = tf.parse_single_example( serialized_example, features={ 'bounding_box':{ 'y': tf.VarLenFeature(<whatstheproperdatatype>) ??? 'x': 'height': 'width': # I only need the bounding box... - do I need to write # the format information for the other features...??? } }) y=tf.decode() # decoding necessary? x= height= width= return x,y,height,width 

I have been doing research on the Internet for several hours, but can't find anything really detailed about how to read json in a tensor stream ...

Maybe someone can give me the key ...

+5
source share
2 answers

Update

The solution below does its job, but it is not very efficient, see comments for details.

Original answer

You can use standard python json parsing with TensorFlow if you end functions with tf.py_func :

 import json import numpy as np import tensorflow as tf def get_bbox(str): obj = json.loads(str.decode('utf-8')) bbox = obj['bounding_box'] return np.array([bbox['x'], bbox['y'], bbox['height'], bbox['width']], dtype='f') def get_multiple_bboxes(str): return [[get_bbox(x) for x in str]] raw = tf.placeholder(tf.string, [None]) [parsed] = tf.py_func(get_multiple_bboxes, [raw], [tf.float32]) 

Note that tf.py_func returns a list of tensors, not just one tensor, so we need to wrap parsed in the [parsed] list. If not, parsed will get the form [1, None, 4] , and not the desired form [None, 4] (where None is the lot size).

Using your data, you will get the following results:

 json_string = """{ "bounding_box": { "y": 98.5, "x": 94.0, "height": 197, "width": 188 }, "rotation": { "yaw": -27.97019577026367, "roll": 2.206029415130615, "pitch": 0.0}, "confidence": 3.053506851196289, "landmarks": { "1": { "y": 180.87722778320312, "x": 124.47326660156205}, "0": { "y": 178.60653686523438, "x": 183.41931152343795}, "2": { "y": 224.5936889648438, "x": 141.62365722656205 }}}""" my_data = np.array([json_string, json_string, json_string]) init_op = tf.initialize_all_variables() with tf.Session() as sess: sess.run(init_op) print(sess.run(parsed, feed_dict={raw: my_data})) print(sess.run(tf.shape(parsed), feed_dict={raw: my_data})) 
 [[ 94. 98.5 197. 188. ] [ 94. 98.5 197. 188. ] [ 94. 98.5 197. 188. ]] [3 4] 
+4
source

This may work around the problem, but you can pre-process your data using a command line tool such as https://stedolan.imtqy.com/jq/tutorial/ into a linear data format like csv. It may be more effective.

-1
source

All Articles