How to save and restore a partitioned variable in Tensorflow

I have a large matrix.

I use an easier way to create this variable as the number of skulls.

softmax_w = tf.get_variable("softmax_w", [hps.vocab_size, hps.projected_size], partitioner=tf.fixed_size_partitioner(hps.num_shards, 0)) 

create log:

 model/softmax_w/part_0:0 (99184, 512) /cpu:0 model/softmax_w/part_1:0 (99184, 512) /cpu:0 model/softmax_w/part_2:0 (99184, 512) /cpu:0 model/softmax_w/part_3:0 (99184, 512) /cpu:0 model/softmax_w/part_4:0 (99184, 512) /cpu:0 model/softmax_w/part_5:0 (99184, 512) /cpu:0 model/softmax_w/part_6:0 (99183, 512) /cpu:0 model/softmax_w/part_7:0 (99183, 512) /cpu:0 

I can train and be successful. But when I try to restore the model, I got this error:

 W tensorflow/core/framework/op_kernel.cc:975] Not found: Key model/softmax_w/part_7 not found in checkpoint W tensorflow/core/framework/op_kernel.cc:975] Not found: Key model/softmax_w/part_6 not found in checkpoint W tensorflow/core/framework/op_kernel.cc:975] Not found: Key model/softmax_w/part_5 not found in checkpoint W tensorflow/core/framework/op_kernel.cc:975] Not found: Key model/softmax_w/part_4 not found in checkpoint W tensorflow/core/framework/op_kernel.cc:975] Not found: Key model/softmax_w/part_3 not found in checkpoint W tensorflow/core/framework/op_kernel.cc:975] Not found: Key model/softmax_w/part_2 not found in checkpoint W tensorflow/core/framework/op_kernel.cc:975] Not found: Key model/softmax_w/part_1 not found in checkpoint W tensorflow/core/framework/op_kernel.cc:975] Not found: Key model/softmax_w/part_0 not found in checkpoint W tensorflow/core/framework/op_kernel.cc:975] Not found: Key model/softmax_w/part_7 not found in checkpoint 

I found shadoworflow to save a variable as part. The saved parameter has only one softmax_w . No longer a partitioned variable.

+7
python deep-learning machine-learning tensorflow
source share
1 answer

This happened in tensor flow 0.12 and does not happen in 1.3 (latest version as of October 2017). Here's the GitHub issue filed by the same author and is now fixed. Therefore, if you see this error, just update the tensor.

0
source share

All Articles