Tensorflow cannot restore vocabulary during assessment process

Question

Tensorflow cannot restore vocabulary during assessment process

I am new to tensor flow and neural network. I started a project that deals with the detection of errors in Persian texts. I used the code for this address and developed the code in here . please check the code because I cannot post all the code here.

What I want to do is give a few Persian sentences to the model for training, and then see if the model can detect the wrong sentences. The model works great with English data, but when I use it for Persian data, I run into this problem.

The code is too long to be written here, so I'm trying to point to the part that I think might cause the problem. I used these lines in train.py , which works fine and stores dictionaries:

 x_text, y = data_helpers.load_data_labels(datasets) # Build vocabulary max_document_length = max([len(x.split(" ")) for x in x_text]) vocab_processor = learn.preprocessing.VocabularyProcessor(max_document_length) x = np.array(list(vocab_processor.fit_transform(x_text)))

however after training, when I try this code in eval.py :

 vocab_path = os.path.join(FLAGS.checkpoint_dir, "..", "vocab") vocab_processor = learn.preprocessing.VocabularyProcessor.restore(vocab_path) x_test = np.array(list(vocab_processor.transform(x_raw)))

this error occurs:

 vocab_processor = learn.preprocessing.VocabularyProcessor.restore(vocab_path) File "C:\WinPython-64bit-3.5.2.3Qt5\python-3.5.2.amd64\lib\site-packages\tensorflow\contrib\learn\python\learn\preprocessing\text.py", line 226, in restore return pickle.loads(f.read()) File "C:\WinPython-64bit-3.5.2.3Qt5\python-3.5.2.amd64\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 118, in read self._preread_check() File "C:\WinPython-64bit-3.5.2.3Qt5\python-3.5.2.amd64\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 78, in _preread_check compat.as_bytes(self.__name), 1024 * 512, status) File "C:\WinPython-64bit-3.5.2.3Qt5\python-3.5.2.amd64\lib\contextlib.py", line 66, in __exit__ next(self.gen) File "C:\WinPython-64bit-3.5.2.3Qt5\python-3.5.2.amd64\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 466, in raise_exception_on_not_ok_status pywrap_tensorflow.TF_GetCode(status)) tensorflow.python.framework.errors_impl.NotFoundError: NewRandomAccessFile failed to Create/Open: ..\vocab : The system cannot find the file specified.

I think the problem is that he cannot read the dictionary stored after training, because the data is in Unicode, and not in English. Can anyone help me?

+7

python tensorflow python-unicode

Masoud masoumi moghadam Dec 03 '17 at 10:49

source share

2 answers

Have you tried adding this to the top of your file?

 # -*- coding: utf-8 -*-

+3

Myles hollowed Dec 05 '17 at 10:53

source share

Masoud masoumi moghadam · Accepted Answer · 2017-12-10T09:13:55+0000

The reason this problem occurs is because the vocab address is incorrect. In train.py after line 144 that set out_dir, I added the following:

 file = open('model_dir.txt', 'w') file.write(out_dir) file.close()

After training the model, the address is stored in a directory in a file named model_dir.txt .

Then in eval.py I added the following:

 model_dir = open('model_dir.txt').readline() vocab_path = model_dir + "/vocab"

Now the address is set correctly, and the code works without problems.

Tensorflow cannot restore vocabulary during assessment process

More articles: