Encog Neural Net - How to structure training data?

Question

Encog Neural Net - How to structure training data?

In every example that I saw for Encog neural networks, XOR or something very simple was involved. I have about 10,000 sentences, and each word in a sentence has some type of tag. The input level must accept 2 inputs, the previous word and the current word. If there is no previous word, then the first input is not activated at all. I need to go through every sentence like that. Each word depends on the previous word, so I cannot just have an array similar to the XOR example. Also, I really don't want to load all the words from 10,000 sentences into an array, I would prefer to scan one sentence at a time, and as soon as I reach EOF, start from the beginning.

How can I do it? I am not very comfortable with Encog, because all the examples that I saw were either XOR or extremely complex.

There are 2 inputs ... Each input consists of 30 neurons. The probability that the word is a specific tag is used as input. Thus, most neurons get 0; others get probabilistic inputs, such as .5, .3 and .2. When I say “not activated”, I mean that all neurons are set to 0. The output level represents all possible tags, so its value is 30. Regardless of which of the output neurons has the largest number, this is the tag that is selected .

I’m not sure how to go through all 10,000 sentences and look for every word in every sentence (for input and activate this input) in the Encog “demo files” that I saw.)

It seems that the networks are trained using a single array containing all the training data, and this loops until the network is trained. I would like to train the network with many different arrays (array per sentence) and then look at them again.

This format obviously does not work for what I am doing:

do { train.iteration(); System.out.println( "Epoch #" + epoch + " Error:" + train.getError()); epoch++; } while(train.getError() > 0.01);

+2

java artificial-intelligence machine-learning neural-network encog

Nate cook3 Jun 30 '15 at 20:49

source share

2 answers

You can use other implementations of MLDataSet besides BasicMLDataSet .

I ran into a similar problem with windows of DNA sequences. Creating an array of all windows would not be scalable.

Instead, I implemented my own VersatileDataSource and wrapped it in VersatileMLDataSet .

VersatileDataSource has just a few implementation methods:

 public interface VersatileDataSource { String[] readLine(); void rewind(); int columnIndex(String name); }

For each readLine() you can return the entries for the previous / current word and move the position to the next word.

+2

Andy Thomas Jun 30 '15 at 20:59

source share

Slater victoroff · Accepted Answer · 2015-06-30T20:56:01+0000

So I'm not sure how to tell you this, but that’s not how the neural network works. You cannot just use a word as an input, and you cannot just “not activate” an input. At the most basic level, this is what you need to run a neural network on a problem:

Input vector of fixed length (no matter what you submit, it should be represented numerically with a fixed length. Each entry in the vector is a single number)
A set of labels (each input vector must correspond to one output vector of a fixed length)

After you have these two, the neural network classifies the example, then edits itself to maximize the labels.

If you want to work with words and a deep learning map, you must match your words with an existing vector representation (I would highly recommend glove , but word2vec is also decent), and then find out on top of that view.

After a deeper understanding of what you are trying here, I think the problem is that you are dealing with 60 inputs, not one. These inputs are a concatenation of existing predictions for both words (if the first word does not contain the first 30 entries, it is 0). You have to take care of the mapping yourself (it should be very simple), and then just consider it as an attempt to predict 30 numbers with 60 numbers.

I feel obligated to tell you that as you created the problem, you will see terrible performance. Using a sparse (mostly zero) vector and such a small data set, deep learning methods will show VERY poor performance compared to other methods. You are better off using the + svm glove or a random forest model for existing data.

Encog Neural Net - How to structure training data?

More articles: