First question: do you have a longer training session, do you get better accuracy? You may not have prepared enough.
Also, what is the accuracy of the training data and what is the accuracy of the test data? If both are tall, you can work longer or use a more complex model. If training accuracy is better than testing accuracy, you are essentially within your data. (i.e. scaling brute force to fit the model will not help, but smart improvements can, i.e. use convolution networks)
Finally, complex and noisy data may require a lot of data to make a reasonable classification. Therefore, you need many, many images.
Deep stackable autocoders, as I understand that this is an uncontrolled method that is not suitable for classification.
source share