Torch, why does my artificial neural network always predict zeros?

I work with Torch7 on Linux CentOS 7. I am trying to apply an artificial neural network (ANN) to my dataset to solve the problem of binary classification . I use a simple multi-layer perceptron .

I use the following Torch packages: optim, torch.

The problem is that my perceptron always predicts zero values (elements are classified as zeros), and I cannot understand why ...

Here is my dataset ("dataset_file.csv"). There are 34 functions and 1 target labels (the last column, which can be 0 or 1):

0.55,1,0,1,0,0.29,1,0,1,0.46,1,1,0,0.67,1,0.37,0.41,1,0.08,0.47,0.23,0.13,0.82,0.46,0.25,0.04,0,0,0.52,1,0,0,0,0.33,0 0.65,1,0,1,0,0.64,1,0,0,0.02,1,1,1,1,0,0.52,0.32,0,0.18,0.67,0.47,0.2,0.64,0.38,0.23,1,0.24,0.18,0.04,1,1,1,1,0.41,0 0.34,1,0.13,1,0,0.33,0,0.5,0,0.02,0,0,0,0.67,1,0.25,0.55,1,0.06,0.23,0.18,0.15,0.82,0.51,0.22,0.06,0,0,0.6,1,0,0,0,0.42,1 0.46,1,0,1,0,0.14,1,0,0,0.06,0,1,1,0,1,0.37,0.64,1,0.14,0.22,0.17,0.1,0.94,0.65,0.22,0.06,0.75,0.64,0.3,1,1,0,0,0.2,0 0.55,1,0,1,0,0.14,1,0.5,1,0.03,1,1,0,1,1,0.42,0.18,0,0.16,0.55,0.16,0.12,0.73,0.55,0.2,0.03,0.54,0.44,0.35,1,1,0,0,0.11,0 0.67,1,0,1,0,0.71,0,0.5,0,0.46,1,0,1,1,1,0.74,0.41,0,0.1,0.6,0.15,0.15,0.69,0.42,0.27,0.04,0.61,0.48,0.54,1,1,0,0,0.22,1 0.52,1,0,1,0,0.21,1,0.5,0,0.01,1,1,1,0.67,0,0.27,0.64,0,0.08,0.34,0.14,0.21,0.85,0.51,0.2,0.05,0.51,0.36,0.36,1,1,0,0,0.23,0 0.58,1,0.38,1,0,0.36,1,0.5,1,0.02,0,1,0,1,1,0.38,0.55,1,0.13,0.57,0.21,0.23,0.73,0.52,0.19,0.03,0,0,0.6,1,0,0,0,0.42,0 0.66,1,0,1,0,0.07,1,0,0,0.06,1,0,0,1,1,0.24,0.32,1,0.06,0.45,0.16,0.13,0.92,0.57,0.27,0.06,0,0,0.55,1,0,0,0,0.33,0 0.39,1,0.5,1,0,0.29,1,0,1,0.06,0,0,0,1,1,0.34,0.45,1,0.1,0.31,0.12,0.16,0.81,0.54,0.21,0.02,0.51,0.27,0.5,1,1,0,0,0.32,0 0.26,0,0,1,0,0.21,1,0,0,0.02,1,1,1,0,1,0.17,0.36,0,0.19,0.41,0.24,0.26,0.73,0.55,0.22,0.41,0.46,0.43,0.42,1,1,0,0,0.52,0 0.96,0,0.63,1,0,0.86,1,0,1,0.06,1,1,1,0,0,0.41,0.5,1,0.08,0.64,0.23,0.19,0.69,0.45,0.23,0.06,0.72,0.43,0.45,1,1,0,0,0.53,0 0.58,0,0.25,1,0,0.29,1,0,1,0.04,1,0,0,0,1,0.4,0.27,1,0.09,0.65,0.21,0.16,0.8,0.57,0.24,0.02,0.51,0.28,0.5,1,1,1,0,0.63,0 0.6,1,0.5,1,0,0.73,1,0.5,1,0.04,1,0,1,0,1,0.85,0.64,1,0.16,0.71,0.24,0.21,0.72,0.45,0.23,0.1,0.63,0.57,0.13,1,1,1,1,0.65,0 0.72,1,0.25,1,0,0.29,1,0,0,0.06,1,0,0,1,1,0.31,0.41,1,0.17,0.78,0.24,0.16,0.75,0.54,0.27,0.09,0.78,0.68,0.19,1,1,1,1,0.75,0 0.56,0,0.13,1,0,0.4,1,0,0,0.23,1,0,0,1,1,0.42,1,0,0.03,0.14,0.15,0.13,0.85,0.52,0.24,0.06,0,0,0.56,1,0,0,0,0.33,0 0.67,0,0,1,0,0.57,1,0,1,0.02,0,0,0,1,1,0.38,0.36,0,0.08,0.12,0.11,0.14,0.8,0.49,0.22,0.05,0,0,0.6,1,0,0,0,0.22,0 0.67,0,0,1,0,0.36,1,0,0,0.23,0,1,0,0,0,0.32,0.73,0,0.25,0.86,0.26,0.16,0.62,0.35,0.25,0.02,0.46,0.43,0.45,1,1,1,0,0.76,0 0.55,1,0.5,1,0,0.57,0,0.5,1,0.12,1,1,1,0.67,1,1,0.45,0,0.19,0.94,0.19,0.22,0.88,0.41,0.35,0.15,0.47,0.4,0.05,1,1,1,0,0.56,1 0.61,0,0,1,0,0.43,1,0.5,1,0.04,1,0,1,0,0,0.68,0.23,1,0.12,0.68,0.25,0.29,0.68,0.45,0.29,0.13,0.58,0.41,0.11,1,1,1,1,0.74,0 0.59,1,0.25,1,0,0.23,1,0.5,0,0.02,1,1,1,0,1,0.57,0.41,1,0.08,0.05,0.16,0.15,0.87,0.61,0.25,0.04,0.67,0.61,0.45,1,1,0,0,0.65,0 0.74,1,0.5,1,0,0.26,1,0,1,0.01,1,1,1,1,0,0.76,0.36,0,0.14,0.72,0.12,0.13,0.68,0.54,0.54,0.17,0.93,0.82,0.12,1,1,0,0,0.18,0 0.64,0,0,1,0,0.29,0,0,1,0.15,0,0,1,0,1,0.33,0.45,0,0.11,0.55,0.25,0.15,0.75,0.54,0.27,0.05,0.61,0.64,0.43,1,1,0,0,0.23,1 0.36,0,0.38,1,0,0.14,0,0.5,0,0.02,1,1,1,0.33,1,0.18,0.36,0,0.17,0.79,0.21,0.12,0.75,0.54,0.24,0.05,0,0,0.52,1,0,0,0,0.44,1 0.52,0,0.75,1,0,0.14,1,0.5,0,0.04,1,1,1,0,1,0.36,0.68,1,0.08,0.34,0.12,0.13,0.79,0.59,0.22,0.02,0,0,0.5,1,0,0,0,0.23,0 0.59,0,0.75,1,0,0.29,1,0,0,0.06,1,1,0,0,1,0.24,0.27,0,0.12,0.7,0.2,0.16,0.74,0.45,0.26,0.02,0.46,0.32,0.52,1,0,0,0,0.33,0 0.72,1,0.38,1,0,0.43,0,0.5,0,0.06,1,0,1,0.67,1,0.53,0.32,0,0.2,0.68,0.16,0.13,0.79,0.45,0.25,0.09,0.61,0.57,0.15,1,1,0,0,0.22,1 

And here is my Lua torch code:

 -- add comma to separate thousands function comma_value(amount) local formatted = amount while true do formatted, k = string.gsub(formatted, "^(-?%d+)(%d%d%d)", '%1,%2') if (k==0) then break end end return formatted end -- function that computes the confusion matrix function confusion_matrix(predictionTestVect, truthVect, threshold, printValues) local tp = 0 local tn = 0 local fp = 0 local fn = 0 local MatthewsCC = -2 local accuracy = -2 local arrayFPindices = {} local arrayFPvalues = {} local arrayTPvalues = {} local areaRoc = 0 local fpRateVett = {} local tpRateVett = {} local precisionVett = {} local recallVett = {} for i=1,#predictionTestVect do if printValues == true then io.write("predictionTestVect["..i.."] = ".. round(predictionTestVect[i],4).."\ttruthVect["..i.."] = "..truthVect[i].." "); io.flush(); end if predictionTestVect[i] >= threshold and truthVect[i] >= threshold then tp = tp + 1 arrayTPvalues[#arrayTPvalues+1] = predictionTestVect[i] if printValues == true then print(" TP ") end elseif predictionTestVect[i] < threshold and truthVect[i] >= threshold then fn = fn + 1 if printValues == true then print(" FN ") end elseif predictionTestVect[i] >= threshold and truthVect[i] < threshold then fp = fp + 1 if printValues == true then print(" FP ") end arrayFPindices[#arrayFPindices+1] = i; arrayFPvalues[#arrayFPvalues+1] = predictionTestVect[i] elseif predictionTestVect[i] < threshold and truthVect[i] < threshold then tn = tn + 1 if printValues == true then print(" TN ") end end end print("TOTAL:") print(" FN = "..comma_value(fn).." / "..comma_value(tonumber(fn+tp)).."\t (truth == 1) & (prediction < threshold)"); print(" TP = "..comma_value(tp).." / "..comma_value(tonumber(fn+tp)).."\t (truth == 1) & (prediction >= threshold)\n"); print(" FP = "..comma_value(fp).." / "..comma_value(tonumber(fp+tn)).."\t (truth == 0) & (prediction >= threshold)"); print(" TN = "..comma_value(tn).." / "..comma_value(tonumber(fp+tn)).."\t (truth == 0) & (prediction < threshold)\n"); local continueLabel = true if continueLabel then upperMCC = (tp*tn) - (fp*fn) innerSquare = (tp+fp)*(tp+fn)*(tn+fp)*(tn+fn) lowerMCC = math.sqrt(innerSquare) MatthewsCC = -2 if lowerMCC>0 then MatthewsCC = upperMCC/lowerMCC end local signedMCC = MatthewsCC print("signedMCC = "..signedMCC) if MatthewsCC > -2 then print("\n::::\tMatthews correlation coefficient = "..signedMCC.."\t::::\n"); else print("Matthews correlation coefficient = NOT computable"); end accuracy = (tp + tn)/(tp + tn +fn + fp) print("accuracy = "..round(accuracy,2).. " = (tp + tn) / (tp + tn +fn + fp) \t \t [worst = -1, best = +1]"); local f1_score = -2 if (tp+fp+fn)>0 then f1_score = (2*tp) / (2*tp+fp+fn) print("f1_score = "..round(f1_score,2).." = (2*tp) / (2*tp+fp+fn) \t [worst = 0, best = 1]"); else print("f1_score CANNOT be computed because (tp+fp+fn)==0") end local totalRate = 0 if MatthewsCC > -2 and f1_score > -2 then totalRate = MatthewsCC + accuracy + f1_score print("total rate = "..round(totalRate,2).." in [-1, +3] that is "..round((totalRate+1)*100/4,2).."% of possible correctness"); end local numberOfPredictedOnes = tp + fp; print("numberOfPredictedOnes = (TP + FP) = "..comma_value(numberOfPredictedOnes).." = "..round(numberOfPredictedOnes*100/(tp + tn + fn + fp),2).."%"); io.write("\nDiagnosis: "); if (fn >= tp and (fn+tp)>0) then print("too many FN false negatives"); end if (fp >= tn and (fp+tn)>0) then print("too many FP false positives"); end if (tn > (10*fp) and tp > (10*fn)) then print("Excellent ! ! !"); elseif (tn > (5*fp) and tp > (5*fn)) then print("Very good ! !"); elseif (tn > (2*fp) and tp > (2*fn)) then print("Good !"); elseif (tn >= fp and tp >= fn) then print("Alright"); else print("Baaaad"); end end return {accuracy, arrayFPindices, arrayFPvalues, MatthewsCC}; end -- Permutations -- tab = {1,2,3,4,5,6,7,8,9,10} -- permute(tab, 10, 10) function permute(tab, n, count) n = n or #tab for i = 1, count or n do local j = math.random(i, n) tab[i], tab[j] = tab[j], tab[i] end return tab end -- round a real value function round(num, idp) local mult = 10^(idp or 0) return math.floor(num * mult + 0.5) / mult end -- ##############################3 local profile_vett = {} local csv = require("csv") local fileName = "dataset_file.csv" print("Readin' "..tostring(fileName)) local f = csv.open(fileName) local column_names = {} local j = 0 for fields in f:lines() do if j>0 then profile_vett[j] = {} for i, v in ipairs(fields) do profile_vett[j][i] = tonumber(v); end j = j + 1 else for i, v in ipairs(fields) do column_names[i] = v end j = j + 1 end end OPTIM_PACKAGE = true local output_number = 1 THRESHOLD = 0.5 -- ORIGINAL DROPOUT_FLAG = false MOMENTUM = false MOMENTUM_ALPHA = 0.5 MAX_MSE = 4 LEARN_RATE = 0.001 ITERATIONS = 100 local hidden_units = 2000 local hidden_layers = 1 local hiddenUnitVect = {2000, 4000, 6000, 8000, 10000} -- local hiddenLayerVect = {1,2,3,4,5} local hiddenLayerVect = {1} local profile_vett_data = {} local label_vett = {} for i=1,#profile_vett do profile_vett_data[i] = {} for j=1,#(profile_vett[1]) do if j<#(profile_vett[1]) then profile_vett_data[i][j] = profile_vett[i][j] else label_vett[i] = profile_vett[i][j] end end end print("Number of value profiles (rows) = "..#profile_vett_data); print("Number features (columns) = "..#(profile_vett_data[1])); print("Number of targets (rows) = "..#label_vett); local table_row_outcome = label_vett local table_rows_vett = profile_vett -- ######################################################## -- START local indexVect = {}; for i=1, #table_rows_vett do indexVect[i] = i; end permutedIndexVect = permute(indexVect, #indexVect, #indexVect); TEST_SET_PERC = 20 local test_set_size = round((TEST_SET_PERC*#table_rows_vett)/100) print("training_set_size = "..(#table_rows_vett-test_set_size).." elements"); print("test_set_size = "..test_set_size.." elements\n"); local train_table_row_profile = {} local test_table_row_profile = {} local original_test_indexes = {} for i=1,#table_rows_vett do if i<=(tonumber(#table_rows_vett)-test_set_size) then train_table_row_profile[#train_table_row_profile+1] = {torch.Tensor(table_rows_vett[permutedIndexVect[i]]), torch.Tensor{table_row_outcome[permutedIndexVect[i]]}} else original_test_indexes[#original_test_indexes+1] = permutedIndexVect[i]; test_table_row_profile[#test_table_row_profile+1] = {torch.Tensor(table_rows_vett[permutedIndexVect[i]]), torch.Tensor{table_row_outcome[permutedIndexVect[i]]}} end end require 'nn' perceptron = nn.Sequential() input_number = #table_rows_vett[1] perceptron:add(nn.Linear(input_number, hidden_units)) perceptron:add(nn.Sigmoid()) if DROPOUT_FLAG==true then perceptron:add(nn.Dropout()) end for w=1,hidden_layers do perceptron:add(nn.Linear(hidden_units, hidden_units)) perceptron:add(nn.Sigmoid()) if DROPOUT_FLAG==true then perceptron:add(nn.Dropout()) end end perceptron:add(nn.Linear(hidden_units, output_number)) function train_table_row_profile:size() return #train_table_row_profile end function test_table_row_profile:size() return #test_table_row_profile end -- OPTIMIZATION LOOPS local MCC_vect = {} for a=1,#hiddenUnitVect do for b=1,#hiddenLayerVect do local hidden_units = hiddenUnitVect[a] local hidden_layers = hiddenLayerVect[b] print("hidden_units = "..hidden_units.."\t output_number = "..output_number.." hidden_layers = "..hidden_layers) local criterion = nn.MSECriterion() local lossSum = 0 local error_progress = 0 require 'optim' local params, gradParams = perceptron:getParameters() local optimState = nil if MOMENTUM==true then optimState = {learningRate = LEARN_RATE} else optimState = {learningRate = LEARN_RATE, momentum = MOMENTUM_ALPHA } end local total_runs = ITERATIONS*#train_table_row_profile local loopIterations = 1 for epoch=1,ITERATIONS do for k=1,#train_table_row_profile do -- Function feval local function feval(params) gradParams:zero() local thisProfile = train_table_row_profile[k][1] local thisLabel = train_table_row_profile[k][2] local thisPrediction = perceptron:forward(thisProfile) local loss = criterion:forward(thisPrediction, thisLabel) -- print("thisPrediction = "..round(thisPrediction[1],2).." thisLabel = "..thisLabel[1]) lossSum = lossSum + loss error_progress = lossSum*100 / (loopIterations*MAX_MSE) if ((loopIterations*100/total_runs)*10)%10==0 then io.write("completion: ", round((loopIterations*100/total_runs),2).."%" ) io.write(" (epoch="..epoch..")(element="..k..") loss = "..round(loss,2).." ") io.write("\terror progress = "..round(error_progress,5).."%\n") end local dloss_doutput = criterion:backward(thisPrediction, thisLabel) perceptron:backward(thisProfile, dloss_doutput) return loss,gradParams end optim.sgd(feval, params, optimState) loopIterations = loopIterations+1 end end local correctPredictions = 0 local atleastOneTrue = false local atleastOneFalse = false local predictionTestVect = {} local truthVect = {} for i=1,#test_table_row_profile do local current_label = test_table_row_profile[i][2][1] local prediction = perceptron:forward(test_table_row_profile[i][1])[1] predictionTestVect[i] = prediction truthVect[i] = current_label local labelResult = false if current_label >= THRESHOLD and prediction >= THRESHOLD then labelResult = true elseif current_label < THRESHOLD and prediction < THRESHOLD then labelResult = true end if labelResult==true then correctPredictions = correctPredictions + 1; end print("\nCorrect predictions = "..round(correctPredictions*100/#test_table_row_profile,2).."%") local printValues = false local output_confusion_matrix = confusion_matrix(predictionTestVect, truthVect, THRESHOLD, printValues) end end 

Does anyone have an idea why my script predicts only null elements?

EDIT: I replaced the original dataset with my normalized version, which I use in my script

+7
lua neural-network torch
source share
2 answers

When I run my source code, I sometimes get all the zeros predicted, and sometimes I get excellent performance. This suggests that your original model is very sensitive to initializing parameter values.

If I use the initial value torch.manualSeed(0) (so we always have the same initialization), I get perfect performance every time. But this is not a general solution.

To get a more general improvement, I made the following changes:

  • The number of hidden units has been reduced. In the source code, you have a single hidden layer of 2000 units. But you only have 34 inputs and 1. Often you only need the number of hidden units between the number of inputs and outputs. I reduced it to 50 .
  • These labels are asymmetric, only 5/27 (19%) of the labels are units, so you should really divide the test sets in such a way as to preserve the ratio of units to zeros. At the moment, I just increased the size of the test suite to "50"%.
  • I also increased the learning speed to "0.01", turned on MOMENTUM and increased ITERATIONS to 200.

When I run this model 20 times (unseeded), I got Excellent performance 19 times. To further improve, you can configure hyper settings further. And you should also consider several initializations with a separate test suite to choose the β€œbest” model (although this will further divide your already very small dataset).

 -- add comma to separate thousands function comma_value(amount) local formatted = amount while true do formatted, k = string.gsub(formatted, "^(-?%d+)(%d%d%d)", '%1,%2') if (k==0) then break end end return formatted end -- function that computes the confusion matrix function confusion_matrix(predictionTestVect, truthVect, threshold, printValues) local tp = 0 local tn = 0 local fp = 0 local fn = 0 local MatthewsCC = -2 local accuracy = -2 local arrayFPindices = {} local arrayFPvalues = {} local arrayTPvalues = {} local areaRoc = 0 local fpRateVett = {} local tpRateVett = {} local precisionVett = {} local recallVett = {} for i=1,#predictionTestVect do if printValues == true then io.write("predictionTestVect["..i.."] = ".. round(predictionTestVect[i],4).."\ttruthVect["..i.."] = "..truthVect[i].." "); io.flush(); end if predictionTestVect[i] >= threshold and truthVect[i] >= threshold then tp = tp + 1 arrayTPvalues[#arrayTPvalues+1] = predictionTestVect[i] if printValues == true then print(" TP ") end elseif predictionTestVect[i] < threshold and truthVect[i] >= threshold then fn = fn + 1 if printValues == true then print(" FN ") end elseif predictionTestVect[i] >= threshold and truthVect[i] < threshold then fp = fp + 1 if printValues == true then print(" FP ") end arrayFPindices[#arrayFPindices+1] = i; arrayFPvalues[#arrayFPvalues+1] = predictionTestVect[i] elseif predictionTestVect[i] < threshold and truthVect[i] < threshold then tn = tn + 1 if printValues == true then print(" TN ") end end end print("TOTAL:") print(" FN = "..comma_value(fn).." / "..comma_value(tonumber(fn+tp)).."\t (truth == 1) & (prediction < threshold)"); print(" TP = "..comma_value(tp).." / "..comma_value(tonumber(fn+tp)).."\t (truth == 1) & (prediction >= threshold)\n"); print(" FP = "..comma_value(fp).." / "..comma_value(tonumber(fp+tn)).."\t (truth == 0) & (prediction >= threshold)"); print(" TN = "..comma_value(tn).." / "..comma_value(tonumber(fp+tn)).."\t (truth == 0) & (prediction < threshold)\n"); local continueLabel = true if continueLabel then upperMCC = (tp*tn) - (fp*fn) innerSquare = (tp+fp)*(tp+fn)*(tn+fp)*(tn+fn) lowerMCC = math.sqrt(innerSquare) MatthewsCC = -2 if lowerMCC>0 then MatthewsCC = upperMCC/lowerMCC end local signedMCC = MatthewsCC print("signedMCC = "..signedMCC) if MatthewsCC > -2 then print("\n::::\tMatthews correlation coefficient = "..signedMCC.."\t::::\n"); else print("Matthews correlation coefficient = NOT computable"); end accuracy = (tp + tn)/(tp + tn +fn + fp) print("accuracy = "..round(accuracy,2).. " = (tp + tn) / (tp + tn +fn + fp) \t \t [worst = -1, best = +1]"); local f1_score = -2 if (tp+fp+fn)>0 then f1_score = (2*tp) / (2*tp+fp+fn) print("f1_score = "..round(f1_score,2).." = (2*tp) / (2*tp+fp+fn) \t [worst = 0, best = 1]"); else print("f1_score CANNOT be computed because (tp+fp+fn)==0") end local totalRate = 0 if MatthewsCC > -2 and f1_score > -2 then totalRate = MatthewsCC + accuracy + f1_score print("total rate = "..round(totalRate,2).." in [-1, +3] that is "..round((totalRate+1)*100/4,2).."% of possible correctness"); end local numberOfPredictedOnes = tp + fp; print("numberOfPredictedOnes = (TP + FP) = "..comma_value(numberOfPredictedOnes).." = "..round(numberOfPredictedOnes*100/(tp + tn + fn + fp),2).."%"); io.write("\nDiagnosis: "); if (fn >= tp and (fn+tp)>0) then print("too many FN false negatives"); end if (fp >= tn and (fp+tn)>0) then print("too many FP false positives"); end if (tn > (10*fp) and tp > (10*fn)) then print("Excellent ! ! !"); elseif (tn > (5*fp) and tp > (5*fn)) then print("Very good ! !"); elseif (tn > (2*fp) and tp > (2*fn)) then print("Good !"); elseif (tn >= fp and tp >= fn) then print("Alright"); else print("Baaaad"); end end return {accuracy, arrayFPindices, arrayFPvalues, MatthewsCC}; end -- Permutations -- tab = {1,2,3,4,5,6,7,8,9,10} -- permute(tab, 10, 10) function permute(tab, n, count) n = n or #tab for i = 1, count or n do local j = math.random(i, n) tab[i], tab[j] = tab[j], tab[i] end return tab end -- round a real value function round(num, idp) local mult = 10^(idp or 0) return math.floor(num * mult + 0.5) / mult end -- ##############################3 local profile_vett = {} local csv = require("csv") local fileName = "dataset_file.csv" print("Readin' "..tostring(fileName)) local f = csv.open(fileName) local column_names = {} local j = 0 for fields in f:lines() do if j>0 then profile_vett[j] = {} for i, v in ipairs(fields) do profile_vett[j][i] = tonumber(v); end j = j + 1 else for i, v in ipairs(fields) do column_names[i] = v end j = j + 1 end end OPTIM_PACKAGE = true local output_number = 1 THRESHOLD = 0.5 -- ORIGINAL DROPOUT_FLAG = false MOMENTUM_ALPHA = 0.5 MAX_MSE = 4 -- CHANGE: increased learn_rate to 0.01, reduced hidden units to 50, turned momentum on, increased iterations to 200 LEARN_RATE = 0.01 local hidden_units = 50 MOMENTUM = true ITERATIONS = 200 ------------------------------------- local hidden_layers = 1 local hiddenUnitVect = {2000, 4000, 6000, 8000, 10000} -- local hiddenLayerVect = {1,2,3,4,5} local hiddenLayerVect = {1} local profile_vett_data = {} local label_vett = {} for i=1,#profile_vett do profile_vett_data[i] = {} for j=1,#(profile_vett[1]) do if j<#(profile_vett[1]) then profile_vett_data[i][j] = profile_vett[i][j] else label_vett[i] = profile_vett[i][j] end end end print("Number of value profiles (rows) = "..#profile_vett_data); print("Number features (columns) = "..#(profile_vett_data[1])); print("Number of targets (rows) = "..#label_vett); local table_row_outcome = label_vett local table_rows_vett = profile_vett -- ######################################################## -- START -- Seed random number generator -- torch.manualSeed(0) local indexVect = {}; for i=1, #table_rows_vett do indexVect[i] = i; end permutedIndexVect = permute(indexVect, #indexVect, #indexVect); -- CHANGE: increase test_set to 50% TEST_SET_PERC = 50 --------------------------- local test_set_size = round((TEST_SET_PERC*#table_rows_vett)/100) print("training_set_size = "..(#table_rows_vett-test_set_size).." elements"); print("test_set_size = "..test_set_size.." elements\n"); local train_table_row_profile = {} local test_table_row_profile = {} local original_test_indexes = {} for i=1,#table_rows_vett do if i<=(tonumber(#table_rows_vett)-test_set_size) then train_table_row_profile[#train_table_row_profile+1] = {torch.Tensor(table_rows_vett[permutedIndexVect[i]]), torch.Tensor{table_row_outcome[permutedIndexVect[i]]}} else original_test_indexes[#original_test_indexes+1] = permutedIndexVect[i]; test_table_row_profile[#test_table_row_profile+1] = {torch.Tensor(table_rows_vett[permutedIndexVect[i]]), torch.Tensor{table_row_outcome[permutedIndexVect[i]]}} end end require 'nn' perceptron = nn.Sequential() input_number = #table_rows_vett[1] perceptron:add(nn.Linear(input_number, hidden_units)) perceptron:add(nn.Sigmoid()) if DROPOUT_FLAG==true then perceptron:add(nn.Dropout()) end for w=1,hidden_layers do perceptron:add(nn.Linear(hidden_units, hidden_units)) perceptron:add(nn.Sigmoid()) if DROPOUT_FLAG==true then perceptron:add(nn.Dropout()) end end perceptron:add(nn.Linear(hidden_units, output_number)) function train_table_row_profile:size() return #train_table_row_profile end function test_table_row_profile:size() return #test_table_row_profile end -- OPTIMIZATION LOOPS local MCC_vect = {} for a=1,#hiddenUnitVect do for b=1,#hiddenLayerVect do local hidden_units = hiddenUnitVect[a] local hidden_layers = hiddenLayerVect[b] print("hidden_units = "..hidden_units.."\t output_number = "..output_number.." hidden_layers = "..hidden_layers) local criterion = nn.MSECriterion() local lossSum = 0 local error_progress = 0 require 'optim' local params, gradParams = perceptron:getParameters() local optimState = nil if MOMENTUM==true then optimState = {learningRate = LEARN_RATE} else optimState = {learningRate = LEARN_RATE, momentum = MOMENTUM_ALPHA } end local total_runs = ITERATIONS*#train_table_row_profile local loopIterations = 1 for epoch=1,ITERATIONS do for k=1,#train_table_row_profile do -- Function feval local function feval(params) gradParams:zero() local thisProfile = train_table_row_profile[k][1] local thisLabel = train_table_row_profile[k][2] local thisPrediction = perceptron:forward(thisProfile) local loss = criterion:forward(thisPrediction, thisLabel) -- print("thisPrediction = "..round(thisPrediction[1],2).." thisLabel = "..thisLabel[1]) lossSum = lossSum + loss error_progress = lossSum*100 / (loopIterations*MAX_MSE) if ((loopIterations*100/total_runs)*10)%10==0 then io.write("completion: ", round((loopIterations*100/total_runs),2).."%" ) io.write(" (epoch="..epoch..")(element="..k..") loss = "..round(loss,2).." ") io.write("\terror progress = "..round(error_progress,5).."%\n") end local dloss_doutput = criterion:backward(thisPrediction, thisLabel) perceptron:backward(thisProfile, dloss_doutput) return loss,gradParams end optim.sgd(feval, params, optimState) loopIterations = loopIterations+1 end end local correctPredictions = 0 local atleastOneTrue = false local atleastOneFalse = false local predictionTestVect = {} local truthVect = {} for i=1,#test_table_row_profile do local current_label = test_table_row_profile[i][2][1] local prediction = perceptron:forward(test_table_row_profile[i][1])[1] predictionTestVect[i] = prediction truthVect[i] = current_label local labelResult = false if current_label >= THRESHOLD and prediction >= THRESHOLD then labelResult = true elseif current_label < THRESHOLD and prediction < THRESHOLD then labelResult = true end if labelResult==true then correctPredictions = correctPredictions + 1; end print("\nCorrect predictions = "..round(correctPredictions*100/#test_table_row_profile,2).."%") local printValues = false local output_confusion_matrix = confusion_matrix(predictionTestVect, truthVect, THRESHOLD, printValues) end end end 

The box below is the result of 1 out of 20 runs:

 Correct predictions = 100% TOTAL: FN = 0 / 4 (truth == 1) & (prediction < threshold) TP = 4 / 4 (truth == 1) & (prediction >= threshold) FP = 0 / 9 (truth == 0) & (prediction >= threshold) TN = 9 / 9 (truth == 0) & (prediction < threshold) signedMCC = 1 :::: Matthews correlation coefficient = 1 :::: accuracy = 1 = (tp + tn) / (tp + tn +fn + fp) [worst = -1, best = +1] f1_score = 1 = (2*tp) / (2*tp+fp+fn) [worst = 0, best = 1] total rate = 3 in [-1, +3] that is 100% of possible correctness numberOfPredictedOnes = (TP + FP) = 4 = 30.77% Diagnosis: Excellent ! ! ! 
+2
source share

Most likely your NN is learning too slowly and therefore is not learning anything. Deeplearning4j has an excellent article on troubleshooting neural network problems that can shed light on the effect that various hyperparameters can have.

Looking at your code, I would first try the following:

  • Adjust your learning speed:
    You set the learning speed: LEARN_RATE = 0.001 . Try values ​​between 1e-1 and 1e-8 .
  • Adjust your hidden layer:
    You may need to set up a hidden layer: hiddenUnitVect = {2000, 4000, 6000, 8000, 10000} . It seems a little big for this task. Try a smaller grid first and increase the size if it doesn't generalize very well.
0
source share

All Articles