I am trying to import some tests from Weka into Matlab. I already created a model with Weka 3.7, and now I want to reproduce the results in Matlab. Firstly, I created a set of traindata and testdata, for example, I have:
> ... > @attribute Tmp numeric > @attribute Hum numeric > @attribute Wsp numeric > @attribute Wnd numeric > @attribute class {IN,no} > @data ...
and then load the model and I get the predictions with:
classifier = weka.core.SerializationHelper.read('myMIX750.model'); numInst = testdata.numInstances(); pred = zeros(numInst,1); predProbs = zeros(numInst, traindata.numClasses()); for i=1:numInst pred(i) = classifier.classifyInstance( testdata.instance(i-1) ); predProbs(i,:) = classifier.distributionForInstance( testdata.instance(i-1) ); end
... and it works! But, by mistake, I entered the input file with an error, my model is as follows:
disp( char(classifier.toString()) ) J48 pruned tree ------------------ WxCat = 0: 0 (14003.0) WxCat = 1 | WndVar <= 0 | | PcpCatVar <= 1.00015: 0 (4436.0/389.0) | | PcpCatVar > 1.00015: 1 (2499.0/143.0) | WndVar > 0: 1 (18636.0/1592.0)
Note that the traindata / testdata attributes are different from the attributes in the model. However, the code it works and returns the classification ...
=== Detailed Accuracy By Class === TP Rate FP Rate Precision Recall F-Measure MCC ROC Area PRC Area Class 1,000 1,000 0,500 1,000 0,667 0,000 0,500 0,500 IN 0,000 0,000 0,000 0,000 0,000 0,000 0,500 0,500 no Weighted Avg. 0,500 0,500 0,250 0,500 0,333 0,000 0,500 0,500
So how does weka classify these instances containing different attributes for those used by the model?
This is the rest of the code used:
eval = weka.classifiers.Evaluation(traindata); eval.evaluateModel(classifier, testdata, javaArray('java.lang.Object',1)); fprintf('=== Run information ===\n\n') fprintf('Scheme: %s %s\n', ... char(classifier.getClass().getName()), ... char(weka.core.Utils.joinOptions(classifier.getOptions())) ) fprintf('Relation: %s\n', char(traindata.relationName)) fprintf('Instances: %d\n', traindata.numInstances) fprintf('Attributes: %d\n\n', traindata.numAttributes) fprintf('=== Classifier model ===\n\n') disp( char(classifier.toString()) ) fprintf('=== Summary ===\n') disp( char(eval.toSummaryString()) ) disp( char(eval.toClassDetailsString()) ) %Detailed Accuracy By Class disp( char(eval.toMatrixString()) ) %Confusion Matrix
Thanks in advance!
pgam