To build a multiclass NaiveBayes classifier, I use CrossValidator to select the best options in my pipeline:
val cv = new CrossValidator() .setEstimator(pipeline) .setEstimatorParamMaps(paramGrid) .setEvaluator(new MulticlassClassificationEvaluator) .setNumFolds(10) val cvModel = cv.fit(trainingSet)
The pipeline contains ordinary transformers and ratings in the following order: Tokenizer, StopWordsRemover, HashingTF, IDF, and finally NaiveBayes.
Is it possible to access metrics calculated for the best model?
Ideally, I would like to access the metrics of all models to see how changing the parameters changes the quality of the classification. But at the moment, the best model is good enough.
FYI, I am using Spark 1.6.0
apache-spark apache-spark-mllib apache-spark-ml
Rami
source share