I use Apache Spark Mllib 1.4.1 (PySpark, a python implementation of Spark) to generate a decision tree based on my LabeledPoint data. The tree is generated correctly, and I can print it on the terminal (extract the rules, as this user calls it How to extract the rules from the MLlib spark of the decision tree ) using:
model = DecisionTree.trainClassifier( ... ) print(model.toDebugString()
But I want to visualize or build a decision tree, and not print it on the terminal. Is it possible to somehow build a decision tree in PySpark, or maybe I can save the data of the decision tree and use R to build them? Thanks!
source share