Can I use the weight of a sample in a Spark MLlib Random Forest workout?

Question

Can I use the weight of a sample in a Spark MLlib Random Forest workout?

I use the Spark 1.5.0 MLlib Random Forest algorithm (Scala code) for a two-class classification. Since the data set that I use is very unbalanced, so the majority class is omitted with a sampling rate of 10%.

Is it possible to use the sample weight (in this case 10) in a Spark Random Forest training? I do not see the weight among the input parameters for trainClassifier()in Random Forest.

+5

scala random-forest apache-spark weight apache-spark-mllib

machine_learner Mar 11 '16 at 20:35

source share

1 answer