Can I use the weight of a sample in a Spark MLlib Random Forest workout?

I use the Spark 1.5.0 MLlib Random Forest algorithm (Scala code) for a two-class classification. Since the data set that I use is very unbalanced, so the majority class is omitted with a sampling rate of 10%.

Is it possible to use the sample weight (in this case 10) in a Spark Random Forest training? I do not see the weight among the input parameters for trainClassifier()in Random Forest.

+5
source share
1 answer

Not at all in Spark 1.5 and only partially (Logistic / LinearRegression) in Spark 1.6

https://issues.apache.org/jira/browse/SPARK-7685

JIRA

https://issues.apache.org/jira/browse/SPARK-9610

+1

All Articles