I use the Spark 1.5.0 MLlib Random Forest algorithm (Scala code) for a two-class classification. Since the data set that I use is very unbalanced, so the majority class is omitted with a sampling rate of 10%.
Is it possible to use the sample weight (in this case 10) in a Spark Random Forest training? I do not see the weight among the input parameters for trainClassifier()in Random Forest.
source
share