Failed to declare a drive of type String

I am trying to define a battery variable of type String in the Scala shell (driver), but I keep getting the following error: -

scala> val myacc = sc.accumulator("Test") <console>:21: error: could not find implicit value for parameter param: org.apache.spark.AccumulatorParam[String] val myacc = sc.accumulator("Test") ^ 

This does not seem to be a problem for batteries like Int or Double.

thanks

+5
source share
1 answer

This is because Spark provides only Long , Double and Float batteries by default. If you need something else, you need to expand AccumulatorParam .

 import org.apache.spark.AccumulatorParam object StringAccumulatorParam extends AccumulatorParam[String] { def zero(initialValue: String): String = { "" } def addInPlace(s1: String, s2: String): String = { s"$s1 $s2" } } val stringAccum = sc.accumulator("")(StringAccumulatorParam) val rdd = sc.parallelize("foo" :: "bar" :: Nil, 2) rdd.foreach(s => stringAccum += s) stringAccum.value 

Note

In general, you should avoid using batteries for tasks where data can increase significantly over time. Its behavior will be similar to group an collect , and in the worst case scenario, the script may fail due to lack of resources. Batteries are useful mainly for simple diagnostic tasks, such as tracking basic statistics.

+10
source

All Articles