How is thread safety a spark? I have something like this in java:
class A implements Function<String, Boolean> {
NotThreadSafe3rdParty calculator = new NotThreadSafe3rdParty();
public Boolean call(String s) {
return calculator.calc(s);
}
}
class B implements Function<String, Boolean> {
static NotThreadSafe3rdParty calculator;
static {
calculator = new NotThreadSafe3rdParty();
}
public Boolean call(String s) {
return calculator.calc(s);
}
}
class MyRun {
public static void main(String[] args) {
String myPath = "/data/path";
SparkConf conf = new SparkConf().setAppName("Simple Application");
JavaSparkContext sc = new JavaSparkContext(conf);
JavaRDD<String> myData = sc.textFile(myPath);
long numAs = myData.filter(new A()).count();
long numBs = myData.filter(new B()).count();
}
}
- Is class A used correctly?
- Is class B used correctly?
- What if class A NotThreadSafe3rdParty is a jni wrapper over c code (e.g. crfsuite?)
- How to use such dependencies?
Piotr source
share