How to use 3Party ceiling-independent dependencies in Spark?

How is thread safety a spark? I have something like this in java:

class A implements Function<String, Boolean> {
    NotThreadSafe3rdParty calculator = new NotThreadSafe3rdParty();
    public Boolean call(String s) {
        return calculator.calc(s);
    }
}

class B implements Function<String, Boolean> {
    static NotThreadSafe3rdParty calculator;
    static {
        calculator = new NotThreadSafe3rdParty();
    }
    public Boolean call(String s) {
        return calculator.calc(s);
    }
}

class MyRun {
    public static void main(String[] args) {
        String myPath = "/data/path";
        SparkConf conf = new SparkConf().setAppName("Simple Application");
        JavaSparkContext sc = new JavaSparkContext(conf);
        JavaRDD<String> myData = sc.textFile(myPath);

        long numAs = myData.filter(new A()).count();
        long numBs = myData.filter(new B()).count();
    }
}
  • Is class A used correctly?
  • Is class B used correctly?
  • What if class A NotThreadSafe3rdParty is a jni wrapper over c code (e.g. crfsuite?)
  • How to use such dependencies?
+4
source share

All Articles