(Spark) object {name} is not included in org.apache.spark.ml package

I am trying to run a standalone application using scala in apache spark using an example: http://spark.apache.org/docs/latest/ml-pipeline.html

Here is my full code:

import org.apache.spark.ml.classification.LogisticRegression import org.apache.spark.ml.linalg.{Vector, Vectors} import org.apache.spark.ml.param.ParamMap import org.apache.spark.sql.Row object mllibexample1 { def main(args: Array[String]) { val spark = SparkSession .builder() .master("local[*]") .appName("logistic regression example 1") .getOrCreate() val training = spark.createDataFrame(Seq( (1.0, Vectors.dense(0.0, 1.1, 0.1)), (0.0, Vectors.dense(2.0, 1.0, -1.0)), (0.0, Vectors.dense(2.0, 1.3, 1.0)), (1.0, Vectors.dense(0.0, 1.2, -0.5)) )).toDF("label", "features") val lr = new LogisticRegression() println("LogisticRegression parameters:\n" + lr.explainParams() + "\n") lr.setMaxIter(100) .setRegParam(0.01) val model1 = lr.fit(training) println("Model 1 was fit using parameters: " + model1.parent.extractParamMap) } } 

Dependencies in build.sbt:

 name := "example" version := "1.0.0" scalaVersion := "2.11.8" libraryDependencies ++= Seq( "org.apache.spark" %% "spark-core" % "2.0.1", "org.apache.spark" %% "spark-sql" % "2.0.1", "org.apache.spark" %% "spark-mllib-local" % "2.0.1", "com.github.fommil.netlib" % "all" % "1.1.2" ) 

However, after running the program in the sbt shell, I received the following error:

 [info] Compiling 1 Scala source to /dataplatform/example/target/scala-2.11/classes... [error] /dataplatform/example/src/main/scala/mllibexample1.scala:1: object classification is not a member of package org.apache.spark.ml [error] import org.apache.spark.ml.classification.LogisticRegression [error] ^ [error] /dataplatform/example/src/main/scala/mllibexample1.scala:3: object param is not a member of package org.apache.spark.ml [error] import org.apache.spark.ml.param.ParamMap [error] ^ [error] /dataplatform/example/src/main/scala/mllibexample1.scala:8: not found: value SparkSession [error] val spark = SparkSession [error] ^ [error] /dataplatform/example/src/main/scala/mllibexample1.scala:22: not found: type LogisticRegression [error] val lr = new LogisticRegression() 

I can successfully run this code in an interactive spark shell. Am I missing something in the * .sbt file?

Thank you Bai

+8
scala sbt apache-spark apache-spark-mllib
source share
2 answers

You missed the MLlib dependency:

 "org.apache.spark" %% "spark-mllib" % "2.0.1" 

Local is not enough.

+13
source share

I had the same problem and I have a Maven Scala project.

I used the Maven dependency below. After adding this maven dependency, the problem was resolved.

  <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-mllib_2.11</artifactId> <version>2.0.2</version> </dependency 
0
source share

All Articles