When you convert a DataFrame to RDD, you get RDD[Row], so when you use map, your function receives the Rowas parameter . Therefore, you should use methods Rowto access your members (note that the index starts at 0):
df.rdd.map {
row: Row => (row.getString(1) + "_" + row.getString(2), row)
}.take(5)
Row Spark scaladoc.
: , , String DataFrame :
import org.apache.spark.sql.functions._
val newDF = df.withColumn("concat", concat(df("col2"), lit("_"), df("col3")))