Intrinsically safe data cropping column and convert

In Scala / Spark, how do I convert an empty string, such as ", to" NULL "? First you need to trim it and then convert it to" NULL. "Thanks.

dataframe.na.replace("cut", Map(" " -> "NULL")).show //wrong 
+7
scala apache-spark
source share
1 answer

You can create a simple function for this. First import pair:

 import org.apache.spark.sql.functions.{trim, length, when} import org.apache.spark.sql.Column 

and definition:

 def emptyToNull(c: Column) = when(length(trim(c)) > 0, c) 

Finally, a quick test:

 val df = Seq(" ", "foo", "", "bar").toDF df.withColumn("value", emptyToNull($"value")) 

which should give the following result:

 +-----+ |value| +-----+ | null| | foo| | null| | bar| +-----+ 

If you want to replace the empty string with the string "NULL , you can add the otherwise clause:

 def emptyToNullString(c: Column) = when(length(trim(c)) > 0, c).otherwise("NULL") 
+6
source share

All Articles