What is the preferred way to avoid SQL injection in Spark-SQL (on Hive)

Question

What is the preferred way to avoid SQL injection in Spark-SQL (on Hive)

Suppose that SchemaRDD rdd has a registered customer table. You want to filter entries according to user input. One idea you can make is this:

 rdd.sqlContext.sql(s"SELECT * FROM customer WHERE name='$userInput'")

However, since in the old days of PHP we know that this can lead to unpleasant things. Is there an equivalent to PreparedStatement? The only thing I could find that looked remotely relevant was org.apache.commons.lang.StringEscapeUtils.escapeSql .

+5

security scala hive apache-spark apache-spark-sql

DanielM Apr 16 '15 at 8:18

source share

1 answer

dpeacock · Answer 1 · 2015-04-16T14:33:51+0000

One option is to use thriftserver to set jdbc , and then you can use the usual methods (PreparedStatement, etc.) to prevent sql injection.

What is the preferred way to avoid SQL injection in Spark-SQL (on Hive)

More articles: