As mentioned in the discussion here - the problem stems from the laziness of the card on the partition iterator. This laziness means that for each section the connection is created and closed, and only later (when the RDD is in effect) readMatchingFromDB is readMatchingFromDB .
To fix this, you must force iterator traversal before closing the connection, for example. by converting it to a list (and then back):
val newRd = myRdd.mapPartitions(partition => { val connection = new DbConnection val newPartition = partition.map(record => { readMatchingFromDB(record, connection) }).toList
Tzach Zohar Jun 20 '16 at 5:29 2016-06-20 05:29
source share