There is no need to convert to RDD, its delay is performed as shown below
`public static void mapMethod () {// Read the data from the file where the file is in the classpath. Dataset df = sparkSession.read (). Json ("file1.json");
// Prior to java 1.8 Encoder<String> encoder = Encoders.STRING(); List<String> rowsList = df.map((new MapFunction<Row, String>() { private static final long serialVersionUID = 1L; @Override public String call(Row row) throws Exception { return "string:>" + row.getString(0).toString() + "<"; } }), encoder).collectAsList(); // from java 1.8 onwards List<String> rowsList1 = df.map((row -> "string >" + row.getString(0) + "<" ), encoder).collectAsList(); System.out.println(">>> " + rowsList); System.out.println(">>> " + rowsList1);
} `
source share