How to create a string from a list or array in Spark using java

In Java, I use RowFactory.create () to create a string:

Row row = RowFactory.create(record.getLong(1), record.getInt(2), record.getString(3)); 

where "record" is a record from the database, but I cannot know the length of the "record" in advance, so I want to use a "List" or "array" to create a "string". In Scala, I can use Row.fromSeq () to create a string from a list or array, but how can I achieve this in Java?

+6
source share
4 answers

I'm not sure if I got your question right, but you can use RowFactory to create a Row from an ArrayList in java.

 List<MyData> mlist = new ArrayList<MyData>(); mlist.add(d1); mlist.add(d2); Row row = RowFactory.create(mlist.toArray()); 
+10
source

We often need to create datasets or Dataframes in real applications. The following is an example of creating rows and a dataset in a Java application:

 // initialize first SQLContext SQLContext sqlContext = ... StructType schemata = DataTypes.createStructType( new StructField[]{ createStructField("NAME", StringType, false), createStructField("STRING_VALUE", StringType, false), createStructField("NUM_VALUE", IntegerType, false), }); Row r1 = RowFactory.create("name1", "value1", 1); Row r2 = RowFactory.create("name2", "value2", 2); List<Row> rowList = ImmutableList.of(r1, r2); Dataset<Row> data = sqlContext.createDataFrame(rowList, schemata); 
 +-----+------------+---------+ | NAME|STRING_VALUE|NUM_VALUE| +-----+------------+---------+ |name1| value1| 1| |name2| value2| 2| +-----+------------+---------+ 
+7
source

// Create DTO List

 List<MyDTO> dtoList = Arrays.asList(.....)); 

// Create DTO Dataset

 Dataset<MyDTO> dtoSet = sparkSession.createDataset(dtoList, Encoders.bean(MyDTO.class)); 

// If You Need A Row Dataset

 Dataset<Row> rowSet= dtoSet .select("col1","col2","col3"); 
0
source

For simple list values, you can use Encoders :

  List<Row> rows = ImmutableList.of(RowFactory.create(new Timestamp(currentTime))); Dataset<Row> input = sparkSession.createDataFrame(rows, Encoders.TIMESTAMP().schema()); 
-1
source

All Articles