The fastest way to read a CSV file in Java

I noticed that use is java.util.Scannervery slow when reading large files (in my case, CSV files).

I want to change the way I read files now to improve performance. The following is what I have at the moment. Notice what I'm developing for Android:

InputStreamReader inputStreamReader;
    try {
        inputStreamReader = new InputStreamReader(context.getAssets().open("MyFile.csv"));
        Scanner inputStream = new Scanner(inputStreamReader);
        inputStream.nextLine(); // Ignores the first line
        while (inputStream.hasNext()) {
            String data = inputStream.nextLine(); // Gets a whole line
            String[] line = data.split(","); // Splits the line up into a string array

            if (line.length > 1) {
                // Do stuff, e.g:
                String value = line[1];
            }
        }
        inputStream.close();
    } catch (IOException e) {
        e.printStackTrace();
    }

Using Traceview , I was able to find that the main performance problems, in particular: java.util.Scanner.nextLine()and java.util.Scanner.hasNext().

I looked at other issues (like this one ) and I came across some CSV readers such as Apache Commons CSV , but they don't seem to have much information on how to use them, and I'm not sure how much faster they are.

FileReader BufferedReader , , , , .

30 000 , , (. ), 600 1 , , 2000 , , , Android .

, , , , FileReader BufferedReader. , , ? , , ( , ).

+4
3

, , , , , , .

AsyncTask :

private class LoadFilesTask extends AsyncTask<String, Integer, Long> {
protected Long doInBackground(String... str) {
    long lineNumber = 0;
    InputStreamReader inputStreamReader;
    try {
        inputStreamReader = new
                InputStreamReader(context.getAssets().open(str[0]));
        Scanner inputStream = new Scanner(inputStreamReader);
        inputStream.nextLine(); // Ignores the first line

        while (inputStream.hasNext()) {
            lineNumber++;
            String data = inputStream.nextLine(); // Gets a whole line
            String[] line = data.split(","); // Splits the line up into a string array

            if (line.length > 1) {
                // Do stuff, e.g:
                String value = line[1];
            }
        }
        inputStream.close();
    } catch (IOException e) {
        e.printStackTrace();
    }
    return lineNumber;
}

//If you need to show the progress use this method
protected void onProgressUpdate(Integer... progress) {
    setYourCustomProgressPercent(progress[0]);
}

//This method is triggered at the end of the process, in your case when the loading has finished
protected void onPostExecute(Long result) {
    showDialog("File Loaded: " + result + " lines");
}
}

... :

new LoadFilesTask().execute("MyFile.csv");

, .

+2

uniVocity-parsers CSV, ( 2 , OpenCSV, 3 , Apache Commons CSV), .

, :

CsvParserSettings settings = new CsvParserSettings(); // many options here, have a look at the tutorial

CsvParser parser = new CsvParser(settings);

// parses all rows in one go
List<String[]> allRows = parser.parseAll(new FileReader(new File("your/file.csv")));

, :

parserSettings.selectFields("Column X", "Column A", "Column Y");

4 2 . 30%.

, RowProcessor. , POJOS .. . :

// let get the values of all columns using a column processor
ColumnProcessor rowProcessor = new ColumnProcessor();
parserSettings.setRowProcessor(rowProcessor);

//the parse() method will submit all rows to the row processor
parser.parse(new FileReader(new File("/examples/example.csv")));

//get the result from your row processor:
Map<String, List<String>> columnValues = rowProcessor.getColumnValuesAsMapOfNames();

.

+3

Instead, you should use BufferedReader:

BufferedReader reader = null;
try {
    reader = new BufferedReader( new InputStreamReader(context.getAssets().open("MyFile.csv"))) ;
    reader.readLine(); // Ignores the first line
    String data;
    while ((data = reader.readLine()) != null) { // Gets a whole line
        String[] line = data.split(","); // Splits the line up into a string array
        if (line.length > 1) {
            // Do stuff, e.g:
            String value = line[1];
        }
    }
} catch (IOException e) {
    e.printStackTrace();
} finally {
    if (reader != null) {
        try {
            reader.close();
        } catch (IOException e) {
            e.printStackTrace();
        } 
    } 
}
0
source

All Articles