Error from memory while reading a large CSV file (millions of lines) in java

I get an error while reading a large CSV file in java. How can I deal with this problem. I increased the heap size, I also tried using BufferedReader, but the same problem still persists. Here is my code

public class CsvParser {
    public static void main(String[] args) {
        try {
            FileReader fr = new FileReader((args.length > 0) ? args[0] : "data.csv");
            Map<String, List<String>> values = parseCsv(fr, " ", true);
            System.out.println(values);
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

    public static Map<String, List<String>> parseCsv(Reader reader, String separator, boolean hasHeader)
            throws IOException {
        Map<String, List<String>> values = new LinkedHashMap<String, List<String>>();
        List<String> columnNames = new LinkedList<String>();
        BufferedReader br = null;
        br = new BufferedReader(reader);
        String line;
        int numLines = 0;
        while ((line = br.readLine()) != null) {
            if (StringUtils.isNotBlank(line)) {
                if (!line.startsWith("#")) {
                    String[] tokens = line.split(separator);
                    if (tokens != null) {
                        for (int i = 0; i < tokens.length; ++i) {
                            if (numLines == 0) {
                                columnNames.add(hasHeader ? tokens[i] : ("row_" + i));
                            } else {
                                List<String> column = values.get(columnNames.get(i));
                                if (column == null) {
                                    column = new LinkedList<String>();
                                }
                                column.add(tokens[i]);
                                values.put(columnNames.get(i), column);
                            }
                        }
                    }
                    ++numLines;
                }
            }
        }
        return values;
    }
}
+4
source share
6 answers

- , , , . RandomAccessFile MappedByteBuffer NIO. , . , . .

+1

csv-file, , values, "" + .

, "" CSV . , , , , ,

+1

: C, L , B 64- JVM:

  • CSV C × L × B, (32 + 24 + 2 × B) C × L × B . , UTF-8 (24 + B) C × L × B. , , .

  • LinkedList 40 node, 40 × C × L . ArrayList , 8 node, , .

(96 + 2 × B) × L × C , . ArrayLists , (32 + B) × L × C .

+1

. , , .

uniVocity-parsers CSV, . CSV, java. : . ( Apache V2.0).

, 42 1 ,

, CSV UniVocity-parsers:

CsvParserSettings settings = new CsvParserSettings();
CsvParser parser = new CsvParser(settings);

// parses all rows in one go.
List<String[]> allRows = parser.parseAll(new FileReader(yourFile));
+1

, .

, OutOfMemory.

, . , sqlite, , . , , , .

0
0

All Articles