Java.lang.OutOfMemoryError when converting XML to huge directory

I want to convert XML files using XSLT2 to a huge directory with many levels. There are over 1 million files, each file from 4 to 10 kB. After a while, I always get java.lang.OutOfMemoryError: Java heap space.

My team: java -Xmx3072M -XX: + UseConcMarkSweepGC -XX: + CMSClassUnloadingEna bled -XX: MaxPermSize = 512M ...

Adding memory to -Xmx is not a good solution.

Here are my codes:

for (File file : dir.listFiles()) {
    if (file.isDirectory()) {
        pushDocuments(file);
    } else {
        indexFiles.index(file);
    }
}

public void index(File file) {
    ByteArrayOutputStream outputStream = new ByteArrayOutputStream();

    try {
        xslTransformer.xslTransform(outputStream, file);
        outputStream.flush();
        outputStream.close();
    } catch (IOException e) {
        System.err.println(e.toString());
    }
}

Convert XSLT to net.sf.saxon.s9api

public void xslTransform(ByteArrayOutputStream outputStream, File xmlFile) {
    try {
        XdmNode source = proc.newDocumentBuilder().build(new StreamSource(xmlFile));
        Serializer out = proc.newSerializer();
        out.setOutputStream(outputStream);
        transformer.setInitialContextNode(source);
        transformer.setDestination(out);
        transformer.transform();

        out.close();
    } catch (SaxonApiException e) {
        System.err.println(e.toString());
    }
}
+4
source share
4 answers

Saxon s9api XsltExecutable, XsltTransformer . XsltTransformer , , , , .

xsltTransformer.getUnderlyingController().clearDocumentPool() .

( , saxonica.plan.io, , [] . "", , , - , . , StackOverflow , , , - .)

+5

, . , , , .

jstat -gc {pid} 10s, , . , , Full GC, - , VisualVM, , . jmap -histo:live {pid} | head -20 .

, , . , : a) , , ; b) .

0

String[] files = dir.list();
for (String fileName : files) {
    File file = new File(fileName);
    if (file.isDirectory()) {
        pushDocuments(file);
    } else {
        indexFiles.index(file);
    }
}
0
source

I had a similar problem arising from the javax.xml.transform package that used ThreadLocalMap to cache XML fragments that were read during XSLT. I had to pass XSLT to my own thread so that ThreadLocalMap would clear when the new Thread died - this freed up memory. See here: https://www.ahoi-it.de/ahoi/news/java-xslt-memory-leak/1446

0
source

All Articles