What is the most efficient way to write from a database to a (zip) file in Java?

My program is fast enough, but I would prefer to abandon this speed to optimize memory, because the maximum memory usage for a user is up to 300 MB means that some of them can constantly cause the application to crash. Most of the answers that I found were related to speed optimization, while others were just general ("if you write directly from the database to memory, there should not be much memory usage"). Well, it seems that there is :) I was thinking about not sending the code, so I would not β€œblock” some ideas, but, on the other hand, I could waste your time if you do not see that I have already done so here it is:

// First I get the data from the database in a way that I think can't be more // optimized since i've done some testing and it seems to me that the problem // isn't in the RS and setting FetchSize and/or direction does not help. public static void generateAndWriteXML(String query, String oznaka, BufferedOutputStream bos, Connection conn) throws Exception { ResultSet rs = null; Statement stmt = null; try { stmt = conn.createStatement(ResultSet.TYPE_SCROLL_INSENSITIVE, ResultSet.CONCUR_READ_ONLY); rs = stmt.executeQuery(query); writeToZip(rs, oznaka, bos); } finally { ConnectionManager.close(rs, stmt, conn); } } // then I open up my streams. In the next method I'll generate an XML from the // ResultSet and I want that XML to be saved in an XML, but since its size takes up // to 300MB, I want it to be saved in a ZIP. I'm thinking that maybe by writing // first to file, then to zip I could get a slower but more efficient program. private static void writeToZip(ResultSet rs, String oznaka, BufferedOutputStream bos) throws SAXException, SQLException, IOException { ZipEntry ze = new ZipEntry(oznaka + ".xml"); ZipOutputStream zos = new ZipOutputStream(bos); zos.putNextEntry(ze); OutputStreamWriter writer = new OutputStreamWriter(zos, "UTF8"); writeXMLToWriter(rs, writer); try { writer.close(); } catch (IOException e) { } try { zos.closeEntry(); } catch (IOException e) { } try { zos.flush(); } catch (IOException e) { } try { bos.close(); } catch (IOException e) { } } // And finally, the method that does the actual generating and writing. // This is the second point I think I could do the memory optimization since the // DataWriter is custom and it extends a custom XMLWriter that extends the standard // org.xml.sax.helpers.XMLFilterImpl I've tried with flushing at points in program, // but the memory that is occupied remains the same, it only takes longer. public static void writeXMLToWriter(ResultSet rs, Writer writer) throws SAXException, SQLException, IOException { //Set up XML DataWriter w = new DataWriter(writer); w.startDocument(); w.setIndentStep(2); w.startElement(startingXMLElement); // Get the metadata ResultSetMetaData meta = rs.getMetaData(); int count = meta.getColumnCount(); // Iterate over the set while (rs.next()) { w.startElement(rowElement); for (int i = 0; i < count; i++) { Object ob = rs.getObject(i + 1); if (rs.wasNull()) { ob = null; } // XML elements are repeated so they could benefit from caching String colName = meta.getColumnLabel(i + 1).intern(); if (ob != null) { if (ob instanceof Timestamp) { w.dataElement(colName, Util.formatDate((Timestamp) ob, dateFormat)); } else if (ob instanceof BigDecimal) { // Possible benefit from writing ints as strings and interning them w.dataElement(colName, Util.transformToHTML(new Integer(((BigDecimal) ob).intValue()))); } else { // there enough of data that repeated to validate the use of interning w.dataElement(colName, ob.toString().intern()); } } else { w.emptyElement(colName); } } w.endElement(rowElement); } w.endElement(startingXMLElement); w.endDocument(); } 

EDIT: Here is an example of memory usage (using visualVM):

Memory usage screenshot

EDIT2: The database is Oracle 10.2.0.4. and I set ResultSet.TYPE_FORWARD_ONLY and got a maximum of 50 MB of use! As I said in the comments, I will follow up on this, but it is really promising.

Memory usage after adding ResultSet.TYPE_FORWARD_ONLY

EDIT3: There seems to be another possible optimization. As I said, I generate XML, which means that a lot of data is repeated (if nothing else, then tags), i.e. String.intern () can help me here, I will send a message when I do this.

+2
source share
3 answers

I did some more tests, and the conclusions are:

  • The biggest win in the JVM (or visualvm has problems monitoring Java 5 Heap space :). When I first reported that ResultSet.TYPE_FORWARD_ONLY received significant benefits, I was mistaken. The biggest gain was due to the use of Java 5, in which the same program used up to 50 MB of heap, in contrast to Java 6, in which the same code took up to 150 MB.
  • The second gain is in ResultSet.TYPE_FORWARD_ONLY, which caused the program to take as little memory as possible.
  • The third win in Sting.intern (), which caused the program to take up a bit less memory, as it caches lines instead of creating new ones.

This is a use with optimizations 2 and 3 (if there wasn’t String.intern (), the graph would be the same, you should add only 5 MB to each point)

alt text

and this use without them (less use at the end is connected with the program exit from memory :)) alt text

Thank you all for your help.

0
source

Can I use ResultSet.TYPE_FORWARD_ONLY?

You used ResultSet.TYPE_SCROLL_INSENSITIVE. I believe that for some databases (you did not say which one you are using), this leads to loading the entire result set into memory.

+3
source

Since this is Java, the memory should pop up only temporarily, unless you leaked links, for example, if you click things on a list that is a member of a singleton that has the lifetime of the entire program, or rather, in my experience, this is a resource leak. which happens when (and this I intend to apply to Java, although I think of C #), objects that use unmanaged resources, such as file descriptors, never call their cleanup code, a condition usually caused by empty exception handlers that don't re-thr ow to the parent stack frame, which has the pure effect of traversing the finally block ...

0
source

All Articles