How to create a large (30MB +) xml file in java?

The file itself is not so large, it should fit into memory. But as soon as you combine this with other overhead factors, then it starts to become a problem. We create a DOM in memory, and it does not scale for us. Using raw output streams seems problematic in the sense that we have to be careful with escape characters.

What are good approaches for this?

Are there any lib products for this?

+6
java dom xml stream scalability
source share
4 answers

STAX provides a convenient API for writing XML to the output stream. A simple tutorial is here .

+9
source share

Try XStream

Functions

  • Ease of use. A high level faΓ§ade is provided that simplifies common use cases.
  • No matching is required. Most objects can be serialized without the need for mappings.
  • Performance. Speed ​​and small footprint are an important part of the design, which makes it suitable for large graphs of objects or systems with high message throughput.
  • Clear XML. Information is not duplicated, which can be obtained by reflection. This results in XML that is easier to read for people and more compact than Java's built-in serialization.
  • Does not require changes to objects. Serializes internal fields, including private and final. Supports non-public and internal classes. Classes are not required to have a default constructor.
  • Support for a full schedule of objects. Duplicate links found in the object model will be saved. Supports circular references.
  • Integrates with other XML interfaces. By implementing the interface, XStream can be serialized directly to / from any tree structure (and not just from XML).
  • Custom conversion strategies. Strategies can be registered, allowing you to customize how specific types are represented as XML.
  • Error messages. When an exception is thrown due to malformed XML, detailed diagnostic information is provided to help isolate and fix the problem.
  • Alternative output format. The modular design allows the use of other output formats. XStream currently ships with JSON and morphing.
+1
source share

With Saxon, you can use the StAX XMLStreamWriter API in conjunction with the Serializer, which gives you full control over the serialization properties, as defined in xsl: output, for example, the ability to control indentation, use CDATA sections, etc. See the s9api Serializer class.

+1
source share

It depends on how your data is structured, but the StAX implementation may be what you are looking for - for example, Woodstock.

0
source share

All Articles