What are the disadvantages of XML?

By reading StackOverflow and listening to the Joel Spolsky and Jeff Atwood podcasts, I am starting to believe that many developers hate using XML, or at least try to avoid using XML as much as possible for storing or exchanging data .

On the other hand, I really enjoy using XML for several reasons:

  • XML serialization is implemented in most modern languages ​​and is extremely easy to use ,
  • Being slower than binary serialization, XML serialization is very useful when it comes to using the same data from several programming languages , or where it is intended to be read and understood, even for debugging, of a person (JSON, for example, is more difficult to understand),
  • XML supports unicode , and when used correctly, problems with different encodings, characters, etc. not.
  • There are many tools that simplify working with XML data. XSLT is an example that facilitates the presentation and transformation of data. XPath is another way to make finding data easier.
  • XML can be stored on some SQL servers, which allows scenarios when data that is too complex to be easily stored in SQL tables must be stored and processed; For example, JSON or binary data cannot be manipulated directly by SQL (except for manipulating strings, which are crazy in most cases),
  • XML does not require the installation of any applications. If I want my application to use a database, I must first install the database server. If I want my application to use XML, I don’t have to install anything ,
  • XML is much more explicit and extensible than, for example, Windows Registry or INI files,
  • In most cases, there is no CR-LF problem , due to the level of abstraction provided by XML.

So, taking into account all the benefits of using XML, why do many developers hate using it? IMHO, the only problem is that:

  • XML is too verbose and requires much more space than most other forms of data, especially when it comes to Base64 encoding.

Of course, there are many scenarios in which XML is not suitable at all. Storing SO questions and answers in an XML file on the server side will be completely wrong. Or, when you are storing AVI videos or a bunch of JPG images, it is best to use XML.

But what about other scenarios? What are the disadvantages of XML?


For people who thought this question was not a real question:

Unlike questions like the unoccupied significant new inventions in the field of computing since 1980 , my question is a very clear question and clearly offers to explain what weaknesses other people experience using XML and why they don't like them. He does not invite discussion, for example, if XML is good or bad. It also does not require extensive discussion; thus, the current answers received so far are short and accurate and provide sufficient information that I wanted.

But this is a wiki, since there cannot be a unique good answer to this question.

According to SO, “not a real question” is a question where “It's hard to say what is being asked here. This question is ambiguous, vague, incomplete or rhetorical and cannot be reasonably responded to in its current form.”

  • What is being asked here: I think the question itself is very clear, and a few paragraphs above the text make it even more clear,
  • This question is ambiguous, vague, incomplete: again, there is nothing ambiguous, neither vague nor incomplete,
  • or rhetorical: this is not so: the answer to my question is not something obvious,
  • and cannot be reasonably answered: several people have already given great answers to the question, showing that the question can be answered reasonably.

It also seems quite obvious how to evaluate responses and determine the accepted answer. If the answer provides good reasons for what is wrong with XML, there is a chance that this answer will be voted and then accepted.

+6
xml xml-serialization data-storage
source share
6 answers

Some disadvantages:

  • It is difficult to link xml files and external resources, therefore, Office zip file formats use a zip envelope that contains skeleton xml files and associated resource files. Another possibility of using base64 encoding is very verbose and does not allow good random access, which leads to the following point:
  • Random access is difficult. None of the two traditional ways to read an XML file — building a DOM or directly viewing a SAX style — really allows random access.
  • Collective write access to different parts of a file is difficult, so using it in Windows executables is error prone.
  • What encoding does the xml file use? Strictly speaking, you first guess the encoding, then read the file and verify that the encoding is correct.
  • It is difficult to reproduce parts of a file. Therefore, if you want to provide granular version control, you need to share your data. This is not just a problem with the file format, but also because tools usually provide semantics for each file — version control tools, synchronization tools like DropBox, etc.
+5
source share
<xml> <noise> The </noise> <adjective> main </adjective> <noun> weakness </noun> <noise> of </noise> <subject> XML </subject> <noise> , </noise> <whocares> in my opinion </whocares> <noise> , </noise> <wildgeneralisation> is its verbosity </wildgeneralisation> <noise> . </noise> </xml> 
+5
source share

I am not the kind of person I am asking about as I am a big xml fan. However, I can tell you one of the main complaints I heard:

It’s hard to work hard. It is hard to say that it requires knowledge of the API and that you will need to write relatively much code to parse your xml. Although I would not say that it is really so complicated, I can only agree that a language that is designed to describe objects can be more easily accessed using a language that supports dynamically created objects.

+1
source share

I think the overall reaction is simply because XML is overused.

However, if there is one word that I hate about XML, with passion, it's namespaces. Lost performance around namespace problems is terrible.

+1
source share

XML descends from SGML, a plausible markup language. The purpose of SGML and the XML extension is to annotate text. XML does this well and has a wide range of tools that expand the capabilities of various applications.

The problem, as I see it, is that XML is often used, rather than commenting on text, but is structured data, which is a subtle but important difference. From a practical point of view, structured data should be concise for a variety of reasons. Performance is obvious, especially when bandwidth is limited. This is probably one of the main reasons why JSON is so popular for web applications. A brief overview of the data structure on the wire means better scalability.

Unfortunately, JSON is not very readable without an additional whitespace, which is almost always omitted. On the other hand, if you have ever tried to edit a large XML file using the command line editor, this can be very inconvenient.

Personally, I found that YAML strikes a good balance between the two extremes. Compare the following (copied from yaml.org with minor changes).

YAML:

 invoice: 34843 date: 2001-01-23 billto: &id001 given: Chris family: Dumars address: lines: | 458 Walkman Dr. Suite #292 city: Royal Oak state: MI postal: 48046 shipto: *id001 product: - sku: BL394D quantity: 4 description: Basketball price: 450.00 - sku: BL4438H quantity: 1 description: Super Hoop price: 2392.00 tax : 251.42 total: 4443.52 comments: > Late afternoon is best. Backup contact is Nancy Billsmer @ 338-4338. 

XML:

 <invoice> <number>34843</number> <date>2001-01-03</date> <billto id="id001"> <given>Chris</given> <family>Dumars</family> <address> <lines> 458 Walkman Dr. Suite #292 </lines> <city>Royal Oak</city> <state>MI</state> <postal>48046</postal> </address> </billto> <shipto xref="id001" /> <products> <product> <sku>BL394D</sku> <quantity>4</quantity> <description>Basketball</description> <price>450.00</price> </product> <product> <sku>BL4438</sku> <quantity>1</quantity> <description>Super Hoop</description> <price>2392.00</price> </product> </products> <tax>251.42</tax> <total>4443.52</total> <comments> Late afternoon is best. Backup contact is Nancy Billsmer @ 338-4338 </comments> </invoice> 

Both of them present the same data, but YAML is 30% smaller and possibly more readable. What would you rather change with a text editor? There are many libraries for parsing and emitting YAML (i.e. Snakeyaml for Java developers).

As with everything, the right rule for the right job is the right rule.

+1
source share

My favorite nasty problem is with XML serialization formats that use attributes - for example, XAML.

It works:

 <ListBox ItemsSource="{Binding Items}" SelectedItem="{Binding CurrentSelection}"/> 

It does not mean:

 <ListBox SelectedItem="{Binding CurrentSelection}" ItemsSource="{Binding Items}"/> 

Deserialization XAML assigns property values ​​as they are read from an XML stream. Therefore, in the second example, when the SelectedItem property is assigned, the ItemsSource control is not set yet, and the SelectedItem property is assigned to the element that still exists.

If you use Visual Studio to create your XAML files, everything will be cool, because Visual Studio supports attribute ordering. But change your XAML in some XML tool that considers an XML recommendation when it says that attribute ordering doesn't matter, and the boy in a scary world.

0
source share

All Articles