I have a long (about 3 thousand lines) XML document that usually looks like:
<chapter someAttributes="someValues"> <title>someTitle</title> <p>multiple paragraphs</p> <p>...</p> <li> <p>- some text</p> </li> <li> <p>- some other text</p> </li> <p>multiple other paragraphs</p> <p>...</p> <li> <p>1. some text</p> </li> <li> <p>2. some other text</p> </li> <p>multiple other paragraphs</p> <p>...</p> </chapter>
I want to pack every scattered (I mean between paragraphs, tables, illustrations, etc.) sequence of li elements with ol or ul depending on some semantic and returned wrapped XML.
- if the first character in the paragraph is
- , then it should be ul with the mark="DASH" attribute - if paragraphs start with
1. , 2. , 3. , etc., then I want ol with numeration="ARABIC"
For example (this is just one sequence):
<ul mark="DASH"> <li> <p> some text</p> </li> <li> <p> some other text</p> </li> <ul>
As you can see, I need to cut out the βcharacter (s)β from all paragraphs, that is - or 1. , 2. , 3. , etc.
This XML input is more complex than I described (nested sequences, internal sequences in table elements), but I'm looking for some idea, especially how to catch and process a specific sequence using such semantics.
I want to get XML with exactly the same ordering, only with li elements wrapped. XSLT 2.0 / EXSLT are available if necessary.
source share