Best way to convert custom XML to syntax

Question

Best way to convert custom XML to syntax

Using Python

So basically I have the XML syntax as a tag, but the tags have no attributes. So, <a>but not <a value='t'>. They regularly close with </a>.

Here is my question. I have something similar to this:

<al>
1. test
2. test2
 test with new line
3.  test3
<al>
    1. test 4
    <al>
        2. test 5
        3. test 6
        4. test 7
    </al>
</al>
4. test 8
</al>

And I want to convert it to:

<al>
<li>test</li>
<li> test2</li>
<li> test with new line</li>
<li>  test3
<al>
    <li> test 4 </li>
    <al>
        <li> test 5</li>
        <li> test 6</li>
        <li> test 7</li>
    </al>
    </li>
</al>
</li>
<li> test 8</li>
</al>

I'm really not looking for a turnkey solution, but rather a push in the right direction. I'm just wondering how people here approach this problem. The only REGEX? to write a complete custom parser for tag syntax without attributes? Deploy existing XML parsers? and etc.

Thanks in advance

+5

python xml parsing tags

Peach passion Jul 15 '11 at 18:12

source share

4 answers

:

from xml.dom.minidom import parse, parseString

xml = parse(...)
l = xml.getElementsByTagName('al')

l, ( <al> ).

Python.

, chunk.split('\n') <li> , .

<al> xml.toxml(), xml .

, , , xml, xml .

, , .

+2

spacediver 15 . '11 18:45

, "XML ". , XML, XML, XSLT XQuery.

, XML, , , XML XML- SAX. XML-, XML.

+2

Michael Kay 16 . '11 12:02

, , script, :

cat in.txt | perl -pe 'if(!/<\/?al>/){s#^(\s*)([0-9]+\.)?(.*)$#$1<li>$3</li>#}'

. , ;) .

+1

markijbema 16 . '11 1:23

mac · Accepted Answer · 2011-07-16T00:54:31+0000

, .

.

, , , - . , .

<li> </li>; , , "".

, , , , : , () .

" " . Regex " ": , . , "" ( , <a></a> <a/> HTML <br> XHTML <br/>), - beautifulsoup - "", , ( ) , , .

, , regex. [ , , 72 ].

, python readability counts, , // .

!

Best way to convert custom XML to syntax

More articles: