XML parsing in clojure

I am new to clojure, so please bear with me. I have an XML that looks like

<?xml version="1.0" encoding="UTF-8"?> <XVar Id="cdx9" Type="Dictionary"> <XVar Id="Base.AccruedPremium" Type="Multi" Value="" Rows="1" Columns="1"> <Row Id="0"> <Col Id="0" Type="Num" Value="0"/> </Row> </XVar> <XVar Id="TrancheAnalysis.IndexDuration" Type="Multi" Value="" Rows="1" Columns="1"> <Row Id="0"> <Col Id="0" Type="Num" Value="3.4380728252313069"/> </Row> </XVar> <XVar Id="TrancheAnalysis.IndexLevel01" Type="Multi" Value="" Rows="1" Columns="1"> <Row Id="0"> <Col Id="0" Type="Num" Value="30693.926279941188"/> </Row> </XVar> <XVar Id="TrancheAnalysis.TrancheDelta" Type="Multi" Value="" Rows="1" Columns="1"> <Row Id="0"> <Col Id="0" Type="Num" Value="8.9304387917502073"/> </Row> </XVar> <XVar Id="TrancheAnalysis.TrancheDuration" Type="Multi" Value="" Rows="1" Columns="1"> <Row Id="0"> <Col Id="0" Type="Num" Value="3.0775955481964035"/> </Row> </XVar> </XVar> 

And this is repeated. From this I want to be able to create a CSV file with these columns

 IndexName,TrancheAnalysis.IndexDuration,TrancheAnalysis.TrancheDuration cdx9,3.4380728252313069,3.0775955481964035 ......................................... ......................................... 

I can parse a simple XML file like

 <?xml version="1.0" encoding="UTF-8"?> <CalibrationData> <IndexList> <Index> <Calibrate>Y</Calibrate> <UseClientIndexQuotes>Y</UseClientIndexQuotes> <IndexName>HYCDX10</IndexName> <Tenor>06/20/2013</Tenor> <TenorName>3Y</TenorName> <IndexLevels>219.6</IndexLevels> <Tranche>Equity0To0.15</Tranche> <TrancheStart>0</TrancheStart> <TrancheEnd>0.15</TrancheEnd> <UseBreakEvenSpread>1</UseBreakEvenSpread> <UseTlet>0</UseTlet> <IsTlet>0</IsTlet> <PctExpectedLoss>0</PctExpectedLoss> <UpfrontFee>52.125</UpfrontFee> <RunningFee>0</RunningFee> <DeltaFee>5.3</DeltaFee> <CentralCorrelation>0.1</CentralCorrelation> <Currency>USD</Currency> <RescalingMethod>PTIndexRescaling</RescalingMethod> <EffectiveDate>06/17/2011</EffectiveDate> </Index> </IndexList> </CalibrationData> 

with this code

 (ns DynamicProgramming (:require [clojure.xml :as xml])) ;Get the Input Files (def calibrationFile "C:/ashwani/Eclipse/HistoricalTrancheAnalysis/src/CalibrationQuotes.xml") (def mktdataFile "C:/ashwani/Eclipse/HistoricalTrancheAnalysis/src/MarketData.xml") (def sample "C:/ashwani/Eclipse/HistoricalTrancheAnalysis/src/Sample.xml") ;Parse the Calibration Input File (def CalibOp (for [x (xml-seq (xml/parse (java.io.File. calibrationFile))) :when (or (= :IndexName (:tag x)) (= :Tenor (:tag x)) (= :UpfrontFee (:tag x)) (= :RunningFee (:tag x)) (= :DeltaFee (:tag x)) (= :IndexLevels (:tag x)) (= :TrancheStart (:tag x)) (= :TrancheEnd (:tag x)) )] (first(:content x)))) (println CalibOp) 

But the second XML is simple; on the other hand, I don’t know how to iterate through the nested structure of the first XML example and extract the necessary information.

Any help would be great.

+4
xml clojure
Jun 24 2018-11-11T00:
source share
1 answer

I would use data.zip (formerly clojure.contrib.zip-filter). It provides many XML parsing capabilities, and it can easily execute xpath expressions. README describes it as a system for filtering trees and, in particular, the XML tree.

Below I have a sample code for creating a β€œline” for a CSV file. A row is a map of the column name for an attribute value.

 (ns work (:require [clojure.xml :as xml] [clojure.zip :as zip] [clojure.contrib.zip-filter.xml :as zf])) ; create a zip from the xml file (def zip (zip/xml-zip (xml/parse "data.xml"))) ; pulls out a list of all of the root "Id" attribute values (zf/xml-> zip (zf/attr :Id)) (defn value [xvar-zip] "Finds the id and value for a particular element" (let [id (-> xvar-zip zip/node :attrs :Id) ; manual access value (zf/xml1-> xvar-zip ; use xpath like expression to pull value out :Row ; need the row element :Col ; then the column element (zf/attr :Value))] ; and finally pull the Value out {id value})) ; gets the "column-value" pair for a single column (zf/xml1-> zip (zf/attr= :Id "cdx9") ; filter on id "cdx9" :XVar ; filter on XVars under it (zf/attr= :Id "TrancheAnalysis.IndexDuration") ; filter on id value) ; apply the value function on the result of above ; creates a map of every column key to it corresponding value (apply merge (zf/xml-> zip (zf/attr= :Id "cdx9") :XVar value)) 

I'm not sure how xml will work with multiple Dictionary XVars, since it is the root element. If you need one of the other functions that is useful for this type of work is mapcat , in which cat all the values ​​returned from the map function.

There are several more examples in the test source .

Another big recommendation is to make sure that you use many small features. You will find it much easier to debug, test and work with.

+8
Jun 24 '11 at 16:52
source share
β€” -



All Articles