Create and populate complex XML with a Perl script

I have the following XML file template that I want to create and populate with a Perl script. All XML attribute values ​​come from the SQL database in different queries. My XML contains several types of collection attributes.

It is difficult for me which perl module I should use, because there are many alternatives in CPAN. In addition, I want to know how to approach this problem.

Any help is greatly appreciated.

`

<TumorDetails> <personUpi>String</personUpi> <ageAtDiagnosis>3.14159E0</ageAtDiagnosis> <biopsyPathologyReportSummary>String</biopsyPathologyReportSummary> <primarySiteCollection> <tissueSite> <description>String</description> <name>String</name> </tissueSite> </primarySiteCollection> <distantMetastasisSite> <description>String</description> <name>String</name> </distantMetastasisSite> <siteGroup> <description>String</description> <name>String</name> </siteGroup> <tmStaging> <clinicalDescriptor>String</clinicalDescriptor> <clinicalMStage>String</clinicalMStage> <siteGroupEdition5> <description>String</description> <name>String</name> </siteGroupEdition5> <siteGroupEdition6> <description>String</description> <name>String</name> </siteGroupEdition6> </tmStaging> <pediatricStaging> <doneBy>String</doneBy> <group>String</group> </pediatricStaging> <histologicTypeCollection> <histologicType> <description>String</description> <system>String</system> <value>String</value> </histologicType> </histologicTypeCollection> <histologicGradeCollection> <histologicGrade> <gradeOrDifferentiation>String</gradeOrDifferentiation> </histologicGrade> </histologicGradeCollection> <familyHistoryCollection> <familyHistory> <otherCancerDiagnosed>String</otherCancerDiagnosed> <sameCancerDiagnosed>String</sameCancerDiagnosed> </familyHistory> </familyHistoryCollection> <comorbidityOrComplicationCollection> <comorbidityOrComplication> <value>String</value> </comorbidityOrComplication> </comorbidityOrComplicationCollection> <tumorBiomarkerTest> <her2NeuDerived>String</her2NeuDerived> <her2NeuFish>String</her2NeuFish> </tumorBiomarkerTest> <patientHistoryCollection> <patientHistory> <cancerSite>String</cancerSite> <sequence>2147483647</sequence> </patientHistory> </patientHistoryCollection> <tumorHistory> <cancerStatus>String</cancerStatus> <cancerStatusFollowUpDate>1967-08-13</cancerStatusFollowUpDate> <cancerStatusFollowUpType>String</cancerStatusFollowUpType> <qualityOfSurvival>String</qualityOfSurvival> </tumorHistory> <placeOfDiagnosis> <initials>String</initials> </placeOfDiagnosis> <followUp> <dateFollowUpChanged>String</dateFollowUpChanged> <dateOfLastCancerStatus>1967-08-13</dateOfLastCancerStatus> <nextFollowUpHospital> <initials>String</initials> </nextFollowUpHospital> <lastFollowUpHospital> <initials>String</initials> </lastFollowUpHospital> <tumorFollowUpBiomarkerTest> <her2NeuDerived>String</her2NeuDerived> <her2NeuFish>String</her2NeuFish> </tumorFollowUpBiomarkerTest> </followUp> </TumorDetails> 

`

+2
source share
4 answers

Firstly, a very, very important concept: there is no such thing as an "XML template"! The whole point of using XML is able to read / write data in accordance with some scheme. If you have an (sequential) XML sample but no schema definition (XSD), use trang to understand:

 java -jar trang.jar sample.xml sample.xsd 

For the provided sample, the created XSD file is as follows:

 <?xml version="1.0" encoding="UTF-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified"> <xs:element name="TumorDetails"> <xs:complexType> <xs:sequence> <xs:element ref="personUpi"/> <xs:element ref="ageAtDiagnosis"/> <xs:element ref="biopsyPathologyReportSummary"/> <xs:element ref="primarySiteCollection"/> <xs:element ref="distantMetastasisSite"/> <xs:element ref="siteGroup"/> <xs:element ref="tmStaging"/> <xs:element ref="pediatricStaging"/> <xs:element ref="histologicTypeCollection"/> <xs:element ref="histologicGradeCollection"/> <xs:element ref="familyHistoryCollection"/> <xs:element ref="comorbidityOrComplicationCollection"/> <xs:element ref="tumorBiomarkerTest"/> <xs:element ref="patientHistoryCollection"/> <xs:element ref="tumorHistory"/> <xs:element ref="placeOfDiagnosis"/> <xs:element ref="followUp"/> </xs:sequence> </xs:complexType> </xs:element> ... </xs:schema> 

And now the best part, called XML :: Compile . It takes your XSD schema, compiles it, and customizes / validates Perl's own structures, creating XML as output:

 #!/usr/bin/env perl use strict; use warnings; use XML::Compile::Schema; my $node = { personUpi => 'String', ageAtDiagnosis => '3.14159E0', biopsyPathologyReportSummary => 'String', primarySiteCollection => { tissueSite => { description => 'String', name => 'String', }, }, ... }; my $schema = XML::Compile::Schema->new('sample.xsd'); my $writer = $schema->compile(WRITER => 'TumorDetails'); my $doc = XML::LibXML::Document->new(q(1.0), q(UTF-8)); print $writer->($doc, $node)->toString; 
+3
source

Much depends on what you are already familiar with. If it’s more convenient for you to move XML documents using the Document Object Model, then XML::DOM , XML::LibXML or XML::Twig are good, and XML::TreeBuilder is a similar module that has its own API, and you will find Does it suit you only by tasting it.

However, all of these modules are designed to navigate and access existing XML data, and they are only partially useful for creating new XML from scratch. Instead, the XML::Generator , XML::Writer and XML::API modules are specifically designed for this purpose, and they all share the same interfaces. My preferences and my recommendation to you is XML::API , which has the most flexible interface and should fit your purpose well.

Using XML::API , the code for creating this XML document has a one-to-one correspondence with the resulting XML. Each statement corresponds to one XML element or tag, and tag and attribute names and text values ​​can be inferred from the run time, for example, using information from the database.

This program recreates your XML sample. Note that subsections can be separately encoded and subprogrammed by passing an XML::API object to each of them. It is also possible to generate XML non-linearly, since each method returns a link to the element that is being created, and there is a _goto method that accepts such a link and sets the location of subsequent additions. Indeed, instead of writing any data, the _close method simply executes _goto parent of the current element.

 use strict; use warnings; use XML::API; my $xml = XML::API->new(doctype => 'xhtml'); $xml->_open('TumorDetails'); $xml->_element('personUpi', 'String'); $xml->_element('ageAtDiagnosis', '3.14159E0'); $xml->_element('biopsyPathologyReportSummary', 'String'); $xml->_open('primarySiteCollection'); $xml->_open('tissueSite'); $xml->_element('description', 'String'); $xml->_element('name', 'String'); $xml->_close('tissueSite'); $xml->_close('primarySiteCollection'); $xml->_open('distantMetastasisSite'); $xml->_element('description', 'String'); $xml->_element('name', 'String'); $xml->_close('distantMetastasisSite'); $xml->_open('siteGroup'); $xml->_element('description', 'String'); $xml->_element('name', 'String'); $xml->_close('siteGroup'); $xml->_open('tmStaging'); $xml->_element('clinicalDescriptor', 'String'); $xml->_element('clinicalMStage', 'String'); $xml->_open('siteGroupEdition5'); $xml->_element('description', 'String'); $xml->_element('name', 'String'); $xml->_close('siteGroupEdition5'); $xml->_open('siteGroupEdition6'); $xml->_element('description', 'String'); $xml->_element('name', 'String'); $xml->_close('siteGroupEdition6'); $xml->_close('tmStaging'); $xml->_open('pediatricStaging'); $xml->_element('doneBy', 'String'); $xml->_element('group', 'String'); $xml->_close('pediatricStaging'); $xml->_open('histologicTypeCollection'); $xml->_open('histologicType'); $xml->_element('description', 'String'); $xml->_element('system', 'String'); $xml->_element('value', 'String'); $xml->_close('histologicType'); $xml->_close('histologicTypeCollection'); $xml->_open('histologicGradeCollection'); $xml->_open('histologicGrade'); $xml->_element('gradeOrDifferentiation', 'String'); $xml->_close('histologicGrade'); $xml->_close('histologicGradeCollection'); $xml->_open('familyHistoryCollection'); $xml->_open('familyHistory'); $xml->_element('otherCancerDiagnosed', 'String'); $xml->_element('sameCancerDiagnosed', 'String'); $xml->_close('familyHistory'); $xml->_close('familyHistoryCollection'); $xml->_open('comorbidityOrComplicationCollection'); $xml->_open('comorbidityOrComplication'); $xml->_element('value', 'String'); $xml->_close('comorbidityOrComplication'); $xml->_close('comorbidityOrComplicationCollection'); $xml->_open('tumorBiomarkerTest'); $xml->_element('her2NeuDerived', 'String'); $xml->_element('her2NeuFish', 'String'); $xml->_close('tumorBiomarkerTest'); $xml->_open('patientHistoryCollection'); $xml->_open('patientHistory'); $xml->_element('cancerSite', 'String'); $xml->_element('sequence', '2147483647'); $xml->_close('patientHistory'); $xml->_close('patientHistoryCollection'); $xml->_open('tumorHistory'); $xml->_element('cancerStatus', 'String'); $xml->_element('cancerStatusFollowUpDate', '1967-08-13'); $xml->_element('cancerStatusFollowUpType', 'String'); $xml->_element('qualityOfSurvival', 'String'); $xml->_close('tumorHistory'); $xml->_open('placeOfDiagnosis'); $xml->_element('initials', 'String'); $xml->_close('placeOfDiagnosis'); $xml->_open('followUp'); $xml->_element('dateFollowUpChanged', 'String'); $xml->_element('dateOfLastCancerStatus', '1967-08-13'); $xml->_open('nextFollowUpHospital'); $xml->_element('initials', 'String'); $xml->_close('nextFollowUpHospital'); $xml->_open('lastFollowUpHospital'); $xml->_element('initials', 'String'); $xml->_close('lastFollowUpHospital'); $xml->_open('tumorFollowUpBiomarkerTest'); $xml->_element('her2NeuDerived', 'String'); $xml->_element('her2NeuFish', 'String'); $xml->_close('tumorFollowUpBiomarkerTest'); $xml->_close('followUp'); $xml->_close('TumorDetails'); print $xml; 
+2
source

If the data is always identical, then the ddoxey TemplateToolkit solution is good, however, if some of the tags are sometimes not present, you need to create XML every time from scratch.

I recently worked a bit with XML and was very pleased with XML::Writer .

+1
source

I am somewhat incomplete to the Template toolkit. Cm.:

 #!/usr/bin/perl -Tw use strict; use warnings; use Template; my $tmpl = get_template(); my $rec = get_record(); my $xml; my $template = Template->new(); $template->process( \$tmpl, $rec, \$xml ) || die $template->error(); print "$xml"; # ... sub get_record { return { personUpi => 'String', ageAtDiagnosis => '3.14159E0', biopsyPathologyReportSummary => 'String', primarySiteCollection => { tissueSite => { description => 'String', name => 'String', }, }, distantMetastasisSite => { description => 'String', name => 'String', }, siteGroup => { description => 'String', name => 'String', }, tmStaging => { clinicalDescriptor => 'String', clinicalMStage => 'String', siteGroupEdition5 => { description => 'String', name => 'String', }, siteGroupEdition6 => { description => 'String', name => 'String', }, }, pediatricStaging => { doneBy => 'String', group => 'String', }, histologicTypeCollection => { histologicType => { description => 'String', system => 'String', value => 'String', }, }, histologicGradeCollection => { histologicGrade => { gradeOrDifferentiation => 'String', }, }, familyHistoryCollection => { familyHistory => { otherCancerDiagnosed => 'String', sameCancerDiagnosed => 'String', }, }, comorbidityOrComplicationCollection => { comorbidityOrComplicationCollection => { value => 'String', }, }, tumorBiomarkerTest => { her2NeuDerived => 'String', her2NeuFish => 'String', }, patientHistoryCollection => { patientHistory => { cancerSite => 'String', sequence => '2147483647', }, }, tumorHistory => { cancerStatus => 'String', cancerStatusFollowUpDate => '1967-08-13', cancerStatusFollowUpType => 'String', qualityOfSurvival => 'String', }, placeOfDiagnosis => { initials => 'String', }, followUp => { dateFollowUpChanged => 'String', dateOfLastCancerStatus => '1967-08-13', nextFollowUpHospital => { initials => 'String', }, lastFollowUpHospital => { initials => 'String', }, tumorFollowUpBiomarkerTest => { her2NeuDerived => 'String', her2NeuFish => 'String', }, }, }; } sub get_template { return <<'END_TEMPL'; <TumorDetails> <personUpi>[% personUpi %]</personUpi> <ageAtDiagnosis>[% ageAtDiagnosis %]</ageAtDiagnosis> <biopsyPathologyReportSummary>[% biopsyPathologyReportSummary %]</biopsyPathologyReportSummary> <primarySiteCollection> <tissueSite> <description>[% primarySiteCollection.tissueSite.description %]</description> <name>[% primarySiteCollection.tissueSite.name %]</name> </tissueSite> </primarySiteCollection> <distantMetastasisSite> <description>[% distantMetastasisSite.description %]</description> <name>[% distantMetastasisSite.name %]</name> </distantMetastasisSite> <siteGroup> <description>[% siteGroup.description %]</description> <name>[% siteGroup.name %]</name> </siteGroup> <tmStaging> <clinicalDescriptor>[% tmStaging.clinicalDescriptor %]</clinicalDescriptor> <clinicalMStage>[% tmStaging.clinicalMStage %]</clinicalMStage> <siteGroupEdition5> <description>[% tmStaging.siteGroupEdition5.description %]</description> <name>[% tmStaging.siteGroupEdition5.name %]</name> </siteGroupEdition5> <siteGroupEdition6> <description>[% tmStaging.siteGroupEdition6.description %]</description> <name>[% tmStaging.siteGroupEdition6.name %]</name> </siteGroupEdition6> </tmStaging> <pediatricStaging> <doneBy>[% pediatricStaging.doneBy %]</doneBy> <group>[% pediatricStaging.group %]</group> </pediatricStaging> <histologicTypeCollection> <histologicType> <description>[% histologicTypeCollection.histologicType.description %]</description> <system>[% histologicTypeCollection.histologicType.system %]</system> <value>[% histologicTypeCollection.histologicType.value %]</value> </histologicType> </histologicTypeCollection> <histologicGradeCollection> <histologicGrade> <gradeOrDifferentiation>[% histologicGradeCollection.histologicGrade.gradeOrDifferentiation %]</gradeOrDifferentiation> </histologicGrade> </histologicGradeCollection> <familyHistoryCollection> <familyHistory> <otherCancerDiagnosed>[% familyHistoryCollection.familyHistory.otherCancerDiagnosed %]</otherCancerDiagnosed> <sameCancerDiagnosed>[% familyHistoryCollection.familyHistory.sameCancerDiagnosed %]</sameCancerDiagnosed> </familyHistory> </familyHistoryCollection> <comorbidityOrComplicationCollection> <comorbidityOrComplication> <value>[% comorbidityOrComplicationCollection.comorbidityOrComplicationCollection.value %]</value> </comorbidityOrComplication> </comorbidityOrComplicationCollection> <tumorBiomarkerTest> <her2NeuDerived>[% tumorBiomarkerTest.her2NeuDerived %]</her2NeuDerived> <her2NeuFish>[% tumorBiomarkerTest.her2NeuFish %]</her2NeuFish> </tumorBiomarkerTest> <patientHistoryCollection> <patientHistory> <cancerSite>[% patientHistoryCollection.patientHistory.cancerSite %]</cancerSite> <sequence>[% patientHistoryCollection.patientHistory.sequence %]</sequence> </patientHistory> </patientHistoryCollection> <tumorHistory> <cancerStatus>[% tumorHistory.cancerStatus %]</cancerStatus> <cancerStatusFollowUpDate>[% tumorHistory.cancerStatusFollowUpDate %]</cancerStatusFollowUpDate> <cancerStatusFollowUpType>[% tumorHistory.cancerStatusFollowUpType %]</cancerStatusFollowUpType> <qualityOfSurvival>[% tumorHistory.qualityOfSurvival %]</qualityOfSurvival> </tumorHistory> <placeOfDiagnosis> <initials>[% placeOfDiagnosis.initials %]</initials> </placeOfDiagnosis> <followUp> <dateFollowUpChanged>[% followUp.dateFollowUpChanged %]</dateFollowUpChanged> <dateOfLastCancerStatus>[% followUp.dateOfLastCancerStatus %]</dateOfLastCancerStatus> <nextFollowUpHospital> <initials>[% followUp.nextFollowUpHospital.initials %]</initials> </nextFollowUpHospital> <lastFollowUpHospital> <initials>[% followUp.nextFollowUpHospital.initials %]</initials> </lastFollowUpHospital> <tumorFollowUpBiomarkerTest> <her2NeuDerived>[% followUp.tumorFollowUpBiomarkerTest.her2NeuDerived %]</her2NeuDerived> <her2NeuFish>[% followUp.tumorFollowUpBiomarkerTest.her2NeuFish %]</her2NeuFish> </tumorFollowUpBiomarkerTest> </followUp> </TumorDetails> END_TEMPL } __END__ 
-1
source

Source: https://habr.com/ru/post/922345/


All Articles