How to parse OFX file (version 1.0.2) in PHP?

I have an OFX file downloaded from Citibank , this file has a DTD at http://www.ofx.net/DownloadPage/Files/ofx102spec.zip (OFXBANK.DTD file) OFX file looks SGML valid. I try with DomDocument PHP 5.4.13, but I get a few warnings and the file is not parsed. My code is:

$file = "source/ACCT_013.OFX"; $dtd = "source/ofx102spec/OFXBANK.DTD"; $doc = new DomDocument(); $doc->loadHTMLFile($file); $doc->schemaValidate($dtd); $dom->validateOnParse = true; 

The OFX file starts with:

 OFXHEADER:100 DATA:OFXSGML VERSION:102 SECURITY:NONE ENCODING:USASCII CHARSET:1252 COMPRESSION:NONE OLDFILEUID:NONE NEWFILEUID:NONE <OFX> <SIGNONMSGSRSV1> <SONRS> <STATUS> <CODE>0 <SEVERITY>INFO </STATUS> <DTSERVER>20130331073401 <LANGUAGE>SPA </SONRS> </SIGNONMSGSRSV1> <BANKMSGSRSV1> <STMTTRNRS> <TRNUID>0 <STATUS> <CODE>0 <SEVERITY>INFO </STATUS> <STMTRS> <CURDEF>COP <BANKACCTFROM> ... 

I am open to installing and using any program on the server (Centos) to call with PHP.

PD: This class http://www.phpclasses.org/package/5778-PHP-Parse-and-extract-financial-records-from-OFX-files.html does not work for me.

+6
source share
2 answers

Well, in the first place, even XML is a subset of SGML, a valid SGML file does not have to be a well-formed XML file. XML is more strict and does not use all the features that SGML offers.

Since the DOMDocument based on XML (not SGML), this is not entirely compatible.

Next to this problem, please see 2.2 Open Headers of Financial Exchanges in Ofexfin1.doc, this explains to you that

The contents of an Open Financial Exchange file consists of a simple set of headers, followed by the content defined by this header

and further:

An empty line follows the last heading. Then (for the OFXSGML type) SGML-readable data begins with the <OFX> tag.

So, find the first blank line and split it until it appears. Then load the SGML part into a DOMDocument, first converting SGML to XML:

 $source = fopen('file.ofx', 'r'); if (!$source) { throw new Exception('Unable to open OFX file.'); } // skip headers of OFX file $headers = array(); $charsets = array( 1252 => 'WINDOWS-1251', ); while(!feof($source)) { $line = trim(fgets($source)); if ($line === '') { break; } list($header, $value) = explode(':', $line, 2); $headers[$header] = $value; } $buffer = ''; // dead-cheap SGML to XML conversion // see as well http://www.hanselman.com/blog/PostprocessingAutoClosedSGMLTagsWithTheSGMLReader.aspx while(!feof($source)) { $line = trim(fgets($source)); if ($line === '') continue; $line = iconv($charsets[$headers['CHARSET']], 'UTF-8', $line); if (substr($line, -1, 1) !== '>') { list($tag) = explode('>', $line, 2); $line .= '</' . substr($tag, 1) . '>'; } $buffer .= $line ."\n"; } // use DOMDocument with non-standard recover mode $doc = new DOMDocument(); $doc->recover = true; $doc->preserveWhiteSpace = false; $doc->formatOutput = true; $save = libxml_use_internal_errors(true); $doc->loadXML($buffer); libxml_use_internal_errors($save); echo $doc->saveXML(); 

This code example outputs the following (reformatted) XML, which also shows that the DOMDocument loaded the data correctly:

 <?xml version="1.0"?> <OFX> <SIGNONMSGSRSV1> <SONRS> <STATUS> <CODE>0</CODE> <SEVERITY>INFO</SEVERITY> </STATUS> <DTSERVER>20130331073401</DTSERVER> <LANGUAGE>SPA</LANGUAGE> </SONRS> </SIGNONMSGSRSV1> <BANKMSGSRSV1> <STMTTRNRS> <TRNUID>0</TRNUID> <STATUS> <CODE>0</CODE> <SEVERITY>INFO</SEVERITY> </STATUS> <STMTRS><CURDEF>COP</CURDEF><BANKACCTFROM> ...</BANKACCTFROM> </STMTRS> </STMTTRNRS> </BANKMSGSRSV1> </OFX> 

I do not know if this can be confirmed against DTD. Maybe it works. Also, if SGML is not written with values ​​that have a tag on one line (and only one element is needed per line), then this fragile conversion will break.

+3
source

The simplest OFX parses an array with easy access to all values ​​and transactions.

 function parseOFX($ofx) { $OFXArray=explode("<",$ofx); $a=array(); foreach ($OFXArray as $v) { $pair=explode(">",$v); if (isset($pair[1])) { if ($pair[1]!=NULL) { if (isset($a[$pair[0]])) { if (is_array($a[$pair[0]])) { $a[$pair[0]][]=$pair[1]; } else { $temp=$a[$pair[0]]; $a[$pair[0]]=array(); $a[$pair[0]][]=$temp; $a[$pair[0]][]=$pair[1]; } } else { $a[$pair[0]]=$pair[1]; } } } } return $a; } 
+1
source

All Articles