I built a GPX analyzer using XML conduit and there were problems with overly verbose and fragile code to identify elements and skip unnecessary tags.
Identification of elements (slight irritation)
I explicitly ignore the namespace by comparing only nameLocalName
s. I assume the correct way is to hardcode the correct namespace into the program and have an auxiliary construction of my element names for comparison in tag*
functions? This is a bit annoying as I have to support at least two different namespaces (GPX 1.1 and 1.0) which are pretty similar that they don't require code changes for my purposes.
Skipping items
GPX is quite large, and the set of user extensions is larger. Since the tool that I create needs limited information, I decided to ignore certain tags along with all their subelements. For instance:
<trkpnt lat="45.19843" lon="-122.428"> <ele>4</ele> <time>...</time> <extensions> ... </extensions> </trkpnt>
To ignore extensions
and similar tags with numerous subelements, I made a shell that will consume elements up to the final Event
element:
skipTagAndContents :: (MonadThrow m) => Text -> Sink Event m (Maybe ()) skipTagAndContents n = tagPredicate ((== n) . nameLocalName) ignoreAttrs (const $ many (skipElements n) >> return ()) skipElements t = do x <- await case x of Just (EventEndElement n) | nameLocalName n == t -> Done x Nothing Nothing -> Done x Nothing _ -> return (Just ())
There seems to be a tag*
option that will do this for me (to succeed if all the children are not consumed), but the fact that I do not suggest that I skip a simple combinator, or should send a patch - which is it?
source share