Well ... where to start from this ....
First of all, write a parser, well, that is a very broad expression, especially with the question you ask.
Your opening expression was that you need a simple arithmetic "parser", and technically it is not a parser, it is a lexical analyzer similar to what you can use to create a new language. ( http://en.wikipedia.org/wiki/Lexical_analysis ) I understand exactly where the confusion may be that they are one and the same. It is important to note that Lexical analysis is ALSO that you want to understand if you are going to write language / script parses too, this does not strictly analyze, because you interpret the instructions, and not use them.
Back to the parsing issue ....
This is what you will do if you take a rigidly defined file structure to extract information from it.
In general, you really do not need to write a parser for XML / HTML, because there is already a lot of it, and especially if your XML parsing is created at runtime .NET, then you don’t even need to parse, you just need to "serialize" and de-serialize.
In the interest of learning, however, parsing XML (or something like html) is very simple in most cases.
if we start with the following XML:
<movies> <movie id="1"> <name>Tron</name> </movie> <movie id="2"> <name>Tron Legacy</name> </movie> <movies>
we can load the data into XElement as follows:
XElement myXML = XElement.Load("mymovies.xml");
you can get 'movies in the root element using ' myXML.Root '
MOre interesting however, you can easily use Linq to get nested tags:
var myElements = from p in myXML.Root.Elements("movie") select p;
Provides you with var XElements, each of which contains one "...", which you can use when using somthing like:
foreach(var v in myElements) { Console.WriteLine(string.Format("ID {0} = {1}",(int)v.Attributes["id"],(string)v.Element("movie")); }
For anything other than XML, like a data structure, I’m afraid that you will have to start learning the art of regular expressions, a tool like “Regular Expression Coach” will help you in truth ( http://weitz.de/regex -coach / ) or one of the most commonly used similar tools.
You will also need to become familiar with .NET regex objects ( http://www.codeproject.com/KB/dotnet/regextutorial.aspx ) should give you a good start.
As soon as you know how your reg-ex file works, in most cases it is a simple case of reading in files one line at a time and with an understanding of what method you feel with.
A good free source of file formats for everything you can imagine can be found at ( http://www.wotsit.org/ )