For the system software development course, I am working on a complete assembler for the assembler developed by the instructor. I am currently working on a tokenizer. While doing some searches, I came across the Java StringTokenizer class ... but I can see that it is essentially deprecated. However, it is much easier to use the String.split method with regular expressions.
Is there any reason why I should avoid using it? Is there anything else possible in typical Java libraries that are well suited for this task that I donβt know about?
EDIT: More.
The reason I am considering String.split complicated is because my knowledge of regular expressions is about what I know about them. Although it would be useful for my general knowledge as a software developer to know them, I'm not sure I want to spend time right now, especially if there is a simpler alternative.
In terms of my use of the tokenizer: it will go through a text file containing assembler code and break it into tokens, passing the text and token to the parser. Separators include space (spaces, tabs, newlines), the start character of the comment '|' (which may occur on a separate line or after another text) and a comma to separate operands in the instruction.
I would write it more mathematically, but my knowledge of formal languages ββis a little rusty.
EDIT 2: Question more clearly
I saw the documentation for the StringTokenizer class. This works well for my purposes, but its use is not recommended. Besides String.split , is there anything in the standard java libraries that would be useful?
source share