This answer assumes that you really want to write a parser and are ready to make the necessary effort.
You should start with the formal JSON specification. I found http://www.ietf.org/rfc/rfc4627.txt . It precisely defines the language. You MUST implement everything in the specification and write tests for this. Your parser MUST serve the wrong JSON (like yours) and throw exceptions.
If you want to write a parser, stop, think, and then do not. It is a lot of work to make it work correctly. No matter what you do, do the right thing β incomplete parsers are a threat and should never be spread.
You MUST write code that matches. Here are some phrases from the specification. If you do not understand them, you will have to carefully study and make sure that you understand:
"JSON text will be encoded in Unicode. The default encoding is UTF-8."
"The JSON parser MUST accept all texts matching the JSON grammar."
"Encoding: 8 bits if UTF-8; binary if UTF-16 or UTF-32
JSON may be represented using UTF-8, UTF-16, or UTF-32. When JSON is written in UTF-8, JSON is 8bit compatible. When JSON is written in UTF-16 or UTF-32, the binary content-transfer-encoding must be used.
"
"Any character can be escaped. If the character is in the Basic Multilingual Plane (U + 0000 via U + FFFF), then it can be represented as a six-character sequence: the inverse solidus, followed by the lowercase u, followed by four hexadecimal digits, which encode the code point of the character. Hexadecimal letters A, although F may be upper or lower case. For example, a string containing only one inverse solidus character may be represented as "\ U005C."
If you understand this and still want to write a parser, check out some other parsers and see if they have any conformance tests. Borrow them for your application.
If you are still interested, you should seriously consider using a parser generator. Examples are JAVACC, CUP and my preferred ANTLR tool. ANTLR is very powerful, but it can be difficult to get started with. See also Parboiled suggestion, which I would recommend. JSON is relatively simple, and that would be a useful exercise. Most parser generators generate a complete parser that can generate executable code or generate a parsing tree for your JSON.
There is a JSON parser using ANTLR at http://www.antlr.org/wiki/display/ANTLR3/JSON+Interpreter if you are allowed to look into it. I also just discovered Parboiled parser-generator for JSON . If your main reason for writing a parser is to learn how to do this, this is probably a good starting point.
If you are not allowed (or not required) to use a parser, you will have to create your own parser. This usually happens in two parts:
lexer / tokenizer . This recognizes the basic primitives defined in the language specification. In this case, it will have to recognize curly braces, quotation marks, etc. Perhaps this will also create a representation of numbers.
the AbstractSyntaxTree ( http://en.wikipedia.org/wiki/Abstract_syntax_tree , AST) generator . Here you write code to build a tree representing an abstraction of your JSON (for example, spaces and curls are discarded).
When you have AST, you need to easily sort through the nodes and create the desired result.
But writing parsers, even for a simple language like JSON, is a lot of work.