A Python-based JSON parser and scanner that reads tokenized input from .txt files, builds a parse tree, and outputs valid JSON or a structured error message.
- Custom scanner to tokenize JSON-like inputs
- Recursive descent parser
- Handles objects, arrays, strings, numbers, booleans, and nulls
- Type validation for arrays (homogeneous elements)
- Detects and reports:
- Invalid tokens
- Duplicate keys in objects
- Type mismatches in arrays
- Malformed numbers and strings
- Extra trailing commas
- Outputs valid
.jsonfiles or JSON-formatted error messages
- json_scanner.py | Defines TokenType, Token, and Lexer classes
- json_parser.py | Main parser logic
- test0X.txt | Example input files
- output0X.json | Output files (created after running)
Input .txt files contain pre-tokenized lines. Examples:
<{>
<str,"name">
<:>
<str,"Ben">
<,>
<str,"age">
<:>
<num,21>
<}>
On success:
{
"name": "Ben",
"age": 21
}On error:
{
"error": "Error type 5 at \"name\": Duplicate keys found."
}- Ensure you have python3 installed.
- Clone the repo
git clone https://github.com/your-username/json-parser.git
cd json-parser
-
Adjust test files to your liking in test.00.txt, test.01.txt, etc.
-
Run the parser
python3 json_parser.pySpecific Types:
- Malformed number (starts/ends with a dot)
- Empty string
- Number starts with + or leading 0
- Reserved words used as strings (e.g. "true")
- Duplicate keys in objects
- Mixed types in arrays
Other:
- Extra comma Trailing commas in arrays or objects
MIT License