This is a compiler for the PL/0 language, which I am constructing for the course Compiler/ Interpreter at HTW Dresden.
It uses the rather uncommon graph based approach of a parser, as taught in the course at HTW.
It does not generate machine code, it generates a byte code that works with the virtual machine supplied in the course.
The grammar implemented has some additional features that the actual PL/0 grammar doesn't have:
Statement
is not permitted to be empty.- It supports comments in the C-style
/* your comment here */
syntax (also works across multiple lines). It does not support// single line comments in this style
, however. - It permits string output in the form of
!"This is a string with \"escaped quote marks\" and an escaped backslash \\"
. Refer to the syntax graph of Statement and the Lexer FSM for details. - It permits conditional statements with
ELSE
branches. See the Conditional Statement graph - It allows arrays of variables:
var index, b[3];
begin
?index;
?b[index]
!b[index]
end.
See the syntax graphs of Variable declaration Factor, Assignment statement, Input statement and Array Index for details
- It supports logical expressions in the conditions for loops and conditionals, with operators
NOT
,OR
,AND
and curly braces{}
. See the graphs for Logical expression, Logical term, and Logical factor as well as Conditional statement and Loop statement.
This is the complete EBNF:
PROGRAM = BLOCK, ".";
BLOCK = [ CONST_DECLARATION_LIST ],
[ VAR_DECLARATION_LIST ],
[ PROCEDURE_DECLARATION ],
STATEMENT;
CONST_DECLARATION_LIST = "CONST",
CONSTANT_DECLARATION,
{ ",", CONSTANT_DECLARATION },
";";
CONST_DECLARATION = IDENTIFIER, "=", NUMERAL;
VAR_DECLARATION_LIST = "VAR",
VAR_DECLARATION,
{ ",", VAR_DECLARATION },
";";
VAR_DECLARATION = IDENTIFIER, [ "[", NUMERAL "]" ];
PROCEDURE_DECLARATION = "PROCEDURE", IDENTIFIER, ";", BLOCK, ";";
STATEMENT = ASSIGNMENT_STATEMENT
| CONDITIONAL_STATEMENT
| LOOP_STATEMENT
| COMPOUND_STATEMENT
| PROCEDURE_CALL
| INPUT_STATEMENT
| OUTPUT_STATEMENT;
ASSIGNMENT_STATEMENT = IDENTIFIER, [ ARRAY_INDEX ], := EXPRESSION;
CONDITIONAL_STATEMENT = "IF", LOGICAL_EXPRESSION, "THEN", STATEMENT, [ "ELSE", STATEMENT ];
LOOP_STATEMENT = "WHILE", LOGICAL_EXPRESSION, "DO", STATEMENT;
COMPOUND_STATEMENT = "BEGIN", STATEMENT, { ";", STATEMENT }, "END";
PROCEDURE_CALL = "CALL", IDENTIFIER;
INPUT_STATEMENT = "?" IDENTIFIER, [ ARRAY_INDEX ];
OUTPUT_STATEMENT = "!",
(
EXPRESSION
| '"', STRING, '"'
);
ARRAY_INDEX = "[", EXPRESSION, "]";
CONDITION = ( "ODD", EXPRESSION )
| (
EXPRESSION,
( "=" | "#" | ">" | ">=" | "<" | "<=" ),
EXPRSSION
);
EXPRESSION = [ "-" ], TERM, { ( "+" | "-" ), TERM };
TERM = FACTOR, { ( "*" | "/" ), FACTOR };
FACTOR = NUMERAL
| "(", EXPRESSION, ")"
| IDENTIFIER, [ ARRAY_INDEX ];
LOGICAL_EXPRESSION = LOGICAL_TERM, { "OR", LOGICAL_TERM };
LOGICAL_TERM = [ "NOT" ], LOGICAL_FACTOR, { "AND", [ "NOT" ], LOGICAL_FACTOR };
LOGICAL_FACTOR = CONDITION
| (
"{",
LOGICAL_EXPRESSION,
"}"
);
This project uses Java 8 and maven. If you don't have maven, you can also compile with javac.
To build the compiler, run
mvn package
This runs all tests and generates an executable jar file. If you don't have maven installed, you can also use javac
.
You can run the compiler by executing
java -jar path/to/jar <source file> [<output file>]
The API reference can be generated by running
mvn javadoc:javadoc
The generated docs can then be found under target/site/apidocs
You can run the tests with
mvn test
The lexer is implemented with a finite state machine described in the following state chart diagram:
All States starting with
E
are final states signifying a completed token. Z10
, Z11
, and Z12
are responsible for comments. As you can see, comments are not formed into a token but simply skipped like whitespace.
The implemented grammar is described by syntax graphs, which are implemented in the class de.htw_dresden.informatik.s75924.parser.Graph
. These are their visual representations: