Skip to content

Compiler for the PL0 language for HTW Dresden course Compiler/Interpreter

License

Notifications You must be signed in to change notification settings

revilovs/pl0_compiler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

72 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pl0_compiler

This is a compiler for the PL/0 language, which I am constructing for the course Compiler/ Interpreter at HTW Dresden.

It uses the rather uncommon graph based approach of a parser, as taught in the course at HTW.

It does not generate machine code, it generates a byte code that works with the virtual machine supplied in the course.

Language

The grammar implemented has some additional features that the actual PL/0 grammar doesn't have:

  • Statement is not permitted to be empty.
  • It supports comments in the C-style /* your comment here */ syntax (also works across multiple lines). It does not support // single line comments in this style, however.
  • It permits string output in the form of !"This is a string with \"escaped quote marks\" and an escaped backslash \\". Refer to the syntax graph of Statement and the Lexer FSM for details.
  • It permits conditional statements with ELSE branches. See the Conditional Statement graph
  • It allows arrays of variables:
var index, b[3];
begin
    ?index;
    ?b[index]
    !b[index]
end.

See the syntax graphs of Variable declaration Factor, Assignment statement, Input statement and Array Index for details

This is the complete EBNF:

PROGRAM = BLOCK, ".";
BLOCK   = [ CONST_DECLARATION_LIST ], 
    [ VAR_DECLARATION_LIST ], 
    [ PROCEDURE_DECLARATION ], 
    STATEMENT;

CONST_DECLARATION_LIST = "CONST", 
    CONSTANT_DECLARATION, 
    { ",", CONSTANT_DECLARATION }, 
    ";";
CONST_DECLARATION      = IDENTIFIER, "=", NUMERAL;
VAR_DECLARATION_LIST   = "VAR", 
    VAR_DECLARATION, 
    { ",", VAR_DECLARATION }, 
    ";";
VAR_DECLARATION        = IDENTIFIER, [ "[", NUMERAL "]" ];
PROCEDURE_DECLARATION  = "PROCEDURE", IDENTIFIER, ";", BLOCK, ";";

STATEMENT = ASSIGNMENT_STATEMENT
    | CONDITIONAL_STATEMENT
    | LOOP_STATEMENT
    | COMPOUND_STATEMENT
    | PROCEDURE_CALL
    | INPUT_STATEMENT
    | OUTPUT_STATEMENT;

ASSIGNMENT_STATEMENT  = IDENTIFIER, [ ARRAY_INDEX ], := EXPRESSION;
CONDITIONAL_STATEMENT = "IF", LOGICAL_EXPRESSION, "THEN", STATEMENT, [ "ELSE", STATEMENT ];
LOOP_STATEMENT        = "WHILE", LOGICAL_EXPRESSION, "DO", STATEMENT;
COMPOUND_STATEMENT    = "BEGIN", STATEMENT, { ";", STATEMENT }, "END";
PROCEDURE_CALL        = "CALL", IDENTIFIER;
INPUT_STATEMENT       = "?" IDENTIFIER, [ ARRAY_INDEX ];
OUTPUT_STATEMENT      = "!", 
    (
        EXPRESSION
        | '"', STRING, '"'
    );

ARRAY_INDEX = "[", EXPRESSION, "]";

CONDITION = ( "ODD", EXPRESSION ) 
    | ( 
        EXPRESSION, 
        ( "=" | "#" | ">" | ">=" | "<" | "<=" ),
        EXPRSSION
    );

EXPRESSION = [ "-" ], TERM, { ( "+" | "-" ), TERM };
TERM       = FACTOR, { ( "*" | "/" ), FACTOR };
FACTOR     = NUMERAL
    | "(", EXPRESSION, ")"
    | IDENTIFIER, [ ARRAY_INDEX ];

LOGICAL_EXPRESSION = LOGICAL_TERM, { "OR", LOGICAL_TERM };
LOGICAL_TERM       = [ "NOT" ], LOGICAL_FACTOR, { "AND", [ "NOT" ], LOGICAL_FACTOR };
LOGICAL_FACTOR     = CONDITION
    | (
        "{",
        LOGICAL_EXPRESSION,
        "}"
    );

Requirements

This project uses Java 8 and maven. If you don't have maven, you can also compile with javac.

Building

To build the compiler, run

mvn package

This runs all tests and generates an executable jar file. If you don't have maven installed, you can also use javac.

Running

You can run the compiler by executing

java -jar path/to/jar <source file> [<output file>]

API Reference

The API reference can be generated by running

mvn javadoc:javadoc

The generated docs can then be found under target/site/apidocs

Testing

You can run the tests with

mvn test

Lexer

The lexer is implemented with a finite state machine described in the following state chart diagram: Lexer's FSM diagram All States starting with E are final states signifying a completed token. Z10, Z11, and Z12 are responsible for comments. As you can see, comments are not formed into a token but simply skipped like whitespace.

Grammar

The implemented grammar is described by syntax graphs, which are implemented in the class de.htw_dresden.informatik.s75924.parser.Graph. These are their visual representations:

Program

Graph for program

Block

Graph for block

Constant declaration list

Graph for constant declaration list

Constant declaration

Graph for constant declaration

Variable declaration list

Graph for variable declaration list

Variable declaration

Graph for variable declaration

Procedure declaration

Graph for procedure declaration

Statement

Graph for statement

Assignment statement

Graph for assignment statement

Conditional statement

Graph for conditional statement

Loop statement

Graph for loop statement

Compound statement

Graph for compound statement

Procedure call

Graph for procedure call

Input statement

Graph for input statement

Output statement

Graph for output statement

Expression

Graph for expression

Term

Graph for term

Factor

Graph for factor

Condition

Graph for condition

Array Index

Graph for array index

Logical expression

Graph for logical expression

Logical term

Graph for logical term

Logical factor

Graph for logical factor

About

Compiler for the PL0 language for HTW Dresden course Compiler/Interpreter

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages