This project holds modules which together build up mjavac - a MiniJava compiler (with extensions).
Each part is self-contained and reusable. For example, the parser may be used as a library to build syntax highlighters, interpreters and so on. The parser library is used by mjavac to build an AST for the source code.
The top-level project builds the complete compiler.
make build
./build/mjavac test/examples/factorial.java
The project can be built and used with Docker like so.
make docker
docker run --rm -it mjavac --help
Quickstart
Features
Documention
Documentation - mjavac
Documentation - Parser
Documentation - Virtual Machine
Development
- Reusable parser library
- Reusable stack-based virtual machine
- Helpful error messages with GCC-like reporting
- Can be built with included Graphviz support for parse graphs
- Flags for validating syntax and or semantics only - usable for CI
Note: allthough every project is able to be built and used in isolation, it may require additional configuration. The compiler flags in each project represent a bare minimum and are supposed to be tuned, such as is done by the main project's Makefile. To compile the projects in their simplest form simply ensure that $CXXFLAGS
contains at least -std=c++17
like so: CXXFLAGS=-std=c++17 make -C ...
. The projects can be built in debug mode by setting the DEBUG
environment variable like so: DEBUG=true ...
. However, the debug mode provided by using the main Makefile is much more powerful as it uses Address Sanitizers etc.
The mjavac project is the complete tool made available in this project. It combines the parser, compiler and more and provides a complete tool for dealing with MiniJava files.
The simplest way to build the CLI is to run make mjavac
. It can be built by itself just like other projects, but it generally makes little sense. Just like with the parser, simply run make -C mjavac
to build.
The resulting binary will be available in the mjavac/build
directory under the chosen build configuration (production
, debug
).
The included Graphviz support requires graphviz
to be installed (brew install graphviz
on macOS). It is enabled by default, but may be disabled when building like so: GRAPHVIZ_SUPPORT=false make mjavac
. Make sure that the include headers and libraries are available in the correct location, or that the environment variables CPPFLAGS
and LDFLAGS
are specified accordingly.
To get usage information, simply run mjavac --help
. Depending on what features were included in your build, you'll get the output shown below.
OVERVIEW: MiniJava compiler with extensions
USAGE: mjavac [options] file
OPTIONS:
-h, --help Print this help page
--parse-only Only parse the source
--semantics-only Only validate the semantics of the source
--execute Execute the source
--ast <file.dot> Output a dot-formatted parse graph
--cfg <file.dot> Output a dot-formatted control flow graph
--ast-graph <file.(pdf|png|jpg)> Render the parse graph as a pdf
--cfg-graph <file.(pdf|png|jpg)> Render the control flow graph as a pdf
--symbol-table <file.txt> Output the symbol table
There are four main usages of the mjavac tool - to parse, check semantics, compile or execute a program.
# Parse, analyze semantics and compile a source file
mjavac source.java
# Only validate the syntax of a source file
mjavac --parser source.java
# Parse and validate the semantics of a source file
mjavac --semantics-only source.java
# Parse, validate semantics and execute a source file
mjavac --execute source.java
During the parsing phase, mjavac will check for the following flags:
--ast <file.dot>
- write a parse tree tofile.dot
in the Graphviz dot format--ast-graph <file.(pdf|png|jpg)>
- render the parse tree tofile
in the specified format--parse-only
- terminate immediately after the parsing phase
During the semantics check, mjavac will check for the following flags:
--symbol-table <file.txt>
- output a plaintext representation of the symbol table--semantics-only
- terminate immediately after the semantics check
After the semantics check, the compilation phase begins. mjavac will check for the following flags:
--cfg <file.dot>
- write a control flow graph tofile.dot
in the Graphviz dot format--cfg-graph <file.(pdf|png|jpg)>
- render the control flow graph tofile
in the specified format
A full example looks as follows:
mjavac \
--ast ast.dot \
--ast-graph ast.pdf \
--symbol-table symbol-table.txt \
--cfg cfg.dot \
--cfg-graph cfg.pdf \
source.java
The parser project contains the lexer, parser and AST for the language. It's built as a statically linked library.
To build the parser, bison
v3.5 or higher as well as flex
2.6 or higher is required. These can be installed like so:
- macOS:
brew install bison flex
- Ubuntu:
sudo apt install bison flex
The simplest way to build the parser is to simply run make parser
. To build the parser by itself one can run make -C parser
.
Once built, the parser/build
directory is populated with the following directories for the chosen build configuration (production
, debug
):
include
lib
These directories are to be referenced when building projects using the parser like so:
g++ my-compiler.cc -I path/to/build/production/include -L path/to/build/production/lib/mjavac -l mjavacparser
The virtual machine project contains the bytecode, instructions and virtual machine. It's built as a statically linked library.
The simplest way to build the virtual-machine is to simply run make virtual-machine
. To build the virtual-machine by itself one can run make -C virtual-machine
.
Once built, the virtual-machine/build
directory is populated with the following directories for the chosen build configuration (production
, debug
):
include
lib
These directories are to be referenced when building projects using the virtual-machine like so:
g++ my-compiler.cc -I path/to/build/production/include -L path/to/build/production/lib/mjavac -l mjavacvm
Make sure you meet the following prerequisites:
Building:
$CXX
refers to a modern C++ compiler:g++
10 or newer (brew install gcc
on macOS)clang++
10 or newer (brew install llvm
on macOS)
$AR
refers to an appropriate version ofar
for the same toolchainbison
version 3.7 or higher is available in the appropriate path or in$CPPFLAGS
and$LDFLAGS
flex
version 2.7 or higher is available in the appropriate path or in$CPPFLAGS
and$LDFLAGS
- If building with Graphviz support,
graphviz
(orlibgraphviz-dev
) is required to be available in the appropriate path or in$CPPFLAGS
and$LDFLAGS
Developing:
clang
10 or newer is installedscan-build
refers to version 7 or newer which comes withclang
clang-format
refers to version 7 or newer which comes withclang
# Lint
make lint
# Perform static analysis
make analyze
# The report is now available in build/reports/static-analysis
# Format the code
make format
When the main project is built in its debug configuration, it will build the parser and mjavac using clang, with enabled address (and leak) sanitizers. At runtime, further checks may be enabled by running mjavac like so: ASAN_OPTIONS=detect_leaks=1 mjavac ...
.
To profile CPU usage, one may use Valgrind's callgrind tool (brew install --HEAD LouisBrunner/valgrind/valgrind
on macOS, sudo apt install valgrind
on Ubuntu).
Build mjavac for production and run like so:
# Note that the direct path to the mjavac binary must be used, not a link
valgrind --tool=callgrind --dump-instr=yes --collect-jumps=yes --callgrind-out-file=profile.out ./mjavac/build/production/mjavac test/correct/adder.java
One may visualize the resulting file in many ways, such as using kcachegrind (brew install qcachegrind
on macOS) or by generating a graph using gprof2dot
(brew install gprof2dot
on macOS).
qcachegrind profile.out
gprof2dot --format=callgrind --output=profile.dot profile.out
dot -Tpdf -o profile.pdf profile.dot
The test directory contains several (Mini)Java files. mjavac supports testing parsing, semantics and execution of source code. Each file may have a mjavac test file header. The header looks and works as follow.
// mjavac test file header
// header: 6
// parse: succeed - valid syntax
// semantics: succeed - valid program
// output: 1
// 3
//
The entire header is commented using //
. The lines specified bellow all start with an implicit //
.
The first line must be equal to mjavac test file header
. The consecutive lines are simply key-value pairs separated by :
. The values for the keys parse
and semantics
are further separated by -
.
parse
succeed
- parse the file and ensure it succeedsfail
- parse the file and ensure it failsignore
- don't parse the file (same as excluding theparse
line completely)
semantics
succeed
- parse the file and ensure its semantics are validfail
- parse the file and ensure its semantics are invalidignore
- don't parse and evaluate the semantics of the file (same as excluding thesemantics
line completely)
The last key-value pair is output
. Its value is the number of lines to read at the end of the header and use as the expected output for the execution of the source code.
The mjavac logo was created by Amanda Svensson and is licensed under Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported (CC BY-NC-ND 3.0).