A basic compiler implementation in C++ that supports a subset of C-like syntax. Currently implements lexical analysis, parsing, and semantic analysis phases.
- Lexical Analysis (Tokenization)
- Abstract Syntax Tree (AST) Generation
- Semantic Analysis
- Variable scope tracking
- Variable initialization checking
- Basic type checking
Currently supports:
int main() {
int x = 42;
if (x > 0) {
return x * 2;
}
return 0;
}
- Integer variables
- Basic arithmetic operations (+, -, *, /)
- Comparison operators (>, <, >=, <=, ==)
- If statements
- Return statements
- Block scoping
- C++17 compatible compiler (Clang++ recommended)
- Make build system
- CMake (optional)
.
├── include/
│ ├── lexer.hpp
│ ├── parser.hpp
│ ├── semantic_analyzer.hpp
│ └── utils.hpp
├── src/
│ └── main.cpp
├── Makefile
├── .gitignore
└── README.md
- Clone the repository:
git clone https://github.com/Sunsvea/coulstock-cpp-compiler.git
cd cpp-compiler
- Create build directory:
mkdir build
- Build using Make:
make
This will create the compiler
executable in the build
directory.
- Basic usage:
./build/compiler
The compiler will process the default test program embedded in main.cpp.
- To modify the input program, edit the
input
string insrc/main.cpp
:
std::string input = R"(
// Your program here
)";
The compiler produces detailed output showing each compilation phase:
Tokens:
Token: INT | Value: 'int' | Line: 2 | Column: 1
Token: IDENTIFIER | Value: 'main' | Line: 2 | Column: 5
...
Parsing AST:
Function: main
Block:
Variable Declaration: x
Initializer:
Number: 42
If Statement:
Condition:
Binary Expression:
Left:
Identifier: x
Operator: GREATER
Right:
Number: 0
...
Semantic analysis completed successfully!
The compiler provides detailed error messages for various types of errors:
- Lexical errors (invalid characters, malformed tokens)
- Syntax errors (invalid program structure)
- Semantic errors:
- Use of undeclared variables
- Use of uninitialized variables
- Variable redeclaration
- Scope violations
Example error messages:
Semantic Error: Use of undeclared variable 'y'
Semantic Error: Use of uninitialized variable 'x'
Semantic Error: Variable 'x' is already declared in this scope
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add some amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Implement x86 assembly generation
- Add basic register allocation
- Create proper linking phase
- Generate executable binaries
- Add position-independent code support
- Implement system call interface
- Add while loops and for loops
- Implement function calls and parameters
- Add support for additional types (float, bool, string)
- Implement arrays and pointers
- Add struct/class support
- Support header files and includes
- Implement constant folding
- Add dead code elimination
- Support common subexpression elimination
- Add loop optimization
- Implement function inlining
- Add peephole optimization
- Add line numbers and column information to errors
- Implement error recovery for better error reporting
- Add warning system with different severity levels
- Provide source code suggestions for common mistakes
- Add color-coded error output
- Implement detailed error explanations
- Based on modern compiler implementation practices
- Inspired by the LLVM project structure