Skip to content

Sunsvea/coulstock-cpp-compiler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Simple C++ Compiler

A basic compiler implementation in C++ that supports a subset of C-like syntax. Currently implements lexical analysis, parsing, and semantic analysis phases.

Features

  • Lexical Analysis (Tokenization)
  • Abstract Syntax Tree (AST) Generation
  • Semantic Analysis
    • Variable scope tracking
    • Variable initialization checking
    • Basic type checking

Supported Syntax

Currently supports:

int main() {
    int x = 42;
    if (x > 0) {
        return x * 2;
    }
    return 0;
}

Language Features

  • Integer variables
  • Basic arithmetic operations (+, -, *, /)
  • Comparison operators (>, <, >=, <=, ==)
  • If statements
  • Return statements
  • Block scoping

Prerequisites

  • C++17 compatible compiler (Clang++ recommended)
  • Make build system
  • CMake (optional)

Project Structure

.
├── include/
│   ├── lexer.hpp
│   ├── parser.hpp
│   ├── semantic_analyzer.hpp
│   └── utils.hpp
├── src/
│   └── main.cpp
├── Makefile
├── .gitignore
└── README.md

Building from Source

  1. Clone the repository:
git clone https://github.com/Sunsvea/coulstock-cpp-compiler.git
cd cpp-compiler
  1. Create build directory:
mkdir build
  1. Build using Make:
make

This will create the compiler executable in the build directory.

Usage

  1. Basic usage:
./build/compiler

The compiler will process the default test program embedded in main.cpp.

  1. To modify the input program, edit the input string in src/main.cpp:
std::string input = R"(
    // Your program here
)";

Example Output

The compiler produces detailed output showing each compilation phase:

Tokens:
Token: INT | Value: 'int' | Line: 2 | Column: 1
Token: IDENTIFIER | Value: 'main' | Line: 2 | Column: 5
...

Parsing AST:
Function: main
  Block:
    Variable Declaration: x
      Initializer:
        Number: 42
    If Statement:
      Condition:
        Binary Expression:
          Left:
            Identifier: x
          Operator: GREATER
          Right:
            Number: 0
...

Semantic analysis completed successfully!

Error Handling

The compiler provides detailed error messages for various types of errors:

  1. Lexical errors (invalid characters, malformed tokens)
  2. Syntax errors (invalid program structure)
  3. Semantic errors:
    • Use of undeclared variables
    • Use of uninitialized variables
    • Variable redeclaration
    • Scope violations

Example error messages:

Semantic Error: Use of undeclared variable 'y'
Semantic Error: Use of uninitialized variable 'x'
Semantic Error: Variable 'x' is already declared in this scope

Contributing

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add some amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Future Improvements

1. Code Generation Phase

  • Implement x86 assembly generation
  • Add basic register allocation
  • Create proper linking phase
  • Generate executable binaries
  • Add position-independent code support
  • Implement system call interface

2. Language Feature Expansion

  • Add while loops and for loops
  • Implement function calls and parameters
  • Add support for additional types (float, bool, string)
  • Implement arrays and pointers
  • Add struct/class support
  • Support header files and includes

3. Optimization Phase

  • Implement constant folding
  • Add dead code elimination
  • Support common subexpression elimination
  • Add loop optimization
  • Implement function inlining
  • Add peephole optimization

4. Enhanced Error Handling

  • Add line numbers and column information to errors
  • Implement error recovery for better error reporting
  • Add warning system with different severity levels
  • Provide source code suggestions for common mistakes
  • Add color-coded error output
  • Implement detailed error explanations

Acknowledgments

  • Based on modern compiler implementation practices
  • Inspired by the LLVM project structure