Skip to content

A toy C compiler that compatible with C23 and can do limited optimizations

License

Notifications You must be signed in to change notification settings

AinsleySnow/Ginkgo

Repository files navigation

Ginkgo -- A Hobby Optimizing(?) C Compiler for C23

Ginkgo is an C23 compiler that can do limited optimizations written in C++ 17.

I intend to implement all C23 features and newly introduced library functions. Currently, I have implemented most of the C11 features and some significant C23 features that greatly impact user experience, such as type inference, the typeof specifier, and improved normal enumerations. I will gradually implement other features in the furture.

Ginkgo's architecture is heavily influenced by LLVM. In fact, Ginkgo's IR is very similar to that of LLVM! You can generally view Ginkgo as four parts: a parser, an IR generator, an optimizing pipeline, and a code generator, which converts the IR to x64 assembly code. LLVM has taught me a lot!

I managed to keep the code simple and understandable and make room for migrating Ginkgo to other platforms. So, although Ginkgo currently supports x86-64 Linux only, you can easily migrate it. The only cost is that you may need to rewrite one or more of the four parts mentioned above.

Note that this is only a hobby project and I may not actively maintain it. But if you want to, you can commit to this repo and I will give you a big thanks!

I hope this compiler would be useful to you. Have fun!

Environment

  1. x86-64
  2. Ubuntu 22.04 (technically it can run on any Linux distribution; not tested, though)
  3. Bison 3.8.1 and Flex 2.6
  4. GCC 12.1 (or any compiler supports C++ 17)
  5. CMake 3.16 or later

Build

mkdir build && cd build
cmake ..
make all -j 10
make test

If you modified the .yy or .ll file, you will need to type "make parser" or "make lexer" to update the parser or lexer source file.

Parser

The parser is generated by Bison and the lexer is generated by Flex. But I recently realized that using a generated parser to parse C23 is totally a mistake and I may replace that by a hand written recursive descent parser.

Translate From C to IR

I achived that by simulating what clang does.

Optimizing

There are only two optimizations: one is constant folding and the other is register allocation. The register allocating algorithm is a simple preemptive one - first come, first get. This seems awkward but if you not run the mem2reg pass and not do constant/copy propergation, it is enough to map most of the temporary variables to physics registers. It should generate correct but low-quality allocation if these optimizations are done (not tested). I may update it to a linear-scan one if these passes are implemented.

Generate Assembly Code From IR

There's so many details to consisder when translating from IR to x64 assemble! The format of x64 instructions varies in detail, so I have to write huge amount of code to tackle this.

Memory Management

Smart pointers are heavily used in this project. Almost every bare pointer you see in my source code corresponds to a smart pointer somewhere. Maybe I will use memory pools instead of scattered smart pointers.

Reference

  1. wgtcc
  2. Online Document of PKU Compiler Course
  3. chibicc
  4. Working Draft of C23 (n3096)
  5. System V Application Binary Interface, AMD64 Architecture Processor Supplement
  6. Bison's Manual
  7. Flex's Manual
  8. LLVM Instruction Reference
  9. SSA-Based Compiler Design

About

A toy C compiler that compatible with C23 and can do limited optimizations

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published