|
1 |
| -This series of articles is a tutorial for building a C compiler from scratch. |
| 1 | +# Preface |
2 | 2 |
|
3 |
| -I lied a little in the above sentence: it is actually an _interpreter_ instead |
4 |
| -of _compiler_. I lied because what the hell is a "C interpreter"? You will |
5 |
| -however, understand compilers better by building an interpreter. |
| 3 | +This is multi-part tutorial on how to build a C compiler from scratch. |
6 | 4 |
|
7 |
| -Yeah, I wish you can get a basic understanding of how a compiler is |
8 |
| -constructed, and realize it is not that hard to build one. Good Luck! |
| 5 | +Well, I lied a little in the previous sentence: it's actually an _interpreter_, |
| 6 | +not a _compiler_. I had to lie, because what on earth is a "C interpreter"? |
| 7 | +You will however gain a better understanding of compilers by building an |
| 8 | +interpreter. |
9 | 9 |
|
10 |
| -Finally, this series is written in Chinese in the first place, feel free to |
11 |
| -correct me if you are confused by my English. And I would like it very much if |
12 |
| -you could teach me some "native" English :) |
| 10 | +Yeah, I want to provide you with a basic understanding of how a compiler is |
| 11 | +constructed, and realize that it's not that hard to build one, after all. |
| 12 | +Good Luck! |
13 | 13 |
|
14 |
| -We won't write any code in this chapter, feel free to skip it if you are |
15 |
| -desperate to see some code... |
| 14 | +This tutorial was originally written in Chinese, so feel free to correct me if |
| 15 | +you're confused by my English. Also, I would really appreciate it if you could |
| 16 | +teach me some "native" English. :smile: |
16 | 17 |
|
17 |
| -## Why you should care about compiler theory? |
| 18 | +We won't be writing any code in this chapter; so if you're eager to see some code, feel free to skip it. |
18 | 19 |
|
19 |
| -Because it is **COOL**! |
20 | 20 |
|
21 |
| -And it is very useful. Programs are built to do something for us, when they |
22 |
| -are used to translate some forms of data into another form, we can call them |
23 |
| -a compiler. Thus by learning some compiler theory we are trying to master a very |
24 |
| -powerful technique of solving problems. Isn't that cool enough to you? |
| 21 | +## Why Should I Care about Compiler Theory? |
| 22 | + |
| 23 | +Because it's **COOL**! |
| 24 | + |
| 25 | +And it's also very useful. Programs are designed to do something for us; when |
| 26 | +they are used to translate some form of data into another form, we can call |
| 27 | +them compilers. Thus, by learning some compiler theory, we are trying to |
| 28 | +master a very powerful problem solving technique. Doesn't this sound cool |
| 29 | +enough to you? |
| 30 | + |
| 31 | +People used to say that understanding how a compiler works would help you to |
| 32 | +write better code. Some would argue that modern compilers are so good at |
| 33 | +optimizing that you shouldn't care any more. Well, that's true, most people |
| 34 | +don't need to learn compiler theory to improve code performance — and by "most |
| 35 | +people" I mean _you_! |
25 | 36 |
|
26 |
| -People used to say understanding how a compiler works would help you to write |
27 |
| -better code. Some would argue that modern compilers are so good at |
28 |
| -optimization that you should not care any more. Well, that's true, most people |
29 |
| -don't need to learn compiler theory only to improve the efficency of the code. |
30 |
| -And by most people, I mean you! |
31 | 37 |
|
32 | 38 | ## We Don't Like Theory Either
|
33 | 39 |
|
34 |
| -I have always been in awe of compiler theory because that's what makes |
35 |
| -programing easy. Anyway can you imaging building a web browser in only |
36 |
| -assembly language? So when I got a chance to learn compiler theory in college, |
37 |
| -I was so excited! And then... I quit, not understanding what that it. |
| 40 | +I've always been in awe of compiler theory because that's what makes programing |
| 41 | +easy. Anyway, can you imagine building a web browser entirely in assembly |
| 42 | +language? So when I got a chance to learn compiler theory in college, I was so |
| 43 | +excited! And then ... I quit! And left without understanding what it's all |
| 44 | +about. |
38 | 45 |
|
39 |
| -Normally a course of compiler will cover: |
| 46 | +Normally compiler course covers the following topics: |
40 | 47 |
|
41 |
| -1. How to represent syntax (such as BNF, etc.) |
42 |
| -2. Lexer, with somewhat NFA(Nondeterministic Finite Automata), |
43 |
| - DFA(Deterministic Finite Automata). |
44 |
| -3. Parser, such as recursive descent, LL(k), LALR, etc. |
| 48 | +1. How to represent syntaxes (i.e. BNF, etc.) |
| 49 | +2. Lexers, using NFA (Nondeterministic Finite Automata) and |
| 50 | + DFA (Deterministic Finite Automata). |
| 51 | +3. Parsers, such as recursive descent, LL(k), LALR, etc. |
45 | 52 | 4. Intermediate Languages.
|
46 | 53 | 5. Code generation.
|
47 | 54 | 6. Code optimization.
|
48 | 55 |
|
49 |
| -Perhaps more than 90% students will not care anything beyond the parser, and |
50 |
| -what's more, we still don't know how to build a compiler! Even after all the |
51 |
| -effort learning the theories. Well the main reason is that what "Compiler |
52 |
| -Thoery" trys to teach is "How to build a parser generator", namely a tool that |
53 |
| -consumes syntax gramer and generates a compiler for you. lex/yacc or |
54 |
| -flex/bison or things like that. |
| 56 | +Perhaps more than 90% of the students won't really care about any of that, |
| 57 | +except for the parser, and what's more, we'd still won't know how to actually |
| 58 | +build a compiler! even after all the effort of learning the theory. Well, the |
| 59 | +main reason is that what "Compiler Theory" tries to teach is "how to build a |
| 60 | +parser generator" — i.e. a tool that consumes a syntax grammar and generates a |
| 61 | +compiler for you, like lex/yacc or flex/bison, or similar tools. |
| 62 | + |
| 63 | +These theories try to teach us how to solve the general challenges of |
| 64 | +generating compilers automatically. Once you've mastered them, you're able to |
| 65 | +deal with all kinds of grammars. They are indeed useful in the industry. |
| 66 | +Nevertheless, they are too powerful and too complicated for students and most |
| 67 | +programmers. If you try to read lex/yacc's source code you'll understand what |
| 68 | +I mean. |
55 | 69 |
|
56 |
| -These theories try to teach us how to solve the general problems of generating |
57 |
| -compilers automatically. That means once you've mastered them, you are able to |
58 |
| -deal with all kinds of grammars. They are indeed useful in industry. |
59 |
| -Nevertheless they are too powerful and too complicated for students and most |
60 |
| -programmers. You will understand that if you try to read lex/yacc's source |
61 |
| -code. |
| 70 | +The good news is that building a compiler can be much simpler than you ever |
| 71 | +imagined. I won't lie, it's not easy, but definitely not hard. |
62 | 72 |
|
63 |
| -Good news is building a compiler can be much simpler than you ever imagined. |
64 |
| -I won't lie, not easy, but definitely not hard. |
65 | 73 |
|
66 |
| -## Birth of this project |
| 74 | +## How This Project Began |
67 | 75 |
|
68 |
| -One day I came across the project [c4](https://github.com/rswier/c4) on |
69 |
| -Github. It is a small C interpreter which is claimed to be implemented by only |
70 |
| -4 functions. The most amazing part is that it is bootstrapping (that interpret |
71 |
| -itself). Also it is done with about 500 lines! |
| 76 | +One day I came across the project [c4] on Github, a small C interpreter |
| 77 | +claiming to be implemented with only 4 functions. The most amazing part is |
| 78 | +that it's [bootstrapping] (i.e. it can interpret itself). Furthermore, it's |
| 79 | +being done in around 500 lines of code! |
72 | 80 |
|
73 |
| -Meanwhile I've read a lot of tutorials about compiler, they are either too |
74 |
| -simple(such as implementing a simple calculator) or using automation |
75 |
| -tools(such as flex/bison). c4 is however implemented all from scratch. The |
76 |
| -sad thing is that it try to be minimal, that makes the code quite a mess, hard |
77 |
| -to understand. So I started a new project to: |
| 81 | +Meanwhile, I've read many tutorials on compilers design, and found them to be |
| 82 | +either too simple (such as implementing a simple calculator) or using |
| 83 | +automation tools (such as flex/bison). [C4], however, is implemented entirely |
| 84 | +from scratch. The sad thing is that it aims to be "an exercise in minimalism," |
| 85 | +which makes the code quite messy and hard to understand. So I started a new |
| 86 | +project, in order to: |
78 | 87 |
|
79 |
| -1. Implement a working C compiler(interpreter actually) |
80 |
| -2. Write a tutorial of how it is built. |
| 88 | +1. Implement a working C compiler (an interpreter, actually). |
| 89 | +2. Write a step-by-step tutorial on how it was built. |
81 | 90 |
|
82 |
| -It took me 1 week to re-write it, resulting 1400 lines including comments. The |
83 |
| -project is hosted on Github: [Write a C Interpreter](https://github.com/lotabout/write-a-C-interpreter). |
| 91 | +It took me one week to re-write it, resulting in 1400 lines of code (including |
| 92 | +comments). The project is hosted on Github: [Write a C Interpreter]. |
84 | 93 |
|
85 |
| -Thanks rswier for bringing us a wonderful project! |
| 94 | +Thanks [@rswier] for sharing with us [c4], it's such a wonderful project! |
86 | 95 |
|
87 |
| -## Before you go |
88 | 96 |
|
89 |
| -Implementing a compiler could be boring and it is hard to debug. So I hope you |
90 |
| -can spare enough time studying, as well as type the code. I am sure that you |
91 |
| -will feel a great sense of accomplishment just like I do. |
| 97 | +## Before You Begin |
| 98 | + |
| 99 | +Implementing a compiler can be boring and hard to debug. So I hope you can |
| 100 | +spare enough time studying, and typing code. I'm sure that you will feel a |
| 101 | +great sense of accomplishment, just like I do. |
| 102 | + |
92 | 103 |
|
93 | 104 | ## Good Resources
|
94 | 105 |
|
95 |
| -1. [Let’s Build a Compiler](http://compilers.iecc.com/crenshaw/): a very good |
96 |
| - tutorial of building a compiler for fresh starters. |
97 |
| -2. [Lemon Parser Generator](http://www.hwaci.com/sw/lemon/): the parser |
98 |
| - generator that is used in SQLite. Good to read if you want to understand |
99 |
| - compiler theory with code. |
| 106 | +1. _[Let’s Build a Compiler]_: a very good tutorial of building a compiler, |
| 107 | + written for beginners. |
| 108 | +2. [Lemon Parser Generator]: the parser generator used by SQLite. |
| 109 | + Good to read if you want to understand compiler theory with code. |
| 110 | + |
| 111 | +In the end, I am just a person with a general level of expertise, so there |
| 112 | +will inevitably be some mistakes in my articles and code (and also in my |
| 113 | +English). Feel free to correct me! |
| 114 | + |
| 115 | +I hope you'll enjoy it. |
100 | 116 |
|
101 |
| -In the end, I am human with a general level, there will be inevitably wrong |
102 |
| -with the articles and codes(also my English). Feel free to correct me! |
| 117 | +<!----------------------------------------------------------------------------- |
| 118 | + REFERENCE LINKS |
| 119 | +------------------------------------------------------------------------------> |
103 | 120 |
|
104 |
| -Hope you enjoy it. |
| 121 | +[@rswier]: https://github.com/rswier "Visit @rswier's GitHub profile" |
| 122 | +[bootstrapping]: https://en.wikipedia.org/wiki/Bootstrapping_(compilers) "Wikipedia » Bootstrapping (compilers)" |
| 123 | +[c4]: https://github.com/rswier/c4 "Visit the c4 repository on GitHub" |
| 124 | +[Lemon Parser Generator]: http://www.hwaci.com/sw/lemon/ "Visit Lemon homepage" |
| 125 | +[Let’s Build a Compiler]: http://compilers.iecc.com/crenshaw/ "15-part tutorial series, by Jack Crenshaw" |
| 126 | +[Write a C Interpreter]: https://github.com/lotabout/write-a-C-interpreter "Visit the 'Write a C Interpreter' repository on GitHub" |
0 commit comments