From 96506c2463adc6749d450a5f5cf3d0121744b0d3 Mon Sep 17 00:00:00 2001
From: isuckatcs <65320245+isuckatcs@users.noreply.github.com>
Date: Tue, 30 Jul 2024 03:12:21 +0200
Subject: [PATCH] [www] use a spell-checker that catches mistakes that the
 previous one didn't catch

---
 www/index.html   | 39 ++++++++++++-----------
 www/lexing.html  | 64 +++++++++++++++++++-------------------
 www/parsing.html | 80 ++++++++++++++++++++++++------------------------
 3 files changed, 91 insertions(+), 92 deletions(-)
diff --git a/www/index.html b/www/index.html
index 4a0960b..f113882 100644
--- a/www/index.html
+++ b/www/index.html
@@ -49,8 +49,7 @@ <h1>How to Compile Your Language</h1>
                 <p>
                     This guide is intended to be a practical introduction to how
                     to design <i>your language</i> and implement a modern
-                    compiler for it. The source code of the compiler is
-                    available on
+                    compiler for it. The compiler's source code is available on
                     <a
                         href="https://github.com/isuckatcs/how-to-compile-your-language"
                         target="_blank"
@@ -58,19 +57,19 @@ <h1>How to Compile Your Language</h1>
                     >.
                 </p>
                 <p>
-                    When designing a language it helps if there is an idea what
-                    the language is going to be used for. Is it indented to be
+                    When designing a language it helps if there is an idea of
+                    what the language will be used for. Is it intended to be
                     making systems programming safer like Rust? Is it targeting
                     AI developers like Mojo?
                 </p>
                 <p>
-                    In this case the goal of the language is to showcase various
-                    algorithms and techniques that are used in the
+                    In this case, the goal of the language is to showcase
+                    various algorithms and techniques that are used in the
                     implementation of some of the most popular languages like
-                    C++, Kotlin or Rust.
+                    C++, Kotlin, or Rust.
                 </p>
                 <p>
-                    The guide also covers how to create a platform specific
+                    The guide also covers how to create a platform-specific
                     executable with the help of the LLVM compiler
                     infrastructure, which all of the previously mentioned
                     languages use for the same purpose. Yes, even Kotlin can be
@@ -82,10 +81,10 @@ <h2>What Does Every Language Have in Common?</h2>
                     When creating a new language, the first question is how to
                     get started. There is something that every existing language
                     and <i>your language</i> must define too, which is the entry
-                    point from which the execution starts.
+                    point from which the execution begins.
                 </p>
                 <p>
-                    In scripting languages like JavaScript the execution of the
+                    In scripting languages like JavaScript, the execution of the
                     code usually starts from the first line of the source file,
                     while most programming languages including
                     <i>your language</i> treat the <code>main()</code> function
@@ -99,16 +98,16 @@ <h2>What Does Every Language Have in Common?</h2>
                     already popular language.
                 </p>
                 <p>
-                    In the past 50 years the syntax of a function declaration
+                    In the past 50 years, the syntax of a function declaration
                     was the name of the function followed by the list of
                     arguments enclosed by <code>(</code> and <code>)</code>. At
-                    first glance it is tempting to introduce some new exotic
+                    first glance, it is tempting to introduce some new exotic
                     syntax like <code>main<> {}</code>, but in many popular
                     languages <code><></code> might mean something completely
-                    different, in this case a generic argument list. Using such
-                    syntax for a function definition would probably cause
-                    confusion for developers who try to get familiar with this
-                    new language, which is something to keep in mind.
+                    different, in this case, a generic argument list. Using such
+                    syntax for a function definition would probably confuse
+                    developers who are trying to get familiar with this new
+                    language, which is something to keep in mind.
                 </p>
                 <h2>How Is This Text Turned into an Executable?</h2>
                 <p>
@@ -121,7 +120,7 @@ <h2>How Is This Text Turned into an Executable?</h2>
                     The <code>frontend</code> contains the actual implementation
                     of the language, it is responsible for ensuring that the
                     program written in the specific language doesn't contain any
-                    errors, and reporting every issue it finds to the developer.
+                    errors and reporting every issue it finds to the developer.
                 </p>
                 <p>
                     After validating the program, it turns it into an
@@ -141,9 +140,9 @@ <h2>How Is This Text Turned into an Executable?</h2>
                 </p>
                 <h2>Is It Possible to Learn All These Topics?</h2>
                 <p>
-                    Yes, with enough time. However there is no need to learn all
-                    of them to create a successful language. In fact even a lot
-                    of modern popular languages like <code>C++</code>,
+                    Yes, with enough time. However, there is no need to learn
+                    all of them to create a successful language. In fact, even a
+                    lot of modern popular languages like <code>C++</code>,
                     <code>Rust</code>, <code>Swift</code>,
                     <code>Haskell</code> or <code>Kotlin/Native</code> rely on
                     <code>LLVM</code> for optimization and code generation.
diff --git a/www/lexing.html b/www/lexing.html
index 2d72967..6836d56 100644
--- a/www/lexing.html
+++ b/www/lexing.html
@@ -50,7 +50,7 @@
                 <h1>Tokenization</h1>
                 <p>
                     The first step of the compilation process is to take the
-                    textual representation of the program and brake it down into
+                    textual representation of the program and break it down into
                     a list of tokens. Like spoken languages have sentences that
                     are composed of nouns, verbs, adjectives, etc., programming
                     languages similarly are composed of a set of tokens.
@@ -64,8 +64,8 @@ <h1>Tokenization</h1>
                     be named anything else like <code>foo</code> or
                     <code>bar</code>. One thing these names have in common is
                     that each of them uniquely identifies the given function, so
-                    the token that represent such piece of source code is called
-                    the <code>Identifier</code> token.
+                    the token that represents such a piece of source code is
+                    called the <code>Identifier</code> token.
                 </p>
                 <pre><code>enum class TokenKind : char {
   Identifier
@@ -79,8 +79,8 @@ <h1>Tokenization</h1>
                     functions called <code>fn</code> or <code>void</code>.
                 </p>
                 <p>
-                    Each keyword gets it's own unique token, so that it's easy
-                    to differentiate between them.
+                    Each keyword gets its unique token so that it's easy to
+                    differentiate between them.
                 </p>
                 <pre><code>enum class TokenKind : char {
   ...
@@ -99,7 +99,7 @@ <h1>Tokenization</h1>
                     The rest of the tokens, including <code>EOF</code> are
                     tokens composed of a single character. To make creating them
                     easier, each of these tokens is placed into an array and
-                    their respective enumerator values are the ascii code of
+                    their respective enumerator values are the ASCII code of
                     their corresponding character.
                 </p>
                 <pre><code>constexpr char singleCharTokens[] = {'\0', '(', ')', '{', '}', ':'};
@@ -115,10 +115,10 @@ <h1>Tokenization</h1>
   Colon = singleCharTokens[5],
 };</code></pre>
                 <p>
-                    It might happen that a developer writes something in the
-                    source code that cannot be represented by any of the known
-                    tokens. In such cases an <code>Unk</code> token is used,
-                    that represents every unknown piece of source code.
+                    A developer might write something in the source code that
+                    cannot be represented by any of the known tokens. In such
+                    cases an <code>Unk</code> token is used, that represents
+                    every unknown piece of source code.
                 </p>
                 <pre><code>enum class TokenKind : char {
   Unk = -128,
@@ -153,12 +153,12 @@ <h2>The Lexer</h2>
                 <p>
                     The lexer is the part of the compiler that is responsible
                     for producing the tokens. It iterates over a source file
-                    character by character and does it's best to select the
+                    character by character and does its best to select the
                     correct token for each piece of code.
                 </p>
                 <p>
-                    Within the compiler a source file is represented by it's
-                    path and a buffer filled with it's content.
+                    Within the compiler, a source file is represented by its
+                    path and a buffer filled with its content.
                 </p>
                 <pre><code>struct SourceFile {
   std::string_view path;
@@ -171,7 +171,7 @@ <h2>The Lexer</h2>
                     traverses the buffer. Because initially none of the
                     characters in the source file is processed, the lexer points
                     to the first character of the buffer and starts at the
-                    position of line 1 column 0, or with other words, before the
+                    position of line 1 column 0, or in other words, before the
                     first character of the first line. The next
                     <code>Token</code> is returned on demand by the
                     <code>getNextToken()</code> method.
@@ -194,7 +194,7 @@ <h2>The Lexer</h2>
                     <code>eatNextChar()</code> helper methods are introduced.
                     The former returns which character is to be processed next,
                     while the latter returns that character and advances the
-                    lexer to the next character, while updating the correct line
+                    lexer to the next character while updating the correct line
                     and column position in the source file.
                 </p>
                 <pre><code>class Lexer {
@@ -267,14 +267,14 @@ <h2>The Lexer</h2>
   ...
 }</code></pre>
                 <p>
-                    A <code>for</code> loop is used to iterate over the single
-                    character tokens array and if the current character matches
-                    one of them, the corresponding token is returned. This is
-                    the benefit of storing the characters in an array and making
-                    their corresponding <code>TokenKind</code> have the value of
-                    the ascii code of the character the token represents. This
-                    way the <code>TokenKind</code> can immediately be returned
-                    with a simple cast.
+                    A <code>for</code> loop is used to iterate over the
+                    single-character tokens array and if the current character
+                    matches one of them, the corresponding token is returned.
+                    This is the benefit of storing the characters in an array
+                    and making their corresponding <code>TokenKind</code> have
+                    the value of the ASCII code of the character the token
+                    represents. This way the <code>TokenKind</code> can
+                    immediately be returned with a simple cast.
                 </p>
                 <pre><code>Token Lexer::getNextToken() {
   ...
@@ -288,7 +288,7 @@ <h2>The Lexer</h2>
                 <blockquote>
                     <h3>Design Note</h3>
                     <p>
-                        In production grade compilers single character tokens
+                        In production-grade compilers, single-character tokens
                         are usually handled using hardcoded branches, as that
                         will lead to the fastest running code in general.
                     </p>
@@ -310,7 +310,7 @@ <h3>Design Note</h3>
 if (currentChar == '\0')
   return Token{tokenStartLocation, TokenKind::eof};</code></pre>
                     <p>
-                        In this compiler the goal is to use a representation
+                        In this compiler, the goal is to use a representation
                         that takes as little boilerplate code to implement and
                         extend as possible.
                     </p>
@@ -352,7 +352,7 @@ <h3>Design Note</h3>
                         While comments are not important for this compiler,
                         other compilers that convert one language to another
                         (e.g.: Java to Kotlin) or formatting tools do need to
-                        know about them. In such cases the lexer might return a
+                        know about them. In such cases, the lexer might return a
                         dedicated <code>Comment</code> token with the contents
                         of the comment.
                     </p>
@@ -360,8 +360,8 @@ <h3>Design Note</h3>
                 <h2>Identifiers and Keywords</h2>
                 <p>
                     Identifiers consist of multiple characters in the form of
-                    <code>(a-z|A-Z)(a-z|A-Z|0-9)*</code>. Initially keywords are
-                    also lexed as identifiers but later their corresponding
+                    <code>(a-z|A-Z)(a-z|A-Z|0-9)*</code>. Initially, keywords
+                    are also lexed as identifiers but later their corresponding
                     <code>TokenKind</code> is looked up from the map and the
                     correct token representing them is returned.
                 </p>
@@ -390,12 +390,12 @@ <h2>Identifiers and Keywords</h2>
 }</code></pre>
                 <p>
                     Notice how <code>isSpace</code>, <code>isAlpha</code>, etc.
-                    are all custom functions, when the C++ standard library also
+                    are all custom functions when the C++ standard library also
                     provides <code>std::isspace</code>,
                     <code>std::isalpha</code>, etc.
                 </p>
                 <p>
-                    These functions are dependant on the current locale, so if
+                    These functions are dependent on the current locale, so if
                     for example
                     <code>'a'</code> is not considered alphabetic in the current
                     locale, the lexer will no longer work as expected.
@@ -403,8 +403,8 @@ <h2>Identifiers and Keywords</h2>
                 <p>
                     If none of the above conditions matches the current
                     character and the end of the function is reached, the lexer
-                    wasn't able to figure out which token represents the piece
-                    of code starting at the current character, so an
+                    can't figure out which token represents the piece of code
+                    starting at the current character, so an
                     <code>Unk</code> token is returned.
                 </p>
                 <pre><code>Token Lexer::getNextToken() {
diff --git a/www/parsing.html b/www/parsing.html
index 7eebbbe..d60c931 100644
--- a/www/parsing.html
+++ b/www/parsing.html
@@ -54,7 +54,7 @@ <h1>The Abstract Syntax Tree</h1>
                     building blocks (nouns, verbs, etc.) of sentences in a
                     spoken language. The
                     <code>This section talks about the parser.</code> sentence
-                    is valid in the english language, because the mentioned
+                    is valid in the English language because the mentioned
                     building blocks follow each other in the correct order.
                     Similarly <code>fn main(): void {}</code> is a valid
                     function declaration in <i>your language</i> for the same
@@ -98,8 +98,8 @@ <h1>The Abstract Syntax Tree</h1>
   virtual void dump(size_t level = 0) const = 0;
 };</code></pre>
                 <p>
-                    Currently the only <code>Decl</code> in the language is the
-                    <code>FunctionDecl</code>, which additionally to what every
+                    Currently, the only <code>Decl</code> in the language is the
+                    <code>FunctionDecl</code>, which in addition to what every
                     declaration has in common, also has a return type and a
                     body.
                 </p>
@@ -121,7 +121,7 @@ <h1>The Abstract Syntax Tree</h1>
                     To make the dumping of the node easier the
                     <code>indent()</code> helper is introduced, which returns
                     the indentation of a given level. For the indentation of
-                    each level 2 spaces are used.
+                    each level, 2 spaces are used.
                 </p>
                 <pre><code>std::string indent(size_t level) { return std::string(level * 2, ' '); }</code></pre>
                 <p>
@@ -153,7 +153,7 @@ <h1>The Abstract Syntax Tree</h1>
 };</code></pre>
                 <p>
                     Because a <code>Block</code> doesn't have any child nodes,
-                    it's textual representation only includes the name of the
+                    its textual representation only includes the name of the
                     node.
                 </p>
                 <pre><code>void Block::dump(size_t level) const {
@@ -162,7 +162,7 @@ <h1>The Abstract Syntax Tree</h1>
                 <blockquote>
                     <h3>Design Note</h3>
                     <p>
-                        Lately some compiler engineers started using
+                        Lately, some compiler engineers started using
                         <code>std::variant</code> instead of inheritance to
                         model the AST, where the variant acts as a union of
                         nodes.
@@ -199,8 +199,8 @@ <h3>Design Note</h3>
   Expr *innerExpr;
 };</code></pre>
                     <p>
-                        In this case the question is, who owns the memory for
-                        the <code>innerExpr</code> field. Who allocates it, who
+                        In this case, the question is, who owns the memory for
+                        the <code>innerExpr</code> field? Who allocates it, who
                         is responsible for freeing it, etc. The workaround for
                         this problem is to use a <code>std::unique_ptr</code>.
                     </p>
@@ -208,9 +208,9 @@ <h3>Design Note</h3>
   std::unique_ptr&lt;Expr> innerExpr;
 };</code></pre>
                     <p>
-                        Now it's clear that the node is the owner of it's child
-                        node. However to know the current type of the variant,
-                        <code>innerExpr</code> needs to be type checked. The
+                        Now it's clear that the node is the owner of its child
+                        node. However, to know the current type of the variant,
+                        <code>innerExpr</code> needs to be type-checked. The
                         same type checking however could also be performed on
                         the pointer itself if <code>Expr</code> was a
                         polymorphic base class. To avoid complexities, this
@@ -270,8 +270,8 @@ <h2>Types</h2>
                 <blockquote>
                     <h3>Design Note</h3>
                     <p>
-                        Theoretically a function is also a separate type, so in
-                        a more complex language with a more complex type system
+                        Theoretically, a function is also a separate type, so in
+                        a more complex language with a more complex type system,
                         this should also be encapsulated somehow.
                     </p>
                     <p>
@@ -280,7 +280,7 @@ <h3>Design Note</h3>
                         function type. To be able to model the complexity of C++
                         types precisely,
                         <code>Clang</code> uses a layer-based type system, where
-                        each layer is a different higher level type.
+                        each layer is a different higher-level type.
                     </p>
                     <p>
                         An <code>int *</code> is represented using 2 layers, one
@@ -327,7 +327,7 @@ <h2>The Parser</h2>
         nextToken(lexer.getNextToken()) {}
 };</code></pre>
                 <p>
-                    Once the parser finished processing the next token, it calls
+                    Once the parser finishes processing the next token, it calls
                     the <code>eatNextToken()</code> helper, which consumes it
                     and calls the lexer for the following one.
                 </p>
@@ -440,9 +440,9 @@ <h2>The Parser</h2>
   ...
 }</code></pre>
                 <p>
-                    It might happen that the source code is invalid and the
-                    parser fails to process it completely. In that case the AST
-                    is incomplete, which is marked by the
+                    The source code might be invalid and the parser fails to
+                    process it completely. In that case, the AST is incomplete,
+                    which is marked by the
                     <code>incompleteAST</code> flag.
                 </p>
                 <pre><code>class Parser {
@@ -552,9 +552,9 @@ <h2>Parsing Functions</h2>
     return report(nextToken.location, msg);</code></pre>
                 <p>
                     The <code>parseFunctionDecl()</code> method expects the
-                    current token to be <code>KwFn</code>, saves it's location
-                    as the beginning of the function and checks if the rest of
-                    the tokens are in the correct order.
+                    current token to be <code>KwFn</code>, saves its location as
+                    the beginning of the function and checks if the rest of the
+                    tokens are in the correct order.
                 </p>
                 <pre><code>// &lt;functionDecl>
 //  ::= 'fn' &lt;identifier> '(' ')' ':' &lt;type> &lt;block>
@@ -583,7 +583,7 @@ <h2>Parsing Functions</h2>
 }</code></pre>
                 <p>
                     The next tokens denoting the start and end of the argument
-                    list are single character tokens, which don't require any
+                    list are single-character tokens, which don't require any
                     special handling.
                 </p>
                 <pre><code>std::unique_ptr&lt;FunctionDecl> Parser::parseFunctionDecl() {
@@ -612,7 +612,7 @@ <h2>Parsing Functions</h2>
   ...
 }</code></pre>
                 <p>
-                    Finally the <code>Block</code> is parsed by the
+                    Finally, the <code>Block</code> is parsed by the
                     <code>parseBlock()</code> method. Similarly to the current
                     method, <code>parseBlock()</code> also expects the first
                     token to be the start of the block, so that token is checked
@@ -627,7 +627,7 @@ <h2>Parsing Functions</h2>
   ...
 }</code></pre>
                 <p>
-                    If everything was successful, the
+                    If everything is successful, the
                     <code>FunctionDecl</code> node is returned.
                 </p>
                 <pre><code>std::unique_ptr&lt;FunctionDecl> Parser::parseFunctionDecl() {
@@ -637,9 +637,9 @@ <h2>Parsing Functions</h2>
 }</code></pre>
                 <p>
                     Parsing the type has been extracted into a dedicated helper
-                    method, so that it can be reused later when the language is
+                    method so that it can be reused later when the language is
                     extended. The <code>number</code> type is handled in a later
-                    chapter as so far there is no token that represents it.
+                    chapter as so far no token can represent it.
                 </p>
                 <p>
                     This method checks if the current token is
@@ -713,10 +713,10 @@ <h2>Parsing Functions</h2>
 }</code></pre>
                 <p>
                     If <code>main()</code> is not found and the AST is complete
-                    an error is reported. In case of an incomplete AST it might
-                    have been parsing the <code>main()</code> function that
-                    caused the syntax error, so nothing is reported to avoid
-                    false positives.
+                    an error is reported. In the case of an incomplete AST it
+                    might have been parsing the <code>main()</code> function
+                    that caused the syntax error, so nothing is reported to
+                    avoid false positives.
                 </p>
                 <pre><code>std::pair&lt;std::vector&lt;std::unique_ptr&lt;FunctionDecl>>, bool>
 Parser::parseSourceFile() {
@@ -808,7 +808,7 @@ <h2>Language Design</h2>
                     the syntax of a language. It might be tempting to introduce
                     a certain syntax, but it can easily increase the difficulty
                     of parsing that language and can even make expanding a
-                    grammar rule dependant on the semantics of the source code.
+                    grammar rule dependent on the semantics of the source code.
                 </p>
                 <p>
                     As an example take a look at the function declaration syntax
@@ -817,7 +817,7 @@ <h2>Language Design</h2>
                 <p>
                     <code>int foo(int);</code> declares a function named
                     <code>foo</code>, which returns an <code>int</code> and
-                    accepts an <code>int</code> as parameter.
+                    accepts an <code>int</code> as a parameter.
                     <code>int foo(0);</code> is also a valid C++ code, that
                     declares an <code>int</code> variable and initializes it to
                     <code>0</code>.
@@ -826,7 +826,7 @@ <h2>Language Design</h2>
                     The issue arises when
                     <code>int foo(x);</code> is encountered by the parser. Since
                     C++ allows the creation of user-defined types,
-                    <code>x</code> can either be a type, or a value. If
+                    <code>x</code> can either be a type or a value. If
                     <code>x</code> is a type, the above sequence of tokens is a
                     function declaration, if <code>x</code> is a value, it is a
                     variable declaration.
@@ -845,8 +845,8 @@ <h2>Language Design</h2>
                     When the same sequence of symbols can have a different
                     meaning based on what context they appear in, the grammar is
                     called ambiguous. C++ is known to have multiple ambiguities
-                    in it's grammar, though some are inherited from C such as
-                    the pointer syntax.
+                    in its grammar, though some are inherited from C such as the
+                    pointer syntax.
                 </p>
                 <pre><code>typedef char a;
 a * b; // declares 'b', a pointer to 'a'
@@ -864,7 +864,7 @@ <h2>Language Design</h2>
                     A well-known source of ambiguity in programming languages is
                     the generic syntax. Consider the following generic function
                     call, which can appear in both C++ and Kotlin
-                    <code>function&lt;type>(argument)</code>. For the parser
+                    <code>function&lt;type>(argument)</code>. For the parser,
                     this is a sequence of <code>Identifier</code>,
                     <code>&lt;</code>, <code>Identifier</code>, <code>></code>,
                     <code>(</code>, <code>Identifier</code> and <code>)</code>.
@@ -883,13 +883,13 @@ <h2>Language Design</h2>
 </code></pre>
                 <p>
                     The source of the problem is that <code>&lt;</code> can
-                    either mean the start of a generic argument list, or the
+                    either mean the start of a generic argument list or the
                     less-than operator. Rust resolved this ambiguity by
                     introducing the turbofish (<code>::<></code>). The Rust
                     parser knows that <code>&lt;</code> always means the
-                    less-than operator in confusing situations, because a
-                    generic argument list must begin with
-                    <code>::</code> followed by the <code>&lt;</code>.
+                    less-than operator in confusing situations because a generic
+                    argument list must begin with <code>::</code> followed by
+                    the <code>&lt;</code>.
                 </p>
                 <pre><code>fn f&lt;T>() {}