A sophisticated compiler implementation for a custom programming language that performs complete lexical analysis, syntax parsing, semantic analysis, and intermediate code generation via quadruples.
- Project Overview
- Features
- System Architecture
- Language Features
- Error Handling
- Tokens
- Quadruples
- Building and Running
- Examples
QuadStack-Compiler is a complete compiler implementation that transforms a high-level programming language into an intermediate representation using quadruples. The compilation process follows the standard multi-phase design pattern:
- Lexical Analysis: Tokenizes the source code using Flex
- Syntax Parsing: Validates grammatical structure using Bison/YACC
- Semantic Analysis: Ensures meaningful code with static type checking
- Symbol Table Management: Tracks variables, functions, scopes, and types
- Intermediate Code Generation: Produces quadruples representing the program
The compiler's design emphasizes robust error detection and reporting, including syntax errors, type mismatches, scope violations, and semantic inconsistencies.
The lexical analyzer (lexer.l
) is implemented using Flex. It:
- Converts source code into a stream of tokens
- Identifies keywords, identifiers, literals, operators, and punctuation
- Tracks line numbers for error reporting
- Handles comments (single and multi-line)
- Passes token values to the parser
The parser (parser.y
) is implemented using Bison/YACC. It:
- Defines the formal grammar of the language
- Constructs a parse tree according to syntax rules
- Initiates semantic analysis checks
- Calls the quadruple generator during parsing
- Handles syntax error recovery
Semantic analysis is integrated into the parser. It performs:
- Type checking for expressions and assignments
- Variable declaration and usage validation
- Constant reassignment prevention
- Function argument validation
- Control flow validation (e.g., break outside loops)
- Array access validation
The symbol table module (symbol_table.c/h
) provides a robust system for tracking program entities:
- Structure: Linked list with scope-based lookup
- Entries: Store name, type, scope, initialization status, usage info
- Scopes: Support nested blocks with proper scope entry/exit
- Operations:
- Variable declaration tracking
- Function and parameter declaration
- Type retrieval and validation
- Constant tracking
- Array information
- Usage and initialization tracking
The quadruples generator (quadruples.c/h
) produces a stack-based intermediate representation:
- Model: Stack machine with push/pop operations
- Structure: Operation code + up to three operands
- Generation: Emits quadruples during parsing
- Labels: Manages control flow with automatically generated labels
- Output: Produces human-readable quadruple file
-
Basic Types:
int
: Integer valuesfloat
: Floating-point valueschar
: Single character valuesstring
: Text stringsbool
: Boolean values (true/false)void
: Used for functions without return values
-
Type Checking:
- Static type checking at compile time
- Implicit conversions where appropriate (int → float)
- Warning messages for potentially unsafe conversions
- Error messages for incompatible types
-
Conditional:
if-else
statements with proper nestingswitch-case
statements with default case handling
-
Loops:
while
loops with condition testing at startrepeat-until
loops with condition testing at endfor
loops with initialization, condition, and update expressionsbreak
andcontinue
statements with context validation
- Arithmetic: Addition, subtraction, multiplication, division, modulo, power
- Comparison: Equal, not equal, greater than, less than, greater/less or equal
- Logical: AND, OR, NOT
- Bitwise: AND, OR, left shift, right shift
- Assignment: Simple assignment, compound assignments
- Increment/Decrement: Pre/post-increment, pre/post-decrement
- Declaration: Function name, return type, parameter list
- Parameters: Type-checked parameters with proper scope
- Return Values: Type-checked return statements
- Calls: Argument validation against parameter types and counts
- Quadruples Generation: Function calls are translated to appropriate CALL/RET operations
- Declaration: Fixed-size arrays with type and dimensions
- Access: Index-based access with bounds checking
- Assignment: Element assignment with type checking
- Declaration: Using
const
modifier with any data type - Protection: Prevents reassignment after initialization
The compiler performs extensive error detection and reporting:
- Syntax Errors: Reports unexpected tokens with line numbers
- Type Errors: Validates type compatibility in expressions and assignments
- Declaration Errors: Prevents variable redeclaration in the same scope
- Usage Errors: Detects use of undeclared variables
- Initialization Warnings: Warns about using variables before initialization
- Constant Modification: Prevents reassignment to constants
- Function Errors: Validates return types, parameter counts, and argument types
- Control Flow Errors: Ensures break/continue are used in proper contexts
- Array Errors: Validates array indices and dimensions
Token Category | Examples | Description |
---|---|---|
Keywords | if , else , while , for , int , float |
Language keywords for control flow, types, etc. |
Identifiers | Variable and function names | User-defined names for program entities |
Literals | 123 , 45.67 , "hello" , true |
Number, string, character, boolean constants |
Operators | + , - , * , / , = , == , != , && , || |
Arithmetic, assignment, comparison, logical operators |
Bitwise Operators | & , | , << , >> |
Operators for bit manipulation |
Punctuation | ; , , , ( , ) , { , } , [ , ] |
Delimiters and structural tokens |
Special Tokens | ++ , -- , -> |
Increment, decrement, and other special operators |
Quadruples are four-part instructions that represent operations in a stack-based execution model:
Quadruple | Description | Example |
---|---|---|
PUSH x |
Push the value of x onto the stack | PUSH 5 |
POP x |
Pop a value from the stack and store it in x | POP result |
ADD |
Pop two values, add them, push the result | ADD |
SUB |
Pop two values, subtract second from first, push result | SUB |
MUL |
Pop two values, multiply them, push result | MUL |
DIV |
Pop two values, divide first by second, push result | DIV |
MOD |
Pop two values, compute modulo, push result | MOD |
NEG |
Pop a value, negate it, push result | NEG |
EQ |
Compare for equality, push boolean result | EQ |
NE |
Compare for inequality, push boolean result | NE |
LT |
Compare less than, push boolean result | LT |
GT |
Compare greater than, push boolean result | GT |
LE |
Compare less than or equal, push boolean result | LE |
GE |
Compare greater than or equal, push boolean result | GE |
LOGICAL_AND |
Logical AND of two values | LOGICAL_AND |
LOGICAL_OR |
Logical OR of two values | LOGICAL_OR |
BITWISE_AND |
Bitwise AND of two values | BITWISE_AND |
BITWISE_OR |
Bitwise OR of two values | BITWISE_OR |
LSHIFT |
Left shift operation | LSHIFT |
RSHIFT |
Right shift operation | RSHIFT |
NOT |
Logical NOT of a value | NOT |
JMP Label |
Unconditional jump to label | JMP EndLoop_1 |
JF Label |
Jump to label if top of stack is false | JF FalseLabel_2 |
Label: |
Define a label for jumps | StartWhile_1: |
CALL func |
Call a function | CALL sum |
RET |
Return from a function | RET |
END func |
End of function declaration | END sum |
To build the compiler on Windows, you can use the provided batch file:
run.batGraphical User Interface
The project includes a graphical user interface built with Python and Tkinter for a more user-friendly compilation experience.
Features
Source code editor with syntax highlighting
File open/save functionality
One-click compilation
Tabbed output display showing compiler messages and generated quadruples
Status bar for process feedback