diff --git a/README.md b/README.md index 415c54e..cd8d1bc 100644 --- a/README.md +++ b/README.md @@ -1,543 +1,110 @@ -# **UNFINISHED, PARSER NOTUP TO DATE** +# Zenith +Zenith is an in-development, hybrid programming language compiler written in C++23. It aims to combine JVM interoperability, low-level memory control, and dynamic scripting in a single language. - - -# Specific language specification - -### Keywords -- `auto` - Type inference -- ```dynamic``` - Dynamic type (primitive) (heap allocated) -- ```fun``` - Declares a dynamic function, can be used with static return type functions too -- [DEPRECATED] ```hoist``` - Hoists a variable to the top of its scope. Defines at the normal point -- `unsigned` - Makes a number variable unsigned -- `signed` - Makes a number variable signed -- ```class``` - Makes a class -- ```struct``` - Makes a struct -- ```unsafe``` - Creates a block where you can use unsafe functions -- ```new``` - New keyword, used when creating objects - - -**Class Scopes** -- ```public``` - Class public scope -- ```protected``` - Class protected scope -- ```private``` - Class private scope -- ```privatew``` - Class privatew scope (only private scope can write, everyone can read) -- ```protectedw``` - Class protectedw scope (only protected/private scope can write, everyone can read - -**Imports** -- ```import``` - Imports a file -- ```import java``` - Imports a java class(must be compiled to JVM or have a JVM it can reference to) -- ```extern "C"``` - Creates a block where you can declare C functions, must be in an unsafe block -- ```package``` - Sets a package, needed when compiling to JVM or integrating with java - -### Types - -**Built-in types** - -**Number Types** -- `short` - 2 byte integer -- `int` - 4 byte integer -- `long` - 8 byte integer -- `byte` - 1 byte integer -- `float` - 4 byte decimal number -- `double` - 8 byte decimal number -- `dynamic` - heap allocated dynamic data type +## Usage -**Other primitive Types** -- `string` - A string !! Not a `class` -- `freeobj` - A JavaScript like object that can store data, functions, etc. If it can be lowered to a structure it will be to save on heap memory and overhead -- [DEPRECATED] -- `Number` - A JS like number that supports decimals and whole numbers up to 8 bytes -- `BigInt` - A JS like dynamically allocated integer, up to 32 bytes -- `BigNumber` - A dynamically allocated number that can support numbers up to 32 bytes -``` -//Easiest way -auto dynamic = { //inferred as freeobj - name: "Zenith", - version: 1.0f, - getInfo: () => "${this.name} v${this.version}" -} -//Redundant freeobj way -auto dynamic = freeobj { - name: "Zenith", - version: 1.0f, - getInfo: () => "${this.name} v${this.version}" -} -//Type strict way -freeobj dynamic = freeobj { //second freeobj Infront of { is optional - name: "Zenith", - version: 1.0f, - getInfo: () => `"${this.name} v${this.version}"` -} -//Edge cases -struct somestruct{...} -somestruct name = {} //will be a struct -somestruct nicename = freeobj {} //Type conversion error as you are specifically making a freeobject but assigning it to a structure yype -``` -**Built-in `class` types** - -- `IO` - simple console and file IO -- Std lib not done - -### Blocks -**Functions** -``` -fun type name(type argName){} -``` -**Loops** -- `for` loop -``` -for(let i=0;i<123;i++){ - // -} -int arr[1024] = ... -for(int i : arr){ - //Log i -} -``` -- `while` loop -``` -while(somecondition){ - // -} -``` -- `do-while` loop -``` -do { - // -} -while (condition) ``` -**If statements** +Zenith [options] ``` -if(condition){ - //do this -}else{ - //do that -} -if(condition){ - //do 1 -}else if(condition){ - //do 2 -} -``` -**Objects** -**Classes** +### Options -``` -class Cat{ - privatew string name; - privatew double age // ; optional - public void pet(){ - IO.print(`"${this.name} feels very Loved!"`) - } - public void age(double age){ - if(age>0){ - this.age+=age - } - } +| Flag | Values | Default | Description | +|------|--------|---------|-------------| +| `--target=` | `native`, `jvm` | `native` | Compilation target | +| `--braces=` | `required`, `optional` | `required` | Whether braces are mandatory around blocks | +| `--gc=` | `generational`, `refcounting`, `none` | `generational` | Garbage-collection strategy | - public Cat(string name, Gender gender){ - this.name = name - this.age = 0 - } +### Example -} -``` -**Structs** -``` -struct Vector3{ - int x; - int y; - int z; -} -``` -[DEPRECATED] -**Union** -``` -union Pet{ - Cat; - Dog; -} +```bash +./build/Zenith --target=native hello.zn ``` -### Metaprogramming - Annotations, Decorators -- **Annotations** can be added to anything and hold some extra info, work the same as Java Annotations can hold anything, they give runtime metadata(in some cases with the built-in ones it's compile time) -``` -@Breedable -class Cat{...} -@Register(id="some_registry_entry",registry="block") -class Cube{...} +After a successful run you will find: -@NotNull -let somevarthatcantbenull = 1 +- `lexerout.log` — token stream produced by the lexer +- `parserout.log` — AST dump produced by the parser -@IsAGreatFunction -fun givesGreatResponses{...} -``` -- **Decorators** - can be added to functions/methods and function as a wrapper before executing the main code -``` - @@Memoize - fun fibbonaci(){} -``` -#### Creating -**Annotations** -``` -annotation ComesFromNet -@ComesFromNet -/************************/ -annotation Name{ - string value; -} -@Name("John Cena") -/************************/ -annotation Register{ - string registry; - string id; -} -@Register(id="actor",registry="con") -/************************/ -annotation Register{ - string registry; - Cat mother; -} -``` -**Decorators** - -[WIP] - -### Object-Oriented Programming (OOP) in Depth -**Classes** - -Classes are the foundation of OOP, encapsulating data (fields) and behavior (methods). They support: - -- Inheritance - -- Operator overloading (though not shown in this first example) - -- Automatic getters/setters (though not shown in this first example) - -- Constructors, destructors, and method overriding - -Key Modifiers - -- `const` – Makes a field immutable (must be initialized at declaration or in the constructor). -- `static` - Makes the member not need an instance to be accessed +## Running the Tests +```bash +cmake --build build --target ptest +cd build && ctest --output-on-failure ``` -class Dog { - // Fields with restricted write access - privatew double age; - protectedw string breed; - const public string name; - - public Dog(string name, string breed) : name(name) { - this.age = 0; - this.breed = breed; - } - - public void describe() { - IO.print(`"${this.name} is a ${this.breed} aged ${this.age}."`); - this.age = 5 - } - // Protected method (can modify protected fields) - protected void setBreed(string newBreed) { - this.breed = newBreed; - } -} +The test suite uses [GoogleTest](https://github.com/google/googletest) and covers the lexer/parser pipeline. -class Mallinois : Dog { - public Mallinois(string name) : Dog(name, "Mallinois") {} +## Project Structure - public void train() { - this.age = 3; - this.breed = "Trained Mallinois"; - IO.print(`"${super.name} is now a ${this.breed}!"`); - } -} - -// Example Usage -fun main() { - Dog buddy = new Dog("Buddy", "Labrador"); - buddy.describe(); // "Buddy is a Labrador aged 0." - - // Can READ but NOT WRITE privatew/protectedw from public scope: - // buddy.age = 5; // ERROR: privatew write outside class! - // buddy.breed = "Poodle"; // ERROR: protectedw write outside hierarchy! - - Mallinois rex = new Mallinois("Rex"); - rex.train(); // "Rex is now a Trained Mallinois!" - rex.describe(); // "Rex is a Trained Mallinois aged 0." -} ``` - +Zenith/ +├── CMakeLists.txt +├── docs/ +│ └── LANGUAGE_SPEC.md # Full language reference +└── src/ + ├── main.cpp # Compiler entry point + ├── lexer/ # Tokeniser + ├── parser/ # Recursive-descent parser → AST + ├── ast/ # AST node definitions + ├── SemanticAnalysis/ # Type checking and symbol resolution + ├── visitor/ # Visitor pattern for AST traversal + ├── core/ # Polymorphic type utilities + ├── exceptions/ # LexError, ParseError, SemanticAnalysisError + ├── utils/ # File I/O, argument parsing, helpers + └── test/ # GoogleTest unit tests ``` -class Vector3D { - // Private fields (readable everywhere, writable only in Vector3D) - privatew double x; - privatew double y; - privatew double z; - - // Constructor - public Vector3D(double x, double y, double z) { - this.x = x; - this.y = y; - this.z = z; - } - - // --- Setters (with validation) --- - public void setter(x) setX(double x) { - if (!x.isNaN()) { // Example validation - this.x = x; - } - } - // (Similar for setY/setZ...) - - // --- Operator Overloading --- - // Vector addition (+) - public Vector3D operator+(Vector3D other) { - return new Vector3D( - this.x + other.x, - this.y + other.y, - this.z + other.z - ); - } - - // Scalar multiplication (*) - public Vector3D operator*(double scalar) { - return new Vector3D( - this.x * scalar, - this.y * scalar, - this.z * scalar - ); - } - - // Equality check (==) - public bool operator==(Vector3D other) { - return this.x == other.x - && this.y == other.y - && this.z == other.z; - } - - // --- String representation --- - public string toString() { - return `"(${this.x}, ${this.y}, ${this.z})"`; - } -} - -// Example Usage -fun main() { - Vector3D v1 = new Vector3D(1.0, 2.0, 3.0); - Vector3D v2 = new Vector3D(4.0, 5.0, 6.0); - - // Operator overloading - Vector3D sum = v1 + v2; // (5.0, 7.0, 9.0) - Vector3D scaled = v1 * 2.0; // (2.0, 4.0, 6.0) - bool isEqual = (v1 == v2); // false - // Getters/setters - IO.print(v1.getX()); // 1.0 - v1.x = 10.0; // Valid write ↓ - // v1.x = 10.0; // Usually would error, but will automatically use setter, ERROR: privatew write outside class! +## Roadmap - IO.print(sum.toString()); // "(5.0, 7.0, 9.0)" -} -``` -**Structs** -- Structs are the same as classes, but have a default access level of public -``` - struct Vector3{ - public int x; - public int y; - public int z; - //By default they don't have a constuctor - } - Vector3 a = {1,2,4} - Vector b = { - a: 1, - b: 2, - c: 4 - } -``` +- [ ] Complete parser +- [ ] Finish semantic analysis +- [ ] Native code generation +- [ ] JVM compilation target +- [ ] Standard library (`IO`, collections, …) +- [ ] IDE / LSP support -[DEPRECATED] -**Union** -- A union is a special data type that lets you store different types of data in the same memory space, but only one at a time -Key Idea: -- Unlike a struct (where each field has its own memory), a union shares the same memory for all its fields -- Changing one field overwrites the others -- Size = largest member’s size (since memory is shared) -- Do not directly support dynamic variables -- If they contain an object that contains dynamic variables they will be ignored (in size) -- No way to know what type is active -``` -union arbnum { - int, - float -} -``` +## License +No license file is present in this repository. All rights are reserved by the author unless otherwise stated. Contact the repository owner for licensing enquiries. -# Enum error handling model (syntax not mentioned up top needs to be fixed) +--- -``` -enum ReadFileRV { - SUCCESS(data: string) - ERR_NO_ACCESS, - ERR_IN_USE -} -``` -Used with -``` -match value{ - SUCCESS => { - //LAMBDA - - } - //etc... -} -``` -# Pipelines (Wip) -``` -path -|> readFile -|> fileType -|> match { - XML |> parseXml |> docFromXml - JSON |> parseJson |> docFromJson - _ |=> {/*lambda*/ IO.print("wrong type")} |> break //breaks parent lambda -} -|> modifyDoc -|> saveDoc(path, _) //_ is result from previous +The full language reference has moved to **[docs/LANGUAGE_SPEC.md](docs/LANGUAGE_SPEC.md)**. diff --git a/docs/LANGUAGE_SPEC.md b/docs/LANGUAGE_SPEC.md new file mode 100644 index 0000000..b355021 --- /dev/null +++ b/docs/LANGUAGE_SPEC.md @@ -0,0 +1,414 @@ +# Zenith Language Specification + +> **Note:** This specification is a work in progress. Some features described here are not yet fully implemented in the compiler. + +--- + +## Table of Contents + +1. [Keywords](#keywords) +2. [Types](#types) +3. [Blocks and Control Flow](#blocks-and-control-flow) +4. [Object-Oriented Programming](#object-oriented-programming) +5. [Metaprogramming](#metaprogramming) +6. [Error Handling](#error-handling) +7. [Pipelines](#pipelines) +8. [Unique Features](#unique-features) +9. [Grammar Highlights](#grammar-highlights) + +--- + +## Keywords + +### General + +| Keyword | Description | +|---------|-------------| +| `auto` | Type inference | +| `dynamic` | Dynamic (heap-allocated) type | +| `fun` | Declares a function (works with both static and dynamic return types) | +| `unsigned` | Makes a numeric variable unsigned | +| `signed` | Makes a numeric variable signed | +| `class` | Declares a class | +| `struct` | Declares a struct | +| `unsafe` | Opens a block where unsafe operations are allowed | +| `new` | Creates a new object instance | +| `let` | Declares a dynamic variable | +| `var` | Declares a variable (dynamic, hoistable) | +| `hoist` *(deprecated)* | Hoists a variable to the top of its scope | + +### Class Access Modifiers + +| Modifier | Description | +|----------|-------------| +| `public` | Accessible from anywhere | +| `protected` | Accessible within the class and subclasses | +| `private` | Accessible only within the class | +| `privatew` | Readable everywhere, writable only within the class | +| `protectedw` | Readable everywhere, writable only within protected/private scope | + +### Imports + +| Keyword | Description | +|---------|-------------| +| `import` | Imports a Zenith file | +| `import java` | Imports a Java class (requires JVM target) | +| `extern "C"` | Declares C functions (must be inside an `unsafe` block) | +| `package` | Sets a package namespace (required for JVM compilation) | + +--- + +## Types + +### Primitive Number Types + +| Type | Size | Description | +|------|------|-------------| +| `byte` | 1 byte | Integer | +| `short` | 2 bytes | Integer | +| `int` | 4 bytes | Integer | +| `long` | 8 bytes | Integer | +| `float` | 4 bytes | Decimal number | +| `double` | 8 bytes | Decimal number | +| `dynamic` | heap | Heap-allocated dynamic data | + +### Other Primitive Types + +| Type | Description | +|------|-------------| +| `string` | String (not a class) | +| `freeobj` | JavaScript-like object; lowered to a struct at compile time if its shape is static | +| `bool` | Boolean | + +### `freeobj` Examples + +```zenith +// Inferred as freeobj +auto obj = { + name: "Zenith", + version: 1.0f, + getInfo: () => `"${this.name} v${this.version}"` +} + +// Explicit freeobj +freeobj obj = freeobj { + name: "Zenith", + version: 1.0f +} + +// Edge cases +struct SomeStruct { ... } +SomeStruct a = {} // becomes a struct +SomeStruct b = freeobj {} // ERROR: explicit freeobj cannot be assigned to struct type +``` + +### Built-in Class Types + +- `IO` — basic console and file I/O + +> The standard library is not yet complete. + +--- + +## Blocks and Control Flow + +### Functions + +```zenith +fun returnType name(type argName) { + // body +} +``` + +### `for` Loop + +```zenith +// C-style +for (let i = 0; i < 123; i++) { + // ... +} + +// Range-based +int arr[1024] = ... +for (int i : arr) { + // iterate over arr +} +``` + +### `while` Loop + +```zenith +while (condition) { + // ... +} +``` + +### `do-while` Loop + +```zenith +do { + // ... +} while (condition) +``` + +### `if` / `else` + +```zenith +if (condition) { + // ... +} else { + // ... +} + +if (condition) { + // ... +} else if (otherCondition) { + // ... +} +``` + +--- + +## Object-Oriented Programming + +### Classes + +Classes encapsulate data and behaviour. They support: + +- Inheritance +- Operator overloading +- Constructors and destructors +- Method overriding +- `const` fields (immutable after construction) +- `static` members (no instance required) + +```zenith +class Dog { + privatew double age; + protectedw string breed; + const public string name; + + public Dog(string name, string breed) : name(name) { + this.age = 0; + this.breed = breed; + } + + public void describe() { + IO.print(`"${this.name} is a ${this.breed} aged ${this.age}."`); + } + + protected void setBreed(string newBreed) { + this.breed = newBreed; + } +} + +class Mallinois : Dog { + public Mallinois(string name) : Dog(name, "Mallinois") {} + + public void train() { + this.age = 3; + this.breed = "Trained Mallinois"; + IO.print(`"${super.name} is now a ${this.breed}!"`); + } +} +``` + +#### Operator Overloading + +```zenith +class Vector3D { + privatew double x; + privatew double y; + privatew double z; + + public Vector3D(double x, double y, double z) { + this.x = x; this.y = y; this.z = z; + } + + public Vector3D operator+(Vector3D other) { + return new Vector3D(this.x + other.x, this.y + other.y, this.z + other.z); + } + + public Vector3D operator*(double scalar) { + return new Vector3D(this.x * scalar, this.y * scalar, this.z * scalar); + } + + public bool operator==(Vector3D other) { + return this.x == other.x && this.y == other.y && this.z == other.z; + } + + public string toString() { + return `"(${this.x}, ${this.y}, ${this.z})"`; + } +} +``` + +### Structs + +Structs are like classes but default to `public` access. They do not have a constructor by default. + +```zenith +struct Vector3 { + int x; + int y; + int z; +} + +Vector3 a = {1, 2, 4} +Vector3 b = { x: 1, y: 2, z: 4 } +``` + +--- + +## Metaprogramming + +### Annotations + +Annotations attach metadata to any declaration. Built-in annotations may be processed at compile time; user-defined annotations provide runtime metadata. + +```zenith +@Breedable +class Cat { ... } + +@Register(id="some_entry", registry="block") +class Cube { ... } + +@NotNull +let value = 1 + +@IsAGreatFunction +fun givesGreatResponses() { ... } +``` + +#### Defining Annotations + +```zenith +annotation ComesFromNet // marker annotation + +annotation Name { + string value; +} + +annotation Register { + string registry; + string id; +} +``` + +### Decorators + +Decorators wrap functions/methods with additional behaviour at compile time. + +```zenith +@@Memoize +fun fibonacci(int n) -> int { ... } + +@@Log +fun compute() { ... } +``` + +> Custom decorators are a work in progress. + +--- + +## Error Handling + +### Enum-Based Model + +```zenith +enum ReadFileRV { + SUCCESS(data: string), + ERR_NO_ACCESS, + ERR_IN_USE +} +``` + +Use with `match`: + +```zenith +match value { + SUCCESS => { + // handle success, data is available here + } + ERR_NO_ACCESS => { + IO.print("Access denied") + } + ERR_IN_USE => { + IO.print("File is in use") + } +} +``` + +### JVM Exceptions + +```zenith +@Throws +fun riskyOperation() { ... } +``` + +--- + +## Pipelines + +> Pipelines are a work in progress. + +```zenith +path +|> readFile +|> fileType +|> match { + XML |> parseXml |> docFromXml + JSON |> parseJson |> docFromJson + _ |=> { IO.print("wrong type") } |> break +} +|> modifyDoc +|> saveDoc(path, _) // _ is the result from the previous step +``` + +--- + +## Unique Features + +| Feature | Example | Notes | +|---------|---------|-------| +| Write restrictions | `privatew int x;` | Class-only write, public read | +| Hoisting *(deprecated)* | `hoist var x = 10;` | Like JavaScript | +| Configurable braces | `--braces=optional` | Compiler flag | +| Free objects | `let obj = { x: 10 };` | Optimised to struct when shape is static | +| Dual GC | `--gc=none` | Manual and GC modes can coexist | + +--- + +## Grammar Highlights (EBNF) + +```ebnf +freeobj = "{" [ IDENT ":" expr { "," IDENT ":" expr } ] "}" ; +unsafe_block = "unsafe" "{" { stmt } "}" ; +async_func = "@Async" "fun" IDENT "(" params ")" "->" "Future" "<" type ">" ; +``` + +--- + +## JVM Interop Example + +```zenith +@JVMExport +class Main { + static void main() { + let obj = { msg: "Hello, JVM!" }; + IO.print(obj.msg); + } +} +``` + +> To export to Java you must set a `package`. When exported, standalone functions appear as methods in a `functions` class on the Java side. + +## Low-Level Control Example + +```zenith +unsafe { + let buf = malloc(1024); + buf[0] = 42; // pointer arithmetic + free(buf); +} +```