Skip to content

Investigate more type-safe (and compact) AST representation #1

@AiredaleDev

Description

@AiredaleDev

Presently, the meaning of the children of ASTNodes is dependent on the ASTNodeType within it. This means that it is possible to construct ASTs with bogus arguments, and the meaning of their arguments is defined at runtime (and therefore the code is not super-readable).

I'm curious if it would be worth representing an AST like this instead:

// This type specifies which slotmap to look up.
// NOTE: IIRC there isn't any macro like this, but there is a trait for it.
#[derive(slotmap_key)]
enum ASTKey {
   // One variant for each of the following:
}

// No smallvec is allocated, making these as theoretically small as possible.
struct Literal<'src> {
    val: &'src str,
}

struct Ident<'src> {
    name: &'src str,
}

struct UnOpNode {
    op: UnOp,
    operand: ASTKey,
}

struct BinOpNode {
  op: BinOp,
  lhs: ASTKey,
  rhs: ASTKey,
}

struct Block {
    stmts: Vec<ASTKey>,
}

// ...

// We could define a macro that generates this struct from everything we tag as an AST node too if we'd like.
struct AST<'src> {
    lits: SlotMap<ASTKey, Literal<'src>>,
    idents: SlotMap<ASTKey, Ident<'src>>,
    unops: SlotMap<ASTKey, UnOpNode>,
    binops: SlotMap<ASTKey, BinOpNode>,
    // ...
    root: ASTKey,
}

This would make the AST more type-safe in addition to improving cache usage by making things more compact, at the expense of making keys a little bigger (probably no more than an extra byte). We'd also avoid poisoning nodes that don't need lifetime parameters with extraneous lifetime parameters. IIRC this is how Zig does its AST (except it doesn't use SlotMaps, it uses a MultiArrayList )

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions