Skip to content

Latest commit



558 lines (381 loc) · 16.4 KB

File metadata and controls

558 lines (381 loc) · 16.4 KB

SML/NJ Primitive Operations

This document describes the primitive operators (primops) that the compiler exposes. These are used to define the InlineT structure, which, in turn is used in the implementation of the Basis Library. With the addition of 64-bit targets, the mapping from primop to internal representation becomes target-specific in many cases.

Relavant source files

  • compiler/ElabData/prim/primop.sml
    this file defines the Primop structure, which includes the various datatypes used to represent primitive operations internally in the front-end of the compiler. The main type is Primop.primop.

  • compiler/ElabData/prim/primop.sml
    this file defines the PRIMOP signature use for the Primop structure.

  • compiler/Semant/prim/primop-bindings.sml
    this file defines the bindings between the SML variables exposed by the compiler and the internal Primop.primop type.

  • system/smlnj/init/built-in32.sml
    this file defines the InlineT structure for 32-bit targets

Naming conventions

Operations that "belong" to a specific type (e.g., addition) have an initial prefix that specifies the type as follows:

  • "int" -- default tagged integer type (i.e., either or
  • "word" -- default tagged word type (i.e., either Word31.word or Word63.word)
  • "int32" -- 32-bit integers
  • "word32" -- 32-bit words
  • "int64" -- 64-bit integers
  • "word64" -- 64-bit words
  • "intinf" -- arbitrary precision integers
  • "real32" -- 32-bit real numbers (not yet supported)
  • "real64" -- 64-bit real numbers
  • "ptr" -- machine address
  • "barr" -- bytearray (used for arrays of Word8.word and char)
  • "bvec" -- bytevector (used for strings and vectors of Word8.word)
  • "arr" -- polymorphic arrays
  • "vec" -- polymorphic vectors
  • "seq" -- sequence types (arrays and vectors)

We use the attribute "raw" to denote direct machine operations that are not directly accesible in the Basis Library (e.g., shift operations, where the basis versions clamp the shift amount to the word size, but the raw versions do not).

We use the attribute "unsafe" for operations that could potentially result in a crash (e.g., array subscript operations that do not check the index against the array bounds).

Primitive operators

Size-independent primops

Continuation operators

  • callcc : ('a cont -> 'a) -> 'a

  • throw : 'a cont -> 'a -> 'b

  • capture : ('a control_cont -> 'a) -> 'a

  • isolate : ('a -> unit) -> 'a cont

  • cthrow : 'a control_cont -> 'a -> 'b

Reference operations

  • ! : 'a ref -> 'a

  • := : 'a ref * 'a -> unit

  • makeref : 'a ref * 'a -> unit

Boxity tests

  • boxed : 'a -> bool

  • unboxed : 'a -> bool

Type cast

  • cast : 'a -> 'b

Equality tests

  • = : ''a * ''a -> bool

  • <> : ''a * ''a -> bool

  • ptr_eql : 'a * 'a -> bool

  • ptr_neq : 'a * 'a -> bool

Runtime hooks

  • getvar : unit -> 'a

  • setvar : 'a -> unit

  • mkspecial : int * 'a -> 'b

  • getspecial : 'a -> int

  • setspecial : 'a * int -> unit

  • gethdlr : unit -> 'a cont

  • sethdlr : 'a cont -> unit

  • gettag : 'a -> int

  • objlength : 'a -> int
    extracts the length field from an object's header word. P.OBJLENGTH

Inline operations

These primops are Basis Library functions that should be inlined for efficiency.

  • compose : ('b -> 'c) * ('a -> 'b) -> 'a -> 'c

  • before : 'a * 'b -> 'a

  • ignore : 'a -> unit

  • identity : 'a -> 'a

  • bool_not : bool -> bool

Some additional candidates for inlined operations include hd, tl, null, chr, and ord. If the compiler had the option and order datatypes builtin (like bool and list), then valOf, isSome, isNone and some of the compare functions could be inlined.

Bytearray and bytevector operations

Operations on byte/char array/vectors. We renamed these to make it clear which operations do bounds checking and which do not.

  • bvec_unsafe_sub : 'a * int -> 'b
    subscript from byte vector without bounds checking (P.NUMSUBSCRIPT{kind=P.INT 8, checked=false, immutable=true})

  • barr_unsafe_sub : 'a * int -> 'b
    subscript from byte array without bounds checking (P.NUMSUBSCRIPT{kind=P.INT 8, checked=false, immutable=false})

  • barr_unsafe_update : 'a * int * 'b -> unit
    update byte array without bounds checking (P.NUMUPDATE{kind=P.INT 8, checked=false})

  • bvec_sub : 'a * int -> 'b
    subscript from byte vector (P.NUMSUBSCRIPT{kind=P.INT 8, checked=true, immutable=true})

  • barr_sub : 'a * int -> 'b
    subscript from byte array (P.NUMSUBSCRIPT{kind=P.INT 8, checked=true, immutable=false})

  • barr_update : 'a * int * 'b -> unit
    update byte array (P.NUMUPDATE{kind=P.INT 8, checked=true})

Polymorphic array and vector

  • mkarray : int * 'a -> 'a array
    create a polymorphic array (P.INLMKARRAY)

  • arr_unsafe_sub : 'a array * int -> 'a
    subscript from polymorphic array without bounds checking (P.SUBSCRIPT)

  • arr_sub : 'a array * int -> 'a
    subscript from polymorphic array (P.INLSUBSCRIPT)

  • vec_unsafe_sub : 'a vector * int -> 'a
    subscript from polymorphic vector without bounds checking (P.SUBSCRIPTV)

  • vec_sub : 'a vector * int -> 'a
    subscript from polymorphic vector (P.INLSUBSCRIPTV)

  • arr_unsafe_update : 'a array * int * 'a -> unit
    update a polymorphic array without bounds checking (P.UPDATE)

  • arr_update : 'a array * int * 'a -> unit
    update a polymorphic array (P.INLUPDATE)

  • arr_unboxed_update : 'a array * int * 'a -> unit
    update a polymorphic array with an unboxed value, which means that there is no store-list entry created for the update. P.UNBOXEDUPDATE

Sequence operations

Sequence values (e.g., string, 'a array, RealVector.vector, etc.) are represented by a header consisting of a length (in elements) and a data pointer to the raw sequence data.

  • newArray0 : unit -> 'a

  • seq_length : 'a -> int
    get the length field from a sequence header P.LENGTH

  • seq_data : 'a -> 'b
    get the length field from a sequence header P.GET_SEQ_DATA

  • unsafe_record_sub : 'a * int -> 'b

  • raw64Sub : 'a * int -> real64
    Unclear what purpose this primop serves P.SUBSCRIPT_RAW64

Numeric primops

Default tagged integer operations

These are the primitive operations on the default tagged integer type (

  • int_add : int * int -> int
    Signed integer addition with overflow checking. P.ARITH{oper=P.ADD, overflow=true, kind=P.INT <int-size>}

  • int_unsafe_add : int * int -> int
    Signed integer addition without overflow checking. P.ARITH{oper=P.ADD, overflow=false, kind=P.INT <int-size>}

  • int_sub : int * int -> int
    Signed integer subtraction with overflow checking. P.ARITH{oper=P.SUB, overflow=true, kind=P.INT <int-size>}

  • int_unsafe_sub : int * int -> int
    Signed integer subtraction without overflow checking. P.ARITH{oper=P.SUB, overflow=false, kind=P.INT <int-size>}

  • int_mul : int * int -> int
    P.ARITH{oper=P.MUL, overflow=true, kind=P.INT <int-size>}

  • int_div : int * int -> int
    P.ARITH{oper=P.QUOT, overflow=true, kind=P.INT <int-size>}

  • int_mod : int * int -> int
    P.ARITH{oper=P.REM, overflow=true, kind=P.INT <int-size>}

  • int_quot : int * int -> int
    P.ARITH{oper=P.QUOT, overflow=true, kind=P.INT <int-size>}

  • int_rem : int * int -> int
    P.ARITH{oper=P.REM, overflow=true, kind=P.INT <int-size>}

  • int_orb : int * int -> int
    P.ARITH{oper=P.ORB, overflow=false, kind=P.INT <int-size>}

  • int_xorb : int * int -> int
    P.ARITH{oper=P.XORB, overflow=false, kind=P.INT <int-size>}

  • int_andb : int * int -> int
    P.ARITH{oper=P.ANDB, overflow=false, kind=P.INT <int-size>}

  • int_neg : word32 -> word32
    P.ARITH{oper=P.NEG, overflow=true, kind=P.INT <int-size>}

  • int_raw_rshift : int * word -> int
    P.ARITH{oper=P.RSHIFT, overflow=false, kind=P.INT <int-size>}

  • int_raw_lshift : int * word -> int
    P.ARITH{oper=P.LSHIFT, overflow=false, kind=P.INT <int-size>}

  • int_gt : int * int -> bool
    P.CMP{oper=P.GT, kind=P.INT <int-size>}

  • int_ge : int * int -> bool
    P.CMP{oper=P.GTE, kind=P.INT <int-size>}

  • int_lt : int * int -> bool
    P.CMP{oper=P.LT, kind=P.INT <int-size>}

  • int_le : int * int -> bool
    P.CMP{oper=P.LTE, kind=P.INT <int-size>}

  • int_eql : int * int -> bool
    P.CMP{oper=P.EQL, kind=P.INT <int-size>}

  • int_neq : int * int -> bool
    P.CMP{oper=P.NEQ, kind=P.INT <int-size>}

  • int_min : int * int -> int
    P.INLMIN (P.INT <int-size>)

  • int_max : int * int -> int
    P.INLMAX (P.INT <int-size>)

  • int_abs : word32 -> word32
    P.INLABS (P.INT <int-size>)

Default tagged word operations

These are the primitive operations on the default tagged word type (Word.word).

  • word_mul : word * word -> word
    P.ARITH{oper=P.MUL, overflow=false, kind=P.INT <int-size>}

  • word_div : word * word -> word
    P.ARITH{oper=P.QUOT, overflow=false, kind=P.INT <int-size>}

  • word_mod : word * word -> word
    P.ARITH{oper=P.REM, overflow=false, kind=P.INT <int-size>}

  • word_add : word * word -> word
    P.ARITH{oper=P.ADD, overflow=false, kind=P.INT <int-size>}

  • word_sub : word * word -> word
    P.ARITH{oper=P.SUB, overflow=false, kind=P.INT <int-size>}

  • word_orb : word * word -> word
    P.ARITH{oper=P.ORB, overflow=false, kind=P.INT <int-size>}

  • word_xorb : word * word -> word
    P.ARITH{oper=P.XORB, overflow=false, kind=P.INT <int-size>}

  • word_andb : word * word -> word
    P.ARITH{oper=P.ANDB, overflow=false, kind=P.INT <int-size>}

  • word_notb : word -> word
    P.ARITH{oper=P.NOTB, overflow=false, kind=P.INT <int-size>}

  • word_neg : word -> word
    P.ARITH{oper=P.NEG, overflow=false, kind=P.INT <int-size>}

  • word_rshift : word * word -> word
    P.ARITH{oper=P.RSHIFT, overflow=false, kind=P.INT <int-size>}

  • word_rshiftl : word * word -> word
    P.ARITH{oper=P.RSHIFTL, overflow=false, kind=P.INT <int-size>}

  • word_lshift : word * word -> word
    P.ARITH{oper=P.LSHIFT, overflow=false, kind=P.INT <int-size>}

  • word_gt : word * word -> bool
    P.CMP{oper=P.GT, kind=P.UINT <int-size>}

  • word_ge : word * word -> bool
    P.CMP{oper=P.GTE, kind=P.UINT <int-size>}

  • word_lt : word * word -> bool
    P.CMP{oper=P.LT, kind=P.UINT <int-size>}

  • word_le : word * word -> bool
    P.CMP{oper=P.LTE, kind=P.UINT <int-size>}

  • word_eql : word * word -> bool
    P.CMP{oper=P.EQL, kind=P.UINT <int-size>}

  • word_neq : word * word -> bool
    P.CMP{oper=P.NEQ, kind=P.UINT <int-size>}

  • word_raw_rshift : word * word -> word
    P.INLRSHIFT(P.UINT <int-size>)

  • word_raw_rshiftl : word * word -> word
    P.INLRSHIFTL(P.UINT <int-size>)

  • word_raw_lshift : word * word -> word
    P.INLLSHIFT(P.UINT <int-size>)

  • word_min : word * word -> word
    P.INLMIN (P.UINT <int-size>)

  • word_max : word * word -> word
    P.INLMAX (P.UINT <int-size>)

8-bit word operations

32-bit integer operations

  • int32_add : int32 * int32 -> int32
    P.ARITH{oper=P.ADD, overflow=true, kind=P.INT 32}

  • int32_sub : int32 * int32 -> int32
    P.ARITH{oper=P.SUB, overflow=true, kind=P.INT 32}

  • int32_mul : int32 * int32 -> int32
    P.ARITH{oper=P.MUL, overflow=true, kind=P.INT 32}

  • int32_div : int32 * int32 -> int32
    P.ARITH{oper=P.QUOT, overflow=true, kind=P.INT 32}

  • int32_mod : int32 * int32 -> int32
    P.ARITH{oper=P.REM, overflow=true, kind=P.INT 32}

  • int32_quot : int32 * int32 -> int32
    P.ARITH{oper=P.QUOT, overflow=true, kind=P.INT 32}

  • int32_rem : int32 * int32 -> int32
    P.ARITH{oper=P.REM, overflow=true, kind=P.INT 32}

  • int32_orb : int32 * int32 -> int32
    P.ARITH{oper=P.ORB, overflow=false, kind=P.INT 32}

  • int32_xorb : int32 * int32 -> int32
    P.ARITH{oper=P.XORB, overflow=false, kind=P.INT 32}

  • int32_andb : int32 * int32 -> int32
    P.ARITH{oper=P.ANDB, overflow=false, kind=P.INT 32}

  • int32_neg : word32 -> word32
    P.ARITH{oper=P.NEG, overflow=true, kind=P.INT 32}

  • int32_raw_rshift : int32 * word -> int32
    P.ARITH{oper=P.RSHIFT, overflow=false, kind=P.INT 32}

  • int32_raw_lshift : int32 * word -> int32
    P.ARITH{oper=P.LSHIFT, overflow=false, kind=P.INT 32}

  • int32_gt : int32 * int32 -> bool
    P.CMP{oper=P.GT, kind=P.INT 32}

  • int32_ge : int32 * int32 -> bool
    P.CMP{oper=P.GTE, kind=P.INT 32}

  • int32_lt : int32 * int32 -> bool
    P.CMP{oper=P.LT, kind=P.INT 32}

  • int32_le : int32 * int32 -> bool
    P.CMP{oper=P.LTE, kind=P.INT 32}

  • int32_eql : int32 * int32 -> bool
    P.CMP{oper=P.EQL, kind=P.INT 32}

  • int32_neq : int32 * int32 -> bool
    P.CMP{oper=P.NEQ, kind=P.INT 32}

  • int32_min : int32 * int32 -> int32
    P.INLMIN (P.INT 32)

  • int32_max : int32 * int32 -> int32
    P.INLMAX (P.INT 32)

  • int32_abs : word32 -> word32
    P.INLABS (P.INT 32)

32-bit word operations

64-bit integer operations

  • int64_add : int64 * int64 -> int64
    P.ARITH{oper=P.ADD, overflow=true, kind=P.INT 64}

  • int64_sub : int64 * int64 -> int64
    P.ARITH{oper=P.SUB, overflow=true, kind=P.INT 64}

  • int64_mul : int64 * int64 -> int64
    P.ARITH{oper=P.MUL, overflow=true, kind=P.INT 64}

  • int64_div : int64 * int64 -> int64
    P.ARITH{oper=P.QUOT, overflow=true, kind=P.INT 64}

  • int64_mod : int64 * int64 -> int64
    P.ARITH{oper=P.REM, overflow=true, kind=P.INT 64}

  • int64_quot : int64 * int64 -> int64
    P.ARITH{oper=P.QUOT, overflow=true, kind=P.INT 64}

  • int64_rem : int64 * int64 -> int64
    P.ARITH{oper=P.REM, overflow=true, kind=P.INT 64}

  • int64_orb : int64 * int64 -> int64
    P.ARITH{oper=P.ORB, overflow=false, kind=P.INT 64}

  • int64_xorb : int64 * int64 -> int64
    P.ARITH{oper=P.XORB, overflow=false, kind=P.INT 64}

  • int64_andb : int64 * int64 -> int64
    P.ARITH{oper=P.ANDB, overflow=false, kind=P.INT 64}

  • int64_neg : word32 -> word32
    P.ARITH{oper=P.NEG, overflow=true, kind=P.INT 64}

  • int64_raw_rshift : int64 * word -> int64
    P.ARITH{oper=P.RSHIFT, overflow=false, kind=P.INT 64}

  • int64_raw_lshift : int64 * word -> int64
    P.ARITH{oper=P.LSHIFT, overflow=false, kind=P.INT 64}

  • int64_gt : int64 * int64 -> bool
    P.CMP{oper=P.GT, kind=P.INT 64}

  • int64_ge : int64 * int64 -> bool
    P.CMP{oper=P.GTE, kind=P.INT 64}

  • int64_lt : int64 * int64 -> bool
    P.CMP{oper=P.LT, kind=P.INT 64}

  • int64_le : int64 * int64 -> bool
    P.CMP{oper=P.LTE, kind=P.INT 64}

  • int64_eql : int64 * int64 -> bool
    P.CMP{oper=P.EQL, kind=P.INT 64}

  • int64_neq : int64 * int64 -> bool
    P.CMP{oper=P.NEQ, kind=P.INT 64}

  • int64_min : int64 * int64 -> int64
    P.INLMIN (P.INT 64)

  • int64_max : int64 * int64 -> int64
    P.INLMAX (P.INT 64)

  • int64_abs : word32 -> word32
    P.INLABS (P.INT 64)

64-bit word operations

64-bit real operations
