The specification contains standard guideline for implementing and understanding sushiscript. It's currently under construction and reflects the progress of implementation.
<identifier> = <ident head> <ident tail>*
<ident head> = <letter> | '_'
<ident tail> = <digit> | <ident head>
Following identifiers are keywords:
or not and
define return export
if else
switch case default
for in break continue
redirect from to append here
Following punctuations are meaningful:
+ - * // %
> < >= <= == !=
, = : ;
[ ] { } ( )
! $ " ' #
<built-in type>
= "()" | "Bool" | "Char" | "Int" | "String" | "Function"
| "Path" | "RelPath" | "Array" | "Map" | "ExitCode" | "FD"
<line break> = '\n'
<whitespace> = ' ' | '\t'
Special characters inside string-like token can be expressed by escaping character, by putting a \\
before a character to escape it. There are different meaningful escaped character in different context (specified below). Escaping a character that doesn't have special meaning in current context preserves the original character. In some contexts there are characters that have to be escaped to avoid lexical ambiguity.
TODO: string escaped characters may be incomplete
<tradition special> =
'\' ('a' | 'b' | 'f' | 'n' | 'r' | 't' | 'v' | '\')
The raw mode of lexer is turned on in special contexts (typically when [invoking shell command](#3.1.4.2 Invoking Command)) where getting rid of the "lexical restriction" conforms better to the tradition of shell and is more intuitive and convenient for users.
<raw token> = <raw char>*
<raw char> = [^<raw restrict>] | <raw escape>
<raw restrict> = ' ' | ';' | '"' | "'" | '\' | '|' | ',' | ')' | ':' | '}'
<raw escape> = '\' (<raw restrict> | <interpolate char>)
<string char> = [^<string restrict>] | <string escape>
<string escape>
= '\' ( <string restrict> | <interpolate char>)
| <tradition special>
<string restrict> = '"'
<char char> = [^<char restrict>] | <tradition special>
<char restrict> = "'"
<bool literal> = "true" | "false"
<unit literal> = "()"
<fd literal> = "stdin" | "stdout" | "stderr"
TODO: integer literal of various radix support
<integer> = ('+' | '-')? <digit>+
<char literal> = "'" <char char> "'"
<string literal> = '"' ( <string char> | <interpolation> ) * '"'
<path literal> = "~" ('/' <path tail>)? | '/' <path tail>?
<relpath literal> = '.'+ ('/' <path tail>)?
<path tail> = (<raw char> | <interpolation>)*
Interpolation is a lexical construct that embed expression in some special contexts (typically string interpolation).
<interpolation> = "${" ... "}"
Interpolation can be recognized recursively. e.g.
${ ... "${ ... }" ... }
"${ ... "${ ... }" ... }"
There are various contexts where interpolation can appear (such as string literal
). The lexer yield control tokens to interact with the parser, so that the parser know when to:
- Enter a interpolation context
- Extract a "normal" string segment
- Start parsing an interpolated expression
- Exit the interpolation context
Each context has its unique indicator of entering its interpolation context. So the parser can start to parse corresponding literal.
Depending on the character configuration of context, the lexer try to extract as many consecutive normal characters as possible to form a segment for the interpolation context.
In all interpolation contexts, character sequence ${
is recognized as an indicator of the start of an interpolated expression. Then lexer enters the normal context for recognizing tokens for the following expression, until the parser recognizes a matching }
for ${
, the interpolation context is resumed.
Depending on the character configuration of context, the lexer yields a control token to indicate the end of the interpolation context typically when a restricted character of the context is found, such as "
for string literal
and <space>
for path literal
.
"hello${a + b}!"
- The first
"
is recognized as the start of anstring literal
- The segment
hello
is recognized. ${
is recognized as the start of interpolated expression.- Then the normal context of lexer recognized
identifier: a
,+
,identifier: b
and}
. }
is considered the end of interpolation expression. The parser tells lexer to resumed the interpolation context.- The segment
!
is recognized. "
is recognized as the end of thestring literal
and a control token is yielded.
When a hash tag #
is recognized, any following character of the same line is regarded as comment and discarded by lexer.
<comment> = '#' <any character> $
Indent of length N
is N
consecutive white spaces at line start, no tab \t
is allowed.
<indent N> = ^[ ]{n}
When the last character of a line is \
, this backslash, next line break and indentation of next line are ignored. And if the current line is empty, the only the indentation of next line will count.
<literal>
= <int literal> | <bool literal>
| <unit literal> | <fd literal>
| <char literal> | <string literal>
| <path literal> | <relpath literal>
| <array literal> | <map literal>
<array literal> = '{' <array items>? '}'
<array items> = <expression> (',' <expression>)*
<map literal> = '{' <map items>? '}'
<map items> = <map item> (',' <map item>)*
<map item> = <expression> ':' <expression>
<atom expr> = <literal> | <identifier> | <paren expr> | <atom expr> <index>
<paren expr> = '(' <expression> ')'
<index> = '[' <expression> ']'
<binop expr> = <expression> <binary op> <expression>
<binary op>
= '+' | '-' | '*' | '//' | '%'
| '>' | '<' | '>=' | '<=' | '==' | '!='
| "or" | "and"
Operator | Associativity | Precedence |
---|---|---|
or and |
left | 0 |
== != |
left | 1 |
> < >= <= |
left | 2 |
+ - |
left | 3 |
* // % |
left | 4 |
<unop expr> = <unary op> <atom expr>
<unary op> = '+' | '-' | "not"
<function call> = <identifier> <atom expr>+
<command> = '!' <comamnd param>* <command end>
<command param>
= <raw token>
| <string literal>
| <char literal>
<command end> = <line break> | ','
<redirection> = "redirect" <redirect item> (',' <redirect item>)*
<redirect item> = <fd literal>? <redirect action>
<redirect action>
= "to" (<expression> "append"? | "here")
| "from" <expression>
<procedure call> = ( <function call> | <command> ) <redirection>? <pipeline>?
<pipeline> = '|' <procedure call>
<expression>
= <procedure call>
| <atom expr>
| <binop expr>
| <unop expr>
| <indexing>
<type> = <simple type> | <array type> | <map type> | <function type>
<simple type>
= "Int" | "Bool" | "String"
| "Path" | "()" | "FD" | "ExitCode"
<array type> = "Array" <simple type>
<map type> = "Map" <simple type> <simple type>
<function type> = "Function" <atom-type> <atom type>*
<atom type> = <simple type> | '(' <type> ')'
If we say "<statement>
introduces new block on <program>
", all indentations of token in <program>
must be greater than the line which the starting token of <statement>
is in if <program>
is in a different line, which means the following:
<indent N> <statement> ( <program> | <line break>
<indent (N+x)> <program> )
And all statements within the same innermost block must have same level of indentation.
If there's no line break before <program>
, the indentation level of that block is the column number of the first token in that <program>
minus 1.
<definition> = <define var> | <define func>
<define var> =
"export"? "define" <identifier> ( ':' <type> )? '=' <expression>
<define func> =
"export"? "define" <identifier> '(' <param list>? ')' (':' <type>)? '=' <program>
<param list> = <param dec> (',' <param dec>)*
<param dec> = <identifier> ':' <type>
<define func>
introduces new block on <program>
.
<assignment> = <expression> '=' <expression>
return
is used for returning from the control flow of current enclosing function.
<return> = return <expression>
<if> = "if" <expression> (':' | <line break>) <program> <else>?
<else> = "else" ':'? <program>
<if>
and <else>
both introduce new block on corresponding <program>
.
The if
and its matching else
must be in same level of indentation, if two or more unmatched if
s are in the same indentation level, next else
will match the closest if
.
<switch> = "switch" <expression> <case>+ <default>?
<case> = "case" <expression> (':' | <line break>) <program>
<default> = "default" ':'? <program>
<case>
and <default>
both introduce new block on corresponding <program>
.
Indentations of <case>
s must be equal or more than <switch>
, and all indentations of <case>
s must be identical.
<for> = "for" <loop condition> ':' <loop body>
<loop condition>
= <identifier> "in" <expression>
| <expression>
<loop body> = ( <statement> | <break> | <continue> )*
<for>
introduces new block on <loop body>
.
<break> = "break" <integer>?
<continue> = "continue" <integer>?
<statement>
= <expression>
| <definition>
| <assignment>
| <if>
| <switch>
| <for>
<program> = <statement>*
The type system provide abstraction over many operating system concepts such as "Path" in filesystem and the exit status of a process and distinguish one of those concepts from another on the aspect of language although their low level representations may be the same.
sushiscript
is designed to be a staticly and strongly typed language. "Static" means that the type checking process on the type correctness of program is performed at compile time by the type checker. And "strong" means that there are few implicit conversions between different types are tried by type checker when the actual type of an expression and expected type are not exactly the same. (there are some, though).
The type system also has a strong connection with the syntax structure, every expression has a unambiguous type that depend on the context. And most <expression>
s appeared in the syntax structure means "expression of some certain types", expressions that aren't in the list of that "certain types" are rejected even if they are valid in syntax. And the requirements of types often distinguish different semantics of syntax structure.
We use monospaced
string and angle bracket around them to represent some expandable syntactic rules. And <E>
usually means some concrete <expression>
and <S>
usually means some concrete <statement>
.
When we add colon between the angle bracket like <E : T>
. We means expression <E>
has type <T>
. It may be useful to explicity assign a type for the expression.
Simple types (or atom types) are types that doesn't contains other types in itself.
Parameterized type (as its name suggest) can be provided with type parameters to form (intuitively) different types with different parameters.
An instantiation of a type parameter is the process where type parameter is replaced by a specific type when the actual type of the expression is known somehow from the context.
A expression <E>
is a direct sub-expressions of a statement or an expression <R>
if it's the first <expression>
appeared in some path while walking down the parsing tree rooted at <R>
.
Opposite to "sub-expression", if <E'>
is a sub-expression to <E>
, then <E>
is a super-expression of <E'>
.
Type T
is_ implicitly convertible_ to type T'
when there are built-in or user-provided ways to transform an object of type T
to another object of type T'
. And intuitively the process could be done implicitly, with no additional code written.
The well-typed expression is recursively defined as follow:
- A literal of simple type is a well-typed expression
- An defined identifier is a well-typed expression
- Other expressions are well typed if all direct sub-expressions are well typed, and their types together satisfy one of the typing rules(explained later) of that composite expression.
A composite expression <E : T>
or a statement <S>
may explicitly requires some of their direct sub-expressions <E'>
to have some certain possible types T1', T2', ...
in their typing rules R. Then these variable together forms a type requirement, and T1', T2', ...
are the required types by <E : T>
or <S>
on <E'>
in that requirement.
Sometimes the required types by <E>
or <S>
on <E'>
may also depend on the types of other direct sub-expressions of <E>
or <S>
due to the typing rule.
With <E'>
being an direct sub-expression of <E : T>
or <S>
, and its actual type being T'
, an implicit type conversion(or coersion) happens when T'
isn't one of the required types in the corresponding type requirement. And T'
is implicitly convertible to exactly one type T''
in the required types. Then we say <E' : T'>
successfully implicitly convert to T''
in the requirement.
If there're multiple implicitly convertible type in required types, then conversion is considered ambiguous, which results in a fail implict conversion.
We say <E' : T'>
satisfy the requirement by <E : T>
or <S>
on <E'>
if one of the followings is true:
T'
is a same type as one of the required types by<E>
or<S>
on<E'>
.<E' : T'>
successfully convert toT
in the requirement.- ~~There's only one type in the required types, and
T'
unifies with that type. ~~
A type correct program(i.e. a program that pass the type checking process) means that type requirements recursively generate by all statements and composite expressions are satisfied by their corresponding direct sub-expressions.
Typing rules, as mentioned in sections above, are the most important source of type requirements. And they usually come along with the semantic of the program. So they would be introduced together.
We use a notation connected intimately with the notation that expressed syntax structure:
<expression : T> = <[optional-name :] T1> other tokens <T2>
# or
<statement> = <[optional-name :] T1> other tokens <T2 | T3>
where those enclosing in angle bracket are direct sub-expressions of the expression or statement with an optional name and type T1, T2, T3, ...
. T1, T2, T3, ...
may be specific types or just type pameters, either way, types with same name are the same type in the notation. Other tokens are no-longer enclosing in double or single quotes for clarity.
And what's on the left of the equal sign (=
) of the rule is called the head of the rule, the right is the body of the rule.
Type deduction is performed mostly bottom-up and always left-to-right with respect to the abstract syntax tree. That is the type of the actual type of a super-expression is obtained by the type of its sub-expressions and the applying typing rule.
Whenever a type parameter in a rule is successfully deduced with a actual type T
, all other type parameter with the same name (both in the head and body) is instantiated with T
.
A specific type <E' : T'>
in the body, whenever it is instantiated, add T'
to the required types by <E : T>
on <E'>
. For example:
<E : T> = <E' : T'> <E'' : T'>
When <E'>
is deduced with actual type TA
, it instantiates the T'
to TA
in <E''>
, and add TA1
to the required types of <E''>
. Then when <E''>
is deduced with TA2
, <E'' : TA2>
is checked against the satisfaction condition of the requirement with required type TA1
.
Unit
: "Trivial" type with only one value (unit
)Bool
: Boolean types with only two different values (true
andfalse
)Char
: Character todo: specify the range of CharInt
: Signed integer todo: specify the range of IntString
: Character sequence.Path
: Absolute path that may or may not pointing to a valid fileRelPath
: Relative path that may or may not pointing to a valid fileExitCode
: Exit status of a process or command.FD
: File descriptor is the logical handle of a opened file.
Array [element]
: Linear sequence of values with simple type[Element]
.Map [Key] [Value]
: Associative structure that map a values of simple type[Key]
to values of simple type[Value]
.Function [Ret] [Param1] [Param2] ...
: Function with parameters of type[Param1], [Param2]
and return type[Ret]
.
ExitCode
toBool
, non-zeroExitCode
tofalse
(exit with error), zeroExitCode
totrue
(success).ExitCode
toInt
, converting the exit status code to the integer of same value.RelPath
toPath
, the relative path is applied to the current working directory.
Not (planned to) implement yet
<array literal : Array T> = {<T>, <T>, ...} | {}
Array Literal is of type Array T
when all of it's element are well-typed and of same type T
.
<map literal : Map K V> = {<K> : <V>, <K> : <V>, ...} | {}
Map Literal is of type Map K V
when all of it's keys and values are well-typed and all keys are of the same type K
, all values are of the same type V
.
Empty array and map literals satisfy all requirement whose required types contains an array or map with any type parameter combination.
<abs : Int> : + <Int>
+
as a unary operator turns an integer to its absolute value.
<negate : Int> = - <Int>
-
as a unary operator turns an integer to its negative counter-part.
<logical negate : Bool> = not <Bool>
not
negate boolean value to another different one.
<int plus : Int> = <Int> + <Int>
<string concat : String> = <String> + <String>
<array concat : Array T> = <Array T> + <Array T>
<map merge : Map K V> = <Map K V> + <Map K V>
<int plus>
integer addition.<string concat>
concatenating two strings producing a new string.<array concat>
concatenating two array of same element type producing a new array.<map merge>
combining two maps, with keys on the right overwrite the value of the same key on the left.
<int minus : Int> = <Int> - <Int>
<int minus>
integer subtraction
<int mult : Int> = <Int> * <Int>
<string dup : String> = <String> * <Int>
<int mult>
integer multiplication.<string dup>
duplicate the string multiple times.
<int div : Int> = <Int> // <Int>
<path join1 : Path> = <RelPath> // <RelPath>
<path join2 : Path> = <Path> // <RelPath>
<int div>
integer division.<path join>
applying the right-hand relative path to the left-hand path(absolute or relative).
<int mod : Int> = <Int> % <Int>
<int mod>
modulo operation
<equal compare : Bool> = <EqualComparable> (== | !=) <EqualComparable>
Following types are allow to instantiate EqualComparable
Unit
two units are always equalBool
boolean equality (true == true, false == false
)Char
character equalityInt
integer equalityString
string case sensitive equalityPath
determine whether two paths pointing to the same filesystem locationExitCode
just integer equalityFD
just integer equalityArray EqualComparable
element-wise array equality, two arrays are equal if all elements at the same index of two arrays are equal.
<order compare : Bool> = <OrderComparable> (> | <) <OrderComparable>
Following types are allow to instantiate OrderComparable
Char
ascii (unicode) value compareInt
integer comparisonString
lexical order comparison
<equal order compare : Bool> = <EqualOrderComparable> (>= | <=) <EqualOrderComparable>
Types that can instantiate both EqualComparable
and OrderComparable
can instantiate EqualOrderComparable
a <= b
is always equivalent toa < b or a == b
a >= b
is always equivalent toa > b or a == b
<logical operation : Bool> = <Bool> (or | and) <Bool>
a or b == false
if and only ifa == false and b == false
a and b == true
if and only ifa == true and b == true
<array index : T> = <Array T> [ <Int> ]
<map index : V> = <Map K V> [ <K> ]
<string index : Char> = <String> [ <Int> ]
<array index>
array indexing, index counts from0
<map index>
retrieving the mapped value by the key<string index>
string indexing, index counts from0
todo: what happens when out of range?
<function call : Ret> = <Function Ret P1 P2 ...> <P1> <P2> ...
| <Function Ret> ()
<call>
calling a defined function and retriving the return value of function
<redirect to here : String> = (<function call : Ret> | command) redirect to here
<redirect output> = redirect to <RelPath | Path | FD>
<redirect input> = redirect from <RelPath | Path>
<redirect to here>
if the redirected command if a function call, the original return value of function is discarded(todo: to be discussed). The value of whole command expression is the output of the command as a string<redirect output>
redirecting the output to the file pointing to byPath
or another file discriptor<redirect input>
redirecting the content of the file pointed byPath
to the input file descriptor
<variable def> = define ident : type "=" <expr : T>
- if type of
ident
is explicitly stated, the statement generates type requirement on<expr>
with required type represented bytype
, lets sayT'
, and makeident
an well-typed expression of typeT'
in the following program. - otherwise, if
expr
is an well-typed expression, thenident
is an well-typed expression of typeT
in the following program.
<ident assign> = <ident : T> = <expr : T>
<array index set> = <ident : Array T> [ <Int> ] = <T>
<map value set> = <ident : Map K V> [ <K> ] = <V>
<ident assign>
assigning the value of<expr>
to a identifier of typeT
<array index set>
set the value of an array at integer index<map value set>
set the value of an specific key, overwrite previous value if the value is set before.
<function def> = define func (p1 : type1, p2 : type2, ...) : type = ...
<return> = return <expr : T>
- If there is no function parameter, the default function parameter type is
()
. - If return type of
func
is explicitly stated, the definition statement generates type requirements on all thereturn
statement inside the body of the function with what's represented by type expressiontype
. If all requirements are satisfied,func
is an well-typed expression in the following program. - Otherwise, the first
return
statement with no recursive call tofunc
(callingfunc
is not a sub-expression ofexpr
) instantiate the type parameterT
and generate type requirements on the expression of other return statements accordingly. Only if: a. Suchreturn
statement is found; b. All type requirements generated by thisreturn
statement are satisfied; thenfunc
becomes a well-typed expression in the following program. - If
<expr>
in the return statement is omitted,<expr>
isunit
by default.
<if> = if <Bool>: ...
else if <Bool>: ...
The type of ondition expression of if
statement is required to be Bool
.
<switch> = switch <T>
case <T>: ...
The required types of expression following case
are:
T
requiresT
to be alsoEqualComparable
(see==
), and if two values are equal according to the equality comparison, the branch is chosen.Function Bool T
calls the function with the switching value, if the function returnstrue
, this branch is chosen.
<normal for> = for <Bool>: ...
<range for> = for ident in <Array T>: ...
<normal for>
checks the loop condition before every iteration, if the expression evaluates totrue
, loop body is entered, the loop body is skipped otherwise.<range for>
requires the range expression has anArray T
, and theident
becomes an well-typed expression inside the loop body of typeT
.
Unlike bash, statements like if
/ for
have scope in sushi. To implement this,
-
Sushi records all the definition in a scope, and when the scope exits, sushi does
unset
to all the variables.if true: define a : Int = 1 define b = "test" --> if [[ 1 -ne 0 ]]; then local a=$((1)) local b="test" unset a unset b fi
-
If there are identifiers having the same name, sushi will translate them into different names by adding suffix
_scope_<num>
. (<num>
is a number.)define a = 1 if true: define a = 2 define b = "test" --> local a=$((1)) if [[ 1 != 0 ]]; then local a_scope_1=$((2)) local b="test" unset a_scope_1 unset b fi
Temporary variables are like _sushi_t_<num>_
. They are maintained by sushi.
-
Identifiers with prefix
_sushi_
are usually a variable maintained by sushi. It's not recommended to use such identifiers directly. -
Variables
_sushi_unit_
: A special variable representUnit
in sushi. It has a read-only default value0
._sushi_func_ret_
: A variable temporarily store the function return.
-
Functions: Return means "echo ..."
_sushi_abs_
: Get the absolute value of the parameter- Parameters: 1 parameter means the number to get abs.
- Return: The absolute value of the parameter.
_sushi_dup_str_
: Duplicate a string for particular times.- Parameters: 2 parameters. The 1st for the string to duplicate. The 2nd for times to duplicate.
- Return: String par1 duplicated for par2 times.
_sushi_path_concat_
: Concat 2 path.- Parameters: 2 parameters. 2 paths to concat.
- Return: The concat res of 2 paths.
_sushi_file_eq_
: Compare 2 file path.- Parameters: 2 parameters. 2 path to compare.
- Return:
1
if 2 path refer to the same file, otherwise0
. Besides, exit status will be0
if they are the same, otherwise exit status will be1
(Opposite to the "echo return").
One expression in sushi can be translated to a few lines bash script. To make it easier to explain, the translation is separated to 2 parts, which is "code before" and "value"(or "val").
"code before" means the code before "val", usually a few lines, and "val" is an expression of bash. "val" can be just an identifier. For example, to translate interpolation,
define a = "c"
define b = "d"
define c = "0" + "${"ab" + ${a + b}}"
-->
a="c"
b="d"
_sushi_t_0_="${a}${b}" # With post-order, the interpolation `${a + b}` is firstly translated and stored in a temporary variable _sushi_t_0_
_sushi_t_1_="ab${_sushi_t_0_}" # Then the interpolation `${"a" + "b" + ${a + b}}` is translated and stored in _sushi_t_1_, and this is the identifier the interpolation finally becomes
c="0${_sushi_t_1_}" # Use _sushi_t_1_ to replace the interpolation, this is like translating `define c = "0" + _sushi_t_1_`
For the interpolation ${"a" + "b" + ${a + b}}
,
# code before
_sushi_t_0_="${a}${b}"
_sushi_t_1_="ab${_sushi_t_0_}"
# val
${_sushi_t_1_} # this is what in "c=" assignment statement
Refer to 6.2.2 Variable , 6.2.3 Literal and 6.2.6 Indexing.
Note that assignments on Map are different in "MapLit as right value" and "Map Variable as right value".
<t_lhs>=<t_rhs>
<t_lhs>=(<t_rhs>)
eval "<t_lhs>=(<t_rhs>)"
Except Variable and Indexing, the following expressions only have right value translation.
<expr>
is expression in sushi, <t_expr>
is the translated expression "value" in bash.
Because <expr>
may be translated to multiple statements in bash, <t_expr>
is only the final identifier;
Another situation is that <expr>
can be translated directly to one expression in bash, then it will be the expression in bash (like "string"
, $((2 + 2))
and etc).
Interpolation is that expressions can be interpolated in PathLit, RelPathLit, StringLit.
<expr>
-> $<t_expr>
<t_expr>
is the identifier which <expr>
finally becomes, but interpolation is usually translated to a few lines. Interpolations are traversed with post-order. Each expression in an interpolation will be translated and stored in a temporary variable. Finally, a temporary variable will replace the interpolation. An example is following.
define a = "c"
define b = "d"
define c = "0" + "${"ab" + ${a + b}}"
-->
a="c"
b="d"
_sushi_t_0_="${a}${b}" # With post-order, the interpolation `${a + b}` is firstly translated and stored in a temporary variable _sushi_t_0_
_sushi_t_1_="ab${_sushi_t_0_}" # Then the interpolation `${"a" + "b" + ${a + b}}` is translated and stored in _sushi_t_1_, and this is the identifier the interpolation finally becomes
c="0${_sushi_t_1_}" # Use _sushi_t_1_ to replace the interpolation, this is like translating `define c = "0" + _sushi_t_1_`
As left value: <identifier>
-> <identifier>
As right value: <identifier>
->
-
Array type as right value:
"${<t_id>[@]}"
-
Map type as right value:
# code before _sushi_t_0_=`declare -p t_map` _sushi_t_0_=${_sushi_t_0_#*=} _sushi_t_0_=${_sushi_t_0_:1:-1} # val $_sushi_t_0_
-
Simple type:
<identifier>
->$<identifier>
Exception:
- Bool type as condition (translation will be wrapped in
[[ ]]
):<identifier>
->$<identifier> != 0
[<expr>, <expr>, ...]
-> $<t_expr> $<t_expr> ...
ArrayLit of <expr>
is translated to Indexed Array in bash. It's <expr>s
split by space character. <expr>s
can only be simple types and they have the same types.
Ascondition wrapped in [[ ]]
: true
-> (1 -ne 0)
, false
-> (0 -ne 0)
As right value in assignment: true
-> 1
, false
-> 0
'c'
-> "c"
CharLit will be translated directly into string.
stdin
-> 0
stdout
-> 1
stderr
-> 2
FdLit will be translated to the relative number.
100
-> $((100))
IntLit will be translated to "arithmetic" in bash. Int literal will be wrapped by $(( ))
.
{ <expr> : <expr>, <expr> : <expr>, ... }
-> [$<t_expr>]=$<t_expr> [$<t_expr>]=$<t_expr> ...
MapLit of <expr>
is translated to Associative Array in bash. K-V pairs are like [<key>]=<value>
, and they are split by space character.
This is an interpolated string, refer to Interpolation part.
This is an interpolated string, refer to Interpolation part.
This is an interpolated string, refer to Interpolation part.
Unit
is a type in sushi. It will be translated to a special variable in bash named _sushi_unit_
.
not <expr>
- Bool not:
- As condition wrapped in
[[ ]]
:(! $<t_expr> -ne 0)
not
will be translated to!
. - As right value in assignment:
$((! $<t_expr>))
- As condition wrapped in
- Int abs:
+<expr>
->`_sushi_abs_ $<t_expr>`
_sushi_abs_
is a function that return absolute value of the 1st parameter.+
is only applied to Int type. It will be translated to "pass<t_expr>
to a built-in function_sushi_abs_
and get its output" like above.
- Int neg:
-<expr>
->$((-$<t_expr>))
-
is only applied to Int type. It will be translated to "a$(( ))
wraps-$<t_expr>
" like above.
<expr> + <expr>
->
- Int addition:
$(($<t_expr> + $<t_expr>))
- String concat:
"${<t_expr>}${<t_expr>}"
- Array concat:
<expr>
may be ArrayLit/Variable.- ArrayLit:
<t_expr>
without( )
.<t_expr>
will be elements in array split by space. e.g.[1, 2, 3]
->1 2 3
(ArrayLit is translated to(1 2 3)
in general, but different here.) - Variable:
${<t_expr>[@]}
.<t_expr>
will be an identifier in bash. e.g.arr
->${arr[@]}
- The final result will be stored in a temp variable, and
<t_expr>
is its identifier. For example,
- ArrayLit:
define arr = [1, 2]
define res = arr + [3, 4] + [5, 6]
-->
local -a arr=(1 2)
_sushi_t_0_=(${arr[@]} 3 4)
_sushi_t_1_=(${_sushi_t_0_[@]} 5 6)
local -a res=(${_sushi_t_1_[@]})
- Map merge:
<expr>
may be MapLit/Variable- MapLit:
<t_expr>
without( )
.<t_expr>
will be elements in array split by space. e.g.{"k0": 1, "k1": 2}
->["k0"]=1 ["k1"]=2
(ArrayLit is translated to(["k0"]=1 ["k1"]=2)
in general, but different here.) - Variable:
`_sushi_extract_map_ ${!<t_expr>[@]} ${<t_expr>[@]}`
.<t_expr>
will be an identifier in bash. (_sushi_extract_map_
is a built-in function. It receives the keys and values from parameter, and return a string like["key0"]=val0 ["key1"]=val1 ["key2"]=val2 ...
) e.g.arr
->${arr[@]}
- The final result will be stored in a temp variable, and
<t_expr>
is its identifier. For example,
- MapLit:
define map = {"a": 1, "b": "c"}
define res = map + {"c": 3, "d": 4} + {"e": 5, "f": 6}
-->
local -A map=(["a"]=$((1)) ["b"]="c")
_sushi_t_0_=(`_sushi_extract_map_ ${!map[@]} ${map[@]}` ["c"]=$((3)) ["d"]=$((4)))
_sushi_t_1_=(`_sushi_extract_map_ ${!_sushi_t_0_[@]} ${_sushi_t_0_[@]}` ["e"]=$((5)) ["f"]=$((6)))
local -A res=(`_sushi_extract_map_ ${!_sushi_t_1_[@]} ${_sushi_t_1_[@]}`)
<expr_0> - <expr_0>
->
- Int subtraction:
$(($<t_expr_0> - $<t_expr_1>))
<expr_0> * <expr_1>
->
- Int multiplication:
$(($<t_expr_0> - $<t_expr_1>))
- String duplication:
`_sushi_dup_str_ $<t_expr_0> $<t_expr_1>`
_sushi_dup_str_
is a function that duplicate string.<expr_0>
should be String type and<expr_1>
should be Int type.
<expr_0> // <expr_1>
->
- Integer division:
$(($<t_expr_0> / $<t_expr_1>))
- Path concat:
`_sushi_path_concat_ $<t_expr_0> $<t_expr_1>`
_sushi_path_concat_
is a function that concat paths.
<expr_0> % <expr_1>
->
- Int modulo operation:
$(($<t_expr_0> % $<t_expr_1>))
<expr_0> {<|>|<=|>=} <expr_1>
->
- Char ascii comparison / String lexical order comparison
- As condition wrapped in
[[ ]]
:$<t_expr_0> {<|>|<=|>=} $<t_expr_1>
- As right value in assignment:
$(($<t_expr_0> {<|>|<=|>=} $<t_expr_1>))
- As condition wrapped in
- Int comparison
- As condition wrapped in
[[ ]]
:$<t_expr_0> -{lt|gt|le|ge} $<t_expr_1>
- As right value in assignment:
$(($<t_expr_0> {<|>|<=|>=} $<t_expr_1>))
- As condition wrapped in
<expr_0> {==|!=} <expr_1>
->
-
Unit
- As condition wrapped in
[[ ]]
:==
:(1 -ne 0)
!=
:(0 -ne 0)
- As right value:
==
:1
!=
:0
- As condition wrapped in
-
Bool/Char/Int/String/ExitCode/FD
- As condition wrapped in
[[ ]]
:(<t_expr_0> {==|!=} <t_expr_1>)
- As right value:
$((<t_expr_0> {==|!=} <t_expr_1>))
- As condition wrapped in
-
Path
- As condition wrapped in
[[ ]]
:(<t_expr_0> -ef <t_expr_1>)
- As right value:
`_sushi_file_eq_ <t_expr_0> <t_expr_1>`
- As condition wrapped in
-
Array
-
As condition wrapped in
[[ ]]
:# code before _sushi_t_0_=0 if [[ ${#<t_expr_0>[@]} -eq ${#<t_expr_1>[@]} ]]; then _sushi_t_0_=1; fi if [[ _sushi_t_0_ -eq 1 ]]; then for ((i = 0; i < ${#<t_expr_0>[@]}; i++)); do if [[ <t_expr_0>[i] -ne <t_expr_1>[i] ]]; then _sushi_t_0_=0 break fi done fi # val (_sushi_t_0_ -ne 0)
-
As right value:
Code before is the same, val is only
_sushi_t_0_
-
<expr_0> {and|or} <expr_1>
->
- Bool logical operation
- As condition wrapped in
[[]]
:$<t_expr_0> {&& | ||} $<t_expr_1>
<expr_0>
- As right value in assignment:
$(($<t_expr_0> {&& | ||} $<t_expr_1>))
- As condition wrapped in
As described in 4.7.8 Redirection
<redirect to here : String> = (<function call : Ret> | command) redirect to here
<redirect output> = redirect <FdLit>? to <Path | FD>
<redirect input> = redirect <FdLit>? from <Path>
- redirect to here
- CommandLike with redirect to here will be wrapped by double
`
. It's like`<t_func_call>`
or`<t_command>`
.
- CommandLike with redirect to here will be wrapped by double
- redirect input/output
redirect <FdLit> {to|from} <Path | FD> {append}?
-><t_FdLit> > <t_Path | t_FD>
if redirect to without "append"<t_FdLit> >> <t_Path | t_FD>
if redirect to with "append"<t_FdLit> < <t_Path | t_FD>
if redirect from (always without "append")<t_FdLit>
and<t_FD>
above is the relative number and<t_Path>
above is the translated path.
! <cmd> <params> <redirection>
->
<t_cmd> <t_params> <t_redir>; _sushi_t_0_=$?
if not "to here"_sushi_t_0_=`<t_cmd> <t_params> <t_redir>`
if "to here"
The final identifier(<t_expr>
) is "_sushi_t_0_"
(0 may be changed to other number depending on the current temp variables count).
<cmd>
is an interpolation and <params>
is a group of interpolation, so they are translated as how interpolations are translated. <redirection>
is translated as described above.
<func> <params> <redirection>
->
<t_func> <t_params> <t_redir>; _sushi_t_0_=$_sushi_func_ret_
if not "to here"`_sushi_t_0_
's assignment may be different depending on the return type
_sushi_t_0_=`<t_func> <t_params> <t_redir>`
if "to here"
The final identifier(<t_expr>
) is "_sushi_t_0_"
(0 may be changed to other number depending on the current temp variables count).
Detail:
<func>
- Function identifier name.
<params>
- All the params will be translated to variable in bash, and be used as reference in function.
- If Literal is used in function call, it will be stored in a temp variable first.
<redirection>
- See above.
- Function's return will be restored in a global variable
_sushi_func_ret_
Code before is like above, val is _sushi_t_0_
<expr0>[<expr1>]
->
- As right value
- Array:
${<t_expr0>[<t_expr1>]}
<expr1>
must be Int type.
- Map:
${t_expr0}[<t_expr1>]
<expr1>
must be String type.
- Array:
- As left value
<t_expr0>[<t_expr1>]
export? define <id> (: <type>)? = <expr>
->
local <id>=<t_expr>
if not exportdeclare -gx <id>=<t_expr>
if export<type>
is ignored in translation.<t_expr>
is the translation's final identifier like described before.- Declaration with assignment is relative to assignment. Refer to 6.1 Assignment
Exception:
- Map: Declare and then assignment. If right value is variable, translation result declare the variable first, and use
eval
to assign. Otherwise, if right value is MapLit, noeval
is needed (Refer to 6.1 Assignment) For example
define m : Map String Int = { "a": 1, "b": 2 }
define m_0 : Map String Int = m
-->
local -A m=(["a"]=1 ["b"]=2)
local -A m_0=(); eval "m_0=(`_sushi_extract_map_ ${!<m>[@]} ${<m>[@]}`)"
export? define <id> (<params>) =
<program>
->
- Function name is
<id>
. - Parameters' translations are to get parameters' references:
local -n <param_id>=$1
- Parameters are acquired inside the function body and before all the other statement
export -f <id>
will be below the function definition if "export"
For example,
export define foo (a : Int, b : Array Int, c : String Int) =
! echo "${a}"
-->
foo () {
local -n a=$1
local -n b=$2
local -n c=$3
local _sushi_t_0_=${a}
echo "${_sushi_t_0_}"
}
export foo
return <expr>
->
-
Return type is not Bool
_sushi_func_ret_=<t_expr>; return 0
-
Return type is Bool
_sushi_func_ret_=<t_expr> if [[ _sushi_func_ret_ -ne 0 ]]; then return 0 else return 1 fi
Assignment may be different depending on <expr>
type.
if <expr>: <program>
else: <program>?
-->
if [[ <t_expr> ]]; then
<t_program>
else
<t_program>
fi
<expr>
can only be Bool type.
switch <val : T>
case <cond : T>:
<program0>
case <cond_func : Function Bool T>:
<program1>
default:
<program2>
-->
if [[ <t_val> == <t_cond> ]]; then
<t_program0>
elif <t_cond_func> <t_val>
<t_program1>
else
<t_program2>
fi
for <expr>:
<program>
-->
while [[ <t_expr> ]]; do
<t_program>
done
<expr>
above can only be Bool type.
for <ident> in <expr>:
<program>
-->
for <ident> in ${<t_expr>[@]}; do
<t_program>
done
<expr>
above can only be Array type.
break <num>
-> break <num>
continue <num>
-> continue <num>