SASM is a stack-oriented programming language that is based on Assembly style syntax.
init:
print "Hello World!"
ret
Download SASM Interpreter (sasm-20251031.zip)
To run the interpreter, you will need a Java 21 Runtime Environment.
SASM attempts to capture the feeling of programming in an Assembly language, while exploring the stack-oriented programming paradigm.
I find Assembly programming languages to be quite fascinating because of their structure. They describe how the CPU performs computations, how it reads and writes data from memory, and how it communicates with other devices. As a result, these languages have a very particular style and feel to them.
Stack-oriented programming languages allow you to define algorithms by function composition, in a point-free style. This means that the arguments on which the functions operate, are not explicitly named in the algorithm.
In this article, I will focus on some of the key features that sets SASM apart from Assembly and stack-oriented languages. For a more detailed language reference, check the README.md that is included in the download archive.
The stack is the central data structure of the language. All instructions transform the stack in one way or another.
All arguments of an instruction are automatically pushed onto the stack. The instruction then pops the required amount of values from the stack (consume), calculates the result, and pushes it to the stack (produce).
The following example shows how the stack changes when the instructions are executed by the interpreter:
push 6, 2 ;[6, 2]
add 8, 4 ;[12, 6, 2]
mul 2 ;[24, 6, 2]
sub ;[18, 2]
div ;[9]
Notice how mul has only one argument. It consumes one value from the stack and produces one. sub and div have no arguments, but they consume two values from the stack and produce one.
The language also provides functions to manipulate the stack in different ways. Besides push and pop, there are four other helpful instructions.
dup duplicate the first value from the top
;[3] -> [3, 3]
over duplicate the second value from the top
;[3, 4] -> [4, 3, 4]
rot rotate three elements at the top
;[1, 2, 3] -> [2, 3, 1]
swap swap the two elements at the top
;[3, 4] -> [4, 3]
These are implemented as part of the stack module.
Another way of manipulating the stack is the stack character _. It can be used to interject values from the stack into instructions, where you would otherwise have to reorder the stack.
push 1, 2 ;[1, 2]
push _, 3 ;[1, 3, 2]
This is particularly useful for comparisons, where the position of the value matters for the condition:
push 5 ;[5]
cmp 4 ;[] CMP=4-5=-1
setl ;[true]
push 5 ;[5]
cmp _, 4 ;[] CMP=5-4=1
setl ;[false]
The stack is not the only way to work with values. You can also use variables. To assign a value to a variable use the mov instruction.
mov A, 3 ;A=3
After you have assigned a value to a variable, you can use it in other instructions:
add A, 5 ;A=8
Note how this add is different from the stack-based variant of add which consumes two variables and produces one. If the first argument of the add instruction is a variable, it performs an addition and an assignment to this variable.
This mirrors how most operations work in Assembly languages, and offers you the choice between a stack-based and a variable-based approach to solving problems.
Another way to assign values to variables is by using the pop instruction. The following line pops a value from a stack and assigns it to a variable:
;[3]
pop A ;[] A=3
You can also pop multiple values into variables:
;[3, 8, 9]
pop A, B, C ;[] A=3, B=8, C=9
You can use the stack character _ to discard certain values while retaining others:
;[3, 8, 9]
pop A, _, C ;[] A=3, C=9
Variables are only visible inside the function scope. If you want to pass values to functions, or return values from a function, you have to put them on the stack.
Similar to the stack-based and variable-based instructions, you can write functions in a stack-based or variable-based way.
Lets try to implement a clamp function in both styles, using this Python example as reference:
def clamp(v, min, max):
if v < min:
return min
elif v > max:
return max
else:
return v
Here is a variable-based implementation of clamp:
clamp:
pop v, min, max
cmp v, min | jl 1f
cmp v, max | jg 2f
ret v
1: ret min
2: ret max
This looks roughly similar to a naive implementation in Assembly.
ret instruction.Here is a stack-based implementation of clamp:
clamp: ;[v, min, max]
dup ;[v, v, min, max]
rot ;[v, min, v, max]
over ;[min, v, min, v, max]
swap ;[v, min, min, v, max]
cmp ;[min, v, max]
jl 1f ;return min
drop ;[v, max]
dup ;[v, v, max]
rot ;[v, max, v]
over ;[max, v, max, v]
swap ;[v, max, max, v]
cmp ;[max, v]
jg 2f ;return max
drop ;[v]
ret
1: rot ;[v, max, min]
drop ;[max, min]
drop ;[min]
ret
2: swap ;[v, max]
drop ;[max]
ret
Solving this in a stack-based way represents a nice little puzzle but you can make a lot of mistakes along the way, and it is harder to modify the code correctly. There are algorithms that are better suited for the stack-based approach.
In the following example, you can see a function that consists of trivially composable functions. The functions are defined such that you can omit the last argument, if the function can pop the argument from the stack, and produce a value that satisfies the last argument of the next function.
calcSum: ;[[1, 2, 3, 4]]
call list.map, {inc} ;[[2, 3, 4, 5]]
call list.reduce, {add} ;[14]
mul 2 ;[28]
ret
Another benefit of having the stack at your disposal is that you can easily return multiple values.
readFile:
...
ret true, text ;successful read
main:
call readFile, "file.txt" ;[true, "foobar"]
cmp true ;["foobar"] CMP=0
jne 1f
call io.print ;[]
1: ret
Using the defer instruction, you can defer the call of a function until the end of the function scope.
init:
defer io.print, 0
defer io.print, 1
defer io.print, 2
call io.print, "A"
call io.print, "B"
call io.print, "C"
ret
The previous example will print the following string to the console:
A
B
C
2
1
0
Note that the numbers are printed in reverse.
This is useful for situations where something needs to be done right before the function is exited.
To concatenate instructions on one line, you can use the pipe character |:
push 8, 4, 2, 6, 2 ;[8, 4, 2, 6, 2]
add | mul | sub | div ;[9]
Used with comparison and branching:
cmp A, 5 | jl .lesser ;A<5
cmp A, 8 | jg .greater ;A>8
Used in anonymous functions:
call list.filter, { cmp _, 5 | setle }, lst
You can create an anonymous function by placing concatenated instructions between two curly braces {, }.
{ add | sub 3 | call fn | dup }
This is useful for functions that take a function pointer (or label) as one of their arguments. To avoid having to write out a complete function and to name it by giving it a label, you can create an anonymous function instead:
call list.reduce, { add }, [1, 2, 3] ;[6]
You can also push, pop, and assign anonymous functions to variables, and call them at a later time:
push { add A, B }
call
Variables used in anonymous functions will not be resolved until the function is called!
Function calls in Assembly usually start with the call mnemonic. In SASM, using call for function calls is optional.
The following two lines do exactly the same thing:
call list.sort, [4, 2, 7, 0]
list.sort [4, 2, 7, 0]
When the interpreter encounters a label instead of an instruction, it uses the call instruction on the label, by default.
Which one you prefer is mostly a stylistic choice.
A module is a collection of instructions and labels. Every file or document in SASM is a module. The name of the module is the file name, without the .sasm extension.
There are quite a few similarities between a module in SASM, and a translation unit in Assembly. You can define public symbolic labels, private symbolic labels, as well as private numeric labels. Symbolic labels have to be unique, which means that you cannot define the same symbolic label twice in a module.
The private numeric labels are special because you can redefine them multiple times in the same module. But you have to specify in which direction You are jumping, either backward or forward. The numeric labels are most useful when implementing conditional jumps and loops.
label: ;public symbolic
jmp label ;use
call label
.foo: ;private symbolic
jmp .foo ;use
call .foo
4: ;private numeric
jmp 4f ;jump forward
jmp 4b ;jump backward
When you want to call a public function in another module, you just preface it with the module name. For example, if you want to call the function bar from the module foo, you can write:
call foo.bar, 42
The language is very tedious to write programs in. You have to keep many different aspects of the language and instructions in mind, while trying to run the algorithms in your head. There aren't many rail-guards to help you with this. This could be improved by some static analysis but the biggest pitfall is the manipulation of the stack.
The easiest mistake you can make is to push or to pop too many arguments from the stack. This will affect all subsequent instructions and function calls. If you are unlucky, this mistake will be caught after many instructions have already been executed. Since there is no function signature, it is difficult to determine if the right amount of arguments have been passed to a function.
In that sense, programming in this language is very much like programming in Assembly.