slashbinbash.de / SASM

SASM is a stack-oriented programming language that is based on Assembly style syntax.

init:
    print "Hello World!"
    ret

Download SASM Interpreter (sasm-20251031.zip)

To run the interpreter, you will need a Java 21 Runtime Environment.

Motivation

SASM attempts to capture the feeling of programming in an Assembly language, while exploring the stack-oriented programming paradigm.

I find Assembly programming languages to be quite fascinating because of their structure. They describe how the CPU performs computations, how it reads and writes data from memory, and how it communicates with other devices. As a result, these languages have a very particular style and feel to them.

Stack-oriented programming languages allow you to define algorithms by function composition, in a point-free style. This means that the arguments on which the functions operate, are not explicitly named in the algorithm.

In this article, I will focus on some of the key features that sets SASM apart from Assembly and stack-oriented languages. For a more detailed language reference, check the README.md that is included in the download archive.

Stack-Oriented

The stack is the central data structure of the language. All instructions transform the stack in one way or another.

All arguments of an instruction are automatically pushed onto the stack. The instruction then pops the required amount of values from the stack (consume), calculates the result, and pushes it to the stack (produce).

The following example shows how the stack changes when the instructions are executed by the interpreter:

push 6, 2  ;[6, 2]
add  8, 4  ;[12, 6, 2]
mul  2     ;[24, 6, 2]
sub        ;[18, 2]
div        ;[9]

Notice how mul has only one argument. It consumes one value from the stack and produces one. sub and div have no arguments, but they consume two values from the stack and produce one.

The language also provides functions to manipulate the stack in different ways. Besides push and pop, there are four other helpful instructions.

dup   duplicate the first value from the top
      ;[3] -> [3, 3]

over  duplicate the second value from the top
      ;[3, 4] -> [4, 3, 4]

rot   rotate three elements at the top
      ;[1, 2, 3] -> [2, 3, 1]

swap  swap the two elements at the top
      ;[3, 4] -> [4, 3]

These are implemented as part of the stack module.

Another way of manipulating the stack is the stack character _. It can be used to interject values from the stack into instructions, where you would otherwise have to reorder the stack.

push 1, 2   ;[1, 2]
push _, 3   ;[1, 3, 2]

This is particularly useful for comparisons, where the position of the value matters for the condition:

push 5     ;[5]
cmp  4     ;[] CMP=4-5=-1
setl       ;[true]

push 5     ;[5]
cmp  _, 4  ;[] CMP=5-4=1
setl       ;[false]

Variables

The stack is not the only way to work with values. You can also use variables. To assign a value to a variable use the mov instruction.

mov A, 3  ;A=3

After you have assigned a value to a variable, you can use it in other instructions:

add A, 5  ;A=8

Note how this add is different from the stack-based variant of add which consumes two variables and produces one. If the first argument of the add instruction is a variable, it performs an addition and an assignment to this variable.

This mirrors how most operations work in Assembly languages, and offers you the choice between a stack-based and a variable-based approach to solving problems.

Another way to assign values to variables is by using the pop instruction. The following line pops a value from a stack and assigns it to a variable:

             ;[3]
pop A        ;[] A=3

You can also pop multiple values into variables:

             ;[3, 8, 9]
pop A, B, C  ;[] A=3, B=8, C=9

You can use the stack character _ to discard certain values while retaining others:

             ;[3, 8, 9]
pop A, _, C  ;[] A=3, C=9

Variables are only visible inside the function scope. If you want to pass values to functions, or return values from a function, you have to put them on the stack.

Functions

Similar to the stack-based and variable-based instructions, you can write functions in a stack-based or variable-based way.

Lets try to implement a clamp function in both styles, using this Python example as reference:

def clamp(v, min, max):
    if v < min:
        return min
    elif v > max:
        return max
    else:
        return v

Here is a variable-based implementation of clamp:

clamp:
    pop  v, min, max
    cmp  v, min | jl 1f
    cmp  v, max | jg 2f
    ret  v
1:  ret  min
2:  ret  max

This looks roughly similar to a naive implementation in Assembly.

Here is a stack-based implementation of clamp:

clamp:       ;[v, min, max]
    dup      ;[v, v, min, max]
    rot      ;[v, min, v, max]
    over     ;[min, v, min, v, max]
    swap     ;[v, min, min, v, max]
    cmp      ;[min, v, max]
    jl 1f    ;return min

    drop     ;[v, max]
    dup      ;[v, v, max]
    rot      ;[v, max, v]
    over     ;[max, v, max, v]
    swap     ;[v, max, max, v]
    cmp      ;[max, v]
    jg 2f    ;return max

    drop     ;[v]
    ret

1:  rot      ;[v, max, min]
    drop     ;[max, min]
    drop     ;[min]
    ret
2:  swap     ;[v, max]
    drop     ;[max]
    ret

Solving this in a stack-based way represents a nice little puzzle but you can make a lot of mistakes along the way, and it is harder to modify the code correctly. There are algorithms that are better suited for the stack-based approach.

In the following example, you can see a function that consists of trivially composable functions. The functions are defined such that you can omit the last argument, if the function can pop the argument from the stack, and produce a value that satisfies the last argument of the next function.

calcSum:                        ;[[1, 2, 3, 4]]
    call list.map, {inc}        ;[[2, 3, 4, 5]]
    call list.reduce, {add}     ;[14]
    mul  2                      ;[28]
    ret

Another benefit of having the stack at your disposal is that you can easily return multiple values.

readFile:
    ...
    ret  true, text             ;successful read

main:
    call readFile, "file.txt"  ;[true, "foobar"]
    cmp  true                  ;["foobar"] CMP=0
    jne  1f
    call io.print              ;[]
1:  ret

Deferring Function Calls

Using the defer instruction, you can defer the call of a function until the end of the function scope.

init:
    defer io.print, 0
    defer io.print, 1
    defer io.print, 2

    call io.print, "A"
    call io.print, "B"
    call io.print, "C"
    ret

The previous example will print the following string to the console:

A
B
C
2
1
0

Note that the numbers are printed in reverse.

This is useful for situations where something needs to be done right before the function is exited.

Concatenation

To concatenate instructions on one line, you can use the pipe character |:

push 8, 4, 2, 6, 2     ;[8, 4, 2, 6, 2]
add | mul | sub | div  ;[9]

Used with comparison and branching:

cmp A, 5 | jl .lesser   ;A<5
cmp A, 8 | jg .greater  ;A>8

Used in anonymous functions:

call list.filter, { cmp _, 5 | setle }, lst

Anonymous Functions

You can create an anonymous function by placing concatenated instructions between two curly braces {, }.

{ add | sub 3 | call fn | dup }

This is useful for functions that take a function pointer (or label) as one of their arguments. To avoid having to write out a complete function and to name it by giving it a label, you can create an anonymous function instead:

call list.reduce, { add }, [1, 2, 3]  ;[6]

You can also push, pop, and assign anonymous functions to variables, and call them at a later time:

push { add A, B }
call

Variables used in anonymous functions will not be resolved until the function is called!

Labels as Instructions

Function calls in Assembly usually start with the call mnemonic. In SASM, using call for function calls is optional.

The following two lines do exactly the same thing:

call      list.sort, [4, 2, 7, 0]
list.sort [4, 2, 7, 0]

When the interpreter encounters a label instead of an instruction, it uses the call instruction on the label, by default.

Which one you prefer is mostly a stylistic choice.

Modules

A module is a collection of instructions and labels. Every file or document in SASM is a module. The name of the module is the file name, without the .sasm extension.

There are quite a few similarities between a module in SASM, and a translation unit in Assembly. You can define public symbolic labels, private symbolic labels, as well as private numeric labels. Symbolic labels have to be unique, which means that you cannot define the same symbolic label twice in a module.

The private numeric labels are special because you can redefine them multiple times in the same module. But you have to specify in which direction You are jumping, either backward or forward. The numeric labels are most useful when implementing conditional jumps and loops.

label:       ;public symbolic

jmp  label   ;use
call label

.foo:        ;private symbolic

jmp  .foo    ;use
call .foo

4:           ;private numeric

jmp 4f       ;jump forward
jmp 4b       ;jump backward

When you want to call a public function in another module, you just preface it with the module name. For example, if you want to call the function bar from the module foo, you can write:

call foo.bar, 42

Conclusion

The language is very tedious to write programs in. You have to keep many different aspects of the language and instructions in mind, while trying to run the algorithms in your head. There aren't many rail-guards to help you with this. This could be improved by some static analysis but the biggest pitfall is the manipulation of the stack.

The easiest mistake you can make is to push or to pop too many arguments from the stack. This will affect all subsequent instructions and function calls. If you are unlucky, this mistake will be caught after many instructions have already been executed. Since there is no function signature, it is difficult to determine if the right amount of arguments have been passed to a function.

In that sense, programming in this language is very much like programming in Assembly.

Sources


Created: 2017-05-16 Modified: 2025-11-01