Implements a virtual machine for the Whitespace language.
Whitespace is an imperative, stack based language. The only
significant tokens are the white space characters:
[Space]
, [Tab]
, and [LF]
.
Anything else is treated as a comment and ignored.
You can use this implementation to experiment with the Whitespace language, and to disassemble whitespace code into (slightly) more readable assmbler code.
...or, if you use the scripts in the bin
directory...
Options are:
-a
, gives a more verbose annotation.-v
, but more verbose.This Whitespace code prints some Fibonacci numbers.
Too hard to read? Here it is with white space made visible.
[Space][Space][Space][Tab][LineFeed] [Space][Space][Space][Space][LineFeed] [Tab][Tab][Space][Space][Space][Space][Tab][LineFeed] [Space][LineFeed] [Space][LineFeed] [Space][Tab][Space][Tab][Tab][Tab][Space][Space][Space][Space][Space][Tab][Tab][Tab][Space][Space][Tab][Space][Space][Tab][Tab][Space][Tab][Tab][Tab][Space][LineFeed] [Space][Space][Space][Tab][LineFeed] [LineFeed] [Space][Space][Space][Tab][Tab][Space][Tab][Tab][Tab][Space][Space][Tab][Tab][Space][Space][Tab][Space][Tab][Space][Tab][Tab][Tab][Tab][Space][Space][Space][Space][Tab][Tab][Tab][Space][Tab][Space][Space][LineFeed] [Space][LineFeed] [Space][LineFeed] [Space][Tab][Space][Tab][Tab][Tab][Space][Space][Space][Space][Space][Tab][Tab][Tab][Space][Space][Tab][Space][Space][Tab][Tab][Space][Tab][Tab][Tab][Space][LineFeed] [Space][LineFeed] [Tab][Space][Tab][Space][Space][Tab][LineFeed] [Tab][Space][Space][Space][Space][Space][Space][Tab][LineFeed] [Space][LineFeed] [Space][Tab][Tab][Tab][Space][Space][Space][Tab][LineFeed] [Tab][Space][Space][Space][Space][LineFeed] [Space][Space][Space][Space][Tab][Space][Space][Space][Space][LineFeed] [Space][LineFeed] [Tab][Tab][Space][Space][Tab][LineFeed] [Tab][Tab][Space][Tab][Tab][Space][Space][Tab][Space][Space][Space][Tab][Tab][Space][Tab][Tab][Tab][Tab][Space][Tab][Tab][Space][Tab][Tab][Tab][Space][Space][Tab][Tab][Space][Space][Tab][Space][Tab][LineFeed] [Tab][Tab][Space][LineFeed] [Space][LineFeed] [Space][Tab][Tab][Space][Tab][Tab][Tab][Space][Space][Tab][Tab][Space][Space][Tab][Space][Tab][Space][Tab][Tab][Tab][Tab][Space][Space][Space][Space][Tab][Tab][Tab][Space][Tab][Space][Space][LineFeed] [LineFeed] [Space][Space][Space][Tab][Tab][Space][Space][Tab][Space][Space][Space][Tab][Tab][Space][Tab][Tab][Tab][Tab][Space][Tab][Tab][Space][Tab][Tab][Tab][Space][Space][Tab][Tab][Space][Space][Tab][Space][Tab][LineFeed] [LineFeed] [LineFeed] [LineFeed] [LineFeed] [Space][Space][Space][Tab][Tab][Tab][Space][Space][Space][Space][Space][Tab][Tab][Tab][Space][Space][Tab][Space][Space][Tab][Tab][Space][Tab][Tab][Tab][Space][LineFeed] [Tab][LineFeed] [Space][Tab][Space][Space][Space][Tab][Space][Tab][Space][LineFeed] [Tab][LineFeed] [Space][Space][LineFeed] [Tab][LineFeed] [LineFeed] [LineFeed] [LineFeed]
If that's still too hard, see the assembler version of the same example.
This implements Whitespace v0.3 from the original Whitespace site
(recovered from
the Wayback Machine and now
available here). That means it includes the instructions in the
Stack IMP added for v0.3 by the original developer:
[Space][LF][LF]
and [Space][Tab][LF]
(or in
WsAsm, drop
and
slide
).
I found another developer had made a v0.4 which has an instruction added to randomise the stack. This implementation does not include that instruction.
One difference is that this implementation explicitly prevents use of negatve heap addresses. In the original implementation, writing to some negative addresses was ok, but others caused massive memory allocations or program crash.
If enabled with the option -x
, there are some
extensions available to the original Whitespace language.
[Tab][Tab][LF]
(or in WsAsm,
x-dump
), which
outputs info about the state of the stack and the heap.
[Tab][LF][Space][LF]
(or in WsAsm,
x-args
), gets
you access to command line params as a stream of integers.
Every time you use this instruction, one integer is loaded onto
the stack. The first integer is the number of parameters
present. The following integers are the characters that make up
the parameters, with zero as a separator/terminator. If you
read too many times, you will get an error.
[Tab][LF][Tab][LF][Space]
(or in WsAsm,
x-readfile
),
gets you access to the content of a file as a stream of
integers. You push the address of a file name. Then you use the
instruction to read from the file onto the stack. The first
integer returned is a file handle, or zero if the file could
not be opened. You need to push the file handle before reading
so that you read from the right file (there may be more than
one open). The data consists of integer values of the bytes in
the file, or -1 for end of file. To disambiguate calls to open
a file and uses of an existing file handle, file handles are
negative.
[Tab][LF][Tab][LF][LF]
(or in WsAsm,
x-writefile
),
lets you write integers as bytes to a file. You push the address
of a file name. Then you use the instruction to write to the
file. The first use leaves a file handle on the stack, or zero
if the file could not be opened. You need to push the file
handle before writing so that you write to the right file
(there may be more than one open). The data consists of integer
values of the bytes. To disambiguate calls to open a file and
uses of an existing file handle, file handles are negative.
[Tab][LF][Tab][LF][Tab]
(or in WsAsm,
x-closefile
),
closes to a file. You push a file handle. Then you use the
instruction to close the file.
[Tab][LF][Tab][Space]
(or in
WsAsm, readc
)
returns an integer for a character as normal, but also -1 for
end of input. This means you can use Whitespace in a pipe.
[LF][LF][LF]
(or in
WsAsm, end
),
takes a parameter which is the return code of the Whitespace
interpreter.
None of this is enabled unless you run the interpreter with the
extensions flag, -x
. The default is standard Whitespace.
I occasionally want to get the size of the stack, and there's one unused instruction in the Stack IMP:
[Space][Tab][Tab]
Not so sure about that one. Do I want to waste the only possibility for stack improvements on that?
For real blue sky thinking, how about a mechanism to allow jump targets to be decided at run time? Hmm. Whitespace has one data type, integer, which already has to represent a character, an address and an integer. We'd have to overload it with another meaning: runtime jump target. Well, labels are already represented a lot like integers...