A Forth for x86-64 personal computers

David Smith 2022 david.a.c.v.smith@gmail.com

New: Check out Andrei Duma's explanation of SmithForth! He has implemented in RISC-V the machine-code part of SmithForth and more!

SmithForth runs on Linux x86-64 systems.
I believe some other Unix x86-64 systems can run Linux ELF binaries. SmithForth should run on these.
SmithForth should run on your Windows system if you have installed the Windows Subsystem for Linux. I haven't tried it.

SmithForth design

SmithForth is an implementation of the Forth programming language for x86-64 desktop computers. SmithForth is a text interpreter that runs in a Linux text console.

You can use SmithForth as you would use any other standard Forth system. SmithForth follows the the Forth standard of 2012. The Forth standard describes some optional word sets concerning features like floating-point arithmetic and dynamic memory allocation that most programmers today would regard as standard, but the Forth community considers optional, perhaps because Forth runs on modest machines like microcontrollers. SmithForth does not yet have floating-point arithmetic, local variables, or file access.

When I consider using a programming environment, I expect it to work on my hardware, and I expect it to be concrete and simple as possible. Good luck finding tools that meet this standard. So I wrote SmithForth. Please take a look at the source code and the comments. SmithForth is implemented in the subroutine-threaded style, which I think is the most straightforward way. The Intel manual is our source of information on the x86 architecture.

We use none of the usual tools from the world of C, not even an assembler. SmithForth is implemented in two source files:

SForth.dmp contains a primitive Forth system in 1000 hand-written bytes of annotated machine code.
system.fs contains 1000 lines of system Forth to complete a standard Forth system.

How to run SmithForth

Combine SForth.dmp and system.fs into one binary file.

Machine code is converted from a human-readable format "DMP" to binary by xxd (hex dump) -r (reverse).
```
$ cut -d'#' -f1 SForth.dmp | xxd -p -r > SForth
$ cat system.fs >> SForth
```
The Forth source text is grafted onto the binary.
Turn the binary file into a proper executable.

Grant permission to execute the file on the Linux system.
```
$ chmod +x SForth
```
You should now have a working executable (if your system is similar enough to mine).

(Optional:) Advanced users might edit the kernel or system.fs. You must ensure that the ELF segment header entry p_filesz contains the number of bytes of the segment that appear in binary file SForth. In my design, that's the whole file, including ELF header, machine code, and system Forth. We prefer not to count those bytes by hand. An easy solution is:
1. Make the binary as above. Don't run the binary yet.
2. Ask the system for the size of the binary file.
3. Write the result into field p_filesz of dmqc.dmp.
4. Remake the binary. The binary is ready to run.
My script make.sh does these steps automatically:
```
$ ./make.sh # if and only if you modified SForth.dmp or system.fs
```
Run the SmithForth executable.

You can use SmithForth interactively, or you can feed your program into the standard input stream.
```
$ ./SForth
$ ./SForth < YourProgram.fs
$ cat YourProgram.fs - | ./SForth
```
The last command allows you to use SmithForth interactively after your programs.

Numbers in machines

If you aren't an experienced programmer, here are some things to know before you read the DMP source file.

Hexadecimal

Numbers in machine-code listings are often written in hexadecimal, the base-sixteen number system with numerals 0123456789ABCDEF. In hexadecimal, the nonnegative integers are:

0, 1, 2, ..., 9, A, B, C, D, E, F, 10, 11, ..., 1F, 20, ..., FF, 100, 101, ...

Bytes

A byte is eight bits. The eight bits of a byte have 2⁸ different states, but our alphabet has fewer symbols. Instead we write these states in base sixteen (=2⁴) using a pair of hexadecimal numerals. The states of a byte are:

00, 01, 02, ..., FE, FF.

Binary

Sometimes we need to convert hexadecimal numbers to and from binary (base 2). Conveniently, one base is the fourth power of the other, and the conversion can be done in one-to-four fashion, repeatedly, independently. For example:

3E hexadecimal = 0011 1110 binary,

as 3 = 0011 and E = 1110. You may see bits of a binary number grouped in various ways. For example, x86 instruction "sub r/m32, r32" has a ModR/M byte written:

11 111 000.

Bits 11 select an access mode for argument r/m32, and bits 000 select register EAX for that argument. The middle three bits 111 select register EDI. To convert this into hexadecimal, we regroup the eight bits:

1111 1000 binary = F8 hexadecimal (as 1111 = F and 1000 = 8).

Endianness

In hexadecimal, and in our usual decimal system, numbers with more than one numeral are written with more significant numerals first:

Decimal has hundreds before tens before ones.
Hexadecimal has two hundred fifty-sixes before sixteens before ones.

Such systems are big-endian. Little-endian is the reverse order, with less significant numerals first.

The little-endian x86

The x86 architecture is little-endian in bytes (that is, little-endian in base 2⁸) ...

When an integer is moved from a CPU register into memory, for example, the least significant byte appears first in memory.

... but each byte is written as a big-endian pair of hexadecimal numerals. This custom is observed:

in Intel's manuals,
in the output of tools (like xxd) that convert binary files to text, and
in the input of tools that convert text (like our DMP file) to binary files.

For example,

hexadecimal 12 is eighteen,
hexadecimal 12 00 is eighteen (= 0012 big-endian), and
hexadecimal 12 00 00 00 is eighteen (= 00000012 big-endian).

Machine arithmetic is modular

There are infinitely many integers, but digital machines have finitely many states. Integers processed by the Arithmetic Logic Unit (ALU) of the machine are more aptly called congruence classes, as the ALU performs modular arithmetic for addition, subtraction, and multiplication, where the modulus of the congruence relation is a power of 2, like 2⁸. However, some operations involve a particular representative of a congruence class.

Two rules are commonly used to select a representative. They are:

unsigned, where the representatives are
0, 1, 2, ..., modulus - 1, and
signed, where the representatives are
0, 1, ..., modulus/2 - 1, (nonnegative) and

-1, -2, ..., modulus/2 (negative);
but the machine has no "minus sign," so instead we use
modulus - 1, modulus - 2, ..., modulus/2 (negative).

For example, if the modulus is 2⁸, the signed representatives are

00, 01, ..., 7F, (nonnegative) FF, FE, ..., 80 (negative).

We can tell negative from nonnegative by the most significant bit.

Nonnegatives have most significant bit 0.
Negatives have most significant bit 1.

Watch for "unsigned" and "signed" in the Intel manual.

Implementing Forth

If you aren't an experienced Forth programmer, here are some things to know before you read the Forth system file.

Conditionals

Compiling in Forth is simple. The compiler sees a word W, looks it up in the dictionary, and appends to the body of the current dictionary entry a CALL instruction. The target of the CALL is the address of the body of the dictionary entry of word W.

CALL facilitates the use of subroutines, which normally return control to the point in the instruction stream where CALL occurred. There are other forms of flow control, mainly conditionals and loops.

: X ... flag IF  ...  ( do this if flag is nonzero )  ...  THEN ... ;
              \                                               /
               -->-- ( skip to THEN if flag is zero ) ---->---

Forth has words IF and THEN for conditional execution. They are used in compilation mode. Normally IF is paired with an instance of THEN later in a word definition. When the interpreter reaches IF at run time, it examines the number at the top of the stack. If the number is nonzero, execution continues past IF. If the number is zero, execution jumps past THEN.

The microprocessor has flow-control features including jumps and conditional jumps. The Forth compiler, when it sees IF, might emit a conditional jump. A likely instruction of this kind in x86 is JZ, jump if zero. JZ has a parameter that indicates where to jump to, past THEN in this case. However, when the compiler first sees IF, it has not yet seen any THEN. So the plan is:

Upon IF, reserve space in the compiler's output for instruction JZ and produce (push) a reference.
Continue compiling as usual.
Upon THEN, consume (pop) a reference and finish formulating instruction JZ. In the Forth standard, this is resolving a forward reference.

Forth implements this plan by using the data stack during compilation:

IF ( flag -- ) Compilation: ( -- orig ) ... ;
THEN ( -- ) Compilation: ( orig -- ) ... ;

Stack-effect comments like ( bef -- aft ) describe the end of the stack before and after the word is executed. The LIFO property of the stack allows IF-THEN statements to be nested.

Loops

Forth offers several ways to write loops. One is BEGIN-WHILE-REPEAT.

              --------------------------------<---------------------------
             /                                                            \
: X ... BEGIN ... flag WHILE  ...  ( do this if flag is nonzero )  ...  REPEAT ... ;
                         \                                                   /
                          --->-- ( skip to REPEAT if flag is zero ) ---->----

BEGIN ( -- ) Compilation: ( -- dest ) ... ;
WHILE ( flag -- ) Compilation: ( dest -- orig dest ) ... ;
REPEAT ( -- ) Compilation: ( orig dest -- )

BEGIN-WHILE-REPEAT is versatile, allowing to exit from the middle of a loop body. BEGIN-WHILE-REPEAT can emulate other control structures:

IF-THEN:: flag BEGIN WHILE ... 0 REPEAT
IF-ELSE-THEN:: flag DUP BEGIN WHILE DROP ... 1 0 REPEAT 0= BEGIN WHILE ... 0 REPEAT
BEGIN-UNTIL:: BEGIN ... flag 0= WHILE REPEAT

Simpler software

Here are my opinions and motivations for this project. Forth is a simple programming language that achieved some fame back when personal computers were young. What else would run on such modest computers? Nowadays computers and languages are fancy and complex. I say Forth is still a good choice. Forth is simple enough that we may reason about it and understand it. Forth provides:

features found in most programming languages:
- conditionals (if-else)
- loops
- variables
- user-defined functions
and more:
- CREATE
- DOES>

A good system helps us to solve our problems quickly. Some systems try to be comprehensive. Others keep it simple. SmithForth is small and flat enough that a programmer can understand it all. SmithForth has only a couple files. We use none of the usual tools from the world of C. You might argue that C makes programming simpler by insulating us from the workings of the machine. I might agree if:

in our daily work, we are allowed to focus on a small number of layers, and
each layer is well documented,

but this has not been my experience. C is intentionally vague about the dimensions of machines -- an irritation the programmer -- so that programs are portable -- a feature I never use. There is always one kind of machine available, usually x86. The Intel architecture is fairly well documented, and although the Intel manual is huge, we can find the few parts we need. For an introduction to x86 machine-language programming (without Forth), please see my video series on that topic.

I can imagine a system that contains its own documentation and C source code and tailors these to the computer on which it is installed in such a way that the user can study them and be unconcerned with other computers. No one does this, as far as I know.

Forth has been implemented many times. Some Forths have a small kernel so that the system can be ported easily to other architectures (by implementing only a small number of kernel words in machine code). But if you want the system to run well, you have to write more machine code anyway, don't you? My goal with SmithForth is not to stop writing machine code early, but to start writing Forth early. In SmithForth, our early Forth contains a lot of machine code.

I believe SmithForth is a good first Forth to learn and a good first programming environment. The shortened presentation may appeal to mathematicians.

If you want to know my ideas for the direction of this project, I would like to write extensions so that SmithForth becomes fast and suitable for scientific computing; and to add enough features of Lisp, Python, Awk, and Tcl that I wouldn't think to use those languages. I would like to adapt it to other machines and make it usable without an underlying operating system like Linux.