# David Smith 2022 david.a.c.v.smith@gmail.com http://dacvs.neocities.org/
# This is the 1st of the 2 files that define SmithForth, a subroutine-threaded Forth for x86-64.

# Understanding this file may be easier if first you watch my video series on hand-made
# Linux x86 executables (featuring the ELF header and general-purpose x86 instructions):
#     https://www.youtube.com/playlist?list=PLZCIHSjpQ12woLj0sjsnqDH8yVuXwTy3p

# ============= ELF FILE HEADER
#
# Linux will run a computing job given the name of an executable file. An executable file
# contains machine code for the processor and information for the operating system about the
# layout of the file and the dimensions of the job. Working without the usual development
# tools, we write this information by hand.

7F 45 4C 46                 # e_ident[EI_MAG]: ELF magic number
            02              # e_ident[EI_CLASS]: 1: 32-bit, 2: 64-bit
               01           # e_ident[EI_DATA]: 1: little-endian, 2: big-endian
                  01        # e_ident[EI_VERSION]: ELF header version; must be 1
                     00     # e_ident[EI_OSABI]: Target OS ABI; should be 0
00                          # e_ident[EI_ABIVERSION]: ABI version; 0 is ok for Linux
   00 00 00 00 00 00 00     # e_ident[EI_PAD]: unused, should be 0
02 00                       # e_type: object file type; 2: executable
      3E 00                 # e_machine: instruction set architecture; 3: x86, 3E: amd64
            01 00 00 00     # e_version: ELF identification version; must be 1
78 00 40 00 00 00 00 00     # e_entry: memory address of entry point (where process starts)
40 00 00 00 00 00 00 00     # e_phoff: file offset where program headers begin (34: 32-bit, 40: 64)
00 00 00 00 00 00 00 00     # e_shoff: file offset where section headers begin
00 00 00 00                 # e_flags: 0 for x86
            40 00           # e_ehsize: size of this header (34: 32-bit, 40: 64-bit)
                  38 00     # e_phentsize: size of each program header (20: 32-bit, 38: 64-bit)
01 00                       # e_phnum: number of program headers
      40 00                 # e_shentsize: size of each section header (28: 32-bit, 40: 64-bit)
            00 00           # e_shnum: number of section headers
                  00 00     # e_shstrndx: index of section header containing section names

# ============= ELF PROGRAM HEADER

01 00 00 00                 # p_type: segment type; 1: loadable
            07 00 00 00     # p_flags: segment-dependent flags (1: X, 2: W, 4: R)
00 00 00 00 00 00 00 00     # p_offset: file offset where segment begins
00 00 40 00 00 00 00 00     # p_vaddr: virtual address of segment in memory (amd64: 00400000)
00 00 00 00 00 00 00 00     # p_paddr: physical address of segment, unspecified by 386 supplement
02 1E 01 00 00 00 00 00 ##### p_filesz: size in bytes of the segment in the file image (see make.sh)
00 00 C0 7F 00 00 00 00     # p_memsz: (>= filesz) size in bytes of the segment in memory
00 10 00 00 00 00 00 00     # p_align: 1000 for x86

# ============= 64-BIT EXTENSIONS
#
# SmithForth has 64-bit Forth cells (i.e., integers) and uses the instruction set
# x86-64 (a.k.a. amd64). Changes from x86 to x86-64 are explained in Intel's
# manual (and in AMD's). See for instance Vol 1 Sec 3.2.1, Vol 2 Sec 2.2.1, and the
# instruction set reference, Vol 2 Chs 3, 4, 5. There are many subtle details. 
#
# In a nutshell, general-purpose registers are widened from 32 bits to 64. The old eight
# 32-bit general-purpose registers EAX, ECX, ..., EDI are still available as operands.
# They are the lower halves of their new 64-bit counterparts RAX, RCX, ..., RDI.
# There are also 8 new 64-bit general-purpose registers R8, R9, ..., R15.
#
# The x86-64 instruction set is almost a superset of x86. Many of the extensions are
# 64-bit counterparts to old 32-bit instructions. Often an instruction on 64-bit operands
# is obtained from a 32-bit instruction by adding a REX prefix byte valued 40 ... 4F (see below).
# Bytes ModR/M and SIB are used in x86-64 as in x86. These bytes provide only 3-bit fields to
# select operands. If we want to select a new register R8 ... R15 as an operand, each 3-bit field
# should have another bit. These bits occur in the REX byte.
#
# The 4 high bits of REX are 0100 (=4). The 4 low bits are named WRXB: (see Intel manual Vol 2 Sec 2.2.1)
#   W=1 iff certain operands are 64 bits wide
#       "REX.W" (W=1) or "REX" (W=0) appears in the instruction reference for most of our instructions.
#   R=1 iff a register R8 ... R15 is referred to by:
#       field reg (middle) of the ModR/M byte;
#   X=1 iff a register R8 ... R15 is referred to by:
#       field index of the SIB byte;
#   B=1 iff a register R8 ... R15 is referred to by:
#       field r/m (last) of the ModR/M byte,
#       field base of the SIB byte, or
#       field reg of the opcode.
#
# Most operations that set a 32-bit register (the lower half of a 64-bit register) also zero
# out the higher 32 bits of the containing 64-bit register. For example, XOR EAX, EAX = 31 C0
# (even without prefix REX) sets all 64 bits of RAX to 0.
#
# Q: In an x86-64 instruction like CMP r/m8, imm8 with opcode 80 /7 ib, does ModR/M byte 00 111 000 refer to [eax] or to [rax]?
# A: [rax]. Register ax contains an address, not an operand. The default AddressSize in 64-bit mode is 64 bits.
# AddressSize is defined not in our usual Volume 2 of Intel's manual, but in Volume 1. See Table 3-4.
# See also AMD's manual, Table 1-3 (p. 9) of https://www.amd.com/system/files/TechDocs/24594.pdf

# ============= FORTH INTERPRETER
#
# Forth words are defined in terms of more primitive Forth words. The most primitive
# SmithForth words are defined in machine language. SmithForth is written "from scratch."
# In the beginning, there is only machine code. We want to switch from writing machine
# code to writing (more pleasant) Forth. Our immediate goal is to write a simple Forth
# interpreter in machine code. Here is an example of Forth input:
#
#     1 2 + . ( Interpreting, not compiling, so far. )
#     : newWord ( -- ) ." After colon, compiling." ; ( After semicolon, interpreting. )
#     newWord ( Interpreting: newWord is executed )
#
# The interpreter reads words and executes them until it reaches a colon. After the colon,
# the interpreter stops executing and starts compiling. To keep the design simple, perhaps
# the colon should require no special treatment. Let the colon, when executed, add a new word
# to the dictionary. We have the following TENTATIVE PLAN:
#
# Interpret:
#     Consume the next word (from the input stream).
#     Find it in the dictionary.
#     Execute it.
#     Go to Interpret.
#
# Colon:
#     Name:
#         Consume the next word (the first word of the colon definition) and write a new Forth
#         header signifying a new dictionary entry named by this word.
#     Constituent:
#         Consume the next word.
#         If it is a semicolon, exit routine Colon. Else:
#         Find the word in the dictionary.
#         Compile the word (append x86 instr. CALL to the dict. entry of the current definition).
#         Go to Constituent.
#
# Loops Interpret and Constituent are similar.
# We can combine them into one loop if we remember whether we are
#     outside a definition (STATE = Interpreting) or
#     inside a definition (STATE = Compiling).
# Some words like semicolon should not be compiled, even when Compiling.
# Such words are labeled "immediate" and treated specially. We have:
#
# ============= THE PLAN
#     (a typical Forth interpreter, simplified)
#
# Set STATE to Interpreting.
# Loop:
#     Consume the next word.
#     Find it in the dictionary.
#     If STATE is Interpreting, or if the word is immediate:
#         Execute the word.
#     Else:
#         Compile the word.
#     Go to Loop.
#
# Colon:
#     Consume the next word and make it the name of a new dictionary entry.
#     Set STATE to Compiling.
#     Return from subroutine Colon.
#
# Semicolon, an immediate word:
#     Set STATE to Interpreting.
#     Return from subroutine Semicolon.
#
# Our first interpreter cannot recognize whole words. We provide special
# commands to start a definition or compile or execute a word. The input
# stream is binary. This binary interpreter (bi) transmits most bytes (all
# but 99) unchanged. A command begins with byte 99. After 99 is a 1-byte
# argument indicating which command is issued. If the command indicates a
# definition, the argument also encodes the length of the word's name, and
# the name of the word is given in full. If the command is to execute or
# compile the word, only the first character of the name is provided, encoded
# in the argument.
#
# MACHINE CODE ########## INTENTION ############ 78 INSTRUCTION ####### OPCODE ######## ModR/M #### SIB ######
BE B2 00 40 00          #:rsi(input)  = 004000__    mov r32, imm32      B8+rd id
BF 30 00 00 10          # rdi(output) = 10000030    mov r32, imm32      B8+rd id

######################### binary interpreter >>> 82 <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
E8 02 00 00 00          #+call (bi)                 call rel32
EB F9                   #-jump bi                89 jmp rel8            EB cb
# # # # # # # # # # # # # (bi)                   89
AC                      # al = [rsi++]              lods m8             AC
3C 99                   # cmp al, 99(command)       cmp al, imm8        3C ib
74 02                   #+jump _command if ==    8E je rel8             74 cb
AA                      # [rdi++] = al (xmit)       stos m8             AA
C3                      # return                    ret                 C3
# _command:             #                        90
BA 28 00 00 10          # rdx = Latest              mov r32, imm32      B8+rd id
AC                      # al = [rsi++] (argument)   lods m8             AC
A8 60                   # al & 60(graphic)?         test al, imm8       A8 ib
74 31                   #+jump Head if zero      9A jz rel8             74 cb
48 8B 1A                # rbx = [rdx]               mov r64, r/m64      REX.W 8B /r     00 011 010
# _find1:               #                        9D
50                      # push al                   push r64            50+rd
24 7F                   # al &= 7F                  and al, imm8        24 ib
3A 43 11                # cmp al, [rbx+11]          cmp r8, r/m8        REX 3A /r       01 000 011
58                      # pop al                    pop r64             58+rd
74 06                   #+jump _match if ==      A6 je rel8             74 cb
48 8B 5B 08             # rbx = [rbx+8]             mov r64, r/m64      REX.W 8B /r     01 011 011
EB F1                   #-jump _find1            AC jmp rel8            EB cb
# _match:               #                        AC
A8 80                   # al & 80(exec) ?           test al, imm8       A8 ib
74 09                   #+jump COMPL if zero     B0 jz rel8             74 cb
FF 23                   # jump [rbx] (exec)      B2 jmp r/m64           REX FF /4       00 100 011

######################### Interpreter subroutines ################################################

99 05 43 4F 4D 50 4C #### COMPL Forth's COMPILE, B9 ( ebx=xt -- )
B0 FF AA                # compile >>>>>>>>>>>>>>>>> call r/m64          FF /2           00 010 100  00 100 101
B0 14 AA                #     al = _                mov r8, imm8        B0+rb ib
B0 25 AA                #         [rdi++] = al      stos m8             AA
93                      # eax = ebx                 xchg eax, r32       90+rd
AB                      # [rdi(++4)] = eax          stos m32            AB
C3                      # return                    ret                 C3

99 04 48 65 61 64 ####### Head ================= CB ( al=flag rdx=Latest rsi=addr -- rdx=Latest rsi=addr' )
48 83 C7 0F             # rdi += 0F                 add r/m64, imm8     REX.W 83 /0 ib  11 000 111
48 83 E7 F0             # rdi &= F0                 and r/m64, imm8     REX.W 83 /4 ib  11 100 111
48 8B 0A                # rcx = [rdx]               mov r64, r/m64      REX.W 8B /r     00 001 010
48 89 4F 08             # [rdi+8] = rcx             mov r/m64, r64      REX.W 89 /r     01 001 111
48 89 3A                # [rdx] = rdi               mov r/m64, r64      REX.W 89 /r     00 111 010
48 83 C7 10             # rdi += 10                 add r/m64, imm8     REX.W 83 /0 ib  11 000 111
AA                      # [rdi++] = al              stos m8             AA
91                      # ecx = eax                 xchg eax, r32       90+rd
83 E1 1F                # ecx &= 1F                 and r/m32, imm8     83 /4 ib        11 100 001
F3 A4                   # copy Name                 rep movs m8, m8     F3 A4
48 8B 0A                # rcx = [rdx]               mov r64, r/m64      REX.W 8B /r     00 001 010
48 89 39                # [rcx] = rdi               mov r/m64, r64      REX.W 89 /r     00 111 001
C3                      # return                    ret                 C3

# ============= DICTIONARY FORMAT
#
# Each SmithForth dictionary entry begins with:
#     (8 bytes)         Code
#     (8 bytes)         Link
#     (1 byte)          Flag (3 bits) and Length (5 bits) of Name
#     (Length bytes)    Name, where Length < 2^5.
# Each subroutine call refers to its callee. See argument ZZ in the following example:
#
# WW WW WW WW WW WW WW WW # Code: address of a subroutine (usually right after Name)
# XX XX XX XX XX XX XX XX # Link: address of the next earlier dictionary entry
# YY                      # Flag: 80=IMMEDIATE, 40=HIDDEN ; Name Length
# 2E 53                   # Name: .S ( -- ) show the values on the data stack
# 4D 89 7F F8             # [r15-8] = r15 (obuf)      mov r/m64, r64      REX.W 89 /r     01 111 111
# 49 C7 47 F0 00 00 00 10 # [r15-10] = 10000000 (len) mov r/m64, imm32    REX.W C7 /0 id  01 000 111
# 4D 29 7F F0             # [r15-10] -= r15           sub r/m64, r64      REX.W 29 /r     01 111 111
# 49 83 EF 10             # r15 -= 2 cells            sub r/m64, imm8     REX.W 83 /5 ib  11 101 111
# FF 14 25 ZZ ZZ ZZ ZZ    # call TYPE                 call r/m64          FF /2           00 010 100  00 100 101
# C3                      # return                    ret                 C3

99 03 42 59 45 ########## BYE ( -- ) =============================================================
6A 3C 58                # rax = exit (no return)    push imm8; pop      6A ib ; 58+rd
31 FF                   # rdi = stat                xor r/m32, r32      31 /r           11 111 111
0F 05                   # syscall                   syscall             0F 05

# 99 C2                   # BYE

# Linux syscall: ( RDI RSI RDX R10 R8 R9 RAX=syscall# -- RAX=stat RCX=? R11=? )
#     Manual pages on system calls: `man 2 syscalls ; man 2 exit ; man 2 read ; man 2 write ; man 2 mmap`
#     syscall numbers: /usr/include/x86_64-linux-gnu/asm/unistd_64.h
#     syscall error numbers: /usr/include/asm-generic/errno-base.h
#     mmap flag values: /usr/include/asm-generic/mman-common.h

99 04 54 59 50 45 ####### TYPE ( rsi=addr rdx=u -- rsi=? rdi=? ) show memory [addr, addr+u) ======
6A 01 5F                # rdi(fd) = stdout = 1      push imm8; pop      6A ib ; 58+rd
# _beg:                 #                        00
8B C7                   # rax = write = 1 = rdi     mov r32, r/m32      8B /r           11 000 111
0F 05                   # syscall                   syscall             0F 05
48 85 C0                # cmp rax, 0                test r/m64, r64     REX.W 85 /r     11 000 000
7C 08                   #+jump _end if <         09 jl rel8             7C cb
48 01 C6                # rsi(buf) += rax           add r/m64, r64      REX.W 01 /r     11 000 110
48 29 C2                # rdx(cnt) -= rax           sub r/m64, r64      REX.W 29 /r     11 000 010
7F EF                   #-jump _beg if >         11 jg rel8             7F cb
# _end:                 #                        11
C3                      # return                    ret                 C3

# ============= DEBUGGING
#
# During development, a program like this one may crash with an uninformative error message like
# "Segmentation fault" or "Illegal instruction." How can we work in such an environment?
# We start with a trivial program that works (i.e., simply invokes syscall exit, as in BYE),
# and then expand it gradually until it does what we want. When a program breaks after a small
# change, we know where the bug is. Here is one way to go.
#
# Insert a jump to BYE at the top of the program. You have to compute the length of the jump.
# After this chore, updating it is easy if you expand the program only one instruction at a time.
# You will want to disable and enable parts of the program as you expand it. The most basic ways:
# -- Hide unwanted code in comments. If this disrupts byte counts, replace lost bytes by no-op
#    instructions NOP = 90.
# -- Inside a subroutine, leave early by inserting a return instruction RET = C3.

99 03 64 62 67 ########## dbg ( -- ) show stack and data; use `./SForth | xxd -o 0x0fffffe0` =====
56 57                   # push rsi, rdi             push r64            50+rd
BE E0 FF FF 0F          # rsi = addr                mov r32, imm32      B8+rd id
BA 00 0A 00 00          # rdx = u                   mov r32, imm32      B8+rd id
99 54                   # Call TYPE
5F 5E                   # pop rdi, rsi              pop r64             58+rd
C3                      # return                    ret                 C3

# 99 E4 99 C2             # dbg BYE

99 03 72 65 67 ########## reg ( -- ) show registers; use `./SForth | xxd` ========================
56 57                   # push rsi, rdi             push r64            50+rd
41 57 57 41 56 56       # push r15, rdi, r14, rsi   push r64            REX 50+rd
41 55 55 41 54 54       # push r13, rbp, r12, rsp   push r64            REX 50+rd
41 53 53 41 52 52       # push r11, rbx, r10, rdx   push r64            REX 50+rd
41 51 51 41 50 50       # push r9 , rcx, r8 , rax   push r64            REX 50+rd
48 8B F4                # rsi = rsp                 mov r64, r/m64      REX.W 8B /r     11 110 100
BA 80 00 00 00          # rdx = u                   mov r32, imm32      B8+rd id
99 54                   # Call TYPE
48 83 EC 80             # rsp -= -80                sub r/m64, imm8     REX.W 83 /5 ib  11 101 100
5F 5E                   # pop rdi, rsi              pop r64             58+rd
C3                      # return                    ret                 C3

# 99 F2 99 C2             # reg BYE

# ============= TEXT INTERPRETER
#
# Standard Forth handles input one line at a time.
# SmithForth's text interpreter is a simple interpreter in the standard Forth style.
# SVAL (see standard Forth's EVALUATE) interprets each line.
# REFILL fetches a line of input, including its trailing LF, and sets the input source state.
#     10000000 #IN      cell contains #characters in the current line.
#     10000008 TIB      cell contains the address where the current line begins.
#     10000010 >IN      cell contains #characters in the current line that have been parsed.
#     10000020 STATE    cell contains 0(Interpreting) or 1(Compiling).
#     10000028 Latest   cell contains the execution token (xt) of the latest defined Forth word.
# In Forth, to parse is to remove from the input stream. As a line is parsed, [>IN] increases from 0 to [#IN].
# Forth's "parse area" is the part of the line not yet parsed.

99 06 52 45 46 49 4C 4C # REFILL ( -- ) ==========================================================
49 C7 C1 00 00 00 10    # r9 = VAR                  mov r/m64, imm32    REX.W C7 /0 id  11 000 001
49 8B 01                # rax = [#IN]               mov r64, r/m64      REX.W 8B /r     00 000 001
49 01 41 08             # [TIB] += rax              add r/m64, r64      REX.W 01 /r     01 000 001
49 83 21 00             # [#IN] = 0                 and r/m64, imm8     REX.W 83 /4 ib  00 100 001
49 83 61 10 00          # [>IN] = 0                 and r/m64, imm8     REX.W 83 /4 ib  01 100 001
# _beg:                 #                        00
49 FF 01                # [#IN]++                   inc r/m64           REX.W FF /0     00 000 001
49 8B 41 08             # rax = [TIB]               mov r64, r/m64      REX.W 8B /r     01 000 001
49 03 01                # rax += [#IN]              add r64, r/m64      REX.W 03 /r     00 000 001
80 78 FF 0A             # cmp [rax-1], LF           cmp r/m8, imm8      80 /7 ib        01 111 000
75 F0                   #-jump _beg if !=        10 jne rel8            75 cb
C3                      # return                    ret                 C3

99 04 73 65 65 6B ####### seek ( cl dl "ccc" -- eflags ) parse until 1st char of parse area is within [cl, dl) or parse area is empty
49 C7 C1 00 00 00 10    # r9 = VAR                  mov r/m64, imm32    REX.W C7 /0 id  11 000 001
2A D1                   # dl -= cl                  sub r8, r/m8        2A /r           11 010 001
# _beg:                 #                        00                 like WITHIN ( al cl dl -- eflags )
49 8B 41 10             # rax = [>IN]               mov r64, r/m64      REX.W 8B /r     01 000 001
49 3B 01                # cmp rax, [#IN]            cmp r64, r/m64      REX.W 3B /r     00 000 001
73 16                   #+jump _end if U>=       09 jae rel8            73 cb
49 8B 41 08             # rax = [TIB]               mov r64, r/m64      REX.W 8B /r     01 000 001
49 03 41 10             # rax += [>IN]              add r64, r/m64      REX.W 03 /r     01 000 001
8A 00                   # al = [rax]                mov r8, r/m8        8A /r           00 000 000
2A C1                   # al -= cl                  sub r8, r/m8        2A /r           11 000 001
3A C2                   # cmp al, dl                cmp r8, r/m8        3A /r           11 000 010
72 06                   #+jump _end if U<        19 jb rel8             72 cb
49 FF 41 10             # [>IN]++                   inc r/m64           REX.W FF /0     01 000 001
EB E1                   #-jump _beg              1F jmp rel8            EB cb
# _end:                 #                        1F
C3                      # return                    ret                 C3

99 05 50 41 52 53 45 #### PARSE ( cl dl "ccc<char>" -- rbp=addr rax=u ) addr: where ccc begins ; u: length of ccc
49 C7 C1 00 00 00 10    # r9 = VAR                  mov r/m64, imm32    REX.W C7 /0 id  11 000 001
49 8B 69 10             # rbp = [>IN]               mov r64, r/m64      REX.W 8B /r     01 101 001
99 73                   # Call seek         parse until 1st instance within [cl, dl) is parsed or parse area empty
49 8B 41 10             # rax = [>IN]               mov r64, r/m64      REX.W 8B /r     01 000 001
73 04                   #+jump _end if U>=       00 jae rel8            73 cb
49 FF 41 10             # [>IN]++                   inc r/m64           REX.W FF /0     01 000 001
# _end:                 #                        04
48 29 E8                # rax -= rbp                sub r/m64, r64      REX.W 29 /r     11 101 000
49 03 69 08             # rbp += [TIB]              add r64, r/m64      REX.W 03 /r     01 101 001
C3                      # return                    ret                 C3

99 05 70 6E 61 6D 65 #### pname ( "<spaces>ccc<space>" -- rbp=addr rax=u ) PARSE-NAME ============
B1 21 B2 7F             # (cl, dl) = (BL+1, ...)    mov r8, imm8        B0+rb ib
99 73                   # Call seek
B1 7F B2 21             # (cl, dl) = (..., BL+1)    mov r8, imm8        B0+rb ib
99 50                   # Call PARSE
C3                      # return                    ret                 C3

99 81 5B ################ [ ( -- ) lbracket IMMEDIATE ============================================
6A 00                   # push 0(Interpreting)      push imm8           6A ib
8F 04 25 20 00 00 10    # pop [STATE]               pop r/m64           8F /0           00 000 100  00 100 101
C3                      # return                    ret                 C3

99 01 5D ################ ] ( -- ) rbracket ======================================================
6A 01                   # push 1(Compiling)         push imm8           6A ib
8F 04 25 20 00 00 10    # pop [STATE]               pop r/m64           8F /0           00 000 100  00 100 101 
C3                      # return                    ret                 C3

99 81 5C ################ \ ( "ccc<eol>" -- ) backslash IMMEDIATE ================================
48 8B 04 25 00 00 00 10 # rax = [#IN]               mov r64, r/m64      REX.W 8B /r     00 000 100  00 100 101
48 89 04 25 10 00 00 10 # [>IN] = rax               mov r/m64, r64      REX.W 89 /r     00 000 100  00 100 101
C3                      # return                    ret                 C3

99 81 28 ################ ( ( "ccc<rparen>" -- ) lparen IMMEDIATE ================================
B1 29 B2 2A             # (cl, dl) = (RP, RP+1)     mov r8, imm8        B0+rb ib
99 50                   # Call PARSE            Forth 2012 implies comment ends at rparen or newline.
C3                      # return                    ret                 C3

99 01 3A ################ : ( "<spaces>ccc<space>" -- ) colon ====================================
99 70                   # Call pname                            See Forth 2012 Table 2.1
48 89 EE                # rsi = rbp                 mov r/m64, r64      REX.W 89 /r     11 101 110
BA 28 00 00 10          # rdx = Latest              mov r32, imm32      B8+rd id
99 48                   # Call Head
48 8B 0A                # rcx = [rdx]               mov r64, r/m64      REX.W 8B /r     00 001 010
48 83 C1 10             # rcx += 10                 add r/m64, imm8     REX.W 83 /0 ib  11 000 001
80 09 40                # [rcx] |= 40(HIDDEN)       or r/m8, imm8       80 /1 ib        00 001 001
99 5D                   # Call ]
C3                      # return                    ret                 C3

99 81 3B ################ ; ( C: -- ) semicolon IMMEDIATE ========================================
B0 C3                   # al = opcode ret           mov r8, imm8        B0+rb ib
AA                      # [rdi++] = al              stos m8             AA
48 8B 0C 25 28 00 00 10 # rcx = [Latest]            mov r64, r/m64      REX.W 8B /r     00 001 100  00 100 101
48 83 C1 10             # rcx += 10                 add r/m64, imm8     REX.W 83 /0 ib  11 000 001
80 21 BF                # [rcx] &= BF(~HIDDEN)      and r/m8, imm8      80 /4 ib        00 100 001
99 5B                   # Call [
C3                      # return                    ret                 C3

99 01 2E ################ . ( char -- ) nonstandard name for C, ==================================
41 8A 07                # al = [r15]                mov r8, r/m8        REX 8A /r       00 000 111
49 83 C7 08             # r15 += 8                  add r/m64, imm8     REX.W 83 /0 ib  11 000 111
AA                      # [rdi++] = al              stos m8             AA
C3                      # return                    ret                 C3

99 83 4C 49 54 ########## LIT ( C: x -- ) ( -- x ) IMMEDIATE ===================================== TODO compare xchg r15, rsp ; push imm8 ; xchg r15, rsp
B8 49 83 EF 08 AB       # compile r15 -= 8          sub r/m64, imm8     REX.W 83 /5 ib  11 101 111
B8 6A 41 8F 07 AA       # eax = push x ; pop [r15]  push i8 ; pop r/m64 6A ib;REX 8F /0 00 000 111
41 8A 07 AB             # al = [r15] ; compile      mov r8, r/m8        REX 8A /r       00 000 111
49 83 C7 08             # r15 += 8                  add r/m64, imm8     REX.W 83 /0 ib  11 000 111
C3                      # return                    ret                 C3

99 03 78 74 3D ########## xt= ( rbp=addr rax=u rbx=xt -- rbx=xt rax=? rdi=? eflags ) rbx == 0 or unhidden and matches
48 85 DB                # rbx(xt) ?                 test r/m64, r64     REX.W 85 /r     11 011 011
75 01                   #+jump _nonzero if != 0     jnz rel8            75 cb
C3                      # return                    ret                 C3
# _nonzero:             #
48 8B C8                # rcx = rax(u)              mov r64, r/m64      REX.W 8B /r     11 001 000
48 8D 73 10             # rsi = rbx(xt) + 10        lea r64, m          REX.W 8D /r     01 110 011
AC                      # al = [rsi++]              lods m8             AC
A8 40                   # al & 40(HIDDEN) ?         test al, imm8       A8 ib
74 01                   #+jump _unhidden if == 0    jz rel8             74 cb
C3                      # return                    ret                 C3
# _unhidden:            #
48 83 E0 1F             # rax &= 1F(Length)         and r/m64, imm8     REX.W 83 /4 ib  11 100 000
48 39 C8                # cmp rax, rcx              cmp r/m64, r64      REX.W 39 /r     11 001 000
74 01                   #+jump _lengthEq if ==      je rel8             74 cb
C3                      # return                    ret                 C3
# _lengthEq:            #                          
48 8B FD                # rdi = rbp                 mov r64, r/m64      REX.W 8B /r     11 111 101
F3 A6                   # strings equal ?           repe cmps m8, m8    F3 A6
C3                      # return                    ret                 C3

99 04 46 49 4E 44 ####### FIND ( rbp=addr rax=u -- rbp=addr rax=u rbx=xt ) xt==0 if not found ====
48 8B 1C 25 28 00 00 10 # rbx = [Latest]            mov r64, r/m64      REX.W 8B /r     00 011 100  00 100 101
# _beg:                 #
E8 03 00 00 00          #+call (FIND)               call rel32          E8 cd
75 F9                   #-jump _beg if !=           jne rel8            75 cb
C3                      # return                    ret                 C3
# # # # # # # # # # # # # (FIND)
50 57                   # push rax, rdi             push r64            50+rd
99 78                   # Call xt=  
5F 58                   # pop rdi, rax              pop r64             58+rd
74 04                   #+jump _end if ==           je rel8             74 cb
48 8B 5B 08             # rbx = [rbx+8]             mov r64, r/m64      REX.W 8B /r     01 011 011
# _end:                 #
C3                      # return                    ret                 C3

99 03 4E 75 6D ########## Num ( rbp=addr rax=u -- n ) ============================================
49 83 EF 08             # r15 -= 8                  sub r/m64, imm8     REX.W 83 /5 ib  11 101 111
49 83 27 00             # [r15] = 0                 and r/m64, imm8     REX.W 83 /4 ib  00 100 111
48 89 C1                # rcx = rax                 mov r/m64, r64      REX.W 89 /r     11 000 001  
48 8B F5                # rsi = rbp                 mov r64, r/m64      REX.W 8B /r     11 110 101
# _beg:                 #                        
E8 03 00 00 00          #+call (Num)                call rel32          E8 cd
E2 F9                   #-jump beg if --rcx         loop rel8           E2 cb
C3                      # return                    ret                 C3
# # # # # # # # # # # # # (Num)
AC                      # al = [rsi++]              lods m8             AC
3C 41                   # cmp al, 'A'               cmp al, imm8        3C ib
7C 02                   #+jump _digit if <          jl rel8             7C cb
# _letter:              #
2C 07                   # al -= 7                   sub al, imm8        2C ib
# _digit:               #
2C 30                   # al -= 30                  sub al, imm8        2C ib
49 C1 27 04             # [r15] <<= 4               sal r/m64, imm8     REX.W C1 /4 ib  00 100 111
49 09 07                # [r15] |= rax              or r/m64, r64       REX.W 09 /r     00 000 111
C3                      # return                    ret                 C3

99 04 6D 69 73 73 ####### miss ( rbp=addr rax=u rbx=xt -- |n rbx=xt ) n present iff u nonzero ====
48 85 DB                # rbx(xt) ?                 test r/m64, r64     REX.W 85 /r     11 011 011
74 01                   #+jump (miss) if == 0       jz rel8             74 cb
C3                      # return                    ret                 C3
# # # # # # # # # # # # # (miss)
48 85 C0                # rax(u) ?                  test r/m64, r64     REX.W 85 /r     11 000 000
75 01                   #+jump _nonempty if !=      jne rel8            75 cb
C3                      # return                    ret                 C3
# _nonempty:            #
99 4E                   # Call Num
F6 04 25 20 00 00 10 01 # [STATE] ?                 test r/m8, imm8     F6 /0 ib        00 000 100  00 100 101
75 01                   #+jump _lit if != 0         jnz rel8            75 cb
C3                      # return                    ret                 C3
# _lit:                 #
99 4C                   # Call LIT
C3                      # return                    ret                 C3

99 04 45 58 45 43 ####### EXEC ( rbx=xt -- ) =====================================================
B9 F8 FF FF 7F          # rcx = _                   mov r32, imm32      B8+rd id
57                      # push rdi                  push r64            50+rd
89 CF                   # rdi = rcx                 mov r/m32, r32      89 /r           11 001 111
99 43                   # Call COMPL
B0 C3                   # al = C3                   mov r8, imm8        B0+rb ib
AA                      # [rdi++] = al              stos m8             AA
5F                      # pop rdi                   pop r64             58+rd
FF D1                   # call rcx                  call r/m64          FF /2           11 010 001
C3                      # return                    ret                 C3

99 04 65 78 65 63 ####### exec ( al rbx=xt -- ) iff al != 1 ======================================
3C 01                   # cmp al, 1                 cmp al, imm8        3C ib
75 01                   #+jump (exec) if !=         jne rel8            75 cb
C3                      # return                    ret                 C3
# # # # # # # # # # # # # (exec)
99 45                   # Call EXEC
C3                      # return                    ret                 C3

99 05 63 6F 6D 70 6C #### compl ( al -- al ) iff al == 1 ==========================================
3C 01                   # cmp al, 1                 cmp al, imm8        3C ib
74 01                   #+jump (compl) if ==        je rel8             74 cb
C3                      # return                    ret                 C3
# # # # # # # # # # # # # (compl)
99 43                   # Call COMPL
B0 01                   # al = 1                    mov r8, imm8        B0+rb ib
C3                      # return                    ret                 C3

99 03 68 69 74 ########## hit ( rbx=xt -- ) ======================================================
48 85 DB                # rbx(xt) ?                 test r/m64, r64     REX.W 85 /r     11 011 011
75 01                   #+jump (hit) if != 0        jnz rel8            75 cb
C3                      # return                    ret                 C3
# # # # # # # # # # # # # (hit)
40 8A 43 10             # al = [rbx+10]             mov r8, r/m8        REX 8A /r       01 000 011
24 80                   # al &= 80(IMMEDIATE)       and al, imm8        24 ib
0A 04 25 20 00 00 10    # al |= [STATE]             or r8, r/m8         0A /r           00 000 100  00 100 101
99 63                   # Call compl
99 65                   # Call exec
C3                      # return                    ret                 C3

99 04 53 56 41 4C ####### SVAL ( i*x -- j*x ) == 00 EVALUATE =====================================
E8 03 00 00 00          #+call (SVAL)            05 call rel32          E8 cd
7C F9                   #-jump SVAL if <         07 jl rel8             7C cb
C3                      # return                    ret                 C3
# # # # # # # # # # # # # (SVAL)                 08
99 70                   # Call pname
99 46                   # Call FIND
99 6D                   # Call miss
99 68                   # Call hit
48 8B 04 25 10 00 00 10 # rax = [>IN]               mov r64, r/m64      REX.W 8B /r     00 000 100  00 100 101
48 3B 04 25 00 00 00 10 # cmp rax, [#IN]            cmp r64, r/m64      REX.W 3B /r     00 000 100  00 100 101
C3                      # return                    ret                 C3

99 02 74 69 ############# ti ( -- ) text interpreter =============================================
49 C7 C7 00 00 00 10    # r15(stack) = 10000000     mov r/m64, imm32    REX.W C7 /0 id  11 000 111
49 89 77 08             # [TIB] = rsi               mov r/m64, r64      REX.W 89 /r     01 110 111
99 5B                   # Call [
# _beg:                 #
E8 02 00 00 00          #+call (ti)                 call rel32          E8 cd
EB F9                   #-jump _beg                 jmp rel8            EB cb
# # # # # # # # # # # # # (ti)
99 52                   # Call REFILL
99 53                   # Call SVAL
C3                      # return                    ret                 C3

# 99 E4 99 C2             # dbg BYE

99 F4                   # ti