<!-- END doctoc generated TOC please keep comment here to allow auto update -->
## Overview
This is a description of the Cannon Fault Proof Virtual Machine (FPVM). The Cannon FPVM emulates
a minimal Linux-based system running on big-endian 32-bit MIPS32 architecture.
Alot of its behaviors are copied from Linux/MIPS with a few tweaks made for fault proofs.
For the rest of this doc, we refer to the Cannon FPVM as simply the FPVM.
Operationally, the FPVM is a state transition function. This state transition is referred to as a *Step*,
that executes a single instruction. We say the VM is a function $f$, given an input state $S_{pre}$, steps on a
single instruction encoded in the state to produce a new state $S_{post}$.
$$f(S_{pre}) \rightarrow S_{post}$$
Thus, the trace of a program executed by the FPVM is an ordered set of VM states.
## State
The virtual machine state highlights the effects of running a Fault Proof Program on the VM.
It consists of the following fields:
1.`memRoot` - A `bytes32` value representing the merkle root of VM memory.
2.`preimageKey` - `bytes32` value of the last requested pre-image key.
3.`preimageOffset` - The 32-bit value of the last requested pre-image offset.
4.`pc` - 32-bit program counter.
5.`nextPC` - 32-bit next program counter. Note that this value may not always be $pc+4$
when executing a branch/jump delay slot.
6.`lo` - 32-bit MIPS LO special register.
7.`hi` - 32-bit MIPS HI special register.
8.`heap` - 32-bit base address of the most recent memory allocation via mmap.
9.`exitCode` - 8-bit exit code.
10.`exited` - 1-bit indicator that the VM has exited.
11.`registers` - General-purpose MIPS32 registers. Each register is a 32-bit value.
The state is represented by packing the above fields, in order, into a 226-byte buffer.
## Memory
Memory is represented as a binary merkle tree.
The tree has a fixed-depth of 27 levels, with leaf values of 32 bytes each.
This spans the full 32-bit address space, where each leaf contains the memory at that part of the tree.
The state `memRoot` represents the merkle root of the tree, reflecting the effects of memory writes.
As a result of this memory representation, all memory operations are 4-byte aligned.
Memory access doesn't require any privileges. An instruction step can access any memory
location as the entire address space is unprotected.
### Heap
FPVM state contains a `heap` that tracks the base address of the most recent memory allocation.
Heap pages are bump allocated at the page boundary, per `mmap` syscall. The page size is 4096.
The FPVM has a fixed program break at `0x40000000`. However, the FPVM is permitted to extend the
heap beyond this limit via mmap syscalls.
For simplicity, there are no memory protections against "heap overruns" against other memory segments.
Such VM steps are still considered valid state transitions.
Specification of memory mappings is outside the scope of this document as it is irrelevant to
the VM state. FPVM implementers may refer to the Linux/MIPS kernel for inspiration.
## Delay Slots
The post-state of a step updates the `nextPC`, indicating the instruction following the `pc`.
However, in the case of where a branch instruction is being stepped, the `nextPC` post-state is
set to the branch target. And the `pc` post-state set to the branch delay slot as usual.
A VM state transition is invalid whenever the current instruction is a delay slot that is filled
with jump or branch type instruction.
That is, where $nextPC \neq pc + 4$ while stepping on a jump/branch instruction.
Otherwise, there would be two consecutive delay slots. While this is considered "undefined"
behavior in typical MIPS implementations, FPVM must raise an exception when stepping on such states.
## Syscalls
Syscalls work similar to [Linux/MIPS](https://www.linux-mips.org/wiki/Syscall), including the
syscall calling conventions and general syscall handling behavior.
However, the FPVM supports a subset of Linux/MIPS syscalls with slightly different behaviors.
The following table list summarizes the supported syscalls and their behaviors.
| $v0 | system call | $a0 | $a1 | $a2 | Effect |
| -- | -- | -- | -- | -- | -- |
| 4090 | mmap | uint32 addr | uint32 len | | Allocates a page from the heap. See [heap](#heap) for details. |
| 4045 | brk | | | | Returns a fixed address for the program break at `0x40000000` |
| 4120 | clone | | | | Returns 1 |
| 4246 | exit_group | uint8 exit_code | | | Sets the Exited and ExitCode states to `true` and `$a0` respectively. |
| 4003 | read | uint32 fd | char *buf | uint32 count | Similar behavior as Linux/MIPS with support for unaligned reads. See [I/O](#io) for more details. |
| 4004 | write | uint32 fd | char *buf | uint32 count | Similar behavior as Linux/MIPS with support for unaligned writes. See [I/O](#io) for more details. |
| 4055 | fcntl | uint32 fd | int32 cmd | | Similar behavior as Linux/MIPS. Only the `F_GETFL` (3) cmd is supported. Sets errno to `0x16` for all other commands |
For all of the above syscalls, an error is indicated by setting the return register (`$v0`) to
`0xFFFFFFFF` (-1) and `errno` (`$a3`) is set accordingly.
For unsupported syscalls, the VM must do nothing except to zero out the syscall return (`$v0`)
and errno (`$a3`) registers.
Note that the above syscalls have identical syscall numbers and ABIs as Linux/MIPS.
## I/O
The VM does not support Linux open(2). However, the VM can read from and write to a predefined set of file descriptors.