-
Notifications
You must be signed in to change notification settings - Fork 197
Add docs/SPECS.md #2105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
DiegoCivi
wants to merge
32
commits into
main
Choose a base branch
from
docs/specs
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Add docs/SPECS.md #2105
Changes from 5 commits
Commits
Show all changes
32 commits
Select commit
Hold shift + click to select a range
85b5c28
Add SPECS.md with instructions info
DiegoCivi 4c76d36
Minor changes
DiegoCivi 922fd22
Add brief explanation of Opcode Extensions
DiegoCivi 82c6a70
Minor changes
DiegoCivi 04093af
Remove incorrect constraint
DiegoCivi b282b86
Fix titles and subtitles sizes
DiegoCivi 5ade55b
Add some more constraints we have
DiegoCivi 84de826
Start memory model section
DiegoCivi b108442
Some initial info about memory structures
DiegoCivi 77b6a4c
Some more info about memory structures
DiegoCivi 0a2b4ec
A little bit of memory info
DiegoCivi 7484df8
Relocation first steps
DiegoCivi e9e1e88
More on memory relocation
DiegoCivi dc01867
Creation of relocation table
DiegoCivi e84ed44
Explain MemoryCell
DiegoCivi 38b3179
Explain some memory methods
DiegoCivi f15ce1b
Explain Temporary Memory Relocation
DiegoCivi 0790382
Fix relocation explanation order
DiegoCivi 85987d1
Minor changes
DiegoCivi 8981742
MemorySegmentManager methods
DiegoCivi c5f57cd
Add info about real memory
DiegoCivi 93899fd
Info about trace
DiegoCivi 23f4a13
Info about CairoPie and PublicInput
DiegoCivi 748a303
Minor changes
DiegoCivi c466ab8
Add diagram with relocation flow
DiegoCivi e754878
Add diagrams for instructions
DiegoCivi 00de59f
Add links to functions
DiegoCivi ee4389c
Merge branch 'main' into docs/specs
DiegoCivi 3dfed29
Merge branch 'docs/specs' of github.com:lambdaclass/cairo-vm into doc…
DiegoCivi 2c35b4c
Merge branch 'main' into docs/specs
DiegoCivi 7d3298c
Fix typos
DiegoCivi 9f53fc3
Merge branch 'main' into docs/specs
DiegoCivi File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,251 @@ | ||
# Specifications | ||
|
||
## Instructions | ||
## Operations | ||
Each Cairo instruction represents one of the following operations: | ||
|
||
- `AddAP`: Increases the AP register. | ||
- `AssertEq`: Asserts that two values are equal. Also used to write memory cells. | ||
- `Call`: A relative or absolute call. | ||
- `Jnz`: A conditional jump. | ||
- `Jump`: An unconditional jump. | ||
- `Ret`: Returns from a call operation. | ||
|
||
However, the binary encoding is not centered around the operation to perform, but around specific aspects of the instruction execution (i.e. how to modify the AP register). | ||
|
||
## Encoding | ||
|
||
The instruction encoding is specified in the [Cario whitepaper](https://eprint.iacr.org/2021/1063.pdf), page 32. | ||
|
||
``` | ||
┌─────────────────────────────────────────────────────────────────────────┐ | ||
│ off dst (biased representation) │ | ||
├─────────────────────────────────────────────────────────────────────────┤ | ||
│ off op0 (biased representation) │ | ||
├─────────────────────────────────────────────────────────────────────────┤ | ||
│ off op1 (biased representation) │ | ||
├─────┬─────┬───────────┬───────┬───────────┬─────────┬──────────────┬────┤ | ||
│ dst │ op0 │ op1 src │ res │ pc update │ ap │ opcode │ 0 │ | ||
│ reg │ reg │ │ logic │ │ update │ │ │ | ||
├─────┼─────┼───┬───┬───┼───┬───┼───┬───┬───┼────┬────┼────┬────┬────┼────┤ | ||
│ 0 │ 1 │ 2 │ 3 │ 4 │ 5 │ 6 │ 7 │ 8 │ 9 │ 10 │ 11 │ 12 │ 13 │ 14 │ 15 │ | ||
└─────┴─────┴───┴───┴───┴───┴───┴───┴───┴───┴────┴────┴────┴────┴────┴────┘ | ||
``` | ||
The figure shows the structure of the 63-bit that form the first word of each instruction. | ||
- The bits are ordered in a little-endian-like encoding, this implies that `off_dst` is located at bits `[0;15]`, while `dst_reg` is located at bit `48`. | ||
- The last bit (`63`) is unused. | ||
|
||
Each sets of fields determine different aspects of the instruction execution: | ||
- `off_op0`, `off_op1`, `op0_reg`, `op1_src` determine the location of each operand. | ||
- `off_dst`, `dst_reg` determine the location of the destionation. | ||
- `res_logic` determine which operation to compute with the operands. | ||
- `pc_update` determines how the PC register is updated. | ||
- `ap_update` determines how the AP register is updated. | ||
- `opcode` determines both how the FP register is updated, and how some memory cell values are deduced. | ||
|
||
> [!NOTE] | ||
> In our VM: | ||
> - `off0` = `off_dst` | ||
> - `off1` = `off_op0` | ||
> - `off2` = `off_op1` | ||
|
||
## Opcode Extensions | ||
|
||
With the realease of Stwo, a new field was introduced named `opcode_extension`. Execution of instructions vary depending on its extension. The 4 types are `Stone`, `Blake`, `BlakeFinalize` and `QM31Operation`. | ||
|
||
> [!NOTE] | ||
> These are not specified on the whitepaper | ||
|
||
### Stone | ||
|
||
This type is used when `opcode_extension == 0`. Its does not add new behaviour since its the original extension | ||
|
||
### Blake & BlakeFinalize | ||
|
||
Blake is used when `opcode_extension == 1` and BlakeFinalize is used when `opcode_extension == 2` and they have some constraints, if these are not met while having one of the the blake extensions, the instructions becomes invalid: | ||
- `opcode == 0` (no operation) | ||
- `op1_src == 2` (FP) or `op1_src == 4` (AP) | ||
- `res_logic == 0` (op1) | ||
- `pc_update == 0` (Regular) | ||
- `ap_update == 0` (Regular) || `ap_update == 2` (Add1) | ||
|
||
After checking the constraints, if we are working with a `Blake` extension the blake2 algorithm is applied. The operands are contained as follows: | ||
- `op0_addr` points to a sequence of 8 -> represents a state | ||
- `op1_addr` points to a sequence of 16 felts -> represents a message | ||
- `dst_addr` holds the value of the counter | ||
- `ap` points to a sequence of 8 cells which each should either be uninitialised or already contain a value matching that of the output | ||
|
||
The output consists of 8 felts that represent the output of the Blake2s compression. | ||
|
||
On the other side, if we are working with the `BlakeFinalize` extension the operands are the same as with `Blake` with only one change: | ||
- `op0_addr` points to a sequence of 9 felts -> first 8 felts represent the state the last one represents the state | ||
|
||
The out here represents the Blake2s compression of the last block. | ||
|
||
### QM31 | ||
|
||
In this case, when `opcode_extension == 3` we are working with the `QM31Operation` which changes how the arithmetic (add, mul, sub, div) works on the VM by doing it with QM31 elements in reduced form. Again there are some constraints, if these are not met the instruction becomes invalid: | ||
- `res_logic == 1` (Add) || `res_logic == 2` (Mul) | ||
- `op1_src != 0` (Op0) | ||
- `pc_update == 0` (Regular) | ||
- `ap_update == 0` (Regular) || `ap_update == 2` (Add1) | ||
|
||
## Auxiliary Variables | ||
|
||
The instruction execution uses four auxiliary variables, that can be computed from the memory values, and the instruction fields: | ||
- `dst`: Destination. | ||
- `op0`: First operand. | ||
- `op1`: Second operand. | ||
- `res`: Operation result. | ||
|
||
Depending on the instruction, the values of `dst`, `op0` and `op1` may be unknown at the start of execution, and will be deduced during it. | ||
|
||
### Computing `dst` | ||
|
||
The value of `dst` is computed as the value at address `register + off_dst`, where `register` depends on the value of `dst_reg`: | ||
- `dst_reg == 0`: We use `AP`. | ||
- `dst_reg == 1`: We use `FP`. | ||
|
||
If the value at the specified address is undefined, then it must be deduced instead. | ||
|
||
### Computing `op0` | ||
|
||
The value of `op0` is computed as the value at address `register + off_op0`, where `register` depends on the value of `op0_reg`: | ||
- `op0_reg == 0`: We use `AP` | ||
- `op0_reg == 1`: We use `FP`. | ||
|
||
If the value at the specified address is undefined, then it must be deduced instead. | ||
|
||
### Computing `op1` | ||
|
||
The value of `op1` is computed as the value at address `base + off_op1`, where `base` depends on the value of `op1_src`: | ||
- `op1_src == 0`: We use `op0`. | ||
- `op1_src == 1`: We use `pc`. | ||
- `op1_src == 2`: We use `FP`. | ||
- `op1_src == 4`: We use `AP`. | ||
- Otherwise: The instruction is invalid. | ||
|
||
If the value at the specified address is undefined, then it must be deduced instead. | ||
|
||
> [!NOTE] | ||
> When `op1_src == 1` we must assert that `off_op1 == 1`, so that `op1` is an immediate value. This constraint is not specified in the whitepaper, but enforced by our VM. | ||
|
||
### Computing `res` | ||
|
||
The variable `res` computation depends on `res_logic`. | ||
- `res_logic == 0`: We set `res = op1`. | ||
- `res_logic == 1`: We set `res = op0 + op1`. | ||
- `res_logic == 2`: We set `res = op0 * op1`. | ||
- Otherwise: The instruction is invalid. | ||
|
||
> [!NOTE] | ||
> The value of `res` won’t always be used. For example, it won’t be used when `pc_update == 4`. | ||
|
||
## Additional Constraints | ||
|
||
1. When `opcode == 1` (Call), the following conditions must be met: | ||
- `off_dst == 0` | ||
- `dst_reg == AP` | ||
- `off_op0 == 1` | ||
- `op0_reg == AP` | ||
- `ap_update == 0` (add 2) | ||
|
||
2. When `opcode == 2` (Return), the following conditions must be met: | ||
- `off_dst == -2` | ||
- `dst_reg == FP` | ||
- `off_op1 == -1` | ||
- `op1_src == FP` | ||
- `res_logic == 0` (op1) | ||
- `pc_update == 1` (absolute jump) | ||
|
||
|
||
> [!NOTE] | ||
> These constraints are not specified in the whitepaper, but enforced by our VM. If | ||
> they are not met, then the instruction is **invalid**. | ||
|
||
## Deductions | ||
|
||
Some values may be undefined because the associated memory locations haven’t been set. In this case, they must be *deduced*, or *asserted*. | ||
|
||
An **assertion** verifies that two values are equal. This implies deducing one of the values if its undefined. When deducing a value, the corresponding memory cell must be updated with the deduced value. | ||
|
||
If the memory cell corresponds to a builtin segment, the associated builtin runner should be used to assert the value of that memory cell. | ||
|
||
Otherwise, the value will be deduced based on the `opcode`. There are 4 different types of operations: | ||
|
||
- `opcode == 0`: A no-op (no operation). | ||
- `opcode == 1`: A call operation. | ||
- Asserts that `op0 == pc + instruction_size`. | ||
- Asserts that `dst == fp`. | ||
- `opcode == 2`: A ret operation. | ||
- `opcode == 3`: An assert equal operation. | ||
- Asserts that `dst == res`. This may imply deducing `op0` value so that `res == dst`, by performing the inverse operation. | ||
|
||
The `instruction_size` is always `1`, unless `op1` is an immediate, in which case it will be `2`. | ||
|
||
|
||
> [!NOTE] | ||
> If a value is undefined and cannot be deduced, the instruction execution must fail. | ||
> This constraint is not specified in the whitepaper, but enforced by our VM. | ||
|
||
## Updating Registers | ||
|
||
At the end of the execution, the registers must be updated according to the instruction flags. | ||
|
||
### Updating the PC | ||
|
||
The updated PC will be denoted by `next_pc`. | ||
|
||
When updating the program counter, we depend primarily on the `pc_update` field: | ||
- `pc_update == 0`: Advance program counter. | ||
- Set `next_pc = pc + instruction_size`. | ||
- `pc_update == 1`: Absolute jump. | ||
- Set `next_pc = res`. | ||
- `pc_update == 2`: Relative jump. | ||
- Set `next_pc = pc + res`. | ||
- `pc_update == 4`: Conditional relative jump. | ||
- If `dst == 0`: Set `next_pc = pc + instruction_size`. | ||
- If `dst != 0`: Set `next_pc = pc + op1`. | ||
- Otherwise: The instruction is invalid. | ||
|
||
> [!NOTE] | ||
> In our VM: | ||
> `new_pc` = `next_pc` | ||
|
||
### Updating the AP | ||
|
||
The updated AP will be denoted by `next_ap`. | ||
|
||
When updating the allocation pointer, we depend primarily on the `ap_update` field, but also on the current operation: | ||
|
||
- If the `opcode` is *call*, then we must assert that `ap_update == 0`: | ||
- Set `next_ap = ap + 2` | ||
- Else, depending on `ap_update`. | ||
- `ap_update == 0`: Set `next_ap = ap` | ||
- `ap_update == 1`: Set `next_ap = ap + res` | ||
- `ap_update == 2`: Set `next_ap = ap + 1` | ||
- Otherwise: The instruction is invalid. | ||
|
||
> [!NOTE] | ||
> In our VM: | ||
> `new_apset` = `next_ap` | ||
|
||
### Updating the FP | ||
|
||
The updated FP will be denoted by `next_fp`. | ||
|
||
When updating the frame pointer, we depend on the `opcode`: | ||
|
||
- `opcode == 0`: A no-op (no operation). | ||
- Set `next_fp = fp`. | ||
- `opcode == 1`: A call operation. | ||
- Set `next_fp = ap + 2`. | ||
- `opcode == 2`: A return operation. | ||
- Set `next_fp = dst`. | ||
- `opcode == 3`: An assert equal operation. | ||
- Set `next_fp = fp`. | ||
- Otherwise: The instruction is invalid. | ||
|
||
> [!NOTE] | ||
> In our VM: | ||
> `new_fp_offset` = `next_fp` |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is
Operations
a subsection ofInstructions
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! Did the correction here
b282b86