4.1.Y86-64 Instruction Set Architecture
\(4.1.\)Y86-64 Instruction Set Architecture
1.Programmer-Visible State
The programmer-visible state is where the programmer can generate machine-level code.
- The
Stat
indicates the overall state of program execution. It will indicate either normal operation or that some sort of exception has occurred.
2.Y86-64 Instruction Set
The integer operation instructions operate only on register data. This instructions set the three condition codes
ZF
,SF
, andOF
(zero, sign, and overflow).The
call
instruction pushes the return address on the stack and jumps to the destination address. Theret
instruction returns from such a call.The
halt
instruction stops instruction execution, and the status code is set toHLT
.
3.Instruction Encoding
- Every instruction has an initial byte identifying the instruction type. This byte is split into two 4-bit parts: the high-order, or code, part, and the low-order, or function, part.
Instructions that need operands have longer encodings:
There can be an additional register specifier byte specifying either one or two registers. These register fields are called
rA
andrB
in figure above.Instructions that have no register operands do not have a register specifier byte.
Those that require just one register operand have the other register specifier set to 0xF.
Some instructions require an additional 8-byte constant word. This word can serve as the immediate data for
irmovq
, the displacement forrmmovq
andmrmovq
address specifiers, and the destination of branches and calls.- Note that branch and call destinations are given as absolute addresses, rather than using the PC-relative addressing seen in x86-64.
- The program registers are stored within the CPU in a register file, a small random access memory where the register IDs serve as addresses.
Let's take the instruction
rmmovq %rsp, 0x123456789abcd(%rdx)
as an example:
rmmovq
has initial byte40
.The source register
%rsp
is encoded inrA
field, and base register%rdx
is encoded inrB
field. So we get the register specifier byte of42
.The displacement is encoded in the 8-byte constant word:
00 01 23 45 67 89 ab cd
, and we write it in byte-reversed order.
So the final result is 4042cdab896745230100
.
One important property of any instruction set is that the byte encodings must have a unique interpretation. An arbitrary sequence of bytes either encodes a unique instruction sequence or is not a legal byte sequence
- This property ensures that a processor can execute an object-code program without any ambiguity about the meaning of the code. However, if we do not know the starting position of a code sequence, we cannot reliably determine how to split the sequence into individual instructions.
4.Exceptions
The programmer-visible state includes a status code
Stat
describing the overall state of the executing program.
The possible values for this code are as below:
The
ADR
indicates that the processor attempted to read from or write to an invalid memory address, either while fetching an instruction or while reading or writing data.- We limit the maximum address, and any access to an address beyond this limit will trigger an ADR exception.
5.Y86-64 Programs
Take the following C program as an example:
1 | long sum(long *start, long count) |
the x86-64 and Y86-64 assembly code are as below:
1 | # x86-64 ver |
1 | # Y86-64 ver |
The Y86-64 codes have the following differences:
The Y86-64 code loads constants into registers (lines 2�C3), since it cannot use immediate data in arithmetic instructions.
The Y86-64 code requires two instructions (lines 8�C9) to read a value from memory and add it to a register.
The following shows a complete program file written in Y86-64 assembly code:
1 | .pos 0 # Execution begins at address 0 |
Words beginning with '.' are assembler directives telling the assembler to adjust the address at which it is generating code or to insert some words of data.
The directive
.pos 0
indicates that the assembler should begin generating code starting at address 0.The next instruction
irmovq stack, %rsp
initializes the stack pointer.- The label
stack
is declared at the end of the program, to indicate address 0x200 using a .pos directive. Our stack will therefore start at this address and grow toward lower addresses.
- The label
The label
array
denotes the start of this array, and is aligned on an 8-byte boundary using the.align
directive. The array stores 4 words.
Since our only tool for creating Y86-64 code is an assembler, the programmer must perform tasks we ordinarily delegate to the compiler, linker, and run-time system.
The YIS
tools is used to simulate the execution of the
Y86-64 machine-code program. If we run our object code on
YIS
, we will get the following output:
1 | Stopped in 34 steps at PC = 0x13. Status 'HLT', CC Z=1 S=0 O=0 |
The first line of the simulation output summarizes the execution and the resulting values of the PC and program status.
The original values of the register are shown on the left.