3.10.Combining Control and Data in Machine-Level Programs

\(3.10.\)Combining Control and Data in Machine-Level Programs

1.Understanding Pointers

  Pointers serve as a uniform way to generate references to elements within different data structures.

  They are some key principles of pointers:

  1. Every pointer has an associated type. This type indicates what kind of object the pointer points to. The void * pointer represents a generic pointer.

  2. Every pointer has a value. This value is an address of some object of the designated type.

  3. Pointers are created with the & operator. This operator can be applied to any C expression that is categorized as an lvalue, meaning an expression that can appear on the left side of an assignment.

    • The operation is often realized by leaq.
  4. Pointers are dereferenced with the `*`` operator. It's implemented by a memory reference.

  5. Casting the type of pointer don't change its value but lead to other changes. One effect is to change any scaling of pointer arithmetic.

    • For example, if we have a char pointer with value p, then (int*)p + 7 computes p + 28, while (int*)(p + 7) computes p + 7.
  6. Pointers can also point to functions, and the value of a function pointer is the address of the first instruction in the machine-code representation of the function.

    • Take the following function as an example:
1
int fun(int x, int *p);

    and we declare a fucntions pointer fp:

1
2
int (*fp)(int, int *);
fp = fun;

    Then we can use fp to invoke the function:

1
2
int y = 1;
int result = fp(3, &y);

The parentheses around *fp are required, or it will be recognized as (int *) fp(int, int *), which is a function prototype.

2.Thwarting Buffer Overflow Attacks

  \(a.\)Buffer overflow & attacks

  Buffer overflow is a specific type of out-of-bound memory reference. A typical case of buffer overflow is that: some character array is allocated on the stack to hold a string, but the size of the string exceeds the space allocated for the array.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
char *gets(char *s)
{
int c;
char *dest = s;
while ((c = getchar()) != '\n' && c != EOF)
*dest++ = c;
if (c == EOF && dest == s)
/* No characters read */
return NULL;
*dest++ = '\0'; /* Terminate string */
return s;
}

/* Read input line and write it back */
void echo()
{
char buf[8]; /* Way too small! */
gets(buf);
puts(buf);
}

  The problem with gets is that it has no way to determine whether sufficient space has been allocated to hold the entire string. In our echo example, any string longer than seven characters will cause an out-of-bounds write. This will overwrite some of the information stored on the stack, and as the string gets longer, the following information will get corrupted.

  • Buffer overflow is used to get a program to perform a function that it would otherwise be unwilling to do.

    • Typically, the program is fed with a string that contains the byte encoding of some executable code, called the exploit code, plus some extra bytes that overwrite the return address with a pointer to the exploit code. The effect of executing the ret instruction is then to jump to the exploit code.

  \(b.\)Stack randomization

  In order to insert exploit code into a system, the attacker needs to inject both the code as well as a pointer to this code as part of the attack string. Generating this pointer requires knowing the stack address where the string will be located.

  • The idea of stack randomization is to make the position of the stack vary from one run of a program to another. This is implemented by allocating a random amount of space between 0 and n bytes on the stack at the start of a program.

    • for example, by using the allocation function alloca, which allocates space for a specified number of bytes on the stack. This allocated space is not used by the program, but it causes all subsequent stack locations to vary from one execution of a program to another.

  Stack randomization is one of a larger class of techniques known as address-space layout randomization, or ASLR. With ASLR, different parts of the program, including program code, library code, stack, global variables, and heap data, are loaded into different regions of memory each time a program is run.

  \(c.\)Stack corruption detection

  A second line of defense is to be able to detect when a stack has been corrupted. It' hard to prevent the behavious, however, the program can attempt to detect when such a write has occurred before it can have any harmful effects.

  • The GCC use a mechanisim known as a stack protector into the generated code to detect buffer overruns.

    • The idea is to store a special canary value(guard value) in the stack frame between any local buffer and the rest of the stack state. The value is generated randomly each time the program runs:

    • Before restoring the register state and returning from the function, the program checks if the canary has been altered by some operation of this function or one that it has called. If so, the program aborts with an error.

  Recent version of GCC inserts this type of overflow detection automatically. We have to give the command-line option -fno-stack-protector.

  The following code uses the overflow detection techniques:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
echo:
subq $24, %rsp # Allocate 24 bytes on stack
movq %fs:40, %rax # Retrieve canary
movq %rax, 8(%rsp) # Store on stack
xorl %eax, %eax # Zero out register
movq %rsp, %rdi # Compute buf as %rsp
call gets # Call gets
movq %rsp, %rdi # Compute buf as %rsp
call puts # Call puts
movq 8(%rsp), %rax # Retrieve canary
xorq %fs:40, %rax # Compare to stored value
je .L9 # If =, goto ok
call __stack_chk_fail # Stack corrupted!
.L9:
addq $24, %rsp # Deallocate stack space
ret

   The instruction argument %fs:40 is an indication that the canary value is read from memory using segmented addressing.

  • By storing the canary in a special segment, it can be marked as "read only", so that an attacker cannot overwrite the stored canary value.

  • Before restoring the register state and returning, the function compares the value stored at the stack location with the canary value (via the xorq instruction on line 11). A nonzero value indicates that the canary on the stack has been modified, and so the code will call an error routine.

  \(d.\)Ordering rearrangement

  \(e.g.\)Let's discuss the following C program:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
int len(char *s) {
return strlen(s);
}

void iptoa(char *s, long *p) {
long val = *p;
sprintf(s, "%ld", val);
}

int intlen(long x) {
long v;
char buf[12];
v = x;
iptoa(buf, &v);
return len(buf);
}

  And the following show portions of the code for intlen, compiled both with and without stack protector:

  \(a.\)

1
2
3
4
5
6
7
8
# int intlen(long x)
# x in %rdi
intlen:
subq $40, %rsp
movq %rdi, 24(%rsp)
leaq 24(%rsp), %rsi
movq %rsp, %rdi
call iptoa

  \(b.\)

1
2
3
4
5
6
7
8
9
10
11
# int intlen(long x)
# x in %rdi
intlen:
subq $56, %rsp
movq %fs:40, %rax
movq %rax, 40(%rsp)
xorl %eax, %eax
movq %rdi, 8(%rsp)
leaq 8(%rsp), %rsi
leaq 16(%rsp), %rdi
call iptoa

  In the protected code, local variable v is positioned closer to the top of the stack than buf, and so an overrun of buf will not corrupt the value of v.

  \(e.\)Limiting executable code regions

  A final step is to eliminate the ability of an attacker to insert executable code into a system. One method is to limit which memory regions hold executable code. In typical programs, only the portion of memory holding the code generated by the compiler need be executable. The other portions can be restricted to allow just reading and writing.

  • Many systems allow control over three forms of access: read (reading data from memory), write (storing data into memory), and execute (treating the memory contents as machine-level code).

  • AMD introduced an NX (for "no-execute") bit into the memory protection for its 64-bit processors, separating the read and execute access modes. With this feature, the stack can be marked as being readable and writable, but not executable, and the checking of whether a page is executable is performed in hardware, with no penalty in efficiency.

3.Variable-Size Stack Frames

  Some functions require a variable amount of space that must be allocated for their stack frames, such as a local array declared with variable size:

1
2
3
4
5
6
7
8
long vframe(long n, long idx, long *q) {
long i;
long *p[n];
p[0] = &i;
for (i = 1; i < n; i++)
p[i] = q;
return *p[idx];
}

  To manage a variable-size stack frame, x86-64 code uses %rbp to serve as a frame pointer(sometimes referred to as a base pointer). When using a frame pointer, the stack frame is organized as below:

1
2
3
4
5
6
7
8
9
10
11
# long vframe(long n, long idx, long *q)
# n in %rdi, idx in %rsi, q in %rdx
# Only portions of code shown

vframe:
pushq %rbp # Save old %rbp
movq %rsp, %rbp # Set frame pointer
...

leave # Restore %rbp and %rsp
ret # Return
  • The code must save the previous version of %rbp on the stack, since it is a callee-saved register. It then keeps %rbp pointing to this position throughout the execution of the function, and it references fixed-length local variables, such as i, at offsets relative to %rbp.

    • So the code start by pushing the current value of %rbp onto the stack and setting %rbp to point to this stack position
  • At the end of the function, the frame pointer is restored to its previous value using the leave instruction. This instruction takes no arguments. It is equivalent to executing the following two instructions:

1
2
3
movq %rbp, %rsp     # Set stack pointer to beginning of frame
popq %rbp # Restore saved %rbp and set stack ptr
# to end of caller��s frame