Chapter 4: The ARM Stack and Memory Concepts
4.1 What is a Stack?
Imagine a stack of plates in a cafeteria. When you add a plate, you put it on top. When you take a plate, you take it from the top. This is exactly how a stack in computer programming works. A stack is a simple way to store and retrieve data in a computer's memory.
A stack is a data structure that follows the Last-In-First-Out (LIFO) principle. This means:
The last item you put in is the first item you can take out.
You can only access the top item at any time.
In computer terms:
Adding an item is called "pushing" onto the stack.
Removing an item is called "popping" from the stack.
The stack is crucial for several reasons:
Temporary Storage: It's a quick way to store and retrieve data.
Function Calls: It helps manage function calls and returns.
Local Variables: It's where local variables in functions are typically stored.
4.2 What is Memory?
Memory: It refers to the physical devices used to store data temporarily or permanently in a computer. This includes RAM (Random Access Memory), which is used for temporary storage while your computer is running.
What is a Memory Address?
Memory Address: It's like a unique identifier for each location in your memory. Think of memory as a series of numbered boxes, where each box can store a piece of data. Each box has a number (address) that identifies it.
SP and Memory: The value stored in the SP register is a memory address that points to the top of the stack in the larger memory (RAM). Why Does the Stack Grow Downwards?
In most systems, including ARM64, the stack grows downwards in memory, meaning it starts at a high memory address and grows towards lower addresses.
Reason: This design helps to prevent stack overflow from corrupting other memory areas.
Multiple Processes and Memory
Processes: When you run an application, it becomes a process that uses some portion of the memory.
Isolation: Each process is isolated in its own memory space for security and stability. One process cannot directly access another process's memory. This isolation is managed by the operating system.
Stack Pointer (SP): A special CPU register that keeps track of the top of the stack. It points to the current top of the stack. Frame Pointer (FP) or x29: A register that points to the start of the current stack frame, making it easier to access function parameters and local variables.
4.3 Pushing and Popping Single Registers
Pushing a Single Register.
Let's imagine the stack as a stack of boxes. The Stack Pointer (SP) always points to the top box.
When we push a register onto the stack, we're putting its value into a new box on top.
This does three things:
It makes space for a new box (by subtracting 16 from SP)
It puts the value from x0 into this new box
It moves SP to point to this new box
Let's break this down:
'str' is the store instruction STR (Store Register)
'x0' is the register we're pushing onto the stack
'[sp, #-16]!' means:
Subtract 16 from SP
Use this new address to store the value
The '!' updates SP with this new address
Visual representation:
Before:
After:
Popping a Single Register
When we pop from the stack, we're taking the value from the top box and putting it into a register.
This does three things:
It takes the value from the box SP is pointing to
It puts this value into x0
It moves SP down to the next box (by adding 16 to SP)
Breaking this down:
'ldr' is the load instruction LDR (Load Register)
'x0' is the register we're loading into
'[sp], #16' means:
Use the current SP address to load the value
After loading, add 16 to SP
Visual representation:
Before:
After:
Why 16 bytes? ARM64 requires the stack to be 16-byte aligned. Even though a 64-bit register only needs 8 bytes, we adjust by 16 to maintain alignment.
4.4 Pushing and Popping Multiple Registers
Sometimes, we want to push or pop two registers at once. ARM64 has special instructions for this: STP (Store Pair) for pushing, and LDP (Load Pair) for popping.
Pushing Two Registers
To push two registers at once, we use:
This does three things:
It makes space for a new box (by subtracting 16 from SP)
It puts the value from x0 into the first half of this box
It puts the value from x1 into the second half of this box
It moves SP to point to this new box
Visual representation:
Before:
After:
Popping Two Registers
To pop two registers at once, we use:
This does three things:
It takes the value from the first half of the box SP is pointing to and puts it in x0
It takes the value from the second half of the box and puts it in x1
It moves SP down to the next box (by adding 16 to SP)
Visual representation:
Before:
After:
Remember: Even though we're dealing with two registers, we still only move SP by 16 bytes. This keeps SP aligned properly, which is important for how ARM64 works.
Let's have a example.
File: stack_operations.s
Now, let's go through this example step-by-step:
Assemble and link the program:
Start gdbserver:
In a new terminal, start GDB:
Examine the next 5 instructions:
Step through the initial setup:
You should see x0 = 10, x1 = 20, x2 = 30
Save the initial SP value:
x3 now contains the initial SP value
Push x0 and x1 onto the stack:
SP should have decreased by 16, and you should see 10 and 20 on the stack
Push x2 onto the stack:
SP should have decreased by another 16, and you should see 30 at the top of the stack
Modify the registers:
You should see x0 = 100, x1 = 200, x2 = 300
Pop x2 from the stack:
x2 should be back to 30, and SP should have increased by 16
Pop x1 and x0 from the stack:
x0 should be 10, x1 should be 20, and SP should be back to its original value
4.5 What are Calling Conventions?
Imagine you're writing a letter to a friend. You both need to agree on a language to use, where to write the address, and how to sign off. This way, you both understand the letter.
In programming, calling conventions are similar. They're like rules that programmers agree on for how functions should work together.
Why Do We Need Calling Conventions?
Teamwork: Different people can write different parts of a program, and they'll still work together.
Using Libraries: We can use pre-written code (libraries) easily because they follow the same rules.
Consistency: It makes programs more organized and easier to understand.
Basic Ideas in ARM Calling Conventions
Passing Information to Functions:
Think of registers like small boxes where we can put information.
When we call a function, we put the information it needs (called arguments) in these boxes.
In ARM, we usually use boxes named X0, X1, X2, and so on for this.
Getting Results from Functions:
After a function finishes its job, it needs to give us back a result.
In ARM, functions usually put their answer in the X0 box.
Remembering Where to Go Back:
When we call a function, the program needs to remember where to go back to.
ARM uses a special box called LR (Link Register) to remember this.
A Simple Example
Let's look at a very basic example:
The calling convention elements here are:
Argument Passing:
We put the argument (5) in register X0 before calling the function.
This is part of the calling convention: the first argument goes in X0.
Function Call:
We use 'bl add_one' to call the function.
'bl' (Branch with Link) is part of the calling convention. It not only jumps to the function but also stores the return address in the LR (Link Register).
Return Value:
The function puts its result back in X0.
This is part of the calling convention: functions return their result in X0.
Return from Function:
We use 'ret' to return from the function.
This is part of the calling convention. 'ret' knows to use the address stored in LR to return to the caller.
Preservation of Registers:
In this simple example, we don't see it, but the calling convention also specifies which registers a function must preserve (not change) and which it can freely use.
These rules form the basic ARM calling convention. They ensure that the caller (main) and the callee (add_one) agree on:
Where to put function arguments (X0)
Where to find the return value (X0)
How to call and return from functions (bl and ret)
Example: calling_convention_example.s
Create the assembly file:
Enter the following code:
Save and exit nano.
Assemble the program:
Link the program:
Start gdbserver:
In a new terminal, start GDB:
In GDB, connect to the remote target:
Now, let's step through the program:
At the start of _start:
This should show the first 5 instructions of our program.
Step through each instruction:
After each stepi
, examine the relevant registers to see how they change.
To exit GDB:
Let's go through this step-by-step, explaining each part in detail:
Initial state (at _start):
After first instruction (mov x0, #5):
x0 now contains 5
After second instruction (mov x1, #3):
x1 now contains 3
Before bl instruction:
Note that x30 (Link Register) is 0
After bl instruction (in add_numbers):
We've jumped to add_numbers
x30 now contains 0x400084, which is the address to return to
Inside add_numbers:
After add instruction:
x0 now contains 8 (5 + 3)
The ret instruction:
This instruction uses the value in x30 to know where to return
It jumps to the address stored in x30 (0x400084)
After ret (back in _start):
We're now back in _start, at address 0x400084
x0 still contains 8, the result of our addition
Current position in _start:
We're about to set up the exit syscall
Example 2:
Filename: simple_output.s
Save this content in a file named
simple_output.s
Assemble the program:
Link the program:
Start gdbserver:
In a new terminal, start GDB:
In GDB, connect to the remote target:
Now we can step through the program:
Examine and step through each instruction:
Repeat this for each instruction, using
info registers
to check register values after each step.
I apologize for misunderstanding. You're right, I should approach this differently. Let me break it down more clearly:
The program setup:
The crucial part in GDB:
This shows the registers just before the system call. x1 contains the address of our message.
The system call:
This step executed the
svc #0
instruction, which made the system call.After the system call:
x0 now contains 14, which is the number of bytes written.
The output:
Last updated