Chapter 4: The ARM Stack and Memory Concepts
4.1 What is a Stack?
Imagine a stack of plates in a cafeteria. When you add a plate, you put it on top. When you take a plate, you take it from the top. This is exactly how a stack in computer programming works. A stack is a simple way to store and retrieve data in a computer's memory.
A stack is a data structure that follows the Last-In-First-Out (LIFO) principle. This means:
The last item you put in is the first item you can take out.
You can only access the top item at any time.
In computer terms:
Adding an item is called "pushing" onto the stack.
Removing an item is called "popping" from the stack.
The stack is crucial for several reasons:
Temporary Storage: It's a quick way to store and retrieve data.
Function Calls: It helps manage function calls and returns.
Local Variables: It's where local variables in functions are typically stored.
4.2 What is Memory?
Memory: It refers to the physical devices used to store data temporarily or permanently in a computer. This includes RAM (Random Access Memory), which is used for temporary storage while your computer is running.
What is a Memory Address?
Memory Address: It's like a unique identifier for each location in your memory. Think of memory as a series of numbered boxes, where each box can store a piece of data. Each box has a number (address) that identifies it.
SP and Memory: The value stored in the SP register is a memory address that points to the top of the stack in the larger memory (RAM). Why Does the Stack Grow Downwards?
In most systems, including ARM64, the stack grows downwards in memory, meaning it starts at a high memory address and grows towards lower addresses.
Reason: This design helps to prevent stack overflow from corrupting other memory areas.
Multiple Processes and Memory
Processes: When you run an application, it becomes a process that uses some portion of the memory.
Isolation: Each process is isolated in its own memory space for security and stability. One process cannot directly access another process's memory. This isolation is managed by the operating system.
Stack Pointer (SP): A special CPU register that keeps track of the top of the stack. It points to the current top of the stack. Frame Pointer (FP) or x29: A register that points to the start of the current stack frame, making it easier to access function parameters and local variables.
4.3 Pushing and Popping Single Registers
Pushing a Single Register.
Let's imagine the stack as a stack of boxes. The Stack Pointer (SP) always points to the top box.
When we push a register onto the stack, we're putting its value into a new box on top.
str x0, [sp, #-16]!
This does three things:
It makes space for a new box (by subtracting 16 from SP)
It puts the value from x0 into this new box
It moves SP to point to this new box
Let's break this down:
'str' is the store instruction STR (Store Register)
'x0' is the register we're pushing onto the stack
'[sp, #-16]!' means:
Subtract 16 from SP
Use this new address to store the value
The '!' updates SP with this new address
Visual representation:
Before:
SP -> [ Empty ]
[ Other stuff ]
After:
SP -> [ x0's value ]
[ Empty ]
[ Other stuff ]
Popping a Single Register
When we pop from the stack, we're taking the value from the top box and putting it into a register.
ldr x0, [sp], #16
This does three things:
It takes the value from the box SP is pointing to
It puts this value into x0
It moves SP down to the next box (by adding 16 to SP)
Breaking this down:
'ldr' is the load instruction LDR (Load Register)
'x0' is the register we're loading into
'[sp], #16' means:
Use the current SP address to load the value
After loading, add 16 to SP
Visual representation:
Before:
SP -> [ Some value ]
[ Other stuff ]
After:
[ Some value ] (x0 now has this value)
SP -> [ Other stuff ]
Why 16 bytes? ARM64 requires the stack to be 16-byte aligned. Even though a 64-bit register only needs 8 bytes, we adjust by 16 to maintain alignment.
4.4 Pushing and Popping Multiple Registers
Sometimes, we want to push or pop two registers at once. ARM64 has special instructions for this: STP (Store Pair) for pushing, and LDP (Load Pair) for popping.
Pushing Two Registers
To push two registers at once, we use:
stp x0, x1, [sp, #-16]!
This does three things:
It makes space for a new box (by subtracting 16 from SP)
It puts the value from x0 into the first half of this box
It puts the value from x1 into the second half of this box
It moves SP to point to this new box
Visual representation:
Before:
SP -> [ Empty ]
[ Other stuff ]
After:
SP -> [ x0's value | x1's value ]
[ Empty ]
[ Other stuff ]
Popping Two Registers
To pop two registers at once, we use:
ldp x0, x1, [sp], #16
This does three things:
It takes the value from the first half of the box SP is pointing to and puts it in x0
It takes the value from the second half of the box and puts it in x1
It moves SP down to the next box (by adding 16 to SP)
Visual representation:
Before:
SP -> [ Value A | Value B ]
[ Other stuff ]
After:
[ Value A | Value B ] (x0 now has Value A, x1 has Value B)
SP -> [ Other stuff ]
Remember: Even though we're dealing with two registers, we still only move SP by 16 bytes. This keeps SP aligned properly, which is important for how ARM64 works.
Let's have a example.
File: stack_operations.s
.global _start
.section .text
_start:
// 1. Initial setup
mov x0, #10
mov x1, #20
mov x2, #30
// 2. Save initial SP
mov x3, sp
// 3. Push x0 and x1 onto the stack
stp x0, x1, [sp, #-16]!
// 4. Push x2 onto the stack
str x2, [sp, #-16]!
// 5. Modify registers
mov x0, #100
mov x1, #200
mov x2, #300
// 6. Pop x2 from the stack
ldr x2, [sp], #16
// 7. Pop x1 and x0 from the stack
ldp x1, x0, [sp], #16
// 8. Exit
mov x8, #93
mov x0, #0
svc #0

Now, let's go through this example step-by-step:
Assemble and link the program:
as stack_operations.s -o stack_operations.o ld stack_operations.o -o stack_operations
Start gdbserver:
gdbserver :1234 ./stack_operations
In a new terminal, start GDB:
gdb (gdb) file stack_operations (gdb) target remote localhost:1234
Examine the next 5 instructions:
(gdb) x/5i $pc
Step through the initial setup:
(gdb) stepi (gdb) info registers x0 (gdb) stepi (gdb) info registers x1 (gdb) stepi (gdb) info registers x2
You should see x0 = 10, x1 = 20, x2 = 30
Save the initial SP value:
(gdb) stepi (gdb) info registers x3
x3 now contains the initial SP value
Push x0 and x1 onto the stack:
(gdb) stepi (gdb) info registers sp (gdb) x/2xg $sp
SP should have decreased by 16, and you should see 10 and 20 on the stack
Push x2 onto the stack:
(gdb) stepi (gdb) info registers sp (gdb) x/3xg $sp
SP should have decreased by another 16, and you should see 30 at the top of the stack
Modify the registers:
(gdb) stepi (gdb) stepi (gdb) stepi (gdb) info registers x0 x1 x2
You should see x0 = 100, x1 = 200, x2 = 300
Pop x2 from the stack:
(gdb) stepi (gdb) info registers x2 sp
x2 should be back to 30, and SP should have increased by 16
Pop x1 and x0 from the stack:
(gdb) stepi (gdb) info registers x0 x1 sp
x0 should be 10, x1 should be 20, and SP should be back to its original value
4.5 What are Calling Conventions?
Imagine you're writing a letter to a friend. You both need to agree on a language to use, where to write the address, and how to sign off. This way, you both understand the letter.
In programming, calling conventions are similar. They're like rules that programmers agree on for how functions should work together.
Why Do We Need Calling Conventions?
Teamwork: Different people can write different parts of a program, and they'll still work together.
Using Libraries: We can use pre-written code (libraries) easily because they follow the same rules.
Consistency: It makes programs more organized and easier to understand.
Basic Ideas in ARM Calling Conventions
Passing Information to Functions:
Think of registers like small boxes where we can put information.
When we call a function, we put the information it needs (called arguments) in these boxes.
In ARM, we usually use boxes named X0, X1, X2, and so on for this.
Getting Results from Functions:
After a function finishes its job, it needs to give us back a result.
In ARM, functions usually put their answer in the X0 box.
Remembering Where to Go Back:
When we call a function, the program needs to remember where to go back to.
ARM uses a special box called LR (Link Register) to remember this.
A Simple Example
Let's look at a very basic example:
main:
mov x0, #5 // Put the number 5 in box X0
bl add_one // Call the add_one function
// When we come back, X0 will have the result
add_one:
add x0, x0, #1 // Add 1 to whatever is in X0
ret // Go back to where we came from
The calling convention elements here are:
Argument Passing:
We put the argument (5) in register X0 before calling the function.
This is part of the calling convention: the first argument goes in X0.
Function Call:
We use 'bl add_one' to call the function.
'bl' (Branch with Link) is part of the calling convention. It not only jumps to the function but also stores the return address in the LR (Link Register).
Return Value:
The function puts its result back in X0.
This is part of the calling convention: functions return their result in X0.
Return from Function:
We use 'ret' to return from the function.
This is part of the calling convention. 'ret' knows to use the address stored in LR to return to the caller.
Preservation of Registers:
In this simple example, we don't see it, but the calling convention also specifies which registers a function must preserve (not change) and which it can freely use.
These rules form the basic ARM calling convention. They ensure that the caller (main) and the callee (add_one) agree on:
Where to put function arguments (X0)
Where to find the return value (X0)
How to call and return from functions (bl and ret)
Example: calling_convention_example.s
Create the assembly file:
nano calling_convention_example.s
Enter the following code:
.global _start
.section .text
_start:
// Prepare arguments
mov x0, #5
mov x1, #3
// Call add_numbers function
bl add_numbers
// Exit (result is in x0)
mov x8, #93
mov x0, #0
svc #0
add_numbers:
// Add the two numbers
add x0, x0, x1
// Return (result is already in x0)
ret
Save and exit nano.
Assemble the program:
as calling_convention_example.s -o calling_convention_example.o
Link the program:
ld calling_convention_example.o -o calling_convention_example
Start gdbserver:
gdbserver :1234 ./calling_convention_example
In a new terminal, start GDB:
gdb
In GDB, connect to the remote target:
(gdb) file calling_convention_example
(gdb) target remote localhost:1234

Now, let's step through the program:
At the start of _start:
(gdb) x/5i $pc
This should show the first 5 instructions of our program.
Step through each instruction:
(gdb) stepi
(gdb) info registers x0
(gdb) stepi
(gdb) info registers x1
(gdb) stepi
(gdb) info registers x30
(gdb) stepi
(gdb) info registers x0
(gdb) stepi
(gdb) info registers x8
After each stepi
, examine the relevant registers to see how they change.
To exit GDB:
(gdb) quit
Let's go through this step-by-step, explaining each part in detail:
Initial state (at _start):
(gdb) x/5i $pc
=> 0x400078 <_start>: mov x0, #0x5
0x40007c <_start+4>: mov x1, #0x3
0x400080 <_start+8>: bl 0x400090 <add_numbers>
0x400084 <_start+12>: mov x8, #0x5d
0x400088 <_start+16>: mov x0, #0x0
After first instruction (mov x0, #5):
(gdb) stepi
(gdb) info register x0
x0 0x5 5
x0 now contains 5
After second instruction (mov x1, #3):
(gdb) stepi
(gdb) info register x0 x1
x0 0x5 5
x1 0x3 3
x1 now contains 3
Before bl instruction:
(gdb) info register x0 x1 x30
x0 0x5 5
x1 0x3 3
x30 0x0 0
Note that x30 (Link Register) is 0
After bl instruction (in add_numbers):
(gdb) stepi
(gdb) info register x0 x1 x30
x0 0x5 5
x1 0x3 3
x30 0x400084 4194436
We've jumped to add_numbers
x30 now contains 0x400084, which is the address to return to
Inside add_numbers:
(gdb) x/5i $pc
=> 0x400090 <add_numbers>: add x0, x0, x1
0x400094 <add_numbers+4>: ret
After add instruction:
(gdb) stepi
(gdb) info register x0 x1 x30
x0 0x8 8
x1 0x3 3
x30 0x400084 4194436
x0 now contains 8 (5 + 3)
The ret instruction:
This instruction uses the value in x30 to know where to return
It jumps to the address stored in x30 (0x400084)
After ret (back in _start):
(gdb) stepi
(gdb) info register x0 x1 x30
x0 0x8 8
x1 0x3 3
x30 0x400084 4194436
We're now back in _start, at address 0x400084
x0 still contains 8, the result of our addition
Current position in _start:
(gdb) x/5i $pc
=> 0x400084 <_start+12>: mov x8, #0x5d
0x400088 <_start+16>: mov x0, #0x0
0x40008c <_start+20>: svc #0x0
We're about to set up the exit syscall
Example 2:
Filename: simple_output.s
.global _start
.section .text
_start:
// Setup for write syscall
mov x0, #1 // File descriptor 1 is stdout
ldr x1, =message // Load address of the message
mov x2, #14 // Message length
mov x8, #64 // Syscall number for write
// Make the syscall
svc #0
// Exit
mov x8, #93 // Syscall number for exit
mov x0, #0 // Exit status
svc #0
.section .data
message:
.ascii "Hello, ARM64!\n"

Save this content in a file named
simple_output.s
Assemble the program:
as simple_output.s -o simple_output.o
Link the program:
ld simple_output.o -o simple_output
Start gdbserver:
gdbserver :1234 ./simple_output
In a new terminal, start GDB:
gdb
In GDB, connect to the remote target:
(gdb) file simple_output (gdb) target remote localhost:1234

Now we can step through the program:
Examine and step through each instruction:
(gdb) x/i $pc (gdb) stepi
Repeat this for each instruction, using
info registers
to check register values after each step.
I apologize for misunderstanding. You're right, I should approach this differently. Let me break it down more clearly:
The program setup:
mov x0, #1 // File descriptor 1 (stdout) ldr x1, =message // Address of the message mov x2, #14 // Message length mov x8, #64 // Syscall number for write
The crucial part in GDB:
(gdb) stepi 0x00000000004000c0 in _start () (gdb) info register x0 x1 x8 x30 x0 0x1 1 x1 0x4100d8 4260056 x8 0x40 64 x30 0x0 0
This shows the registers just before the system call. x1 contains the address of our message.
The system call:
(gdb) stepi 0x00000000004000c4 in _start ()
This step executed the
svc #0
instruction, which made the system call.After the system call:
(gdb) info register x0 x1 x8 x30 x0 0xe 14 x1 0x4100d8 4260056 x8 0x40 64 x30 0x0 0
x0 now contains 14, which is the number of bytes written.
The output:
"Hello, ARM64!" appears here because the system call wrote to stdout.
Last updated