1. Assembly
To begin the machine learning compilers project, we explored the fundamentals of the assembly language and compiler behavior to establish a fundamental understanding.
1.1 Hello Assembly
We started with a simple C program:
#include <stdio.h>
void hello_assembly()
{
printf( "Hello Assembly Language!\n");
}
Our goal was to compile this program using both GCC
and Clang
and analyze the differences in the generated assembly code to understand how compiler implementations can vary at the machine level.
Task 1.1.1
To generate assembly code from the hello_assembly.c
program using both compilers, we first identified the appropriate commands:
GCC
compilergcc -S hello_assembly.c
Clang
compilerclang -S hello_assembly.c
Task 1.1.2
After compiling the C program, we analyzed the generated assembly code by:
Locating the “Hello Assembly Language!” string.
Identifying the instructions inserted by the compiler insert to conform to the procedure call standard (PCS).
Identifying the function call to the C standard library (libc) that prints the string.
Task 1.1.2.1 GCC
The GCC
-generated file:
1 .arch armv8-a
2 .file "hello_assembly.c"
3 .text
4 .section .rodata
5 .align 3
6.LC0:
7 .string "Hello Assembly Language!"
8 .text
9 .align 2
10 .global hello_assembly
11 .type hello_assembly, %function
12hello_assembly:
13.LFB0:
14 .cfi_startproc
15 stp x29, x30, [sp, -16]!
16 .cfi_def_cfa_offset 16
17 .cfi_offset 29, -16
18 .cfi_offset 30, -8
19 mov x29, sp
20 adrp x0, .LC0
21 add x0, x0, :lo12:.LC0
22 bl puts
23 nop
24 ldp x29, x30, [sp], 16
25 .cfi_restore 30
26 .cfi_restore 29
27 .cfi_def_cfa_offset 0
28 ret
29 .cfi_endproc
30.LFE0:
31 .size hello_assembly, .-hello_assembly
32 .ident "GCC: (GNU) 14.2.1 20250110 (Red Hat 14.2.1-7)"
33 .section .note.GNU-stack,"",@progbits
The string “Hello Assembly Language!” appears at line 7.
The instructions for the procedure call standard are:
stp x29, x30, [sp, -16]!
mov x29, sp
...
ldp x29, x30, [sp], 16
ret
The function call used to print the string is:
bl puts
Task 1.1.2.2 Clang
The Clang
-generated file:
1 .text
2 .file "hello_assembly.c"
3 .globl hello_assembly // -- Begin function hello_assembly
4 .p2align 2
5 .type hello_assembly,@function
6hello_assembly: // @hello_assembly
7 .cfi_startproc
8// %bb.0:
9 stp x29, x30, [sp, #-16]! // 16-byte Folded Spill
10 .cfi_def_cfa_offset 16
11 mov x29, sp
12 .cfi_def_cfa w29, 16
13 .cfi_offset w30, -8
14 .cfi_offset w29, -16
15 adrp x0, .L.str
16 add x0, x0, :lo12:.L.str
17 bl printf
18 .cfi_def_cfa wsp, 16
19 ldp x29, x30, [sp], #16 // 16-byte Folded Reload
20 .cfi_def_cfa_offset 0
21 .cfi_restore w30
22 .cfi_restore w29
23 ret
24.Lfunc_end0:
25 .size hello_assembly, .Lfunc_end0-hello_assembly
26 .cfi_endproc
27 // -- End function
28 .type .L.str,@object // @.str
29 .section .rodata.str1.1,"aMS",@progbits,1
30.L.str:
31 .asciz "Hello Assembly Language!\n"
32 .size .L.str, 26
33
34 .ident "clang version 19.1.7 (Fedora 19.1.7-3.fc41)"
35 .section ".note.GNU-stack","",@progbits
36 .addrsig
37 .addrsig_sym printf
The string “Hello Assembly Language!” appears at line 31.
The instructions for the procedure call standard are:
stp x29, x30, [sp, -16]!
mov x29, sp
...
ldp x29, x30, [sp], 16
ret
The function call used to print the string is:
bl printf
This analysis illustrates that while both compilers conform to the same calling standard, they differ in aspects, such as the string placement and the choice of standard library function used for the output.
Task 1.1.3
After analyzing the generated assembly code, we implemented a simple C++ driver that calls the hello_assembly
function:
1#include <iostream>
2
3extern "C" void hello_assembly();
4
5int main()
6{
7 std::cout << "Calling hello_assembly ...\n";
8
9 // Call to hello_assembly
10 hello_assembly();
11
12 std::cout << "... returned from function call!\n";
13 return 0;
14}
To compile the driver along with the GCC
-generated assembly code and execute the program, we used the following commands:
g++ driver_hello_assembly.cpp gcc_hello_assembly.s -o hello_assembly
./hello_assembly
The output of the program:
Calling hello_assembly ...
Hello Assembly Language!
... returned from function call!
1.2 Assembly Function
Next, we worked with the add_values.s
file.
.text
.type add_values, %function
.global add_values
add_values:
stp fp, lr, [sp, #-16]!
mov fp, sp
ldr w3, [x0]
ldr w4, [x1]
add w5, w3, w4
str w5, [x2]
ldp fp, lr, [sp], #16
ret
Task 1.2.1
As a first step, we assembled the file into an object file using:
as add_values.s -o add_values.o
Task 1.2.2
With the add_values.o
, we performed different file generations to understand its structure:
hexdump add_values.o > hexdump_add_values.hex
10000000 457f 464c 0102 0001 0000 0000 0000 0000
20000010 0001 00b7 0001 0000 0000 0000 0000 0000
30000020 0000 0000 0000 0000 0130 0000 0000 0000
40000030 0000 0000 0040 0000 0000 0040 0007 0006
50000040 7bfd a9bf 03fd 9100 0003 b940 0024 b940
60000050 0065 0b04 0045 b900 7bfd a8c1 03c0 d65f
70000060 0000 0000 0000 0000 0000 0000 0000 0000
80000070 0000 0000 0000 0000 0000 0000 0003 0001
90000080 0000 0000 0000 0000 0000 0000 0000 0000
100000090 0000 0000 0003 0002 0000 0000 0000 0000
1100000a0 0000 0000 0000 0000 0000 0000 0003 0003
1200000b0 0000 0000 0000 0000 0000 0000 0000 0000
1300000c0 0001 0000 0000 0001 0000 0000 0000 0000
1400000d0 0000 0000 0000 0000 0004 0000 0012 0001
1500000e0 0000 0000 0000 0000 0000 0000 0000 0000
1600000f0 2400 0078 6461 5f64 6176 756c 7365 0000
170000100 732e 6d79 6174 0062 732e 7274 6174 0062
180000110 732e 7368 7274 6174 0062 742e 7865 0074
190000120 642e 7461 0061 622e 7373 0000 0000 0000
200000130 0000 0000 0000 0000 0000 0000 0000 0000
21*
220000170 001b 0000 0001 0000 0006 0000 0000 0000
230000180 0000 0000 0000 0000 0040 0000 0000 0000
240000190 0020 0000 0000 0000 0000 0000 0000 0000
2500001a0 0004 0000 0000 0000 0000 0000 0000 0000
2600001b0 0021 0000 0001 0000 0003 0000 0000 0000
2700001c0 0000 0000 0000 0000 0060 0000 0000 0000
2800001d0 0000 0000 0000 0000 0000 0000 0000 0000
2900001e0 0001 0000 0000 0000 0000 0000 0000 0000
3000001f0 0027 0000 0008 0000 0003 0000 0000 0000
310000200 0000 0000 0000 0000 0060 0000 0000 0000
320000210 0000 0000 0000 0000 0000 0000 0000 0000
330000220 0001 0000 0000 0000 0000 0000 0000 0000
340000230 0001 0000 0002 0000 0000 0000 0000 0000
350000240 0000 0000 0000 0000 0060 0000 0000 0000
360000250 0090 0000 0000 0000 0005 0000 0005 0000
370000260 0008 0000 0000 0000 0018 0000 0000 0000
380000270 0009 0000 0003 0000 0000 0000 0000 0000
390000280 0000 0000 0000 0000 00f0 0000 0000 0000
400000290 000f 0000 0000 0000 0000 0000 0000 0000
4100002a0 0001 0000 0000 0000 0000 0000 0000 0000
4200002b0 0011 0000 0003 0000 0000 0000 0000 0000
4300002c0 0000 0000 0000 0000 00ff 0000 0000 0000
4400002d0 002c 0000 0000 0000 0000 0000 0000 0000
4500002e0 0001 0000 0000 0000 0000 0000 0000 0000
4600002f0
readelf -S add_values.o > sec_headers_add_values.relf
1There are 7 section headers, starting at offset 0x130:
2
3Section Headers:
4 [Nr] Name Type Address Offset
5 Size EntSize Flags Link Info Align
6 [ 0] NULL 0000000000000000 00000000
7 0000000000000000 0000000000000000 0 0 0
8 [ 1] .text PROGBITS 0000000000000000 00000040
9 0000000000000020 0000000000000000 AX 0 0 4
10 [ 2] .data PROGBITS 0000000000000000 00000060
11 0000000000000000 0000000000000000 WA 0 0 1
12 [ 3] .bss NOBITS 0000000000000000 00000060
13 0000000000000000 0000000000000000 WA 0 0 1
14 [ 4] .symtab SYMTAB 0000000000000000 00000060
15 0000000000000090 0000000000000018 5 5 8
16 [ 5] .strtab STRTAB 0000000000000000 000000f0
17 000000000000000f 0000000000000000 0 0 1
18 [ 6] .shstrtab STRTAB 0000000000000000 000000ff
19 000000000000002c 0000000000000000 0 0 1
20Key to Flags:
21 W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
22 L (link order), O (extra OS processing required), G (group), T (TLS),
23 C (compressed), x (unknown), o (OS specific), E (exclude),
24 D (mbind), p (processor specific)
objdump --syms -S -d add_values.o > dis_add_values.dis
1
2add_values.o: file format elf64-littleaarch64
3
4SYMBOL TABLE:
50000000000000000 l d .text 0000000000000000 .text
60000000000000000 l d .data 0000000000000000 .data
70000000000000000 l d .bss 0000000000000000 .bss
80000000000000000 g F .text 0000000000000000 add_values
9
10
11
12Disassembly of section .text:
13
140000000000000000 <add_values>:
15 0: a9bf7bfd stp x29, x30, [sp, #-16]!
16 4: 910003fd mov x29, sp
17 8: b9400003 ldr w3, [x0]
18 c: b9400024 ldr w4, [x1]
19 10: 0b040065 add w5, w3, w4
20 14: b9000045 str w5, [x2]
21 18: a8c17bfd ldp x29, x30, [sp], #16
22 1c: d65f03c0 ret
Task 1.2.3
The next step was to determine the size of the .text
section and understand its content. This information is available in the section headers file in line 9:
[ 1] .text PROGBITS 0000000000000000 00000040
0000000000000020 0000000000000000 AX 0 0 4
The .text
section has a size of 0000000000000020
or 0x20
bytes, which corresponds to 32 bytes in decimal. Since each AArch64 instruction is 4 bytes, the function add_values
consists of exactly 8 instructions.
We confirmed this observation by inspecting the disassembled output:
0: a9bf7bfd stp x29, x30, [sp, #-16]!
4: 910003fd mov x29, sp
8: b9400003 ldr w3, [x0]
c: b9400024 ldr w4, [x1]
10: 0b040065 add w5, w3, w4
14: b9000045 str w5, [x2]
18: a8c17bfd ldp x29, x30, [sp], #16
1c: d65f03c0 ret
The instructions start at 0x00
and proceed in 4-byte increments, ending at 0x1c
with the ret
instruction. This confirms that our there are exactly 8 instructions present, that match the .text
section size.
Task 1.2.4
Similar to the first task, we now tested the functionality of the add_values
function by implementing a C++ driver:
1#include <iostream>
2
3extern "C" void add_values(
4 int32_t * a,
5 int32_t * b,
6 int32_t * c
7);
8
9int main()
10{
11 std::cout << "Calling assembly 'add_value' function...\n";
12
13 // Test Data
14 int32_t l_value_1 = 4;
15 int32_t * l_ptr_1 = &l_value_1;
16
17 int32_t l_value_2 = 7;
18 int32_t * l_ptr_2 = &l_value_2;
19
20 int32_t l_value_o;
21 int32_t * l_ptr_o = &l_value_o;
22
23 // Call to add_values
24 add_values( l_ptr_1, l_ptr_2, l_ptr_o );
25
26 std::cout << "l_data_1 / l_value_2 / l_value_o\n"
27 << l_value_1 << " / "
28 << l_value_2 << " / "
29 << l_value_o
30 << std::endl;
31
32 return 0;
33}
We compiled and linked the driver with the assembly file using:
g++ driver_add_values.cpp add_values.s -o driver_add_values
After executing these commands, we received the following results:
Calling assembly 'add_value' function ...
l_data_1 / l_value_2 / l_value_o
4 / 7 / 11
This confirms that the add_values
correctly adds the two input values together and stores the result at the specified memory location.
Task 1.2.5
To better understand the contents of the general purpose registers during a function call to add_values
, we used the GNU Project Debugger (GDB) to step through the function execution.
We launched GDB with the compiled executable:
gdb ./driver_add_values
lay next
To activate the correct layout view that displays both the assembly instructions and register contents, we ran lay next
. After pressing Enter
, GDB displayed the desired view.
Next, we set a breakpoint at the add_values
function and began the debugging process:
break add_values
run
After starting the debugging process, we used nexti
to step through each instruction one at a time: