OPENING AND READING FILES
x86 System Call Table
Before diving into code implementation, letβs understand the system calls required.
%eax | Name | Source | %ebx | %ecx | %edx | %esx | %edi |
---|---|---|---|---|---|---|---|
1 | sys_exit | kernel/exit.c | int | - | - | - | - |
2 | sys_fork | arch/i386/kernel/process.c | struct pt_regs | - | - | - | - |
3 | sys_read | fs/read_write.c | unsigned int | char * | size_t | - | - |
4 | sys_write | fs/read_write.c | unsigned int | const char * | size_t | - | - |
5 | sys_open | fs/open.c | const char * | int | int | - | - |
β¦ | β¦ | β¦ | β¦ | β¦ | β¦ | β¦ | β¦ |
Iβll focus on the open
and read
system calls for file manipulation.
I can find the complete x86 system call table here, here, or here. Additionally, I can use the terminal command man syscalls
to access the system call manual.
Opening a File
If I use the terminal command man 2 open
(also here), Iβll find the following function:
int open(const char *pathname, int flags);
So, I need the following information to open a file:
open
system call requireseax
to be set to5
;- It takes a file path on
ebx
as aconst char *
; - I need to provide a flag as an argument to
ecx
;
Flags are used to specify the mode in which the file should be opened. For example, O_RDONLY
for read-only mode, O_WRONLY
for write-only mode, and O_RDWR
for read-write mode. I can check that in the terminal using man 2 open
and scrolling down to the flags
section.
One small thing is that in man
, the flags are defined as C macros
, but in assembly, I need to use the actual values. I can go to the fcntl.h file and find the actual value of O_RDONLY
which is 00000000 = 0
.
section .data
pathname db "path/to/file.txt"
section .text
global main
main:
mov eax, 5
mov ebx, pathname
mov ecx, 0
int 80h
- The result of the system call will be my file descriptor, which will be stored in
eax
; - I can use in GDB
x/10x [pointer]
to see the memory content (hexadecimal format) andx/10s [pointer]
(string format);
A file descriptor is a unique identifier assigned by the operating system to a file when it is opened. It is used to reference the file in subsequent operations. The first three file descriptors are reserved for standard input, output, and error streams (0
, 1
, and 2
, respectively). The file descriptor returned by the open
system call will be a positive integer greater than 2
.
Reading from the File
If I use the terminal command man 2 read
(also here), Iβll find the following function:
ssize_t read(int fd, void *buf, size_t count);
Since I now know the file descriptor from the previous example eax = 3
, I can now read from the file using the read
system call.
So, I need the following information to read
from a file:
- It copies
eax
toebx
to use the file descriptor; - I will need a buffer to store the data read from the file;
- The size of the buffer should be provided to
edx
as asize_t
;
section .data
pathname db "path/to/file.txt"
section .bss
buffer resb 1024 ; Reserve 1024 bytes for the buffer because I don't know the file size
section .text
global main
main:
mov eax, 5 ; Open system call
mov ebx, pathname ; File path
mov ecx, 0 ; Flags
int 80h ; Call the system
mov ebx, eax ; Store the file descriptor in ebx
mov eax, 3 ; Read system call
mov ecx, buffer ; Buffer to store the data
mov edx, 1024 ; Size of the buffer
int 80h ; Call the system
mov eax, 1 ; Exit system call
mov ebx, 0 ; Exit code (success)
int 80h ; Call the system
The result of the system call will be the number of bytes read, which will be stored in eax
. If eax
is 0
, it means that the end of the file has been reached. If eax
is -1
, it means that an error occurred. If eax
is -2
, it means that the file descriptor is invalid.