Initial commit - Begun rewrite of website generator
This commit is contained in:
160
pages/2013-08-21-Loading-Elf.md
Normal file
160
pages/2013-08-21-Loading-Elf.md
Normal file
@@ -0,0 +1,160 @@
|
||||
layout: post
|
||||
title: "Loading elf"
|
||||
subtitle: "there's DWARF in my ELF."
|
||||
tags: [osdev]
|
||||
|
||||
### Elf header format
|
||||
|
||||
Elf files all start with a header which identifies the file and explains
|
||||
where to find everything. It has the following structure. The
|
||||
[ELF specification](http://www.skyfree.org/linux/references/ELF_Format.pdf)
|
||||
gives an excellent description on the meaning and use of each field.
|
||||
|
||||
typedef struct
|
||||
{
|
||||
uint8_t identity[16];
|
||||
uint16_t type;
|
||||
uint16_t machine;
|
||||
uint32_t version;
|
||||
uint32_t entry;
|
||||
uint32_t ph_offset;
|
||||
uint32_t sh_offset;
|
||||
uint32_t flags;
|
||||
uint16_t header_size;
|
||||
uint16_t ph_size;
|
||||
uint16_t ph_num;
|
||||
uint16_t sh_size;
|
||||
uint16_t sh_num;
|
||||
uint16_t strtab_index;
|
||||
}__attributes__((packed)) elf_header;
|
||||
|
||||
The first thing we should do is check whether we actually got an
|
||||
executable ELF file. (In the following code, I'll assume the entire elf
|
||||
file is located somewhere in memory and that this location is passed to
|
||||
the `load_elf()` function.)
|
||||
|
||||
To check if the file is an ELF executable we can look at the
|
||||
identity field. The first four bytes of this filed should always be
|
||||
`0x7F`,`'E'`,`'L'`,`'F'`. If that's correct, we can look at the `type`
|
||||
field. For an executable standalone program, this should be `2`.
|
||||
|
||||
int load_elf(uint8_t *data)
|
||||
{
|
||||
elf_header *elf = (elf_header *)data;
|
||||
if(is_elf(elf) != ELF_TYPE_EXECUTABLE)
|
||||
return -1;
|
||||
...
|
||||
|
||||
`is_elf` looks as follows. Note the use of `strncmp` which I can do
|
||||
because I link [newlib into my kernel](/blog/2013/08/Catching-Up/).
|
||||
|
||||
int is_elf(elf_header *elf)
|
||||
{
|
||||
int iself = -1;
|
||||
|
||||
if((elf->identity[0] == 0x7f) && \
|
||||
!strncmp((char *)&elf->identity[1], "ELF", 3))
|
||||
{
|
||||
iself = 0;
|
||||
}
|
||||
|
||||
if(iself != -1)
|
||||
iself = elf->type;
|
||||
|
||||
return iself;
|
||||
}
|
||||
|
||||
Should be pretty straight forward. Let's continue.
|
||||
|
||||
For just loading a simple ELF program, we only need to look at the
|
||||
program headers which are located in a table at offset `ph_offset` in
|
||||
the file.
|
||||
|
||||
typedef struct
|
||||
{
|
||||
uint32_t type;
|
||||
uint32_t offset;
|
||||
uint32_t virtual_address;
|
||||
uint32_t physical_address;
|
||||
uint32_t file_size;
|
||||
uint32_t mem_size;
|
||||
uint32_t flags;
|
||||
uint32_t align;
|
||||
}__attributes__((packed)) elf_phead;
|
||||
|
||||
The program headers each tell us about one section of the file, and we
|
||||
use them to find out what parts of the elf image should be loaded where
|
||||
in memory. So, the next step would be to go through all program headers
|
||||
looking for loadable sections and load them into memory.
|
||||
|
||||
...
|
||||
elf_phead *phead = (elf_phead)&data[elf->ph_offset];
|
||||
uint32_t i;
|
||||
for(i = 0; i < elf->ph_num; i++)
|
||||
{
|
||||
if(phead[i].type == ELF_PT_LOAD)
|
||||
{
|
||||
load_elf_segment(data, &phead[i]);
|
||||
}
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
This would also be a good time to update the memory manager information
|
||||
about the executable. You might want to keep track of the start and end
|
||||
of code and data for example.
|
||||
|
||||
Anyway, `load_elf_segment()` looks like this
|
||||
|
||||
void load_elf_segment(uint8_t *data, elf_phead *phead)
|
||||
{
|
||||
|
||||
uint32_t memsize = phead->mem_size; // Size in memory
|
||||
uint32_t filesize = phead->file_size; // Size in file
|
||||
uint32_t mempos = phead->virtual_address; // Offset in memory
|
||||
uint32_t filepos = phead->offset; // Offset in file
|
||||
|
||||
uint32_t flags = MM_FLAG_READ;
|
||||
if(phead->flags & ELF_PT_W) flags |= MM_FLAG_WRITE;
|
||||
|
||||
new_area(current->proc, mempos, mempos + memsize, \
|
||||
flags, MM_TYPE_DATA);
|
||||
|
||||
if(memsize == 0) return;
|
||||
|
||||
memcpy(mempos, &data[filepos], filesize);
|
||||
memset(mempos + filesize, 0, memsize - filesize);
|
||||
}
|
||||
|
||||
Let's go through it.
|
||||
|
||||
First we define some helper variables.
|
||||
|
||||
Next we check if the section we're loading should be writable.
|
||||
|
||||
Then we request a new memory area from the [process memory
|
||||
manager](/blog/2013/06/Even-More-Memory/).
|
||||
|
||||
Finally, we copy as much data as is provided in the file and fill the
|
||||
rest of the new area with zeros.
|
||||
|
||||
And that's really all you need to do to load an ELF executable.
|
||||
The only thing left is to jump to `elf->entry` and you're going.
|
||||
|
||||
### Improvements
|
||||
Of course the entire executable image won't be loaded into memory in the
|
||||
normal case, but it might be true for e.g. an `init` program or similar
|
||||
that your bootloaded loads as a module to your kernel. Instead, you
|
||||
should read the parts you want through your filesystem as you go along.
|
||||
|
||||
Or maybe you shouldn't. It doesn't make sense to load a huge program
|
||||
into memory all at once. What if it encounters an error and exits with
|
||||
99% of the code unexecuted?
|
||||
|
||||
Perhaps the process memory manager could be told where to find certain
|
||||
parts of the program, and load them only when needed?
|
||||
|
||||
### Git
|
||||
The methods described in this post has been implemented in git commit
|
||||
[a4ca835d1d](https://github.com/thomasloven/os5/tree/a4ca835d1db61faf214b4b617d38a335ef05d142).
|
||||
|
||||
Reference in New Issue
Block a user