Maximize
Bookmark

VX Heaven

Library Collection Sources Engines Constructors Simulators Utilities Links Forum

Some ideas to increase detection complexity

SPTH
Valhalla #1
July 2011

1
[Back to index] [Comments]

0) Introduction

Here you'll find a few small ideas and thoughts about making detection of computerviruses harder. Thanks alot to herm1t and hh86 for discussion and asking the right questions.

1) Improving tau-obfuscation?

The idea of tau-obfuscation is to perform a time-intensive calculation before encrypting/executing the virus-code, with the result that realistic AV emulators have to give up (as they can't scan one file for too long). This technique has been already covered by Beaucamps & Filiol[1] and Z0MBiE[2].

A simple example:

	encrypted_code=[ENCRYPTED CODE];
	key=sum(factors(VERY_BIG_INTEGER_NUMBER));
	eval(decrypt(encrypted_code, key));
  1. Philippe Beaucamps & Eric Filiol, "On the possibility of practically obfuscating programs Towards a unified perspective of code protection" Journal in Computer Virology, April 2007.
  2. Z0MBiE, ""DELAYED CODE" technology (version 1.1)", 2000, http://vxheavens.com/lib/vzo23.html
  3. SPTH, "Matlab.MicrophoneFever2", Valhalla Magazine, July 2011.

2) Reverse Engineering vs. Meta-Language in Body

Metamorphic viruses/worms need the information of their structure coded in a metalanguage to work with it later (change it and write it back to native code).

One way is to get it by reverse engineering (disassembling) the code.

Biologic organisms need the information of their structure coded in a metalanguage to work with it later (due to the lack of a "copy function").

They could also use a mechanism of reverse engineering the structures in the cell to get this information.

They dont do this, because its way to complicated. Instead, they save the whole information within the cell in form of the metalanguage (DNA), and therefor they can directly start at this step.

For compuerviruses, the meta-language structure must not appear in plain-text, and simple encryption is vulnerable to statistical attacks.

Instead, one could write the zero-form at runtime to memory:

        mov     edi, Alloc_memory_for_metalanguage
        mov     dword[edi], 'AABBCCDD'
        mov     dword[edi+4], 'EEFFGGHH'
 

Advantage: This writing process is an excellent source for metamorphic mutations, thus increases the variability of the organism alot, by that also increases the detection complexity.

We can be funny and add simple encryption to written memory:

        mov     edi, Alloc_memory_for_metalanguage
        mov     dword[edi], 'XXYYZZAA'
        mov     dword[edi+4], 'BBCCDDEE'
        ...
 
        for(int i=0; i<Metalanguage_size; i++)
        {
                mov byte[edi+i], (byte[edi+i]+23)%26;
        }
        ...
 

Now - an emulation can kill us? No, just use tau-obfuscation :)

PS: Conway's Game of Life is known to be Turing-complete. In 2010, Andrew Wade wrote the first self-replicator in that "universe". The self-replicator has its own structure stored in a dynamic tape (DNA) and uses a glider-stream (biosynthesis?) to gain the information. (http://conwaylife.com/forums/viewtopic.php?f=2&t=399)

3) Code Integration -> Code Merging

Code integration is certainly the most complex infection technique for computer viruses so far. It was first used in ZMist by Z0MBiE for Win32 executeables in 2001[4][5], and later in 2007 by herm1t in his Linux.Lacrimae[6][7].

The idea is to fully disassemble the host and virus, and integrate the viruscode into the hostcode:

     ***************             #####################
     *             *             ##                 ##
     *      H      *             ##  jmp Vir1       ##
     *             *             ##  Host1:         ##
     *      O      *             ##        H        ##
     *             *             ##  jmp Host2      ##
     *      S      *             ##  Vir3:          ##
     *             *             ##        R        ##
     *      T      *             ##  jmp Host1      ##
     *             *             ##  Host2:         ##
     ***************             ##        O        ##
                       - - - >   ##  jmp Host3      ##
     +++++++++++++++             ##  Vir1:          ##
     +             +             ##        V        ##
     +      V      +             ##  jmp Vir2       ##
     +             +             ##  Host3:         ##
     +      I      +             ##        S        ##
     +             +             ##  jmp Host4      ##
     +      R      +             ##  Vir2:          ##
     +             +             ##        I        ##
     +++++++++++++++             ##  jmp Vir3       ##
                                 ##  Host4:         ##
                                 ##        T        ##
                                 ##                 ##
                                 #####################

This is a successful technique. However, we can try to put it one additional step further.

We can not just insert the virus between the hostcode, but actually use the hostcode as viruscode, by creating a second codeflow.

Let's say, we want to include a simple

        invoke MessageBox, 0x0, VMSG1, VMSG2, 0x0
 

into a given hostcode:

include 'E:\Programme\FASM\INCLUDE\win32ax.inc'

.data
FileName db 'info.txt',0
hCreateFileFile dd 0x0

.code
start:
        push    0x0
        push    FILE_ATTRIBUTE_NORMAL
        push    OPEN_ALWAYS
        push    0x0
        push    0x0
        push    (GENERIC_READ or GENERIC_WRITE)
        push    FileName
        stdcall dword[CreateFileA]
        mov     dword[hCreateFileFile], eax
ret
.end start
 

To get enough instructions that we can use, we can expand the hostcode

include 'E:\Programme\FASM\INCLUDE\win32ax.inc'

.data
FileName db 'info.txt',0
hCreateFileFile dd 0x0

.code
start:
        push    0x0
        mov     eax, FILE_ATTRIBUTE_NORMAL
        push    eax
        push    OPEN_ALWAYS
        push    0x0
        push    0x0
        mov     eax, (GENERIC_READ or GENERIC_WRITE)
        push    eax
        mov     eax, FileName
        push    eax
        mov     eax, CreateFileA
        stdcall dword[eax]
        mov     ebx, hCreateFileFile
        mov     dword[ebx], eax
ret

.end start      
 

And now let's merge our MessageBox with this hostcode.

include 'E:\Programme\FASM\INCLUDE\win32ax.inc'

.data
FileName db 'info.txt',0
hCreateFileFile dd 0x0

VMSG1 db 'Hello',0
VMSG2 db 'VXers!',0

.code
start:
        xor     ecx, ecx                ; Set ZF
        jmp     VirInstr1
        HostInstr0:
        push    0x0
        mov     eax, FILE_ATTRIBUTE_NORMAL
        jnz     HostInstr1
        VirInstr3:
        add     eax, (VMSG1-FileName)
        xor     ecx, ecx                ; Set ZF
        jmp     VirInstr4
        HostInstr1:
        VirInstr6:
        push    eax
        jz      VirInstr7
        push    OPEN_ALWAYS
        VirInstr7:
        push    0x0
        jz      VirInstr8
        VirInstr1:
        push    0x0
        jz      VirInstr2
        mov     eax, (GENERIC_READ or GENERIC_WRITE)
        VirInstr4:
        push    eax
        jz      VirInstr5
        VirInstr2:
        mov     eax, FileName
        jz      VirInstr3
        push    eax
        jnz     HostInstr4
        VirInstr10:
        inc     ecx                     ; Clear ZF
        jmp     HostInstr0
        HostInstr4:
        mov     eax, CreateFileA
        VirInstr9:
        stdcall dword[eax]
        jz      VirInstr10
        jnz     HostInstr2
        VirInstr5:
        add     eax, (VMSG2-VMSG1)
        xor     ecx, ecx                ; Set ZF
        jmp     VirInstr6
        HostInstr2:
        mov     ebx, hCreateFileFile
        jnz     HostInstr3
        VirInstr8:
        add     eax, (MessageBox-VMSG2)
        xor     ecx, ecx                ; Set ZF
        jmp     VirInstr9
        HostInstr3:
        mov     dword[ebx], eax
ret

.end start    
 

We use the instructions given by the hostcode, and combine them with conditional jumps. The only instructions that are not merged are some re-adjustments of addresses (MessageBox, VMSG1, VMSG1) - but in fact this could be done by merging too, however, the result would be more complex.

Beside of hard recognizion of the code (even for the human eye), it provides alot of freedom which can be used to alter after every generation: which instructions are expanded; which registers are used for expansion; how is the codeflow of the virus; ...

In my oppinion: Absolutly worth to bring to reality! :)

  1. Z0MBiE, "Automated reverse engineering: Mistfall engine.", 2000, http://vxheavens.com/lib/vzo21.html
  2. Peter Ferrie & Péter Ször, "Zmist Opportunities", VirusBulletin Mar 2001, http://vxheavens.com/lib/apf47.html
  3. herm1t, "Code integration on Linux: Cooking the PIE", EOF-DR-RRLF, 2008.
  4. Peter Ferrie, "Crimea river", VirusBulletin February 2008, http://vxheavens.com/lib/apf12.html

4) Overlapping Code for mutations

Overlapping code are code segments that have different behaviour depending on how they are executed. For instance:

         00402000 > $ B8 31C04040    MOV EAX,4040C031

what happens if we jump to 00402001?

         00402001   > 31C0           XOR EAX,EAX
         00402003   . 40             INC EAX
         00402004   . 40             INC EAX

This can be used in a vast variety of ways for obfuscation (in 1994, Stormbringer wrote a virus that just consists of jump instructions, using overlapping code[8]) or code protection[9].

Certainly, this can be used in mutation engines too, gives additional variability.

Some examples:

Our code:

         00402000 > $ 31C0           XOR EAX,EAX
         00402002   . 40             INC EAX
         00402003   . 40             INC EAX

Overlapped Code:

         00402000 > $ 68 11204000    PUSH overlap_.00402011
         00402005   . 68 0C204000    PUSH overlap_.0040200C
         0040200A   . 81F7 31C040C3  XOR EDI,C340C031
         00402010   . C3             RETN
         00402011   . 40             INC EAX

or

         00402000 > $ B8 31C04040    MOV EAX,4040C031
         00402005   . 3D 31C04040    CMP EAX,4040C031
         0040200A   .^74 F5          JE SHORT overlap_.00402001

or

         00402000 > $ EB 02          JMP SHORT overlap_.00402004
         00402002   . 81FE 31C04040  CMP ESI,4040C031

There are over 9.000 other ways to write the original instructions down using overlapping code. One may consider this when planing the next mutation engine.

  1. Stormbringer, "Jump", 40hex #14, 1994.
  2. Matthias Jacob & Mariusz H. Jakubowski & Ramarathnam Venkatesan, "Towards Integral Binary Execution: Implementing Oblivious Hashing Using Overlapped Instruction Encodings", 2007.
Second Part To Hell
July 2011
[Back to index] [Comments]
By accessing, viewing, downloading or otherwise using this content you agree to be bound by the Terms of Use! vxheaven.org aka vx.netlux.org
deenesitfrplruua