Development of Secure Software Systems

CSci 4271 Lab 3

Today's lab will focus on buffer overflow attacks, based on a version of the Badly Coded Print Server code that you were introduced to in class. Your end goal is to attack the program, taking over its control flow. But first you need to find a vulnerability that allows that attack (this is auditing, which you started on earlier), then understand what control you have over the program's memory, and plan the attack accordingly.

In the online lab we'll randomly split you into breakout groups of 2-3 students: please work together, discuss, and learn from the other student(s) in you group. Use the "Ask for Help" button to ask questions or show off what you've done. We also recommend working in groups in the in-person lab, but there you can choose your own groups and physically raise your hand to ask a question. You may still find it useful to use Zoom or tmate for screen sharing in person while respecting social distancing.

Your attack should overwrite a return address to return control to a function that is otherwise not accessible during the program run, named attack_function. Go through the program first to understand the control flows, then identify the vulnerable functions that you can potentially attack. Often it is a good plan to first check that you can control the program to the point where the vulnerable function segfaults. Then after that adjust the input that causes a crash so that causes a jump to the attacker's target instead. Transferring control to the attack_function serves as a proxy for transferring control to shellcode that would be used in a more complete attack: if you can make the control flow go to some "shellcode" that's already in the program, the same technique would also allow transfers anywhere else the attacker wanted.

The bclpr.c program that you will run today has been slightly modified from the one we previously looked at in class, so that you can try attacks out by running it and not just read the code. This version doesn't need to be installed using root access, and we've confirmed that the bugs are still vulnerable when compiled to x86-64 (the original version from several years ago was x86-32).

You can copy the program source code to your working directory using a similar command as in previous labs:

cp /web/classes/Spring-2021/csci4271/labs/03/bclpr.c .

Because this is a simulation of a program that would be installed in a system-wide location if you controlled the full computer, we need to do a bit more to simulate "installing" it so that it will run correctly. As a location that you will have access to but will be unique even on a shared computer, we recommend that you compile and install the program to a location similar to /tmp/bclpr-goldy007, but where goldy007 is replaced with your UMN ID. /tmp is a directory for temporary files that everyone has access to, while using your UMN ID makes it unique. You'll need to make this change both on line 23 of the source that defines the INSTALL_PREFIX macro and in the commands below that create directories the program will use:

gcc -no-pie -z execstack -g -Wall -Wno-format-security -fno-stack-protector bclpr.c -o bclpr
mkdir -p /tmp/bclpr-goldy007/spool/lp0
mkdir -p /tmp/bclpr-goldy007/printouts/lp0

This is a simplified simulation of print server program which will ask for the pathname of the file that you want to print. For instance if you have a text file named hello.txt you can use the command ./bclpr hello.txt to simulate printing the file. The contents of the directory /tmp/bclpr-goldy007/printouts/lp0 (but again with your username) holds the output of the simulated printer.

You can tell that your attack has been successful if you see the following message printed:

...You have successfully completed the attack...

This code contains several buffer-overflow vulnerabilities, as well as at least one non-buffer-overflow vulnerability that could also be applied to hijack control flow. But since we have limited time in lab, we recommend you start by the vulnerability that looks like it would be easiest to exploit.

If you want to have the fun of attacking the vulnerability with no further hints, you should stop reading now. If you feel like some more specific hints would help you learn more during lab time, you can go on the next section.

Hints for a specific vulnerability

The function cleanup_spoolfile should have looked suspicious in your auditing, since it combines several different pieces of text into a longer string in a fixed-size buffer. The different functions it uses have different policies about checking for overflows. Which of the parts of the data are under the control of an attacker (person running the program)? Which of the string functions check or don't check for overflows?
To cause the overflow, you need to print a file with a long filename, not counting enclosing directories. A single component of a pathname can only be 255 bytes long in Linux, but if you look at all the sizes involved, this is still enough to cause an overflow. Using the same kinds of GDB steps we've used before, you can figure out how long the attacker-controlled filename will need to be in order to overwrite the return address of the function.
There are two further complications related to null bytes because the overwrite here uses a null-terminated string. The attacker controls the length of the overwrite, but the attacker controlled bytes can't contain a null byte because that would be a terminator. Also, the overflowing string function will write a null byte after the end of all the attacker controlled bytes. In one sample binary we compiled, the normal return address is 0x00000000004028bf, whereas the address of the attack function we want to jump to is 0x000000000040265f. In your binary the addresses will probably be slightly different. Because x86 addresses are little-endian, a partial overwrite will replace bytes starting from the low part of the address. How many bytes should the attacker try to overwrite to turn the first address into the second?

Hints for another specific vulnerability

You may have also noticed that this program has a call to a function in the printf family where the format string is under external control, representing a format-string vulnerability. We haven't discussed as much how to attack these, but you might start looking at it if you have time left after your first attack. You might want to start by just providing a format string with a lot of %lx format specifiers to print the contents of the stack in the area where the function is taking its format parameters. A format string attack will be easiest if you can find one of these values that you can also control.