Development of Secure Software Systems

CSci 4271 Lab 4

Today's lab is a series of 7 shorter problems that form an obstacle course of challenges related to buffer and integer overflow. The 7 problems are embedded into a single binary in a way that might be reminiscent of the 2021 Bomb Lab, if you did that activity.

Each stage of the obstacle course is a C function that reads a line from the input file into a buffer that might be too small. To make the memory layout more predictable, the vulnerable buffer is always in a structure, and the amount of the overflow is limited to at most the size of the structure. (This also means you won't be able to overwrite a return address with your overflow.) Another restriction is that the data used in the overflow is limited to printable ASCII characters, the ones with byte values 0x20 through 0x7e. But you can control the exact amount of the overflow based on how long you make the corresponding line in the input file, and the overflow doesn't include any terminator character (no \0 terminator, and no newline).

Speaking of ASCII characters, you will probably find it helpful for the challenges to go back and forth between decimal or hexadecimal byte values and the printable characters they represent. Here's a table:

Dec Hex    Dec Hex    Dec Hex  Dec Hex  Dec Hex  Dec Hex   Dec Hex   Dec Hex  
  0 00 NUL  16 10 DLE  32 20    48 30 0  64 40 @  80 50 P   96 60 `  112 70 p
  1 01 SOH  17 11 DC1  33 21 !  49 31 1  65 41 A  81 51 Q   97 61 a  113 71 q
  2 02 STX  18 12 DC2  34 22 "  50 32 2  66 42 B  82 52 R   98 62 b  114 72 r
  3 03 ETX  19 13 DC3  35 23 #  51 33 3  67 43 C  83 53 S   99 63 c  115 73 s
  4 04 EOT  20 14 DC4  36 24 $  52 34 4  68 44 D  84 54 T  100 64 d  116 74 t
  5 05 ENQ  21 15 NAK  37 25 %  53 35 5  69 45 E  85 55 U  101 65 e  117 75 u
  6 06 ACK  22 16 SYN  38 26 &  54 36 6  70 46 F  86 56 V  102 66 f  118 76 v
  7 07 BEL  23 17 ETB  39 27 '  55 37 7  71 47 G  87 57 W  103 67 g  119 77 w
  8 08 BS   24 18 CAN  40 28 (  56 38 8  72 48 H  88 58 X  104 68 h  120 78 x
  9 09 HT   25 19 EM   41 29 )  57 39 9  73 49 I  89 59 Y  105 69 i  121 79 y
 10 0A LF   26 1A SUB  42 2A *  58 3A :  74 4A J  90 5A Z  106 6A j  122 7A z
 11 0B VT   27 1B ESC  43 2B +  59 3B ;  75 4B K  91 5B [  107 6B k  123 7B {
 12 0C FF   28 1C FS   44 2C ,  60 3C <  76 4C L  92 5C \  108 6C l  124 7C |
 13 0D CR   29 1D GS   45 2D -  61 3D =  77 4D M  93 5D ]  109 6D m  125 7D }
 14 0E SO   30 1E RS   46 2E .  62 3E >  78 4E N  94 5E ^  110 6E n  126 7E ~
 15 0F SI   31 1F US   47 2F /  63 3F ?  79 4F O  95 5F _  111 6F o  127 7F DEL

There are also many ASCII tables and tools for conversion that you can find online: a previous semester's TA recommends this one.

There are seven stages, so your final input file will consist of seven lines, one for each stage. As a starting point, we've given you a template file that has 7 lines that are similar to the input you'll need to supply, but which don't cause any of the attacks to succeed. Your job is to modify each line in the file into an input that will cause the corresponding stage of the obstacle course to succeed. We've tried to arrange the stages in order of increasing difficulty, though you don't have to do them in order. Stages 1-5 print some extra information when they fail (and stage 1 even when it succeeds), which you may find helpful to understand what's going on. A debugger shouldn't be necessary, but you can use one if you'd like. The final stage 7 is a bit special in that you can easily make the program crash or behave weirdly with a failed attack, and you probably will want to use GDB or another disassembler for it.

This lab also puts more stress on a conceptual challenge that was also present in last week's lab, namely thinking about how different interpretations of data in files and in memory interact. The basic structure of both files and memory for the purpose of this lab is a sequence of bytes, each of which is 8 bits that can represent a number between 0 and 255, a two-hex-digit number in hexadecimal, or one ASCII text character. When those bytes are written into memory, they might be interpreted in other ways, like a 4-byte int or an 8-byte pointer (because x86-64 is little-endian, these interpretations might feel backwards from what you'd first expect). Because the attacks in this lab are limited to printable characters, it is possible to write your attack input just using a plain text editor. But the other perspectives on how the bytes in your attack are interpreted are important for understanding how the attacks work. So you may find it helpful to look at your attack input file with the hd command, edit it with a hex editor, and/or explore different interpretations for the overwritten data by using the x (examine) command in GDB.

Remember that the purpose of the lab is to help you understand the underlying concepts, not just to solve the puzzles, so if you get an attack to succeed by random guessing, take a moment to try to understand why the attack works before going on to the next one.

You can start by copying the program binary, source code, and solution template into your current directory:

cp /web/classes/Spring-2023/csci4271/labs/04/overflow-course{,.c} .
cp /web/classes/Spring-2023/csci4271/labs/04/template.in my-soln.in

You can run the program as follows:

./overflow-course my-soln.in

You can start work on your first attack by looking for the function stage1 in the source code.