Machine Architecture and Organization (sec 010)

CSCI 2021 lab0x5

Debugging A Puzzle Program

This lab will mainly focus on debugging a simpler version of the the binary bomb (hands on assignment 3). This will all be done using the GDB debugger. This will allow you to step through the code line by line, print out the contents of variables and registers, and set break points in the program. The way this "bomb" works is by reading in your input from a text file. You will type your input(s) for each "phase" on separate lines in the provided text file, input.txt. You can then run the executable "puzzle" against your input using the command:

./puzzle input.txt

This will most likely provide less than satisfactory results until some debugging is done on the executable to try and decode the correct inputs. This will be done using GDB.

To make using GDB less daunting we will provide some basic commands that are useful to utilizing the debugger.

GDB Tips and Tricks

To run gdb on a file you can simply call gdb on the executable itself:

gdb <executable>

For this lab, our executable is called "puzzle"

Optionally, you can run gdb with the "-tui" flag before the executable. This will open up a visual of the source code which indicated which line of the program you will be on. The TUI mode often "breaks" in the sense that it becomes unreadable. To reset TUI, use ctrl-l and this will make it look normal again. The tui flag is STRONGLY recommended.

At first you will see the words "no source available" on the screen. To show the assembly code, type

(gdb) layout asm

Before you run the program you must set the arguments of the program (gdb in parentheses is not part of the input):

(gdb) set args <arguments>

Once again, for this lab, we have provided an empty text file called input.txt that you will use in place of the word arguments in the above command.

Quite simply, the way you run your program now that it has arguments is a quick call to the run command. This will run your program given the arguments you supply:

(gdb) run

If the program is stuck in a loop or if you want to prematurely end the process, you can use the kill command in gdb, this will stop the current process:

(gdb) kill

But, before you go ahead and run your program, you will want to set a break point using the break function. This allows you to set break points at any line or function. Break points will halt the progress of the program and allow you to step line by line past them in order to find where errors might be coming about. Breakpoints can be created and destroyed as such:

Creating Breakpoints

(gdb) break "function name"

(gdb) break "filename.c:lineNumber"

Removing Breakpoints

(gdb) delete breakpoint# (where # is the breakpoints number id)

(gdb) clear (deletes all of the breakpoints)

Once the breakpoint has been reached, you can step line by line using the si (step instruction) command:

(gdb) si

One last command that is very useful is the finish command.

(gdb) finish

When stepping through the assembly with the si command, you will often enter other functions that are called. In this lab, this is the case for the C function, printf, which in assembly is sometimes converted to the "puts" function instead. You will see this very often in this lab. Because the inner works of the puts command is not relevant, you can type "finish" into gdb anytime you are in a function you want finish. This will then complete the current function and return you the line of assembly after it was called.

Warmup: source-level GDB

If you already feel comfortable using GDB for programs with source code from previous labs and assignments, you can skip this part of the lab and go straight to the binary-level puzzle. But if you feel like you could use a review, you can start with a bomb-like puzzle with source code. You can copy this source code to your directory and compile it like this:

cp /web/classes/Fall-2018/csci2021-010/labs/0x5/puzzle-w-src.c .
gcc -Wall -g puzzle-w-src.c -o puzzle-w-src

This puzzle has only one "phase", meaning that it reads just two numbers in from one line of an input file. As with the binary puzzle, your job is to figure out an input that will make it print its success message. How can you efficiently figure that out, without having to understand everything the program does? Specifically, we'd like you to think about how you can find an input using GDB. If you already have an idea about how you might do this, go ahead. Or in the rest of this section we'll walk through one way you might approach it.

You might try looking over the code first. We're using a naming convention in this lab that the first phase (and in this case the only one) is in a function named p1. You can see that the p1 function takes two integers as arguments, so you should try making an input line with two numbers on it. Try making such a file and running the program: but you will probably see the failure message instead of the success one. Looking back at the source, it seems like the first number you supply is transformed y a function named "mangle", and then that result has to be equal to the second number for the phase to succeed. But looking at the mangle function, there is no obvious short summary of what it is doing: it does a lot of different bitwise operators that mix up all the bits in a 16-bit value. Just by looking at it, it's not obvious what the results will be. But if we can figure out what result it will produce for any input, we can use that input and output to make a working input file.

If you are used to doing debugging with printf, you might be tempted to use that approach here: if you just added a printf to print the value of mangled in p1, that would print the information you need. However you can't do printf debugging in cases where you don't have source code, so let's use GDB do get a similar effect but without modifying or recompiling the code.

Suppose that mangle-input.txt is a text file with two numbers on one line. We want to debug the execution of the program when running on that input, so we can run GDB with the command:

gdb --args ./puzzle-w-src mangle-input.txt

The equivalent of printf debugging in GDB is to use GDB's print (or just p) command with a variable. But just like it matters where in your program you put the call to printf, you have to let the program execute up to the point where the value you want is available, which in this case would be inside the p1 function, but after the call to mangle and the assignment to mangled have happened. One convenient way to do this is with the combination of a breakpoint and the next command. Specifically, set a breakpoint at the starting point of the p1 function with the command break p1 (break can also be abbreviated b). Then run the program from the beginning up to that point with run. If this much works correctly, you should see an output from GDB that looks like:

Breakpoint 1, p1 (x=33, y=100) at puzzle-w-src.c:47
47	  int mangled = mangle(x);

The parameters x and y should be the numbers in your input file. We'd like to know the value in the variable mangled that this line of the code will compute, but the breakpoint has stopped before that line executes, so if we were to try to print it now we wouldn't see the right value. We want to let the execution of the program proceed a little bit further. We could use step, but that would go into the details of mangle that would get complicated. In this case we don't actually need to look at how mangle works, we just want to pick up watching after it has finished. The next command in GDB is just for situations like this: it goes one line forward, but only within the current function, not going into other functions that are called. If you give the command next (or n), you'll see GDB goes to the point before the next line in p1:

48	  if (mangled == y) {

From here you can then say print mangled. It probably won't print the same value as you have in t, but you can see that the value it prints is the one you need to have in y in order for the phase to succeed. So now you can quit GDB, but that number in as the second number in your input file, and check that the puzzle program now succeeds.

The other phases of the lab are similar, and in fact use simpler code, but they have the challenge that you can't look at the source code or use source-level commands in GDB. Instead you need to use a combination of looking at the machine code and GDB features to figure out what inputs will work.

Completing The Lab

To get the puzzle program to use in lab, run the command:

cp /web/classes/Fall-2018/csci2021-010/labs/0x5/{puzzle,input.txt} .

The program itself will take a .txt file named "input.txt" as an argument. In the input file, you will have to debug the program to find the solutions to each phase and place those solutions into the text file.

When you input your solutions in the text file, make sure you do it one line at a time, with no extra spaces at the end of the line, or the file could be misread. If you figure out all of the solutions and receive the final output of the puzzle program, you have completed the lab and are on your way to debugging the fabled binary bomb.