I was assigned a pretty simple flag checker problem to reverse engineer for one of my classes. I'm not going to release much about the challenge because it will probably be used in future course work. I knew I wasn't going to learn much doing that challenge, so I decided to move the goalposts and teach myself a super cool tool called PIN. PIN is a closed-source tool from Intel that dynamically instruments binaries using magic with fairly simple C++ code. This is Alien technology.
I set this up by downloading the tarball from Intel, unpacking it in /opt
, making a symbolic link (ln -s
) to /opt/pin-dir
, then running make all
.
Pintools are .so
files used by the pin
program to instrument a binary by dynamically decompiling it, inserting your some code, then recompiling it in a process pretty similar to Just In Time (JIT) compilation seen in languages like Java. There's a bunch of sample tools that provided to us in other subdirectories of the source/tools/
directory. Let's revisit this in a bit once we introduce our target.
Let's introduce our target: A stupid simple flag checker. Sometimes, these flag checkers will evaluate a flag character by character using a bunch of chained AND (&&
) statements, like so:
if (flag[0] == 'c' && flag[1] == 't' && flag[2] == 'f') {
printf("WIN!\n");
} else {
exit(1);
}
The goal for this kind of challenge is to identify the input that once ran through the program, activates some sort of condition. You have to reverse engineer the operations that the program is doing on the data and effectively undo them. In this case, the flag is of course ctf
, but this get's much more difficult when the input get's longer and is being mutated through the course of the program.
Modern compilers are smart enough to realize that if the first condition checked in a chain of AND statements evaluates to false, like flag[0]
being set to m
not c
, there's no point in evaluating the rest of the statements. This will result in a behavior called "short-circuiting", where the computer will not evaluate any statements in an AND conditional after the first false is reached, instead skipping to the next portion of the code. Let's take a look under the hood.
When compiled, often times, this kind of behavior will result in a series of CMP + JNZ
instructions, or CoMPare + Jump if Not Zero. The CMP
instruction will take a byte from memory and compare it to a static value. If they are the same, the ZF
will be set, meaning that the JNZ
instruction will not be executed. But, if they are different, the JNZ
instruction will jump to the code for the else
statement. This means that the if
statement in the code above will probably compile into something that looks like this:
0010184f 80 7d fd 62 CMP byte ptr [RBP + local_b],'c'
00101853 0f 85 d2 JNZ LAB_0010192b
00 00 00
00101859 80 7d e9 37 CMP byte ptr [RBP + local_1f],'t'
0010185d 0f 85 c8 JNZ LAB_0010192b
00 00 00
00101863 80 7d eb 56 CMP byte ptr [RBP + local_1d],'f'
00101867 0f 85 be JNZ LAB_0010192b
00 00 00
What I would like to do is come up with a way to reliably count the number of instructions that are being executed by the program given a certain input, so we can know when more code is being executed. Let's create a really simple PinTool to count every instruction:
#include "pin.H"
#include <iostream>
#include <fstream>
std::ofstream outFile;
uint64_t insCount = 0;
VOID Instruction(INS ins, VOID *v) { insCount++; }
VOID Fini(INT32 code, VOID *v) {
outFile << insCount << std::endl;
outFile.close();
}
// Main function to set up PIN instrumentation
int main(int argc, char *argv[]) {
if (PIN_Init(argc, argv))
{
std::cerr << "PIN Initialization Failed!" << std::endl;
return 1;
}
outFile.open("inscount.log");
INS_AddInstrumentFunction(Instruction, 0);
PIN_AddFiniFunction(Fini, 0);
PIN_StartProgram();
return 0;
}
Let's break this down, starting out with the include statements:
#include "pin.H"
#include <iostream>
#include <fstream>
The pin.H
file is all you need to start with PIN. Let's look at our global variables:
ofstream OutFile;
static UINT64 icount = 0;
Now let's look at some functions we're using.
VOID Instruction(INS ins, VOID *v) { insCount++; }
VOID Fini(INT32 code, VOID *v) {
outFile << insCount << std::endl;
outFile.close();
}
Instruction()
will increment that global insCount
variable and Fini
will close out the file. Finally, here's the main
:
int main(int argc, char *argv[]) {
if (PIN_Init(argc, argv))
{
std::cerr << "PIN Initialization Failed!" << std::endl;
return 1;
}
outFile.open("inscount.log");
INS_AddInstrumentFunction(Instruction, 0);
PIN_AddFiniFunction(Fini, 0);
PIN_StartProgram();
return 0;
}
This chunk:
Initializes PIN instrumentation using the command lines passed with
argc
andargv
.Opens the
inscount.log
Instruments every address with
Instruction()
with theINS_AddInstrumentFunction()
, meaning that every instruction in the binary will have machine code inserted before it executes that increments theicount
variable.Set
Fini()
as the function to run and clean up everything usingPIN_AddFiniFunction()
.Start the program and instrumentation with
PIN_StartProgram()
.
Compiling and running this returns an accurate and consistent count of how many instructions have been executed:
Compiling and running this returns an accurate and consistent count of how many instructions have been executed:
> /opt/pin-dir/pin -t /opt/pin-dir/source/tools/MyPinTool/obj-intel64/MyPinTool.so -- ./target abc
> cat ./icount.log
180813
However, in the source code of this target binary, I know there are a bunch of printf
statements that depend on user input. This can make life difficult because printf
has relatively complicated logic compared to something as simple as puts
. So, let's try and limit the scope of which instructions get instrumented so we can effectively gauge how deep into the flag checker we are. Here's some more code I cooked up:
#include "pin.H"
#include <iostream>
#include <fstream>
std::ofstream outFile;
ADDRINT baseAddr;
VOID JNZ_Executed(ADDRINT addr) {
outFile << "JNZ executed at address: " << std::hex << addr << std::endl;
}
VOID ImageLoad(IMG img, VOID *v) {
if (IMG_IsMainExecutable(img)) {
baseAddr = IMG_EntryAddress(img);
outFile << "Base address: " << std::hex << baseAddr << std::endl;
}
}
VOID Instruction(INS ins, VOID *v) {
ADDRINT relativeAddr = INS_Address(ins) - baseAddr;
if (relativeAddr >= 0x7d0 && relativeAddr <= 0x940) {
if (INS_IsBranch(ins) && INS_HasFallThrough(ins)) {
OPCODE opc = INS_Opcode(ins);
if (opc == XED_ICLASS_JNZ) {
INS_InsertCall(ins, IPOINT_BEFORE, (AFUNPTR)JNZ_Executed, IARG_INST_PTR, IARG_END);
}
}
}
}
VOID Fini(INT32 code, VOID *v) {
outFile.close();
}
// Main function to set up PIN instrumentation
int main(int argc, char *argv[]) {
if (PIN_Init(argc, argv))
{
std::cerr << "PIN Initialization Failed!" << std::endl;
return 1;
}
outFile.open("jnz_trace.log");
IMG_AddInstrumentFunction(ImageLoad, 0);
INS_AddInstrumentFunction(Instruction, 0);
PIN_AddFiniFunction(Fini, 0);
PIN_StartProgram();
return 0;
}
This pintool has a couple of significant changes from the previous iteration:ImageLoad()
: When an active binary is loaded with pin
, we call ImageLoad
which takes the base address for the binary and stores it in baseAddr
. Because of ASLR and regular RAM allocation behavior, we won't be able to limit the range in which we instrument instructions by relying on raw addresses. Instead, we need to calculate them based off of an offset from a fixed point.
Instruction()
: For each instruction, we calculate the offset from the base of the program. I reverse engineered the binary to see that the instructions that I want to implement are at bytes 0x1017d0
-> 0x101940
. Because the .text
section of the target binary started at 0x101000
, we can guess that the target instructions live from 0x7d0
-> 0x940
bytes past the base address. We then check if the current instruction is some kind of control flow instruction with INS_IsBranch
and INS_HasFallThrough
. Finally, we check the opcode for that instruction and if it matches JNZ
, we execute the JNZ_executed
function, logging out the address of the address.
Here's what the log file looks like after execution:
Base address: 5b4504620080
JNZ executed at address: 5b4504620853
Now that we have this Pintool built, let's wrap this in a simple python script to derive the flag! I'm sure there's a better way to do this, but if it ain't broke:
import string
import subprocess
flag_str = ""
LEN_FLAG = 27
NORMAL_ENTRIES_IN_LOG_FILE = 2
flag = "." * LEN_FLAG
found_chars = 0
# /opt/pin-dir/pin -t /opt/pin-dir/source/tools/MyPinTool/obj-intel64/MyPinTool.so -- ./basic ...........................;
def count_entries_from_mypintool(payload):
subprocess.run(['/opt/pin-dir/pin', '-t', '/opt/pin-dir/source/tools/MyPinTool/obj-intel64/MyPinTool.so', '--', './basic', payload], capture_output=True, text=True)
with open("./jnz_trace.log") as f:
lines = f.readlines()
return len(lines)
possible_chars = string.ascii_letters + string.digits + "_-}{"
for i in range(LEN_FLAG):
for char in possible_chars:
altered_flag = list(flag)
altered_flag[i] = char
altered_flag = "".join(altered_flag)
count = count_entries_from_mypintool(altered_flag)
print(count, altered_flag, end='\r')
if count > (NORMAL_ENTRIES_IN_LOG_FILE + found_chars):
found_chars += 1
flag = altered_flag
print(flag)
break
Very very hacky, but very very cool. Here it is at work:
Looks like we’ve got 4 characters after about 5 minutes.
Music I listened to this week
I Wanna Be Your Dog - Joan Jett & the Blackhearts
She Blinded Me With Science - Thomas Dolby
You Never Know - Lightnin' Luke
Meat Grinder - MF DOOM