-
Notifications
You must be signed in to change notification settings - Fork 47
Debugging guide
LLVM produces pretty good output for debugging, the first step after finding a bug (if the LLVM output isn’t good enough) is to run a backtrace:
Starting program: /export/home/rtc1032/workspace/fracture/Debug+Asserts/bin/fracture-cl ./samples/intel/fib_elf_32 ... > dis fib ... Program received signal SIGSEGV, Segmentation fault. llvm::MachineInstr::getDebugLoc (this=0x4080484c7) at /opt/llvm-trunk/include/llvm/CodeGen/MachineInstr.h:244 244 DebugLoc getDebugLoc() const { return debugLoc; } (gdb) bt #0 llvm::MachineInstr::getDebugLoc (this=0x4080484c7) at /opt/llvm-trunk/include/llvm/CodeGen/MachineInstr.h:244 #1 0x0000000000637b1d in fracture::Disassembler::printInstructions ( this=0x23a86a0, Out=..., Address=134513863, Size=0, PrintTypes=false) at Disassembler.cpp:275 #2 0x0000000000616505 in runDisassembleCommand (CommandLine=...) at fracture-cl.cpp:380 #3 0x00000000006262d5 in CmdExprAST::Codegen (this=0x23c6580) at CmdExprAST.cpp:66 #4 0x0000000000622034 in Commands::handleCommandLine (this=0x2384ae8) at Commands.cpp:325 #5 0x00000000006220e2 in Commands::runShell (this=0x2384ae8, Prompt=<value optimized out>) at Commands.cpp:353 #6 0x00000000006151f3 in main (argc=2, argv=0x7fffffffe858) at fracture-cl.cpp:795
Notice #0 is inside LLVM, and #1 is inside fracture. Usually the problem is inside fracture and how it uses LLVM. Going to that line, we see:
while (BI->instr_rbegin()->getDebugLoc().getLine() < Address) {
Above it:
MachineFunction *MF = disassemble(Address);
MachineFunction::iterator BI = MF->begin(), BE = MF->end(); // Skip to first basic block with instruction in desired address // Out << BI->instr_rbegin()->getDebugLoc().getLine() << "\n";
Something is wrong with how DebugLoc objects are being set on the MachineInst’s inside the disassembler. If you go to the disassemble function, we see:
// Recover MachineInstr representation // Note: Location stores offset of instruction, which is really a perverse // misuse of this field. Type *Int64 = Type::getInt64Ty(*MC->getContext()); uint64_t AddrMask = dwarf::DW_TAG_lexical_block; Value *Elts[] = { ConstantInt::get(Int64, AddrMask), ConstantInt::get(Int64, 0), NULL, NULL }; MDNode *Scope = MDNode::get(*MC->getContext(), Elts); DebugLoc *Location = new DebugLoc(DebugLoc::get(Address, 0, Scope, NULL)); MachineInstrBuilder MIB = BuildMI(Block, *Location, *MCID);
This is not inside an if block, so it is hard to see how this might not be set, but higher up in the function there’s an error condition that sets the value to NULL and kills the processing:
if (!(DA->getInstruction(*Inst, InstSize, *CurSectionMemory, Address, Infos, Errs))) { printError("Unknown instruction encountered, instruction decode failed!"); Instructions[Address] = NULL; Block->push_back(NULL); // TODO: Replace with default size for each target. return 1; // outs() << format("%8" PRIx64 ":\t", SectAddr + Index); // Dism->rawBytesToString(StringRef(Bytes.data() + Index, Size)); // outs() << " unkn\n"; } Instructions[Address] = Inst;
This kind of thing is usually likely culprit, but it should be printing an error saying as much, so we rule it out for now. There’s also a debug output line there, but it doesn’t work anymore, so we add our own to the end of the function:
outs() << Inst->getOpcode() << " " << Location->getLine() << " " << Address << "\n";
Note you can use "errs()" too. This yields output like the following:
2355 0 134513919
So, for some reason Location’s getline isn’t getting set. Looking at the data representation in DebugLoc.h, we see why:
/// LineCol - This 32-bit value encodes the line and column number for the /// location, encoded as 24-bits for line and 8 bits for col. A value of 0 /// for either means unknown. uint32_t LineCol;
That sucks, because the offset is higher than the Line part can hold (16777215). We can either find an alternate way to carry the line number or we can cram that number in there by setting the col value with some bitmagic. The (temporary?) fix is to change the code as so:
MDNode *Scope = MDNode::get(*MC->getContext(), Elts); unsigned ColVal = (Address & 0xFF000000) >> 24; unsigned LineVal = Address & 0xFFFFFF; outs() << ColVal << " " << LineVal << "\t"; DebugLoc *Location = new DebugLoc(DebugLoc::get(LineVal, ColVal, Scope, NULL));
Note that there is a way to add metadata to LLVM IR code, but it’s unclear how to do this cleanly in the DAG engines and carry the information backward through the system. Unfortunately, this bug is going to cascade through our system as there are a lot of places that depend on accurate addresses (control flow graphs, branch calculations, etc).
Note that the object does have ConstantInt/Int64 objects, so it might be possible to represent them cleanly. Not clear how to do it.
Unfortunately, even with this, we still get a crash in the same spot. This is because the instruction address from getLine is never greater than Address. Fix for this particular error is to make sure we aren’t hitting the BI == BE condition:
while (BI != BE && BI->instr_rbegin()->getDebugLoc().getLine() < Address) { ++BI; } if (BI == BE) { printError("Could not disassemble, reached end of function's basic blocks" " when looking for first instruction."); return 0; }
Now we get an error message that makes more sense, and I filed a bug report for this limitation.
To fix the larger bug, 1 of two options exist: 1. Everytime you pull an address with getline, multiply the getCol result back into it to get the original address. 2. Find an alternate way to represent DebugLoc in our code so we don’t have to