JTAG debugging experience for ARMTZ

Recently we ran into this issue: when the Hikey board boots and switches from trusted firmware to UEFI, the entire system hangs. Because the execution has not reached kernel, there is almost no way to debug. Therefore, we bought a bus blaster and decided to use JTAG to examine the internal status of processor. In this article I will share my experience on JTAG as a beginner.

What can we do if we have no clue where the bug is? We halte the cpu some time after the board boots up and dump register values. First we notice that the processor exception level is EL2, which means that the execution stays in normal world and either in UEFI or Hypervisor. Also the PC is messed up (pointing to 0x40XXXXXXXXXXXXXX), which implies that exception might happen. The link register stores the return address, which is the next address after faulty instruction. Sometimes CPSR may also provide useful information, such as ISA selection(ARM vs Thumb).

After narrowing down the scope of possible faulty region, we set a breakpoint a few bytes before that instruction and single stepped until it hit faulty instruction. The faulty instruction tries to push some value to the stack; however, the SP is not aligned which triggers the exception. In order to understand what it is trying to do, we want to go back to source code. But another problem is arisen. The module is dynamically loaded, therefore, it is impossible to know which module corresponds to this piece of code. So we dump a few instructions around the faulty instruction and search the source file with same consecutive instructions. After matching the binary code back to source file, we finally understand the problem. UEFI tries to reserve some space for stack. And if the available space is not enough, it will put the stack to another place. The firmware designer may only test one scenario, but the problem happened in the other path.

Leave a comment