Dailydave mailing list archives
x86_RE_lib
From: Joel Eriksson <je () bitnux com>
Date: Fri, 3 Feb 2006 11:35:49 +0100
On Mon, Jan 30, 2006 at 02:11:26PM -0800, halvar () gmx de wrote:
For those on this list that enjoy reading very ugly python code, check www.sabre-security.com/x86_RE_lib.zip -- depending on your viewpoint, it's one of the following things: 1) The proof that python code is not necessarily more readable than perl 2) An example of how not to document or write python code 3) An 'idea-dump' into which I dump a lot of prototype code 4) A collection of relatively useful IDAPython utility functions for generating flowgraphs, inlining functions into these flowgraphs, doing (very limited, local, register-only) dataflow analysis, detecting vtables, building specialized graphs for uninitialized variable attacks. Cheers, Halvar
Hi Halvar, Thanks for making this available. I'm just getting started with IDAPython so it saved me time being able to look at how some IDA API-functions are used in practice. :) I've run into some problems though, not sure if it is caused by a bug in IDA / IDAPython / the plugin API or your code. The actual problem I'm trying to solve at the moment is eliminating a large number of anti debugging / anti binary modification code sequences from a certain program. The code sequence looks like this: - Variable code that calculates an addr, a size and a value. - A loop that looks like this: loop: mov <tmp_reg>,[addr_reg+optional_and_variable_offset] xor|add|sub <value_reg>,<tmp_reg> sub <addr_reg>, constant_that_varies_from_1_to_4 dec <size_reg> or|test <size_reg>, <size_reg> (this line is not always included) jnz loop - Code that indirectly causes a crash if the value calculated in value_reg is incorrect. Sometimes by using the value when calculating an address that is dereferenced / called, sometimes when calculating an array index and sometimes it just checks if the calculated value equals a specified value, if it doesn't it messes up the stack and ret's into an invalid addr, to make it harder to determine where the check that caused the crash was located. Throughout all of the above, jmp's over junk are sometimes inserted randomly. Probably to confuse linear disassemblers, although that is no problem with IDA that uses the recursive traversal approach. Eliminating one check at a time doesn't work so well. If I modify code at or insert a breakpoint at addr X and succeeds in locating and eliminating the check that indirectly causes the crash that occurs when addr X has been modified, I usually end up crashing somewhere else because of another check that detects that the code that checks addr X has been modified, and so on. Single-stepping would be a possibility, but takes ages, and for some reason hardware breakpoints won't work for me in either GDB or IDA Pro when debugging this certain program. The program is multi-threaded btw, and both GDB and the IDA Pro linux seems to handle this quite poorly.. My idea was to make an IDA Python-script for finding code sequences that match the loop, since that is the only part of it that is fairly static, then manually eliminating each check by using the x86emu-plugin to calculate the correct values and patching the code. Pseudo-code of how I intended to do this: - For each function: - Split up the function into basic blocks - Merge blocks that are only separated by jmp fwd; junk; fwd: ... - Print the addr of blocks that matches the loop described above Since I was not that familiar with the IDA API I looked at your code to see how you split up functions into basic blocks and how to get the instruction mnemonic and the value of operands. At first I wrote my own python classes for instructions, basic blocks and functions but ended up with IDA hanging and sometimes the Python interpreter crashing when splitting up certain functions into basic blocks. I assumed I might have made a mistake somewhere, so I tried making a function that uses the get_basic_block() and get_short_crefs_from() functions that you defined instead. The code looks like this (will probably look weird without a monospaced font :): import x86_RE_lib def get_func_flow(ea): """Return a dict of basic blocks in function at ea.""" func = get_func(ea) ea = func.startEA flow = {} flow[ea] = get_basic_block(ea) list = [ ea ] while len(list) > 0: ea = list.pop(0) curr = flow[ea] if curr[-1][1] != "call": next = get_crefs_from(curr[-1][0]) else: next = get_short_crefs_from(curr[-1][0]) for ea in next: if not flow.has_key(ea): flow[ea] = get_basic_block(ea) list.append(ea) return flow def strip_tags_from_flow(flow): """Remove IDA:s color coding from operand strings.""" for block in flow.values(): for line in block: for i in range(2,5) line[i] = tag_remove(line[i]) if line[i] == None: line[i] = "" def print_flow(flow): """Print the basic blocks in flow, sorted by address.""" keys = flow.keys() keys.sort() for ea in keys: print "[ %x ]" % (ea) for line in flow[ea]: print disasm_line_to_string(line) print Then I tested it on different functions with the following: flow = get_func_flow(get_screen_ea()) strip_tags_from_flow(flow) print_flow(flow) It works fine for most functions, but for certain rather complicated functions it hangs IDA completely or crashes IDAPython (it gets an exception and reloads). Finally I tested using your create_flowgraph_from() function directly, and I still get the same problem. Unfortunately I cannot send you the actual code that I'm analyzing, but perhaps a flowgraph generated by IDA of one of the functions that causes the problem can give you an idea of what the problem might be: https://sec.bitnux.com/badfunc.jpg "165 nodes, 1686 edge segments, 207 crossings" ... Yikes. :) If you like to I could also send you a complete listing of this function, after removing strings and function names. I hope to get the opportunity to buy BinNavi soon btw, it would be very useful for some of the projects I'm working on. :) -- Best Regards, Joel Eriksson ------------------------------------------------- Cellphone: +46-70 228 64 16 Home: +46-18-30 35 55 Security Research & Systems Development at Bitnux PGP Key Server pgp.mit.edu, PGP Key ID 0x08811B44 DF38 5806 0EFB 196E E4B6 34B5 4C01 73BB 0881 1B44 -------------------------------------------------
Current thread:
- The Game Dave Aitel (Jan 30)
- Re: The Game Adam Shostack (Jan 30)
- Re: The Game / ISS SCADA talk Tom Parker (Jan 30)
- Re: The Game halvar (Jan 30)
- x86_RE_lib Joel Eriksson (Feb 03)
- Re: x86_RE_lib Dave Aitel (Feb 03)
- x86_RE_lib Joel Eriksson (Feb 03)
- Re: The Game Jonatan B (Feb 08)