PyPatches Documentation
PyPatches is a library that aims to make binary patching easy! There are some other libraries and tools that try to do the same thing, of course. You can find a list here.
Patch Strategies
- Replace Code: Assemble code directly over the existing binary
- Nop Code: Make code not do something
- Invert/always/never branch: Make branch always/never/go the opposite way it does right now
- Replace Function at Callsites: Change calls to a function to call elsewhere -- either replace all or just some calls
- Replace Function: Make a function do something else
- Add data: add global data to a program
- Hook extern: change the contents of a got/plt entry
Dependencies
PyPatches uses the following libraries and packages, each for their intended purpose:
- archinfo: For architecture information about the binary's architecture
- cle: For loading, segment/section information, symbols
- angr: For program analysis, identification of branches and functions
- lief: For ELF parsing and modification
Design
Patches in patches are a somewhat abstract concept and refer moreso to the idea of
a "single change to a binary" than to a specific swap of bytes in a specific location.
For example: "replace 1 byte at 0x401000 with 0x90" is a patch, and so is "compile this
C code and add it somewhere I can jump to it at label my_code" and so is "add this
long string somewhere at label my_string".
Patches doesn't try to be too clever: it very much believes patching is a low level and application specific process, so it doesn't want to get in your way. C, asm, and raw bytes will always be equally supported. For C, patches uses squishy, my LLVM-based shellcode compiler to allow nearly arbitrary C code to be compiled into a big blob and jumped to.
Patching Process
When a patch is applied with patcher.apply_xxx(), nothing immediately changes. The
patch is basically "queued" for application when patcher.save is called later to write
changes to disk. There are two reasons for this:
- Patches must work together! We rarely just want to change one thing!
- It is easier from a programming perspective :)
Patches are created without context, but they are applied with context. This can look a little weird at first, because a set of patches might look something like:
d = DataPatch(
b"hello\n\x00", read=True, write=True, exec=False, label="hellostr"
)
p = AddCodePatch(
code=(
"mov rdi, {hellostr_addr}\n"
"call {puts_plt}\n"
),
dummy_transformer=lambda asm: sub(r"\{[a-z_]+\}", "0", asm),
build_transformer=lambda tinfo, code: code.format(
hellostr_addr=tinfo.data_offsets.get("hellostr"),
puts_plt=tinfo.lief_binary.get_symbol("puts").value
)
)
That is pretty odd looking, but there is a method to the madness.
- The code is "transformed" with the
dummy_transformer(which, if you don't provide one, will just return the asm which may be OK for your purposes!). This is used to figure out how big the code is (a fudge factor of 2 is added for safety). - The patcher "dummy transforms" all the code it needs to add, and modifies the binary with new segments large enough to hold all the new code and data it will be adding.
- The patcher "transforms" the code again with the
build_transformer, this time providing the binary context which contains the offsets of all the data and code we added, indexed by label, the binary information from LIEF (and angr, not pictured) and some other extra info. This allows us to perform any relocating or fixing up we need to do so our code will call the right locations and such in the final binary!
Examples
There are some examples here, but the most up to date examples will be in the test directory.
API Reference
You can get the raw unfiltered API reference here.