pepy ==== pepy (pronounced p-pie) is a python binding to the pe-parse parser. pepy supports Python versions 3.6 and above. The easiest way to use pepy is to install it via pip: ```bash $ pip3 install pepy ``` ## Building If you can build pe-parse and have a working python environment (headers and libraries) you can build pepy. 1. Build pepy: * `python3 setup.py build` 2. Install pepy: * `python3 setup.py install` **Building on Windows:** Python 3.x is typically installed as _python.exe_, **NOT** _python3.exe_. ## Using ### Parsed object There are a number of objects involved in pepy. The main one is the **parsed** object. This object is returned by the *parse* method. ```python import pepy p = pepy.parse("/path/to/exe") ``` The **parsed** object has a number of methods: * `get_entry_point`: Return the entry point address * `get_machine_as_str`: Return the machine as a human readable string * `get_subsystem_as_str`: Return the subsystem as a human readable string * `get_bytes`: Return the first N bytes at a given address * `get_sections`: Return a list of section objects * `get_imports`: Return a list of import objects * `get_exports`: Return a list of export objects * `get_relocations`: Return a list of relocation objects * `get_resources`: Return a list of resource objects The **parsed** object has a number of attributes: * `signature` * `machine` * `numberofsections` * `timedatestamp` * `numberofsymbols` * `characteristics` * `magic` * `majorlinkerver` * `minorlinkerver` * `codesize` * `initdatasize` * `uninitdatasize` * `entrypointaddr` * `baseofcode` * `baseofdata` * `imagebase` * `sectionalignement` * `filealignment` * `majorosver` * `minorosver` * `win32ver` * `imagesize` * `headersize` * `checksum` * `subsystem` * `dllcharacteristics` * `stackreservesize` * `stackcommitsize` * `heapreservesize` * `heapcommitsize` * `loaderflags` * `rvasandsize` Example: ```python import time import pepy p = pepy.parse("/path/to/exe") print("Timedatestamp: %s" % time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(p.timedatestamp))) ep = p.get_entry_point() print("Entry point: 0x%x" % ep) ``` The `get_sections`, `get_imports`, `get_exports`, `get_relocations` and `get_resources` methods each return a list of objects. The type of object depends upon the method called. `get_sections` returns a list of `section` objects, `get_imports` returns a list of `import` objects, etc. ### Section Object The `section` object has the following attributes: * `base` * `length` * `virtaddr` * `virtsize` * `numrelocs` * `numlinenums` * `characteristics` * `data` ### Import Object The `import` object has the following attributes: * `sym` * `name` * `addr` ### Export Object The `export` object has the following attributes: * `mod` * `func` * `addr` ### Relocation Object The `relocation` object has the following attributes: * `type` * `addr` ### Resource Object The `resource` object has the following attributes: * `type_str` * `name_str` * `lang_str` * `type` * `name` * `lang` * `codepage` * `RVA` * `size` * `data` The `resource` object has the following methods: * `type_as_str` Resources are stored in a directory structure. The first three levels of the are called `type`, `name` and `lang`. Each of these levels can have either a pre-defined value or a custom string. The pre-defined values are stored in the `type`, `name` and `lang` attributes. If a custom string is found it will be stored in the `type_str`, `name_str` and `lang_str` attributes. The `type_as_str` method can be used to convert a pre-defined type value to a string representation. The following code shows how to iterate through resources: ```python import pepy from hashlib import md5 import sys p = pepy.parse(sys.argv[1]) resources = p.get_resources() print("Resources: (%i)" % len(resources)) for resource in resources: print("[+] MD5: (%i) %s" % (len(resource.data), md5(resource.data).hexdigest())) if resource.type_str: print("\tType string: %s" % resource.type_str) else: print("\tType: %s (%s)" % (hex(resource.type), resource.type_as_str())) if resource.name_str: print("\tName string: %s" % resource.name_str) else: print("\tName: %s" % hex(resource.name)) if resource.lang_str: print("\tLang string: %s" % resource.lang_str) else: print("\tLang: %s" % hex(resource.lang)) print("\tCodepage: %s" % hex(resource.codepage)) print("\tRVA: %s" % hex(resource.RVA)) print("\tSize: %s" % hex(resource.size)) ``` Note that some binaries (particularly packed) may have corrupt resource entries. In these cases you may find that `len(resource.data)` is 0 but `resource.size` is greater than 0. The `size` attribute is the size of the data as declared by the resource data entry. ## Authors pe-parse was designed and implemented by Andrew Ruef (andrew@trailofbits.com). pepy was written by Wesley Shields (wxs@atarininja.org).