* pe-parser-library: Directory entry extraction Also runs clang-format on all library files. * dump-pe: Refactor, clang-format * pe-parser-library: Use enum for directory kinds * travis: Refactor * travis: Fixup stages * travis: Fix matrix * examples, pe-parser-library, pepy: clang-format * travis: Use minimal for lint * travis: Use find * clang-format: Remove old option * travis: More experimentation * travis: Move addons * travis: Remove coverity * travis: Hackery * travis: Move addons up * .travis: clang-format-8 * examples: clang-format * travis: Fix homebrew * CONTRIBUTING: Add contrib guidelines * travis: Build python ext, reenable coverity Remove old build files. * travis: Re-add coverity secret * travis: Build with coverity in a separate dir
pepy
pepy (pronounced p-pie) is a python binding to the pe-parse parser.
Building
If you can build pe-parse and have a working python environment (headers and libraries) you can build pepy.
Python 2.7
- Build pepy:
- python setup.py build
- Install pepy:
- python setup.py install
Building on Windows: If you get a build error of 'Unable to find
vcvarsall.bat', you must set the VS90COMNTOOLS
environment variable prior
to the appropriate path as per
this SO article:
While running setup.py for package installations, Python 2.7 searches for an installed Visual Studio 2008. You can trick Python to use a newer Visual Studio by setting the correct path in VS90COMNTOOLS environment variable before calling setup.py.
Execute the following command based on the version of Visual Studio installed:
- Visual Studio 2010 (VS10):
SET VS90COMNTOOLS=%VS100COMNTOOLS%
- Visual Studio 2012 (VS11):
SET VS90COMNTOOLS=%VS110COMNTOOLS%
- Visual Studio 2013 (VS12):
SET VS90COMNTOOLS=%VS120COMNTOOLS%
- Visual Studio 2015/2017 (VS14):
SET VS90COMNTOOLS=%VS140COMNTOOLS%
Python 3.x
- Build pepy:
- python3 setup.py build
- Install pepy:
- python3 setup.py install
Building on Windows: Python 3.x is typically installed as python.exe NOT python3.exe.
Using
Parsed object
There are a number of objects involved in pepy. The main one is the parsed object. This object is returned by the parse method.
import pepy
p = pepy.parse("/path/to/exe")
The parsed object has a number of methods:
- get_entry_point: Return the entry point address
- get_machine_as_str: Return the machine as a human readable string
- get_subsystem_as_str: Return the subsystem as a human readable string
- get_bytes: Return the first N bytes at a given address
- get_sections: Return a list of section objects
- get_imports: Return a list of import objects
- get_exports: Return a list of export objects
- get_relocations: Return a list of relocation objects
- get_resources: Return a list of resource objects
The parsed object has a number of attributes:
- signature
- machine
- numberofsections
- timedatestamp
- numberofsymbols
- characteristics
- magic
- majorlinkerver
- minorlinkerver
- codesize
- initdatasize
- uninitdatasize
- entrypointaddr
- baseofcode
- baseofdata
- imagebase
- sectionalignement
- filealignment
- majorosver
- minorosver
- win32ver
- imagesize
- headersize
- checksum
- subsystem
- dllcharacteristics
- stackreservesize
- stackcommitsize
- heapreservesize
- heapcommitsize
- loaderflags
- rvasandsize
Example:
import time
import pepy
p = pepy.parse("/path/to/exe")
print "Timedatestamp: %s" % time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(p.timedatestamp))
ep = p.get_entry_point()
print "Entry point: 0x%x" % ep
The get_sections, get_imports, get_exports, get_relocations and get_resources methods each return a list of objects. The type of object depends upon the method called. get_sections returns a list of section objects, get_imports returns a list of import objects, etc.
Section Object
The section object has the following attributes:
- base
- length
- virtaddr
- virtsize
- numrelocs
- numlinenums
- characteristics
- data
Import Object
The import object has the following attributes:
- sym
- name
- addr
Export Object
The export object has the following attributes:
- mod
- func
- addr
Relocation Object
The relocation object has the following attributes:
- type
- addr
Resource Object
The resource object has the following attributes:
- type_str
- name_str
- lang_str
- type
- name
- lang
- codepage
- RVA
- size
- data
The resource object has the following methods:
- type_as_str
Resources are stored in a directory structure. The first three levels of the are called type, name and lang. Each of these levels can have either a pre-defined value or a custom string. The pre-defined values are stored in the type, name and lang attributes. If a custom string is found it will be stored in the type_str, name_str and lang_str attributes. The type_as_str method can be used to convert a pre-defined type value to a string representation.
The following code shows how to iterate through resources:
import pepy
from hashlib import md5
p = pepy.parse(sys.argv[1])
resources = p.get_resources()
print "Resources: (%i)" % len(resources)
for resource in resources:
print "[+] MD5: (%i) %s" % (len(resource.data), md5(resource.data).hexdigest())
if resource.type_str:
print "\tType string: %s" % resource.type_str
else:
print "\tType: %s (%s)" % (hex(resource.type), resource.type_as_str())
if resource.name_str:
print "\tName string: %s" % resource.name_str
else:
print "\tName: %s" % hex(resource.name)
if resource.lang_str:
print "\tLang string: %s" % resource.lang_str
else:
print "\tLang: %s" % hex(resource.lang)
print "\tCodepage: %s" % hex(resource.codepage)
print "\tRVA: %s" % hex(resource.RVA)
print "\tSize: %s" % hex(resource.size)
Note that some binaries (particularly packed) may have corrupt resource entries. In these cases you may find that len(resource.data) is 0 but resource.size is greater than 0. The size attribute is the size of the data as declared by the resource data entry.
Authors
pe-parse was designed and implemented by Andrew Ruef (andrew@trailofbits.com) pepy was written by Wesley Shields (wxs@atarininja.org)