mirror of
https://github.com/QuasarApp/pe-parse.git
synced 2025-04-26 20:34:31 +00:00
205 lines
4.8 KiB
Markdown
205 lines
4.8 KiB
Markdown
|
pepy
|
||
|
====
|
||
|
pepy (pronounced p-pie) is a python binding to the pe-parse parser.
|
||
|
|
||
|
pepy supports Python versions 3.6 and above.
|
||
|
|
||
|
The easiest way to use pepy is to install it via pip:
|
||
|
|
||
|
```bash
|
||
|
$ pip3 install pepy
|
||
|
```
|
||
|
|
||
|
## Building
|
||
|
|
||
|
If you can build pe-parse and have a working python environment (headers and
|
||
|
libraries) you can build pepy.
|
||
|
|
||
|
1. Build pepy:
|
||
|
* `python3 setup.py build`
|
||
|
2. Install pepy:
|
||
|
* `python3 setup.py install`
|
||
|
|
||
|
**Building on Windows:** Python 3.x is typically installed as _python.exe_,
|
||
|
**NOT** _python3.exe_.
|
||
|
|
||
|
## Using
|
||
|
|
||
|
### Parsed object
|
||
|
|
||
|
There are a number of objects involved in pepy. The main one is the **parsed**
|
||
|
object. This object is returned by the *parse* method.
|
||
|
|
||
|
```python
|
||
|
import pepy
|
||
|
p = pepy.parse("/path/to/exe")
|
||
|
```
|
||
|
|
||
|
The **parsed** object has a number of methods:
|
||
|
|
||
|
* `get_entry_point`: Return the entry point address
|
||
|
* `get_machine_as_str`: Return the machine as a human readable string
|
||
|
* `get_subsystem_as_str`: Return the subsystem as a human readable string
|
||
|
* `get_bytes`: Return the first N bytes at a given address
|
||
|
* `get_sections`: Return a list of section objects
|
||
|
* `get_imports`: Return a list of import objects
|
||
|
* `get_exports`: Return a list of export objects
|
||
|
* `get_relocations`: Return a list of relocation objects
|
||
|
* `get_resources`: Return a list of resource objects
|
||
|
|
||
|
The **parsed** object has a number of attributes:
|
||
|
|
||
|
* `signature`
|
||
|
* `machine`
|
||
|
* `numberofsections`
|
||
|
* `timedatestamp`
|
||
|
* `numberofsymbols`
|
||
|
* `characteristics`
|
||
|
* `magic`
|
||
|
* `majorlinkerver`
|
||
|
* `minorlinkerver`
|
||
|
* `codesize`
|
||
|
* `initdatasize`
|
||
|
* `uninitdatasize`
|
||
|
* `entrypointaddr`
|
||
|
* `baseofcode`
|
||
|
* `baseofdata`
|
||
|
* `imagebase`
|
||
|
* `sectionalignement`
|
||
|
* `filealignment`
|
||
|
* `majorosver`
|
||
|
* `minorosver`
|
||
|
* `win32ver`
|
||
|
* `imagesize`
|
||
|
* `headersize`
|
||
|
* `checksum`
|
||
|
* `subsystem`
|
||
|
* `dllcharacteristics`
|
||
|
* `stackreservesize`
|
||
|
* `stackcommitsize`
|
||
|
* `heapreservesize`
|
||
|
* `heapcommitsize`
|
||
|
* `loaderflags`
|
||
|
* `rvasandsize`
|
||
|
|
||
|
Example:
|
||
|
|
||
|
```python
|
||
|
import time
|
||
|
import pepy
|
||
|
|
||
|
p = pepy.parse("/path/to/exe")
|
||
|
print("Timedatestamp: %s" % time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(p.timedatestamp)))
|
||
|
ep = p.get_entry_point()
|
||
|
print("Entry point: 0x%x" % ep)
|
||
|
```
|
||
|
|
||
|
The `get_sections`, `get_imports`, `get_exports`, `get_relocations` and
|
||
|
`get_resources` methods each return a list of objects. The type of object
|
||
|
depends upon the method called. `get_sections` returns a list of `section`
|
||
|
objects, `get_imports` returns a list of `import` objects, etc.
|
||
|
|
||
|
### Section Object
|
||
|
|
||
|
The `section` object has the following attributes:
|
||
|
|
||
|
* `base`
|
||
|
* `length`
|
||
|
* `virtaddr`
|
||
|
* `virtsize`
|
||
|
* `numrelocs`
|
||
|
* `numlinenums`
|
||
|
* `characteristics`
|
||
|
* `data`
|
||
|
|
||
|
### Import Object
|
||
|
|
||
|
The `import` object has the following attributes:
|
||
|
|
||
|
* `sym`
|
||
|
* `name`
|
||
|
* `addr`
|
||
|
|
||
|
### Export Object
|
||
|
|
||
|
The `export` object has the following attributes:
|
||
|
|
||
|
* `mod`
|
||
|
* `func`
|
||
|
* `addr`
|
||
|
|
||
|
### Relocation Object
|
||
|
|
||
|
The `relocation` object has the following attributes:
|
||
|
|
||
|
* `type`
|
||
|
* `addr`
|
||
|
|
||
|
### Resource Object
|
||
|
|
||
|
The `resource` object has the following attributes:
|
||
|
|
||
|
* `type_str`
|
||
|
* `name_str`
|
||
|
* `lang_str`
|
||
|
* `type`
|
||
|
* `name`
|
||
|
* `lang`
|
||
|
* `codepage`
|
||
|
* `RVA`
|
||
|
* `size`
|
||
|
* `data`
|
||
|
|
||
|
The `resource` object has the following methods:
|
||
|
|
||
|
* `type_as_str`
|
||
|
|
||
|
Resources are stored in a directory structure. The first three levels of the
|
||
|
are called `type`, `name` and `lang`. Each of these levels can have
|
||
|
either a pre-defined value or a custom string. The pre-defined values are
|
||
|
stored in the `type`, `name` and `lang` attributes. If a custom string is
|
||
|
found it will be stored in the `type_str`, `name_str` and `lang_str`
|
||
|
attributes. The `type_as_str` method can be used to convert a pre-defined
|
||
|
type value to a string representation.
|
||
|
|
||
|
The following code shows how to iterate through resources:
|
||
|
|
||
|
```python
|
||
|
import pepy
|
||
|
|
||
|
from hashlib import md5
|
||
|
import sys
|
||
|
|
||
|
p = pepy.parse(sys.argv[1])
|
||
|
resources = p.get_resources()
|
||
|
print("Resources: (%i)" % len(resources))
|
||
|
for resource in resources:
|
||
|
print("[+] MD5: (%i) %s" % (len(resource.data), md5(resource.data).hexdigest()))
|
||
|
if resource.type_str:
|
||
|
print("\tType string: %s" % resource.type_str)
|
||
|
else:
|
||
|
print("\tType: %s (%s)" % (hex(resource.type), resource.type_as_str()))
|
||
|
if resource.name_str:
|
||
|
print("\tName string: %s" % resource.name_str)
|
||
|
else:
|
||
|
print("\tName: %s" % hex(resource.name))
|
||
|
if resource.lang_str:
|
||
|
print("\tLang string: %s" % resource.lang_str)
|
||
|
else:
|
||
|
print("\tLang: %s" % hex(resource.lang))
|
||
|
print("\tCodepage: %s" % hex(resource.codepage))
|
||
|
print("\tRVA: %s" % hex(resource.RVA))
|
||
|
print("\tSize: %s" % hex(resource.size))
|
||
|
```
|
||
|
|
||
|
Note that some binaries (particularly packed) may have corrupt resource entries.
|
||
|
In these cases you may find that `len(resource.data)` is 0 but `resource.size` is
|
||
|
greater than 0. The `size` attribute is the size of the data as declared by the
|
||
|
resource data entry.
|
||
|
|
||
|
## Authors
|
||
|
|
||
|
pe-parse was designed and implemented by Andrew Ruef (andrew@trailofbits.com).
|
||
|
|
||
|
pepy was written by Wesley Shields (wxs@atarininja.org).
|