* Put all of peparse in the peparse namespace.
* Fixes dupicate symbol problems when using the library inside other applications, namely Python
* Closes#25
I've noticed this in one (otherwise valid) EFI image. What happens is
the section specifies an invalid PointerToRawData, which the bounded
buffer abstraction catches and returns NULL. However, the SizeOfRawData
is still in the structure (and probably invalid too).
I saw two ways to fix this. If sectionData ends up being NULL we can set
SizeOfRawData to 0, but that would be truncating what is otherwise
specified in the file.
The other option is to teach dump-prog and pepy about this and adjust
accordingly. This involves checking for a data being a NULL pointer in
dump-prog when printing sections. In pepy it required roughly the same
check.
I went with option 2.
This was causing a problem where resources with strings would accumulate
the strings of previous resources in the directory.
For example, here is the output of test.py on
3f0961b7942f12bc96848509c04da2b6:
Resources: (4)
[+] MD5: (191649) 33a6345b919c7c733da9d33ee4ac64eb
Type string: BINARY
Name string:
1.165.3106.0_TO_1.165.3138.0_MPASDLTA.VDM._P
Lang: 0x0
Codepage: 0x4e4
RVA: 0x51dc
Size: 0x2eca1
First 10 bytes: 0x4d50535091ec0200c263
[+] MD5: (293587) e4c9b9aa65e0b236cb180fa489502700
Type string: BINARY
Name string: 1.165.3106.0_TO_1.165.3138.0_MPASDLTA.VDM._P1.165.3106.0_TO_1.165.3138.0_MPAVDLTA.VDM._P
The second resource has the first resources name string in it.
Teach the parser to properly handle PE32+ binaries.
The major differences are:
- Fields in the OptionalHeader which are not relative are now 64 bits.
- Base addresses should all be 64 bits.
- The BaseOfData field is not available on PE32+
There is now a 16 bit field tacked on to the end of nt_header_32 called
OptionalMagic. This is a duplicate of the Magic field in optional_header_32
and optional_header_64, but is stored in nt_header_32 to make it easier
to determine which optional header is being used.
I also added support for better error reporting. Now when something fails
to parse you can use a couple of functions to find out what happened and
where it happened:
- GetPEErr(): Return the error as an integer.
- GetPEErrString(): Return the error as a string.
- GetPEErrLoc(): Return the function and line number of the error.
Made some changes to pepy to account for these changes. The interface
into pepy is identical. Only externally visible changes are that
pepy.parse() will now return the error string and location when parsing
fails and the baseofdata attribute will throw an exception if the binary
is PE32+.
to_string.h is now included from parse.h, so remove it from dump.cpp.
While here do a bunch of cleanups to make printing consistent. Use '0x'
where appropriate and ensure exceptions are punctuated correctly.
Instead of constantly defining and redefining the macros to read values
just define them once. There are now the three main ones (READ_WORD,
READ_DWORD and READ_BYTE) along with READ_DWORD_PTR and READ_DWORD_NULL.
Each macro takes a pointer to a bounded_buffer (what to read), an offset
(where to read), a structure and member (what to read into). You should
use READ_DWORD_PTR when you have a pointer to a structure. You can
use READ_DWORD_NULL when failure to read should return NULL as all the
rest return false.
Fixes#7.
I have a UPX packed sample that corrupted the resource directory. These changes
allow the resources to be properly parsed.
They add an RVA and size to the resource struct. This is the address and size
of the resource as it is declared in the directory. If the address is invalid
create a zero-length buffer for the data. If the size is invalid (ie: it goes
off the end of the .rsrc section) create a zero-length buffer for the data.
Otherwise, return the actual data.
This allows consumers of the rsrc to figure out if the resource is corrupt
or not by comparing the length of the buffer to the size element. If the
size is greater than 0 but buffer is empty then it's invalid.
Also, it should never happen but just to be safe make pepy catch NULL
buffers (in pepy_data_converter) and return an empty bytearray.
I had initially written this in such a way that it would break if there
were multiple entries anywhere other than the first table. This change
now works across more complex samples that I have tested against.
While here, I did a little moving around and had to create a structure
that isn't used other than to know how far to move the offset when
parsing. This is because the struct into which I am parsing the data
keeps track of other things along the way, so it's size is incorrect.
While here, change parse_resource() to be parse_resource_table() as it
is more accurate to what it really does.
When iterating through the bytearray it would cause a python crash if
the byte value was 0x78. I have a test sample where the first 8 bytes
at the entry point are 0xe8 0xa6 0x4e 0x0 0x0 0xe9 0x78 0xfe. If I don't
do this dance it crashes when trying to get the 6th (0x78) byte out
of the array.