pe-parse

mirror of https://github.com/QuasarApp/pe-parse.git synced 2025-04-29 22:04:33 +00:00

Author	SHA1	Message	Date
Wesley Shields	77b72f3cc9	Implement PE32+ and error reporting. Teach the parser to properly handle PE32+ binaries. The major differences are: - Fields in the OptionalHeader which are not relative are now 64 bits. - Base addresses should all be 64 bits. - The BaseOfData field is not available on PE32+ There is now a 16 bit field tacked on to the end of nt_header_32 called OptionalMagic. This is a duplicate of the Magic field in optional_header_32 and optional_header_64, but is stored in nt_header_32 to make it easier to determine which optional header is being used. I also added support for better error reporting. Now when something fails to parse you can use a couple of functions to find out what happened and where it happened: - GetPEErr(): Return the error as an integer. - GetPEErrString(): Return the error as a string. - GetPEErrLoc(): Return the function and line number of the error. Made some changes to pepy to account for these changes. The interface into pepy is identical. Only externally visible changes are that pepy.parse() will now return the error string and location when parsing fails and the baseofdata attribute will throw an exception if the binary is PE32+. to_string.h is now included from parse.h, so remove it from dump.cpp. While here do a bunch of cleanups to make printing consistent. Use '0x' where appropriate and ensure exceptions are punctuated correctly.	2014-03-07 13:18:24 -05:00
Wesley Shields	ec5c49eaff	Make resource parsing more resilient. I have a UPX packed sample that corrupted the resource directory. These changes allow the resources to be properly parsed. They add an RVA and size to the resource struct. This is the address and size of the resource as it is declared in the directory. If the address is invalid create a zero-length buffer for the data. If the size is invalid (ie: it goes off the end of the .rsrc section) create a zero-length buffer for the data. Otherwise, return the actual data. This allows consumers of the rsrc to figure out if the resource is corrupt or not by comparing the length of the buffer to the size element. If the size is greater than 0 but buffer is empty then it's invalid. Also, it should never happen but just to be safe make pepy catch NULL buffers (in pepy_data_converter) and return an empty bytearray.	2013-12-30 16:45:50 -05:00
Wesley Shields	913b3c16d1	Catch if PyInt_FromLong() returns NULL.	2013-12-24 14:43:09 -05:00
Wesley Shields	a6af4cbd18	Implement resource parsing. While here, fix a memory leak in pepy as I was not decrementing the reference counter on self->data in section_dealloc().	2013-12-24 12:41:59 -05:00
Wesley Shields	dae8606469	Bugfix to get_bytes and add section.data. If get_bytes does not fill the list, get a slice of what was filled and use that to convert to a bytearray. I still want to find a way to just use a bytearray from the start. Luckily with the rest of this commit I don't have a need to call get_bytes() on sections anymore. Sections now have a data attribute which is a bytearray of the data that makes up that section. This way you can just use section.data attribute to get the entire contents and operate on it as you wish. Make test.py use section.data to generate an MD5 of the section. It now also prints the first 10 bytes of each section (if there are bytes).	2013-12-14 22:26:58 -05:00
Wesley Shields	23ebc6e799	Whitespace.	2013-12-12 16:19:07 -05:00
Wesley Shields	9494d96300	Switch to a bytearray for get_bytes(). It probably isn't the best way to do it but I couldn't get anything to work when trying to generate a bytearray object directly. As a workaround I first put each byte into a list and then convert the list to a bytearray.	2013-12-12 16:14:53 -05:00
Wesley Shields	b867946050	Implement relocations. This still needs testing.	2013-11-30 22:44:30 -05:00
Wesley Shields	5e86f97c96	Implement a bunch of parsed attributes. These are all the things the dump-prog pulls out already.	2013-11-30 22:21:10 -05:00
Wesley Shields	5fb0afd098	Simplify things a bit. Instead of having 2 macros for each object simplify by having 1 set of macros that can work across all objects except the parsed object. I could make this work for the parsed object by making the parsed object store PyObject pointers to the parsed values instead of creating them on the fly while getting an attribute.	2013-11-30 21:54:38 -05:00
Wesley Shields	7abab7bd2e	Implement imports and exports. Might as well do some general cleanup too: Rename the len attribute of a section to length. The section, import and export callbacks return 0 on success and anything else on failure. Whitespace fixes. Fix a bunch of copy/paste mistakes in the test script.	2013-11-30 21:36:05 -05:00
Wesley Shields	2083f6f358	Sections are now their own type. Do not return a list of dictionaries from get_sections(). Now it returns a list of section objects, which expose the information via attributes.	2013-11-29 23:32:32 -05:00
Wesley Shields	b4ad87819e	Support section, symbols and characteristics. While here, make it easier to extend by providing macros to eliminate the mundane that goes into writing getset members.	2013-11-29 21:56:12 -05:00
Wesley Shields	912a892e47	Switch from using members to getseters. This means I don't have to store anything in the pepy_parsed object (PyObject pointers or native C types). Use a macro to get things out of the parsed structures and into python objects.	2013-11-29 19:04:45 -05:00
Wesley Shields	3c7d1c1052	Turns out I like using native types. Switch back to using native types. This is less memory for me to manage.	2013-11-29 16:29:45 -05:00
Wesley Shields	53fb7e7d2c	Fix crash, convert back to PyObject pointers. There was some weird memory corruption caused by how pepy_parsed_init() was parsing arguments. The result was that accessing attributes or methods which didn't exist would periodically cause segfaults. This code was leftover from an earlier way of doing things and doesn't need to be done this way. Just parse straight to a C style string instead of this crap. Also implement support for signature, machine support. Also, add Py_TPFLAGS_BASETYPE as you should.	2013-11-29 16:20:44 -05:00
Wesley Shields	860fbff4e4	Don't store parsed values in python objects. Convert the PyObject pointers used inside pepy_parsed into their corresponding native types and use those. Teach the members array to return them accordingly. While here might as well add support for signature and machine values. Also, convert test.py to have shorter output by not using pprint.	2013-11-29 14:28:39 -05:00
Wesley Shields	ed77443f31	Implement timedatestamp member. While here, DECREF the string used in init. Also, make a note that I really want to use a bytearray instead of a list for get_bytes().	2013-11-29 14:11:01 -05:00
Wesley Shields	6d8a39ad72	Add a bunch of constants. These are useful for checking values I'll be adding support for later. import pepy print hex(pepy.MZ_MAGIC)	2013-11-27 16:17:22 -05:00
Wesley Shields	20869810cf	Silence warnings in pepy.cpp.	2013-11-27 16:16:55 -05:00
Wesley Shields	a928a15b8b	Initial commit of pepy (pronounced p-pie). This is a set of python bindings to pe-parse. It is nowhere near feature complete yet but I'll keep working on it.	2013-11-27 15:52:24 -05:00

21 Commits