octoleo/qpdf - qpdf - Vast Development Method

mirror of https://github.com/qpdf/qpdf.git synced 2024-11-11 23:45:47 +00:00

Author	SHA1	Message	Date
Jay Berkenbilt	5187a3ec85	Shallow copy arrays without removing sparseness	2019-08-17 23:02:41 -04:00
Jay Berkenbilt	bf7c6a8070	Use SparseOHArray in parsing	2019-08-17 23:02:41 -04:00
Jay Berkenbilt	a89d8a0677	Refactor QPDF_Array in preparation for using SparseOHArray	2019-08-17 23:02:41 -04:00
Jay Berkenbilt	e83f3308fb	SparseOHArray	2019-08-17 23:02:41 -04:00
Jay Berkenbilt	4db1de97ce	Convert some cases of logic_error to runtime_error There were a few cases that could be caused by invalid input rather than bugs in the code which were throwing logic_error instead of runtime_error.	2019-06-25 12:43:06 -04:00
Jay Berkenbilt	848351f1fc	Add missing #include <cstring>	2019-06-21 22:29:31 -04:00
Jay Berkenbilt	b07ad6794e	Fix bugs found by fuzz tests * Several assertions in linearization were not always true; change them to run time errors * Handle a few cases of uninitialized objects * Handle pages with no contents when doing form operations * Handle invalid page tree nodes when traversing pages	2019-06-21 17:56:24 -04:00
Jay Berkenbilt	d71f05ca07	Fix sign and conversion warnings (major) This makes all integer type conversions that have potential data loss explicit with calls that do range checks and raise an exception. After this commit, qpdf builds with no warnings when -Wsign-conversion -Wconversion is used with gcc or clang or when -W3 -Wd4800 is used with MSVC. This significantly reduces the likelihood of potential crashes from bogus integer values. There are some parts of the code that take int when they should take size_t or an offset. Such places would make qpdf not support files with more than 2^31 of something that usually wouldn't be so large. In the event that such a file shows up and is valid, at least qpdf would raise an error in the right spot so the issue could be legitimately addressed rather than failing in some weird way because of a silent overflow condition.	2019-06-21 13:17:21 -04:00
Jay Berkenbilt	da30764bce	Change QPDFObjectHandle::pipeStreamData's encode_flags type Change from unsigned long to int since we pass enumerated type values to this field.	2019-06-21 13:17:21 -04:00
Jay Berkenbilt	3608afd5c5	Add new integer accessors to QPDFObjectHandle	2019-06-21 13:17:21 -04:00
Jay Berkenbilt	cf469d7890	Give up reading objects with too many consecutive errors	2019-06-15 08:52:19 -04:00
Jay Berkenbilt	4ccb29912a	Tighten isPageObject (fixes #310 )	2019-04-20 21:00:43 -04:00
Jay Berkenbilt	eb49e07c0a	Make inline image token exactly contain the image data Do not include the trailing EI, and handle cases where EI is not preceded by a delimiter. Such cases have been seen in the wild.	2019-01-31 20:28:44 -05:00
Jay Berkenbilt	ec9e310c9e	Refactor QPDFTokenizer's inline image handling Add a version of expectInlineImage that takes an input source and searches for EI. This is in preparation for improving the way EI is found. This commit just refactors the code without changing the functionality and adds tests to make sure the old and new code behave identically.	2019-01-31 09:26:37 -05:00
Jay Berkenbilt	31372edce0	Inline image token value ends with EI, not delimiter The inline image token erroneously included the delimiter that followed EI. The ObjectHandle created from it was correct.	2019-01-31 09:26:37 -05:00
Jay Berkenbilt	8cb245739c	Add QPDFObjectHandle::getUniqueResourceName	2019-01-27 07:50:30 -05:00
Jay Berkenbilt	009767d97a	Handle inheritable page attributes Add getAttribute for handling inheritable page attributes, and fix getPageImages and annotation flattening code to use it.	2019-01-25 22:30:05 -05:00
Jay Berkenbilt	f78ea057ca	Switch annotation flattening to use the form xobjects Instead of directly putting the contents of the annotation appearance streams into the page's content stream, add commands to render the form xobjects directly. This is a more robust way to do it than the original solution as it works properly with patterns and avoids problems with resource name clashes between the pages and the form xobjects.	2019-01-02 21:49:47 -05:00
Jay Berkenbilt	95d6b17a89	Add QPDFObjectHandle::mergeDictionary()	2019-01-01 08:12:56 -05:00
Jay Berkenbilt	5059ec0d35	Add Matrix class under QPDFObjectHandle	2018-12-31 23:02:43 -05:00
Jay Berkenbilt	30a0c070e4	Add QPDFObjectHandle::getJSON()	2018-12-21 18:34:56 -05:00
Jay Berkenbilt	077d3d4512	Add QPDFObjectHandle::wrapInArray() Wrap an object in an array if it is not already an array.	2018-12-18 16:45:48 -05:00
Jay Berkenbilt	38c9ed23c3	Treat content stream parsing errors as an error, not a warning If parsing content streams is treated as a warning, there is no way for a caller to know if a parsing operation has failed. This is very dangerous and will likely result in data loss when token filters are parser callbacks are in use.	2018-06-22 10:44:08 -04:00
Jay Berkenbilt	ddd78c1b7f	Fix QPDFObjectHandle::shallowCopy It's not really a shallow copy. It just doesn't cross indirect object boundaries. The old implementation had a bug that would cause multiple shallow copies of the same object to share memory, which was not the intention.	2018-06-21 20:34:45 -04:00
Jay Berkenbilt	952a665a4e	Better support for creating Unicode strings	2018-06-21 15:57:13 -04:00
Jay Berkenbilt	4cded10821	Add QPDFObjectHandle::Rectangle type Provide a convenient way of accessing rectangles.	2018-06-21 15:57:13 -04:00
Jay Berkenbilt	b4d6cf6836	Limit depth of nesting in direct objects (fixes #202 ) This fixes CVE-2018-9918.	2018-04-15 16:11:22 -04:00
Jay Berkenbilt	e4e2e26d99	Properly handle pages with no contents (fixes #194 ) Remove calls to assertPageObject(). All cases in the library that called assertPageObject() work fine if you don't call assertPageObject() because nothing assumes anything that was being checked by that call. Removing the calls enables more files to be successfully processed.	2018-03-06 11:34:07 -05:00
Jay Berkenbilt	d0e99f195a	More robust handling of type errors Give objects descriptions and context so it is possible to issue warnings instead of fatal errors for attempts to access objects of the wrong type.	2018-02-18 21:06:27 -05:00
Jay Berkenbilt	21b7481b0e	Push members of QPDFObjectHandle into a Members object As in other cases, this is to enable adding new member variables in the future without breaking ABI compatibility.	2018-02-18 21:06:27 -05:00
Jay Berkenbilt	e410b0fe0d	Simplify TokenFilter interface Expose Pl_QPDFTokenizer, and have it do more of the work of managing the token filter's pipeline.	2018-02-18 21:05:47 -05:00
Jay Berkenbilt	5708b5d0aa	Add additional interface for filtering page contents	2018-02-18 21:05:47 -05:00
Jay Berkenbilt	9910104442	Implement TokenFilter and refactor Pl_QPDFTokenizer Implement a TokenFilter class and refactor Pl_QPDFTokenizer to use a TokenFilter class called ContentNormalizer. Pl_QPDFTokenizer is now a general filter that passes data through a TokenFilter.	2018-02-18 21:05:46 -05:00
Jay Berkenbilt	b8723e97f4	Add coalesce contents capability	2018-02-18 21:05:46 -05:00
Jay Berkenbilt	fcd611b61e	Refactor parseContentStream	2018-02-18 21:05:46 -05:00
Jay Berkenbilt	05ff619b09	Remove redundant method Remove a redundant method that was equal to another one with additional arguments. This breaks binary compatibility, but there are other ABI breaking changes in the upcoming release, so now is the time to do it.	2018-02-18 21:05:46 -05:00
Jay Berkenbilt	55ee55394c	Use inline image token in content parser	2018-02-18 21:05:46 -05:00
Jay Berkenbilt	d31a7b76e7	Improve message for stream decoding error Tweak the message so that we inform the user that we are mitigating data loss.	2017-09-12 16:03:48 -04:00
Jay Berkenbilt	728dc9e6d8	Fix error caught by clang	2017-08-26 21:51:17 -04:00
Jay Berkenbilt	ad527a64f9	Parse iteratively to avoid stack overflow (fixes #146 )	2017-08-25 21:56:45 -04:00
Jay Berkenbilt	e452d9dca6	Spell check	2017-08-22 14:22:20 -04:00
Jay Berkenbilt	9744414c66	Enable finer grained control of stream decoding This commit adds several API methods that enable control over which types of filters QPDF will attempt to decode. It also adds support for /RunLengthDecode and /DCTDecode filters for both encoding and decoding.	2017-08-21 17:44:22 -04:00
Jay Berkenbilt	cfa2eb97fb	Add page rotation (fixes #132 )	2017-08-12 22:57:38 -04:00
Jay Berkenbilt	b389268f16	Better handle split content streams (fixes #73 ) When parsing content streams, allow content to be split arbitrarily across stream boundaries.	2017-07-29 12:19:04 -04:00
Jay Berkenbilt	7f8892525f	Add precheck streams capability When requested, QPDFWriter will do more aggress prechecking of streams to make sure it can actually succeed in decoding them before attempting to do so. This will allow preservation of raw data even when the raw data is corrupted relative to the specified filters.	2017-07-27 23:42:27 -04:00
Jay Berkenbilt	40f00122b8	Convert object parsing errors to warnings QPDFObjectHandle::parseInternal now issues warnings instead of throwing exceptions for all error conditions that it finds (except internal logic errors) and has stronger recovery for things like invalid tokens and malformed dictionaries. This should improve qpdf's ability to recover from a wide range of broken files that currently cause it to fail.	2017-07-27 18:20:31 -04:00
Jay Berkenbilt	12db09898e	Don't interpret word tokens in content streams (fixes #82 )	2017-07-26 06:24:07 -04:00
Jay Berkenbilt	afe0242b26	Handle object ID 0 (fixes #99 ) This is CVE-2017-9208. The QPDF library uses object ID 0 internally as a sentinel to represent a direct object, but prior to this fix, was not blocking handling of 0 0 obj or 0 0 R as a special case. Creating an object in the file with 0 0 obj could cause various infinite loops. The PDF spec doesn't allow for object 0. Having qpdf handle object 0 might be a better fix, but changing all the places in the code that assumes objid == 0 means direct would be risky.	2017-07-26 06:24:07 -04:00
Jay Berkenbilt	603f222365	Fix infinite loop while reporting an error (fixes #101 ) This is CVE-2017-9210. The description string for an error message included unparsing an object, which is too complex of a thing to try to do while throwing an exception. There was only one example of this in the entire codebase, so it is not a pervasive problem. Fixing this eliminated one class of infinite loop errors.	2017-07-26 06:24:07 -04:00
Jay Berkenbilt	c729e07d55	Avoid resolving arguments to R When checking two objects preceding R while parsing, ensure that the objects are direct. This avoids stuff like 1 0 obj containing 1 0 R 0 R from causing an infinite loop in object resolution.	2015-02-21 17:51:08 -05:00

1 2

99 Commits