octoleo/qpdf - qpdf - Vast Development Method

mirror of https://github.com/qpdf/qpdf.git synced 2024-11-14 00:34:03 +00:00

Author	SHA1	Message	Date
Jay Berkenbilt	c76536dd9a	Implement JSON v2 output	2022-05-08 13:45:20 -04:00
Jay Berkenbilt	1bc8abfdd3	Implement JSON v2 for Stream Not fully exercised in this commit	2022-05-08 13:45:20 -04:00
Jay Berkenbilt	16f4f94cd9	Prepare code for JSON v2 Update getJSON() methods and calls to them	2022-05-07 11:12:01 -04:00
Jay Berkenbilt	21d6e3231f	Make use of the new Pipeline methods in some places	2022-05-03 18:31:23 -04:00
Jay Berkenbilt	59f3e09edf	Make Pipeline::write take an unsigned char const* (API change)	2022-05-03 18:31:22 -04:00
Jay Berkenbilt	c365a26e9d	Revert "Remove QPDFObjectHandle::replaceOrRemoveKey" This reverts commit `dc059560e7`. I changed my mind. There's no harm in leaving it deprecated for a release cycle.	2022-04-30 14:15:07 -04:00
Jay Berkenbilt	dc059560e7	Remove QPDFObjectHandle::replaceOrRemoveKey See ChangeLog for rationale for not deprecating it as originally planned.	2022-04-30 13:39:45 -04:00
Jay Berkenbilt	4f24617e1e	Code clean up: use range-style for loops wherever possible Where not possible, use "auto" to get the iterator type. Editorial note: I have avoid this change for a long time because of not wanting to make gratuitous changes to version history, which can obscure when certain changes were made, but with having recently touched every single file to apply automatic code formatting and with making several broad changes to the API, I decided it was time to take the plunge and get rid of the older (pre-C++11) verbose iterator syntax. The new code is just easier to read and understand, and in many cases, it will be more effecient as fewer temporary copies are being made. m-holger, if you're reading, you can see that I've finally come around. :-)	2022-04-30 13:27:18 -04:00
Jay Berkenbilt	7f023701dd	Formatting: remove space in range-style for loops Change .clang-format and commit automated changes from a fresh run of format-code	2022-04-30 13:26:43 -04:00
Jay Berkenbilt	d8fdf632a9	Use replaceKeyAndGet in a few places in existing code	2022-04-29 20:28:02 -04:00
Jay Berkenbilt	e80fad86e9	Add new QPDFObjectHandle methods for more fluent programming	2022-04-29 20:09:10 -04:00
Jay Berkenbilt	4be2f36049	Deprecate replaceOrRemoveKey -- it's the same as replaceKey	2022-04-24 09:31:32 -04:00
Jay Berkenbilt	4925f0d18c	Have dictionary/streams mutators take const& where possible	2022-04-24 09:05:50 -04:00
Jay Berkenbilt	68e721981a	Add new QPDF::warn that takes most of QPDFExc's arguments	2022-04-23 18:25:43 -04:00
Jay Berkenbilt	75fe4f60c3	Use anonymous namespaces for file-private classes	2022-04-16 13:35:27 -04:00
Jay Berkenbilt	cdd0b4fb7d	Use = default and = delete where possible in classes	2022-04-16 11:39:14 -04:00
Jay Berkenbilt	2a7d2b63c2	Make ABI-breaking changes that don't modify API at all * Merge overloaded functions by adding default values * Remove non-const methods that are identical to const methods	2022-04-16 10:41:46 -04:00
Jay Berkenbilt	a68703b07e	Replace PointerHolder with std::shared_ptr in library sources only (patrepl and cleanpatch are my own utilities) patrepl s/PointerHolder/std::shared_ptr/g {include,libqpdf}/qpdf/.hh patrepl s/PointerHolder/std::shared_ptr/g libqpdf/.cc patrepl s/make_pointer_holder/std::make_shared/g libqpdf/.cc patrepl s/make_array_pointer_holder/QUtil::make_shared_array/g libqpdf/.cc patrepl s,qpdf/std::shared_ptr,qpdf/PointerHolder, */.cc */.hh git restore include/qpdf/PointerHolder.hh cleanpatch ./format-code	2022-04-09 17:33:29 -04:00
Jay Berkenbilt	77e889495f	Update some code manually to get better formatting results Add comments to force line breaks, parenthesize function arguments that are contatenated strings, etc. -- these kinds of changes improve clang-format's results and also cause emacs cc-mode to match clang-format. After this type of change, most of the time, when clang-format and emacs disagree, clang-format is better.	2022-04-05 14:56:19 -04:00
Jay Berkenbilt	12f1eb15ca	Programmatically apply new formatting to code Run this: for i in */.cc */.c */.h */.hh; do clang-format < $i >\| $i.new && mv $i.new $i done	2022-04-04 08:10:40 -04:00
m-holger	4ff837f099	Fix tests for Form XObjects Remove test for type == /XObject in QPDFObjectHandle::isFormXObject as type value is optional (as per spec 8.10.2). Replace code to test for /Form in QPDFJob::shouldRemoveUnreferencedResources with a call to isFormXObject.	2022-02-10 19:47:37 -05:00
Jay Berkenbilt	cb769c62e5	WHITESPACE ONLY -- expand tabs in source code This comment expands all tabs using an 8-character tab-width. You should ignore this commit when using git blame or use git blame -w. In the early days, I used to use tabs where possible for indentation, since emacs did this automatically. In recent years, I have switched to only using spaces, which means qpdf source code has been a mixture of spaces and tabs. I have avoided cleaning this up because of not wanting gratuitous whitespaces change to cloud the output of git blame, but I changed my mind after discussing with users who view qpdf source code in editors/IDEs that have other tab widths by default and in light of the fact that I am planning to start applying automatic code formatting soon.	2022-02-08 11:51:15 -05:00
Jay Berkenbilt	c62e8e2b28	Update for clean compile with POINTERHOLDER_TRANSITION=2	2022-02-07 17:38:22 -05:00
m-holger	8371060340	Add method QPDFObjectHandle::getKeyIfDict	2022-02-06 11:21:15 -05:00
Jay Berkenbilt	7fb22740e1	Add operator ""_qpdf for creating QPDFObjectHandle literals	2022-02-05 11:29:25 -05:00
m-holger	e58b1174c7	Add new QPDFObjectHandle::getValueAs... accessors	2022-02-05 11:24:35 -05:00
Jay Berkenbilt	9044a24097	PointerHolder: deprecate getPointer() and getRefcount() Use get() and use_count() instead. Add #define NO_POINTERHOLDER_DEPRECATION to remove deprecation markers for these only. This commit also removes all deprecated PointerHolder API calls from qpdf's code except in PointerHolder's test suite, which must continue to test the deprecated APIs.	2022-02-04 13:12:37 -05:00
m-holger	8eca9d8fd9	Fix QPDFObjectHandle::isOrHasName Ensure isOrHasName returns true if object is an array and the name is present anywhere in the array.	2022-01-27 09:35:39 -06:00
m-holger	07db3200cb	Remove some if statements and simplify some boolean expressions Use QPDFObjectHandle::isNameAndEquals, isDictionaryOfType and isStreamOfType.	2022-01-27 07:31:12 -06:00
m-holger	710d2e54f0	Allow testing for subtype without specifying type in isDictionaryOfType etc Accept empty string as type parameter in QPDFObjectHandle::isDictionaryOfType and isStreamOfType to allow for dictionaries with optional type.	2022-01-27 07:31:12 -06:00
m-holger	8593b9fdf7	Add new convenience methods QPDFObjectHandle::isNameAndEquals, etc Add methods isNameAndEquals, isDictionaryOfType, isStreamOfType	2022-01-22 08:10:28 -06:00
Jay Berkenbilt	3340dbe976	Use a specific error code for type warnings and clarify docs	2021-12-10 11:15:49 -05:00
Jay Berkenbilt	9b28933647	Check object ownership when adding When adding a QPDFObjectHandle to an array or dictionary, if possible, check if the new object belongs to the same QPDF. This makes it much easier to find incorrect code than waiting for the situation to be detected when the file is written.	2021-11-04 12:29:42 -04:00
Jay Berkenbilt	0b77f2cf26	Revert non-binary-compatible handleWarning change -- see TODO (ABI)	2021-03-04 15:59:46 -05:00
Jay Berkenbilt	f68e25c7f2	Don't use handleWarning, which is being reverted	2021-03-04 15:59:45 -05:00
Jay Berkenbilt	552303a94a	Check for reserved after dereference	2021-03-04 15:08:36 -05:00
Jay Berkenbilt	d7ffdfa994	Add optional conflict detection to mergeResources Also improve behavior around direct vs. indirect resources.	2021-03-04 15:08:36 -05:00
Jay Berkenbilt	a15ec6967d	Enhancements to ParserCallbacks	2021-03-03 17:05:49 -05:00
Jay Berkenbilt	a4d6589ff2	Have QPDFObjectHandle notice when replaceObject was called This results in a performance penalty of 1% to 2% when replaceObject and swapObjects are never called and a somewhat larger penalty if they are called, but it's worth it to avoid very confusing behavior as discussed in depth in qpdf#507.	2021-02-25 07:32:46 -05:00
Jay Berkenbilt	ec6719fd25	Always call dereference() before querying obj pointer	2021-02-25 07:31:26 -05:00
Jay Berkenbilt	7b3cbacf5d	Change from QPDF{Array,Dict}Items to aitems() and ditems()	2021-02-22 11:05:39 -05:00
Jay Berkenbilt	92fbc6fdf5	QPDFObjectHandle::copyStream	2021-02-21 06:36:30 -05:00
Jay Berkenbilt	901f1a788c	Enhance QPDFMatrix API	2021-02-21 06:36:30 -05:00
Jay Berkenbilt	05eb5826d8	Fix isPagesObject and isPageObject There are lots of things with /Kids that are not pages. Repair the pages tree, then do a reliable check.	2021-02-20 19:42:41 -05:00
Jay Berkenbilt	35dd11f356	Allow --rotate=0	2021-02-20 16:29:34 -05:00
Jay Berkenbilt	a773f4c71d	Add QPDFObjectHandle::parse for strings with context	2021-02-15 11:33:03 -05:00
Jay Berkenbilt	efbb21673c	Add functional versions of QPDFObjectHandle::replaceStreamData Also fix a bug in checking consistency of length for stream data providers. Length should not be checked or recorded if the provider says it failed to generate the data.	2021-02-14 14:42:24 -05:00
Jay Berkenbilt	07f40bd254	QUtil::double_to_string: trim trailing zeroes with option to disable	2021-02-13 02:30:00 -05:00
Jay Berkenbilt	4ae93a73c5	Improve memory safety of dict/array iterators	2021-01-31 07:16:03 -05:00
Jay Berkenbilt	de0b11fc47	Add C++ iterator API around array and dictionary objects	2021-01-30 15:15:23 -05:00
Jay Berkenbilt	35e7859bc7	Make QPDFObjectHandle::is* return false for uninitialized objects	2021-01-29 15:46:54 -05:00
Jay Berkenbilt	6226b69dba	Add warn() to QPDF's public API	2021-01-16 18:41:53 -05:00
Jay Berkenbilt	3be58f49e5	Make more QPDFPageObjectHelper methods work with form XObject	2021-01-02 14:08:53 -05:00
Jay Berkenbilt	bedf35d6a5	Bug fix: avoid extraneous pipeline finish calls with multiple contents Avoid calling finish() multiple times on the pipeline passed to pipeContentStreams. This commit also fixes a bug in which qpdf was not exiting with the proper exit status if warnings found while splitting pages; this was exposed by a test case that changed.	2021-01-02 14:08:17 -05:00
Jay Berkenbilt	a139d2b36d	Add several methods for working with form XObjects (fixes #436 ) Make some more methods in QPDFPageObjectHelper work with form XObjects, provide forEach methods to walk through nested form XObjects, possibly recursively. This should make it easier to work with form XObjects from user code.	2021-01-02 12:29:31 -05:00
Jay Berkenbilt	63ea46193d	QPDFPageObjectHelper: getPageImages -> getImages	2021-01-02 11:33:36 -05:00
Jay Berkenbilt	e7a8554563	QPDFPageObjectHelper::getPageImages: support form XObjects	2021-01-02 11:33:36 -05:00
Jay Berkenbilt	1562d34c09	Add QPDFObjectHandle::isFormXObject	2021-01-01 07:36:10 -05:00
Jay Berkenbilt	12ecd2019a	Add QPDFObjectHandle::setFilterOnWrite	2020-12-28 12:58:19 -05:00
Jay Berkenbilt	cc8895078a	Add QPDFObjectHandle::makeDirect(bool allow_streams)	2020-12-26 08:48:18 -05:00
Jay Berkenbilt	bd79138c84	Treat direct page as runtime rather than logic error (fuzz issue 27393)	2020-11-11 09:50:43 -05:00
Jay Berkenbilt	b30deaeeab	Avoid merging adjacent tokens when concatenating contents (fixes #444 )	2020-10-23 08:00:04 -04:00
Jay Berkenbilt	92d3cbecd4	Fix warnings reported by -Wshadow=local (fixes #431 )	2020-04-16 12:41:43 -04:00
Jay Berkenbilt	893d38b87e	Allow propagation of errors and retry through StreamDataProvider StreamDataProvider::provideStreamData now has a rich enough API for it to effectively proxy to pipeStreamData.	2020-04-05 20:07:13 -04:00
Jay Berkenbilt	6a4117add9	Avoid potential segfault in warning methods	2020-04-03 21:39:20 -04:00
Jay Berkenbilt	38afdcea7b	Add QPDFObjectHandle::unsafeShallowCopy	2020-04-03 12:16:24 -04:00
Jay Berkenbilt	89f19b7099	Performance: remove Members indirection for QPDFObjectHandle	2020-04-03 12:16:24 -04:00
Jay Berkenbilt	278710fbe8	Refactor QPDFPageObjectHelper::removeUnreferencedResources() Refactor removeUnreferencedResources to prepare for filtering form XObjects.	2020-03-31 17:39:20 -04:00
Masamichi Hosoda	5a842792b6	Parse Contents in signature dictionary without encryption Various PDF digital signing tools do not encrypt /Contents value in signature dictionary. Adobe Acrobat Reader DC can handle a PDF with the /Contents value not encrypted. Write Contents in signature dictionary without encryption Tests ensure that string /Contents are not handled specially when not found in sig dicts.	2019-10-22 16:20:21 -04:00
Masamichi Hosoda	cdc46d78f4	Add QPDFObject::getParsedOffset()	2019-10-22 16:19:06 -04:00
Jay Berkenbilt	685250d7d6	Correct reversed Rectangle coordinates (fixes #363 )	2019-09-19 21:25:34 -04:00
Jay Berkenbilt	8b1e307741	Warn for duplicated dictionary keys (fixes #345 )	2019-09-19 20:22:34 -04:00
Jay Berkenbilt	94e86e2528	Fix fuzz issue 16301	2019-08-25 22:52:25 -04:00
Jay Berkenbilt	3f1ab64066	Pass offset and length to ParserCallbacks::handleObject	2019-08-22 22:54:29 -04:00
Jay Berkenbilt	4b2e72c4cd	Test for direct, rather than resolved nulls in parser Just because we know an indirect reference is null, doesn't mean we shouldn't keep it indirect.	2019-08-22 17:55:16 -04:00
Jay Berkenbilt	225cd9dac2	Protect against coding error of re-entrant parsing	2019-08-22 17:55:16 -04:00
Jay Berkenbilt	42d396f1dd	Handle invalid name tokens symmetrically for PDF < 1.2 (fixes #332 )	2019-08-19 19:48:27 -04:00
Jay Berkenbilt	5187a3ec85	Shallow copy arrays without removing sparseness	2019-08-17 23:02:41 -04:00
Jay Berkenbilt	bf7c6a8070	Use SparseOHArray in parsing	2019-08-17 23:02:41 -04:00
Jay Berkenbilt	a89d8a0677	Refactor QPDF_Array in preparation for using SparseOHArray	2019-08-17 23:02:41 -04:00
Jay Berkenbilt	e83f3308fb	SparseOHArray	2019-08-17 23:02:41 -04:00
Jay Berkenbilt	4db1de97ce	Convert some cases of logic_error to runtime_error There were a few cases that could be caused by invalid input rather than bugs in the code which were throwing logic_error instead of runtime_error.	2019-06-25 12:43:06 -04:00
Jay Berkenbilt	848351f1fc	Add missing #include <cstring>	2019-06-21 22:29:31 -04:00
Jay Berkenbilt	b07ad6794e	Fix bugs found by fuzz tests * Several assertions in linearization were not always true; change them to run time errors * Handle a few cases of uninitialized objects * Handle pages with no contents when doing form operations * Handle invalid page tree nodes when traversing pages	2019-06-21 17:56:24 -04:00
Jay Berkenbilt	d71f05ca07	Fix sign and conversion warnings (major) This makes all integer type conversions that have potential data loss explicit with calls that do range checks and raise an exception. After this commit, qpdf builds with no warnings when -Wsign-conversion -Wconversion is used with gcc or clang or when -W3 -Wd4800 is used with MSVC. This significantly reduces the likelihood of potential crashes from bogus integer values. There are some parts of the code that take int when they should take size_t or an offset. Such places would make qpdf not support files with more than 2^31 of something that usually wouldn't be so large. In the event that such a file shows up and is valid, at least qpdf would raise an error in the right spot so the issue could be legitimately addressed rather than failing in some weird way because of a silent overflow condition.	2019-06-21 13:17:21 -04:00
Jay Berkenbilt	da30764bce	Change QPDFObjectHandle::pipeStreamData's encode_flags type Change from unsigned long to int since we pass enumerated type values to this field.	2019-06-21 13:17:21 -04:00
Jay Berkenbilt	3608afd5c5	Add new integer accessors to QPDFObjectHandle	2019-06-21 13:17:21 -04:00
Jay Berkenbilt	cf469d7890	Give up reading objects with too many consecutive errors	2019-06-15 08:52:19 -04:00
Jay Berkenbilt	4ccb29912a	Tighten isPageObject (fixes #310 )	2019-04-20 21:00:43 -04:00
Jay Berkenbilt	eb49e07c0a	Make inline image token exactly contain the image data Do not include the trailing EI, and handle cases where EI is not preceded by a delimiter. Such cases have been seen in the wild.	2019-01-31 20:28:44 -05:00
Jay Berkenbilt	ec9e310c9e	Refactor QPDFTokenizer's inline image handling Add a version of expectInlineImage that takes an input source and searches for EI. This is in preparation for improving the way EI is found. This commit just refactors the code without changing the functionality and adds tests to make sure the old and new code behave identically.	2019-01-31 09:26:37 -05:00
Jay Berkenbilt	31372edce0	Inline image token value ends with EI, not delimiter The inline image token erroneously included the delimiter that followed EI. The ObjectHandle created from it was correct.	2019-01-31 09:26:37 -05:00
Jay Berkenbilt	8cb245739c	Add QPDFObjectHandle::getUniqueResourceName	2019-01-27 07:50:30 -05:00
Jay Berkenbilt	009767d97a	Handle inheritable page attributes Add getAttribute for handling inheritable page attributes, and fix getPageImages and annotation flattening code to use it.	2019-01-25 22:30:05 -05:00
Jay Berkenbilt	f78ea057ca	Switch annotation flattening to use the form xobjects Instead of directly putting the contents of the annotation appearance streams into the page's content stream, add commands to render the form xobjects directly. This is a more robust way to do it than the original solution as it works properly with patterns and avoids problems with resource name clashes between the pages and the form xobjects.	2019-01-02 21:49:47 -05:00
Jay Berkenbilt	95d6b17a89	Add QPDFObjectHandle::mergeDictionary()	2019-01-01 08:12:56 -05:00
Jay Berkenbilt	5059ec0d35	Add Matrix class under QPDFObjectHandle	2018-12-31 23:02:43 -05:00
Jay Berkenbilt	30a0c070e4	Add QPDFObjectHandle::getJSON()	2018-12-21 18:34:56 -05:00
Jay Berkenbilt	077d3d4512	Add QPDFObjectHandle::wrapInArray() Wrap an object in an array if it is not already an array.	2018-12-18 16:45:48 -05:00
Jay Berkenbilt	38c9ed23c3	Treat content stream parsing errors as an error, not a warning If parsing content streams is treated as a warning, there is no way for a caller to know if a parsing operation has failed. This is very dangerous and will likely result in data loss when token filters are parser callbacks are in use.	2018-06-22 10:44:08 -04:00
Jay Berkenbilt	ddd78c1b7f	Fix QPDFObjectHandle::shallowCopy It's not really a shallow copy. It just doesn't cross indirect object boundaries. The old implementation had a bug that would cause multiple shallow copies of the same object to share memory, which was not the intention.	2018-06-21 20:34:45 -04:00
Jay Berkenbilt	952a665a4e	Better support for creating Unicode strings	2018-06-21 15:57:13 -04:00
Jay Berkenbilt	4cded10821	Add QPDFObjectHandle::Rectangle type Provide a convenient way of accessing rectangles.	2018-06-21 15:57:13 -04:00
Jay Berkenbilt	b4d6cf6836	Limit depth of nesting in direct objects (fixes #202 ) This fixes CVE-2018-9918.	2018-04-15 16:11:22 -04:00
Jay Berkenbilt	e4e2e26d99	Properly handle pages with no contents (fixes #194 ) Remove calls to assertPageObject(). All cases in the library that called assertPageObject() work fine if you don't call assertPageObject() because nothing assumes anything that was being checked by that call. Removing the calls enables more files to be successfully processed.	2018-03-06 11:34:07 -05:00
Jay Berkenbilt	d0e99f195a	More robust handling of type errors Give objects descriptions and context so it is possible to issue warnings instead of fatal errors for attempts to access objects of the wrong type.	2018-02-18 21:06:27 -05:00
Jay Berkenbilt	21b7481b0e	Push members of QPDFObjectHandle into a Members object As in other cases, this is to enable adding new member variables in the future without breaking ABI compatibility.	2018-02-18 21:06:27 -05:00
Jay Berkenbilt	e410b0fe0d	Simplify TokenFilter interface Expose Pl_QPDFTokenizer, and have it do more of the work of managing the token filter's pipeline.	2018-02-18 21:05:47 -05:00
Jay Berkenbilt	5708b5d0aa	Add additional interface for filtering page contents	2018-02-18 21:05:47 -05:00
Jay Berkenbilt	9910104442	Implement TokenFilter and refactor Pl_QPDFTokenizer Implement a TokenFilter class and refactor Pl_QPDFTokenizer to use a TokenFilter class called ContentNormalizer. Pl_QPDFTokenizer is now a general filter that passes data through a TokenFilter.	2018-02-18 21:05:46 -05:00
Jay Berkenbilt	b8723e97f4	Add coalesce contents capability	2018-02-18 21:05:46 -05:00
Jay Berkenbilt	fcd611b61e	Refactor parseContentStream	2018-02-18 21:05:46 -05:00
Jay Berkenbilt	05ff619b09	Remove redundant method Remove a redundant method that was equal to another one with additional arguments. This breaks binary compatibility, but there are other ABI breaking changes in the upcoming release, so now is the time to do it.	2018-02-18 21:05:46 -05:00
Jay Berkenbilt	55ee55394c	Use inline image token in content parser	2018-02-18 21:05:46 -05:00
Jay Berkenbilt	d31a7b76e7	Improve message for stream decoding error Tweak the message so that we inform the user that we are mitigating data loss.	2017-09-12 16:03:48 -04:00
Jay Berkenbilt	728dc9e6d8	Fix error caught by clang	2017-08-26 21:51:17 -04:00
Jay Berkenbilt	ad527a64f9	Parse iteratively to avoid stack overflow (fixes #146 )	2017-08-25 21:56:45 -04:00
Jay Berkenbilt	e452d9dca6	Spell check	2017-08-22 14:22:20 -04:00
Jay Berkenbilt	9744414c66	Enable finer grained control of stream decoding This commit adds several API methods that enable control over which types of filters QPDF will attempt to decode. It also adds support for /RunLengthDecode and /DCTDecode filters for both encoding and decoding.	2017-08-21 17:44:22 -04:00
Jay Berkenbilt	cfa2eb97fb	Add page rotation (fixes #132 )	2017-08-12 22:57:38 -04:00
Jay Berkenbilt	b389268f16	Better handle split content streams (fixes #73 ) When parsing content streams, allow content to be split arbitrarily across stream boundaries.	2017-07-29 12:19:04 -04:00
Jay Berkenbilt	7f8892525f	Add precheck streams capability When requested, QPDFWriter will do more aggress prechecking of streams to make sure it can actually succeed in decoding them before attempting to do so. This will allow preservation of raw data even when the raw data is corrupted relative to the specified filters.	2017-07-27 23:42:27 -04:00
Jay Berkenbilt	40f00122b8	Convert object parsing errors to warnings QPDFObjectHandle::parseInternal now issues warnings instead of throwing exceptions for all error conditions that it finds (except internal logic errors) and has stronger recovery for things like invalid tokens and malformed dictionaries. This should improve qpdf's ability to recover from a wide range of broken files that currently cause it to fail.	2017-07-27 18:20:31 -04:00
Jay Berkenbilt	12db09898e	Don't interpret word tokens in content streams (fixes #82 )	2017-07-26 06:24:07 -04:00
Jay Berkenbilt	afe0242b26	Handle object ID 0 (fixes #99 ) This is CVE-2017-9208. The QPDF library uses object ID 0 internally as a sentinel to represent a direct object, but prior to this fix, was not blocking handling of 0 0 obj or 0 0 R as a special case. Creating an object in the file with 0 0 obj could cause various infinite loops. The PDF spec doesn't allow for object 0. Having qpdf handle object 0 might be a better fix, but changing all the places in the code that assumes objid == 0 means direct would be risky.	2017-07-26 06:24:07 -04:00
Jay Berkenbilt	603f222365	Fix infinite loop while reporting an error (fixes #101 ) This is CVE-2017-9210. The description string for an error message included unparsing an object, which is too complex of a thing to try to do while throwing an exception. There was only one example of this in the entire codebase, so it is not a pervasive problem. Fixing this eliminated one class of infinite loop errors.	2017-07-26 06:24:07 -04:00
Jay Berkenbilt	c729e07d55	Avoid resolving arguments to R When checking two objects preceding R while parsing, ensure that the objects are direct. This avoids stuff like 1 0 obj containing 1 0 R 0 R from causing an infinite loop in object resolution.	2015-02-21 17:51:08 -05:00
Jay Berkenbilt	caab1b0e16	Handle pages with no /Contents from getPageContents() The spec allows /Contents to be omitted for pages that are blank, but QPDFObjectHandle::getPageContents() was throwing an exception in this case.	2014-11-14 13:43:34 -05:00
Jay Berkenbilt	ac9c1f0d56	Security: replace operator[] with at For std::string and std::vector, replace operator[] with at. This was done using an automated process. See README.hardening for details.	2013-10-18 10:45:14 -04:00
Jay Berkenbilt	212812d837	Fix errors reported by Coverity Thanks to Jiri Popelka from Red Hat for sending the output of a Coverity run over qpdf.	2013-07-07 15:36:51 -04:00
Jay Berkenbilt	5039da0b91	Add QPDFObjectHandle::getObjGen() This is safer than getObjectID() and getGeneration() for many uses.	2013-06-14 14:58:09 -04:00
Jay Berkenbilt	2d02b3cc3d	Add explicit int to double cast	2013-04-04 14:13:31 -04:00
Jay Berkenbilt	29f5830325	Fix getTypeCode and getTypeName work for indirect objects Remove const qualifier from getTypeCode and get getTypeName methods of QPDFObjectHandle, make them work properly for indirect objects, and exercise them much better in the test suite.	2013-03-05 13:35:46 -05:00
Jay Berkenbilt	119f2a4b68	Add method to terminate content stream parsing	2013-03-05 13:35:46 -05:00
Jay Berkenbilt	30027481f7	Remove all old-style casts from C++ code	2013-03-04 16:45:16 -05:00
Jay Berkenbilt	a5d8783f67	Improve qpdf --check Fix exit status for case of errors without warnings, continue after errors when possible, add test case for parsing a file with content stream errors on some but not all pages.	2013-01-25 11:08:50 -05:00
Jay Berkenbilt	bfda717749	Cosmetic changes to be closer to Adobe terminology Change object type Keyword to Operator, and place the order of the object types in object_type_e in the same order as they are mentioned in the PDF specification. Note that this change only breaks backward compatibility with code that has not yet been released.	2013-01-23 09:38:05 -05:00
Jay Berkenbilt	913eb5ac35	Add getTypeCode() and getTypeName() Add virtual methods to QPDFObject, wrappers to QPDFObjectHandle, and implementations to all the QPDF_Object types.	2013-01-22 10:01:45 -05:00
Jay Berkenbilt	f81152311e	Add QPDFObjectHandle::parseContentStream method This method allows parsing of the PDF objects in a content stream or array of content streams.	2013-01-20 15:35:39 -05:00
Jay Berkenbilt	1d88955fa6	Added new QPDFObjectHandle types Keyword and InlineImage These object types are to facilitate content stream parsing.	2013-01-20 15:35:39 -05:00
Tobias Hoffmann	9c00874e77	added QPDFObjectHandle::replaceStreamData(std::string data).	2012-07-25 03:02:46 +02:00
Jay Berkenbilt	6bbea4baa0	Implement QPDFObjectHandle::parse Move object parsing code from QPDF to QPDFObjectHandle and parameterize the parts of it that are specific to a QPDF object. Provide a version that can't handle indirect objects and that can be called on an arbitrary string. A side effect of this change is that the offset used when reporting invalid stream length has changed, but since the new value seems like a better value than the old one, the test suite has been updated rather than making the code backward compatible. This only effects the offset reported for invalid streams that lack /Length or have an invalid /Length key. Updated some test code and exmaples to use QPDFObjectHandle::parse. Supporting changes include adding a BufferInputSource constructor that takes a string.	2012-07-21 09:06:10 -04:00
Jay Berkenbilt	b501251291	qpdf: push inherited attributes to page when showing images from qpdf command-line tool	2012-07-15 16:22:28 -04:00
Jay Berkenbilt	e7b8f297ba	Support copying objects from another QPDF object This includes QPDF::copyForeignObject and supporting foreign objects as arguments to addPage*.	2012-07-11 15:54:33 -04:00
Jay Berkenbilt	8a217eb3a2	Add concept of reserved objects QPDFObjectHandle::{new,is,assert}Reserved, QPDF::replaceReserved provide a mechanism to add objects to a PDF file when there are circular references. This is a prerequisite to copying objects from one PDF to another.	2012-07-10 23:34:32 -04:00
Tobias Hoffmann	8720446b23	Added assertNumber and assertScalar to QPDFObjectHandle	2012-07-07 18:55:08 -04:00
Tobias Hoffmann	a8266ccb0e	Added public assert{Type} methods to QPDFObjectHandle	2012-07-07 18:53:38 -04:00
Jay Berkenbilt	e2dedde4bd	Don't require stream data provider to know length in advance Breaking API change: length parameter has disappeared from the StreamDataProvider version of QPDFObjectHandle::replaceStreamData since it is no longer necessary to compute it in advance. This breaking change is justified by the fact that removing the length parameter provides the caller an opportunity to simplify the calling code.	2012-07-07 17:33:45 -04:00
Jay Berkenbilt	5f59c32f87	Add a few minor enhancements to recent work Test coverage case for new newStream method Expose decimal_places argument for double-based newReal All enhancements suggested by Tobias.	2012-06-27 10:43:27 -04:00
Tobias Hoffmann	43c404b45a	Add QPDFObjectHandle::newStream(QPDF *, std::string const&) This makes the code simpler than having to create a buffer of a fixed size and copy the string to it.	2012-06-27 10:19:57 -04:00

1 2 3 4 5 ...

276 Commits