octoleo/qpdf - qpdf - Vast Development Method

mirror of https://github.com/qpdf/qpdf.git synced 2024-06-26 07:12:45 +00:00

Author	SHA1	Message	Date
Jay Berkenbilt	30f109e244	Read xref table without PCRE Also accept more errors than before.	2017-08-10 21:30:32 -04:00
Jay Berkenbilt	ca5b1d267a	Improve stream length recovery Eliminate PCRE and find endobj not preceded by endstream. Be more lax about placement of endstream and endobj.	2017-08-10 21:30:32 -04:00
Jay Berkenbilt	3082e4e606	Find xref without PCRE	2017-08-10 21:30:32 -04:00
Jay Berkenbilt	90840be594	Find lindict without PCRE	2017-08-10 21:30:32 -04:00
Jay Berkenbilt	03aa9679ac	Find starxref without PCRE	2017-08-10 21:30:32 -04:00
Jay Berkenbilt	1765c6ec20	Find header without PCRE	2017-08-10 21:30:32 -04:00
Jay Berkenbilt	ef8ae5449d	Allow QPDFTokenizer::readToken to return bad tokens Sometimes we want to ignore bad tokens rather than having them throw an exception. A coverage case is commented out here and added in a later commit.	2017-08-10 19:01:41 -04:00
Jay Berkenbilt	c5dc6d8067	Remove unused PointerHolder interface Also fix a bug resulting from incorrect use of PointerHolder because of this unused parameter.	2017-08-10 19:01:38 -04:00
Jay Berkenbilt	ff6971fb1c	Call PointerHolder constructor properly (fixes #135 ) Passed arguments to the constructor in the wrong order.	2017-08-09 22:00:49 -04:00
Jay Berkenbilt	49825e5cb6	Add --split-pages option (fixes #30 )	2017-08-05 10:22:33 -04:00
Jay Berkenbilt	a60eb552d3	Split bug tests into separate chunk	2017-08-05 10:22:33 -04:00
Jay Berkenbilt	1ec59c299d	Refactor write_output	2017-08-05 10:22:33 -04:00
Jay Berkenbilt	909daf9543	Move page spec processing earlier	2017-08-05 10:22:33 -04:00
Jay Berkenbilt	24f28f0768	Split qpdf.cc's main into reasonably sized functions main() had gotten absurdly long. Split it into reasonable chunks. This refactoring is in preparation for handling splitting output into single pages.	2017-08-05 08:24:05 -04:00
Jay Berkenbilt	c88eaae2f2	Fix off-by-one error in --pages argument parsing (fixes #129 )	2017-08-02 21:08:43 -04:00
Jay Berkenbilt	2d5b854468	Allow reading command-line args from files (fixes #16 )	2017-07-29 22:23:21 -04:00
Jay Berkenbilt	5993c3e83c	Detect input file = output file (fixes #29 )	2017-07-29 20:58:01 -04:00
Jay Berkenbilt	885b8781cc	Allow --check to coexist with and precede other operations (fixes #42 )	2017-07-29 19:56:21 -04:00
Jay Berkenbilt	b43a0ac237	When recover stream length, indicate the length (fixes #44 )	2017-07-29 19:15:06 -04:00
Jay Berkenbilt	f37d399d82	Add newline-before-endstream option (fixes #103 )	2017-07-29 12:21:38 -04:00
Jay Berkenbilt	6a7d53ad2b	Handle zlib data errors better (fixes #106 )	2017-07-29 12:19:04 -04:00
Jay Berkenbilt	07d6f770b2	Better recovery of bad stream start (fixes #104 )	2017-07-29 12:19:04 -04:00
Jay Berkenbilt	b389268f16	Better handle split content streams (fixes #73 ) When parsing content streams, allow content to be split arbitrarily across stream boundaries.	2017-07-29 12:19:04 -04:00
Jay Berkenbilt	3a1ff5ded9	Add option to preserve unreferenced objects	2017-07-28 19:19:11 -04:00
Jay Berkenbilt	a94a729fee	Explicitly check root dictionary type Very badly corrupted files may not have a retrievable root dictionary. Handle that as a special case so that a more helpful error message can be provided.	2017-07-28 18:03:30 -04:00
Jay Berkenbilt	7f8892525f	Add precheck streams capability When requested, QPDFWriter will do more aggress prechecking of streams to make sure it can actually succeed in decoding them before attempting to do so. This will allow preservation of raw data even when the raw data is corrupted relative to the specified filters.	2017-07-27 23:42:27 -04:00
Jay Berkenbilt	428d96dfe1	Convert many more errors to warnings	2017-07-27 22:57:55 -04:00
Jay Berkenbilt	a4fd4b91c6	Convert stream filtering errors to warnings	2017-07-27 18:43:07 -04:00
Jay Berkenbilt	40f00122b8	Convert object parsing errors to warnings QPDFObjectHandle::parseInternal now issues warnings instead of throwing exceptions for all error conditions that it finds (except internal logic errors) and has stronger recovery for things like invalid tokens and malformed dictionaries. This should improve qpdf's ability to recover from a wide range of broken files that currently cause it to fail.	2017-07-27 18:20:31 -04:00
Jay Berkenbilt	ac3c81a8ed	Include tests for other infinite loop bugs fixes #117 fixes #118 fixes #119 fixes #120 Several other infinite loop bugs were fixed by previous changes. Include their test files in the test suite.	2017-07-26 06:24:07 -04:00
Jay Berkenbilt	701b518d5c	Detect recursion loops resolving objects (fixes #51 ) During parsing of an object, sometimes parts of the object have to be resolved. An example is stream lengths. If such an object directly or indirectly points to the object being parsed, it can cause an infinite loop. Guard against all cases of re-entrant resolution of objects.	2017-07-26 06:24:07 -04:00
Jay Berkenbilt	afe0242b26	Handle object ID 0 (fixes #99 ) This is CVE-2017-9208. The QPDF library uses object ID 0 internally as a sentinel to represent a direct object, but prior to this fix, was not blocking handling of 0 0 obj or 0 0 R as a special case. Creating an object in the file with 0 0 obj could cause various infinite loops. The PDF spec doesn't allow for object 0. Having qpdf handle object 0 might be a better fix, but changing all the places in the code that assumes objid == 0 means direct would be risky.	2017-07-26 06:24:07 -04:00
Jay Berkenbilt	315092dd98	Avoid xref reconstruction infinite loop (fixes #100 ) This is CVE-2017-9209.	2017-07-26 06:24:07 -04:00
Jay Berkenbilt	603f222365	Fix infinite loop while reporting an error (fixes #101 ) This is CVE-2017-9210. The description string for an error message included unparsing an object, which is too complex of a thing to try to do while throwing an exception. There was only one example of this in the entire codebase, so it is not a pervasive problem. Fixing this eliminated one class of infinite loop errors.	2017-07-26 06:24:07 -04:00
Thorsten Schöning	e80b6e3341	Support paths with spaces	2016-01-24 11:52:09 -05:00
Thorsten Schöning	eff935ab60	Use absolute paths for large file tests Working with absolute paths makes debugging easier, but some called scripts always need / as dir separator or won't work.	2016-01-24 11:52:09 -05:00
Thorsten Schöning	adbaa54ad4	Fix non-portable use of /dev/null /dev/null is not portable, so use File::Spec instead, which provides portable "paths" and especially "nul" on Windows. I changed all places with hard coded /dev/null to be sure, while I think it only is a problem in direct system calls, because the other executed commands go to sh.exe from MSYS which itself should port /dev/null to NUL. The test still pass, so shouldn't have made any harm...	2016-01-24 11:52:09 -05:00
Thorsten Schöning	951dbc3b7f	Fix expr syntax, support spaces in paths expr needs ARG + ARG quote paths to support support spaces	2016-01-24 11:52:09 -05:00
Thorsten Schöning	3c1555a622	Explicitly invoke shell scripts with sh Shebang doesn't work well on Windows.	2016-01-24 11:52:09 -05:00
Jay Berkenbilt	b62cbe2508	Tolerate some mangled xref tables If xref table entries lack the spec-required trailing whitespace or contain a small amount of extra space, handle them anyway.	2015-10-31 18:56:43 -04:00
Jay Berkenbilt	b8bdef0ad1	Implement deterministic ID For non-encrypted files, determinstic ID generation uses file contents instead of timestamp and file name. At a small runtime cost, this enables generation of the same /ID if the same inputs are converted in the same way multiple times.	2015-10-31 18:56:42 -04:00
Jay Berkenbilt	f77acbdbba	Copyright 2015	2015-05-24 17:26:49 -04:00
Jay Berkenbilt	b356b9dfa2	fix-qdf: handle object streams with > 255 objects fix-qdf was previously hard-coding the number of bytes for the f2 field of the xref stream entry. This addresses issue #37. Thanks aluebcke for reporting.	2015-05-24 16:52:42 -04:00
Jay Berkenbilt	a11549a566	Detect loops in /Pages structure Pushing inherited objects to pages and getting all pages were both prone to stack overflow infinite loops if there were loops in the Pages dictionary. There is a general weakness in the code in that any part of the code that traverses the Pages structure would be prone to this and would have to implement its own loop detection. A more robust fix may provide some general method for handling the Pages structure, but it's probably not worth doing. Note: addition of *Internal2 private functions was done rather than changing signatures of existing methods to avoid breaking compatibility.	2015-02-21 19:47:11 -05:00
Jay Berkenbilt	c729e07d55	Avoid resolving arguments to R When checking two objects preceding R while parsing, ensure that the objects are direct. This avoids stuff like 1 0 obj containing 1 0 R 0 R from causing an infinite loop in object resolution.	2015-02-21 17:51:08 -05:00
Jay Berkenbilt	d8900c2255	Handle page tree node with no /Type Original reported here: https://bugs.launchpad.net/ubuntu/+source/qpdf/+bug/1397413 The PDF specification says that the /Type key for nodes in the pages dictionary (both /Page and /Pages) is required, but some PDF files omit them. Use the presence of other keys to determine the type of pages tree node this is if the type key is not found.	2014-12-29 10:17:21 -05:00
Jay Berkenbilt	caab1b0e16	Handle pages with no /Contents from getPageContents() The spec allows /Contents to be omitted for pages that are blank, but QPDFObjectHandle::getPageContents() was throwing an exception in this case.	2014-11-14 13:43:34 -05:00
Jay Berkenbilt	9f8aba1db7	Handle indirect stream filter/decode parameters QPDFWriter was trying to make /Filter and /DecodeParms direct in all cases, but there are some cases where /DecodeParms may refer to a stream, which can't be direct. QPDFWriter doesn't actually need /DecodeParms to be direct in that case because it won't be able to filter the stream. Until we can handle this type of stream, just don't make /Filter and /DecodeParms direct if we can't filter the stream anyway. Fixes #34	2014-06-07 16:31:03 -04:00
Jay Berkenbilt	225b018290	Update Copyright to 2014	2014-01-14 15:40:02 -05:00
Jay Berkenbilt	c9a9fe9c2f	Avoid traversing same object twice when copying objects This is a performance fix. The output is unchanged. Fixes #28.	2013-12-26 11:51:50 -05:00

1 2 3 4 5

247 Commits