octoleo/qpdf - qpdf - Vast Development Method

mirror of https://github.com/qpdf/qpdf.git synced 2024-12-23 11:28:56 +00:00

Author	SHA1	Message	Date
Jay Berkenbilt	61d41e2e88	Add copyAnnotations, use with overlay/underlay (fixes #395 )	2021-02-22 18:42:06 -05:00
Jay Berkenbilt	7b3cbacf5d	Change from QPDF{Array,Dict}Items to aitems() and ditems()	2021-02-22 11:05:39 -05:00
Jay Berkenbilt	a9ae8cadc6	Add transformAnnotations and fix flattenRotations to use it	2021-02-21 17:13:09 -05:00
Jay Berkenbilt	7540d2082a	Explicitly override inherited rotate in flattenRotations	2021-02-21 14:58:45 -05:00
Jay Berkenbilt	92fbc6fdf5	QPDFObjectHandle::copyStream	2021-02-21 06:36:30 -05:00
Jay Berkenbilt	35dd11f356	Allow --rotate=0	2021-02-20 16:29:34 -05:00
Jay Berkenbilt	0a52e60ece	Use QUtil::path_basename	2021-02-18 09:59:03 -05:00
Jay Berkenbilt	dfce581754	Add numeric argument to --collate This takes pages from the file in groups of n with default = 1. This partially fixes the enhancement in issue #505 but doesn't implement the entire suggestion.	2021-02-17 20:07:45 -05:00
Jay Berkenbilt	a773f4c71d	Add QPDFObjectHandle::parse for strings with context	2021-02-15 11:33:03 -05:00
Jay Berkenbilt	efbb21673c	Add functional versions of QPDFObjectHandle::replaceStreamData Also fix a bug in checking consistency of length for stream data providers. Length should not be checked or recorded if the provider says it failed to generate the data.	2021-02-14 14:42:24 -05:00
Jay Berkenbilt	07f40bd254	QUtil::double_to_string: trim trailing zeroes with option to disable	2021-02-13 02:30:00 -05:00
Jay Berkenbilt	2538d84413	Explicitly deprecate old name/number tree constructors Use C++14 [[deprecated]] tag	2021-02-10 16:28:00 -05:00
Jay Berkenbilt	accb891b4f	Add attachment information to the json output	2021-02-10 15:46:18 -05:00
Jay Berkenbilt	832d792e4e	Add CLI support for working with attachments	2021-02-10 10:03:27 -05:00
Jay Berkenbilt	1f4771cd0d	Minor clean up of Windows headers	2021-02-10 07:36:18 -05:00
Jay Berkenbilt	ad34b9c278	Implement helpers for file attachments	2021-02-10 06:57:37 -05:00
Jay Berkenbilt	e076c9bf08	Remove erroneous handling of /EFF for stream decryption I thought /EFF was supposed to be used as a default for decrypting embedded file streams, but actually it's supposed to be advice to a conforming writer about handling new ones. This makes sense since the findAttachmentStreams code, which is not actually needed, was never right.	2021-02-06 17:08:41 -05:00
Jay Berkenbilt	ac2b3b96e1	Make wrong object stream type a warning	2021-02-06 14:29:11 -05:00
Jay Berkenbilt	af557db4a4	Cosmetic fix to help	2021-02-06 13:45:43 -05:00
Jay Berkenbilt	3de67173de	Better fix to insecure password check (fixes #501 )	2021-02-04 20:44:05 -05:00
Jay Berkenbilt	63158cf546	Add --password-file=filename option (fixes #499 )	2021-02-04 16:48:53 -05:00
Jay Berkenbilt	21b0f4acfc	Require --allow-insecure to create certain encrypted files (fixes #501 ) For now, --allow-insecure allows creation of files with the owner passwords empty or matching the user password.	2021-02-04 15:57:13 -05:00
Jay Berkenbilt	faa2e3ddfd	Handle older PDFs whose form XObjects inherit resources (fixes #494 ) When removing unreferenced resources, notice if a page (recursively) contains a form XObject with unreferenced resources, and count any such resources as referenced by the page.	2021-02-02 18:06:05 -05:00
Jay Berkenbilt	5fdf37b1ba	Handle warnings in --pages from other files Warnings were not being handled per --no-warn or generating exit code 3.	2021-02-02 18:06:05 -05:00
Jay Berkenbilt	de0b11fc47	Add C++ iterator API around array and dictionary objects	2021-01-30 15:15:23 -05:00
Jay Berkenbilt	8ed3e8c79b	NNTree: rework iterators to be more memory efficient Keep a std::pair internal to the iterators so that operator* can return a reference and operator-> can work, and each can work without copying pairs of objects around.	2021-01-26 09:12:23 -05:00
Jay Berkenbilt	e7e20772ed	name/number trees: remove	2021-01-26 09:12:23 -05:00
Jay Berkenbilt	5816fb44b8	name/number trees: insertAfter	2021-01-25 15:39:10 -05:00
Jay Berkenbilt	16a9bb3f6f	name/number trees: newEmpty, increment/decrement end()	2021-01-25 15:39:10 -05:00
Jay Berkenbilt	b5614f611d	Implement repair and insert for name/number trees	2021-01-24 19:31:45 -05:00
Jay Berkenbilt	04edfe9fad	QPDFObjectHandle::newUnicodeString to uses UTF-16 only when needed Use the first of ASCII, PDFDocEncoding, or UTF-16 that is capable of encoding the string.	2021-01-24 03:27:28 -05:00
Jay Berkenbilt	63e5cb533d	Use new QPDF{Name,Number}TreeObjectHelper API	2021-01-24 03:27:28 -05:00
Jay Berkenbilt	d61ffb65d0	Add new constructors for name/number tree helpers Add constructors that take a QPDF object so we can issue warnings and create new indirect objects.	2021-01-24 03:27:26 -05:00
Jay Berkenbilt	5f0708418a	Add iterators to name/number tree helpers	2021-01-24 03:22:59 -05:00
Jay Berkenbilt	4a1cce0a47	Reimplement name and number tree object helpers Create a computationally and memory efficient implementation of name and number trees that does binary searches as intended by the data structure rather than loading into a map, which can use a great deal of memory and can be very slow.	2021-01-24 03:22:51 -05:00
Jay Berkenbilt	6fe7b704c7	Warn rather than segv on access after closing input source (fixes #495 )	2021-01-06 10:11:34 -05:00
Jay Berkenbilt	0fed040392	Prepare version 10.1.0	2021-01-04 16:59:55 -05:00
Jay Berkenbilt	bf8fd41fee	Update copyright to 2021	2021-01-04 16:26:58 -05:00
Jay Berkenbilt	891751f618	Remove unreferenced resources only from relevant pages	2021-01-04 15:17:35 -05:00
Jay Berkenbilt	a9bdeeb0e0	Fix zsh completion arguments (fixes #473 )	2021-01-04 15:17:35 -05:00
Jay Berkenbilt	3be58f49e5	Make more QPDFPageObjectHelper methods work with form XObject	2021-01-02 14:08:53 -05:00
Jay Berkenbilt	98da4fd835	Externalize inline images now includes form XObjects	2021-01-02 14:08:17 -05:00
Jay Berkenbilt	bedf35d6a5	Bug fix: avoid extraneous pipeline finish calls with multiple contents Avoid calling finish() multiple times on the pipeline passed to pipeContentStreams. This commit also fixes a bug in which qpdf was not exiting with the proper exit status if warnings found while splitting pages; this was exposed by a test case that changed.	2021-01-02 14:08:17 -05:00
Jay Berkenbilt	a139d2b36d	Add several methods for working with form XObjects (fixes #436 ) Make some more methods in QPDFPageObjectHelper work with form XObjects, provide forEach methods to walk through nested form XObjects, possibly recursively. This should make it easier to work with form XObjects from user code.	2021-01-02 12:29:31 -05:00
Jay Berkenbilt	63ea46193d	QPDFPageObjectHelper: getPageImages -> getImages	2021-01-02 11:33:36 -05:00
Jay Berkenbilt	e7a8554563	QPDFPageObjectHelper::getPageImages: support form XObjects	2021-01-02 11:33:36 -05:00
Jay Berkenbilt	c9271335fa	Add QPDFPageObjectHelper::flattenRotation and --flatten-rotation	2020-12-30 13:03:55 -05:00
Jay Berkenbilt	12ecd2019a	Add QPDFObjectHandle::setFilterOnWrite	2020-12-28 12:58:19 -05:00
Jay Berkenbilt	858c7b89bc	Let optimize filter stream parameters instead of making them direct Also removes preclusion of stream references in stream parameters of filterable streams and reduces write times by about 8% by eliminating an extra traversal of the objects.	2020-12-28 12:58:19 -05:00
Jay Berkenbilt	39bfa01307	Implement user-provided stream filters Refactor QPDF_Stream to use stream filter classes to handle supported stream filters as well.	2020-12-28 12:58:19 -05:00
Jay Berkenbilt	cc8895078a	Add QPDFObjectHandle::makeDirect(bool allow_streams)	2020-12-26 08:48:18 -05:00
Jay Berkenbilt	2050977099	Add QPDFObjectHandle manipulation to C API	2020-11-28 19:48:07 -05:00
Jay Berkenbilt	78b9d6bfd4	Prepare 10.0.4 release	2020-11-21 13:50:02 -05:00
Jay Berkenbilt	a7ef572c84	Small enhancement to --pages argument parsing	2020-11-09 11:12:34 -05:00
Jay Berkenbilt	47f4ebcdac	Ignore unused field in xref entry, avoiding range error (fixes #482 )	2020-11-04 07:46:46 -05:00
Jay Berkenbilt	3e5aaa299a	Typo in help message	2020-11-03 09:03:16 -05:00
Jay Berkenbilt	fbe40b800d	Prepare 10.0.3 release	2020-10-31 13:47:03 -04:00
Jay Berkenbilt	96767fb104	Fix foreign stream copying bug (fixes #478 ) This reverts an incorrect fix to #449 and codes it properly. The real problem was that we were looking at the local dictionaries rather than the foreign dictionaries when saving the foreign stream data. In the case of direct objects, these happened to be the same, but in the case of indirect objects, the object references could be pointing anywhere since object numbers don't match up between the old and new files.	2020-10-31 12:14:26 -04:00
Jay Berkenbilt	f1ae55a430	Better indirect filter test case The test suite now contains test cases that fail with both 10.0.1 and 10.0.2 and reproduce the internal error from #449.	2020-10-31 09:02:30 -04:00
Jay Berkenbilt	da7540794a	Prepare 10.0.2 release	2020-10-27 11:57:48 -04:00
Jay Berkenbilt	f8e4b6161c	With --no-warn, suppress warnings in split-pages Warnings issued on the output QPDF object were not suppressing warnings since that option was only set on the input QPDF object.	2020-10-23 16:27:51 -04:00
Jay Berkenbilt	b30deaeeab	Avoid merging adjacent tokens when concatenating contents (fixes #444 )	2020-10-23 08:00:04 -04:00
Jay Berkenbilt	0dea276997	Fix fix-qdf for empty streams	2020-10-23 06:39:42 -04:00
Jay Berkenbilt	8a11feacc3	Avoid leak by resolving object streams more than once (fuzz issue 23642)	2020-10-22 15:39:36 -04:00
Jay Berkenbilt	30bb4c64ee	Minor code cleanup * Return rather than exiting from realmain in qpdf.cc * Remove extraneous blank line * Don't assign temporary to const reference	2020-10-22 15:39:36 -04:00
Jay Berkenbilt	956c8f6432	Obscure bug fix copying foreign streams in special cases (fixes #449 ) Specifically, if a stream had its stream data replaced and had indirect /Filter or /DecodeParms, it would result in non-silent loss of data and/or internal error.	2020-10-21 19:23:23 -04:00
Jay Berkenbilt	deeface146	Add automated test for shell wildcard expansion Wildcard expansion is different in Windows from non-Windows and sometimes requires special link options to work. Add tests that fail if we link incorrectly.	2020-10-21 14:15:31 -04:00
Jay Berkenbilt	758e3e38f5	Add option --warning-exit-0 to exit 0 instead of 3 with warnings	2020-10-20 18:02:39 -04:00
Jay Berkenbilt	90217e6686	Fix another case of errors written to stdout (fixes #438 )	2020-10-20 17:48:55 -04:00
Jay Berkenbilt	ff65e272a8	Fix printf formatting for newer msvc Use autoconf rather than ifdefs to determine what format string to use for long long.	2020-10-16 07:02:23 -04:00
Jay Berkenbilt	bbd45cd01c	Clarify qpdf's exit statuses in the documentation	2020-10-15 15:03:14 -04:00
Jay Berkenbilt	a1994a5343	Fix/clarify documentation on --rotate option (fixes #470 ) Make clear that you almost always want + or - before an angle when specifying rotation.	2020-10-15 14:53:06 -04:00
Jay Berkenbilt	92d3cbecd4	Fix warnings reported by -Wshadow=local (fixes #431 )	2020-04-16 12:41:43 -04:00
Jay Berkenbilt	578c5ac66c	Use more references when iterating When possible, use `for (auto&` or `for (auto const&` when iterating using C++-11 style iterators.	2020-04-10 13:30:33 -04:00
Jay Berkenbilt	821a701851	Prepare 10.0.1 release	2020-04-09 11:48:26 -04:00
Jay Berkenbilt	1a7d3700a6	Fix unnecessary copies in auto iter (fixes #426 ) Also switch to colon-style iteration in some cases. Thanks to Dean Scarff for drawing this to my attention after detecting some unnecessary copies with https://clang.llvm.org/extra/clang-tidy/checks/performance-for-range-copy.html	2020-04-08 20:45:26 -04:00
Jay Berkenbilt	4977a7efa5	Bug fix: getStreamData should on unfilterable stream (fixes #425 )	2020-04-08 18:52:04 -04:00
Jay Berkenbilt	892937cbbe	Fix errors in --remove-unreferenced-resources=auto	2020-04-06 12:14:27 -04:00
Jay Berkenbilt	1e629c278a	Prepare 10.0.0 release	2020-04-06 11:30:15 -04:00
Jay Berkenbilt	ce6cee3570	Spell check	2020-04-06 11:23:02 -04:00
Jay Berkenbilt	3d0de5b924	Fixes to ChangeLog and manual for 10.0.0 changes	2020-04-06 09:02:58 -04:00
Jay Berkenbilt	0837932164	Update documentation and test suite to lock in hard page copy Issue #399 mentioned a use case for which qpdf has support, but the fact that it is supported was not documented or in the test suite, making it vulerable to accidental breakage.	2020-04-05 20:07:13 -04:00
Jay Berkenbilt	893d38b87e	Allow propagation of errors and retry through StreamDataProvider StreamDataProvider::provideStreamData now has a rich enough API for it to effectively proxy to pipeStreamData.	2020-04-05 20:07:13 -04:00
Jay Berkenbilt	2118eecae7	Add objectinfo to json	2020-04-04 18:08:40 -04:00
Jay Berkenbilt	67d5ed3a64	Implement remove-unreferenced-resources=auto	2020-04-04 13:19:49 -04:00
Jay Berkenbilt	1e766dcda2	Add --remove-unreferenced-resources option	2020-04-04 13:19:49 -04:00
Jay Berkenbilt	4f3b89991b	placeFormXObject: allow control of shrink/expand (fixes #409 )	2020-04-03 21:39:17 -04:00
Jay Berkenbilt	dac65a21fb	Look in form XObjects when removing unreferenced resources (fixes #373 ) If a page contains a form XObject, also filter the form XObject and remove its unreferenced resources.	2020-03-31 17:39:20 -04:00
Jay Berkenbilt	b03e6bd65d	Use QPDF_EXECUTABLE as a hint for completion	2020-03-31 17:39:20 -04:00
Jay Berkenbilt	bb3137296d	Handle root /Pages pointing to other than page tree root (fixes #398 )	2020-02-22 11:10:31 -05:00
Jay Berkenbilt	52a2e95dd5	Prepare 9.1.1 release	2020-01-26 18:49:04 -05:00
Jay Berkenbilt	e5cc065598	Update copyright to 2020	2020-01-26 16:57:27 -05:00
Jay Berkenbilt	57c01ef81f	In qdf mode, don't write extra XRef streams (fixes #386 ) fix-qdf assumes there is exactly one XRef stream and that it is at the end of the file.	2020-01-26 16:50:57 -05:00
Jay Berkenbilt	bbc2f8ffae	Bug fix: handle ColorSpace lookup for inline images (fixes #392 ) If the value of /CS in the inline image dictionary was is key in the page's /Resource -> /ColorSpace dictionary, properly resolve it by referencing the proper colorspace, and not just the name, in the external image dictionary.	2020-01-26 15:29:10 -05:00
Jay Berkenbilt	12777a04ca	Add encrypt key to json	2020-01-26 14:44:03 -05:00
Jay Berkenbilt	656d7bc006	Rename test files This change makes it possible to get both the user and owner password from the file name of all the encryption test files.	2020-01-26 14:42:10 -05:00
Jay Berkenbilt	731c4f711b	Add --is-encrypted and --requires-password (fixes #390 ) Allow exit status-based checking of whether a file is encrypted or requires a password without necessarily supplying the correct password. Useful for scripting.	2020-01-26 11:26:53 -05:00
Jay Berkenbilt	388990f7bc	Rewrite fix-qdf in C++	2020-01-14 11:53:19 -05:00
Jay Berkenbilt	a44b5a34a0	Pull wmain -> main code from qpdf.cc into QUtil.cc	2020-01-14 11:40:51 -05:00
Jay Berkenbilt	9b0c6022d7	Prepare 9.1.0 release	2019-11-16 22:29:54 -05:00
Jay Berkenbilt	5e6dfc938e	Prepare 9.1.rc1 release	2019-11-09 22:00:53 -05:00
Jay Berkenbilt	07da0039d3	Link with setargv or wsetargv with MSVC (fixes #224 ) For wildcard expansion to work properly with the msvc binary, it is necessary to link with setargv.obj or wsetargv.obj, depending on whether wmain is in use.	2019-11-09 18:50:42 -05:00
Jay Berkenbilt	c4478e5249	Allow odd/even modifiers in numeric range (fixes #364 )	2019-11-09 13:23:12 -05:00
Jay Berkenbilt	5508f74603	Allow /P in encryption dictionary to be positive (fixes #382 ) Even though this is disallowed by the spec, files like this have been encountered in the wild.	2019-11-09 12:33:15 -05:00
Jay Berkenbilt	127a957aee	Allow runtime inspection/override of crypto provider	2019-11-09 09:53:42 -05:00
Masamichi Hosoda	5a842792b6	Parse Contents in signature dictionary without encryption Various PDF digital signing tools do not encrypt /Contents value in signature dictionary. Adobe Acrobat Reader DC can handle a PDF with the /Contents value not encrypted. Write Contents in signature dictionary without encryption Tests ensure that string /Contents are not handled specially when not found in sig dicts.	2019-10-22 16:20:21 -04:00
Masamichi Hosoda	cdc46d78f4	Add QPDFObject::getParsedOffset()	2019-10-22 16:19:06 -04:00
Masamichi Hosoda	50b329ee9f	Add QPDFWriter::getWrittenXRefTable()	2019-10-22 16:16:16 -04:00
Masamichi Hosoda	5cf4090aee	Add QPDFWriter::getRenumberedObjGen()	2019-10-22 16:16:16 -04:00
Masamichi Hosoda	46ac3e21b3	Add QPDF::getXRefTable()	2019-10-22 16:16:16 -04:00
Masamichi Hosoda	06b818dcd3	Exclude signature dictionary from compressible objects It seems better not to compress signature dictionaries. Various PDF digital signing tools, including Adobe Acrobat Reader DC, do not compress signature dictionaries. Table 8.93 "Entries in a signature dictionary" in PDF 1.5 reference describes that /ByteRange in the signature dictionary shall be used to describe a digest that does not include the signature value (/Contents) itself. The byte ranges cannot be determined if the dictionary is compressed.	2019-10-22 16:16:16 -04:00
Masamichi Hosoda	5e0ba12687	Fix /Contents value representation in a signature dictionary Table 8.93 "Entries in a signature dictionary" in PDF 1.5 reference describes that the value of Contents entry is a hexadecimal string representation when ByteRange is specified. This commit makes QPDF always uses hexadecimal strings representation instead of literal strings for it.	2019-10-22 16:16:16 -04:00
Jay Berkenbilt	3094955dee	Prepare 9.0.2 release	2019-10-12 19:37:40 -04:00
Jay Berkenbilt	e188d0fffa	Make --replace-input work with / in path (fixes #365 )	2019-10-12 19:27:50 -04:00
Jay Berkenbilt	4ea940b03c	Prepare 9.0.1 release	2019-09-20 07:38:18 -04:00
Jay Berkenbilt	685250d7d6	Correct reversed Rectangle coordinates (fixes #363 )	2019-09-19 21:25:34 -04:00
Jay Berkenbilt	8b1e307741	Warn for duplicated dictionary keys (fixes #345 )	2019-09-19 20:22:34 -04:00
Jay Berkenbilt	5462dfce31	Prepare 9.0.0 release	2019-08-31 20:07:36 -04:00
Jay Berkenbilt	d492bb0a90	Add --replace-input option (fixes #321 )	2019-08-31 15:51:21 -04:00
Jay Berkenbilt	41b5c46497	refactor: split write_outfile and do_split_pages	2019-08-31 15:51:06 -04:00
Jay Berkenbilt	47a38a942d	Detect stream in object stream, fixing fuzz 16214 It's detected in QPDFWriter instead of at parse time because I can't figure out how to construct a test case in a reasonable time. This commit moves the fuzz file into the regular test suite for a QTC coverage case.	2019-08-28 12:49:04 -04:00
Jay Berkenbilt	ac5e6de2e8	Fix fuzz issue 15387 (overflow checking xref size)	2019-08-27 11:26:25 -04:00
Jay Berkenbilt	9ebb55aff1	Include password match information in show encryption	2019-08-24 11:01:19 -04:00
Jay Berkenbilt	5da146c8b5	Track separately whether password was user/owner (fixes #159 )	2019-08-24 11:01:19 -04:00
Jay Berkenbilt	2794bfb1a6	Add flags to control zlib compression level (fixes #113 )	2019-08-23 20:34:21 -04:00
Jay Berkenbilt	3f1ab64066	Pass offset and length to ParserCallbacks::handleObject	2019-08-22 22:54:29 -04:00
Jay Berkenbilt	4b2e72c4cd	Test for direct, rather than resolved nulls in parser Just because we know an indirect reference is null, doesn't mean we shouldn't keep it indirect.	2019-08-22 17:55:16 -04:00
Jay Berkenbilt	3f3dbe22ea	Remove array null flattening For some reason, qpdf from the beginning was replacing indirect references to null with literal null in arrays even after removing the old behavior of flattening scalar references. This seems like a bad idea.	2019-08-22 17:55:16 -04:00
Jay Berkenbilt	ae5bd7102d	Accept extraneous space before xref (fixes #341 )	2019-08-19 22:24:53 -04:00
Jay Berkenbilt	8a9086a689	Accept extraneous space after stream keyword (fixes #329 )	2019-08-19 21:43:44 -04:00
Jay Berkenbilt	43f91f58b8	Improve invalid name token warning message This message used to only appear for PDF >= 1.2. The invalid name is valid for PDF 1.0 and 1.1. However, since QPDFWriter may write a newer version, it's better to detect and warn in all cases. Therefore make the warning more informative.	2019-08-19 19:48:27 -04:00
Jay Berkenbilt	42d396f1dd	Handle invalid name tokens symmetrically for PDF < 1.2 (fixes #332 )	2019-08-19 19:48:27 -04:00
Jay Berkenbilt	d9dd99eca3	Attempt to repair /Type key in pages nodes (fixes #349 )	2019-08-18 18:54:37 -04:00
Thorsten Schöning	8f06da7534	Change list to vector for outline helpers (fixes #297 ) This change works around STL problems with Embarcadero C++ Builder version 10.2, but std::vector is more common than std::list in qpdf, and this is a relatively new API, so an API change is tolerable. Thanks to Thorsten Schöning <6223655+ams-tschoening@users.noreply.github.com> for the fix.	2019-07-03 20:08:47 -04:00
Jay Berkenbilt	04f45cf652	Treat all linearization errors as warnings This also reverts the addition of a new checkLinearization that distinguishes errors from warnings. There's no practical distinction between what was considered an error and what was considered a warning.	2019-06-23 13:45:45 -04:00
Jay Berkenbilt	c5ed1b8075	Handle invalid encryption Length (fixes #333 )	2019-06-22 20:57:33 -04:00
Jay Berkenbilt	551dfbf697	Allow set*EncryptionParameters before filename iset (fixes #336 )	2019-06-22 20:57:33 -04:00
Jay Berkenbilt	ed62be888c	Fix --completion-* args to work from AppImage (fixes #285 )	2019-06-22 17:12:01 -04:00
Jay Berkenbilt	85a3f95a89	qpdf: exit 3 for linearization warnings without errors (fixes #50 )	2019-06-22 16:57:51 -04:00
Jay Berkenbilt	1bde5c68a3	Add QUtil::read_file_into_memory This code was essentially duplicated between test_driver and standalone_fuzz_target_runner.	2019-06-22 10:14:25 -04:00
Jay Berkenbilt	45dac410b5	Remove broken QPDFTokenizer::expectInlineImage	2019-06-21 22:29:31 -04:00
Jay Berkenbilt	c6cfd64503	Rename QUtil::strcasecmp to QUtil::str_compare_nocase (fixes #242 )	2019-06-21 22:29:31 -04:00
Jay Berkenbilt	b07ad6794e	Fix bugs found by fuzz tests * Several assertions in linearization were not always true; change them to run time errors * Handle a few cases of uninitialized objects * Handle pages with no contents when doing form operations * Handle invalid page tree nodes when traversing pages	2019-06-21 17:56:24 -04:00
Jay Berkenbilt	ed7f2a6c76	Add smaller image streams file for testing	2019-06-21 17:39:53 -04:00
Jay Berkenbilt	d71f05ca07	Fix sign and conversion warnings (major) This makes all integer type conversions that have potential data loss explicit with calls that do range checks and raise an exception. After this commit, qpdf builds with no warnings when -Wsign-conversion -Wconversion is used with gcc or clang or when -W3 -Wd4800 is used with MSVC. This significantly reduces the likelihood of potential crashes from bogus integer values. There are some parts of the code that take int when they should take size_t or an offset. Such places would make qpdf not support files with more than 2^31 of something that usually wouldn't be so large. In the event that such a file shows up and is valid, at least qpdf would raise an error in the right spot so the issue could be legitimately addressed rather than failing in some weird way because of a silent overflow condition.	2019-06-21 13:17:21 -04:00
Jay Berkenbilt	3608afd5c5	Add new integer accessors to QPDFObjectHandle	2019-06-21 13:17:21 -04:00
Jay Berkenbilt	713d961990	Appearance streams: some floating point values were truncated Bounding box X coordinates could be truncated, causing them to be off by a fraction of a point. This was most likely not visible, but it was still wrong.	2019-06-20 21:32:30 -04:00
Jay Berkenbilt	bcfa407912	As a test suite, run stand-alone fuzzer on seed corpus Temporarily skip fuzz tests on Windows. There are Windows-specific failures to address later.	2019-06-15 17:24:24 -04:00
Jay Berkenbilt	320702c086	Add test files from oss-fuzz bugs (fixes #335 )	2019-06-15 17:24:24 -04:00
Jay Berkenbilt	cf469d7890	Give up reading objects with too many consecutive errors	2019-06-15 08:52:19 -04:00
Jay Berkenbilt	3a180a0591	Commit forgotten test files	2019-06-09 18:11:37 -04:00
Jay Berkenbilt	31bde2f9d7	Handle empty DecodeParams array for (fixes #331 ) On read, ignore /DecodeParms when empty list; on write, delete it. Some files have been found that include an empty list for /DecodeParms, but this is not technically compliant with the spec, and the only sensible interpretation is to treat it as if there are no decode parameters.	2019-06-09 17:19:49 -04:00
Jay Berkenbilt	b1a78be1a8	Prepare 8.4.2 release	2019-05-18 08:56:37 -04:00
Jay Berkenbilt	a323f6f49f	Prepare 8.4.1 release	2019-04-27 20:44:20 -04:00
Jay Berkenbilt	03e27709f3	Improve Unicode filename testing Remove dependency on the behavior of perl for reliable creation of Unicode file names on Windows.	2019-04-27 20:37:33 -04:00
Jay Berkenbilt	7ff234a92f	Remove stray comment	2019-04-27 20:37:33 -04:00
Jay Berkenbilt	7db5bc289b	Fix typo	2019-04-22 09:37:23 -04:00
Jay Berkenbilt	12b159118a	Compare versions between CLI and library	2019-04-20 21:00:43 -04:00
Jay Berkenbilt	2b011f9d81	Add --remove-page-labels option (fixes #317 )	2019-04-20 21:00:43 -04:00
Jay Berkenbilt	e50d5201df	Add --keep-files-open-threshold (fixes #288 )	2019-04-20 21:00:43 -04:00
Jay Berkenbilt	011695dfdf	Support Unicode in filenames (fixes #298 )	2019-04-20 21:00:43 -04:00
Jay Berkenbilt	131a21d36f	Document that linearize disables qdf (fixes #312 )	2019-04-20 21:00:43 -04:00
Jay Berkenbilt	a5a016cdd2	Revert preservations of outlines with --split-pages The preservation of outlines didn't provide very useful behavior anyway as it copied all outlines but most didn't work. This implementation also caused a very significant performance hit and so is being reverted until a proper solution can be coded. The eventual solution will not be compatible with the reverted solution anyway, so it's best not to leave this in.	2019-04-20 21:00:43 -04:00
Thorsten Schöning	af42fe9daf	Don't open more than 50 files. Embarcadero C++Builder doesn't support more than 50 files open at the same time for legacy 32 Bit apps, which makes a test fail trying to open more than that many files. This changes the number of open files for that test to far less to make the test succeed. Alternatively one could reduce the hard coded number of 200 in QPDF itself, which I didn't do currently because it needs adoption of manuals etc. and is something which needs to be discussed with the author of QPDF. I guess chances are better to get the test changed upstream. This fixes #288: https://github.com/qpdf/qpdf/issues/288	2019-03-11 17:14:22 -04:00
Jay Berkenbilt	62baad2264	Merge pull request #294 from ams-tschoening/two_ops_same_val Two operands must evaluate to the same value.	2019-03-11 16:59:42 -04:00
Thorsten Schöning	de5c91f324	[bcc32 Error] test_driver.cc(1634): E2354 Two operands must evaluate to the same type Full parser context test_driver.cc(208): parsing: void runtest(int,const char ,const char )	2019-02-14 19:47:30 +01:00
Thorsten Schöning	2e7f81452f	[bcc32 Error] qpdf.cc(3837): E2354 Two operands must evaluate to the same type Full parser context qpdf.cc(3803): parsing: PointerHolder<Pipeline> ImageOptimizer::makePipeline(const std::string &,Pipeline *)	2019-02-14 19:45:00 +01:00
Thorsten Schöning	27f18e0f67	The kfo-PDF files for testing need to be copied using "binmode" or Windows will introduce \r\n. qpdf: selecting --keep-open-files=n qpdf: processing 001-kfo.pdf WARNING: 001-kfo.pdf: file is damaged WARNING: 001-kfo.pdf (offset 556): xref not found WARNING: 001-kfo.pdf: Attempting to reconstruct cross-reference table	2019-02-14 18:54:38 +01:00
Jay Berkenbilt	fc2e491f74	Add test for exception handling There have been issues reported where exceptions are not thrown properly across shared library/DLL boundaries, so add a test specifically to ensure that exceptions are caught as thrown.	2019-02-07 19:21:26 -05:00
Jay Berkenbilt	8acf636b4e	Incorporate improved Windows fragility workaround from qtest	2019-02-01 22:25:25 -05:00
Jay Berkenbilt	fec5bb124c	Spell check	2019-01-31 21:41:29 -05:00
Jay Berkenbilt	1fba24aada	Add another test case for weird page trees	2019-01-31 21:29:28 -05:00
Jay Berkenbilt	0a470d2daf	Don't optimize non-8-bit images Also add test cases for additional coverage on image optimization.	2019-01-31 21:29:28 -05:00
Jay Berkenbilt	eb49e07c0a	Make inline image token exactly contain the image data Do not include the trailing EI, and handle cases where EI is not preceded by a delimiter. Such cases have been seen in the wild.	2019-01-31 20:28:44 -05:00
Jay Berkenbilt	5211bcb5ea	Externalize inline images (fixes #278 )	2019-01-31 10:38:13 -05:00
Jay Berkenbilt	22bcdbe786	Remove acroread from tests This hasn't worked or been exercised in years since Adobe stopped releasing a Linux version of reader.	2019-01-31 10:38:13 -05:00
Jay Berkenbilt	1eb35a355f	Exclude space after ID in image data	2019-01-31 10:38:10 -05:00
Jay Berkenbilt	2b6c79bcae	Improve locating inline image's EI We've actually seen a PDF file in the wild that contained EI surrounded by delimiters inside the image data, which confused qpdf's naive code. This significantly improves EI detection.	2019-01-31 09:26:37 -05:00
Jay Berkenbilt	ec9e310c9e	Refactor QPDFTokenizer's inline image handling Add a version of expectInlineImage that takes an input source and searches for EI. This is in preparation for improving the way EI is found. This commit just refactors the code without changing the functionality and adds tests to make sure the old and new code behave identically.	2019-01-31 09:26:37 -05:00
Jay Berkenbilt	31372edce0	Inline image token value ends with EI, not delimiter The inline image token erroneously included the delimiter that followed EI. The ObjectHandle created from it was correct.	2019-01-31 09:26:37 -05:00
Jay Berkenbilt	c136356378	Typo in message	2019-01-31 09:26:37 -05:00
Jay Berkenbilt	8d229e078f	Improve info message in optimize images (fixes #280 ) When qpdf can't optimize an image because of an unsupported color space, state this specifically. Recognize that many valid colorspaces are not represented as name objects.	2019-01-29 18:16:02 -05:00
Jay Berkenbilt	8a9cfd2605	Handle direct page objects (fixes #164 )	2019-01-29 17:01:36 -05:00
Jay Berkenbilt	2712869cf9	Fix logic for when to compress object and xref streams (fixes #271 )	2019-01-28 21:43:06 -05:00
Jay Berkenbilt	52f9d326a5	Resolve duplicated page objects (fixes #268 ) When linearizing a file or getting the list of all pages in a file, detect if the pages tree contains a duplicated page object and, if so, shallow copy it. This makes it possible to have a one to one mapping of page positions to page objects.	2019-01-28 20:29:58 -05:00
Jay Berkenbilt	426434c772	Add --overlay and --underlay to qpdf CLI (fixes #207 )	2019-01-27 09:30:13 -05:00
Jay Berkenbilt	c2ae35540e	Add boundary condition test for getUniqueResourceName	2019-01-27 09:26:33 -05:00
Jay Berkenbilt	623f5b664e	Convert pages to form XObjects Support conversion of pages to form XObjects and placement of form XObjects on pages.	2019-01-27 07:50:30 -05:00
Jay Berkenbilt	009767d97a	Handle inheritable page attributes Add getAttribute for handling inheritable page attributes, and fix getPageImages and annotation flattening code to use it.	2019-01-25 22:30:05 -05:00
Jay Berkenbilt	2d32f4db8f	Handle fallback font size in text appearances If we end up using our fallback font size when generating appearances for text fields, reflect that in the Tf operator used in the appearance stream.	2019-01-21 07:38:21 -05:00
Jay Berkenbilt	9cb599875b	Improve text objects used in text appearance streams	2019-01-20 23:05:58 -05:00
Jay Berkenbilt	930eade6d3	Fix omissions in text appearance generation When generating appearance streams for variable text annotations, properly handle the cases of there being no appearance dictionary, no appearance stream, or an appearance stream with no BMC..EMC marker.	2019-01-20 23:05:58 -05:00
Jay Berkenbilt	65ef0bf313	When flattening, remove annotations with no appearance stream With the exception of form field annotations when /NeedAppearances is true, remove annotations that don't have appearance streams when flattening. There is no reason to keep these when flattening since they are invisible. This may include unchecked checkboxes, unshown popup windows, etc.	2019-01-20 23:05:58 -05:00
Jay Berkenbilt	0a3057dc0a	More testing for Unicode passwords	2019-01-19 14:16:03 -05:00
Jay Berkenbilt	c2030d1f33	Implement password recovery suppression and password mode (fixes #215 ) Allow fine control over how passwords are encoded for writing, and allow password for reading to be given as a hexademical encoded string. Allow suppression of password recovery as a means to ensure that the password you specify is actually the right one.	2019-01-19 10:14:07 -05:00
Jay Berkenbilt	392f2ece51	Try passwords with different string encodings	2019-01-19 10:10:58 -05:00
Jay Berkenbilt	e4fa5a3c2a	Refactor qpdf processing Push calls to processFile and processInputSource into separate functions in preparation for password recovery changes	2019-01-19 10:10:58 -05:00
Jay Berkenbilt	997f4ab6cb	Remove incorrect content code from test files	2019-01-17 11:43:56 -05:00
Jay Berkenbilt	966429e718	Update CLI and manual for new encryption granularity (fixes #214 )	2019-01-17 11:43:56 -05:00
Jay Berkenbilt	6ec22f117d	Modernize encryption API for more granularity Setting encryption permissions for R >= 3 set permission bits in groups corresponding to menu options in Acrobat 5. The new API allows the bits to be set individually.	2019-01-17 11:43:56 -05:00
Jay Berkenbilt	429ffcf397	Unicode main for Windows qpdf.cc	2019-01-17 11:43:56 -05:00
Jay Berkenbilt	698485468a	Move remaining existing transcoding to QUtil	2019-01-17 11:43:56 -05:00
Jay Berkenbilt	5cfcd4f361	Additional checks for unreferenced resources Explicitly abandon removal of unreferenced resources if there are any lexical errors in the page's contents. This case always generated a warning, but it now also prevents removal of unreferenced resources, this strongly decreasing the likelihood of data loss.	2019-01-17 11:43:56 -05:00
Jay Berkenbilt	e09ae710dc	Add tests for shared font/xobject The tests are in a separate commit so the bug-fix commit can be taken as a patch for older versions.	2019-01-17 09:44:29 -05:00
Jay Berkenbilt	654c0e8caf	Allow adding the same page more than once in --pages (fixes #272 )	2019-01-12 10:01:47 -05:00
Jay Berkenbilt	53d8e916b7	Interpret . in --pages as a shortcut for the primary file	2019-01-12 09:59:03 -05:00
Jay Berkenbilt	d24a120c7f	Add QPDF::setImmediateCopyFrom	2019-01-10 22:35:08 -05:00
Jay Berkenbilt	3472f6c984	Update copyrights for 2019	2019-01-07 07:54:55 -05:00
Jay Berkenbilt	8a5ca0e406	Don't keep QPDF objects for merging longer than needed	2019-01-07 07:38:03 -05:00
Jay Berkenbilt	c3cee5f154	Exercise out of scope original pdf for copyForeignObject	2019-01-07 07:38:03 -05:00
Jay Berkenbilt	fddbcab0e7	Mostly don't require original QPDF for copyForeignObject (fixes #219 ) The original QPDF is only required now when the source QPDFObjectHandle is a stream that gets its stream data from a QPDFObjectHandle::StreamDataProvider.	2019-01-07 00:11:15 -05:00
Jay Berkenbilt	fbbb0ee016	Make a static version of QPDF::pipeStreamData This is in preparation of being able to pipe a stream's data without keeping a copy of its containing qpdf object.	2019-01-07 00:11:15 -05:00
Jay Berkenbilt	a70fbaaf50	Honor other base encodings when generating appearances	2019-01-05 23:01:59 -05:00
Jay Berkenbilt	5c682f6d1e	Fix image optimization evaluation Don't attempt to pass data through a JPEG filter if we are unable to filter the data.	2019-01-05 22:37:49 -05:00
Jay Berkenbilt	ee437705fc	Update documentation for new features	2019-01-04 21:58:22 -05:00
Jay Berkenbilt	ab9f4cc212	Split help string It was too long for some compilers.	2019-01-04 21:33:14 -05:00
Jay Berkenbilt	2e342ee5bb	Spell check	2019-01-04 21:33:14 -05:00
Jay Berkenbilt	ee2aad4381	Add CLI flags for image optimization	2019-01-04 21:33:14 -05:00
Jay Berkenbilt	6f3b76b6c1	Fix image-streams.pdf in test suite Some of the images were supposed to have no filter, but somewhere along the line, they ended up with /FlateDecode, most likely because qpdf rewrote the file without having --compress-streams=n specified. If this error is repeated, it will cause a test failure.	2019-01-04 20:13:56 -05:00
Jay Berkenbilt	7b6ab900dc	Support page collation with --collate (fixes #259 )	2019-01-04 15:13:02 -05:00
Jay Berkenbilt	16fd6e64f9	Add QPDFWriter::getFinalVersion (fixes #266 )	2019-01-04 12:37:22 -05:00
Jay Berkenbilt	a01359189b	Fix dangling references (fixes #240 ) On certain operations, such as iterating through all objects and adding new indirect objects, walk through the entire object structure and explicitly resolve any indirect references to non-existent objects. That prevents new objects from springing into existence and causing the previously dangling references to point to them.	2019-01-04 10:29:29 -05:00
Jay Berkenbilt	158156d506	Add basic appearance stream generation	2019-01-04 08:00:19 -05:00
Jay Berkenbilt	b55567a0fa	Add special case setV code for button fields	2019-01-03 23:18:13 -05:00
Jay Berkenbilt	1342612308	Replace need-appearances.pdf Create a new need-appearances.pdf based on newer test files with more modified fields.	2019-01-03 23:18:13 -05:00
Jay Berkenbilt	e3144ac417	Add form fields to json output Also add some additional methods for detecting form field types to assist in the json creation and for later use.	2019-01-03 23:18:13 -05:00
Jay Berkenbilt	26393f5137	New test file with form field types	2019-01-03 23:18:13 -05:00
Jay Berkenbilt	87f855dbfc	Rename test file	2019-01-03 23:18:13 -05:00
Jay Berkenbilt	ca94ac68d9	Honor flags when flattening annotations	2019-01-03 11:59:55 -05:00
Jay Berkenbilt	3e74916c5a	Fix seg fault on empty xref stream (fixes #263 ) Thanks to @p-cher for supplying a patch.	2019-01-03 09:17:43 -05:00
Jay Berkenbilt	f78ea057ca	Switch annotation flattening to use the form xobjects Instead of directly putting the contents of the annotation appearance streams into the page's content stream, add commands to render the form xobjects directly. This is a more robust way to do it than the original solution as it works properly with patterns and avoids problems with resource name clashes between the pages and the form xobjects.	2019-01-02 21:49:47 -05:00
Jay Berkenbilt	23bcfeb336	Remove bogus test cheating code	2019-01-02 21:49:47 -05:00
Jay Berkenbilt	3b8ce4f12a	Annotation flattening including form fields Flatten annotations by integrating their appearance streams into the content stream of the containing page. In the case of form fields, only flatten if /NeedAppearance is false (or equivalently absent). If flattening form fields, also remove /AcroForm from the document catalog.	2019-01-01 08:14:15 -05:00
Jay Berkenbilt	95d6b17a89	Add QPDFObjectHandle::mergeDictionary()	2019-01-01 08:12:56 -05:00
Jay Berkenbilt	3440ea7d3c	JSON::serialize -> unparse Unparse is admittedly strange, but I'd rather be strange and consistent, and everything else in the qpdf library uses unparse to serialize. (If you're reading this, the convention of using "unparse" comes from the "clu" programming language.)	2018-12-25 11:52:21 -05:00
Jay Berkenbilt	6048c6e2f0	Don't crash on @file when file doesn't exist (fixes #265 ) When @file is used and file doesn't exist, just treat it as a normal argument.	2018-12-23 11:46:56 -05:00
Jay Berkenbilt	968e7e60b7	Add json tests	2018-12-23 11:21:59 -05:00
Jay Berkenbilt	64c1579544	Support zsh completion	2018-12-23 11:21:59 -05:00
Jay Berkenbilt	76bf863aaa	Add page position information to json	2018-12-23 09:15:40 -05:00
Jay Berkenbilt	52a0b767c8	Slightly improve bash completion arg parsing	2018-12-23 09:15:40 -05:00
Jay Berkenbilt	86f9b4c43b	Add colorspace and depth information in json for images	2018-12-22 11:42:38 -05:00
Jay Berkenbilt	62ea3b9197	Add outlines to json at document level	2018-12-22 11:42:38 -05:00
Jay Berkenbilt	ae9455bf44	Implement --json-objects	2018-12-22 11:42:38 -05:00
Jay Berkenbilt	ce714ac9b8	Call cleanup between test sections	2018-12-22 11:42:38 -05:00
Jay Berkenbilt	fa3051d977	Implement --json-keys	2018-12-22 11:42:38 -05:00
Jay Berkenbilt	2008d037b3	Handle help args using option tables; add json help	2018-12-22 11:42:38 -05:00
Jay Berkenbilt	b3da5a2cba	Switch json args and structure	2018-12-22 11:42:38 -05:00
Jay Berkenbilt	7985c77326	Completion: ignore characters at and after point	2018-12-22 11:42:37 -05:00
Jay Berkenbilt	bb89382f93	Allow --show-object=trailer	2018-12-21 19:11:57 -05:00
Jay Berkenbilt	dd1aca552c	Support bash completion using complete -C	2018-12-21 19:11:57 -05:00
Jay Berkenbilt	3c075fc017	Table-driven parsing of encrypt options	2018-12-21 19:11:57 -05:00
Jay Berkenbilt	245723c570	Table-driven parsing for top-level arguments	2018-12-21 19:11:57 -05:00
Jay Berkenbilt	151206603b	Move argument parsing into a class	2018-12-21 19:11:57 -05:00
Jay Berkenbilt	6580ffe983	Preliminary implementation of json mode The json mode implemented in this commit is not the final version, or are the command line arguments used to invoke it.	2018-12-21 19:11:57 -05:00
Jay Berkenbilt	fa3664357b	Move numrange code from qpdf.cc to QUtil.cc Also move tests to libtests.	2018-12-21 19:11:57 -05:00
Jay Berkenbilt	313ba08126	Preserve some outline functionality in page splitting	2018-12-21 19:11:57 -05:00
Jay Berkenbilt	d5d179f441	Add document and object helpers for outlines (bookmarks)	2018-12-21 19:11:57 -05:00
Jay Berkenbilt	0776c00129	Add QPDFNameTreeObjectHelper	2018-12-21 18:34:56 -05:00
Jay Berkenbilt	352ce9b22b	Preserve page labels (numbers) when splitting and merging	2018-12-18 16:59:24 -05:00
Jay Berkenbilt	6ef9e31233	Add QPDFPageLabelDocumentHelper	2018-12-18 16:59:24 -05:00
Jay Berkenbilt	f38df27aa3	Add QPDFNumberTreeObjectHelper	2018-12-18 16:46:10 -05:00
Jay Berkenbilt	88fb2e5258	Workaround for fragile test on Windows	2018-10-16 11:41:00 -04:00
Jay Berkenbilt	28453a4908	Add --keep-files-open flag (fixes #237 )	2018-08-18 10:56:01 -04:00
Jay Berkenbilt	7214ba2303	Fix memory error on virus workaround code	2018-08-14 16:41:13 -04:00
Jay Berkenbilt	164cbdde46	Protect against virus warnings (fixes #216 ) Some files in the test suite trigger antivirus warnings. These are not infected files with malicious intent. They are test files to ensure that qpdf does not crash when it encounters the files. This change enables those files to be obfuscated in the source repository so that checking out qpdf from version control or extracting the source code doesn't trigger antivirus warnings.	2018-08-13 19:26:20 -04:00
Jay Berkenbilt	fb1e29476c	Add --no-warn option to suppress warnings (fixes #232 )	2018-08-12 22:20:40 -04:00
Jay Berkenbilt	a2f62935b3	Catch exceptions as const references (fixes #236 ) This fix allows qpdf to compile/test cleanly with gcc 8.	2018-08-12 21:57:52 -04:00
Jay Berkenbilt	4a4736c695	Fix EOL handling inside strings (fixes #226 ) CR, CRLF, and LF are all supposed to be treated as LF; only one EOL is to be ignored after backslash.	2018-08-05 20:48:35 -04:00
Jay Berkenbilt	e1cd5891af	Fix infinite loop on small files with progress reporting (fixes #230 ) Turns out you can keep adding zero to a number over and over again and it just doesn't get any bigger. Who would have known?	2018-08-05 15:43:34 -04:00
Jay Berkenbilt	fe769f2723	Keep file open while adding its pages during merge (fixes #217 )	2018-08-04 19:58:13 -04:00
Jay Berkenbilt	3aad28aed0	Bug fix: honor encryption key length with R=3 (fixes #212 )	2018-06-22 19:24:26 -04:00
Jay Berkenbilt	c852af2a57	Add tests for progress and verbose changes	2018-06-22 16:14:54 -04:00
Jay Berkenbilt	6bf47ac6e8	With --verbose, give information on processing merge inputs	2018-06-22 16:14:54 -04:00
Jay Berkenbilt	a433ed24f9	Add progress reporting for QPDFWriter (fixes #200 )	2018-06-22 16:14:54 -04:00
Jay Berkenbilt	99593e0eef	Use ClosedFileInputSource when merging files (fixes #154 )	2018-06-22 12:53:41 -04:00
Jay Berkenbilt	c71dc6888c	Don't prune resource dictionaries on errors or by request If we are unable to filter a page's content streams, don't attempt to remove objects from the page's resource dictionary. Also provide a command line option to suppress resource removal in case we ever need this as a workaround for some bug or broken PDF files.	2018-06-22 10:45:31 -04:00
Jay Berkenbilt	38c9ed23c3	Treat content stream parsing errors as an error, not a warning If parsing content streams is treated as a warning, there is no way for a caller to know if a parsing operation has failed. This is very dangerous and will likely result in data loss when token filters are parser callbacks are in use.	2018-06-22 10:44:08 -04:00
Jay Berkenbilt	6c89d4b35b	When splitting files, remove unreferenced objects (fixes #203 )	2018-06-21 21:03:30 -04:00
Jay Berkenbilt	ddd78c1b7f	Fix QPDFObjectHandle::shallowCopy It's not really a shallow copy. It just doesn't cross indirect object boundaries. The old implementation had a bug that would cause multiple shallow copies of the same object to share memory, which was not the intention.	2018-06-21 20:34:45 -04:00
Jay Berkenbilt	84cd53f5af	Make page range optional in --rotate (fixes #211 )	2018-06-21 16:28:44 -04:00
Jay Berkenbilt	397b097c46	Allow setting a form field's value	2018-06-21 15:57:13 -04:00
Jay Berkenbilt	952a665a4e	Better support for creating Unicode strings	2018-06-21 15:57:13 -04:00
Jay Berkenbilt	0b05111db8	Implement helper class for interactive forms	2018-06-21 15:57:13 -04:00
Jay Berkenbilt	0dadf17ab7	Convert command-line and test suite to use page helper classes This provides better test coverage and more useful code for people to read and copy.	2018-06-21 15:57:13 -04:00
Jay Berkenbilt	4cded10821	Add QPDFObjectHandle::Rectangle type Provide a convenient way of accessing rectangles.	2018-06-21 15:57:13 -04:00
Jay Berkenbilt	078cf9bf90	newline before endstream fix for object streams (fixes #205 )	2018-05-12 13:17:43 -04:00
Jay Berkenbilt	b8ccbff413	doc: point out use of @filename for specifying password (fixes #198 )	2018-05-05 17:52:04 -04:00
Jay Berkenbilt	b4d6cf6836	Limit depth of nesting in direct objects (fixes #202 ) This fixes CVE-2018-9918.	2018-04-15 16:11:22 -04:00
Jay Berkenbilt	e4e2e26d99	Properly handle pages with no contents (fixes #194 ) Remove calls to assertPageObject(). All cases in the library that called assertPageObject() work fine if you don't call assertPageObject() because nothing assumes anything that was being checked by that call. Removing the calls enables more files to be successfully processed.	2018-03-06 11:34:07 -05:00
Jay Berkenbilt	ee44aef8d0	Treat loop in xref tables as damage (fixes #192 ) Prior to this fix, if there was a loop detected in following /Prev pointers in xref streams/tables, it would cause qpdf to lose data. Note that this condition causes many PDF readers to hang or fail.	2018-03-05 14:26:58 -05:00
Jay Berkenbilt	666f794393	Support "r" in page ranges (fixes #155 )	2018-03-04 07:05:14 -05:00
Jay Berkenbilt	7b9f23a99a	Ignore zlib data check errors (fixes #191 )	2018-03-03 11:35:01 -05:00
Jay Berkenbilt	a8682e0b75	Spell check	2018-02-25 15:06:44 -05:00
Jay Berkenbilt	9a4ef8c95d	Separate copyright notice from --version option	2018-02-25 09:03:27 -05:00
Jay Berkenbilt	4bb3046f0b	Properly handle strings with PDF Doc Encoding (fixes #179 ) The QPDF_String::getUTF8Val() method was not treating strings that weren't explicitly Unicode as PDF Doc Encoded. This only affects characters in the range 0x80 through 0xa0.	2018-02-18 21:06:27 -05:00
Jay Berkenbilt	2780a1871d	Add C API for checking PDF files	2018-02-18 21:06:27 -05:00
Jay Berkenbilt	b72a38bf5f	Reorganize some test cases Too many test cases were "miscellaneous".	2018-02-18 21:06:27 -05:00
Jay Berkenbilt	d0e99f195a	More robust handling of type errors Give objects descriptions and context so it is possible to issue warnings instead of fatal errors for attempts to access objects of the wrong type.	2018-02-18 21:06:27 -05:00
Jay Berkenbilt	c2e16827b6	Replace "file position" with "offset" in error messages Sometimes it's an offset in an object stream or a content stream, so file position is confusing in some cases.	2018-02-18 21:06:27 -05:00
Jay Berkenbilt	52e024f701	Include omitted object description in error message	2018-02-18 21:06:27 -05:00
Jay Berkenbilt	cb3b705cf9	Include filename in object stream parse error	2018-02-18 21:06:27 -05:00
Jay Berkenbilt	e410b0fe0d	Simplify TokenFilter interface Expose Pl_QPDFTokenizer, and have it do more of the work of managing the token filter's pipeline.	2018-02-18 21:05:47 -05:00
Jay Berkenbilt	5136238f2a	Detect and report bad tokens in content normalization	2018-02-18 21:05:47 -05:00
Jay Berkenbilt	9910104442	Implement TokenFilter and refactor Pl_QPDFTokenizer Implement a TokenFilter class and refactor Pl_QPDFTokenizer to use a TokenFilter class called ContentNormalizer. Pl_QPDFTokenizer is now a general filter that passes data through a TokenFilter.	2018-02-18 21:05:46 -05:00
Jay Berkenbilt	b8723e97f4	Add coalesce contents capability	2018-02-18 21:05:46 -05:00
Jay Berkenbilt	25988e8d10	Bug fix: content normalizer should not add trailing newline Adding a trailing newline in content normalization damages files whose contents are split across streams in the middle of tokens. Let QPDFWriter add the newline with the indicator to ignore the newline, which it already does. This changes the way some qdf files look.	2018-02-18 21:05:46 -05:00
Jay Berkenbilt	cc108a7f1b	Use pipePageContents in tokenizer test	2018-02-18 21:05:46 -05:00
Jay Berkenbilt	6afe83978f	Switch from parseContentStream to parsePageContents	2018-02-18 21:05:46 -05:00
Jay Berkenbilt	fcd611b61e	Refactor parseContentStream	2018-02-18 21:05:46 -05:00
Jay Berkenbilt	ec538792fa	Use inline image token type in tokenizer filter	2018-02-18 21:05:46 -05:00
Jay Berkenbilt	fefe25030e	Inline image token type	2018-02-18 21:05:46 -05:00
Jay Berkenbilt	d97474868d	Lexer enhancements: EOF, comment, space Significant enhancements to the lexer to improve EOF handling and to support comments and spaces as tokens. Various other minor issues were fixed as well.	2018-02-18 20:18:40 -05:00
Jay Berkenbilt	bb9e91adbd	Create isolated tokenizer tests This tokenizes outer parts of the file, page content streams, and object streams. It is for exercising the tokenizer in isolation and is being introduced before reworking the lexical layer of qpdf.	2018-02-18 20:18:40 -05:00
Jay Berkenbilt	ebd5ed63de	Add option to save pass 1 of lineariziation This is useful only for debugging the linearization code.	2018-02-18 20:18:40 -05:00
Jay Berkenbilt	e3167c1a60	Fix linearization for files with nonstandard ID length	2018-02-04 18:16:23 -05:00
Jay Berkenbilt	cffb6fd64a	Test stream that ends with name token and no newline	2018-01-28 18:34:43 -05:00
Jay Berkenbilt	13d9756a45	Minor fixes to tokenizer	2018-01-28 18:34:43 -05:00
Jay Berkenbilt	569d74d36b	Allow raw encryption key to be specified Add options to enable the raw encryption key to be directly shown or specified. Thanks to Didier Stevens <didier.stevens@gmail.com> for the idea and contribution of one implementation of this idea.	2018-01-14 10:21:05 -05:00
Jay Berkenbilt	68572df2bf	Update copyright to 2018	2018-01-13 20:25:58 -05:00
Jay Berkenbilt	791e0db762	Allow trailing . in numeric token (fixes #165 )	2018-01-13 20:05:40 -05:00
Jay Berkenbilt	6299c64cf3	Use correct link directory order (fixes #158 ) Make sure to link from the source tree before linking from the system. In many environments, this is necessary to allow a newly built qpdf to link properly instead of trying to link or resolve libraries from an older installed version.	2018-01-13 19:53:52 -05:00
Jay Berkenbilt	ec0087e3ce	Support TIFF Predictor (fixes #171 )	2018-01-13 19:49:42 -05:00
Jay Berkenbilt	be27d47bdc	Use better error for getStreamData failure If the stream isn't filterable but we call getStreamData, throw a regular exception instead of a logic error so that normal error handling and reporting mechanisms will be used.	2018-01-13 19:49:42 -05:00
Jay Berkenbilt	48864b8d6e	Clarify documentation of advanced parsing options	2017-12-25 18:42:33 -05:00
Jay Berkenbilt	4edfe1f41d	Add tests for new PNG filters	2017-12-25 18:20:52 -05:00
Jay Berkenbilt	07c8bb2843	Additionally license under Apache License version 2.0 The Apache License version 2.0 is now the primary license for qpdf. However, users may, at their option, continue to use Artistic version 2.0.	2017-09-14 12:59:25 -04:00
Jay Berkenbilt	d31a7b76e7	Improve message for stream decoding error Tweak the message so that we inform the user that we are mitigating data loss.	2017-09-12 16:03:48 -04:00
Jay Berkenbilt	eaacf94005	Update C API with new QPDFWriter methods	2017-09-12 14:30:39 -04:00
Jay Berkenbilt	cbb2614975	Fix command-line parsing for --rotate	2017-09-07 22:58:37 -04:00
Jay Berkenbilt	ec7d74a386	Add test case for overflow in PNG filter (fixes #150 )	2017-08-29 12:33:01 -04:00
Jay Berkenbilt	1868a10f8b	Replace all atoi calls with QUtil::string_to_int The latter catches underflow/overflow.	2017-08-29 12:28:32 -04:00
Jay Berkenbilt	abb3191c32	Add tests for previous memory issues Now that the test suite runs clean with address sanitizer, add some test cases that previously were used to expose memory errors.	2017-08-28 22:28:12 -04:00
Jay Berkenbilt	4f8c734d8e	Missing free in some test code There was a missing free causing a memory leak in some test code. The memory leak was not in library code.	2017-08-26 22:04:49 -04:00
Jay Berkenbilt	ad527a64f9	Parse iteratively to avoid stack overflow (fixes #146 )	2017-08-25 21:56:45 -04:00
Jay Berkenbilt	85f05cc57f	Detect xref pointer infinite loop (fixes #149 )	2017-08-25 19:58:31 -04:00
Jay Berkenbilt	e452d9dca6	Spell check	2017-08-22 14:22:20 -04:00
Jay Berkenbilt	fabff0f3ec	Limit token length during xref recovery While scanning the file looking for objects, limit the length of tokens we allow. This prevents us from getting caught up in reading a file character by character while digging through large streams.	2017-08-22 14:13:10 -04:00
Jay Berkenbilt	6884ad2ead	Fix logic error in recovery A stray semicolon caused a condition to be incorrectly applied during stream length recovery.	2017-08-22 07:19:41 -04:00
Jay Berkenbilt	8288a4eb3a	Update copyright to 2017	2017-08-21 21:18:47 -04:00
Jay Berkenbilt	f08ce00e62	Add tests for PCLm Files written in PCLm mode have to be created in a very specific way. qpdf doesn't know how to create PCLm files from scratch. All it knows how to do is to write an already valid file in a suitable way. Therefore there is no command-line support for PCLm.	2017-08-21 21:05:47 -04:00
Jay Berkenbilt	ddc6cf0cf6	Precheck streams by default There is no need for a --precheck-streams option. We can do the precheck without imposing any penalty, only re-encoding the stream if it fails the first time.	2017-08-21 17:44:22 -04:00
Jay Berkenbilt	9744414c66	Enable finer grained control of stream decoding This commit adds several API methods that enable control over which types of filters QPDF will attempt to decode. It also adds support for /RunLengthDecode and /DCTDecode filters for both encoding and decoding.	2017-08-21 17:44:22 -04:00
Jay Berkenbilt	e0d1cd1f4b	Fix test case There was an unintended recoverable error in a test file. It wasn't hurting anything, but it was obscuring the actual intent of the test.	2017-08-19 14:50:55 -04:00
Jay Berkenbilt	cfa2eb97fb	Add page rotation (fixes #132 )	2017-08-12 22:57:38 -04:00
Jay Berkenbilt	d926d78059	Add --verbose flag	2017-08-12 12:30:18 -04:00
Jay Berkenbilt	2c6fe1805a	Support groups of pages in --split-pages (fixes #30 )	2017-08-12 12:08:23 -04:00
Jay Berkenbilt	df33c368b4	Change --single-pages to --split-pages This is in preparation for implementing page groups.	2017-08-12 11:49:04 -04:00
Jay Berkenbilt	ad82706003	Note about veraPDF	2017-08-12 11:35:02 -04:00
Jay Berkenbilt	8249a26d69	Fix infinite loop in QPDFWriter (fixes #143 )	2017-08-12 08:36:36 -04:00
Jay Berkenbilt	36b3fe5af7	Fix --newline-before-endstream option (fixes #133 ) Add a newline unconditionally before endstream even if a newline was already written as part of the stream data.	2017-08-11 20:57:05 -04:00
Jay Berkenbilt	46611f0710	Prevent a division by zero error (fixes #141 ) Bad /W in an xref stream could cause a division by zero error. Now this is handled as a special case.	2017-08-11 20:11:19 -04:00
Jay Berkenbilt	8fe0b06cd8	Pad encryption parameters that are too short (fixes #96 )	2017-08-11 19:53:56 -04:00
Jay Berkenbilt	0c99cf874b	Sanitize test suite Remove problematic test files	2017-08-11 07:41:11 -04:00
Jay Berkenbilt	30f109e244	Read xref table without PCRE Also accept more errors than before.	2017-08-10 21:30:32 -04:00
Jay Berkenbilt	ca5b1d267a	Improve stream length recovery Eliminate PCRE and find endobj not preceded by endstream. Be more lax about placement of endstream and endobj.	2017-08-10 21:30:32 -04:00
Jay Berkenbilt	3082e4e606	Find xref without PCRE	2017-08-10 21:30:32 -04:00
Jay Berkenbilt	90840be594	Find lindict without PCRE	2017-08-10 21:30:32 -04:00
Jay Berkenbilt	03aa9679ac	Find starxref without PCRE	2017-08-10 21:30:32 -04:00
Jay Berkenbilt	1765c6ec20	Find header without PCRE	2017-08-10 21:30:32 -04:00
Jay Berkenbilt	ef8ae5449d	Allow QPDFTokenizer::readToken to return bad tokens Sometimes we want to ignore bad tokens rather than having them throw an exception. A coverage case is commented out here and added in a later commit.	2017-08-10 19:01:41 -04:00
Jay Berkenbilt	c5dc6d8067	Remove unused PointerHolder interface Also fix a bug resulting from incorrect use of PointerHolder because of this unused parameter.	2017-08-10 19:01:38 -04:00
Jay Berkenbilt	ff6971fb1c	Call PointerHolder constructor properly (fixes #135 ) Passed arguments to the constructor in the wrong order.	2017-08-09 22:00:49 -04:00
Jay Berkenbilt	49825e5cb6	Add --split-pages option (fixes #30 )	2017-08-05 10:22:33 -04:00
Jay Berkenbilt	a60eb552d3	Split bug tests into separate chunk	2017-08-05 10:22:33 -04:00
Jay Berkenbilt	1ec59c299d	Refactor write_output	2017-08-05 10:22:33 -04:00
Jay Berkenbilt	909daf9543	Move page spec processing earlier	2017-08-05 10:22:33 -04:00
Jay Berkenbilt	24f28f0768	Split qpdf.cc's main into reasonably sized functions main() had gotten absurdly long. Split it into reasonable chunks. This refactoring is in preparation for handling splitting output into single pages.	2017-08-05 08:24:05 -04:00
Jay Berkenbilt	c88eaae2f2	Fix off-by-one error in --pages argument parsing (fixes #129 )	2017-08-02 21:08:43 -04:00
Jay Berkenbilt	2d5b854468	Allow reading command-line args from files (fixes #16 )	2017-07-29 22:23:21 -04:00
Jay Berkenbilt	5993c3e83c	Detect input file = output file (fixes #29 )	2017-07-29 20:58:01 -04:00
Jay Berkenbilt	885b8781cc	Allow --check to coexist with and precede other operations (fixes #42 )	2017-07-29 19:56:21 -04:00
Jay Berkenbilt	b43a0ac237	When recover stream length, indicate the length (fixes #44 )	2017-07-29 19:15:06 -04:00
Jay Berkenbilt	f37d399d82	Add newline-before-endstream option (fixes #103 )	2017-07-29 12:21:38 -04:00
Jay Berkenbilt	6a7d53ad2b	Handle zlib data errors better (fixes #106 )	2017-07-29 12:19:04 -04:00
Jay Berkenbilt	07d6f770b2	Better recovery of bad stream start (fixes #104 )	2017-07-29 12:19:04 -04:00
Jay Berkenbilt	b389268f16	Better handle split content streams (fixes #73 ) When parsing content streams, allow content to be split arbitrarily across stream boundaries.	2017-07-29 12:19:04 -04:00
Jay Berkenbilt	3a1ff5ded9	Add option to preserve unreferenced objects	2017-07-28 19:19:11 -04:00
Jay Berkenbilt	a94a729fee	Explicitly check root dictionary type Very badly corrupted files may not have a retrievable root dictionary. Handle that as a special case so that a more helpful error message can be provided.	2017-07-28 18:03:30 -04:00
Jay Berkenbilt	7f8892525f	Add precheck streams capability When requested, QPDFWriter will do more aggress prechecking of streams to make sure it can actually succeed in decoding them before attempting to do so. This will allow preservation of raw data even when the raw data is corrupted relative to the specified filters.	2017-07-27 23:42:27 -04:00
Jay Berkenbilt	428d96dfe1	Convert many more errors to warnings	2017-07-27 22:57:55 -04:00
Jay Berkenbilt	a4fd4b91c6	Convert stream filtering errors to warnings	2017-07-27 18:43:07 -04:00
Jay Berkenbilt	40f00122b8	Convert object parsing errors to warnings QPDFObjectHandle::parseInternal now issues warnings instead of throwing exceptions for all error conditions that it finds (except internal logic errors) and has stronger recovery for things like invalid tokens and malformed dictionaries. This should improve qpdf's ability to recover from a wide range of broken files that currently cause it to fail.	2017-07-27 18:20:31 -04:00
Jay Berkenbilt	ac3c81a8ed	Include tests for other infinite loop bugs fixes #117 fixes #118 fixes #119 fixes #120 Several other infinite loop bugs were fixed by previous changes. Include their test files in the test suite.	2017-07-26 06:24:07 -04:00
Jay Berkenbilt	701b518d5c	Detect recursion loops resolving objects (fixes #51 ) During parsing of an object, sometimes parts of the object have to be resolved. An example is stream lengths. If such an object directly or indirectly points to the object being parsed, it can cause an infinite loop. Guard against all cases of re-entrant resolution of objects.	2017-07-26 06:24:07 -04:00
Jay Berkenbilt	afe0242b26	Handle object ID 0 (fixes #99 ) This is CVE-2017-9208. The QPDF library uses object ID 0 internally as a sentinel to represent a direct object, but prior to this fix, was not blocking handling of 0 0 obj or 0 0 R as a special case. Creating an object in the file with 0 0 obj could cause various infinite loops. The PDF spec doesn't allow for object 0. Having qpdf handle object 0 might be a better fix, but changing all the places in the code that assumes objid == 0 means direct would be risky.	2017-07-26 06:24:07 -04:00
Jay Berkenbilt	315092dd98	Avoid xref reconstruction infinite loop (fixes #100 ) This is CVE-2017-9209.	2017-07-26 06:24:07 -04:00
Jay Berkenbilt	603f222365	Fix infinite loop while reporting an error (fixes #101 ) This is CVE-2017-9210. The description string for an error message included unparsing an object, which is too complex of a thing to try to do while throwing an exception. There was only one example of this in the entire codebase, so it is not a pervasive problem. Fixing this eliminated one class of infinite loop errors.	2017-07-26 06:24:07 -04:00
Thorsten Schöning	e80b6e3341	Support paths with spaces	2016-01-24 11:52:09 -05:00
Thorsten Schöning	eff935ab60	Use absolute paths for large file tests Working with absolute paths makes debugging easier, but some called scripts always need / as dir separator or won't work.	2016-01-24 11:52:09 -05:00
Thorsten Schöning	adbaa54ad4	Fix non-portable use of /dev/null /dev/null is not portable, so use File::Spec instead, which provides portable "paths" and especially "nul" on Windows. I changed all places with hard coded /dev/null to be sure, while I think it only is a problem in direct system calls, because the other executed commands go to sh.exe from MSYS which itself should port /dev/null to NUL. The test still pass, so shouldn't have made any harm...	2016-01-24 11:52:09 -05:00
Thorsten Schöning	951dbc3b7f	Fix expr syntax, support spaces in paths expr needs ARG + ARG quote paths to support support spaces	2016-01-24 11:52:09 -05:00
Thorsten Schöning	3c1555a622	Explicitly invoke shell scripts with sh Shebang doesn't work well on Windows.	2016-01-24 11:52:09 -05:00
Jay Berkenbilt	b62cbe2508	Tolerate some mangled xref tables If xref table entries lack the spec-required trailing whitespace or contain a small amount of extra space, handle them anyway.	2015-10-31 18:56:43 -04:00
Jay Berkenbilt	b8bdef0ad1	Implement deterministic ID For non-encrypted files, determinstic ID generation uses file contents instead of timestamp and file name. At a small runtime cost, this enables generation of the same /ID if the same inputs are converted in the same way multiple times.	2015-10-31 18:56:42 -04:00
Jay Berkenbilt	f77acbdbba	Copyright 2015	2015-05-24 17:26:49 -04:00
Jay Berkenbilt	b356b9dfa2	fix-qdf: handle object streams with > 255 objects fix-qdf was previously hard-coding the number of bytes for the f2 field of the xref stream entry. This addresses issue #37. Thanks aluebcke for reporting.	2015-05-24 16:52:42 -04:00
Jay Berkenbilt	a11549a566	Detect loops in /Pages structure Pushing inherited objects to pages and getting all pages were both prone to stack overflow infinite loops if there were loops in the Pages dictionary. There is a general weakness in the code in that any part of the code that traverses the Pages structure would be prone to this and would have to implement its own loop detection. A more robust fix may provide some general method for handling the Pages structure, but it's probably not worth doing. Note: addition of *Internal2 private functions was done rather than changing signatures of existing methods to avoid breaking compatibility.	2015-02-21 19:47:11 -05:00
Jay Berkenbilt	c729e07d55	Avoid resolving arguments to R When checking two objects preceding R while parsing, ensure that the objects are direct. This avoids stuff like 1 0 obj containing 1 0 R 0 R from causing an infinite loop in object resolution.	2015-02-21 17:51:08 -05:00
Jay Berkenbilt	d8900c2255	Handle page tree node with no /Type Original reported here: https://bugs.launchpad.net/ubuntu/+source/qpdf/+bug/1397413 The PDF specification says that the /Type key for nodes in the pages dictionary (both /Page and /Pages) is required, but some PDF files omit them. Use the presence of other keys to determine the type of pages tree node this is if the type key is not found.	2014-12-29 10:17:21 -05:00
Jay Berkenbilt	caab1b0e16	Handle pages with no /Contents from getPageContents() The spec allows /Contents to be omitted for pages that are blank, but QPDFObjectHandle::getPageContents() was throwing an exception in this case.	2014-11-14 13:43:34 -05:00

... 6 7 8 9 10 ...

950 Commits