octoleo/qpdf - qpdf - Vast Development Method

mirror of https://github.com/qpdf/qpdf.git synced 2024-12-23 11:28:56 +00:00

Author	SHA1	Message	Date
Jay Berkenbilt	f8eee83515	Expose QPDFArgParser::usage	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	8dcf6da259	QPDFJob: remove non-check from doFinalChecks	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	c216854607	Add basic framework for QPDFJob code generation	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	bd89aac360	QPDFJob increment: move arg parsing into QPDFJob Move ArgParser from qpdf.cc into QPDFJob.cc. It still works with millions of public member variables, but now qpdf.cc is minimal and just calls stable library functions.	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	12396702af	QPDFJob: reorder functions, no other changes	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	2394dd8519	QPDFJob increment: static functions to member functions Convert remaining static functions that take QPDFJob& as a parameter to member functions. Utility functions that don't take QPDFJob& remain static functions and can probably just stay that way since the keep extra complexity out of QPDFJob.hh.	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	e2975b9ed0	QPDFJob: de-templatize do_process and do_process_once	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	2f631997f2	QPDFJob increment: remove std::cout, std::cerr, whoami Remove remaining temporary duplication of hard-coded values and direct access to std::cout, std::cerr, and whoami in favor of parameters in QPDFJob. This moves a few more static methods into QPDFJob member functions.	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	1ddf5b4b4b	QPDFJob increment: get rid of exit, handle verbose Remove all calls to exit() from QPDFJob. Handle code that runs in verbose mode to enable it to make use of output streams and message prefix (whoami) from QPDFJob. This removes temporarily duplicated exit code logic and most access to whoami/std::cout outside of QPDFJob proper.	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	0910e767ad	QPDFJob increment: basic QPDFJob structure Move most of the methods called from qpdf.cc after argument parsing into QPDFJob. In this increment, enough QPDFJob API has been added to handle the branch of QPDFJob::run() that creates output with an appropriate division between qpdf.cc and QPDFJob. There are temporary bits of code to enable everything to compile and pass the test suite, including some duplication and hard-coded values.	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	52817f0a45	Implement QPDFArgParser based on ArgParser from qpdf.cc	2022-01-30 13:11:02 -05:00
m-holger	0f9086e509	Fix doc typos	2022-01-30 12:09:54 -06:00
m-holger	8eca9d8fd9	Fix QPDFObjectHandle::isOrHasName Ensure isOrHasName returns true if object is an array and the name is present anywhere in the array.	2022-01-27 09:35:39 -06:00
m-holger	07db3200cb	Remove some if statements and simplify some boolean expressions Use QPDFObjectHandle::isNameAndEquals, isDictionaryOfType and isStreamOfType.	2022-01-27 07:31:12 -06:00
m-holger	710d2e54f0	Allow testing for subtype without specifying type in isDictionaryOfType etc Accept empty string as type parameter in QPDFObjectHandle::isDictionaryOfType and isStreamOfType to allow for dictionaries with optional type.	2022-01-27 07:31:12 -06:00
m-holger	1b1b471ca9	Make a few whitespace fixes from last commit Commit by ejb@ql.org using m-holger as author so git annotate gives proper credit for changes.	2022-01-22 09:14:53 -05:00
m-holger	8593b9fdf7	Add new convenience methods QPDFObjectHandle::isNameAndEquals, etc Add methods isNameAndEquals, isDictionaryOfType, isStreamOfType	2022-01-22 08:10:28 -06:00
Jay Berkenbilt	370710657a	Add missing characters from PDF doc encoding (fixes #606 )	2022-01-11 15:55:19 -05:00
Jay Berkenbilt	77c31305fe	Fix signed/unsigned char warning (fixes #604 )	2022-01-11 06:51:31 -05:00
Jay Berkenbilt	af91b5b584	Add QUtil::file_can_be_opened	2021-12-29 13:41:02 -05:00
Jay Berkenbilt	04745320d6	Prepare 10.5.0 release	2021-12-20 14:51:46 -05:00
Jay Berkenbilt	d866f48081	Change names of qpdf_object_type_e enumerations They have to be ot_* rather than qpdf_ot_* for compatibility. * Different enumerated types are not assignment-compatible in C++, at least with strict compiler settings * While you can do `constexpr ot_xyz = ::qpdf_ot_xyz` in QPDFObject.hh to make QPDFObject::ot_xyz work, QPDFObject::object_type_e::ot_xyz will only work if the enumerated type names are the same.	2021-12-20 14:51:45 -05:00
Jay Berkenbilt	ea73bf72e0	Further improvements to handling binary strings	2021-12-19 14:30:45 -05:00
Jay Berkenbilt	ddbe59179e	C API: simplify new error handling and improve documentation	2021-12-17 15:59:47 -05:00
m-holger	f6293bd94c	C-API expose QPDFObjectHandle::getTypeCode and getTypeName (fixes #597 )	2021-12-17 14:24:43 -05:00
Jay Berkenbilt	feafcc4e88	C API: add several stream functions (fixes #596 )	2021-12-17 13:28:11 -05:00
Jay Berkenbilt	fee7489ee4	Add Pl_Buffer::getMallocBuffer	2021-12-17 12:38:52 -05:00
Jay Berkenbilt	9bb6f570ec	C API: add functions for working with pages (fixes #594 )	2021-12-16 15:07:48 -05:00
Jay Berkenbilt	245ca28066	Use value rather than reference captures where possible	2021-12-16 11:47:07 -05:00
Jay Berkenbilt	af2a71aa2c	Handle bitstream overflow errors more gracefully (fixes #581 ) * Make it a runtime error, not a logic error * Include additional information * Capture it properly in checkLinearization	2021-12-10 15:37:35 -05:00
Jay Berkenbilt	1c62c2a342	C API: expose functions for indirect objects (fixes #588 )	2021-12-10 14:57:35 -05:00
Jay Berkenbilt	72c10d8617	C API: overhaul error handling * Handle error conditions that occur when using the object handle interfaces. In the past, some exceptions were not correctly converted to errors or warnings. * Add more detailed information to qpdf-c.h * Make it possible to work more explicitly with uninitialized objects	2021-12-10 12:16:02 -05:00
Jay Berkenbilt	3340dbe976	Use a specific error code for type warnings and clarify docs	2021-12-10 11:15:49 -05:00
Jay Berkenbilt	b2b2a175c4	Add missing unit test for register progress reporter in C API It was exercised in the pdf-linearize example but not in qpdf-ctest.	2021-12-10 09:11:56 -05:00
Jay Berkenbilt	1faa21502f	Refactor trap_errors to use std::function	2021-12-09 10:33:31 -05:00
Jay Berkenbilt	e3cc171d02	C API: qpdf_oh_is_initialized	2021-12-09 10:33:31 -05:00
Jay Berkenbilt	bef2c2222a	C API: qpdf_get_last_string_length	2021-12-09 10:33:31 -05:00
m-holger	b4fc9eb700	C-API expose new_object as qpdf_oh_new_object	2021-12-02 13:59:58 -05:00
Jay Berkenbilt	720ce9e8f3	Improve testing and error handling around operating before processing	2021-11-29 07:42:36 -05:00
Jay Berkenbilt	ac17308cf6	Initialize QPDF::Members::file (fixes #584 )	2021-11-29 07:16:34 -05:00
m-holger	4630b8567c	Ensure qpdf_oh handles returned by C-API functions are unique. Return new qpdf_oh from qpdf_oh_wrap_in_array when input is already an array. Update some doc comments in qpdf-c.h.	2021-11-19 13:31:59 +00:00
Jay Berkenbilt	ce7db05d22	Prepare 10.4.0 release	2021-11-16 15:44:09 -05:00
Jay Berkenbilt	750aca5b94	First increment of improving handling of weak crypto (fixes #358 )	2021-11-11 12:24:15 -05:00
Jay Berkenbilt	f45dacf4cb	Make recovery logic flexible about where objects end (fixes #573 ) Don't assume endobj is at the beginning of the line. This means we are looking at tokens for every line, but the odds of n n obj appearing in the middle of the object are likely much lower than endobj not being at the beginning of the line or missing entirely. This will probably have a negative impact on recovery time for very large files. Hopefully it will be worth it.	2021-11-07 15:27:22 -05:00
Jay Berkenbilt	3794f8e2ad	Support OpenSSL 3 (fixes #568 )	2021-11-04 18:24:54 -04:00
Jay Berkenbilt	a84a0b2487	Add range check in QPDFNumberTreeObjectHelper (fuzz issue 37740)	2021-11-04 14:03:24 -04:00
Jay Berkenbilt	4a648b9a00	Fix bug in merging resources /DR from foreign AcroForm (fixes #548 ) When making resources indirect in from_dr, the code was using the wrong owning QPDF, forgetting that from_dr had already been copied using CopyForeignObject.	2021-11-04 12:29:42 -04:00
Jay Berkenbilt	9b28933647	Check object ownership when adding When adding a QPDFObjectHandle to an array or dictionary, if possible, check if the new object belongs to the same QPDF. This makes it much easier to find incorrect code than waiting for the situation to be detected when the file is written.	2021-11-04 12:29:42 -04:00
Jay Berkenbilt	33a47d5c3c	Make QPDF::findPage public (fixes #516 ) This was originally not public because I wanted to get rid fo the pages cache, but I recently realized there were deep reasons not to do that, and the author of pikepdf wanted this, so I decided to make it public.	2021-11-03 09:43:17 -04:00
Jay Berkenbilt	532a4f3d60	Detect recoverable but invalid zlib data streams (fixes #562 )	2021-11-03 09:43:17 -04:00
Fredrik Fornwall	e0775238b8	Fix QPDFEFStreamObjectHelper::{get,set}Subtype The /Subtype entry that specifies the mime type of an embedded file is inside the embedded file stream dictionary directly, not it in the parameter dictionary. See Table 45 and 46 in the PDF 1.7 specification: https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#page=112	2021-09-10 10:02:24 -04:00
Jay Berkenbilt	3cacb27a90	Performance fix on preserveObjectStreams	2021-05-09 07:51:14 -04:00
Jay Berkenbilt	bddebdb0ea	Prepare 10.3.2 release	2021-05-08 10:41:14 -04:00
Jay Berkenbilt	30ac51bc78	Exclude unreferenced objects in object streams (fixes #520 )	2021-05-08 09:42:09 -04:00
Zdenek Dohnal	16c19e9424	libqpdf/Pl_AES_PDF.cc: remove duplicated if branch Check for this->encrypt seems to be moved to plugged crypto implementations, so it can be removed from Pl_AES_PDF.cc.	2021-04-29 09:42:38 -04:00
Jay Berkenbilt	36c7c20819	Fix timezone portability issue (fixes #515 )	2021-04-17 18:12:55 -04:00
Jay Berkenbilt	8971443e46	QPDF::addPage*: handle duplicate pages more robustly	2021-04-05 10:58:10 -04:00
Jay Berkenbilt	ec48820c3c	Fix loop detection in NNTree	2021-04-05 07:59:02 -04:00
Jay Berkenbilt	258675fc99	Move ABI comment to the right place	2021-04-03 11:43:08 -04:00
Jay Berkenbilt	a77f58142d	Remove some assertions that are not necessarily true (fixes #514 ) Operations that add the same object to multiple places in the pages tree are throwing exceptions and then later causing assertion failures. The assert calls shouldn't be there.	2021-03-21 19:35:23 -04:00
Jay Berkenbilt	3f05429cc5	Prepare 10.3.1 release	2021-03-11 12:59:41 -05:00
Jay Berkenbilt	85884c363c	Allow /DR to be direct in /AcroForm Also handle direct annotation, though this is much less likely.	2021-03-11 11:43:38 -05:00
Jay Berkenbilt	dc65b88457	Prepare 10.3.0 release	2021-03-05 06:15:48 -05:00
Jay Berkenbilt	cb6e53136f	QPDFAcroFormDocumentHelper: add missing analyze calls	2021-03-04 18:11:44 -05:00
Jay Berkenbilt	0b77f2cf26	Revert non-binary-compatible handleWarning change -- see TODO (ABI)	2021-03-04 15:59:46 -05:00
Jay Berkenbilt	f68e25c7f2	Don't use handleWarning, which is being reverted	2021-03-04 15:59:45 -05:00
Jay Berkenbilt	9fb174b9e9	Major rework of handling form fields when copying pages (fixes #509 )	2021-03-04 15:08:37 -05:00
Jay Berkenbilt	887f35efaa	When resolving font from /DR, copy it into resources	2021-03-04 15:08:36 -05:00
Jay Berkenbilt	a2124f992c	Add QPDFMatrix::operator==	2021-03-04 15:08:36 -05:00
Jay Berkenbilt	552303a94a	Check for reserved after dereference	2021-03-04 15:08:36 -05:00
Jay Berkenbilt	d7ffdfa994	Add optional conflict detection to mergeResources Also improve behavior around direct vs. indirect resources.	2021-03-04 15:08:36 -05:00
Jay Berkenbilt	e17585c2d2	Remove unreferenced: ignore names that are not Fonts or XObjects Converted ResourceFinder to ParserCallbacks so we can better detect the name that precedes various operators and use the operators to sort the names into resource types. This enables us to be smarter about detecting unreferenced resources in pages and also sets the stage for reconciling differences in /DR across documents.	2021-03-03 17:05:49 -05:00
Jay Berkenbilt	a15ec6967d	Enhancements to ParserCallbacks	2021-03-03 17:05:49 -05:00
Jay Berkenbilt	1bb209a9bf	Add QPDF::numWarnings	2021-03-03 17:05:49 -05:00
Jay Berkenbilt	37fcc5ff71	Create ResourceFinder from NameWatcher in QPDFPageObjectHelper	2021-03-03 17:05:49 -05:00
Jay Berkenbilt	b444ab3352	Fix typos in coverage cases	2021-03-03 17:05:49 -05:00
Jay Berkenbilt	fa2516df71	Fix behavior for finding /Q, /DA, and /DR for form fields If not found in the field hierarchy, /Q and /DA are supposed to be looked up in the document-level form dictionary. /DR is supposed to only come from the document dictionary.	2021-03-03 17:05:19 -05:00
Jay Berkenbilt	a4d6589ff2	Have QPDFObjectHandle notice when replaceObject was called This results in a performance penalty of 1% to 2% when replaceObject and swapObjects are never called and a somewhat larger penalty if they are called, but it's worth it to avoid very confusing behavior as discussed in depth in qpdf#507.	2021-02-25 07:32:46 -05:00
Jay Berkenbilt	ec6719fd25	Always call dereference() before querying obj pointer	2021-02-25 07:31:26 -05:00
Jay Berkenbilt	b5e937397c	Prepare 10.2.0 release	2021-02-23 10:41:58 -05:00
Jay Berkenbilt	1886673d7e	Spell check	2021-02-23 10:38:05 -05:00
Jay Berkenbilt	9e00be7ffa	Remove warning that gives false positives in some normal cases	2021-02-23 08:26:21 -05:00
Jay Berkenbilt	be3a8c0e7a	Keep only referenced form fields in --pages	2021-02-23 08:26:21 -05:00
Jay Berkenbilt	83216e640c	Preserve form fields when splitting pages (fixes #340 )	2021-02-22 18:42:06 -05:00
Jay Berkenbilt	1f35ec9988	Add methods for copying form fields	2021-02-22 18:42:06 -05:00
Jay Berkenbilt	8e8c0d8290	Add new placeFormXObject that takes a matrix reference	2021-02-22 18:42:06 -05:00
Jay Berkenbilt	61d41e2e88	Add copyAnnotations, use with overlay/underlay (fixes #395 )	2021-02-22 18:42:06 -05:00
Jay Berkenbilt	7b3cbacf5d	Change from QPDF{Array,Dict}Items to aitems() and ditems()	2021-02-22 11:05:39 -05:00
Jay Berkenbilt	a9ae8cadc6	Add transformAnnotations and fix flattenRotations to use it	2021-02-21 17:13:09 -05:00
Jay Berkenbilt	a76decd2d5	Add QPDFObjGen::unparse	2021-02-21 16:21:52 -05:00
Jay Berkenbilt	7540d2082a	Explicitly override inherited rotate in flattenRotations	2021-02-21 14:58:45 -05:00
Jay Berkenbilt	e899926e0d	Use QPDFMatrix inside flattenRotations	2021-02-21 14:58:45 -05:00
Jay Berkenbilt	92fbc6fdf5	QPDFObjectHandle::copyStream	2021-02-21 06:36:30 -05:00
Jay Berkenbilt	60afe4142e	Refactor: separate copyStreamData from replaceForeignIndirectObjects	2021-02-21 06:36:30 -05:00
Jay Berkenbilt	15269f36d8	addFormField: update cache rather than invalidating	2021-02-21 06:36:30 -05:00
Jay Berkenbilt	901f1a788c	Enhance QPDFMatrix API	2021-02-21 06:36:30 -05:00
Jay Berkenbilt	05eb5826d8	Fix isPagesObject and isPageObject There are lots of things with /Kids that are not pages. Repair the pages tree, then do a reliable check.	2021-02-20 19:42:41 -05:00
Jay Berkenbilt	35dd11f356	Allow --rotate=0	2021-02-20 16:29:34 -05:00
Jay Berkenbilt	71e8627285	Add const versions of QPDFMatrix::transform*	2021-02-19 18:35:19 -05:00
Jay Berkenbilt	de8929a41c	Add QPDFAcroFormDocumentHelper::addFormField	2021-02-18 12:25:48 -05:00
Jay Berkenbilt	5cec6b4c3d	Add QPDFPageObjectHelper::getMatrixForFormXObjectPlacement	2021-02-18 12:25:48 -05:00
Jay Berkenbilt	0765872295	Form field for non-widget just returns null	2021-02-18 10:25:07 -05:00
Jay Berkenbilt	0b1623d07d	Add QUtil::path_basename	2021-02-18 09:59:03 -05:00
Jay Berkenbilt	a773f4c71d	Add QPDFObjectHandle::parse for strings with context	2021-02-15 11:33:03 -05:00
Jay Berkenbilt	7eb903d9aa	Use functional replaceStreamData	2021-02-14 14:42:24 -05:00
Jay Berkenbilt	efbb21673c	Add functional versions of QPDFObjectHandle::replaceStreamData Also fix a bug in checking consistency of length for stream data providers. Length should not be checked or recorded if the provider says it failed to generate the data.	2021-02-14 14:42:24 -05:00
Jay Berkenbilt	e2593e2efe	Move QPDFMatrix into the public API	2021-02-13 02:30:00 -05:00
Jay Berkenbilt	07f40bd254	QUtil::double_to_string: trim trailing zeroes with option to disable	2021-02-13 02:30:00 -05:00
Jay Berkenbilt	8fbc8579f2	Allow zone information to be omitted from timestamp strings	2021-02-11 14:26:55 -05:00
Jay Berkenbilt	df067c9ab6	Add autoconf test for localtime_r	2021-02-11 14:26:55 -05:00
Jay Berkenbilt	1b3f84f967	Require C++14 instead of C++11	2021-02-10 16:27:58 -05:00
Jay Berkenbilt	9fcf61b2f6	Fix loop in QPDFOutlineDocumentHelper (fuzz issue 30507)	2021-02-10 16:27:44 -05:00
Jay Berkenbilt	4d1f2fdcac	Update to new name/number tree API	2021-02-10 15:46:20 -05:00
Jay Berkenbilt	1f4771cd0d	Minor clean up of Windows headers	2021-02-10 07:36:18 -05:00
Jay Berkenbilt	ad34b9c278	Implement helpers for file attachments	2021-02-10 06:57:37 -05:00
Jay Berkenbilt	bf0e6eb302	Add QUtil methods for dealing with PDF timestamp strings	2021-02-09 17:50:24 -05:00
Jay Berkenbilt	bfbeec5497	Make newly created name/number trees indirect objects	2021-02-08 06:49:56 -05:00
Jay Berkenbilt	553ac7f353	Add QUtil::pipe_file and QUtil::file_provider	2021-02-07 19:41:34 -05:00
Jay Berkenbilt	e076c9bf08	Remove erroneous handling of /EFF for stream decryption I thought /EFF was supposed to be used as a default for decrypting embedded file streams, but actually it's supposed to be advice to a conforming writer about handling new ones. This makes sense since the findAttachmentStreams code, which is not actually needed, was never right.	2021-02-06 17:08:41 -05:00
Jay Berkenbilt	ac2b3b96e1	Make wrong object stream type a warning	2021-02-06 14:29:11 -05:00
Jay Berkenbilt	faa2e3ddfd	Handle older PDFs whose form XObjects inherit resources (fixes #494 ) When removing unreferenced resources, notice if a page (recursively) contains a form XObject with unreferenced resources, and count any such resources as referenced by the page.	2021-02-02 18:06:05 -05:00
Jay Berkenbilt	81025e4998	Refactor removal of unreferenced resources Refactor in preparation for resolving unresolved resources in form xobjects from page.	2021-02-02 18:06:05 -05:00
Jay Berkenbilt	9c9ce64eec	Handle strings in inline image dictionaries We need to use token.getRawValue, not token.getValue	2021-01-31 07:50:03 -05:00
Jay Berkenbilt	178f995fc2	Recover from exceptions during filtering for inline images	2021-01-31 07:49:08 -05:00
Jay Berkenbilt	4ae93a73c5	Improve memory safety of dict/array iterators	2021-01-31 07:16:03 -05:00
Jay Berkenbilt	de0b11fc47	Add C++ iterator API around array and dictionary objects	2021-01-30 15:15:23 -05:00
Jay Berkenbilt	35e7859bc7	Make QPDFObjectHandle::is* return false for uninitialized objects	2021-01-29 15:46:54 -05:00
Jay Berkenbilt	50decc9bb8	name/number tree: explicitly declare default destructors	2021-01-29 15:46:54 -05:00
Jay Berkenbilt	8ed3e8c79b	NNTree: rework iterators to be more memory efficient Keep a std::pair internal to the iterators so that operator* can return a reference and operator-> can work, and each can work without copying pairs of objects around.	2021-01-26 09:12:23 -05:00
Jay Berkenbilt	e7e20772ed	name/number trees: remove	2021-01-26 09:12:23 -05:00
Jay Berkenbilt	5816fb44b8	name/number trees: insertAfter	2021-01-25 15:39:10 -05:00
Jay Berkenbilt	16a9bb3f6f	name/number trees: newEmpty, increment/decrement end()	2021-01-25 15:39:10 -05:00
Jay Berkenbilt	b5614f611d	Implement repair and insert for name/number trees	2021-01-24 19:31:45 -05:00
Jay Berkenbilt	04edfe9fad	QPDFObjectHandle::newUnicodeString to uses UTF-16 only when needed Use the first of ASCII, PDFDocEncoding, or UTF-16 that is capable of encoding the string.	2021-01-24 03:27:28 -05:00
Jay Berkenbilt	63e5cb533d	Use new QPDF{Name,Number}TreeObjectHelper API	2021-01-24 03:27:28 -05:00
Jay Berkenbilt	d61ffb65d0	Add new constructors for name/number tree helpers Add constructors that take a QPDF object so we can issue warnings and create new indirect objects.	2021-01-24 03:27:26 -05:00
Jay Berkenbilt	ba814703fb	Use QPDFNameTreeObjectHelper's iterator directly	2021-01-24 03:25:11 -05:00
Jay Berkenbilt	5f0708418a	Add iterators to name/number tree helpers	2021-01-24 03:22:59 -05:00
Jay Berkenbilt	4a1cce0a47	Reimplement name and number tree object helpers Create a computationally and memory efficient implementation of name and number trees that does binary searches as intended by the data structure rather than loading into a map, which can use a great deal of memory and can be very slow.	2021-01-24 03:22:51 -05:00
Jay Berkenbilt	6226b69dba	Add warn() to QPDF's public API	2021-01-16 18:41:53 -05:00
Jay Berkenbilt	fc88837d4b	Treat /EmbeddedFiles as a proper name tree If we ever had an encrypted file with different filters for attachments and either the /EmbeddedFiles name tree was deep or some of the file specs didn't have /Type, we would have overlooked those as attachment streams. The code now properly handles /EmbeddedFiles as a name tree.	2021-01-11 10:50:44 -05:00
Jay Berkenbilt	6fe7b704c7	Warn rather than segv on access after closing input source (fixes #495 )	2021-01-06 10:11:34 -05:00
Jay Berkenbilt	0fed040392	Prepare version 10.1.0	2021-01-04 16:59:55 -05:00
Jay Berkenbilt	18340b8835	Spell check	2021-01-04 16:26:58 -05:00
Jay Berkenbilt	dc92574c10	Fix some pipelines to be safe if downstream write fails (fuzz issue 28262)	2021-01-04 15:17:35 -05:00
Jay Berkenbilt	ba6b6aacf1	Fix outdated comment	2021-01-03 15:59:49 -05:00
Jay Berkenbilt	3be58f49e5	Make more QPDFPageObjectHelper methods work with form XObject	2021-01-02 14:08:53 -05:00
Jay Berkenbilt	98da4fd835	Externalize inline images now includes form XObjects	2021-01-02 14:08:17 -05:00
Jay Berkenbilt	bedf35d6a5	Bug fix: avoid extraneous pipeline finish calls with multiple contents Avoid calling finish() multiple times on the pipeline passed to pipeContentStreams. This commit also fixes a bug in which qpdf was not exiting with the proper exit status if warnings found while splitting pages; this was exposed by a test case that changed.	2021-01-02 14:08:17 -05:00
Jay Berkenbilt	a139d2b36d	Add several methods for working with form XObjects (fixes #436 ) Make some more methods in QPDFPageObjectHelper work with form XObjects, provide forEach methods to walk through nested form XObjects, possibly recursively. This should make it easier to work with form XObjects from user code.	2021-01-02 12:29:31 -05:00
Jay Berkenbilt	6154221edb	QPDFPageObjectHelper: filterPageContents -> filterContents + form XObject	2021-01-02 11:33:36 -05:00
Jay Berkenbilt	63ea46193d	QPDFPageObjectHelper: getPageImages -> getImages	2021-01-02 11:33:36 -05:00
Jay Berkenbilt	e7a8554563	QPDFPageObjectHelper::getPageImages: support form XObjects	2021-01-02 11:33:36 -05:00
Jay Berkenbilt	1562d34c09	Add QPDFObjectHandle::isFormXObject	2021-01-01 07:36:10 -05:00
Jay Berkenbilt	c9271335fa	Add QPDFPageObjectHelper::flattenRotation and --flatten-rotation	2020-12-30 13:03:55 -05:00
Jay Berkenbilt	12ecd2019a	Add QPDFObjectHandle::setFilterOnWrite	2020-12-28 12:58:19 -05:00
Jay Berkenbilt	3f9191a344	Add ostream << for QPDFObjGen	2020-12-28 12:58:19 -05:00
Jay Berkenbilt	858c7b89bc	Let optimize filter stream parameters instead of making them direct Also removes preclusion of stream references in stream parameters of filterable streams and reduces write times by about 8% by eliminating an extra traversal of the objects.	2020-12-28 12:58:19 -05:00
Jay Berkenbilt	1a62cce940	Restructure optimize to allow skipping parameters of filtered streams	2020-12-28 12:58:19 -05:00
Jay Berkenbilt	09027344b9	Refactor: separate code that determines whether to filter a stream	2020-12-28 12:58:19 -05:00
Jay Berkenbilt	39bfa01307	Implement user-provided stream filters Refactor QPDF_Stream to use stream filter classes to handle supported stream filters as well.	2020-12-28 12:58:19 -05:00
Jay Berkenbilt	cc8895078a	Add QPDFObjectHandle::makeDirect(bool allow_streams)	2020-12-26 08:48:18 -05:00
Jay Berkenbilt	573b6eb8b1	Provide qpdf write progress reporting from C API (fixes #487 )	2020-12-20 14:43:24 -05:00
Jay Berkenbilt	2050977099	Add QPDFObjectHandle manipulation to C API	2020-11-28 19:48:07 -05:00
Jay Berkenbilt	78b9d6bfd4	Prepare 10.0.4 release	2020-11-21 13:50:02 -05:00
Jay Berkenbilt	bd79138c84	Treat direct page as runtime rather than logic error (fuzz issue 27393)	2020-11-11 09:50:43 -05:00
Jay Berkenbilt	47f4ebcdac	Ignore unused field in xref entry, avoiding range error (fixes #482 )	2020-11-04 07:46:46 -05:00
Jay Berkenbilt	fbe40b800d	Prepare 10.0.3 release	2020-10-31 13:47:03 -04:00
Jay Berkenbilt	6971f78ff6	Fix stack overflow on direct root (fuzz issue 26761)	2020-10-31 13:10:39 -04:00
Jay Berkenbilt	ffe6af6f77	Add comments explaining the foreign object copying code These are the comments I would have liked to have been able to read while fixing #449 and #478.	2020-10-31 12:14:26 -04:00
Jay Berkenbilt	96767fb104	Fix foreign stream copying bug (fixes #478 ) This reverts an incorrect fix to #449 and codes it properly. The real problem was that we were looking at the local dictionaries rather than the foreign dictionaries when saving the foreign stream data. In the case of direct objects, these happened to be the same, but in the case of indirect objects, the object references could be pointing anywhere since object numbers don't match up between the old and new files.	2020-10-31 12:14:26 -04:00
Jay Berkenbilt	da7540794a	Prepare 10.0.2 release	2020-10-27 11:57:48 -04:00
Jay Berkenbilt	09bd1fafb1	Improve efficiency of number to string conversion	2020-10-27 11:57:48 -04:00
Jay Berkenbilt	bcea54fcaa	Revert removal of unreadCh change for performance Turns out unreadCh is much more efficient than seek(-1, SEEK_CUR). Update comments and code to reflect this.	2020-10-27 11:57:48 -04:00
Jay Berkenbilt	b30deaeeab	Avoid merging adjacent tokens when concatenating contents (fixes #444 )	2020-10-23 08:00:04 -04:00
Jay Berkenbilt	8a11feacc3	Avoid leak by resolving object streams more than once (fuzz issue 23642)	2020-10-22 15:39:36 -04:00
Jay Berkenbilt	30bb4c64ee	Minor code cleanup * Return rather than exiting from realmain in qpdf.cc * Remove extraneous blank line * Don't assign temporary to const reference	2020-10-22 15:39:36 -04:00
Jay Berkenbilt	232f5fc9f3	Handle jpeg library fuzz false positives The jpeg library has some assembly code that is missed by the compiler instrumentation used by memory sanitization. There is a runtime environment variable that is used to work around this issue.	2020-10-22 06:31:52 -04:00
Jay Berkenbilt	c1684eae91	Check for overflow in page labels (fuzz issue 23599)	2020-10-22 05:49:24 -04:00
Jay Berkenbilt	7f4a4df919	Add range_check method to QIntC	2020-10-22 05:48:40 -04:00
Jay Berkenbilt	24196c08cb	Fix loop detection error (fuzz issue 23172)	2020-10-22 05:48:35 -04:00
Jay Berkenbilt	956c8f6432	Obscure bug fix copying foreign streams in special cases (fixes #449 ) Specifically, if a stream had its stream data replaced and had indirect /Filter or /DecodeParms, it would result in non-silent loss of data and/or internal error.	2020-10-21 19:23:23 -04:00
Jay Berkenbilt	98f6c00dad	Protect numeric conversion against user's locale (fixes #459 )	2020-10-21 16:42:51 -04:00
Jay Berkenbilt	bed165c9fc	Stop using InputSource::unreadCh	2020-10-18 07:43:05 -04:00
Dean Scarff	153060a0c5	Check integer overflow in resolveObjectsInStream Fixes a crash found by fuzzing.	2020-10-16 20:09:24 -04:00
Dean Scarff	9a3791c53b	Properly detect OPENSSL_IS_BORINGSSL OPENSSL_IS_BORINGSSL is not actually set by configure, so it will be undefined until a BoringSSL header is included. Hence the #ifdef logic in QPDFCrypto_openssl.h would usually never apply. This still worked because evp.h transitively included BoringSSL's cipher.h and digest.h, but the latter are the correct (documented) headers. By re-ordering the includes, we can ensure the macro is defined when we use it. Also: fix case in the header guards.	2020-10-16 20:04:36 -04:00
Dean Scarff	2ff84aa2c9	Include detailed OpenSSL error messages Fixes qpdf/qpdf#450	2020-10-16 19:58:11 -04:00
James R. Barlow	3fc7c99d02	Replace memchr with manual memory search On large files with predominantly \n line endings, memchr(..'\r'..) seems to waste a considerable amount of time searching for a line ending candidate that we don't need. On the Adobe PDF Reference Manual 1.7, this commit is 8x faster at QPDF::processMemoryFile().	2020-10-16 19:57:29 -04:00
oltolm	3221022fc9	fix WindowsCryptProvider fixes #432	2020-10-16 19:56:33 -04:00
Jay Berkenbilt	ff65e272a8	Fix printf formatting for newer msvc Use autoconf rather than ifdefs to determine what format string to use for long long.	2020-10-16 07:02:23 -04:00
Jay Berkenbilt	88b8f8ec86	Remove redundant check found by lgtm.com	2020-10-15 14:47:43 -04:00
Jay Berkenbilt	26514ab731	Write linearization errors to stderr (fixes #438 )	2020-04-29 17:33:34 -04:00
Jay Berkenbilt	92d3cbecd4	Fix warnings reported by -Wshadow=local (fixes #431 )	2020-04-16 12:41:43 -04:00
Jay Berkenbilt	578c5ac66c	Use more references when iterating When possible, use `for (auto&` or `for (auto const&` when iterating using C++-11 style iterators.	2020-04-10 13:30:33 -04:00
Jay Berkenbilt	821a701851	Prepare 10.0.1 release	2020-04-09 11:48:26 -04:00
Jay Berkenbilt	1a7d3700a6	Fix unnecessary copies in auto iter (fixes #426 ) Also switch to colon-style iteration in some cases. Thanks to Dean Scarff for drawing this to my attention after detecting some unnecessary copies with https://clang.llvm.org/extra/clang-tidy/checks/performance-for-range-copy.html	2020-04-08 20:45:26 -04:00
Jay Berkenbilt	4977a7efa5	Bug fix: getStreamData should on unfilterable stream (fixes #425 )	2020-04-08 18:52:04 -04:00
Jay Berkenbilt	1e629c278a	Prepare 10.0.0 release	2020-04-06 11:30:15 -04:00
Jay Berkenbilt	c996f4ac33	Don't include <cwchar> if not building with wchar	2020-04-06 11:23:02 -04:00
Jay Berkenbilt	77198d5310	Delegate random number generation to crypto provider (fixes #418 )	2020-04-06 11:23:02 -04:00
Jay Berkenbilt	52749b85df	Make random data provider code thread-safe This uses C++-11 thread-safe static initializers now.	2020-04-06 10:00:43 -04:00
Jay Berkenbilt	619d294e9d	Remove QUtil::srandom	2020-04-06 09:49:02 -04:00
Dean Scarff	0f2507234f	Add OpenSSL/BoringSSL crypto provider Fixes qpdf/qpdf#417	2020-04-06 09:01:55 -04:00
Jay Berkenbilt	893d38b87e	Allow propagation of errors and retry through StreamDataProvider StreamDataProvider::provideStreamData now has a rich enough API for it to effectively proxy to pipeStreamData.	2020-04-05 20:07:13 -04:00
Jay Berkenbilt	7246404177	JSON: implement pattern keys in schema	2020-04-04 18:06:32 -04:00
Dean Scarff	c5c1a028cd	Use deterministic assignments for unique_id Fixes qpdf/qpdf#419	2020-04-04 08:29:28 -04:00
Jay Berkenbilt	2100b4ce15	Allow qpdf to be built on systems without wchar_t (fixes #406 )	2020-04-03 21:39:44 -04:00
Jay Berkenbilt	6a4117add9	Avoid potential segfault in warning methods	2020-04-03 21:39:20 -04:00
Jay Berkenbilt	4f3b89991b	placeFormXObject: allow control of shrink/expand (fixes #409 )	2020-04-03 21:39:17 -04:00
Jay Berkenbilt	b76b73b229	C API: accept any non-zero value as TRUE	2020-04-03 17:33:44 -04:00
Jay Berkenbilt	54726930df	Remove redundant methods in QUtil This was being saved until we had to break ABI.	2020-04-03 12:17:57 -04:00
Jay Berkenbilt	5806e5c60c	QPDFPageObjectHelper::placeFormXObject: use std::string const& (fixes #374 )	2020-04-03 12:17:57 -04:00
Jay Berkenbilt	97de12343b	Performance: remove Members indirection for Pipeline	2020-04-03 12:17:57 -04:00
Jay Berkenbilt	bfda941519	Use an unordered map for SparseOHArray for efficiency This was added in C++11.	2020-04-03 12:16:24 -04:00
Jay Berkenbilt	ee271fd2f2	Use auto for iterating over sparse array	2020-04-03 12:16:24 -04:00
Jay Berkenbilt	70665cb381	Internally use unsafeShallowCopy where we can	2020-04-03 12:16:24 -04:00
Jay Berkenbilt	38afdcea7b	Add QPDFObjectHandle::unsafeShallowCopy	2020-04-03 12:16:24 -04:00
Jay Berkenbilt	07afb668b1	Performance: remove indirection through Members for QPDFObject	2020-04-03 12:16:24 -04:00
Jay Berkenbilt	89f19b7099	Performance: remove Members indirection for QPDFObjectHandle	2020-04-03 12:16:24 -04:00
Jay Berkenbilt	dac65a21fb	Look in form XObjects when removing unreferenced resources (fixes #373 ) If a page contains a form XObject, also filter the form XObject and remove its unreferenced resources.	2020-03-31 17:39:20 -04:00
Jay Berkenbilt	278710fbe8	Refactor QPDFPageObjectHelper::removeUnreferencedResources() Refactor removeUnreferencedResources to prepare for filtering form XObjects.	2020-03-31 17:39:20 -04:00
Jay Berkenbilt	bb6768b8f0	Include header for wcslen (fixes #405 )	2020-02-29 08:43:33 -05:00
Jay Berkenbilt	bb3137296d	Handle root /Pages pointing to other than page tree root (fixes #398 )	2020-02-22 11:10:31 -05:00
Jay Berkenbilt	52a2e95dd5	Prepare 9.1.1 release	2020-01-26 18:49:04 -05:00
Jay Berkenbilt	57c01ef81f	In qdf mode, don't write extra XRef streams (fixes #386 ) fix-qdf assumes there is exactly one XRef stream and that it is at the end of the file.	2020-01-26 16:50:57 -05:00
Jay Berkenbilt	bbc2f8ffae	Bug fix: handle ColorSpace lookup for inline images (fixes #392 ) If the value of /CS in the inline image dictionary was is key in the page's /Resource -> /ColorSpace dictionary, properly resolve it by referencing the proper colorspace, and not just the name, in the external image dictionary.	2020-01-26 15:29:10 -05:00
Cloudmersive	a8b6ff5763	Fix for Windows unable to acquire crypt context with new keyset (fixes #387 ) Fix is based on guidance https://support.microsoft.com/en-us/help/238187/cryptacquirecontext-use-and-troubleshooting and is the proper fix for #285/#286	2020-01-14 18:45:54 -05:00
Jay Berkenbilt	a44b5a34a0	Pull wmain -> main code from qpdf.cc into QUtil.cc	2020-01-14 11:40:51 -05:00
Jay Berkenbilt	ab4061f1ee	Add error detection for read_lines_from_file(FILE*)	2020-01-14 11:07:09 -05:00
Jay Berkenbilt	211a7f57be	QUtil::read_lines_from_file: optional EOL preservation	2020-01-13 11:26:18 -05:00
Jay Berkenbilt	9a398504ca	Refactor QUtil::read_lines_from_file This commit adds the preserve_eol flags but doesn't implement EOL preservation yet.	2020-01-13 09:19:53 -05:00
Jay Berkenbilt	9b0c6022d7	Prepare 9.1.0 release	2019-11-16 22:29:54 -05:00
Jay Berkenbilt	5e6dfc938e	Prepare 9.1.rc1 release	2019-11-09 22:00:53 -05:00
Jay Berkenbilt	c4478e5249	Allow odd/even modifiers in numeric range (fixes #364 )	2019-11-09 13:23:12 -05:00
Jay Berkenbilt	5508f74603	Allow /P in encryption dictionary to be positive (fixes #382 ) Even though this is disallowed by the spec, files like this have been encountered in the wild.	2019-11-09 12:33:15 -05:00
Jay Berkenbilt	127a957aee	Allow runtime inspection/override of crypto provider	2019-11-09 09:53:42 -05:00
Jay Berkenbilt	88bedb41fe	Implement gnutls crypto provider (fixes #218 ) Thanks to Zdenek Dohnal <zdohnal@redhat.com> for contributing the code used for the gnutls crypto provider.	2019-11-09 09:53:38 -05:00
Jay Berkenbilt	cc14523440	Update autoconf to support crypto selection	2019-11-09 08:18:02 -05:00
Jay Berkenbilt	d0a53cd3ea	Fix typos in configure.ac	2019-11-09 08:18:02 -05:00
Jay Berkenbilt	c03ced09c0	Isolate source files used for native crypto	2019-11-09 08:18:02 -05:00
Jay Berkenbilt	d1ffe46c04	AES_PDF: move CBC logic from pipeline to AES_PDF implementation	2019-11-09 08:18:02 -05:00
Jay Berkenbilt	c8cda4f965	AES_PDF: switch to pluggable crypto	2019-11-09 08:18:02 -05:00
Jay Berkenbilt	bb427bd117	SHA2: switch to pluggable crypto	2019-11-09 08:18:02 -05:00
Jay Berkenbilt	eadc222ff9	Rename SHA2 implementation (non-bisectable)	2019-11-09 08:18:02 -05:00
Jay Berkenbilt	4287fcc002	RC4: switch to pluggable crypto	2019-11-09 08:18:02 -05:00
Jay Berkenbilt	0cdcd10228	Rename RC4 implementation (non-bisectable)	2019-11-09 08:18:02 -05:00
Jay Berkenbilt	ce8f9b6608	MD5: switch to pluggable crypto	2019-11-09 08:18:02 -05:00
Jay Berkenbilt	5c3e856e9f	Rename MD5 implementation (non-bisectable) Just rename MD5 -> MD5_native in place so that git annotate will show the lines as having originated there.	2019-11-09 08:18:02 -05:00
Jay Berkenbilt	2de41856a0	QPDFCryptoProvider: initial implementation	2019-11-09 08:18:02 -05:00
Jay Berkenbilt	700f5b961e	Remove int type checks -- subsumed by C++-11	2019-11-09 08:18:02 -05:00
Jay Berkenbilt	653ce3550d	Require C++-11 Includes updates to m4/ax_cxx_compile_stdcxx.m4 to make it work with msvc, which supports C++-11 with no flags but doesn't set __cplusplus to a recent value.	2019-11-09 08:18:02 -05:00
Jay Berkenbilt	9094fb1f8e	Fix two additional fuzz test cases	2019-11-03 18:59:12 -05:00
Masamichi Hosoda	5a842792b6	Parse Contents in signature dictionary without encryption Various PDF digital signing tools do not encrypt /Contents value in signature dictionary. Adobe Acrobat Reader DC can handle a PDF with the /Contents value not encrypted. Write Contents in signature dictionary without encryption Tests ensure that string /Contents are not handled specially when not found in sig dicts.	2019-10-22 16:20:21 -04:00
Masamichi Hosoda	cdc46d78f4	Add QPDFObject::getParsedOffset()	2019-10-22 16:19:06 -04:00
Masamichi Hosoda	50b329ee9f	Add QPDFWriter::getWrittenXRefTable()	2019-10-22 16:16:16 -04:00
Masamichi Hosoda	5cf4090aee	Add QPDFWriter::getRenumberedObjGen()	2019-10-22 16:16:16 -04:00
Masamichi Hosoda	46ac3e21b3	Add QPDF::getXRefTable()	2019-10-22 16:16:16 -04:00
Masamichi Hosoda	06b818dcd3	Exclude signature dictionary from compressible objects It seems better not to compress signature dictionaries. Various PDF digital signing tools, including Adobe Acrobat Reader DC, do not compress signature dictionaries. Table 8.93 "Entries in a signature dictionary" in PDF 1.5 reference describes that /ByteRange in the signature dictionary shall be used to describe a digest that does not include the signature value (/Contents) itself. The byte ranges cannot be determined if the dictionary is compressed.	2019-10-22 16:16:16 -04:00
Masamichi Hosoda	5e0ba12687	Fix /Contents value representation in a signature dictionary Table 8.93 "Entries in a signature dictionary" in PDF 1.5 reference describes that the value of Contents entry is a hexadecimal string representation when ByteRange is specified. This commit makes QPDF always uses hexadecimal strings representation instead of literal strings for it.	2019-10-22 16:16:16 -04:00
Jay Berkenbilt	3094955dee	Prepare 9.0.2 release	2019-10-12 19:37:40 -04:00
Jay Berkenbilt	4ea940b03c	Prepare 9.0.1 release	2019-09-20 07:38:18 -04:00
Jay Berkenbilt	685250d7d6	Correct reversed Rectangle coordinates (fixes #363 )	2019-09-19 21:25:34 -04:00
Jay Berkenbilt	48b7de2cc3	Fix typo in comment	2019-09-19 21:04:32 -04:00
Jay Berkenbilt	8b1e307741	Warn for duplicated dictionary keys (fixes #345 )	2019-09-19 20:22:34 -04:00
Jay Berkenbilt	bb83e65193	Fix fuzz issue 16953 (overflow checking in xref stream index)	2019-09-17 19:48:47 -04:00
Jay Berkenbilt	17d431dfd5	Fix integer type warnings for big-endian systems	2019-09-17 19:14:27 -04:00
Jay Berkenbilt	5462dfce31	Prepare 9.0.0 release	2019-08-31 20:07:36 -04:00
Jay Berkenbilt	babd12c9b2	Add methods QPDF::anyWarnings and QPDF::closeInputSource	2019-08-31 15:51:20 -04:00
Jay Berkenbilt	4fa7b1eb60	Add remove_file and rename_file to QUtil	2019-08-31 15:51:04 -04:00
Jay Berkenbilt	0e51a9aca6	Don't encrypt trailer, fixes fuzz issue 15983 Ordinarily the trailer doesn't contain any strings, so this is usually a non-issue, but if the trailer contains strings, linearizing and encrypting with object streams would include encrypted strings in the trailer, which would blow out the padding because encrypted strings are longer than their cleartext counterparts.	2019-08-28 23:06:32 -04:00
Jay Berkenbilt	47a38a942d	Detect stream in object stream, fixing fuzz 16214 It's detected in QPDFWriter instead of at parse time because I can't figure out how to construct a test case in a reasonable time. This commit moves the fuzz file into the regular test suite for a QTC coverage case.	2019-08-28 12:49:04 -04:00
Jay Berkenbilt	ba5fb69164	Make popping pipeline stack safer Use destructors to pop the pipeline stack, and ensure that code that pops the stack is actually popping the intended thing.	2019-08-27 22:27:47 -04:00
Jay Berkenbilt	dadf8307c8	Fix fuzz issues 15316 and 15390	2019-08-27 20:39:06 -04:00
Jay Berkenbilt	456c285b02	Fix fuzz issue 16172 (overflow checking in OffsetInputSource)	2019-08-27 13:08:07 -04:00
Jay Berkenbilt	ad8081daf5	Fix fuzz issue 15442 (overflow checking in BufferInputSource)	2019-08-27 11:26:25 -04:00
Jay Berkenbilt	9a095c5c76	Seek in two stages to avoid overflow When seeing to a position based on a value read from the input, we are prone to integer overflow (fuzz issue 15442). Seek in two stages to move the overflow check into the input source code.	2019-08-27 11:26:25 -04:00
Jay Berkenbilt	ac5e6de2e8	Fix fuzz issue 15387 (overflow checking xref size)	2019-08-27 11:26:25 -04:00
Jay Berkenbilt	6bc4cc3d48	Fix fuzz issue 15475	2019-08-25 22:52:25 -04:00
Jay Berkenbilt	94e86e2528	Fix fuzz issue 16301	2019-08-25 22:52:25 -04:00
Jay Berkenbilt	5da146c8b5	Track separately whether password was user/owner (fixes #159 )	2019-08-24 11:01:19 -04:00
Jay Berkenbilt	5a0aef55a0	Split long line	2019-08-24 10:58:51 -04:00
Jay Berkenbilt	2794bfb1a6	Add flags to control zlib compression level (fixes #113 )	2019-08-23 20:34:21 -04:00
Jay Berkenbilt	dac0598b94	Add ability to set zlib compression level globally	2019-08-23 20:34:21 -04:00
Jay Berkenbilt	3f1ab64066	Pass offset and length to ParserCallbacks::handleObject	2019-08-22 22:54:29 -04:00
Jay Berkenbilt	4b2e72c4cd	Test for direct, rather than resolved nulls in parser Just because we know an indirect reference is null, doesn't mean we shouldn't keep it indirect.	2019-08-22 17:55:16 -04:00
Jay Berkenbilt	3f3dbe22ea	Remove array null flattening For some reason, qpdf from the beginning was replacing indirect references to null with literal null in arrays even after removing the old behavior of flattening scalar references. This seems like a bad idea.	2019-08-22 17:55:16 -04:00
Jay Berkenbilt	225cd9dac2	Protect against coding error of re-entrant parsing	2019-08-22 17:55:16 -04:00
Jay Berkenbilt	ae5bd7102d	Accept extraneous space before xref (fixes #341 )	2019-08-19 22:24:53 -04:00
Jay Berkenbilt	8a9086a689	Accept extraneous space after stream keyword (fixes #329 )	2019-08-19 21:43:44 -04:00
Jay Berkenbilt	43f91f58b8	Improve invalid name token warning message This message used to only appear for PDF >= 1.2. The invalid name is valid for PDF 1.0 and 1.1. However, since QPDFWriter may write a newer version, it's better to detect and warn in all cases. Therefore make the warning more informative.	2019-08-19 19:48:27 -04:00
Jay Berkenbilt	42d396f1dd	Handle invalid name tokens symmetrically for PDF < 1.2 (fixes #332 )	2019-08-19 19:48:27 -04:00
Jay Berkenbilt	d9dd99eca3	Attempt to repair /Type key in pages nodes (fixes #349 )	2019-08-18 18:54:37 -04:00
Jay Berkenbilt	522d2b2227	Improve efficiency of fixDanglingReferences	2019-08-18 09:00:40 -04:00
Jay Berkenbilt	5187a3ec85	Shallow copy arrays without removing sparseness	2019-08-17 23:02:41 -04:00
Jay Berkenbilt	bf7c6a8070	Use SparseOHArray in parsing	2019-08-17 23:02:41 -04:00
Jay Berkenbilt	e5f504b6c5	Use SparseOHArray in QPDF_Array	2019-08-17 23:02:41 -04:00
Jay Berkenbilt	a89d8a0677	Refactor QPDF_Array in preparation for using SparseOHArray	2019-08-17 23:02:41 -04:00
Jay Berkenbilt	e83f3308fb	SparseOHArray	2019-08-17 23:02:41 -04:00
Thorsten Schöning	8f06da7534	Change list to vector for outline helpers (fixes #297 ) This change works around STL problems with Embarcadero C++ Builder version 10.2, but std::vector is more common than std::list in qpdf, and this is a relatively new API, so an API change is tolerable. Thanks to Thorsten Schöning <6223655+ams-tschoening@users.noreply.github.com> for the fix.	2019-07-03 20:08:47 -04:00
Jay Berkenbilt	4db1de97ce	Convert some cases of logic_error to runtime_error There were a few cases that could be caused by invalid input rather than bugs in the code which were throwing logic_error instead of runtime_error.	2019-06-25 12:43:06 -04:00
Jay Berkenbilt	201e8798d7	Convert previously overlooked static cast to QIntC	2019-06-25 12:43:06 -04:00
Jay Berkenbilt	04f45cf652	Treat all linearization errors as warnings This also reverts the addition of a new checkLinearization that distinguishes errors from warnings. There's no practical distinction between what was considered an error and what was considered a warning.	2019-06-23 13:45:45 -04:00
Jay Berkenbilt	c5ed1b8075	Handle invalid encryption Length (fixes #333 )	2019-06-22 20:57:33 -04:00
Jay Berkenbilt	551dfbf697	Allow set*EncryptionParameters before filename iset (fixes #336 )	2019-06-22 20:57:33 -04:00
Jay Berkenbilt	7bd38a3eb3	Provide error message in Windows crypto code (fixes #286 ) Thanks to github user zdenop for supplying some additional error-handling code.	2019-06-22 17:12:01 -04:00
Jay Berkenbilt	6c39aa8763	In shippable code, favor smart pointers (fixes #235 ) Use PointerHolder in several places where manually memory allocation and deallocation were being used. This helps to protect against memory leaks when exceptions are thrown in surprising places.	2019-06-22 16:57:52 -04:00
Jay Berkenbilt	85a3f95a89	qpdf: exit 3 for linearization warnings without errors (fixes #50 )	2019-06-22 16:57:51 -04:00
Jay Berkenbilt	1bde5c68a3	Add QUtil::read_file_into_memory This code was essentially duplicated between test_driver and standalone_fuzz_target_runner.	2019-06-22 10:14:25 -04:00
Jay Berkenbilt	658b5bb3be	QPDFWriter: clean up overloaded functions In a small number of cases, it makes sense to replace an overloaded function with a function that takes a default argument. We can do this now because we've already broken binary compatibility since the last release.	2019-06-22 10:13:27 -04:00
Jay Berkenbilt	79f6b4823b	Convert remaining public classes to use Members pattern Have classes contain only a single private member of type PointerHolder<Members>. This makes it safe to change the structure of the Members class without breaking binary compatibility. Many of the classes already follow this pattern quite successfully. This brings in the rest of the class that are part of the public API.	2019-06-22 10:13:27 -04:00
Jay Berkenbilt	45dac410b5	Remove broken QPDFTokenizer::expectInlineImage	2019-06-21 22:29:31 -04:00
Jay Berkenbilt	25dd3c6750	Remove QPDF::copyForeignObject with unused parameter	2019-06-21 22:29:31 -04:00
Jay Berkenbilt	c6cfd64503	Rename QUtil::strcasecmp to QUtil::str_compare_nocase (fixes #242 )	2019-06-21 22:29:31 -04:00
Jay Berkenbilt	848351f1fc	Add missing #include <cstring>	2019-06-21 22:29:31 -04:00
Jay Berkenbilt	b07ad6794e	Fix bugs found by fuzz tests * Several assertions in linearization were not always true; change them to run time errors * Handle a few cases of uninitialized objects * Handle pages with no contents when doing form operations * Handle invalid page tree nodes when traversing pages	2019-06-21 17:56:24 -04:00
Jay Berkenbilt	a35d4ce9cc	Fix bounds error in utf16_to_utf8 conversion	2019-06-21 17:40:24 -04:00
Jay Berkenbilt	63a643a3c7	Remove implicit conversion from int/pointer to bool This fixes cases of warning C4800 from msvc	2019-06-21 13:17:21 -04:00
Jay Berkenbilt	d71f05ca07	Fix sign and conversion warnings (major) This makes all integer type conversions that have potential data loss explicit with calls that do range checks and raise an exception. After this commit, qpdf builds with no warnings when -Wsign-conversion -Wconversion is used with gcc or clang or when -W3 -Wd4800 is used with MSVC. This significantly reduces the likelihood of potential crashes from bogus integer values. There are some parts of the code that take int when they should take size_t or an offset. Such places would make qpdf not support files with more than 2^31 of something that usually wouldn't be so large. In the event that such a file shows up and is valid, at least qpdf would raise an error in the right spot so the issue could be legitimately addressed rather than failing in some weird way because of a silent overflow condition.	2019-06-21 13:17:21 -04:00
Jay Berkenbilt	f40ffc9d63	Pl_Flate: constructor's out_bufsize is now unsigned int This is the type we need for the underlying zlib implementation.	2019-06-21 13:17:21 -04:00
Jay Berkenbilt	da30764bce	Change QPDFObjectHandle::pipeStreamData's encode_flags type Change from unsigned long to int since we pass enumerated type values to this field.	2019-06-21 13:17:21 -04:00
Jay Berkenbilt	3608afd5c5	Add new integer accessors to QPDFObjectHandle	2019-06-21 13:17:21 -04:00
Jay Berkenbilt	42306e2ff8	QUtil: add unsigned int/string functions	2019-06-21 13:17:21 -04:00
Jay Berkenbilt	2155815234	configure: determine wordsize automatically Based on sizeof(size_t). Assumes 64 if not 32.	2019-06-21 13:17:21 -04:00
Jay Berkenbilt	713d961990	Appearance streams: some floating point values were truncated Bounding box X coordinates could be truncated, causing them to be off by a fraction of a point. This was most likely not visible, but it was still wrong.	2019-06-20 21:32:30 -04:00
Jay Berkenbilt	eb7948876b	Fix problems found in fuzz corpus	2019-06-15 17:24:24 -04:00
Jay Berkenbilt	cf469d7890	Give up reading objects with too many consecutive errors	2019-06-15 08:52:19 -04:00
Jay Berkenbilt	cd830968ef	Eliminate one potential integer overflow There are more to handle, but this resolves an issue already caught by oss-fuzz.	2019-06-15 08:52:19 -04:00
Jay Berkenbilt	31bde2f9d7	Handle empty DecodeParams array for (fixes #331 ) On read, ignore /DecodeParms when empty list; on write, delete it. Some files have been found that include an empty list for /DecodeParms, but this is not technically compliant with the spec, and the only sensible interpretation is to treat it as if there are no decode parameters.	2019-06-09 17:19:49 -04:00
Jay Berkenbilt	b1a78be1a8	Prepare 8.4.2 release	2019-05-18 08:56:37 -04:00
Jay Berkenbilt	b3f0dbff62	Fix Windows memory error (fixes #330 )	2019-05-16 14:26:51 -04:00
Jay Berkenbilt	a323f6f49f	Prepare 8.4.1 release	2019-04-27 20:44:20 -04:00
Jay Berkenbilt	81205e007b	Spell check	2019-04-21 13:09:11 -04:00
Jay Berkenbilt	011695dfdf	Support Unicode in filenames (fixes #298 )	2019-04-20 21:00:43 -04:00
Jay Berkenbilt	4ccb29912a	Tighten isPageObject (fixes #310 )	2019-04-20 21:00:43 -04:00
Thorsten Schöning	2c704b99a1	Undefined functions because of missing std:: or header. (#295 ) * [bcc32 Error] QPDF.cc(375): E2268 Call to undefined function 'atof' Full parser context QPDF.cc(358): parsing: void QPDF::parse(const char ) [bcc32 Error] QPDFTokenizer.cc(183): E2268 Call to undefined function 'strtol' Full parser context QPDFTokenizer.cc(163): parsing: void QPDFTokenizer::resolveLiteral() * [bcc32 Error] pdf-split-pages.cc(52): E2268 Call to undefined function 'exit' Full parser context pdf-split-pages.cc(50): parsing: void usage() * PR #295: Including "cstdlib" should be replaced with "stdlib.h" to be more consistent. At the same time I changed the order of the surrounding includes to reflect alphabetical order, because at some files this already have been the case.	2019-03-12 10:05:29 -04:00
Thorsten Schöning	71b7ed9f4f	"_setmode" and "_stricmp" are not available on Borland C++Builder, neither the classic one nor newer ones based on CLANG.	2019-03-11 16:58:55 -04:00
Jay Berkenbilt	da7c2c0ee9	Fix json serialization for {x \| -1 < x < 1} (fixes #308 ) JSON serialization was preserving the value as presented, but JSON doesn't accept decimal values without a 0 before the decimal point.	2019-03-11 16:22:59 -04:00
Jay Berkenbilt	03074ca5a0	Prepare 8.4.0 release	2019-02-01 22:25:25 -05:00
Jay Berkenbilt	fec5bb124c	Spell check	2019-01-31 21:41:29 -05:00
Jay Berkenbilt	eb49e07c0a	Make inline image token exactly contain the image data Do not include the trailing EI, and handle cases where EI is not preceded by a delimiter. Such cases have been seen in the wild.	2019-01-31 20:28:44 -05:00
Jay Berkenbilt	5211bcb5ea	Externalize inline images (fixes #278 )	2019-01-31 10:38:13 -05:00
Jay Berkenbilt	1eb35a355f	Exclude space after ID in image data	2019-01-31 10:38:10 -05:00
Jay Berkenbilt	2b6c79bcae	Improve locating inline image's EI We've actually seen a PDF file in the wild that contained EI surrounded by delimiters inside the image data, which confused qpdf's naive code. This significantly improves EI detection.	2019-01-31 09:26:37 -05:00
Jay Berkenbilt	ec9e310c9e	Refactor QPDFTokenizer's inline image handling Add a version of expectInlineImage that takes an input source and searches for EI. This is in preparation for improving the way EI is found. This commit just refactors the code without changing the functionality and adds tests to make sure the old and new code behave identically.	2019-01-31 09:26:37 -05:00
Jay Berkenbilt	31372edce0	Inline image token value ends with EI, not delimiter The inline image token erroneously included the delimiter that followed EI. The ObjectHandle created from it was correct.	2019-01-31 09:26:37 -05:00
Jay Berkenbilt	b776dcd2d3	Clean up some private functions	2019-01-29 22:14:20 -05:00
Jay Berkenbilt	8a9cfd2605	Handle direct page objects (fixes #164 )	2019-01-29 17:01:36 -05:00
Jay Berkenbilt	2d0885bc11	Clarify documentation for copyForeignObject regarding pages Make explicit that copyForeignObject can be used on page objects and will copy them properly but not update the pages tree.	2019-01-28 21:53:55 -05:00
Jay Berkenbilt	2712869cf9	Fix logic for when to compress object and xref streams (fixes #271 )	2019-01-28 21:43:06 -05:00
Jay Berkenbilt	52f9d326a5	Resolve duplicated page objects (fixes #268 ) When linearizing a file or getting the list of all pages in a file, detect if the pages tree contains a duplicated page object and, if so, shallow copy it. This makes it possible to have a one to one mapping of page positions to page objects.	2019-01-28 20:29:58 -05:00
Jay Berkenbilt	623f5b664e	Convert pages to form XObjects Support conversion of pages to form XObjects and placement of form XObjects on pages.	2019-01-27 07:50:30 -05:00
Jay Berkenbilt	68ccd87c9e	Move rectangle transformation into QPDFMatrix	2019-01-27 07:50:30 -05:00
Jay Berkenbilt	8cb245739c	Add QPDFObjectHandle::getUniqueResourceName	2019-01-27 07:50:30 -05:00
Jay Berkenbilt	009767d97a	Handle inheritable page attributes Add getAttribute for handling inheritable page attributes, and fix getPageImages and annotation flattening code to use it.	2019-01-25 22:30:05 -05:00
Jay Berkenbilt	2d32f4db8f	Handle fallback font size in text appearances If we end up using our fallback font size when generating appearances for text fields, reflect that in the Tf operator used in the appearance stream.	2019-01-21 07:38:21 -05:00
Jay Berkenbilt	9cb599875b	Improve text objects used in text appearance streams	2019-01-20 23:05:58 -05:00
Jay Berkenbilt	930eade6d3	Fix omissions in text appearance generation When generating appearance streams for variable text annotations, properly handle the cases of there being no appearance dictionary, no appearance stream, or an appearance stream with no BMC..EMC marker.	2019-01-20 23:05:58 -05:00
Jay Berkenbilt	65ef0bf313	When flattening, remove annotations with no appearance stream With the exception of form field annotations when /NeedAppearances is true, remove annotations that don't have appearance streams when flattening. There is no reason to keep these when flattening since they are invisible. This may include unchecked checkboxes, unshown popup windows, etc.	2019-01-20 23:05:58 -05:00
Jay Berkenbilt	c18ee440a3	mingw workaround for QPDFExc destructor mingw doesn't like it when you don't inline empty virtual destructors.	2019-01-19 10:14:07 -05:00
Jay Berkenbilt	e87d149918	Add QUtil::possible_repaired_encodings	2019-01-17 11:43:56 -05:00
Jay Berkenbilt	6ec22f117d	Modernize encryption API for more granularity Setting encryption permissions for R >= 3 set permission bits in groups corresponding to menu options in Acrobat 5. The new API allows the bits to be set individually.	2019-01-17 11:43:56 -05:00
Jay Berkenbilt	4630377731	Add status-reporting transcoders to QUtil	2019-01-17 11:43:56 -05:00
Jay Berkenbilt	8f389f14c0	QUtil::analyze_encoding	2019-01-17 11:43:56 -05:00
Jay Berkenbilt	6817ca585a	Bidirectional transcoding for win, mac, pdf, utf8, utf16	2019-01-17 11:43:56 -05:00
Jay Berkenbilt	698485468a	Move remaining existing transcoding to QUtil	2019-01-17 11:43:56 -05:00
Jay Berkenbilt	5cfcd4f361	Additional checks for unreferenced resources Explicitly abandon removal of unreferenced resources if there are any lexical errors in the page's contents. This case always generated a warning, but it now also prevents removal of unreferenced resources, this strongly decreasing the likelihood of data loss.	2019-01-17 11:43:56 -05:00
Jay Berkenbilt	4bc434000c	Copy subdictionaries when removing resources (fixes #276 ) When removing unreferenced resources, the code was copying the overall resource dictionaries but not the subdictionaries being modified. This was a "typo" in the code -- the comment clearly stated the need to do this, but the code replaced the dictionary with itself rather than with a shallow copy of itself.	2019-01-17 09:40:05 -05:00
Jay Berkenbilt	654c0e8caf	Allow adding the same page more than once in --pages (fixes #272 )	2019-01-12 10:01:47 -05:00
Jay Berkenbilt	4ecd1df6f2	Add configure option AVOID_WINDOWS_HANDLE If set, we avoid using Windows I/O HANDLE, which is disallowed in some versions of the Windows SDK, such as for Windows phones. QUtil::same_file will always return false in this case. Only applies to Windows builds.	2019-01-10 22:35:08 -05:00
Jay Berkenbilt	d24a120c7f	Add QPDF::setImmediateCopyFrom	2019-01-10 22:35:08 -05:00
Jay Berkenbilt	b653929c93	Update version to 8.3.0	2019-01-07 11:16:54 -05:00
Jay Berkenbilt	aa602fd107	Fix integer overflow in large file test	2019-01-07 08:49:14 -05:00
Jay Berkenbilt	c3cee5f154	Exercise out of scope original pdf for copyForeignObject	2019-01-07 07:38:03 -05:00
Jay Berkenbilt	fddbcab0e7	Mostly don't require original QPDF for copyForeignObject (fixes #219 ) The original QPDF is only required now when the source QPDFObjectHandle is a stream that gets its stream data from a QPDFObjectHandle::StreamDataProvider.	2019-01-07 00:11:15 -05:00
Jay Berkenbilt	fbbb0ee016	Make a static version of QPDF::pipeStreamData This is in preparation of being able to pipe a stream's data without keeping a copy of its containing qpdf object.	2019-01-07 00:11:15 -05:00
Jay Berkenbilt	7588cac295	Create an application-scope unique ID for each QPDF object Use this instead of QPDF* as a map key for object_copiers.	2019-01-07 00:11:15 -05:00
Jay Berkenbilt	e27ac682e0	Move encryption parameters into a class	2019-01-06 09:58:16 -05:00
Jay Berkenbilt	a70fbaaf50	Honor other base encodings when generating appearances	2019-01-05 23:01:59 -05:00
Jay Berkenbilt	b341d742db	Add WinAnsi and MacRoman encoding	2019-01-05 23:01:44 -05:00
Jay Berkenbilt	3ef1b77304	Refactor QUtil::utf8_to_ascii	2019-01-05 22:59:29 -05:00
Jay Berkenbilt	089ce5902e	Move utf8_to_utf16 into QUtil	2019-01-05 22:59:27 -05:00
Jay Berkenbilt	ae18bfd142	Refactor string transcoding in QPDF_String	2019-01-05 22:56:58 -05:00
Jay Berkenbilt	2e342ee5bb	Spell check	2019-01-04 21:33:14 -05:00
Jay Berkenbilt	16fd6e64f9	Add QPDFWriter::getFinalVersion (fixes #266 )	2019-01-04 12:37:22 -05:00
Jay Berkenbilt	837dcf8fc2	Don't call assert while checking linearization data (fixes #209 , #231 ) Instead of calling assert for problems found during checking linearization data, throw an exception which is later caught and issued as an error. Ideally we would handle errors more robustly, but this is still a significant improvement.	2019-01-04 11:55:42 -05:00
Jay Berkenbilt	a01359189b	Fix dangling references (fixes #240 ) On certain operations, such as iterating through all objects and adding new indirect objects, walk through the entire object structure and explicitly resolve any indirect references to non-existent objects. That prevents new objects from springing into existence and causing the previously dangling references to point to them.	2019-01-04 10:29:29 -05:00
Jay Berkenbilt	158156d506	Add basic appearance stream generation	2019-01-04 08:00:19 -05:00
Jay Berkenbilt	02281632cc	Add QUtil::utf8_to_ascii	2019-01-03 23:18:13 -05:00
Jay Berkenbilt	b55567a0fa	Add special case setV code for button fields	2019-01-03 23:18:13 -05:00
Jay Berkenbilt	e3144ac417	Add form fields to json output Also add some additional methods for detecting form field types to assist in the json creation and for later use.	2019-01-03 23:18:13 -05:00
Jay Berkenbilt	ca94ac68d9	Honor flags when flattening annotations	2019-01-03 11:59:55 -05:00
Jay Berkenbilt	06d6438ddf	Minor fixes	2019-01-03 09:17:43 -05:00
Jay Berkenbilt	3e74916c5a	Fix seg fault on empty xref stream (fixes #263 ) Thanks to @p-cher for supplying a patch.	2019-01-03 09:17:43 -05:00
Jay Berkenbilt	f78ea057ca	Switch annotation flattening to use the form xobjects Instead of directly putting the contents of the annotation appearance streams into the page's content stream, add commands to render the form xobjects directly. This is a more robust way to do it than the original solution as it works properly with patterns and avoids problems with resource name clashes between the pages and the form xobjects.	2019-01-02 21:49:47 -05:00
Jay Berkenbilt	3b8ce4f12a	Annotation flattening including form fields Flatten annotations by integrating their appearance streams into the content stream of the containing page. In the case of form fields, only flatten if /NeedAppearance is false (or equivalently absent). If flattening form fields, also remove /AcroForm from the document catalog.	2019-01-01 08:14:15 -05:00
Jay Berkenbilt	95d6b17a89	Add QPDFObjectHandle::mergeDictionary()	2019-01-01 08:12:56 -05:00
Jay Berkenbilt	104fd6da52	Add matrix and annotation appearance stream handling Generate page content fragment for rendering appearance streams including all matrix calculation.	2019-01-01 08:07:21 -05:00
Jay Berkenbilt	5059ec0d35	Add Matrix class under QPDFObjectHandle	2018-12-31 23:02:43 -05:00
Jay Berkenbilt	daeb5a85b6	Transformation matrix	2018-12-31 18:23:47 -05:00
Jay Berkenbilt	3440ea7d3c	JSON::serialize -> unparse Unparse is admittedly strange, but I'd rather be strange and consistent, and everything else in the qpdf library uses unparse to serialize. (If you're reading this, the convention of using "unparse" comes from the "clu" programming language.)	2018-12-25 11:52:21 -05:00

... 6 7 8 9 10 ...

1256 Commits