octoleo/qpdf - qpdf - Vast Development Method

mirror of https://github.com/qpdf/qpdf.git synced 2024-11-15 17:17:08 +00:00

Author	SHA1	Message	Date
Jay Berkenbilt	558f043d91	QPDFJob: TRUE -> true	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	fcdbc8a102	Move doFinalChecks to QPDFJob::checkConfiguration	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	c4e56fa5f4	QPDFJob: make createsOutput callable before run()	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	564dc03607	QPDFJob: start real API Create QPDFJob_options.cc to hold API implementation functions. Reorganize a little in preparation for moving public member variables private and creating the real QPDFJob API that will be used by callers as well as the argv/json initialization methods.	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	1d099ab743	QPDFJob: placeholder for initializeFromJson	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	1c8d53465f	Incorporate job schema generation into generate_auto_job	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	b9cd693a5b	QPDFJob: allocate QPDFArgParser on stack The previous commits have removed all references to memory from QPDFArgParser from QPDFJob. This commit removes the constraint that QPDFArgParser remain in scope. This is a prerequisite to allowing JSON as an alternative way to initialize QPDFJob and to initialize it directly using a public API.	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	d526d4c17f	QPDFJob: convert Under/Overlay to use shared pointers	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	88891a75a2	QPDFJob: convert Under/Overlay ranges to strings	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	e48bfce930	QPDFJob: convert PageSpec to used shared pointer	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	e4905983d2	QPDFJob: convert outfilename to shared pointer	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	e5edfc786f	QPDFJob: convert infilename to shared pointer	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	ee7824cf28	QPDFJob: convert encryption_file args to shared pointers	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	021db6f226	QPDFJob: convert password to shared pointer	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	1a8c2eb93b	QPDFJob: use std::shared_ptr over PointerHolder where possible Also fix QPDFArgParser	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	76c4f78b5c	Add QUtil::make_shared_cstr Replace most of the calls to QUtil::copy_string with this instead.	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	67f9d0b7d5	cli.rst: remove () from end of short help This is used to generate a schema for the job json, which can't contain `)"` because it breaks the R"(...)" syntax in C++. While C++ accepts R"anything(...)anything" to avoid this, as of this writing, MSVC 2019 doesn't understand that. For now, just avoid it by removing parentheses from the end of short help.	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	8dea480c9f	Allow optional fields in json "schema" checks	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	ec85e56c3f	Add missing help topic for inspection	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	1db0a7ffce	JSONHandler: rework dictionary and array handlers	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	acf8d18b6e	Editorial changes to cli.rst	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	cf8405d91e	Fix json schema for objects to include dictionary key	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	2e58541493	Use JSON::parse to initialize schema for json mode	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	37105710ee	Implement JSONHandler for recursively processing JSON	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	a6df6fdaf7	CLI doc: use tables where helpful	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	e8e8f6f43c	Add JSON::parse	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	b9af421ef7	Add missing \f support for JSON string encoder	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	aa0a379b37	Add JSON::isDictionary and JSON::isArray	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	5c5e5ca29b	Document how to add a command-line argument	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	c8729398dd	Generate help content from manual This is a massive rewrite of the help text and cli.rst section of the manual. All command-line flags now have their own help and are specifically index. qpdf --help is completely redone.	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	b4bd124be4	QPDFArgParser: support adding/printing help information	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	5303130cf9	Fix comment on duplicated top-level json keys	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	53ba65eb59	QPDFArgParser: handle optional choices including help Handle optional choices in addition to required choices. Refactor the way help options are added to completion to make it work with optional help choices.	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	a301cc5373	Minor code cleanup	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	3ab25d595b	Fix doc typos caught by m-holger -- thanks	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	4577df4b5d	QPDFJob increment: generate option table initialization	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	f1d805badc	Add QPDFArgParser::copyFromOtherTable	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	c3e9b64e7f	QPDFJob increment: generate handler declarations	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	6e70d99b58	QPDFJob increment: generate choices variables in init	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	cb684ec4d3	QPDFJob increment: generate table names	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	f8eee83515	Expose QPDFArgParser::usage	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	8dcf6da259	QPDFJob: remove non-check from doFinalChecks	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	c216854607	Add basic framework for QPDFJob code generation	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	bd89aac360	QPDFJob increment: move arg parsing into QPDFJob Move ArgParser from qpdf.cc into QPDFJob.cc. It still works with millions of public member variables, but now qpdf.cc is minimal and just calls stable library functions.	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	12396702af	QPDFJob: reorder functions, no other changes	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	2394dd8519	QPDFJob increment: static functions to member functions Convert remaining static functions that take QPDFJob& as a parameter to member functions. Utility functions that don't take QPDFJob& remain static functions and can probably just stay that way since the keep extra complexity out of QPDFJob.hh.	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	e2975b9ed0	QPDFJob: de-templatize do_process and do_process_once	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	2f631997f2	QPDFJob increment: remove std::cout, std::cerr, whoami Remove remaining temporary duplication of hard-coded values and direct access to std::cout, std::cerr, and whoami in favor of parameters in QPDFJob. This moves a few more static methods into QPDFJob member functions.	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	1ddf5b4b4b	QPDFJob increment: get rid of exit, handle verbose Remove all calls to exit() from QPDFJob. Handle code that runs in verbose mode to enable it to make use of output streams and message prefix (whoami) from QPDFJob. This removes temporarily duplicated exit code logic and most access to whoami/std::cout outside of QPDFJob proper.	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	0910e767ad	QPDFJob increment: basic QPDFJob structure Move most of the methods called from qpdf.cc after argument parsing into QPDFJob. In this increment, enough QPDFJob API has been added to handle the branch of QPDFJob::run() that creates output with an appropriate division between qpdf.cc and QPDFJob. There are temporary bits of code to enable everything to compile and pass the test suite, including some duplication and hard-coded values.	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	52817f0a45	Implement QPDFArgParser based on ArgParser from qpdf.cc	2022-01-30 13:11:02 -05:00
m-holger	0f9086e509	Fix doc typos	2022-01-30 12:09:54 -06:00
m-holger	8eca9d8fd9	Fix QPDFObjectHandle::isOrHasName Ensure isOrHasName returns true if object is an array and the name is present anywhere in the array.	2022-01-27 09:35:39 -06:00
m-holger	07db3200cb	Remove some if statements and simplify some boolean expressions Use QPDFObjectHandle::isNameAndEquals, isDictionaryOfType and isStreamOfType.	2022-01-27 07:31:12 -06:00
m-holger	710d2e54f0	Allow testing for subtype without specifying type in isDictionaryOfType etc Accept empty string as type parameter in QPDFObjectHandle::isDictionaryOfType and isStreamOfType to allow for dictionaries with optional type.	2022-01-27 07:31:12 -06:00
m-holger	1b1b471ca9	Make a few whitespace fixes from last commit Commit by ejb@ql.org using m-holger as author so git annotate gives proper credit for changes.	2022-01-22 09:14:53 -05:00
m-holger	8593b9fdf7	Add new convenience methods QPDFObjectHandle::isNameAndEquals, etc Add methods isNameAndEquals, isDictionaryOfType, isStreamOfType	2022-01-22 08:10:28 -06:00
Jay Berkenbilt	370710657a	Add missing characters from PDF doc encoding (fixes #606 )	2022-01-11 15:55:19 -05:00
Jay Berkenbilt	77c31305fe	Fix signed/unsigned char warning (fixes #604 )	2022-01-11 06:51:31 -05:00
Jay Berkenbilt	af91b5b584	Add QUtil::file_can_be_opened	2021-12-29 13:41:02 -05:00
Jay Berkenbilt	04745320d6	Prepare 10.5.0 release	2021-12-20 14:51:46 -05:00
Jay Berkenbilt	d866f48081	Change names of qpdf_object_type_e enumerations They have to be ot_* rather than qpdf_ot_* for compatibility. * Different enumerated types are not assignment-compatible in C++, at least with strict compiler settings * While you can do `constexpr ot_xyz = ::qpdf_ot_xyz` in QPDFObject.hh to make QPDFObject::ot_xyz work, QPDFObject::object_type_e::ot_xyz will only work if the enumerated type names are the same.	2021-12-20 14:51:45 -05:00
Jay Berkenbilt	ea73bf72e0	Further improvements to handling binary strings	2021-12-19 14:30:45 -05:00
Jay Berkenbilt	ddbe59179e	C API: simplify new error handling and improve documentation	2021-12-17 15:59:47 -05:00
m-holger	f6293bd94c	C-API expose QPDFObjectHandle::getTypeCode and getTypeName (fixes #597 )	2021-12-17 14:24:43 -05:00
Jay Berkenbilt	feafcc4e88	C API: add several stream functions (fixes #596 )	2021-12-17 13:28:11 -05:00
Jay Berkenbilt	fee7489ee4	Add Pl_Buffer::getMallocBuffer	2021-12-17 12:38:52 -05:00
Jay Berkenbilt	9bb6f570ec	C API: add functions for working with pages (fixes #594 )	2021-12-16 15:07:48 -05:00
Jay Berkenbilt	245ca28066	Use value rather than reference captures where possible	2021-12-16 11:47:07 -05:00
Jay Berkenbilt	af2a71aa2c	Handle bitstream overflow errors more gracefully (fixes #581 ) * Make it a runtime error, not a logic error * Include additional information * Capture it properly in checkLinearization	2021-12-10 15:37:35 -05:00
Jay Berkenbilt	1c62c2a342	C API: expose functions for indirect objects (fixes #588 )	2021-12-10 14:57:35 -05:00
Jay Berkenbilt	72c10d8617	C API: overhaul error handling * Handle error conditions that occur when using the object handle interfaces. In the past, some exceptions were not correctly converted to errors or warnings. * Add more detailed information to qpdf-c.h * Make it possible to work more explicitly with uninitialized objects	2021-12-10 12:16:02 -05:00
Jay Berkenbilt	3340dbe976	Use a specific error code for type warnings and clarify docs	2021-12-10 11:15:49 -05:00
Jay Berkenbilt	b2b2a175c4	Add missing unit test for register progress reporter in C API It was exercised in the pdf-linearize example but not in qpdf-ctest.	2021-12-10 09:11:56 -05:00
Jay Berkenbilt	1faa21502f	Refactor trap_errors to use std::function	2021-12-09 10:33:31 -05:00
Jay Berkenbilt	e3cc171d02	C API: qpdf_oh_is_initialized	2021-12-09 10:33:31 -05:00
Jay Berkenbilt	bef2c2222a	C API: qpdf_get_last_string_length	2021-12-09 10:33:31 -05:00
m-holger	b4fc9eb700	C-API expose new_object as qpdf_oh_new_object	2021-12-02 13:59:58 -05:00
Jay Berkenbilt	720ce9e8f3	Improve testing and error handling around operating before processing	2021-11-29 07:42:36 -05:00
Jay Berkenbilt	ac17308cf6	Initialize QPDF::Members::file (fixes #584 )	2021-11-29 07:16:34 -05:00
m-holger	4630b8567c	Ensure qpdf_oh handles returned by C-API functions are unique. Return new qpdf_oh from qpdf_oh_wrap_in_array when input is already an array. Update some doc comments in qpdf-c.h.	2021-11-19 13:31:59 +00:00
Jay Berkenbilt	ce7db05d22	Prepare 10.4.0 release	2021-11-16 15:44:09 -05:00
Jay Berkenbilt	750aca5b94	First increment of improving handling of weak crypto (fixes #358 )	2021-11-11 12:24:15 -05:00
Jay Berkenbilt	f45dacf4cb	Make recovery logic flexible about where objects end (fixes #573 ) Don't assume endobj is at the beginning of the line. This means we are looking at tokens for every line, but the odds of n n obj appearing in the middle of the object are likely much lower than endobj not being at the beginning of the line or missing entirely. This will probably have a negative impact on recovery time for very large files. Hopefully it will be worth it.	2021-11-07 15:27:22 -05:00
Jay Berkenbilt	3794f8e2ad	Support OpenSSL 3 (fixes #568 )	2021-11-04 18:24:54 -04:00
Jay Berkenbilt	a84a0b2487	Add range check in QPDFNumberTreeObjectHelper (fuzz issue 37740)	2021-11-04 14:03:24 -04:00
Jay Berkenbilt	4a648b9a00	Fix bug in merging resources /DR from foreign AcroForm (fixes #548 ) When making resources indirect in from_dr, the code was using the wrong owning QPDF, forgetting that from_dr had already been copied using CopyForeignObject.	2021-11-04 12:29:42 -04:00
Jay Berkenbilt	9b28933647	Check object ownership when adding When adding a QPDFObjectHandle to an array or dictionary, if possible, check if the new object belongs to the same QPDF. This makes it much easier to find incorrect code than waiting for the situation to be detected when the file is written.	2021-11-04 12:29:42 -04:00
Jay Berkenbilt	33a47d5c3c	Make QPDF::findPage public (fixes #516 ) This was originally not public because I wanted to get rid fo the pages cache, but I recently realized there were deep reasons not to do that, and the author of pikepdf wanted this, so I decided to make it public.	2021-11-03 09:43:17 -04:00
Jay Berkenbilt	532a4f3d60	Detect recoverable but invalid zlib data streams (fixes #562 )	2021-11-03 09:43:17 -04:00
Fredrik Fornwall	e0775238b8	Fix QPDFEFStreamObjectHelper::{get,set}Subtype The /Subtype entry that specifies the mime type of an embedded file is inside the embedded file stream dictionary directly, not it in the parameter dictionary. See Table 45 and 46 in the PDF 1.7 specification: https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#page=112	2021-09-10 10:02:24 -04:00
Jay Berkenbilt	3cacb27a90	Performance fix on preserveObjectStreams	2021-05-09 07:51:14 -04:00
Jay Berkenbilt	bddebdb0ea	Prepare 10.3.2 release	2021-05-08 10:41:14 -04:00
Jay Berkenbilt	30ac51bc78	Exclude unreferenced objects in object streams (fixes #520 )	2021-05-08 09:42:09 -04:00
Zdenek Dohnal	16c19e9424	libqpdf/Pl_AES_PDF.cc: remove duplicated if branch Check for this->encrypt seems to be moved to plugged crypto implementations, so it can be removed from Pl_AES_PDF.cc.	2021-04-29 09:42:38 -04:00
Jay Berkenbilt	36c7c20819	Fix timezone portability issue (fixes #515 )	2021-04-17 18:12:55 -04:00
Jay Berkenbilt	8971443e46	QPDF::addPage*: handle duplicate pages more robustly	2021-04-05 10:58:10 -04:00
Jay Berkenbilt	ec48820c3c	Fix loop detection in NNTree	2021-04-05 07:59:02 -04:00
Jay Berkenbilt	258675fc99	Move ABI comment to the right place	2021-04-03 11:43:08 -04:00
Jay Berkenbilt	a77f58142d	Remove some assertions that are not necessarily true (fixes #514 ) Operations that add the same object to multiple places in the pages tree are throwing exceptions and then later causing assertion failures. The assert calls shouldn't be there.	2021-03-21 19:35:23 -04:00
Jay Berkenbilt	3f05429cc5	Prepare 10.3.1 release	2021-03-11 12:59:41 -05:00
Jay Berkenbilt	85884c363c	Allow /DR to be direct in /AcroForm Also handle direct annotation, though this is much less likely.	2021-03-11 11:43:38 -05:00
Jay Berkenbilt	dc65b88457	Prepare 10.3.0 release	2021-03-05 06:15:48 -05:00
Jay Berkenbilt	cb6e53136f	QPDFAcroFormDocumentHelper: add missing analyze calls	2021-03-04 18:11:44 -05:00
Jay Berkenbilt	0b77f2cf26	Revert non-binary-compatible handleWarning change -- see TODO (ABI)	2021-03-04 15:59:46 -05:00
Jay Berkenbilt	f68e25c7f2	Don't use handleWarning, which is being reverted	2021-03-04 15:59:45 -05:00
Jay Berkenbilt	9fb174b9e9	Major rework of handling form fields when copying pages (fixes #509 )	2021-03-04 15:08:37 -05:00
Jay Berkenbilt	887f35efaa	When resolving font from /DR, copy it into resources	2021-03-04 15:08:36 -05:00
Jay Berkenbilt	a2124f992c	Add QPDFMatrix::operator==	2021-03-04 15:08:36 -05:00
Jay Berkenbilt	552303a94a	Check for reserved after dereference	2021-03-04 15:08:36 -05:00
Jay Berkenbilt	d7ffdfa994	Add optional conflict detection to mergeResources Also improve behavior around direct vs. indirect resources.	2021-03-04 15:08:36 -05:00
Jay Berkenbilt	e17585c2d2	Remove unreferenced: ignore names that are not Fonts or XObjects Converted ResourceFinder to ParserCallbacks so we can better detect the name that precedes various operators and use the operators to sort the names into resource types. This enables us to be smarter about detecting unreferenced resources in pages and also sets the stage for reconciling differences in /DR across documents.	2021-03-03 17:05:49 -05:00
Jay Berkenbilt	a15ec6967d	Enhancements to ParserCallbacks	2021-03-03 17:05:49 -05:00
Jay Berkenbilt	1bb209a9bf	Add QPDF::numWarnings	2021-03-03 17:05:49 -05:00
Jay Berkenbilt	37fcc5ff71	Create ResourceFinder from NameWatcher in QPDFPageObjectHelper	2021-03-03 17:05:49 -05:00
Jay Berkenbilt	b444ab3352	Fix typos in coverage cases	2021-03-03 17:05:49 -05:00
Jay Berkenbilt	fa2516df71	Fix behavior for finding /Q, /DA, and /DR for form fields If not found in the field hierarchy, /Q and /DA are supposed to be looked up in the document-level form dictionary. /DR is supposed to only come from the document dictionary.	2021-03-03 17:05:19 -05:00
Jay Berkenbilt	a4d6589ff2	Have QPDFObjectHandle notice when replaceObject was called This results in a performance penalty of 1% to 2% when replaceObject and swapObjects are never called and a somewhat larger penalty if they are called, but it's worth it to avoid very confusing behavior as discussed in depth in qpdf#507.	2021-02-25 07:32:46 -05:00
Jay Berkenbilt	ec6719fd25	Always call dereference() before querying obj pointer	2021-02-25 07:31:26 -05:00
Jay Berkenbilt	b5e937397c	Prepare 10.2.0 release	2021-02-23 10:41:58 -05:00
Jay Berkenbilt	1886673d7e	Spell check	2021-02-23 10:38:05 -05:00
Jay Berkenbilt	9e00be7ffa	Remove warning that gives false positives in some normal cases	2021-02-23 08:26:21 -05:00
Jay Berkenbilt	be3a8c0e7a	Keep only referenced form fields in --pages	2021-02-23 08:26:21 -05:00
Jay Berkenbilt	83216e640c	Preserve form fields when splitting pages (fixes #340 )	2021-02-22 18:42:06 -05:00
Jay Berkenbilt	1f35ec9988	Add methods for copying form fields	2021-02-22 18:42:06 -05:00
Jay Berkenbilt	8e8c0d8290	Add new placeFormXObject that takes a matrix reference	2021-02-22 18:42:06 -05:00
Jay Berkenbilt	61d41e2e88	Add copyAnnotations, use with overlay/underlay (fixes #395 )	2021-02-22 18:42:06 -05:00
Jay Berkenbilt	7b3cbacf5d	Change from QPDF{Array,Dict}Items to aitems() and ditems()	2021-02-22 11:05:39 -05:00
Jay Berkenbilt	a9ae8cadc6	Add transformAnnotations and fix flattenRotations to use it	2021-02-21 17:13:09 -05:00
Jay Berkenbilt	a76decd2d5	Add QPDFObjGen::unparse	2021-02-21 16:21:52 -05:00
Jay Berkenbilt	7540d2082a	Explicitly override inherited rotate in flattenRotations	2021-02-21 14:58:45 -05:00
Jay Berkenbilt	e899926e0d	Use QPDFMatrix inside flattenRotations	2021-02-21 14:58:45 -05:00
Jay Berkenbilt	92fbc6fdf5	QPDFObjectHandle::copyStream	2021-02-21 06:36:30 -05:00
Jay Berkenbilt	60afe4142e	Refactor: separate copyStreamData from replaceForeignIndirectObjects	2021-02-21 06:36:30 -05:00
Jay Berkenbilt	15269f36d8	addFormField: update cache rather than invalidating	2021-02-21 06:36:30 -05:00
Jay Berkenbilt	901f1a788c	Enhance QPDFMatrix API	2021-02-21 06:36:30 -05:00
Jay Berkenbilt	05eb5826d8	Fix isPagesObject and isPageObject There are lots of things with /Kids that are not pages. Repair the pages tree, then do a reliable check.	2021-02-20 19:42:41 -05:00
Jay Berkenbilt	35dd11f356	Allow --rotate=0	2021-02-20 16:29:34 -05:00
Jay Berkenbilt	71e8627285	Add const versions of QPDFMatrix::transform*	2021-02-19 18:35:19 -05:00
Jay Berkenbilt	de8929a41c	Add QPDFAcroFormDocumentHelper::addFormField	2021-02-18 12:25:48 -05:00
Jay Berkenbilt	5cec6b4c3d	Add QPDFPageObjectHelper::getMatrixForFormXObjectPlacement	2021-02-18 12:25:48 -05:00
Jay Berkenbilt	0765872295	Form field for non-widget just returns null	2021-02-18 10:25:07 -05:00
Jay Berkenbilt	0b1623d07d	Add QUtil::path_basename	2021-02-18 09:59:03 -05:00
Jay Berkenbilt	a773f4c71d	Add QPDFObjectHandle::parse for strings with context	2021-02-15 11:33:03 -05:00
Jay Berkenbilt	7eb903d9aa	Use functional replaceStreamData	2021-02-14 14:42:24 -05:00
Jay Berkenbilt	efbb21673c	Add functional versions of QPDFObjectHandle::replaceStreamData Also fix a bug in checking consistency of length for stream data providers. Length should not be checked or recorded if the provider says it failed to generate the data.	2021-02-14 14:42:24 -05:00
Jay Berkenbilt	e2593e2efe	Move QPDFMatrix into the public API	2021-02-13 02:30:00 -05:00
Jay Berkenbilt	07f40bd254	QUtil::double_to_string: trim trailing zeroes with option to disable	2021-02-13 02:30:00 -05:00
Jay Berkenbilt	8fbc8579f2	Allow zone information to be omitted from timestamp strings	2021-02-11 14:26:55 -05:00
Jay Berkenbilt	df067c9ab6	Add autoconf test for localtime_r	2021-02-11 14:26:55 -05:00
Jay Berkenbilt	1b3f84f967	Require C++14 instead of C++11	2021-02-10 16:27:58 -05:00
Jay Berkenbilt	9fcf61b2f6	Fix loop in QPDFOutlineDocumentHelper (fuzz issue 30507)	2021-02-10 16:27:44 -05:00
Jay Berkenbilt	4d1f2fdcac	Update to new name/number tree API	2021-02-10 15:46:20 -05:00
Jay Berkenbilt	1f4771cd0d	Minor clean up of Windows headers	2021-02-10 07:36:18 -05:00
Jay Berkenbilt	ad34b9c278	Implement helpers for file attachments	2021-02-10 06:57:37 -05:00
Jay Berkenbilt	bf0e6eb302	Add QUtil methods for dealing with PDF timestamp strings	2021-02-09 17:50:24 -05:00
Jay Berkenbilt	bfbeec5497	Make newly created name/number trees indirect objects	2021-02-08 06:49:56 -05:00
Jay Berkenbilt	553ac7f353	Add QUtil::pipe_file and QUtil::file_provider	2021-02-07 19:41:34 -05:00
Jay Berkenbilt	e076c9bf08	Remove erroneous handling of /EFF for stream decryption I thought /EFF was supposed to be used as a default for decrypting embedded file streams, but actually it's supposed to be advice to a conforming writer about handling new ones. This makes sense since the findAttachmentStreams code, which is not actually needed, was never right.	2021-02-06 17:08:41 -05:00
Jay Berkenbilt	ac2b3b96e1	Make wrong object stream type a warning	2021-02-06 14:29:11 -05:00
Jay Berkenbilt	faa2e3ddfd	Handle older PDFs whose form XObjects inherit resources (fixes #494 ) When removing unreferenced resources, notice if a page (recursively) contains a form XObject with unreferenced resources, and count any such resources as referenced by the page.	2021-02-02 18:06:05 -05:00
Jay Berkenbilt	81025e4998	Refactor removal of unreferenced resources Refactor in preparation for resolving unresolved resources in form xobjects from page.	2021-02-02 18:06:05 -05:00
Jay Berkenbilt	9c9ce64eec	Handle strings in inline image dictionaries We need to use token.getRawValue, not token.getValue	2021-01-31 07:50:03 -05:00
Jay Berkenbilt	178f995fc2	Recover from exceptions during filtering for inline images	2021-01-31 07:49:08 -05:00
Jay Berkenbilt	4ae93a73c5	Improve memory safety of dict/array iterators	2021-01-31 07:16:03 -05:00
Jay Berkenbilt	de0b11fc47	Add C++ iterator API around array and dictionary objects	2021-01-30 15:15:23 -05:00
Jay Berkenbilt	35e7859bc7	Make QPDFObjectHandle::is* return false for uninitialized objects	2021-01-29 15:46:54 -05:00
Jay Berkenbilt	50decc9bb8	name/number tree: explicitly declare default destructors	2021-01-29 15:46:54 -05:00
Jay Berkenbilt	8ed3e8c79b	NNTree: rework iterators to be more memory efficient Keep a std::pair internal to the iterators so that operator* can return a reference and operator-> can work, and each can work without copying pairs of objects around.	2021-01-26 09:12:23 -05:00
Jay Berkenbilt	e7e20772ed	name/number trees: remove	2021-01-26 09:12:23 -05:00
Jay Berkenbilt	5816fb44b8	name/number trees: insertAfter	2021-01-25 15:39:10 -05:00
Jay Berkenbilt	16a9bb3f6f	name/number trees: newEmpty, increment/decrement end()	2021-01-25 15:39:10 -05:00
Jay Berkenbilt	b5614f611d	Implement repair and insert for name/number trees	2021-01-24 19:31:45 -05:00
Jay Berkenbilt	04edfe9fad	QPDFObjectHandle::newUnicodeString to uses UTF-16 only when needed Use the first of ASCII, PDFDocEncoding, or UTF-16 that is capable of encoding the string.	2021-01-24 03:27:28 -05:00
Jay Berkenbilt	63e5cb533d	Use new QPDF{Name,Number}TreeObjectHelper API	2021-01-24 03:27:28 -05:00
Jay Berkenbilt	d61ffb65d0	Add new constructors for name/number tree helpers Add constructors that take a QPDF object so we can issue warnings and create new indirect objects.	2021-01-24 03:27:26 -05:00
Jay Berkenbilt	ba814703fb	Use QPDFNameTreeObjectHelper's iterator directly	2021-01-24 03:25:11 -05:00
Jay Berkenbilt	5f0708418a	Add iterators to name/number tree helpers	2021-01-24 03:22:59 -05:00
Jay Berkenbilt	4a1cce0a47	Reimplement name and number tree object helpers Create a computationally and memory efficient implementation of name and number trees that does binary searches as intended by the data structure rather than loading into a map, which can use a great deal of memory and can be very slow.	2021-01-24 03:22:51 -05:00
Jay Berkenbilt	6226b69dba	Add warn() to QPDF's public API	2021-01-16 18:41:53 -05:00
Jay Berkenbilt	fc88837d4b	Treat /EmbeddedFiles as a proper name tree If we ever had an encrypted file with different filters for attachments and either the /EmbeddedFiles name tree was deep or some of the file specs didn't have /Type, we would have overlooked those as attachment streams. The code now properly handles /EmbeddedFiles as a name tree.	2021-01-11 10:50:44 -05:00
Jay Berkenbilt	6fe7b704c7	Warn rather than segv on access after closing input source (fixes #495 )	2021-01-06 10:11:34 -05:00
Jay Berkenbilt	0fed040392	Prepare version 10.1.0	2021-01-04 16:59:55 -05:00
Jay Berkenbilt	18340b8835	Spell check	2021-01-04 16:26:58 -05:00
Jay Berkenbilt	dc92574c10	Fix some pipelines to be safe if downstream write fails (fuzz issue 28262)	2021-01-04 15:17:35 -05:00
Jay Berkenbilt	ba6b6aacf1	Fix outdated comment	2021-01-03 15:59:49 -05:00
Jay Berkenbilt	3be58f49e5	Make more QPDFPageObjectHelper methods work with form XObject	2021-01-02 14:08:53 -05:00
Jay Berkenbilt	98da4fd835	Externalize inline images now includes form XObjects	2021-01-02 14:08:17 -05:00
Jay Berkenbilt	bedf35d6a5	Bug fix: avoid extraneous pipeline finish calls with multiple contents Avoid calling finish() multiple times on the pipeline passed to pipeContentStreams. This commit also fixes a bug in which qpdf was not exiting with the proper exit status if warnings found while splitting pages; this was exposed by a test case that changed.	2021-01-02 14:08:17 -05:00
Jay Berkenbilt	a139d2b36d	Add several methods for working with form XObjects (fixes #436 ) Make some more methods in QPDFPageObjectHelper work with form XObjects, provide forEach methods to walk through nested form XObjects, possibly recursively. This should make it easier to work with form XObjects from user code.	2021-01-02 12:29:31 -05:00
Jay Berkenbilt	6154221edb	QPDFPageObjectHelper: filterPageContents -> filterContents + form XObject	2021-01-02 11:33:36 -05:00
Jay Berkenbilt	63ea46193d	QPDFPageObjectHelper: getPageImages -> getImages	2021-01-02 11:33:36 -05:00
Jay Berkenbilt	e7a8554563	QPDFPageObjectHelper::getPageImages: support form XObjects	2021-01-02 11:33:36 -05:00
Jay Berkenbilt	1562d34c09	Add QPDFObjectHandle::isFormXObject	2021-01-01 07:36:10 -05:00
Jay Berkenbilt	c9271335fa	Add QPDFPageObjectHelper::flattenRotation and --flatten-rotation	2020-12-30 13:03:55 -05:00
Jay Berkenbilt	12ecd2019a	Add QPDFObjectHandle::setFilterOnWrite	2020-12-28 12:58:19 -05:00
Jay Berkenbilt	3f9191a344	Add ostream << for QPDFObjGen	2020-12-28 12:58:19 -05:00
Jay Berkenbilt	858c7b89bc	Let optimize filter stream parameters instead of making them direct Also removes preclusion of stream references in stream parameters of filterable streams and reduces write times by about 8% by eliminating an extra traversal of the objects.	2020-12-28 12:58:19 -05:00
Jay Berkenbilt	1a62cce940	Restructure optimize to allow skipping parameters of filtered streams	2020-12-28 12:58:19 -05:00
Jay Berkenbilt	09027344b9	Refactor: separate code that determines whether to filter a stream	2020-12-28 12:58:19 -05:00
Jay Berkenbilt	39bfa01307	Implement user-provided stream filters Refactor QPDF_Stream to use stream filter classes to handle supported stream filters as well.	2020-12-28 12:58:19 -05:00
Jay Berkenbilt	cc8895078a	Add QPDFObjectHandle::makeDirect(bool allow_streams)	2020-12-26 08:48:18 -05:00
Jay Berkenbilt	573b6eb8b1	Provide qpdf write progress reporting from C API (fixes #487 )	2020-12-20 14:43:24 -05:00
Jay Berkenbilt	2050977099	Add QPDFObjectHandle manipulation to C API	2020-11-28 19:48:07 -05:00
Jay Berkenbilt	78b9d6bfd4	Prepare 10.0.4 release	2020-11-21 13:50:02 -05:00
Jay Berkenbilt	bd79138c84	Treat direct page as runtime rather than logic error (fuzz issue 27393)	2020-11-11 09:50:43 -05:00
Jay Berkenbilt	47f4ebcdac	Ignore unused field in xref entry, avoiding range error (fixes #482 )	2020-11-04 07:46:46 -05:00
Jay Berkenbilt	fbe40b800d	Prepare 10.0.3 release	2020-10-31 13:47:03 -04:00
Jay Berkenbilt	6971f78ff6	Fix stack overflow on direct root (fuzz issue 26761)	2020-10-31 13:10:39 -04:00
Jay Berkenbilt	ffe6af6f77	Add comments explaining the foreign object copying code These are the comments I would have liked to have been able to read while fixing #449 and #478.	2020-10-31 12:14:26 -04:00
Jay Berkenbilt	96767fb104	Fix foreign stream copying bug (fixes #478 ) This reverts an incorrect fix to #449 and codes it properly. The real problem was that we were looking at the local dictionaries rather than the foreign dictionaries when saving the foreign stream data. In the case of direct objects, these happened to be the same, but in the case of indirect objects, the object references could be pointing anywhere since object numbers don't match up between the old and new files.	2020-10-31 12:14:26 -04:00
Jay Berkenbilt	da7540794a	Prepare 10.0.2 release	2020-10-27 11:57:48 -04:00
Jay Berkenbilt	09bd1fafb1	Improve efficiency of number to string conversion	2020-10-27 11:57:48 -04:00
Jay Berkenbilt	bcea54fcaa	Revert removal of unreadCh change for performance Turns out unreadCh is much more efficient than seek(-1, SEEK_CUR). Update comments and code to reflect this.	2020-10-27 11:57:48 -04:00
Jay Berkenbilt	b30deaeeab	Avoid merging adjacent tokens when concatenating contents (fixes #444 )	2020-10-23 08:00:04 -04:00
Jay Berkenbilt	8a11feacc3	Avoid leak by resolving object streams more than once (fuzz issue 23642)	2020-10-22 15:39:36 -04:00
Jay Berkenbilt	30bb4c64ee	Minor code cleanup * Return rather than exiting from realmain in qpdf.cc * Remove extraneous blank line * Don't assign temporary to const reference	2020-10-22 15:39:36 -04:00
Jay Berkenbilt	232f5fc9f3	Handle jpeg library fuzz false positives The jpeg library has some assembly code that is missed by the compiler instrumentation used by memory sanitization. There is a runtime environment variable that is used to work around this issue.	2020-10-22 06:31:52 -04:00
Jay Berkenbilt	c1684eae91	Check for overflow in page labels (fuzz issue 23599)	2020-10-22 05:49:24 -04:00
Jay Berkenbilt	7f4a4df919	Add range_check method to QIntC	2020-10-22 05:48:40 -04:00
Jay Berkenbilt	24196c08cb	Fix loop detection error (fuzz issue 23172)	2020-10-22 05:48:35 -04:00
Jay Berkenbilt	956c8f6432	Obscure bug fix copying foreign streams in special cases (fixes #449 ) Specifically, if a stream had its stream data replaced and had indirect /Filter or /DecodeParms, it would result in non-silent loss of data and/or internal error.	2020-10-21 19:23:23 -04:00
Jay Berkenbilt	98f6c00dad	Protect numeric conversion against user's locale (fixes #459 )	2020-10-21 16:42:51 -04:00
Jay Berkenbilt	bed165c9fc	Stop using InputSource::unreadCh	2020-10-18 07:43:05 -04:00
Dean Scarff	153060a0c5	Check integer overflow in resolveObjectsInStream Fixes a crash found by fuzzing.	2020-10-16 20:09:24 -04:00
Dean Scarff	9a3791c53b	Properly detect OPENSSL_IS_BORINGSSL OPENSSL_IS_BORINGSSL is not actually set by configure, so it will be undefined until a BoringSSL header is included. Hence the #ifdef logic in QPDFCrypto_openssl.h would usually never apply. This still worked because evp.h transitively included BoringSSL's cipher.h and digest.h, but the latter are the correct (documented) headers. By re-ordering the includes, we can ensure the macro is defined when we use it. Also: fix case in the header guards.	2020-10-16 20:04:36 -04:00
Dean Scarff	2ff84aa2c9	Include detailed OpenSSL error messages Fixes qpdf/qpdf#450	2020-10-16 19:58:11 -04:00
James R. Barlow	3fc7c99d02	Replace memchr with manual memory search On large files with predominantly \n line endings, memchr(..'\r'..) seems to waste a considerable amount of time searching for a line ending candidate that we don't need. On the Adobe PDF Reference Manual 1.7, this commit is 8x faster at QPDF::processMemoryFile().	2020-10-16 19:57:29 -04:00
oltolm	3221022fc9	fix WindowsCryptProvider fixes #432	2020-10-16 19:56:33 -04:00
Jay Berkenbilt	ff65e272a8	Fix printf formatting for newer msvc Use autoconf rather than ifdefs to determine what format string to use for long long.	2020-10-16 07:02:23 -04:00
Jay Berkenbilt	88b8f8ec86	Remove redundant check found by lgtm.com	2020-10-15 14:47:43 -04:00
Jay Berkenbilt	26514ab731	Write linearization errors to stderr (fixes #438 )	2020-04-29 17:33:34 -04:00
Jay Berkenbilt	92d3cbecd4	Fix warnings reported by -Wshadow=local (fixes #431 )	2020-04-16 12:41:43 -04:00
Jay Berkenbilt	578c5ac66c	Use more references when iterating When possible, use `for (auto&` or `for (auto const&` when iterating using C++-11 style iterators.	2020-04-10 13:30:33 -04:00
Jay Berkenbilt	821a701851	Prepare 10.0.1 release	2020-04-09 11:48:26 -04:00
Jay Berkenbilt	1a7d3700a6	Fix unnecessary copies in auto iter (fixes #426 ) Also switch to colon-style iteration in some cases. Thanks to Dean Scarff for drawing this to my attention after detecting some unnecessary copies with https://clang.llvm.org/extra/clang-tidy/checks/performance-for-range-copy.html	2020-04-08 20:45:26 -04:00
Jay Berkenbilt	4977a7efa5	Bug fix: getStreamData should on unfilterable stream (fixes #425 )	2020-04-08 18:52:04 -04:00
Jay Berkenbilt	1e629c278a	Prepare 10.0.0 release	2020-04-06 11:30:15 -04:00
Jay Berkenbilt	c996f4ac33	Don't include <cwchar> if not building with wchar	2020-04-06 11:23:02 -04:00
Jay Berkenbilt	77198d5310	Delegate random number generation to crypto provider (fixes #418 )	2020-04-06 11:23:02 -04:00
Jay Berkenbilt	52749b85df	Make random data provider code thread-safe This uses C++-11 thread-safe static initializers now.	2020-04-06 10:00:43 -04:00
Jay Berkenbilt	619d294e9d	Remove QUtil::srandom	2020-04-06 09:49:02 -04:00
Dean Scarff	0f2507234f	Add OpenSSL/BoringSSL crypto provider Fixes qpdf/qpdf#417	2020-04-06 09:01:55 -04:00
Jay Berkenbilt	893d38b87e	Allow propagation of errors and retry through StreamDataProvider StreamDataProvider::provideStreamData now has a rich enough API for it to effectively proxy to pipeStreamData.	2020-04-05 20:07:13 -04:00
Jay Berkenbilt	7246404177	JSON: implement pattern keys in schema	2020-04-04 18:06:32 -04:00
Dean Scarff	c5c1a028cd	Use deterministic assignments for unique_id Fixes qpdf/qpdf#419	2020-04-04 08:29:28 -04:00
Jay Berkenbilt	2100b4ce15	Allow qpdf to be built on systems without wchar_t (fixes #406 )	2020-04-03 21:39:44 -04:00
Jay Berkenbilt	6a4117add9	Avoid potential segfault in warning methods	2020-04-03 21:39:20 -04:00
Jay Berkenbilt	4f3b89991b	placeFormXObject: allow control of shrink/expand (fixes #409 )	2020-04-03 21:39:17 -04:00
Jay Berkenbilt	b76b73b229	C API: accept any non-zero value as TRUE	2020-04-03 17:33:44 -04:00
Jay Berkenbilt	54726930df	Remove redundant methods in QUtil This was being saved until we had to break ABI.	2020-04-03 12:17:57 -04:00
Jay Berkenbilt	5806e5c60c	QPDFPageObjectHelper::placeFormXObject: use std::string const& (fixes #374 )	2020-04-03 12:17:57 -04:00
Jay Berkenbilt	97de12343b	Performance: remove Members indirection for Pipeline	2020-04-03 12:17:57 -04:00
Jay Berkenbilt	bfda941519	Use an unordered map for SparseOHArray for efficiency This was added in C++11.	2020-04-03 12:16:24 -04:00
Jay Berkenbilt	ee271fd2f2	Use auto for iterating over sparse array	2020-04-03 12:16:24 -04:00
Jay Berkenbilt	70665cb381	Internally use unsafeShallowCopy where we can	2020-04-03 12:16:24 -04:00
Jay Berkenbilt	38afdcea7b	Add QPDFObjectHandle::unsafeShallowCopy	2020-04-03 12:16:24 -04:00
Jay Berkenbilt	07afb668b1	Performance: remove indirection through Members for QPDFObject	2020-04-03 12:16:24 -04:00
Jay Berkenbilt	89f19b7099	Performance: remove Members indirection for QPDFObjectHandle	2020-04-03 12:16:24 -04:00
Jay Berkenbilt	dac65a21fb	Look in form XObjects when removing unreferenced resources (fixes #373 ) If a page contains a form XObject, also filter the form XObject and remove its unreferenced resources.	2020-03-31 17:39:20 -04:00
Jay Berkenbilt	278710fbe8	Refactor QPDFPageObjectHelper::removeUnreferencedResources() Refactor removeUnreferencedResources to prepare for filtering form XObjects.	2020-03-31 17:39:20 -04:00
Jay Berkenbilt	bb6768b8f0	Include header for wcslen (fixes #405 )	2020-02-29 08:43:33 -05:00
Jay Berkenbilt	bb3137296d	Handle root /Pages pointing to other than page tree root (fixes #398 )	2020-02-22 11:10:31 -05:00
Jay Berkenbilt	52a2e95dd5	Prepare 9.1.1 release	2020-01-26 18:49:04 -05:00
Jay Berkenbilt	57c01ef81f	In qdf mode, don't write extra XRef streams (fixes #386 ) fix-qdf assumes there is exactly one XRef stream and that it is at the end of the file.	2020-01-26 16:50:57 -05:00
Jay Berkenbilt	bbc2f8ffae	Bug fix: handle ColorSpace lookup for inline images (fixes #392 ) If the value of /CS in the inline image dictionary was is key in the page's /Resource -> /ColorSpace dictionary, properly resolve it by referencing the proper colorspace, and not just the name, in the external image dictionary.	2020-01-26 15:29:10 -05:00
Cloudmersive	a8b6ff5763	Fix for Windows unable to acquire crypt context with new keyset (fixes #387 ) Fix is based on guidance https://support.microsoft.com/en-us/help/238187/cryptacquirecontext-use-and-troubleshooting and is the proper fix for #285/#286	2020-01-14 18:45:54 -05:00
Jay Berkenbilt	a44b5a34a0	Pull wmain -> main code from qpdf.cc into QUtil.cc	2020-01-14 11:40:51 -05:00
Jay Berkenbilt	ab4061f1ee	Add error detection for read_lines_from_file(FILE*)	2020-01-14 11:07:09 -05:00
Jay Berkenbilt	211a7f57be	QUtil::read_lines_from_file: optional EOL preservation	2020-01-13 11:26:18 -05:00
Jay Berkenbilt	9a398504ca	Refactor QUtil::read_lines_from_file This commit adds the preserve_eol flags but doesn't implement EOL preservation yet.	2020-01-13 09:19:53 -05:00
Jay Berkenbilt	9b0c6022d7	Prepare 9.1.0 release	2019-11-16 22:29:54 -05:00
Jay Berkenbilt	5e6dfc938e	Prepare 9.1.rc1 release	2019-11-09 22:00:53 -05:00
Jay Berkenbilt	c4478e5249	Allow odd/even modifiers in numeric range (fixes #364 )	2019-11-09 13:23:12 -05:00
Jay Berkenbilt	5508f74603	Allow /P in encryption dictionary to be positive (fixes #382 ) Even though this is disallowed by the spec, files like this have been encountered in the wild.	2019-11-09 12:33:15 -05:00
Jay Berkenbilt	127a957aee	Allow runtime inspection/override of crypto provider	2019-11-09 09:53:42 -05:00
Jay Berkenbilt	88bedb41fe	Implement gnutls crypto provider (fixes #218 ) Thanks to Zdenek Dohnal <zdohnal@redhat.com> for contributing the code used for the gnutls crypto provider.	2019-11-09 09:53:38 -05:00
Jay Berkenbilt	cc14523440	Update autoconf to support crypto selection	2019-11-09 08:18:02 -05:00
Jay Berkenbilt	d0a53cd3ea	Fix typos in configure.ac	2019-11-09 08:18:02 -05:00
Jay Berkenbilt	c03ced09c0	Isolate source files used for native crypto	2019-11-09 08:18:02 -05:00
Jay Berkenbilt	d1ffe46c04	AES_PDF: move CBC logic from pipeline to AES_PDF implementation	2019-11-09 08:18:02 -05:00
Jay Berkenbilt	c8cda4f965	AES_PDF: switch to pluggable crypto	2019-11-09 08:18:02 -05:00
Jay Berkenbilt	bb427bd117	SHA2: switch to pluggable crypto	2019-11-09 08:18:02 -05:00
Jay Berkenbilt	eadc222ff9	Rename SHA2 implementation (non-bisectable)	2019-11-09 08:18:02 -05:00
Jay Berkenbilt	4287fcc002	RC4: switch to pluggable crypto	2019-11-09 08:18:02 -05:00
Jay Berkenbilt	0cdcd10228	Rename RC4 implementation (non-bisectable)	2019-11-09 08:18:02 -05:00
Jay Berkenbilt	ce8f9b6608	MD5: switch to pluggable crypto	2019-11-09 08:18:02 -05:00
Jay Berkenbilt	5c3e856e9f	Rename MD5 implementation (non-bisectable) Just rename MD5 -> MD5_native in place so that git annotate will show the lines as having originated there.	2019-11-09 08:18:02 -05:00
Jay Berkenbilt	2de41856a0	QPDFCryptoProvider: initial implementation	2019-11-09 08:18:02 -05:00
Jay Berkenbilt	700f5b961e	Remove int type checks -- subsumed by C++-11	2019-11-09 08:18:02 -05:00
Jay Berkenbilt	653ce3550d	Require C++-11 Includes updates to m4/ax_cxx_compile_stdcxx.m4 to make it work with msvc, which supports C++-11 with no flags but doesn't set __cplusplus to a recent value.	2019-11-09 08:18:02 -05:00
Jay Berkenbilt	9094fb1f8e	Fix two additional fuzz test cases	2019-11-03 18:59:12 -05:00
Masamichi Hosoda	5a842792b6	Parse Contents in signature dictionary without encryption Various PDF digital signing tools do not encrypt /Contents value in signature dictionary. Adobe Acrobat Reader DC can handle a PDF with the /Contents value not encrypted. Write Contents in signature dictionary without encryption Tests ensure that string /Contents are not handled specially when not found in sig dicts.	2019-10-22 16:20:21 -04:00
Masamichi Hosoda	cdc46d78f4	Add QPDFObject::getParsedOffset()	2019-10-22 16:19:06 -04:00
Masamichi Hosoda	50b329ee9f	Add QPDFWriter::getWrittenXRefTable()	2019-10-22 16:16:16 -04:00
Masamichi Hosoda	5cf4090aee	Add QPDFWriter::getRenumberedObjGen()	2019-10-22 16:16:16 -04:00
Masamichi Hosoda	46ac3e21b3	Add QPDF::getXRefTable()	2019-10-22 16:16:16 -04:00
Masamichi Hosoda	06b818dcd3	Exclude signature dictionary from compressible objects It seems better not to compress signature dictionaries. Various PDF digital signing tools, including Adobe Acrobat Reader DC, do not compress signature dictionaries. Table 8.93 "Entries in a signature dictionary" in PDF 1.5 reference describes that /ByteRange in the signature dictionary shall be used to describe a digest that does not include the signature value (/Contents) itself. The byte ranges cannot be determined if the dictionary is compressed.	2019-10-22 16:16:16 -04:00
Masamichi Hosoda	5e0ba12687	Fix /Contents value representation in a signature dictionary Table 8.93 "Entries in a signature dictionary" in PDF 1.5 reference describes that the value of Contents entry is a hexadecimal string representation when ByteRange is specified. This commit makes QPDF always uses hexadecimal strings representation instead of literal strings for it.	2019-10-22 16:16:16 -04:00
Jay Berkenbilt	3094955dee	Prepare 9.0.2 release	2019-10-12 19:37:40 -04:00
Jay Berkenbilt	4ea940b03c	Prepare 9.0.1 release	2019-09-20 07:38:18 -04:00
Jay Berkenbilt	685250d7d6	Correct reversed Rectangle coordinates (fixes #363 )	2019-09-19 21:25:34 -04:00
Jay Berkenbilt	48b7de2cc3	Fix typo in comment	2019-09-19 21:04:32 -04:00
Jay Berkenbilt	8b1e307741	Warn for duplicated dictionary keys (fixes #345 )	2019-09-19 20:22:34 -04:00
Jay Berkenbilt	bb83e65193	Fix fuzz issue 16953 (overflow checking in xref stream index)	2019-09-17 19:48:47 -04:00
Jay Berkenbilt	17d431dfd5	Fix integer type warnings for big-endian systems	2019-09-17 19:14:27 -04:00
Jay Berkenbilt	5462dfce31	Prepare 9.0.0 release	2019-08-31 20:07:36 -04:00
Jay Berkenbilt	babd12c9b2	Add methods QPDF::anyWarnings and QPDF::closeInputSource	2019-08-31 15:51:20 -04:00
Jay Berkenbilt	4fa7b1eb60	Add remove_file and rename_file to QUtil	2019-08-31 15:51:04 -04:00
Jay Berkenbilt	0e51a9aca6	Don't encrypt trailer, fixes fuzz issue 15983 Ordinarily the trailer doesn't contain any strings, so this is usually a non-issue, but if the trailer contains strings, linearizing and encrypting with object streams would include encrypted strings in the trailer, which would blow out the padding because encrypted strings are longer than their cleartext counterparts.	2019-08-28 23:06:32 -04:00
Jay Berkenbilt	47a38a942d	Detect stream in object stream, fixing fuzz 16214 It's detected in QPDFWriter instead of at parse time because I can't figure out how to construct a test case in a reasonable time. This commit moves the fuzz file into the regular test suite for a QTC coverage case.	2019-08-28 12:49:04 -04:00
Jay Berkenbilt	ba5fb69164	Make popping pipeline stack safer Use destructors to pop the pipeline stack, and ensure that code that pops the stack is actually popping the intended thing.	2019-08-27 22:27:47 -04:00
Jay Berkenbilt	dadf8307c8	Fix fuzz issues 15316 and 15390	2019-08-27 20:39:06 -04:00
Jay Berkenbilt	456c285b02	Fix fuzz issue 16172 (overflow checking in OffsetInputSource)	2019-08-27 13:08:07 -04:00
Jay Berkenbilt	ad8081daf5	Fix fuzz issue 15442 (overflow checking in BufferInputSource)	2019-08-27 11:26:25 -04:00
Jay Berkenbilt	9a095c5c76	Seek in two stages to avoid overflow When seeing to a position based on a value read from the input, we are prone to integer overflow (fuzz issue 15442). Seek in two stages to move the overflow check into the input source code.	2019-08-27 11:26:25 -04:00
Jay Berkenbilt	ac5e6de2e8	Fix fuzz issue 15387 (overflow checking xref size)	2019-08-27 11:26:25 -04:00
Jay Berkenbilt	6bc4cc3d48	Fix fuzz issue 15475	2019-08-25 22:52:25 -04:00
Jay Berkenbilt	94e86e2528	Fix fuzz issue 16301	2019-08-25 22:52:25 -04:00
Jay Berkenbilt	5da146c8b5	Track separately whether password was user/owner (fixes #159 )	2019-08-24 11:01:19 -04:00
Jay Berkenbilt	5a0aef55a0	Split long line	2019-08-24 10:58:51 -04:00
Jay Berkenbilt	2794bfb1a6	Add flags to control zlib compression level (fixes #113 )	2019-08-23 20:34:21 -04:00
Jay Berkenbilt	dac0598b94	Add ability to set zlib compression level globally	2019-08-23 20:34:21 -04:00
Jay Berkenbilt	3f1ab64066	Pass offset and length to ParserCallbacks::handleObject	2019-08-22 22:54:29 -04:00
Jay Berkenbilt	4b2e72c4cd	Test for direct, rather than resolved nulls in parser Just because we know an indirect reference is null, doesn't mean we shouldn't keep it indirect.	2019-08-22 17:55:16 -04:00
Jay Berkenbilt	3f3dbe22ea	Remove array null flattening For some reason, qpdf from the beginning was replacing indirect references to null with literal null in arrays even after removing the old behavior of flattening scalar references. This seems like a bad idea.	2019-08-22 17:55:16 -04:00
Jay Berkenbilt	225cd9dac2	Protect against coding error of re-entrant parsing	2019-08-22 17:55:16 -04:00
Jay Berkenbilt	ae5bd7102d	Accept extraneous space before xref (fixes #341 )	2019-08-19 22:24:53 -04:00
Jay Berkenbilt	8a9086a689	Accept extraneous space after stream keyword (fixes #329 )	2019-08-19 21:43:44 -04:00
Jay Berkenbilt	43f91f58b8	Improve invalid name token warning message This message used to only appear for PDF >= 1.2. The invalid name is valid for PDF 1.0 and 1.1. However, since QPDFWriter may write a newer version, it's better to detect and warn in all cases. Therefore make the warning more informative.	2019-08-19 19:48:27 -04:00
Jay Berkenbilt	42d396f1dd	Handle invalid name tokens symmetrically for PDF < 1.2 (fixes #332 )	2019-08-19 19:48:27 -04:00
Jay Berkenbilt	d9dd99eca3	Attempt to repair /Type key in pages nodes (fixes #349 )	2019-08-18 18:54:37 -04:00
Jay Berkenbilt	522d2b2227	Improve efficiency of fixDanglingReferences	2019-08-18 09:00:40 -04:00
Jay Berkenbilt	5187a3ec85	Shallow copy arrays without removing sparseness	2019-08-17 23:02:41 -04:00
Jay Berkenbilt	bf7c6a8070	Use SparseOHArray in parsing	2019-08-17 23:02:41 -04:00
Jay Berkenbilt	e5f504b6c5	Use SparseOHArray in QPDF_Array	2019-08-17 23:02:41 -04:00
Jay Berkenbilt	a89d8a0677	Refactor QPDF_Array in preparation for using SparseOHArray	2019-08-17 23:02:41 -04:00
Jay Berkenbilt	e83f3308fb	SparseOHArray	2019-08-17 23:02:41 -04:00
Thorsten Schöning	8f06da7534	Change list to vector for outline helpers (fixes #297 ) This change works around STL problems with Embarcadero C++ Builder version 10.2, but std::vector is more common than std::list in qpdf, and this is a relatively new API, so an API change is tolerable. Thanks to Thorsten Schöning <6223655+ams-tschoening@users.noreply.github.com> for the fix.	2019-07-03 20:08:47 -04:00
Jay Berkenbilt	4db1de97ce	Convert some cases of logic_error to runtime_error There were a few cases that could be caused by invalid input rather than bugs in the code which were throwing logic_error instead of runtime_error.	2019-06-25 12:43:06 -04:00
Jay Berkenbilt	201e8798d7	Convert previously overlooked static cast to QIntC	2019-06-25 12:43:06 -04:00
Jay Berkenbilt	04f45cf652	Treat all linearization errors as warnings This also reverts the addition of a new checkLinearization that distinguishes errors from warnings. There's no practical distinction between what was considered an error and what was considered a warning.	2019-06-23 13:45:45 -04:00
Jay Berkenbilt	c5ed1b8075	Handle invalid encryption Length (fixes #333 )	2019-06-22 20:57:33 -04:00
Jay Berkenbilt	551dfbf697	Allow set*EncryptionParameters before filename iset (fixes #336 )	2019-06-22 20:57:33 -04:00
Jay Berkenbilt	7bd38a3eb3	Provide error message in Windows crypto code (fixes #286 ) Thanks to github user zdenop for supplying some additional error-handling code.	2019-06-22 17:12:01 -04:00
Jay Berkenbilt	6c39aa8763	In shippable code, favor smart pointers (fixes #235 ) Use PointerHolder in several places where manually memory allocation and deallocation were being used. This helps to protect against memory leaks when exceptions are thrown in surprising places.	2019-06-22 16:57:52 -04:00
Jay Berkenbilt	85a3f95a89	qpdf: exit 3 for linearization warnings without errors (fixes #50 )	2019-06-22 16:57:51 -04:00
Jay Berkenbilt	1bde5c68a3	Add QUtil::read_file_into_memory This code was essentially duplicated between test_driver and standalone_fuzz_target_runner.	2019-06-22 10:14:25 -04:00
Jay Berkenbilt	658b5bb3be	QPDFWriter: clean up overloaded functions In a small number of cases, it makes sense to replace an overloaded function with a function that takes a default argument. We can do this now because we've already broken binary compatibility since the last release.	2019-06-22 10:13:27 -04:00
Jay Berkenbilt	79f6b4823b	Convert remaining public classes to use Members pattern Have classes contain only a single private member of type PointerHolder<Members>. This makes it safe to change the structure of the Members class without breaking binary compatibility. Many of the classes already follow this pattern quite successfully. This brings in the rest of the class that are part of the public API.	2019-06-22 10:13:27 -04:00
Jay Berkenbilt	45dac410b5	Remove broken QPDFTokenizer::expectInlineImage	2019-06-21 22:29:31 -04:00
Jay Berkenbilt	25dd3c6750	Remove QPDF::copyForeignObject with unused parameter	2019-06-21 22:29:31 -04:00
Jay Berkenbilt	c6cfd64503	Rename QUtil::strcasecmp to QUtil::str_compare_nocase (fixes #242 )	2019-06-21 22:29:31 -04:00
Jay Berkenbilt	848351f1fc	Add missing #include <cstring>	2019-06-21 22:29:31 -04:00
Jay Berkenbilt	b07ad6794e	Fix bugs found by fuzz tests * Several assertions in linearization were not always true; change them to run time errors * Handle a few cases of uninitialized objects * Handle pages with no contents when doing form operations * Handle invalid page tree nodes when traversing pages	2019-06-21 17:56:24 -04:00
Jay Berkenbilt	a35d4ce9cc	Fix bounds error in utf16_to_utf8 conversion	2019-06-21 17:40:24 -04:00
Jay Berkenbilt	63a643a3c7	Remove implicit conversion from int/pointer to bool This fixes cases of warning C4800 from msvc	2019-06-21 13:17:21 -04:00
Jay Berkenbilt	d71f05ca07	Fix sign and conversion warnings (major) This makes all integer type conversions that have potential data loss explicit with calls that do range checks and raise an exception. After this commit, qpdf builds with no warnings when -Wsign-conversion -Wconversion is used with gcc or clang or when -W3 -Wd4800 is used with MSVC. This significantly reduces the likelihood of potential crashes from bogus integer values. There are some parts of the code that take int when they should take size_t or an offset. Such places would make qpdf not support files with more than 2^31 of something that usually wouldn't be so large. In the event that such a file shows up and is valid, at least qpdf would raise an error in the right spot so the issue could be legitimately addressed rather than failing in some weird way because of a silent overflow condition.	2019-06-21 13:17:21 -04:00
Jay Berkenbilt	f40ffc9d63	Pl_Flate: constructor's out_bufsize is now unsigned int This is the type we need for the underlying zlib implementation.	2019-06-21 13:17:21 -04:00
Jay Berkenbilt	da30764bce	Change QPDFObjectHandle::pipeStreamData's encode_flags type Change from unsigned long to int since we pass enumerated type values to this field.	2019-06-21 13:17:21 -04:00
Jay Berkenbilt	3608afd5c5	Add new integer accessors to QPDFObjectHandle	2019-06-21 13:17:21 -04:00
Jay Berkenbilt	42306e2ff8	QUtil: add unsigned int/string functions	2019-06-21 13:17:21 -04:00
Jay Berkenbilt	2155815234	configure: determine wordsize automatically Based on sizeof(size_t). Assumes 64 if not 32.	2019-06-21 13:17:21 -04:00
Jay Berkenbilt	713d961990	Appearance streams: some floating point values were truncated Bounding box X coordinates could be truncated, causing them to be off by a fraction of a point. This was most likely not visible, but it was still wrong.	2019-06-20 21:32:30 -04:00
Jay Berkenbilt	eb7948876b	Fix problems found in fuzz corpus	2019-06-15 17:24:24 -04:00
Jay Berkenbilt	cf469d7890	Give up reading objects with too many consecutive errors	2019-06-15 08:52:19 -04:00
Jay Berkenbilt	cd830968ef	Eliminate one potential integer overflow There are more to handle, but this resolves an issue already caught by oss-fuzz.	2019-06-15 08:52:19 -04:00
Jay Berkenbilt	31bde2f9d7	Handle empty DecodeParams array for (fixes #331 ) On read, ignore /DecodeParms when empty list; on write, delete it. Some files have been found that include an empty list for /DecodeParms, but this is not technically compliant with the spec, and the only sensible interpretation is to treat it as if there are no decode parameters.	2019-06-09 17:19:49 -04:00
Jay Berkenbilt	b1a78be1a8	Prepare 8.4.2 release	2019-05-18 08:56:37 -04:00
Jay Berkenbilt	b3f0dbff62	Fix Windows memory error (fixes #330 )	2019-05-16 14:26:51 -04:00
Jay Berkenbilt	a323f6f49f	Prepare 8.4.1 release	2019-04-27 20:44:20 -04:00
Jay Berkenbilt	81205e007b	Spell check	2019-04-21 13:09:11 -04:00
Jay Berkenbilt	011695dfdf	Support Unicode in filenames (fixes #298 )	2019-04-20 21:00:43 -04:00
Jay Berkenbilt	4ccb29912a	Tighten isPageObject (fixes #310 )	2019-04-20 21:00:43 -04:00
Thorsten Schöning	2c704b99a1	Undefined functions because of missing std:: or header. (#295 ) * [bcc32 Error] QPDF.cc(375): E2268 Call to undefined function 'atof' Full parser context QPDF.cc(358): parsing: void QPDF::parse(const char ) [bcc32 Error] QPDFTokenizer.cc(183): E2268 Call to undefined function 'strtol' Full parser context QPDFTokenizer.cc(163): parsing: void QPDFTokenizer::resolveLiteral() * [bcc32 Error] pdf-split-pages.cc(52): E2268 Call to undefined function 'exit' Full parser context pdf-split-pages.cc(50): parsing: void usage() * PR #295: Including "cstdlib" should be replaced with "stdlib.h" to be more consistent. At the same time I changed the order of the surrounding includes to reflect alphabetical order, because at some files this already have been the case.	2019-03-12 10:05:29 -04:00
Thorsten Schöning	71b7ed9f4f	"_setmode" and "_stricmp" are not available on Borland C++Builder, neither the classic one nor newer ones based on CLANG.	2019-03-11 16:58:55 -04:00
Jay Berkenbilt	da7c2c0ee9	Fix json serialization for {x \| -1 < x < 1} (fixes #308 ) JSON serialization was preserving the value as presented, but JSON doesn't accept decimal values without a 0 before the decimal point.	2019-03-11 16:22:59 -04:00
Jay Berkenbilt	03074ca5a0	Prepare 8.4.0 release	2019-02-01 22:25:25 -05:00
Jay Berkenbilt	fec5bb124c	Spell check	2019-01-31 21:41:29 -05:00
Jay Berkenbilt	eb49e07c0a	Make inline image token exactly contain the image data Do not include the trailing EI, and handle cases where EI is not preceded by a delimiter. Such cases have been seen in the wild.	2019-01-31 20:28:44 -05:00
Jay Berkenbilt	5211bcb5ea	Externalize inline images (fixes #278 )	2019-01-31 10:38:13 -05:00
Jay Berkenbilt	1eb35a355f	Exclude space after ID in image data	2019-01-31 10:38:10 -05:00
Jay Berkenbilt	2b6c79bcae	Improve locating inline image's EI We've actually seen a PDF file in the wild that contained EI surrounded by delimiters inside the image data, which confused qpdf's naive code. This significantly improves EI detection.	2019-01-31 09:26:37 -05:00
Jay Berkenbilt	ec9e310c9e	Refactor QPDFTokenizer's inline image handling Add a version of expectInlineImage that takes an input source and searches for EI. This is in preparation for improving the way EI is found. This commit just refactors the code without changing the functionality and adds tests to make sure the old and new code behave identically.	2019-01-31 09:26:37 -05:00
Jay Berkenbilt	31372edce0	Inline image token value ends with EI, not delimiter The inline image token erroneously included the delimiter that followed EI. The ObjectHandle created from it was correct.	2019-01-31 09:26:37 -05:00
Jay Berkenbilt	b776dcd2d3	Clean up some private functions	2019-01-29 22:14:20 -05:00
Jay Berkenbilt	8a9cfd2605	Handle direct page objects (fixes #164 )	2019-01-29 17:01:36 -05:00
Jay Berkenbilt	2d0885bc11	Clarify documentation for copyForeignObject regarding pages Make explicit that copyForeignObject can be used on page objects and will copy them properly but not update the pages tree.	2019-01-28 21:53:55 -05:00
Jay Berkenbilt	2712869cf9	Fix logic for when to compress object and xref streams (fixes #271 )	2019-01-28 21:43:06 -05:00
Jay Berkenbilt	52f9d326a5	Resolve duplicated page objects (fixes #268 ) When linearizing a file or getting the list of all pages in a file, detect if the pages tree contains a duplicated page object and, if so, shallow copy it. This makes it possible to have a one to one mapping of page positions to page objects.	2019-01-28 20:29:58 -05:00
Jay Berkenbilt	623f5b664e	Convert pages to form XObjects Support conversion of pages to form XObjects and placement of form XObjects on pages.	2019-01-27 07:50:30 -05:00
Jay Berkenbilt	68ccd87c9e	Move rectangle transformation into QPDFMatrix	2019-01-27 07:50:30 -05:00
Jay Berkenbilt	8cb245739c	Add QPDFObjectHandle::getUniqueResourceName	2019-01-27 07:50:30 -05:00
Jay Berkenbilt	009767d97a	Handle inheritable page attributes Add getAttribute for handling inheritable page attributes, and fix getPageImages and annotation flattening code to use it.	2019-01-25 22:30:05 -05:00
Jay Berkenbilt	2d32f4db8f	Handle fallback font size in text appearances If we end up using our fallback font size when generating appearances for text fields, reflect that in the Tf operator used in the appearance stream.	2019-01-21 07:38:21 -05:00
Jay Berkenbilt	9cb599875b	Improve text objects used in text appearance streams	2019-01-20 23:05:58 -05:00
Jay Berkenbilt	930eade6d3	Fix omissions in text appearance generation When generating appearance streams for variable text annotations, properly handle the cases of there being no appearance dictionary, no appearance stream, or an appearance stream with no BMC..EMC marker.	2019-01-20 23:05:58 -05:00
Jay Berkenbilt	65ef0bf313	When flattening, remove annotations with no appearance stream With the exception of form field annotations when /NeedAppearances is true, remove annotations that don't have appearance streams when flattening. There is no reason to keep these when flattening since they are invisible. This may include unchecked checkboxes, unshown popup windows, etc.	2019-01-20 23:05:58 -05:00
Jay Berkenbilt	c18ee440a3	mingw workaround for QPDFExc destructor mingw doesn't like it when you don't inline empty virtual destructors.	2019-01-19 10:14:07 -05:00
Jay Berkenbilt	e87d149918	Add QUtil::possible_repaired_encodings	2019-01-17 11:43:56 -05:00
Jay Berkenbilt	6ec22f117d	Modernize encryption API for more granularity Setting encryption permissions for R >= 3 set permission bits in groups corresponding to menu options in Acrobat 5. The new API allows the bits to be set individually.	2019-01-17 11:43:56 -05:00
Jay Berkenbilt	4630377731	Add status-reporting transcoders to QUtil	2019-01-17 11:43:56 -05:00
Jay Berkenbilt	8f389f14c0	QUtil::analyze_encoding	2019-01-17 11:43:56 -05:00
Jay Berkenbilt	6817ca585a	Bidirectional transcoding for win, mac, pdf, utf8, utf16	2019-01-17 11:43:56 -05:00
Jay Berkenbilt	698485468a	Move remaining existing transcoding to QUtil	2019-01-17 11:43:56 -05:00
Jay Berkenbilt	5cfcd4f361	Additional checks for unreferenced resources Explicitly abandon removal of unreferenced resources if there are any lexical errors in the page's contents. This case always generated a warning, but it now also prevents removal of unreferenced resources, this strongly decreasing the likelihood of data loss.	2019-01-17 11:43:56 -05:00
Jay Berkenbilt	4bc434000c	Copy subdictionaries when removing resources (fixes #276 ) When removing unreferenced resources, the code was copying the overall resource dictionaries but not the subdictionaries being modified. This was a "typo" in the code -- the comment clearly stated the need to do this, but the code replaced the dictionary with itself rather than with a shallow copy of itself.	2019-01-17 09:40:05 -05:00
Jay Berkenbilt	654c0e8caf	Allow adding the same page more than once in --pages (fixes #272 )	2019-01-12 10:01:47 -05:00
Jay Berkenbilt	4ecd1df6f2	Add configure option AVOID_WINDOWS_HANDLE If set, we avoid using Windows I/O HANDLE, which is disallowed in some versions of the Windows SDK, such as for Windows phones. QUtil::same_file will always return false in this case. Only applies to Windows builds.	2019-01-10 22:35:08 -05:00
Jay Berkenbilt	d24a120c7f	Add QPDF::setImmediateCopyFrom	2019-01-10 22:35:08 -05:00
Jay Berkenbilt	b653929c93	Update version to 8.3.0	2019-01-07 11:16:54 -05:00
Jay Berkenbilt	aa602fd107	Fix integer overflow in large file test	2019-01-07 08:49:14 -05:00
Jay Berkenbilt	c3cee5f154	Exercise out of scope original pdf for copyForeignObject	2019-01-07 07:38:03 -05:00
Jay Berkenbilt	fddbcab0e7	Mostly don't require original QPDF for copyForeignObject (fixes #219 ) The original QPDF is only required now when the source QPDFObjectHandle is a stream that gets its stream data from a QPDFObjectHandle::StreamDataProvider.	2019-01-07 00:11:15 -05:00
Jay Berkenbilt	fbbb0ee016	Make a static version of QPDF::pipeStreamData This is in preparation of being able to pipe a stream's data without keeping a copy of its containing qpdf object.	2019-01-07 00:11:15 -05:00
Jay Berkenbilt	7588cac295	Create an application-scope unique ID for each QPDF object Use this instead of QPDF* as a map key for object_copiers.	2019-01-07 00:11:15 -05:00
Jay Berkenbilt	e27ac682e0	Move encryption parameters into a class	2019-01-06 09:58:16 -05:00
Jay Berkenbilt	a70fbaaf50	Honor other base encodings when generating appearances	2019-01-05 23:01:59 -05:00
Jay Berkenbilt	b341d742db	Add WinAnsi and MacRoman encoding	2019-01-05 23:01:44 -05:00
Jay Berkenbilt	3ef1b77304	Refactor QUtil::utf8_to_ascii	2019-01-05 22:59:29 -05:00
Jay Berkenbilt	089ce5902e	Move utf8_to_utf16 into QUtil	2019-01-05 22:59:27 -05:00
Jay Berkenbilt	ae18bfd142	Refactor string transcoding in QPDF_String	2019-01-05 22:56:58 -05:00
Jay Berkenbilt	2e342ee5bb	Spell check	2019-01-04 21:33:14 -05:00
Jay Berkenbilt	16fd6e64f9	Add QPDFWriter::getFinalVersion (fixes #266 )	2019-01-04 12:37:22 -05:00
Jay Berkenbilt	837dcf8fc2	Don't call assert while checking linearization data (fixes #209 , #231 ) Instead of calling assert for problems found during checking linearization data, throw an exception which is later caught and issued as an error. Ideally we would handle errors more robustly, but this is still a significant improvement.	2019-01-04 11:55:42 -05:00
Jay Berkenbilt	a01359189b	Fix dangling references (fixes #240 ) On certain operations, such as iterating through all objects and adding new indirect objects, walk through the entire object structure and explicitly resolve any indirect references to non-existent objects. That prevents new objects from springing into existence and causing the previously dangling references to point to them.	2019-01-04 10:29:29 -05:00
Jay Berkenbilt	158156d506	Add basic appearance stream generation	2019-01-04 08:00:19 -05:00
Jay Berkenbilt	02281632cc	Add QUtil::utf8_to_ascii	2019-01-03 23:18:13 -05:00
Jay Berkenbilt	b55567a0fa	Add special case setV code for button fields	2019-01-03 23:18:13 -05:00
Jay Berkenbilt	e3144ac417	Add form fields to json output Also add some additional methods for detecting form field types to assist in the json creation and for later use.	2019-01-03 23:18:13 -05:00
Jay Berkenbilt	ca94ac68d9	Honor flags when flattening annotations	2019-01-03 11:59:55 -05:00
Jay Berkenbilt	06d6438ddf	Minor fixes	2019-01-03 09:17:43 -05:00
Jay Berkenbilt	3e74916c5a	Fix seg fault on empty xref stream (fixes #263 ) Thanks to @p-cher for supplying a patch.	2019-01-03 09:17:43 -05:00
Jay Berkenbilt	f78ea057ca	Switch annotation flattening to use the form xobjects Instead of directly putting the contents of the annotation appearance streams into the page's content stream, add commands to render the form xobjects directly. This is a more robust way to do it than the original solution as it works properly with patterns and avoids problems with resource name clashes between the pages and the form xobjects.	2019-01-02 21:49:47 -05:00
Jay Berkenbilt	3b8ce4f12a	Annotation flattening including form fields Flatten annotations by integrating their appearance streams into the content stream of the containing page. In the case of form fields, only flatten if /NeedAppearance is false (or equivalently absent). If flattening form fields, also remove /AcroForm from the document catalog.	2019-01-01 08:14:15 -05:00
Jay Berkenbilt	95d6b17a89	Add QPDFObjectHandle::mergeDictionary()	2019-01-01 08:12:56 -05:00
Jay Berkenbilt	104fd6da52	Add matrix and annotation appearance stream handling Generate page content fragment for rendering appearance streams including all matrix calculation.	2019-01-01 08:07:21 -05:00
Jay Berkenbilt	5059ec0d35	Add Matrix class under QPDFObjectHandle	2018-12-31 23:02:43 -05:00
Jay Berkenbilt	daeb5a85b6	Transformation matrix	2018-12-31 18:23:47 -05:00
Jay Berkenbilt	3440ea7d3c	JSON::serialize -> unparse Unparse is admittedly strange, but I'd rather be strange and consistent, and everything else in the qpdf library uses unparse to serialize. (If you're reading this, the convention of using "unparse" comes from the "clu" programming language.)	2018-12-25 11:52:21 -05:00
Jay Berkenbilt	fa3664357b	Move numrange code from qpdf.cc to QUtil.cc Also move tests to libtests.	2018-12-21 19:11:57 -05:00
Jay Berkenbilt	d5d179f441	Add document and object helpers for outlines (bookmarks)	2018-12-21 19:11:57 -05:00
Jay Berkenbilt	30a0c070e4	Add QPDFObjectHandle::getJSON()	2018-12-21 18:34:56 -05:00
Jay Berkenbilt	651179b5da	Add simple JSON serializer	2018-12-21 18:34:56 -05:00
Jay Berkenbilt	0776c00129	Add QPDFNameTreeObjectHelper	2018-12-21 18:34:56 -05:00
Jay Berkenbilt	cc500eda91	Minor cleanup	2018-12-21 17:25:31 -05:00
Jay Berkenbilt	6ef9e31233	Add QPDFPageLabelDocumentHelper	2018-12-18 16:59:24 -05:00
Jay Berkenbilt	f38df27aa3	Add QPDFNumberTreeObjectHelper	2018-12-18 16:46:10 -05:00
Jay Berkenbilt	077d3d4512	Add QPDFObjectHandle::wrapInArray() Wrap an object in an array if it is not already an array.	2018-12-18 16:45:48 -05:00
Jay Berkenbilt	d1368a3851	Commit automatically generated files	2018-10-11 17:27:54 -04:00
Jay Berkenbilt	6ee761fc86	Prepare 8.2.1 release	2018-08-18 10:56:19 -04:00
Jay Berkenbilt	5e9e17e62a	Prepare 8.2.0 release	2018-08-16 11:53:10 -04:00
Jay Berkenbilt	693cdaac35	Missing header for std::max	2018-08-16 11:53:10 -04:00
Jay Berkenbilt	b4ce557be5	Fix error in QPDFSystemError.cc	2018-08-14 11:39:07 -04:00
Jay Berkenbilt	b4bdc42b4f	New exception class QPDFSystemError (fixes #221 )	2018-08-13 20:01:51 -04:00
Jay Berkenbilt	5d9d80beba	Fix fallback logic for encryption (fixes #229 )	2018-08-12 22:32:40 -04:00
Jay Berkenbilt	60fe8061cb	Fix one more identifier (fixes #236 )	2018-08-12 22:01:51 -04:00
Jay Berkenbilt	a2f62935b3	Catch exceptions as const references (fixes #236 ) This fix allows qpdf to compile/test cleanly with gcc 8.	2018-08-12 21:57:52 -04:00
Jay Berkenbilt	3d6615b276	Pl_Buffer: reduce memory growth (fixes #228 ) Rather than keeping a list of buffers for every write, accumulate bytes in a single buffer, doubling the size of the buffer when needed to accommodate new data. This is not the best possible implementation, but the change was implemented in this way to avoid changing the shape of Pl_Buffer and thus breaking backward compatibility.	2018-08-12 17:45:43 -04:00
Jay Berkenbilt	3873f5fd9b	Protect headers with compliant identifiers (fixes #233 )	2018-08-12 14:10:32 -04:00
Jay Berkenbilt	932799baab	Fix memory access error A previous fix introduced a potentially memory overrun under certain rare conditions. The test suite now once again passes with address sanitizer.	2018-08-12 13:16:17 -04:00
Jay Berkenbilt	b6e414b10b	Remove some extraneous null pointer checks (fixes #234 ) There were a few places in the code that were checking that a pointer wasn't null before deleting it, even though C++ has always allowed delete 0. Most of the code did not perform these checks.	2018-08-12 12:58:39 -04:00
Jay Berkenbilt	4a4736c695	Fix EOL handling inside strings (fixes #226 ) CR, CRLF, and LF are all supposed to be treated as LF; only one EOL is to be ignored after backslash.	2018-08-05 20:48:35 -04:00
Jay Berkenbilt	1619cad1e8	Return correct method for string encryption (fixes #227 )	2018-08-05 16:58:21 -04:00
Jay Berkenbilt	e1cd5891af	Fix infinite loop on small files with progress reporting (fixes #230 ) Turns out you can keep adding zero to a number over and over again and it just doesn't get any bigger. Who would have known?	2018-08-05 15:43:34 -04:00
Jay Berkenbilt	4f4c627b77	ClosedFileInputSource: add method to keep file open During periods of intensive operation on a specific file, this method can reduce the overhead of repeated open/close operations.	2018-08-04 19:52:46 -04:00
Jay Berkenbilt	1bd2a2e79b	Prepare 8.1.0 release	2018-06-23 07:50:11 -04:00
Jay Berkenbilt	3aad28aed0	Bug fix: honor encryption key length with R=3 (fixes #212 )	2018-06-22 19:24:26 -04:00
Jay Berkenbilt	a433ed24f9	Add progress reporting for QPDFWriter (fixes #200 )	2018-06-22 16:14:54 -04:00
Jay Berkenbilt	2a82f6e1e0	Add method to get count of objects in QPDF	2018-06-22 15:53:40 -04:00
Jay Berkenbilt	c81836076f	Correct incorrect comment	2018-06-22 13:13:09 -04:00
Jay Berkenbilt	4ccc8b1a44	Add ClosedFileInputSource ClosedFileInputSource is an input source that keeps the file closed when not reading it.	2018-06-22 12:52:45 -04:00
Jay Berkenbilt	c71dc6888c	Don't prune resource dictionaries on errors or by request If we are unable to filter a page's content streams, don't attempt to remove objects from the page's resource dictionary. Also provide a command line option to suppress resource removal in case we ever need this as a workaround for some bug or broken PDF files.	2018-06-22 10:45:31 -04:00
Jay Berkenbilt	38c9ed23c3	Treat content stream parsing errors as an error, not a warning If parsing content streams is treated as a warning, there is no way for a caller to know if a parsing operation has failed. This is very dangerous and will likely result in data loss when token filters are parser callbacks are in use.	2018-06-22 10:44:08 -04:00
Jay Berkenbilt	6c89d4b35b	When splitting files, remove unreferenced objects (fixes #203 )	2018-06-21 21:03:30 -04:00
Jay Berkenbilt	ddd78c1b7f	Fix QPDFObjectHandle::shallowCopy It's not really a shallow copy. It just doesn't cross indirect object boundaries. The old implementation had a bug that would cause multiple shallow copies of the same object to share memory, which was not the intention.	2018-06-21 20:34:45 -04:00
Jay Berkenbilt	397b097c46	Allow setting a form field's value	2018-06-21 15:57:13 -04:00
Jay Berkenbilt	952a665a4e	Better support for creating Unicode strings	2018-06-21 15:57:13 -04:00
Jay Berkenbilt	e44c395c51	QUtil::toUTF16	2018-06-21 15:57:13 -04:00
Jay Berkenbilt	0b05111db8	Implement helper class for interactive forms	2018-06-21 15:57:13 -04:00
Jay Berkenbilt	2e7ee23bf6	Add QPDFPageDocumentHelper and QPDFPageObjectHelper This is the beginning of higher-level API support using helper classes. The goal is to be able to add more helpers without continuing to pollute QPDF's and QPDFObjectHandle's public interfaces.	2018-06-21 15:57:13 -04:00
Jay Berkenbilt	4cded10821	Add QPDFObjectHandle::Rectangle type Provide a convenient way of accessing rectangles.	2018-06-21 15:57:13 -04:00
Jay Berkenbilt	078cf9bf90	newline before endstream fix for object streams (fixes #205 )	2018-05-12 13:17:43 -04:00
Jay Berkenbilt	15ed9f8565	Fix small logic error in Token construct (fixes #206 ) The special case around name token was not reachable. This would only affect constructors of name tokens that were represented in non-canonical form such as with a hex substitution for a printable character. The error was harmless but still a bug.	2018-05-05 17:47:56 -04:00
Jay Berkenbilt	b4d6cf6836	Limit depth of nesting in direct objects (fixes #202 ) This fixes CVE-2018-9918.	2018-04-15 16:11:22 -04:00
Jay Berkenbilt	f8c8e4dcc0	Prepare 8.0.2 release	2018-03-06 11:34:07 -05:00
Jay Berkenbilt	e4e2e26d99	Properly handle pages with no contents (fixes #194 ) Remove calls to assertPageObject(). All cases in the library that called assertPageObject() work fine if you don't call assertPageObject() because nothing assumes anything that was being checked by that call. Removing the calls enables more files to be successfully processed.	2018-03-06 11:34:07 -05:00
Jay Berkenbilt	1a4dcb4aaf	Pl_Buffer starts in a ready state	2018-03-06 11:31:03 -05:00
Jay Berkenbilt	ee44aef8d0	Treat loop in xref tables as damage (fixes #192 ) Prior to this fix, if there was a loop detected in following /Prev pointers in xref streams/tables, it would cause qpdf to lose data. Note that this condition causes many PDF readers to hang or fail.	2018-03-05 14:26:58 -05:00
Jay Berkenbilt	6fe1e9de40	Prepare 8.0.1 release	2018-03-04 07:16:20 -05:00
Jay Berkenbilt	7b9f23a99a	Ignore zlib data check errors (fixes #191 )	2018-03-03 11:35:01 -05:00
Jay Berkenbilt	3e8b643ae3	Release 8.0.0	2018-02-25 16:00:11 -05:00
Jay Berkenbilt	111ec50950	8.0.rc3	2018-02-25 14:17:59 -05:00
Jay Berkenbilt	d3d3970cf6	8.0.rc2	2018-02-25 13:50:22 -05:00
Jay Berkenbilt	a16d703f4d	Update version to 8.0.rc1 This is for testing the release process, particularly as it pertains to AppImage creation.	2018-02-25 09:03:27 -05:00
Jay Berkenbilt	82cae01a76	Bump version number and soname Bump to an alpha release. This version is not being widely released but is being used to push the new shared library version through the debian packaging system and to test out github releases.	2018-02-20 21:31:38 -05:00
Jay Berkenbilt	4bb3046f0b	Properly handle strings with PDF Doc Encoding (fixes #179 ) The QPDF_String::getUTF8Val() method was not treating strings that weren't explicitly Unicode as PDF Doc Encoded. This only affects characters in the range 0x80 through 0xa0.	2018-02-18 21:06:27 -05:00
Jay Berkenbilt	2780a1871d	Add C API for checking PDF files	2018-02-18 21:06:27 -05:00
Jay Berkenbilt	d0e99f195a	More robust handling of type errors Give objects descriptions and context so it is possible to issue warnings instead of fatal errors for attempts to access objects of the wrong type.	2018-02-18 21:06:27 -05:00
Jay Berkenbilt	c2e16827b6	Replace "file position" with "offset" in error messages Sometimes it's an offset in an object stream or a content stream, so file position is confusing in some cases.	2018-02-18 21:06:27 -05:00
Jay Berkenbilt	52e024f701	Include omitted object description in error message	2018-02-18 21:06:27 -05:00
Jay Berkenbilt	cb3b705cf9	Include filename in object stream parse error	2018-02-18 21:06:27 -05:00
Jay Berkenbilt	21b7481b0e	Push members of QPDFObjectHandle into a Members object As in other cases, this is to enable adding new member variables in the future without breaking ABI compatibility.	2018-02-18 21:06:27 -05:00
Jay Berkenbilt	e410b0fe0d	Simplify TokenFilter interface Expose Pl_QPDFTokenizer, and have it do more of the work of managing the token filter's pipeline.	2018-02-18 21:05:47 -05:00
Jay Berkenbilt	1fdd86a049	Move Pl_QPDFTokenizer to public interface	2018-02-18 21:05:47 -05:00
Jay Berkenbilt	5708b5d0aa	Add additional interface for filtering page contents	2018-02-18 21:05:47 -05:00
Jay Berkenbilt	fd02944e19	Clean up comment	2018-02-18 21:05:47 -05:00
Jay Berkenbilt	5136238f2a	Detect and report bad tokens in content normalization	2018-02-18 21:05:47 -05:00
Jay Berkenbilt	9910104442	Implement TokenFilter and refactor Pl_QPDFTokenizer Implement a TokenFilter class and refactor Pl_QPDFTokenizer to use a TokenFilter class called ContentNormalizer. Pl_QPDFTokenizer is now a general filter that passes data through a TokenFilter.	2018-02-18 21:05:46 -05:00
Jay Berkenbilt	b8723e97f4	Add coalesce contents capability	2018-02-18 21:05:46 -05:00
Jay Berkenbilt	25988e8d10	Bug fix: content normalizer should not add trailing newline Adding a trailing newline in content normalization damages files whose contents are split across streams in the middle of tokens. Let QPDFWriter add the newline with the indicator to ignore the newline, which it already does. This changes the way some qdf files look.	2018-02-18 21:05:46 -05:00
Jay Berkenbilt	fcd611b61e	Refactor parseContentStream	2018-02-18 21:05:46 -05:00
Jay Berkenbilt	05ff619b09	Remove redundant method Remove a redundant method that was equal to another one with additional arguments. This breaks binary compatibility, but there are other ABI breaking changes in the upcoming release, so now is the time to do it.	2018-02-18 21:05:46 -05:00
Jay Berkenbilt	55ee55394c	Use inline image token in content parser	2018-02-18 21:05:46 -05:00
Jay Berkenbilt	ba453ba4ff	Use space tokens in tokenizer filter	2018-02-18 21:05:46 -05:00
Jay Berkenbilt	ec538792fa	Use inline image token type in tokenizer filter	2018-02-18 21:05:46 -05:00
Jay Berkenbilt	fefe25030e	Inline image token type	2018-02-18 21:05:46 -05:00
Jay Berkenbilt	2699ecf13e	Push QPDFTokenizer members into a nested structure This is for protection against future ABI breaking changes.	2018-02-18 21:05:46 -05:00
Jay Berkenbilt	d97474868d	Lexer enhancements: EOF, comment, space Significant enhancements to the lexer to improve EOF handling and to support comments and spaces as tokens. Various other minor issues were fixed as well.	2018-02-18 20:18:40 -05:00
Jay Berkenbilt	ebd5ed63de	Add option to save pass 1 of lineariziation This is useful only for debugging the linearization code.	2018-02-18 20:18:40 -05:00
Jay Berkenbilt	2ebdd6929e	Prepare 7.1.1 release	2018-02-04 18:31:42 -05:00
Jay Berkenbilt	e3167c1a60	Fix linearization for files with nonstandard ID length	2018-02-04 18:16:23 -05:00
Jay Berkenbilt	3b2a3cdd77	Fix setLineBuf for bsd (fixes #177 ) Use 0 instead of NULL in a cast.	2018-02-04 14:19:00 -05:00
Jay Berkenbilt	d5bfd49cb2	Remove use of std::abs (fixes #172 ) Different compilers want different choices of headers for std::abs. It's easier to just to not use it.	2018-02-04 14:19:00 -05:00
Jay Berkenbilt	34a9b835b0	Fix indentation	2018-02-04 14:19:00 -05:00
Jay Berkenbilt	7e5e1a7158	Fix offset in error message	2018-02-04 14:19:00 -05:00
Jay Berkenbilt	633fb414af	Pl_QPDFTokenizer: Use unsigned_char_pointer instead of copy	2018-01-28 18:34:43 -05:00
Jay Berkenbilt	13d9756a45	Minor fixes to tokenizer	2018-01-28 18:34:43 -05:00
Jay Berkenbilt	2e4ca7ecf4	Update version numbers for 7.1.0	2018-01-14 20:09:20 -05:00
Jay Berkenbilt	04e47deaf9	Fixes for clang	2018-01-14 19:18:04 -05:00
Jay Berkenbilt	569d74d36b	Allow raw encryption key to be specified Add options to enable the raw encryption key to be directly shown or specified. Thanks to Didier Stevens <didier.stevens@gmail.com> for the idea and contribution of one implementation of this idea.	2018-01-14 10:21:05 -05:00
Jay Berkenbilt	3e306ae64c	Add QUtil::hex_decode	2018-01-14 09:04:13 -05:00
Jay Berkenbilt	791e0db762	Allow trailing . in numeric token (fixes #165 )	2018-01-13 20:05:40 -05:00
Jay Berkenbilt	ec0087e3ce	Support TIFF Predictor (fixes #171 )	2018-01-13 19:49:42 -05:00
Jay Berkenbilt	53971d50be	Add Pl_TIFFPredictor	2018-01-13 19:49:42 -05:00
Jay Berkenbilt	d9c9049708	Add signed support to BitStream and BitWriter	2018-01-13 19:49:42 -05:00
Jay Berkenbilt	661ed1d28e	Minor fixes to Pl_PNGFilter Fix comment, remove restriction that doesn't actually matter.	2018-01-13 19:49:42 -05:00
Jay Berkenbilt	be27d47bdc	Use better error for getStreamData failure If the stream isn't filterable but we call getStreamData, throw a regular exception instead of a logic error so that normal error handling and reporting mechanisms will be used.	2018-01-13 19:49:42 -05:00
Jay Berkenbilt	4edfe1f41d	Add tests for new PNG filters	2017-12-25 18:20:52 -05:00
Jay Berkenbilt	a3a55be9cd	Correct errors in PNG filters and make use from library	2017-12-25 14:24:48 -05:00
Casey Rojas	9a48720246	Initial implementation of other PNG decode filters Initial implementation provided by Casey Rojas <crojas@infotechfl.com> Some problems are fixed in a subsequent commit.	2017-12-24 22:59:51 -05:00
Jay Berkenbilt	0f1ce8e646	Prepare 7.0.0 release	2017-09-16 13:22:15 -04:00
Jay Berkenbilt	249e95f608	Fix test failure on MSVC	2017-09-15 23:09:04 -04:00
Jay Berkenbilt	6898bc8d98	Spell check	2017-09-15 23:09:04 -04:00
Jay Berkenbilt	f2ffb6968a	Fix Windows compilation errors	2017-09-15 21:44:57 -04:00
Jay Berkenbilt	d31a7b76e7	Improve message for stream decoding error Tweak the message so that we inform the user that we are mitigating data loss.	2017-09-12 16:03:48 -04:00
Jay Berkenbilt	eaacf94005	Update C API with new QPDFWriter methods	2017-09-12 14:30:39 -04:00
Jay Berkenbilt	40ecba4172	Pl_DCT: Use custom source and destination managers (fixes #153 ) Avoid calling jpeg_mem_src and jpeg_mem_dest. The custom destination manager writes to the pipeline in smaller chunks to avoid having the whole image in memory at once. The source manager works directly with the Buffer object. Using customer managers avoids use of memory source and destination managers, which are not present in older versions of libjpeg still in use by some Linux distributions.	2017-09-07 22:59:11 -04:00
Jay Berkenbilt	3ef1be9783	PNGFilter: Better range checking for columns	2017-08-31 07:26:58 -04:00
Jay Berkenbilt	1868a10f8b	Replace all atoi calls with QUtil::string_to_int The latter catches underflow/overflow.	2017-08-29 12:28:32 -04:00
Jay Berkenbilt	742190bd98	Pl_PNGFilter: disallow columns = 0	2017-08-29 12:28:32 -04:00
Jay Berkenbilt	6d46346eb9	Detect integer overflow/underflow	2017-08-29 12:28:32 -04:00
Jay Berkenbilt	e999bbae43	Fix memory leak with bad jpeg data	2017-08-28 22:16:45 -04:00
Jay Berkenbilt	c6872d2c70	Clean up circular references in QPDF_Stream	2017-08-28 22:16:31 -04:00
Jay Berkenbilt	728dc9e6d8	Fix error caught by clang	2017-08-26 21:51:17 -04:00
Jay Berkenbilt	dea704f0ab	Pad keys to avoid memory errors (fixes #147 )	2017-08-26 21:35:59 -04:00
Jay Berkenbilt	021c229331	Fix Pl_Flate memory leak on error (fixes #148 )	2017-08-25 22:26:53 -04:00
Jay Berkenbilt	ad527a64f9	Parse iteratively to avoid stack overflow (fixes #146 )	2017-08-25 21:56:45 -04:00
Jay Berkenbilt	85f05cc57f	Detect xref pointer infinite loop (fixes #149 )	2017-08-25 19:58:31 -04:00
Jay Berkenbilt	1e52d33822	Bump soname to 18 and version to 7.0.b1	2017-08-22 16:50:48 -04:00
Jay Berkenbilt	e452d9dca6	Spell check	2017-08-22 14:22:20 -04:00
Jay Berkenbilt	6219111ed7	Update references to README files Most of the README files have been renamed. Refer to the new names.	2017-08-22 14:13:10 -04:00
Jay Berkenbilt	83ec09f66c	Do memory checks Slightly improve memory cleanup in Pl_DCT Make it easier to test with valgrind	2017-08-22 14:13:10 -04:00
Jay Berkenbilt	fabff0f3ec	Limit token length during xref recovery While scanning the file looking for objects, limit the length of tokens we allow. This prevents us from getting caught up in reading a file character by character while digging through large streams.	2017-08-22 14:13:10 -04:00
Jay Berkenbilt	caf5e39c2e	Fix compiler warnings for clang/mac OS X	2017-08-22 14:13:10 -04:00
Jay Berkenbilt	6884ad2ead	Fix logic error in recovery A stray semicolon caused a condition to be incorrectly applied during stream length recovery.	2017-08-22 07:19:41 -04:00
Jay Berkenbilt	ce435222b2	Push QPDFWriter member variables into a nested class	2017-08-21 22:04:07 -04:00
Jay Berkenbilt	a8c93bd324	Push QPDF member variables into a nested class Pushing member variables into a nested class enables addition of new member variables without breaking binary compatibility.	2017-08-21 21:35:11 -04:00
Jay Berkenbilt	198856a825	Improve pclm parameter settings	2017-08-21 21:05:48 -04:00
Jay Berkenbilt	8ab52fa558	Combine writePCLm with writeStandard Reduce code duplication	2017-08-21 21:05:48 -04:00
Jay Berkenbilt	9f60a864a0	Combine PCLm header into writeHeader	2017-08-21 21:05:47 -04:00
Jay Berkenbilt	adbcfcff2d	Remove duplicated coverage cases Remove duplicated coverage cases from Sahil's code so existing test suite passes.	2017-08-21 18:55:02 -04:00
Sahil Arora	b19210fa7d	QPDFWriter: Add setPCLm() and writePCLm() methods * Add support for PCLm using setPCLm() and writePCLm() methods in QPDFWriter.hh and QPDFWriter.cc * Add a function writePCLmHeader() for PCLm header in QPDFWriter	2017-08-21 18:55:02 -04:00
Jay Berkenbilt	ddc6cf0cf6	Precheck streams by default There is no need for a --precheck-streams option. We can do the precheck without imposing any penalty, only re-encoding the stream if it fails the first time.	2017-08-21 17:44:22 -04:00
Jay Berkenbilt	9744414c66	Enable finer grained control of stream decoding This commit adds several API methods that enable control over which types of filters QPDF will attempt to decode. It also adds support for /RunLengthDecode and /DCTDecode filters for both encoding and decoding.	2017-08-21 17:44:22 -04:00
Jay Berkenbilt	ae90d2c485	Implement Pl_DCT pipeline Additional testing is added in later commits to be supported by additional changes in the library.	2017-08-21 17:44:02 -04:00
Jay Berkenbilt	2d2f619665	Implement Pl_RunLength pipeline	2017-08-19 14:50:55 -04:00
Jay Berkenbilt	cfa2eb97fb	Add page rotation (fixes #132 )	2017-08-12 22:57:38 -04:00
Jay Berkenbilt	8249a26d69	Fix infinite loop in QPDFWriter (fixes #143 )	2017-08-12 08:36:36 -04:00
Jay Berkenbilt	36b3fe5af7	Fix --newline-before-endstream option (fixes #133 ) Add a newline unconditionally before endstream even if a newline was already written as part of the stream data.	2017-08-11 20:57:05 -04:00
Jay Berkenbilt	46611f0710	Prevent a division by zero error (fixes #141 ) Bad /W in an xref stream could cause a division by zero error. Now this is handled as a special case.	2017-08-11 20:11:19 -04:00
Jay Berkenbilt	8fe0b06cd8	Pad encryption parameters that are too short (fixes #96 )	2017-08-11 19:53:56 -04:00
Jay Berkenbilt	e7d0019bf4	Generate libqpdf.map from autoconf Rather than checking consistency of libqpdf.map, generate it.	2017-08-11 04:56:22 -04:00
Jay Berkenbilt	6247aaa57c	Fix libqpdf.map and prevent future breakage The build now checks to make sure libqpdf.map has the right library version number in it.	2017-08-10 21:53:19 -04:00
Jay Berkenbilt	9a96e233b0	Remove PCRE	2017-08-10 21:30:32 -04:00
Jay Berkenbilt	30f109e244	Read xref table without PCRE Also accept more errors than before.	2017-08-10 21:30:32 -04:00
Jay Berkenbilt	98a843c2a2	Reconstruct xref without PCRE	2017-08-10 21:30:32 -04:00
Jay Berkenbilt	ca5b1d267a	Improve stream length recovery Eliminate PCRE and find endobj not preceded by endstream. Be more lax about placement of endstream and endobj.	2017-08-10 21:30:32 -04:00
Jay Berkenbilt	3082e4e606	Find xref without PCRE	2017-08-10 21:30:32 -04:00
Jay Berkenbilt	90840be594	Find lindict without PCRE	2017-08-10 21:30:32 -04:00
Jay Berkenbilt	03aa9679ac	Find starxref without PCRE	2017-08-10 21:30:32 -04:00
Jay Berkenbilt	1765c6ec20	Find header without PCRE	2017-08-10 21:30:32 -04:00
Jay Berkenbilt	296b679d6e	Implement findFirst and findLast in InputSource Preparing to refactor some pattern searching code to use these instead of their own memchr loops. This should simplify the code that replaces PCRE.	2017-08-10 21:30:32 -04:00
Jay Berkenbilt	ef8ae5449d	Allow QPDFTokenizer::readToken to return bad tokens Sometimes we want to ignore bad tokens rather than having them throw an exception. A coverage case is commented out here and added in a later commit.	2017-08-10 19:01:41 -04:00
Jay Berkenbilt	8fe261d8b4	QUtil::strcasecmp	2017-08-05 10:22:33 -04:00
Pranjal Bhor	6f88fd36ab	Include missing header in QPDFTokenizer.cc (fixes #125 ) Required for strtol()	2017-07-30 08:47:05 -04:00
Jay Berkenbilt	2d5b854468	Allow reading command-line args from files (fixes #16 )	2017-07-29 22:23:21 -04:00
Jay Berkenbilt	5993c3e83c	Detect input file = output file (fixes #29 )	2017-07-29 20:58:01 -04:00
Jay Berkenbilt	570db9b60b	Catch more exceptions while resolving objects	2017-07-29 19:31:12 -04:00
Jay Berkenbilt	b43a0ac237	When recover stream length, indicate the length (fixes #44 )	2017-07-29 19:15:06 -04:00
Jay Berkenbilt	f37d399d82	Add newline-before-endstream option (fixes #103 )	2017-07-29 12:21:38 -04:00
Jay Berkenbilt	6a7d53ad2b	Handle zlib data errors better (fixes #106 )	2017-07-29 12:19:04 -04:00
Jay Berkenbilt	07d6f770b2	Better recovery of bad stream start (fixes #104 )	2017-07-29 12:19:04 -04:00
Jay Berkenbilt	b389268f16	Better handle split content streams (fixes #73 ) When parsing content streams, allow content to be split arbitrarily across stream boundaries.	2017-07-29 12:19:04 -04:00
Jay Berkenbilt	a136824243	Fix exception catch	2017-07-29 12:19:04 -04:00
Jay Berkenbilt	ba2bae4acc	Use 1.2 as the version if we can't read it from the header The code was using 1.0, but we use /FlateDecode, which didn't appear until 1.2.	2017-07-29 12:19:04 -04:00
Jay Berkenbilt	3a1ff5ded9	Add option to preserve unreferenced objects	2017-07-28 19:19:11 -04:00
Jay Berkenbilt	a94a729fee	Explicitly check root dictionary type Very badly corrupted files may not have a retrievable root dictionary. Handle that as a special case so that a more helpful error message can be provided.	2017-07-28 18:03:30 -04:00
Jay Berkenbilt	7f8892525f	Add precheck streams capability When requested, QPDFWriter will do more aggress prechecking of streams to make sure it can actually succeed in decoding them before attempting to do so. This will allow preservation of raw data even when the raw data is corrupted relative to the specified filters.	2017-07-27 23:42:27 -04:00
Jay Berkenbilt	428d96dfe1	Convert many more errors to warnings	2017-07-27 22:57:55 -04:00
Jay Berkenbilt	a4fd4b91c6	Convert stream filtering errors to warnings	2017-07-27 18:43:07 -04:00
Jay Berkenbilt	40f00122b8	Convert object parsing errors to warnings QPDFObjectHandle::parseInternal now issues warnings instead of throwing exceptions for all error conditions that it finds (except internal logic errors) and has stronger recovery for things like invalid tokens and malformed dictionaries. This should improve qpdf's ability to recover from a wide range of broken files that currently cause it to fail.	2017-07-27 18:20:31 -04:00
Jay Berkenbilt	dd8dad74f4	Move lexer helper functions to QUtil	2017-07-27 13:59:56 -04:00
Jay Berkenbilt	0a745021e7	Remove PCRE from QPDFTokenizer	2017-07-27 13:59:56 -04:00
slurdge	8740b380fe	Make windows includes lowercase (fixes #123 ) For cross compiling.	2017-07-26 06:39:09 -04:00
Jay Berkenbilt	12db09898e	Don't interpret word tokens in content streams (fixes #82 )	2017-07-26 06:24:07 -04:00
Jay Berkenbilt	701b518d5c	Detect recursion loops resolving objects (fixes #51 ) During parsing of an object, sometimes parts of the object have to be resolved. An example is stream lengths. If such an object directly or indirectly points to the object being parsed, it can cause an infinite loop. Guard against all cases of re-entrant resolution of objects.	2017-07-26 06:24:07 -04:00
Jay Berkenbilt	afe0242b26	Handle object ID 0 (fixes #99 ) This is CVE-2017-9208. The QPDF library uses object ID 0 internally as a sentinel to represent a direct object, but prior to this fix, was not blocking handling of 0 0 obj or 0 0 R as a special case. Creating an object in the file with 0 0 obj could cause various infinite loops. The PDF spec doesn't allow for object 0. Having qpdf handle object 0 might be a better fix, but changing all the places in the code that assumes objid == 0 means direct would be risky.	2017-07-26 06:24:07 -04:00
Jay Berkenbilt	315092dd98	Avoid xref reconstruction infinite loop (fixes #100 ) This is CVE-2017-9209.	2017-07-26 06:24:07 -04:00
Jay Berkenbilt	603f222365	Fix infinite loop while reporting an error (fixes #101 ) This is CVE-2017-9210. The description string for an error message included unparsing an object, which is too complex of a thing to try to do while throwing an exception. There was only one example of this in the entire codebase, so it is not a pervasive problem. Fixing this eliminated one class of infinite loop errors.	2017-07-26 06:24:07 -04:00
Thorsten Schöning	b3c08f4f8d	C++-Builder supports 64 Bit file functions The 64 Bit file functions are supported by C++-Builder as well and need to be used, else fseek will error out on larger files than 4 GB like used in the large file test.	2016-01-24 12:07:20 -05:00
Jay Berkenbilt	b7302a9b72	Prepare 6.0.0 release	2015-11-10 12:48:52 -05:00
Jay Berkenbilt	1f4a67912c	Bump library soname Also update maintainer documentation on binary compatibility testing.	2015-11-10 12:42:37 -05:00
Jay Berkenbilt	e0e9d64674	Remove some ABI compatibility private methods Since we have to bump soname, remove some private methods that were just there for binary compatibility	2015-11-10 12:22:40 -05:00
Jay Berkenbilt	e5abc789a2	Prepare 5.2.0 release	2015-11-01 16:40:01 -05:00
Jay Berkenbilt	b62cbe2508	Tolerate some mangled xref tables If xref table entries lack the spec-required trailing whitespace or contain a small amount of extra space, handle them anyway.	2015-10-31 18:56:43 -04:00
Jay Berkenbilt	f0b85a1eb1	Remove trailing whitespace	2015-10-31 18:56:43 -04:00
Jay Berkenbilt	b029555909	Bump soname minor revision for ABI additions	2015-10-31 18:56:43 -04:00
Jay Berkenbilt	b8bdef0ad1	Implement deterministic ID For non-encrypted files, determinstic ID generation uses file contents instead of timestamp and file name. At a small runtime cost, this enables generation of the same /ID if the same inputs are converted in the same way multiple times.	2015-10-31 18:56:42 -04:00
Jay Berkenbilt	94e55394ed	Prepare 5.1.3 release	2015-05-24 17:26:49 -04:00
Jay Berkenbilt	cf43882e9f	Handle Microsoft crypt provider without prior keys As reported in issue #40, a call to CryptAcquireContext in SecureRandomDataProvider fails in a fresh windows install prior to any user keys being created in AppData\Roaming\Microsoft\Crypto\RSA. Thanks michalrames.	2015-05-24 16:52:42 -04:00
Jay Berkenbilt	a11549a566	Detect loops in /Pages structure Pushing inherited objects to pages and getting all pages were both prone to stack overflow infinite loops if there were loops in the Pages dictionary. There is a general weakness in the code in that any part of the code that traverses the Pages structure would be prone to this and would have to implement its own loop detection. A more robust fix may provide some general method for handling the Pages structure, but it's probably not worth doing. Note: addition of *Internal2 private functions was done rather than changing signatures of existing methods to avoid breaking compatibility.	2015-02-21 19:47:11 -05:00
Jay Berkenbilt	28a9df5119	Avoid buffer overrun copying digest Converting a password to an encryption key is supposed to copy up to a certain number of bytes from a digest. Make sure never to copy more than the size of the digest.	2015-02-21 17:51:08 -05:00
Jay Berkenbilt	c729e07d55	Avoid resolving arguments to R When checking two objects preceding R while parsing, ensure that the objects are direct. This avoids stuff like 1 0 obj containing 1 0 R 0 R from causing an infinite loop in object resolution.	2015-02-21 17:51:08 -05:00
Jay Berkenbilt	d8900c2255	Handle page tree node with no /Type Original reported here: https://bugs.launchpad.net/ubuntu/+source/qpdf/+bug/1397413 The PDF specification says that the /Type key for nodes in the pages dictionary (both /Page and /Pages) is required, but some PDF files omit them. Use the presence of other keys to determine the type of pages tree node this is if the type key is not found.	2014-12-29 10:17:21 -05:00
Jay Berkenbilt	caab1b0e16	Handle pages with no /Contents from getPageContents() The spec allows /Contents to be omitted for pages that are blank, but QPDFObjectHandle::getPageContents() was throwing an exception in this case.	2014-11-14 13:43:34 -05:00
Jay Berkenbilt	4071db59aa	Prepare 5.1.2 release	2014-06-07 17:16:52 -04:00
Jay Berkenbilt	9f8aba1db7	Handle indirect stream filter/decode parameters QPDFWriter was trying to make /Filter and /DecodeParms direct in all cases, but there are some cases where /DecodeParms may refer to a stream, which can't be direct. QPDFWriter doesn't actually need /DecodeParms to be direct in that case because it won't be able to filter the stream. Until we can handle this type of stream, just don't make /Filter and /DecodeParms direct if we can't filter the stream anyway. Fixes #34	2014-06-07 16:31:03 -04:00
Jay Berkenbilt	b0a96ce6aa	Fix calculation of xref stream stream columns Fix problem: if the last object in the first part of a linearized file had an offset that was below 65536 by less than the size of the hint stream, the xref stream was invalid and the resulting file is not usable.	2014-02-22 22:13:31 -05:00
Jay Berkenbilt	247d70efee	Prepare 5.1.1 release	2014-01-14 15:45:35 -05:00
Jay Berkenbilt	c9a9fe9c2f	Avoid traversing same object twice when copying objects This is a performance fix. The output is unchanged. Fixes #28.	2013-12-26 11:51:50 -05:00
Jay Berkenbilt	0b6127558d	Prepare 5.1.0 release	2013-12-17 15:26:07 -05:00
Jay Berkenbilt	6067608d93	Remove needless #ifdef _WIN32 from getWhoami	2013-12-16 16:21:28 -05:00
Jay Berkenbilt	235d8f28f8	Increase random data provider support Add a method to get the current random data provider, and document and test the method for resetting it.	2013-12-16 16:21:28 -05:00
Jay Berkenbilt	b802ca47e9	Comments about incremental update support Also remove some trivial, non-functional code.	2013-12-14 15:17:36 -05:00
Jay Berkenbilt	30287d2d65	Allow OS-provided secure random to be disabled	2013-12-14 15:17:36 -05:00
Jay Berkenbilt	5e3bad2f86	Refactor random data generation Add new RandomDataProvider object and implement existing random number generation in terms of that. This enables end users to supply their own random data providers.	2013-12-14 15:17:35 -05:00
Jay Berkenbilt	e9a319fb95	Allow arbitrary whitespace, not just newline, after xref Fixes #27.	2013-12-14 15:17:23 -05:00
Jay Berkenbilt	7393a03868	Update lastOffset when reading	2013-12-14 15:17:07 -05:00
Jay Berkenbilt	478c05fcab	Allow -DNO_GET_ENVIRONMENT to avoid GetEnvironmentVariable If NO_GET_ENVIRONMENT is #defined at compile time on Windows, do not call GetEnvironmentVariable. QUtil::get_env will always return false. This option is not available through configure. This was added to support a specific user's requirements to avoid calling GetEnvironmentVariable from the Windows API. Nothing in qpdf outside the test coverage system in qtest relies on QUtil::get_env.	2013-11-30 15:58:32 -05:00
Jay Berkenbilt	dc9df97466	Include <algorithm> for std::min, std::max	2013-11-29 10:48:16 -05:00
Jay Berkenbilt	e1bd72b46c	Prepare for 5.0.1 release	2013-10-18 13:51:30 -04:00
Jay Berkenbilt	a237e92445	Warn when -accessibility=n will be ignored Also accept -accessibility=n with 256 bit keys even though it will be ignored.	2013-10-18 10:45:15 -04:00
Jay Berkenbilt	ac9c1f0d56	Security: replace operator[] with at For std::string and std::vector, replace operator[] with at. This was done using an automated process. See README.hardening for details.	2013-10-18 10:45:14 -04:00
Jay Berkenbilt	4229457068	Security: use a secure random number generator If not available, give an error. The user may also configure qpdf to use an insecure random number generator.	2013-10-18 10:45:12 -04:00
Jay Berkenbilt	e19eb579b2	Replace some assertions with std::logic_error Ideally, the library should never call assert outside of test code, but it does in several places. For some cases where the assertion might conceivably fail because of a problem with the input data, replace assertions with exceptions so that they can be trapped by the calling application. This commit surely misses some cases and replaced some cases unnecessarily, but it should still be an improvement.	2013-10-09 20:57:14 -04:00
Jay Berkenbilt	0bfe902489	Security: avoid pre-allocating vectors based on file data In places where std::vector<T>(size_t) was used, either validate that the size parameter is sane or refactor code to avoid the need to pre-allocate the vector.	2013-10-09 20:57:14 -04:00
Jay Berkenbilt	10bceb552f	Security: sanitize /W in xref stream The /W array was not sanitized, possibly causing an integer overflow in a multiplication. An analysis of the code suggests that there were no possible exploits based on this since the problems were in checking expected values but bounds checks were performed on actual values.	2013-10-09 20:57:07 -04:00
Jay Berkenbilt	3eb4b066ab	Security: better bounds checks for linearization data The faulty code was only used during explicit checks of linearization data. Those checks are not part of normal reading or writing of PDF files.	2013-10-09 19:50:09 -04:00
Jay Berkenbilt	b097d7a81b	Security: handle empty name in normalizeName	2013-10-09 19:50:09 -04:00
Jay Berkenbilt	eb1b1264b4	Security: fix potential multiplication overflow Better sanity check inputs to bit stream reader	2013-10-09 19:50:09 -04:00
Jay Berkenbilt	c2e91d8ec3	Security: keep cur_byte pointing into bytes array	2013-10-09 19:50:07 -04:00
Jay Berkenbilt	66e63b8667	Prepare 5.0.0 release	2013-07-10 12:29:13 -04:00
Jay Berkenbilt	cee2592ed1	Change API/ABI and withdraw 4.2.0 4.2.0 was binary incompatible in spite of there being no deletions or changes to any public methods. As such, we have to bump the ABI and are fixing some API breakage while we're at it. Previous 4.3.0 target is now 5.1.0.	2013-07-10 11:30:13 -04:00
Jay Berkenbilt	f31e526d67	Prepare 4.2.0 release	2013-07-07 19:43:16 -04:00
Jay Berkenbilt	b84f57e56d	Ignore broken DecodeParms for stream with no filters	2013-07-07 19:43:16 -04:00
Jay Berkenbilt	88bacb6449	Fix QPDFObjGen constructor implementation	2013-07-07 19:43:01 -04:00
Jay Berkenbilt	212812d837	Fix errors reported by Coverity Thanks to Jiri Popelka from Red Hat for sending the output of a Coverity run over qpdf.	2013-07-07 15:36:51 -04:00
Jay Berkenbilt	a85007cb0d	Handle more broken files Space rather than newline after xref, missing /ID in trailer for encrypted file. This enables qpdf to handle some files that xpdf can handle. Adobe reader can't necessarily handle them.	2013-06-15 12:40:01 -04:00
Jay Berkenbilt	16051788ed	Handle /Outlines dictionary being a direct object Even though this case is not valid according to the spec, it has been seen, and caused an internal error.	2013-06-14 21:36:04 -04:00
Jay Berkenbilt	eae8370cd9	Add optional /Length key in crypt filter dictionary	2013-06-14 20:42:39 -04:00
Jay Berkenbilt	a3576a7359	Bug fix: handle generation > 0 when generating object streams Rework QPDFWriter to always track old object IDs and QPDFObjGen instead of int, thus not discarding the generation number. Switch to QPDF::getCompressibleObjGen() to properly handle the case of an old object eligible for compression that has a generation of other than zero.	2013-06-14 14:58:09 -04:00
Jay Berkenbilt	96eb965115	Use QPDFObjectHandle::getObjGen() where appropriate In internal code and examples, replace calls to getObjectID() and getGeneration() with calls to getObjGen() where possible.	2013-06-14 14:58:09 -04:00
Jay Berkenbilt	5039da0b91	Add QPDFObjectHandle::getObjGen() This is safer than getObjectID() and getGeneration() for many uses.	2013-06-14 14:58:09 -04:00
Jay Berkenbilt	d88231e01e	Promote QPDF::ObjGen to top-level object QPDFObjGen	2013-06-14 14:58:08 -04:00
Jay Berkenbilt	690d6031db	Remove duplicated comment	2013-06-08 18:58:31 -04:00
Jay Berkenbilt	f02c5f5e12	Final preparation for 4.1.0 release	2013-04-14 15:03:51 -04:00
Jay Berkenbilt	403bb68d33	Run spelling checker	2013-04-14 14:36:25 -04:00
Jay Berkenbilt	2d02b3cc3d	Add explicit int to double cast	2013-04-04 14:13:31 -04:00
Jay Berkenbilt	8e636ea680	Protect gcc diagnostic pragmas with gcc version Versions prior to 4.6 didn't allow gcc diagnostic pragmas with push and pop and to appear anywhere in the file. Just let the warning be there for those versions.	2013-03-27 17:36:28 -04:00
Jay Berkenbilt	29f5830325	Fix getTypeCode and getTypeName work for indirect objects Remove const qualifier from getTypeCode and get getTypeName methods of QPDFObjectHandle, make them work properly for indirect objects, and exercise them much better in the test suite.	2013-03-05 13:35:46 -05:00
Jay Berkenbilt	119f2a4b68	Add method to terminate content stream parsing	2013-03-05 13:35:46 -05:00
Jay Berkenbilt	fd64959398	Favor strerror_s and fopen_s on MSVC Make remaining calls to fopen and strerror use strerror_s and fopen_s on MSVC.	2013-03-05 13:35:46 -05:00
Jay Berkenbilt	ac4deac187	Call QUtil::safe_fopen in place of fopen fopen was previuosly called wrapped by QUtil::fopen_wrapper, but QUtil::safe_fopen does this itself, which is less cumbersome.	2013-03-05 13:35:46 -05:00
Jay Berkenbilt	7ccc9bd9d5	Remove all calls to strcpy	2013-03-05 13:35:46 -05:00
Jay Berkenbilt	a51ae10b8d	Remove all calls to sprintf	2013-03-05 13:35:46 -05:00
Jay Berkenbilt	66c3c8fdf7	Use portable versions of some UNIX-specific calls Remove needless calls to open, close, and fileno; call remove instead of unlink.	2013-03-05 13:35:46 -05:00
Jay Berkenbilt	6b9297882e	Mark secure CRT warnings with comment Put a specific comment marker next to every piece of code that MSVC gives warning 4996 for. This warning is generated for calls to functions that Microsoft considers insecure or deprecated. This change is in preparation for fixing all these cases even though none of them are actually incorrect or insecure as used in qpdf. The comment marker makes them easier to find so they can be fixed in subsequent commits.	2013-03-05 13:33:32 -05:00
Jay Berkenbilt	8be8277613	Rewrite QUtil::int_to_string and QUtil::double_to_string Make them safer by avoiding any internal limits and replacing sprintf with std::ostringstream.	2013-03-04 16:45:16 -05:00
Jay Berkenbilt	ed19516aa7	Fix unused local variable warnings	2013-03-04 16:45:16 -05:00
Jay Berkenbilt	30027481f7	Remove all old-style casts from C++ code	2013-03-04 16:45:16 -05:00
Jay Berkenbilt	32b62035ce	Replace many calls to sprintf with QUtil::hex_encode Add QUtil::hex_encode to encode binary data has a hexadecimal string, and use it in place of sprintf where possible.	2013-03-04 16:45:15 -05:00
Jay Berkenbilt	6c7bf114dc	Bug fix: properly handle overridden compressed objects When caching objects in an object stream, only cache objects that still resolve to that stream. See Changelog mod from this commit for details.	2013-02-23 17:51:17 -05:00
Jay Berkenbilt	a5d8783f67	Improve qpdf --check Fix exit status for case of errors without warnings, continue after errors when possible, add test case for parsing a file with content stream errors on some but not all pages.	2013-01-25 11:08:50 -05:00
Jay Berkenbilt	bfda717749	Cosmetic changes to be closer to Adobe terminology Change object type Keyword to Operator, and place the order of the object types in object_type_e in the same order as they are mentioned in the PDF specification. Note that this change only breaks backward compatibility with code that has not yet been released.	2013-01-23 09:38:05 -05:00
Jay Berkenbilt	913eb5ac35	Add getTypeCode() and getTypeName() Add virtual methods to QPDFObject, wrappers to QPDFObjectHandle, and implementations to all the QPDF_Object types.	2013-01-22 10:01:45 -05:00
Jay Berkenbilt	f81152311e	Add QPDFObjectHandle::parseContentStream method This method allows parsing of the PDF objects in a content stream or array of content streams.	2013-01-20 15:35:39 -05:00
Jay Berkenbilt	1d88955fa6	Added new QPDFObjectHandle types Keyword and InlineImage These object types are to facilitate content stream parsing.	2013-01-20 15:35:39 -05:00
Jay Berkenbilt	a844c2a3ab	Set version to 4.1.a0 Next released version will be 4.1.0 since new APIs are being added.	2013-01-20 15:35:39 -05:00
Jay Berkenbilt	8708fd373d	Prepare 4.0.1 release	2013-01-17 09:51:04 -05:00
Jay Berkenbilt	80fa4e01a1	Set version number to 4.0.0+	2013-01-03 16:42:10 -05:00
Jay Berkenbilt	0e9949afde	Update versions for 4.0.0 release	2012-12-31 11:43:27 -05:00
Jay Berkenbilt	3e96148aa5	Fix spelling errors Fixed spelling errors in previously published commits and update spelling dictionary	2012-12-31 10:32:32 -05:00
Jay Berkenbilt	f8306913ba	Update "C" API with functions for new features	2012-12-31 10:32:32 -05:00
Jay Berkenbilt	9eb5982fa3	Avoid modifying trailer when writing When preparing the trailer for writing to the new file, trim a copy of the trailer instead of the original file's trailer.	2012-12-31 10:32:32 -05:00
Jay Berkenbilt	0ea70e5dae	Update shared library major version to 10 The upcoming 3.1 release contains non-compatible API changes, though they only affect parts of the interface that are extremely unlikely to have been used outside of qpdf itself. The methods and data types affected were used for communication between QPDFWriter and QPDF and would have had no real use in end user code.	2012-12-31 10:32:32 -05:00
Jay Berkenbilt	9a23c3dcb6	Remove /Crypt from stream filters unconditionally When writing a new stream, always remove /Crypt even if we are not otherwise able to filter the stream.	2012-12-31 10:32:32 -05:00
Jay Berkenbilt	4237a29c94	Refactor Dictionary writing code Original code was written before we could shallow copy objects, so all the filtering was done by suppressing the output of certain keys and replacing them with other keys. Now we can simplify the code greatly by modifying shallow copies of dictionaries in place.	2012-12-31 10:32:32 -05:00
Jay Berkenbilt	e57c25814e	Support for encryption with /V=5 and /R=5 and /R=6 Read and write support is implemented for /V=5 with /R=5 as well as /R=6. /R=5 is the deprecated encryption method used by Acrobat IX. /R=6 is the encryption method used by PDF 2.0 from ISO 32000-2.	2012-12-31 10:32:32 -05:00
Jay Berkenbilt	93ac1695a4	Support files with only attachments encrypted Test cases added in a future commit since they depend on /R=6 support.	2012-12-31 10:32:32 -05:00
Jay Berkenbilt	4eccb9d87b	Add random number functions to QUtil	2012-12-31 10:32:32 -05:00
Jay Berkenbilt	16a23368e7	Fix infinite loop trimming passwords with ( in them	2012-12-31 10:32:31 -05:00
Jay Berkenbilt	0873e42300	SHA2 pipeline with support for 256, 384, and 512 bits Implemented pipeline around sph sha calls using standard test vectors for full-byte values. Did not test or support partial byte values.	2012-12-31 05:36:51 -05:00
Jay Berkenbilt	c9da66a018	Incorporate sha2 code from sphlib 3.0 Changes from upstream are limited to change #include paths so that I can place header files and included "c" files in a subdirectory. I didn't keep the unit tests from sphlib but instead verified them by running them manually. I will implement the same tests using the Pl_SHA2 pipeline except that sphlib's sha2 implementation supports partial bytes, which I will not exercise in qpdf or our tests.	2012-12-31 05:36:51 -05:00
Jay Berkenbilt	3680922ae5	Allow specification of AES initialization vector	2012-12-31 05:36:50 -05:00
Jay Berkenbilt	9b42f526df	Update AES classes to work with 256-bit keys	2012-12-31 05:36:50 -05:00
Jay Berkenbilt	774584163f	Add ExtensionLevel support to version handling All version operations are now fully aware of extension levels.	2012-12-31 05:36:50 -05:00
Jay Berkenbilt	3101955ac0	Add V5 parameters to EncryptionData	2012-12-31 05:36:50 -05:00
Jay Berkenbilt	68447bb556	change EncryptionData	2012-12-31 05:36:50 -05:00
Jay Berkenbilt	04c203ae06	Eliminate flattenScalarReferences	2012-12-31 05:36:48 -05:00
Jay Berkenbilt	b4e7d6ed32	Improve memory safety of finding PDF header	2012-12-25 15:13:44 -05:00
Jay Berkenbilt	7f84239cad	Find PDF header anywhere in the first 1024 bytes	2012-12-25 14:43:37 -05:00
Jay Berkenbilt	f256670eba	Ignore objects with offset 0	2012-11-20 13:57:37 -05:00
Jay Berkenbilt	041397fdab	Allow reading from InputSource and writing to Pipeline Allowing users to subclass InputSource and Pipeline to read and write from/to arbitrary sources provides the maximum flexibility for users who want to read and write from other than files or memory.	2012-09-23 17:42:26 -04:00
Jay Berkenbilt	8c99e4a6c0	Indicate pre-release version	2012-09-23 17:41:08 -04:00
Jay Berkenbilt	b4dc0f072a	Prepare 3.0.2 release	2012-09-06 15:47:58 -04:00
Jay Berkenbilt	7e4a079674	Update libtool data for API changes	2012-09-06 15:31:35 -04:00
Jay Berkenbilt	c1627d0438	Add QPDFWriter::setExtraHeaderText	2012-09-06 15:31:12 -04:00
Jay Berkenbilt	fc4c82a950	Reset state in QPDF::calculateLinearizationData This makes it possible to use two different writers to write linearized files from the same QPDF object.	2012-09-06 15:28:16 -04:00
Jay Berkenbilt	8d2b29ef98	Fix segmentation fault with use of QPDFWriter::setOutputMemory	2012-09-06 14:39:06 -04:00
Jay Berkenbilt	59432b5c70	Prepare 3.0.1 release	2012-08-11 13:41:18 -04:00
Jay Berkenbilt	29e9c34fe3	Bug fix: let EOF resolve literal token Previously only whitespace and comments did it. This fix is needed for object streams whose last object is a literal (name, integer, real, string) not terminated by space or newline.	2012-08-11 09:29:04 -04:00
Jay Berkenbilt	137dc7acb9	Refactor: move resolution of literal to its own method	2012-08-11 09:22:59 -04:00
Jay Berkenbilt	511e68758c	Update version to 3.0.0	2012-08-02 06:52:33 -04:00
Jay Berkenbilt	32051283b9	Fix spelling errors	2012-07-29 14:44:12 -04:00
Jay Berkenbilt	2280c4f6d1	Update documentation and version numbers 3.0.rc1	2012-07-28 22:03:36 -04:00
Tobias Hoffmann	9c00874e77	added QPDFObjectHandle::replaceStreamData(std::string data).	2012-07-25 03:02:46 +02:00
Jay Berkenbilt	316328704b	Windows compilation fixes	2012-07-21 20:51:56 -04:00
Jay Berkenbilt	6bbea4baa0	Implement QPDFObjectHandle::parse Move object parsing code from QPDF to QPDFObjectHandle and parameterize the parts of it that are specific to a QPDF object. Provide a version that can't handle indirect objects and that can be called on an arbitrary string. A side effect of this change is that the offset used when reporting invalid stream length has changed, but since the new value seems like a better value than the old one, the test suite has been updated rather than making the code backward compatible. This only effects the offset reported for invalid streams that lack /Length or have an invalid /Length key. Updated some test code and exmaples to use QPDFObjectHandle::parse. Supporting changes include adding a BufferInputSource constructor that takes a string.	2012-07-21 09:06:10 -04:00
Jay Berkenbilt	f3e267fce2	Move readToken from QPDF to QPDFTokenizer	2012-07-21 09:06:10 -04:00
Jay Berkenbilt	15eaed5c52	Refactor: pull *InputSource out of QPDF InputSource, FileInputSource, and BufferInputSource are now top-level classes instead of privately nested inside QPDF.	2012-07-21 09:06:06 -04:00
Jay Berkenbilt	8657c6f004	Prevent seeking before beginning of BufferInputSource	2012-07-18 09:50:05 -04:00
Jay Berkenbilt	a101533e0a	Add command line option to copy encryption from other file Add --copy-encryption and --encryption-file-password options to qpdf. Also strengthen test suite for copying encryption. The strengthened test suite would have caught the failure to preserve AES and the failure to update the file version, which was invalidating the encrypted data.	2012-07-15 21:15:24 -04:00
Jay Berkenbilt	b26ce88ea1	Minor fixes to copyEncryptionParameters This fixes were to code added yesterday; the problems would not have impacted any previously released code. These are all changes related to the possibility that copyEncryptionParameters may be called on behalf a different QPDF than the one being written.	2012-07-15 21:14:02 -04:00
Jay Berkenbilt	db95960ac1	Bug fix: preserve AES when copying encryption parameters	2012-07-15 19:07:59 -04:00
Jay Berkenbilt	b501251291	qpdf: push inherited attributes to page when showing images from qpdf command-line tool	2012-07-15 16:22:28 -04:00
Jay Berkenbilt	0575d77d77	Add public QPDFWriter::copyEncryptionParameters Method to copy encryption parameters from another file. Adapted from existing code to copy encryption parameters from the original file.	2012-07-14 09:14:41 -04:00
Jay Berkenbilt	1c944e4c89	Have QPDFWriter detect foreign objects while writing Throw an exception that directs the user to QPDF::copyForeignObject.	2012-07-14 08:07:23 -04:00
Jay Berkenbilt	e7b8f297ba	Support copying objects from another QPDF object This includes QPDF::copyForeignObject and supporting foreign objects as arguments to addPage*.	2012-07-11 15:54:33 -04:00
Jay Berkenbilt	8a217eb3a2	Add concept of reserved objects QPDFObjectHandle::{new,is,assert}Reserved, QPDF::replaceReserved provide a mechanism to add objects to a PDF file when there are circular references. This is a prerequisite to copying objects from one PDF to another.	2012-07-10 23:34:32 -04:00
Jay Berkenbilt	1dc25c0217	Fix: make unparse virtual for Null and Real	2012-07-08 16:01:12 -04:00
Tobias Hoffmann	8720446b23	Added assertNumber and assertScalar to QPDFObjectHandle	2012-07-07 18:55:08 -04:00
Tobias Hoffmann	a8266ccb0e	Added public assert{Type} methods to QPDFObjectHandle	2012-07-07 18:53:38 -04:00
Tobias Hoffmann	39bbaa86e3	Build this->all_pages while traversing with pushInheritedAttributesToPage	2012-07-07 17:45:10 -04:00
Jay Berkenbilt	e2dedde4bd	Don't require stream data provider to know length in advance Breaking API change: length parameter has disappeared from the StreamDataProvider version of QPDFObjectHandle::replaceStreamData since it is no longer necessary to compute it in advance. This breaking change is justified by the fact that removing the length parameter provides the caller an opportunity to simplify the calling code.	2012-07-07 17:33:45 -04:00
Jay Berkenbilt	8705e2e8fc	Add QPDFWriter method to output to FILE*	2012-07-05 21:24:04 -04:00
Tobias Hoffmann	abb53ac369	Limited inheritance to the attributes explicitly listed in the PDF spec Previous versions of qpdf incorrectly passed arbitrary objects from /Pages objects down to individual pages in direct contradition with the PDF specification. These are now left in /Pages. When intermediate /Pages nodes are being discarded as when the /Pages tree is being flattened, a warning is issued when unknown keys are encountered.	2012-07-04 23:04:55 -04:00
Tobias Hoffmann	7770a1b036	Added public method QPDF::pushInheritedAttributesToPage Refactored optimizePagesTree to pushInheritedAttributesToPage and made public	2012-07-04 16:24:03 -04:00
Jay Berkenbilt	5f59c32f87	Add a few minor enhancements to recent work Test coverage case for new newStream method Expose decimal_places argument for double-based newReal All enhancements suggested by Tobias.	2012-06-27 10:43:27 -04:00
Tobias Hoffmann	f07e3370f0	Add Pl_Concatenate filter	2012-06-27 10:20:38 -04:00
Tobias Hoffmann	43c404b45a	Add QPDFObjectHandle::newStream(QPDF *, std::string const&) This makes the code simpler than having to create a buffer of a fixed size and copy the string to it.	2012-06-27 10:19:57 -04:00
Tobias Hoffmann	75054c0b94	Add QPDFObjectHandle::newReal(double)	2012-06-27 10:19:01 -04:00
Jay Berkenbilt	2266c6232b	Rework InputSource::readLine to make it much more efficient This rework makes xref reconstruction run much faster and use much less memory.	2012-06-27 06:48:06 -04:00
Jay Berkenbilt	736bafbb9c	Rename seek functions in QUtil	2012-06-26 23:10:10 -04:00
Jay Berkenbilt	0802ba275f	Visual C++ and mingw32 fixes for large files	2012-06-26 23:05:59 -04:00
Jay Berkenbilt	5e3167e856	Set version to 3.0.a0	2012-06-25 21:35:30 -04:00
Jay Berkenbilt	1a3e88ca09	Fix large file support for 32-bit Linux	2012-06-25 10:51:44 -04:00
Jay Berkenbilt	c16db4106c	Increase padding in linearized files With QPDF allowing integers to contain 64-bit quantities, this change is necessary to be able to linearize files whose sizes might be larger than 10 digits.	2012-06-24 15:56:59 -04:00
Jay Berkenbilt	8318d81ada	Fix and test support for files >= 4 GB	2012-06-24 15:56:50 -04:00
Jay Berkenbilt	781c313058	Change QPDF_Integer from int to long long This makes it possible to store offsets that are larger than 2 GB in the trailer dictionary.	2012-06-24 15:20:01 -04:00
Jay Berkenbilt	4f305488d8	Improve the FILE* version of QPDF::processFile	2012-06-23 18:23:06 -04:00
Tobias Hoffmann	7f95ad5b92	Fixed missing throw	2012-06-23 18:17:01 -04:00
Jay Berkenbilt	bf059a6001	Replace the 8-bit characters with \x.. in QPDFWriter.cc This just makes it safer to pull up this file in an editor.	2012-06-23 09:05:06 -04:00
Jay Berkenbilt	6c0af0844c	Switch some code to use empty newArray/newDictionary	2012-06-22 10:09:42 -04:00
Jay Berkenbilt	b6bdc0f595	Add factory methods for creating empty arrays and dictionaries. Also updated pdf_from_scratch test driver to use the new factories, and made some cosmetic improvements and documentation updates for the emptyPDF() method.	2012-06-22 09:46:33 -04:00
Jay Berkenbilt	a0768e4190	Add QPDF::emptyPDF() and pdf_from_scratch test code	2012-06-21 23:09:05 -04:00
Jay Berkenbilt	81e8752362	Use qpdf_offset_t in place of off_t in public APIs. off_t is used internally only when needed to talk to standard libraries. This requires that the "long long" type be supported by the compiler.	2012-06-21 21:23:24 -04:00
Jay Berkenbilt	d1ebe30ff6	Add QPDFObjectHandle::shallowCopy()	2012-06-21 16:15:09 -04:00
Jay Berkenbilt	9689f4cdcf	Use getRoot() instead of looking it up in the trailer	2012-06-21 16:15:09 -04:00
Jay Berkenbilt	11d33a45fa	Iterate of /Info's keys, not trailer's keys, to seed /ID Thanks Tobias Hoffmann for noticing the error.	2012-06-21 15:52:53 -04:00
Jay Berkenbilt	3844aedd93	Add testing for page APIs	2012-06-21 15:01:02 -04:00
Jay Berkenbilt	eb802cfa8c	Implement page manipulation APIs	2012-06-21 15:01:02 -04:00
Jay Berkenbilt	e01ae1968b	Split page handling APIs into a separate source file	2012-06-21 15:01:02 -04:00
Jay Berkenbilt	df493c352f	Refactor optimizePagesTree Split optimizePagesTree into a simpler top-level routine and a recursive internal routine.	2012-06-21 15:01:02 -04:00
Tobias Hoffmann	5d3f93be29	Added first version of pages API.	2012-06-21 15:01:02 -04:00
Tobias Hoffmann	47a846a7e0	Added method to clear pages cache.	2012-06-21 15:01:02 -04:00
Jay Berkenbilt	3b413ca87c	Fix typo in comment	2012-06-21 15:00:58 -04:00
Jay Berkenbilt	f59ff6fcc2	fix include order for off_t	2012-06-21 14:11:22 -04:00
Jay Berkenbilt	fbe68d63f0	fix doc comment	2012-06-21 10:59:33 -04:00
Jay Berkenbilt	bc1c4bb578	Add QPDF::processFile that takes an open FILE*	2012-06-21 08:00:35 -04:00
Tobias Hoffmann	db7474e0fa	Added additional array mutators Added methods to append to arrays, insert items into arrays, and replace array contents with a vector of items.	2012-06-20 15:29:44 -04:00
Jay Berkenbilt	b2e6818935	Fix wording error in error message	2012-06-20 15:29:42 -04:00
Jay Berkenbilt	5d4cad9c02	ABI change: fix use of off_t, size_t, and integer types Significantly improve the code's use of off_t for file offsets, size_t for memory sizes, and integer types in cases where there has to be compatibility with external interfaces. Rework sections of the code that would have prevented qpdf from working on files larger than 2 (or maybe 4) GB in size.	2012-06-20 15:20:26 -04:00
Jay Berkenbilt	24e2b2b76f	Fix gcc 4.7 warnings about C++11	2012-06-20 15:18:14 -04:00
Jay Berkenbilt	92c94e7df2	Add symbol versioning For ELF systems, turn on versioned symbols by default, and add a configure option to enable or disable them.	2012-06-20 15:18:12 -04:00
Jay Berkenbilt	01bcda8974	fix PCRE calls to remove use of deprecated API pcre_info -> pcre_fullinfo. Closes issue 3489349. Thanks Tim Harder.	2012-04-06 21:47:46 -04:00
Jay Berkenbilt	8e9fe21316	Update for 2.3.1	2011-12-28 17:19:40 -05:00
Jay Berkenbilt	92f0207de8	fix MSVC 2010 issues	2011-12-28 16:40:33 -05:00
Jay Berkenbilt	11314a9551	Don't declare any PCRE objects static.	2011-12-28 14:32:33 -05:00
Jay Berkenbilt	1d1d21d3fe	ready for 2.3.0 release	2011-08-11 15:34:41 -04:00

... 14 15 16 17 18 ...

1696 Commits