octoleo/qpdf - qpdf - Vast Development Method

mirror of https://github.com/qpdf/qpdf.git synced 2024-11-11 23:45:47 +00:00

Author	SHA1	Message	Date
Jay Berkenbilt	2794bfb1a6	Add flags to control zlib compression level (fixes #113 )	2019-08-23 20:34:21 -04:00
Jay Berkenbilt	dac0598b94	Add ability to set zlib compression level globally	2019-08-23 20:34:21 -04:00
Jay Berkenbilt	3f1ab64066	Pass offset and length to ParserCallbacks::handleObject	2019-08-22 22:54:29 -04:00
Jay Berkenbilt	4b2e72c4cd	Test for direct, rather than resolved nulls in parser Just because we know an indirect reference is null, doesn't mean we shouldn't keep it indirect.	2019-08-22 17:55:16 -04:00
Jay Berkenbilt	3f3dbe22ea	Remove array null flattening For some reason, qpdf from the beginning was replacing indirect references to null with literal null in arrays even after removing the old behavior of flattening scalar references. This seems like a bad idea.	2019-08-22 17:55:16 -04:00
Jay Berkenbilt	225cd9dac2	Protect against coding error of re-entrant parsing	2019-08-22 17:55:16 -04:00
Jay Berkenbilt	ae5bd7102d	Accept extraneous space before xref (fixes #341 )	2019-08-19 22:24:53 -04:00
Jay Berkenbilt	8a9086a689	Accept extraneous space after stream keyword (fixes #329 )	2019-08-19 21:43:44 -04:00
Jay Berkenbilt	43f91f58b8	Improve invalid name token warning message This message used to only appear for PDF >= 1.2. The invalid name is valid for PDF 1.0 and 1.1. However, since QPDFWriter may write a newer version, it's better to detect and warn in all cases. Therefore make the warning more informative.	2019-08-19 19:48:27 -04:00
Jay Berkenbilt	42d396f1dd	Handle invalid name tokens symmetrically for PDF < 1.2 (fixes #332 )	2019-08-19 19:48:27 -04:00
Jay Berkenbilt	d9dd99eca3	Attempt to repair /Type key in pages nodes (fixes #349 )	2019-08-18 18:54:37 -04:00
Jay Berkenbilt	522d2b2227	Improve efficiency of fixDanglingReferences	2019-08-18 09:00:40 -04:00
Jay Berkenbilt	5187a3ec85	Shallow copy arrays without removing sparseness	2019-08-17 23:02:41 -04:00
Jay Berkenbilt	bf7c6a8070	Use SparseOHArray in parsing	2019-08-17 23:02:41 -04:00
Jay Berkenbilt	e5f504b6c5	Use SparseOHArray in QPDF_Array	2019-08-17 23:02:41 -04:00
Jay Berkenbilt	a89d8a0677	Refactor QPDF_Array in preparation for using SparseOHArray	2019-08-17 23:02:41 -04:00
Jay Berkenbilt	e83f3308fb	SparseOHArray	2019-08-17 23:02:41 -04:00
Thorsten Schöning	8f06da7534	Change list to vector for outline helpers (fixes #297 ) This change works around STL problems with Embarcadero C++ Builder version 10.2, but std::vector is more common than std::list in qpdf, and this is a relatively new API, so an API change is tolerable. Thanks to Thorsten Schöning <6223655+ams-tschoening@users.noreply.github.com> for the fix.	2019-07-03 20:08:47 -04:00
Jay Berkenbilt	4db1de97ce	Convert some cases of logic_error to runtime_error There were a few cases that could be caused by invalid input rather than bugs in the code which were throwing logic_error instead of runtime_error.	2019-06-25 12:43:06 -04:00
Jay Berkenbilt	201e8798d7	Convert previously overlooked static cast to QIntC	2019-06-25 12:43:06 -04:00
Jay Berkenbilt	04f45cf652	Treat all linearization errors as warnings This also reverts the addition of a new checkLinearization that distinguishes errors from warnings. There's no practical distinction between what was considered an error and what was considered a warning.	2019-06-23 13:45:45 -04:00
Jay Berkenbilt	c5ed1b8075	Handle invalid encryption Length (fixes #333 )	2019-06-22 20:57:33 -04:00
Jay Berkenbilt	551dfbf697	Allow set*EncryptionParameters before filename iset (fixes #336 )	2019-06-22 20:57:33 -04:00
Jay Berkenbilt	7bd38a3eb3	Provide error message in Windows crypto code (fixes #286 ) Thanks to github user zdenop for supplying some additional error-handling code.	2019-06-22 17:12:01 -04:00
Jay Berkenbilt	6c39aa8763	In shippable code, favor smart pointers (fixes #235 ) Use PointerHolder in several places where manually memory allocation and deallocation were being used. This helps to protect against memory leaks when exceptions are thrown in surprising places.	2019-06-22 16:57:52 -04:00
Jay Berkenbilt	85a3f95a89	qpdf: exit 3 for linearization warnings without errors (fixes #50 )	2019-06-22 16:57:51 -04:00
Jay Berkenbilt	1bde5c68a3	Add QUtil::read_file_into_memory This code was essentially duplicated between test_driver and standalone_fuzz_target_runner.	2019-06-22 10:14:25 -04:00
Jay Berkenbilt	658b5bb3be	QPDFWriter: clean up overloaded functions In a small number of cases, it makes sense to replace an overloaded function with a function that takes a default argument. We can do this now because we've already broken binary compatibility since the last release.	2019-06-22 10:13:27 -04:00
Jay Berkenbilt	79f6b4823b	Convert remaining public classes to use Members pattern Have classes contain only a single private member of type PointerHolder<Members>. This makes it safe to change the structure of the Members class without breaking binary compatibility. Many of the classes already follow this pattern quite successfully. This brings in the rest of the class that are part of the public API.	2019-06-22 10:13:27 -04:00
Jay Berkenbilt	45dac410b5	Remove broken QPDFTokenizer::expectInlineImage	2019-06-21 22:29:31 -04:00
Jay Berkenbilt	25dd3c6750	Remove QPDF::copyForeignObject with unused parameter	2019-06-21 22:29:31 -04:00
Jay Berkenbilt	c6cfd64503	Rename QUtil::strcasecmp to QUtil::str_compare_nocase (fixes #242 )	2019-06-21 22:29:31 -04:00
Jay Berkenbilt	848351f1fc	Add missing #include <cstring>	2019-06-21 22:29:31 -04:00
Jay Berkenbilt	b07ad6794e	Fix bugs found by fuzz tests * Several assertions in linearization were not always true; change them to run time errors * Handle a few cases of uninitialized objects * Handle pages with no contents when doing form operations * Handle invalid page tree nodes when traversing pages	2019-06-21 17:56:24 -04:00
Jay Berkenbilt	a35d4ce9cc	Fix bounds error in utf16_to_utf8 conversion	2019-06-21 17:40:24 -04:00
Jay Berkenbilt	63a643a3c7	Remove implicit conversion from int/pointer to bool This fixes cases of warning C4800 from msvc	2019-06-21 13:17:21 -04:00
Jay Berkenbilt	d71f05ca07	Fix sign and conversion warnings (major) This makes all integer type conversions that have potential data loss explicit with calls that do range checks and raise an exception. After this commit, qpdf builds with no warnings when -Wsign-conversion -Wconversion is used with gcc or clang or when -W3 -Wd4800 is used with MSVC. This significantly reduces the likelihood of potential crashes from bogus integer values. There are some parts of the code that take int when they should take size_t or an offset. Such places would make qpdf not support files with more than 2^31 of something that usually wouldn't be so large. In the event that such a file shows up and is valid, at least qpdf would raise an error in the right spot so the issue could be legitimately addressed rather than failing in some weird way because of a silent overflow condition.	2019-06-21 13:17:21 -04:00
Jay Berkenbilt	f40ffc9d63	Pl_Flate: constructor's out_bufsize is now unsigned int This is the type we need for the underlying zlib implementation.	2019-06-21 13:17:21 -04:00
Jay Berkenbilt	da30764bce	Change QPDFObjectHandle::pipeStreamData's encode_flags type Change from unsigned long to int since we pass enumerated type values to this field.	2019-06-21 13:17:21 -04:00
Jay Berkenbilt	3608afd5c5	Add new integer accessors to QPDFObjectHandle	2019-06-21 13:17:21 -04:00
Jay Berkenbilt	42306e2ff8	QUtil: add unsigned int/string functions	2019-06-21 13:17:21 -04:00
Jay Berkenbilt	2155815234	configure: determine wordsize automatically Based on sizeof(size_t). Assumes 64 if not 32.	2019-06-21 13:17:21 -04:00
Jay Berkenbilt	713d961990	Appearance streams: some floating point values were truncated Bounding box X coordinates could be truncated, causing them to be off by a fraction of a point. This was most likely not visible, but it was still wrong.	2019-06-20 21:32:30 -04:00
Jay Berkenbilt	eb7948876b	Fix problems found in fuzz corpus	2019-06-15 17:24:24 -04:00
Jay Berkenbilt	cf469d7890	Give up reading objects with too many consecutive errors	2019-06-15 08:52:19 -04:00
Jay Berkenbilt	cd830968ef	Eliminate one potential integer overflow There are more to handle, but this resolves an issue already caught by oss-fuzz.	2019-06-15 08:52:19 -04:00
Jay Berkenbilt	31bde2f9d7	Handle empty DecodeParams array for (fixes #331 ) On read, ignore /DecodeParms when empty list; on write, delete it. Some files have been found that include an empty list for /DecodeParms, but this is not technically compliant with the spec, and the only sensible interpretation is to treat it as if there are no decode parameters.	2019-06-09 17:19:49 -04:00
Jay Berkenbilt	b1a78be1a8	Prepare 8.4.2 release	2019-05-18 08:56:37 -04:00
Jay Berkenbilt	b3f0dbff62	Fix Windows memory error (fixes #330 )	2019-05-16 14:26:51 -04:00
Jay Berkenbilt	a323f6f49f	Prepare 8.4.1 release	2019-04-27 20:44:20 -04:00
Jay Berkenbilt	81205e007b	Spell check	2019-04-21 13:09:11 -04:00
Jay Berkenbilt	011695dfdf	Support Unicode in filenames (fixes #298 )	2019-04-20 21:00:43 -04:00
Jay Berkenbilt	4ccb29912a	Tighten isPageObject (fixes #310 )	2019-04-20 21:00:43 -04:00
Thorsten Schöning	2c704b99a1	Undefined functions because of missing std:: or header. (#295 ) * [bcc32 Error] QPDF.cc(375): E2268 Call to undefined function 'atof' Full parser context QPDF.cc(358): parsing: void QPDF::parse(const char ) [bcc32 Error] QPDFTokenizer.cc(183): E2268 Call to undefined function 'strtol' Full parser context QPDFTokenizer.cc(163): parsing: void QPDFTokenizer::resolveLiteral() * [bcc32 Error] pdf-split-pages.cc(52): E2268 Call to undefined function 'exit' Full parser context pdf-split-pages.cc(50): parsing: void usage() * PR #295: Including "cstdlib" should be replaced with "stdlib.h" to be more consistent. At the same time I changed the order of the surrounding includes to reflect alphabetical order, because at some files this already have been the case.	2019-03-12 10:05:29 -04:00
Thorsten Schöning	71b7ed9f4f	"_setmode" and "_stricmp" are not available on Borland C++Builder, neither the classic one nor newer ones based on CLANG.	2019-03-11 16:58:55 -04:00
Jay Berkenbilt	da7c2c0ee9	Fix json serialization for {x \| -1 < x < 1} (fixes #308 ) JSON serialization was preserving the value as presented, but JSON doesn't accept decimal values without a 0 before the decimal point.	2019-03-11 16:22:59 -04:00
Jay Berkenbilt	03074ca5a0	Prepare 8.4.0 release	2019-02-01 22:25:25 -05:00
Jay Berkenbilt	fec5bb124c	Spell check	2019-01-31 21:41:29 -05:00
Jay Berkenbilt	eb49e07c0a	Make inline image token exactly contain the image data Do not include the trailing EI, and handle cases where EI is not preceded by a delimiter. Such cases have been seen in the wild.	2019-01-31 20:28:44 -05:00
Jay Berkenbilt	5211bcb5ea	Externalize inline images (fixes #278 )	2019-01-31 10:38:13 -05:00
Jay Berkenbilt	1eb35a355f	Exclude space after ID in image data	2019-01-31 10:38:10 -05:00
Jay Berkenbilt	2b6c79bcae	Improve locating inline image's EI We've actually seen a PDF file in the wild that contained EI surrounded by delimiters inside the image data, which confused qpdf's naive code. This significantly improves EI detection.	2019-01-31 09:26:37 -05:00
Jay Berkenbilt	ec9e310c9e	Refactor QPDFTokenizer's inline image handling Add a version of expectInlineImage that takes an input source and searches for EI. This is in preparation for improving the way EI is found. This commit just refactors the code without changing the functionality and adds tests to make sure the old and new code behave identically.	2019-01-31 09:26:37 -05:00
Jay Berkenbilt	31372edce0	Inline image token value ends with EI, not delimiter The inline image token erroneously included the delimiter that followed EI. The ObjectHandle created from it was correct.	2019-01-31 09:26:37 -05:00
Jay Berkenbilt	b776dcd2d3	Clean up some private functions	2019-01-29 22:14:20 -05:00
Jay Berkenbilt	8a9cfd2605	Handle direct page objects (fixes #164 )	2019-01-29 17:01:36 -05:00
Jay Berkenbilt	2d0885bc11	Clarify documentation for copyForeignObject regarding pages Make explicit that copyForeignObject can be used on page objects and will copy them properly but not update the pages tree.	2019-01-28 21:53:55 -05:00
Jay Berkenbilt	2712869cf9	Fix logic for when to compress object and xref streams (fixes #271 )	2019-01-28 21:43:06 -05:00
Jay Berkenbilt	52f9d326a5	Resolve duplicated page objects (fixes #268 ) When linearizing a file or getting the list of all pages in a file, detect if the pages tree contains a duplicated page object and, if so, shallow copy it. This makes it possible to have a one to one mapping of page positions to page objects.	2019-01-28 20:29:58 -05:00
Jay Berkenbilt	623f5b664e	Convert pages to form XObjects Support conversion of pages to form XObjects and placement of form XObjects on pages.	2019-01-27 07:50:30 -05:00
Jay Berkenbilt	68ccd87c9e	Move rectangle transformation into QPDFMatrix	2019-01-27 07:50:30 -05:00
Jay Berkenbilt	8cb245739c	Add QPDFObjectHandle::getUniqueResourceName	2019-01-27 07:50:30 -05:00
Jay Berkenbilt	009767d97a	Handle inheritable page attributes Add getAttribute for handling inheritable page attributes, and fix getPageImages and annotation flattening code to use it.	2019-01-25 22:30:05 -05:00
Jay Berkenbilt	2d32f4db8f	Handle fallback font size in text appearances If we end up using our fallback font size when generating appearances for text fields, reflect that in the Tf operator used in the appearance stream.	2019-01-21 07:38:21 -05:00
Jay Berkenbilt	9cb599875b	Improve text objects used in text appearance streams	2019-01-20 23:05:58 -05:00
Jay Berkenbilt	930eade6d3	Fix omissions in text appearance generation When generating appearance streams for variable text annotations, properly handle the cases of there being no appearance dictionary, no appearance stream, or an appearance stream with no BMC..EMC marker.	2019-01-20 23:05:58 -05:00
Jay Berkenbilt	65ef0bf313	When flattening, remove annotations with no appearance stream With the exception of form field annotations when /NeedAppearances is true, remove annotations that don't have appearance streams when flattening. There is no reason to keep these when flattening since they are invisible. This may include unchecked checkboxes, unshown popup windows, etc.	2019-01-20 23:05:58 -05:00
Jay Berkenbilt	c18ee440a3	mingw workaround for QPDFExc destructor mingw doesn't like it when you don't inline empty virtual destructors.	2019-01-19 10:14:07 -05:00
Jay Berkenbilt	e87d149918	Add QUtil::possible_repaired_encodings	2019-01-17 11:43:56 -05:00
Jay Berkenbilt	6ec22f117d	Modernize encryption API for more granularity Setting encryption permissions for R >= 3 set permission bits in groups corresponding to menu options in Acrobat 5. The new API allows the bits to be set individually.	2019-01-17 11:43:56 -05:00
Jay Berkenbilt	4630377731	Add status-reporting transcoders to QUtil	2019-01-17 11:43:56 -05:00
Jay Berkenbilt	8f389f14c0	QUtil::analyze_encoding	2019-01-17 11:43:56 -05:00
Jay Berkenbilt	6817ca585a	Bidirectional transcoding for win, mac, pdf, utf8, utf16	2019-01-17 11:43:56 -05:00
Jay Berkenbilt	698485468a	Move remaining existing transcoding to QUtil	2019-01-17 11:43:56 -05:00
Jay Berkenbilt	5cfcd4f361	Additional checks for unreferenced resources Explicitly abandon removal of unreferenced resources if there are any lexical errors in the page's contents. This case always generated a warning, but it now also prevents removal of unreferenced resources, this strongly decreasing the likelihood of data loss.	2019-01-17 11:43:56 -05:00
Jay Berkenbilt	4bc434000c	Copy subdictionaries when removing resources (fixes #276 ) When removing unreferenced resources, the code was copying the overall resource dictionaries but not the subdictionaries being modified. This was a "typo" in the code -- the comment clearly stated the need to do this, but the code replaced the dictionary with itself rather than with a shallow copy of itself.	2019-01-17 09:40:05 -05:00
Jay Berkenbilt	654c0e8caf	Allow adding the same page more than once in --pages (fixes #272 )	2019-01-12 10:01:47 -05:00
Jay Berkenbilt	4ecd1df6f2	Add configure option AVOID_WINDOWS_HANDLE If set, we avoid using Windows I/O HANDLE, which is disallowed in some versions of the Windows SDK, such as for Windows phones. QUtil::same_file will always return false in this case. Only applies to Windows builds.	2019-01-10 22:35:08 -05:00
Jay Berkenbilt	d24a120c7f	Add QPDF::setImmediateCopyFrom	2019-01-10 22:35:08 -05:00
Jay Berkenbilt	b653929c93	Update version to 8.3.0	2019-01-07 11:16:54 -05:00
Jay Berkenbilt	aa602fd107	Fix integer overflow in large file test	2019-01-07 08:49:14 -05:00
Jay Berkenbilt	c3cee5f154	Exercise out of scope original pdf for copyForeignObject	2019-01-07 07:38:03 -05:00
Jay Berkenbilt	fddbcab0e7	Mostly don't require original QPDF for copyForeignObject (fixes #219 ) The original QPDF is only required now when the source QPDFObjectHandle is a stream that gets its stream data from a QPDFObjectHandle::StreamDataProvider.	2019-01-07 00:11:15 -05:00
Jay Berkenbilt	fbbb0ee016	Make a static version of QPDF::pipeStreamData This is in preparation of being able to pipe a stream's data without keeping a copy of its containing qpdf object.	2019-01-07 00:11:15 -05:00
Jay Berkenbilt	7588cac295	Create an application-scope unique ID for each QPDF object Use this instead of QPDF* as a map key for object_copiers.	2019-01-07 00:11:15 -05:00
Jay Berkenbilt	e27ac682e0	Move encryption parameters into a class	2019-01-06 09:58:16 -05:00
Jay Berkenbilt	a70fbaaf50	Honor other base encodings when generating appearances	2019-01-05 23:01:59 -05:00
Jay Berkenbilt	b341d742db	Add WinAnsi and MacRoman encoding	2019-01-05 23:01:44 -05:00
Jay Berkenbilt	3ef1b77304	Refactor QUtil::utf8_to_ascii	2019-01-05 22:59:29 -05:00
Jay Berkenbilt	089ce5902e	Move utf8_to_utf16 into QUtil	2019-01-05 22:59:27 -05:00
Jay Berkenbilt	ae18bfd142	Refactor string transcoding in QPDF_String	2019-01-05 22:56:58 -05:00
Jay Berkenbilt	2e342ee5bb	Spell check	2019-01-04 21:33:14 -05:00
Jay Berkenbilt	16fd6e64f9	Add QPDFWriter::getFinalVersion (fixes #266 )	2019-01-04 12:37:22 -05:00
Jay Berkenbilt	837dcf8fc2	Don't call assert while checking linearization data (fixes #209 , #231 ) Instead of calling assert for problems found during checking linearization data, throw an exception which is later caught and issued as an error. Ideally we would handle errors more robustly, but this is still a significant improvement.	2019-01-04 11:55:42 -05:00
Jay Berkenbilt	a01359189b	Fix dangling references (fixes #240 ) On certain operations, such as iterating through all objects and adding new indirect objects, walk through the entire object structure and explicitly resolve any indirect references to non-existent objects. That prevents new objects from springing into existence and causing the previously dangling references to point to them.	2019-01-04 10:29:29 -05:00
Jay Berkenbilt	158156d506	Add basic appearance stream generation	2019-01-04 08:00:19 -05:00
Jay Berkenbilt	02281632cc	Add QUtil::utf8_to_ascii	2019-01-03 23:18:13 -05:00
Jay Berkenbilt	b55567a0fa	Add special case setV code for button fields	2019-01-03 23:18:13 -05:00
Jay Berkenbilt	e3144ac417	Add form fields to json output Also add some additional methods for detecting form field types to assist in the json creation and for later use.	2019-01-03 23:18:13 -05:00
Jay Berkenbilt	ca94ac68d9	Honor flags when flattening annotations	2019-01-03 11:59:55 -05:00
Jay Berkenbilt	06d6438ddf	Minor fixes	2019-01-03 09:17:43 -05:00
Jay Berkenbilt	3e74916c5a	Fix seg fault on empty xref stream (fixes #263 ) Thanks to @p-cher for supplying a patch.	2019-01-03 09:17:43 -05:00
Jay Berkenbilt	f78ea057ca	Switch annotation flattening to use the form xobjects Instead of directly putting the contents of the annotation appearance streams into the page's content stream, add commands to render the form xobjects directly. This is a more robust way to do it than the original solution as it works properly with patterns and avoids problems with resource name clashes between the pages and the form xobjects.	2019-01-02 21:49:47 -05:00
Jay Berkenbilt	3b8ce4f12a	Annotation flattening including form fields Flatten annotations by integrating their appearance streams into the content stream of the containing page. In the case of form fields, only flatten if /NeedAppearance is false (or equivalently absent). If flattening form fields, also remove /AcroForm from the document catalog.	2019-01-01 08:14:15 -05:00
Jay Berkenbilt	95d6b17a89	Add QPDFObjectHandle::mergeDictionary()	2019-01-01 08:12:56 -05:00
Jay Berkenbilt	104fd6da52	Add matrix and annotation appearance stream handling Generate page content fragment for rendering appearance streams including all matrix calculation.	2019-01-01 08:07:21 -05:00
Jay Berkenbilt	5059ec0d35	Add Matrix class under QPDFObjectHandle	2018-12-31 23:02:43 -05:00
Jay Berkenbilt	daeb5a85b6	Transformation matrix	2018-12-31 18:23:47 -05:00
Jay Berkenbilt	3440ea7d3c	JSON::serialize -> unparse Unparse is admittedly strange, but I'd rather be strange and consistent, and everything else in the qpdf library uses unparse to serialize. (If you're reading this, the convention of using "unparse" comes from the "clu" programming language.)	2018-12-25 11:52:21 -05:00
Jay Berkenbilt	fa3664357b	Move numrange code from qpdf.cc to QUtil.cc Also move tests to libtests.	2018-12-21 19:11:57 -05:00
Jay Berkenbilt	d5d179f441	Add document and object helpers for outlines (bookmarks)	2018-12-21 19:11:57 -05:00
Jay Berkenbilt	30a0c070e4	Add QPDFObjectHandle::getJSON()	2018-12-21 18:34:56 -05:00
Jay Berkenbilt	651179b5da	Add simple JSON serializer	2018-12-21 18:34:56 -05:00
Jay Berkenbilt	0776c00129	Add QPDFNameTreeObjectHelper	2018-12-21 18:34:56 -05:00
Jay Berkenbilt	cc500eda91	Minor cleanup	2018-12-21 17:25:31 -05:00
Jay Berkenbilt	6ef9e31233	Add QPDFPageLabelDocumentHelper	2018-12-18 16:59:24 -05:00
Jay Berkenbilt	f38df27aa3	Add QPDFNumberTreeObjectHelper	2018-12-18 16:46:10 -05:00
Jay Berkenbilt	077d3d4512	Add QPDFObjectHandle::wrapInArray() Wrap an object in an array if it is not already an array.	2018-12-18 16:45:48 -05:00
Jay Berkenbilt	d1368a3851	Commit automatically generated files	2018-10-11 17:27:54 -04:00
Jay Berkenbilt	6ee761fc86	Prepare 8.2.1 release	2018-08-18 10:56:19 -04:00
Jay Berkenbilt	5e9e17e62a	Prepare 8.2.0 release	2018-08-16 11:53:10 -04:00
Jay Berkenbilt	693cdaac35	Missing header for std::max	2018-08-16 11:53:10 -04:00
Jay Berkenbilt	b4ce557be5	Fix error in QPDFSystemError.cc	2018-08-14 11:39:07 -04:00
Jay Berkenbilt	b4bdc42b4f	New exception class QPDFSystemError (fixes #221 )	2018-08-13 20:01:51 -04:00
Jay Berkenbilt	5d9d80beba	Fix fallback logic for encryption (fixes #229 )	2018-08-12 22:32:40 -04:00
Jay Berkenbilt	60fe8061cb	Fix one more identifier (fixes #236 )	2018-08-12 22:01:51 -04:00
Jay Berkenbilt	a2f62935b3	Catch exceptions as const references (fixes #236 ) This fix allows qpdf to compile/test cleanly with gcc 8.	2018-08-12 21:57:52 -04:00
Jay Berkenbilt	3d6615b276	Pl_Buffer: reduce memory growth (fixes #228 ) Rather than keeping a list of buffers for every write, accumulate bytes in a single buffer, doubling the size of the buffer when needed to accommodate new data. This is not the best possible implementation, but the change was implemented in this way to avoid changing the shape of Pl_Buffer and thus breaking backward compatibility.	2018-08-12 17:45:43 -04:00
Jay Berkenbilt	3873f5fd9b	Protect headers with compliant identifiers (fixes #233 )	2018-08-12 14:10:32 -04:00
Jay Berkenbilt	932799baab	Fix memory access error A previous fix introduced a potentially memory overrun under certain rare conditions. The test suite now once again passes with address sanitizer.	2018-08-12 13:16:17 -04:00
Jay Berkenbilt	b6e414b10b	Remove some extraneous null pointer checks (fixes #234 ) There were a few places in the code that were checking that a pointer wasn't null before deleting it, even though C++ has always allowed delete 0. Most of the code did not perform these checks.	2018-08-12 12:58:39 -04:00
Jay Berkenbilt	4a4736c695	Fix EOL handling inside strings (fixes #226 ) CR, CRLF, and LF are all supposed to be treated as LF; only one EOL is to be ignored after backslash.	2018-08-05 20:48:35 -04:00
Jay Berkenbilt	1619cad1e8	Return correct method for string encryption (fixes #227 )	2018-08-05 16:58:21 -04:00
Jay Berkenbilt	e1cd5891af	Fix infinite loop on small files with progress reporting (fixes #230 ) Turns out you can keep adding zero to a number over and over again and it just doesn't get any bigger. Who would have known?	2018-08-05 15:43:34 -04:00
Jay Berkenbilt	4f4c627b77	ClosedFileInputSource: add method to keep file open During periods of intensive operation on a specific file, this method can reduce the overhead of repeated open/close operations.	2018-08-04 19:52:46 -04:00
Jay Berkenbilt	1bd2a2e79b	Prepare 8.1.0 release	2018-06-23 07:50:11 -04:00
Jay Berkenbilt	3aad28aed0	Bug fix: honor encryption key length with R=3 (fixes #212 )	2018-06-22 19:24:26 -04:00
Jay Berkenbilt	a433ed24f9	Add progress reporting for QPDFWriter (fixes #200 )	2018-06-22 16:14:54 -04:00
Jay Berkenbilt	2a82f6e1e0	Add method to get count of objects in QPDF	2018-06-22 15:53:40 -04:00
Jay Berkenbilt	c81836076f	Correct incorrect comment	2018-06-22 13:13:09 -04:00
Jay Berkenbilt	4ccc8b1a44	Add ClosedFileInputSource ClosedFileInputSource is an input source that keeps the file closed when not reading it.	2018-06-22 12:52:45 -04:00
Jay Berkenbilt	c71dc6888c	Don't prune resource dictionaries on errors or by request If we are unable to filter a page's content streams, don't attempt to remove objects from the page's resource dictionary. Also provide a command line option to suppress resource removal in case we ever need this as a workaround for some bug or broken PDF files.	2018-06-22 10:45:31 -04:00
Jay Berkenbilt	38c9ed23c3	Treat content stream parsing errors as an error, not a warning If parsing content streams is treated as a warning, there is no way for a caller to know if a parsing operation has failed. This is very dangerous and will likely result in data loss when token filters are parser callbacks are in use.	2018-06-22 10:44:08 -04:00
Jay Berkenbilt	6c89d4b35b	When splitting files, remove unreferenced objects (fixes #203 )	2018-06-21 21:03:30 -04:00
Jay Berkenbilt	ddd78c1b7f	Fix QPDFObjectHandle::shallowCopy It's not really a shallow copy. It just doesn't cross indirect object boundaries. The old implementation had a bug that would cause multiple shallow copies of the same object to share memory, which was not the intention.	2018-06-21 20:34:45 -04:00
Jay Berkenbilt	397b097c46	Allow setting a form field's value	2018-06-21 15:57:13 -04:00
Jay Berkenbilt	952a665a4e	Better support for creating Unicode strings	2018-06-21 15:57:13 -04:00
Jay Berkenbilt	e44c395c51	QUtil::toUTF16	2018-06-21 15:57:13 -04:00
Jay Berkenbilt	0b05111db8	Implement helper class for interactive forms	2018-06-21 15:57:13 -04:00
Jay Berkenbilt	2e7ee23bf6	Add QPDFPageDocumentHelper and QPDFPageObjectHelper This is the beginning of higher-level API support using helper classes. The goal is to be able to add more helpers without continuing to pollute QPDF's and QPDFObjectHandle's public interfaces.	2018-06-21 15:57:13 -04:00
Jay Berkenbilt	4cded10821	Add QPDFObjectHandle::Rectangle type Provide a convenient way of accessing rectangles.	2018-06-21 15:57:13 -04:00
Jay Berkenbilt	078cf9bf90	newline before endstream fix for object streams (fixes #205 )	2018-05-12 13:17:43 -04:00
Jay Berkenbilt	15ed9f8565	Fix small logic error in Token construct (fixes #206 ) The special case around name token was not reachable. This would only affect constructors of name tokens that were represented in non-canonical form such as with a hex substitution for a printable character. The error was harmless but still a bug.	2018-05-05 17:47:56 -04:00
Jay Berkenbilt	b4d6cf6836	Limit depth of nesting in direct objects (fixes #202 ) This fixes CVE-2018-9918.	2018-04-15 16:11:22 -04:00
Jay Berkenbilt	f8c8e4dcc0	Prepare 8.0.2 release	2018-03-06 11:34:07 -05:00
Jay Berkenbilt	e4e2e26d99	Properly handle pages with no contents (fixes #194 ) Remove calls to assertPageObject(). All cases in the library that called assertPageObject() work fine if you don't call assertPageObject() because nothing assumes anything that was being checked by that call. Removing the calls enables more files to be successfully processed.	2018-03-06 11:34:07 -05:00
Jay Berkenbilt	1a4dcb4aaf	Pl_Buffer starts in a ready state	2018-03-06 11:31:03 -05:00
Jay Berkenbilt	ee44aef8d0	Treat loop in xref tables as damage (fixes #192 ) Prior to this fix, if there was a loop detected in following /Prev pointers in xref streams/tables, it would cause qpdf to lose data. Note that this condition causes many PDF readers to hang or fail.	2018-03-05 14:26:58 -05:00
Jay Berkenbilt	6fe1e9de40	Prepare 8.0.1 release	2018-03-04 07:16:20 -05:00
Jay Berkenbilt	7b9f23a99a	Ignore zlib data check errors (fixes #191 )	2018-03-03 11:35:01 -05:00
Jay Berkenbilt	3e8b643ae3	Release 8.0.0	2018-02-25 16:00:11 -05:00
Jay Berkenbilt	111ec50950	8.0.rc3	2018-02-25 14:17:59 -05:00
Jay Berkenbilt	d3d3970cf6	8.0.rc2	2018-02-25 13:50:22 -05:00
Jay Berkenbilt	a16d703f4d	Update version to 8.0.rc1 This is for testing the release process, particularly as it pertains to AppImage creation.	2018-02-25 09:03:27 -05:00
Jay Berkenbilt	82cae01a76	Bump version number and soname Bump to an alpha release. This version is not being widely released but is being used to push the new shared library version through the debian packaging system and to test out github releases.	2018-02-20 21:31:38 -05:00
Jay Berkenbilt	4bb3046f0b	Properly handle strings with PDF Doc Encoding (fixes #179 ) The QPDF_String::getUTF8Val() method was not treating strings that weren't explicitly Unicode as PDF Doc Encoded. This only affects characters in the range 0x80 through 0xa0.	2018-02-18 21:06:27 -05:00
Jay Berkenbilt	2780a1871d	Add C API for checking PDF files	2018-02-18 21:06:27 -05:00
Jay Berkenbilt	d0e99f195a	More robust handling of type errors Give objects descriptions and context so it is possible to issue warnings instead of fatal errors for attempts to access objects of the wrong type.	2018-02-18 21:06:27 -05:00
Jay Berkenbilt	c2e16827b6	Replace "file position" with "offset" in error messages Sometimes it's an offset in an object stream or a content stream, so file position is confusing in some cases.	2018-02-18 21:06:27 -05:00
Jay Berkenbilt	52e024f701	Include omitted object description in error message	2018-02-18 21:06:27 -05:00
Jay Berkenbilt	cb3b705cf9	Include filename in object stream parse error	2018-02-18 21:06:27 -05:00
Jay Berkenbilt	21b7481b0e	Push members of QPDFObjectHandle into a Members object As in other cases, this is to enable adding new member variables in the future without breaking ABI compatibility.	2018-02-18 21:06:27 -05:00
Jay Berkenbilt	e410b0fe0d	Simplify TokenFilter interface Expose Pl_QPDFTokenizer, and have it do more of the work of managing the token filter's pipeline.	2018-02-18 21:05:47 -05:00
Jay Berkenbilt	1fdd86a049	Move Pl_QPDFTokenizer to public interface	2018-02-18 21:05:47 -05:00
Jay Berkenbilt	5708b5d0aa	Add additional interface for filtering page contents	2018-02-18 21:05:47 -05:00
Jay Berkenbilt	fd02944e19	Clean up comment	2018-02-18 21:05:47 -05:00
Jay Berkenbilt	5136238f2a	Detect and report bad tokens in content normalization	2018-02-18 21:05:47 -05:00
Jay Berkenbilt	9910104442	Implement TokenFilter and refactor Pl_QPDFTokenizer Implement a TokenFilter class and refactor Pl_QPDFTokenizer to use a TokenFilter class called ContentNormalizer. Pl_QPDFTokenizer is now a general filter that passes data through a TokenFilter.	2018-02-18 21:05:46 -05:00
Jay Berkenbilt	b8723e97f4	Add coalesce contents capability	2018-02-18 21:05:46 -05:00
Jay Berkenbilt	25988e8d10	Bug fix: content normalizer should not add trailing newline Adding a trailing newline in content normalization damages files whose contents are split across streams in the middle of tokens. Let QPDFWriter add the newline with the indicator to ignore the newline, which it already does. This changes the way some qdf files look.	2018-02-18 21:05:46 -05:00
Jay Berkenbilt	fcd611b61e	Refactor parseContentStream	2018-02-18 21:05:46 -05:00
Jay Berkenbilt	05ff619b09	Remove redundant method Remove a redundant method that was equal to another one with additional arguments. This breaks binary compatibility, but there are other ABI breaking changes in the upcoming release, so now is the time to do it.	2018-02-18 21:05:46 -05:00
Jay Berkenbilt	55ee55394c	Use inline image token in content parser	2018-02-18 21:05:46 -05:00
Jay Berkenbilt	ba453ba4ff	Use space tokens in tokenizer filter	2018-02-18 21:05:46 -05:00
Jay Berkenbilt	ec538792fa	Use inline image token type in tokenizer filter	2018-02-18 21:05:46 -05:00
Jay Berkenbilt	fefe25030e	Inline image token type	2018-02-18 21:05:46 -05:00
Jay Berkenbilt	2699ecf13e	Push QPDFTokenizer members into a nested structure This is for protection against future ABI breaking changes.	2018-02-18 21:05:46 -05:00
Jay Berkenbilt	d97474868d	Lexer enhancements: EOF, comment, space Significant enhancements to the lexer to improve EOF handling and to support comments and spaces as tokens. Various other minor issues were fixed as well.	2018-02-18 20:18:40 -05:00
Jay Berkenbilt	ebd5ed63de	Add option to save pass 1 of lineariziation This is useful only for debugging the linearization code.	2018-02-18 20:18:40 -05:00
Jay Berkenbilt	2ebdd6929e	Prepare 7.1.1 release	2018-02-04 18:31:42 -05:00
Jay Berkenbilt	e3167c1a60	Fix linearization for files with nonstandard ID length	2018-02-04 18:16:23 -05:00
Jay Berkenbilt	3b2a3cdd77	Fix setLineBuf for bsd (fixes #177 ) Use 0 instead of NULL in a cast.	2018-02-04 14:19:00 -05:00
Jay Berkenbilt	d5bfd49cb2	Remove use of std::abs (fixes #172 ) Different compilers want different choices of headers for std::abs. It's easier to just to not use it.	2018-02-04 14:19:00 -05:00
Jay Berkenbilt	34a9b835b0	Fix indentation	2018-02-04 14:19:00 -05:00
Jay Berkenbilt	7e5e1a7158	Fix offset in error message	2018-02-04 14:19:00 -05:00
Jay Berkenbilt	633fb414af	Pl_QPDFTokenizer: Use unsigned_char_pointer instead of copy	2018-01-28 18:34:43 -05:00
Jay Berkenbilt	13d9756a45	Minor fixes to tokenizer	2018-01-28 18:34:43 -05:00
Jay Berkenbilt	2e4ca7ecf4	Update version numbers for 7.1.0	2018-01-14 20:09:20 -05:00
Jay Berkenbilt	04e47deaf9	Fixes for clang	2018-01-14 19:18:04 -05:00
Jay Berkenbilt	569d74d36b	Allow raw encryption key to be specified Add options to enable the raw encryption key to be directly shown or specified. Thanks to Didier Stevens <didier.stevens@gmail.com> for the idea and contribution of one implementation of this idea.	2018-01-14 10:21:05 -05:00
Jay Berkenbilt	3e306ae64c	Add QUtil::hex_decode	2018-01-14 09:04:13 -05:00
Jay Berkenbilt	791e0db762	Allow trailing . in numeric token (fixes #165 )	2018-01-13 20:05:40 -05:00
Jay Berkenbilt	ec0087e3ce	Support TIFF Predictor (fixes #171 )	2018-01-13 19:49:42 -05:00
Jay Berkenbilt	53971d50be	Add Pl_TIFFPredictor	2018-01-13 19:49:42 -05:00
Jay Berkenbilt	d9c9049708	Add signed support to BitStream and BitWriter	2018-01-13 19:49:42 -05:00
Jay Berkenbilt	661ed1d28e	Minor fixes to Pl_PNGFilter Fix comment, remove restriction that doesn't actually matter.	2018-01-13 19:49:42 -05:00
Jay Berkenbilt	be27d47bdc	Use better error for getStreamData failure If the stream isn't filterable but we call getStreamData, throw a regular exception instead of a logic error so that normal error handling and reporting mechanisms will be used.	2018-01-13 19:49:42 -05:00
Jay Berkenbilt	4edfe1f41d	Add tests for new PNG filters	2017-12-25 18:20:52 -05:00
Jay Berkenbilt	a3a55be9cd	Correct errors in PNG filters and make use from library	2017-12-25 14:24:48 -05:00
Casey Rojas	9a48720246	Initial implementation of other PNG decode filters Initial implementation provided by Casey Rojas <crojas@infotechfl.com> Some problems are fixed in a subsequent commit.	2017-12-24 22:59:51 -05:00
Jay Berkenbilt	0f1ce8e646	Prepare 7.0.0 release	2017-09-16 13:22:15 -04:00
Jay Berkenbilt	249e95f608	Fix test failure on MSVC	2017-09-15 23:09:04 -04:00
Jay Berkenbilt	6898bc8d98	Spell check	2017-09-15 23:09:04 -04:00
Jay Berkenbilt	f2ffb6968a	Fix Windows compilation errors	2017-09-15 21:44:57 -04:00
Jay Berkenbilt	d31a7b76e7	Improve message for stream decoding error Tweak the message so that we inform the user that we are mitigating data loss.	2017-09-12 16:03:48 -04:00
Jay Berkenbilt	eaacf94005	Update C API with new QPDFWriter methods	2017-09-12 14:30:39 -04:00
Jay Berkenbilt	40ecba4172	Pl_DCT: Use custom source and destination managers (fixes #153 ) Avoid calling jpeg_mem_src and jpeg_mem_dest. The custom destination manager writes to the pipeline in smaller chunks to avoid having the whole image in memory at once. The source manager works directly with the Buffer object. Using customer managers avoids use of memory source and destination managers, which are not present in older versions of libjpeg still in use by some Linux distributions.	2017-09-07 22:59:11 -04:00
Jay Berkenbilt	3ef1be9783	PNGFilter: Better range checking for columns	2017-08-31 07:26:58 -04:00
Jay Berkenbilt	1868a10f8b	Replace all atoi calls with QUtil::string_to_int The latter catches underflow/overflow.	2017-08-29 12:28:32 -04:00
Jay Berkenbilt	742190bd98	Pl_PNGFilter: disallow columns = 0	2017-08-29 12:28:32 -04:00
Jay Berkenbilt	6d46346eb9	Detect integer overflow/underflow	2017-08-29 12:28:32 -04:00
Jay Berkenbilt	e999bbae43	Fix memory leak with bad jpeg data	2017-08-28 22:16:45 -04:00
Jay Berkenbilt	c6872d2c70	Clean up circular references in QPDF_Stream	2017-08-28 22:16:31 -04:00
Jay Berkenbilt	728dc9e6d8	Fix error caught by clang	2017-08-26 21:51:17 -04:00
Jay Berkenbilt	dea704f0ab	Pad keys to avoid memory errors (fixes #147 )	2017-08-26 21:35:59 -04:00
Jay Berkenbilt	021c229331	Fix Pl_Flate memory leak on error (fixes #148 )	2017-08-25 22:26:53 -04:00
Jay Berkenbilt	ad527a64f9	Parse iteratively to avoid stack overflow (fixes #146 )	2017-08-25 21:56:45 -04:00
Jay Berkenbilt	85f05cc57f	Detect xref pointer infinite loop (fixes #149 )	2017-08-25 19:58:31 -04:00
Jay Berkenbilt	1e52d33822	Bump soname to 18 and version to 7.0.b1	2017-08-22 16:50:48 -04:00
Jay Berkenbilt	e452d9dca6	Spell check	2017-08-22 14:22:20 -04:00
Jay Berkenbilt	6219111ed7	Update references to README files Most of the README files have been renamed. Refer to the new names.	2017-08-22 14:13:10 -04:00
Jay Berkenbilt	83ec09f66c	Do memory checks Slightly improve memory cleanup in Pl_DCT Make it easier to test with valgrind	2017-08-22 14:13:10 -04:00
Jay Berkenbilt	fabff0f3ec	Limit token length during xref recovery While scanning the file looking for objects, limit the length of tokens we allow. This prevents us from getting caught up in reading a file character by character while digging through large streams.	2017-08-22 14:13:10 -04:00
Jay Berkenbilt	caf5e39c2e	Fix compiler warnings for clang/mac OS X	2017-08-22 14:13:10 -04:00
Jay Berkenbilt	6884ad2ead	Fix logic error in recovery A stray semicolon caused a condition to be incorrectly applied during stream length recovery.	2017-08-22 07:19:41 -04:00
Jay Berkenbilt	ce435222b2	Push QPDFWriter member variables into a nested class	2017-08-21 22:04:07 -04:00
Jay Berkenbilt	a8c93bd324	Push QPDF member variables into a nested class Pushing member variables into a nested class enables addition of new member variables without breaking binary compatibility.	2017-08-21 21:35:11 -04:00
Jay Berkenbilt	198856a825	Improve pclm parameter settings	2017-08-21 21:05:48 -04:00
Jay Berkenbilt	8ab52fa558	Combine writePCLm with writeStandard Reduce code duplication	2017-08-21 21:05:48 -04:00
Jay Berkenbilt	9f60a864a0	Combine PCLm header into writeHeader	2017-08-21 21:05:47 -04:00
Jay Berkenbilt	adbcfcff2d	Remove duplicated coverage cases Remove duplicated coverage cases from Sahil's code so existing test suite passes.	2017-08-21 18:55:02 -04:00
Sahil Arora	b19210fa7d	QPDFWriter: Add setPCLm() and writePCLm() methods * Add support for PCLm using setPCLm() and writePCLm() methods in QPDFWriter.hh and QPDFWriter.cc * Add a function writePCLmHeader() for PCLm header in QPDFWriter	2017-08-21 18:55:02 -04:00
Jay Berkenbilt	ddc6cf0cf6	Precheck streams by default There is no need for a --precheck-streams option. We can do the precheck without imposing any penalty, only re-encoding the stream if it fails the first time.	2017-08-21 17:44:22 -04:00
Jay Berkenbilt	9744414c66	Enable finer grained control of stream decoding This commit adds several API methods that enable control over which types of filters QPDF will attempt to decode. It also adds support for /RunLengthDecode and /DCTDecode filters for both encoding and decoding.	2017-08-21 17:44:22 -04:00
Jay Berkenbilt	ae90d2c485	Implement Pl_DCT pipeline Additional testing is added in later commits to be supported by additional changes in the library.	2017-08-21 17:44:02 -04:00
Jay Berkenbilt	2d2f619665	Implement Pl_RunLength pipeline	2017-08-19 14:50:55 -04:00
Jay Berkenbilt	cfa2eb97fb	Add page rotation (fixes #132 )	2017-08-12 22:57:38 -04:00
Jay Berkenbilt	8249a26d69	Fix infinite loop in QPDFWriter (fixes #143 )	2017-08-12 08:36:36 -04:00
Jay Berkenbilt	36b3fe5af7	Fix --newline-before-endstream option (fixes #133 ) Add a newline unconditionally before endstream even if a newline was already written as part of the stream data.	2017-08-11 20:57:05 -04:00
Jay Berkenbilt	46611f0710	Prevent a division by zero error (fixes #141 ) Bad /W in an xref stream could cause a division by zero error. Now this is handled as a special case.	2017-08-11 20:11:19 -04:00
Jay Berkenbilt	8fe0b06cd8	Pad encryption parameters that are too short (fixes #96 )	2017-08-11 19:53:56 -04:00
Jay Berkenbilt	e7d0019bf4	Generate libqpdf.map from autoconf Rather than checking consistency of libqpdf.map, generate it.	2017-08-11 04:56:22 -04:00
Jay Berkenbilt	6247aaa57c	Fix libqpdf.map and prevent future breakage The build now checks to make sure libqpdf.map has the right library version number in it.	2017-08-10 21:53:19 -04:00
Jay Berkenbilt	9a96e233b0	Remove PCRE	2017-08-10 21:30:32 -04:00
Jay Berkenbilt	30f109e244	Read xref table without PCRE Also accept more errors than before.	2017-08-10 21:30:32 -04:00
Jay Berkenbilt	98a843c2a2	Reconstruct xref without PCRE	2017-08-10 21:30:32 -04:00
Jay Berkenbilt	ca5b1d267a	Improve stream length recovery Eliminate PCRE and find endobj not preceded by endstream. Be more lax about placement of endstream and endobj.	2017-08-10 21:30:32 -04:00
Jay Berkenbilt	3082e4e606	Find xref without PCRE	2017-08-10 21:30:32 -04:00
Jay Berkenbilt	90840be594	Find lindict without PCRE	2017-08-10 21:30:32 -04:00
Jay Berkenbilt	03aa9679ac	Find starxref without PCRE	2017-08-10 21:30:32 -04:00
Jay Berkenbilt	1765c6ec20	Find header without PCRE	2017-08-10 21:30:32 -04:00
Jay Berkenbilt	296b679d6e	Implement findFirst and findLast in InputSource Preparing to refactor some pattern searching code to use these instead of their own memchr loops. This should simplify the code that replaces PCRE.	2017-08-10 21:30:32 -04:00
Jay Berkenbilt	ef8ae5449d	Allow QPDFTokenizer::readToken to return bad tokens Sometimes we want to ignore bad tokens rather than having them throw an exception. A coverage case is commented out here and added in a later commit.	2017-08-10 19:01:41 -04:00
Jay Berkenbilt	8fe261d8b4	QUtil::strcasecmp	2017-08-05 10:22:33 -04:00
Pranjal Bhor	6f88fd36ab	Include missing header in QPDFTokenizer.cc (fixes #125 ) Required for strtol()	2017-07-30 08:47:05 -04:00
Jay Berkenbilt	2d5b854468	Allow reading command-line args from files (fixes #16 )	2017-07-29 22:23:21 -04:00
Jay Berkenbilt	5993c3e83c	Detect input file = output file (fixes #29 )	2017-07-29 20:58:01 -04:00
Jay Berkenbilt	570db9b60b	Catch more exceptions while resolving objects	2017-07-29 19:31:12 -04:00
Jay Berkenbilt	b43a0ac237	When recover stream length, indicate the length (fixes #44 )	2017-07-29 19:15:06 -04:00
Jay Berkenbilt	f37d399d82	Add newline-before-endstream option (fixes #103 )	2017-07-29 12:21:38 -04:00
Jay Berkenbilt	6a7d53ad2b	Handle zlib data errors better (fixes #106 )	2017-07-29 12:19:04 -04:00
Jay Berkenbilt	07d6f770b2	Better recovery of bad stream start (fixes #104 )	2017-07-29 12:19:04 -04:00
Jay Berkenbilt	b389268f16	Better handle split content streams (fixes #73 ) When parsing content streams, allow content to be split arbitrarily across stream boundaries.	2017-07-29 12:19:04 -04:00
Jay Berkenbilt	a136824243	Fix exception catch	2017-07-29 12:19:04 -04:00
Jay Berkenbilt	ba2bae4acc	Use 1.2 as the version if we can't read it from the header The code was using 1.0, but we use /FlateDecode, which didn't appear until 1.2.	2017-07-29 12:19:04 -04:00
Jay Berkenbilt	3a1ff5ded9	Add option to preserve unreferenced objects	2017-07-28 19:19:11 -04:00
Jay Berkenbilt	a94a729fee	Explicitly check root dictionary type Very badly corrupted files may not have a retrievable root dictionary. Handle that as a special case so that a more helpful error message can be provided.	2017-07-28 18:03:30 -04:00
Jay Berkenbilt	7f8892525f	Add precheck streams capability When requested, QPDFWriter will do more aggress prechecking of streams to make sure it can actually succeed in decoding them before attempting to do so. This will allow preservation of raw data even when the raw data is corrupted relative to the specified filters.	2017-07-27 23:42:27 -04:00
Jay Berkenbilt	428d96dfe1	Convert many more errors to warnings	2017-07-27 22:57:55 -04:00
Jay Berkenbilt	a4fd4b91c6	Convert stream filtering errors to warnings	2017-07-27 18:43:07 -04:00
Jay Berkenbilt	40f00122b8	Convert object parsing errors to warnings QPDFObjectHandle::parseInternal now issues warnings instead of throwing exceptions for all error conditions that it finds (except internal logic errors) and has stronger recovery for things like invalid tokens and malformed dictionaries. This should improve qpdf's ability to recover from a wide range of broken files that currently cause it to fail.	2017-07-27 18:20:31 -04:00
Jay Berkenbilt	dd8dad74f4	Move lexer helper functions to QUtil	2017-07-27 13:59:56 -04:00
Jay Berkenbilt	0a745021e7	Remove PCRE from QPDFTokenizer	2017-07-27 13:59:56 -04:00
slurdge	8740b380fe	Make windows includes lowercase (fixes #123 ) For cross compiling.	2017-07-26 06:39:09 -04:00
Jay Berkenbilt	12db09898e	Don't interpret word tokens in content streams (fixes #82 )	2017-07-26 06:24:07 -04:00
Jay Berkenbilt	701b518d5c	Detect recursion loops resolving objects (fixes #51 ) During parsing of an object, sometimes parts of the object have to be resolved. An example is stream lengths. If such an object directly or indirectly points to the object being parsed, it can cause an infinite loop. Guard against all cases of re-entrant resolution of objects.	2017-07-26 06:24:07 -04:00
Jay Berkenbilt	afe0242b26	Handle object ID 0 (fixes #99 ) This is CVE-2017-9208. The QPDF library uses object ID 0 internally as a sentinel to represent a direct object, but prior to this fix, was not blocking handling of 0 0 obj or 0 0 R as a special case. Creating an object in the file with 0 0 obj could cause various infinite loops. The PDF spec doesn't allow for object 0. Having qpdf handle object 0 might be a better fix, but changing all the places in the code that assumes objid == 0 means direct would be risky.	2017-07-26 06:24:07 -04:00
Jay Berkenbilt	315092dd98	Avoid xref reconstruction infinite loop (fixes #100 ) This is CVE-2017-9209.	2017-07-26 06:24:07 -04:00
Jay Berkenbilt	603f222365	Fix infinite loop while reporting an error (fixes #101 ) This is CVE-2017-9210. The description string for an error message included unparsing an object, which is too complex of a thing to try to do while throwing an exception. There was only one example of this in the entire codebase, so it is not a pervasive problem. Fixing this eliminated one class of infinite loop errors.	2017-07-26 06:24:07 -04:00
Thorsten Schöning	b3c08f4f8d	C++-Builder supports 64 Bit file functions The 64 Bit file functions are supported by C++-Builder as well and need to be used, else fseek will error out on larger files than 4 GB like used in the large file test.	2016-01-24 12:07:20 -05:00

... 4 5 6 7 8 ...

875 Commits