octoleo/qpdf - qpdf - Vast Development Method

mirror of https://github.com/qpdf/qpdf.git synced 2024-11-16 17:45:09 +00:00

Author	SHA1	Message	Date
Jay Berkenbilt	85a3f95a89	qpdf: exit 3 for linearization warnings without errors (fixes #50 )	2019-06-22 16:57:51 -04:00
Jay Berkenbilt	1bde5c68a3	Add QUtil::read_file_into_memory This code was essentially duplicated between test_driver and standalone_fuzz_target_runner.	2019-06-22 10:14:25 -04:00
Jay Berkenbilt	658b5bb3be	QPDFWriter: clean up overloaded functions In a small number of cases, it makes sense to replace an overloaded function with a function that takes a default argument. We can do this now because we've already broken binary compatibility since the last release.	2019-06-22 10:13:27 -04:00
Jay Berkenbilt	79f6b4823b	Convert remaining public classes to use Members pattern Have classes contain only a single private member of type PointerHolder<Members>. This makes it safe to change the structure of the Members class without breaking binary compatibility. Many of the classes already follow this pattern quite successfully. This brings in the rest of the class that are part of the public API.	2019-06-22 10:13:27 -04:00
Jay Berkenbilt	864a546af6	Build with -fvisibility=hidden when supported	2019-06-21 22:29:31 -04:00
Jay Berkenbilt	45dac410b5	Remove broken QPDFTokenizer::expectInlineImage	2019-06-21 22:29:31 -04:00
Jay Berkenbilt	25dd3c6750	Remove QPDF::copyForeignObject with unused parameter	2019-06-21 22:29:31 -04:00
Jay Berkenbilt	c6cfd64503	Rename QUtil::strcasecmp to QUtil::str_compare_nocase (fixes #242 )	2019-06-21 22:29:31 -04:00
Jay Berkenbilt	d71f05ca07	Fix sign and conversion warnings (major) This makes all integer type conversions that have potential data loss explicit with calls that do range checks and raise an exception. After this commit, qpdf builds with no warnings when -Wsign-conversion -Wconversion is used with gcc or clang or when -W3 -Wd4800 is used with MSVC. This significantly reduces the likelihood of potential crashes from bogus integer values. There are some parts of the code that take int when they should take size_t or an offset. Such places would make qpdf not support files with more than 2^31 of something that usually wouldn't be so large. In the event that such a file shows up and is valid, at least qpdf would raise an error in the right spot so the issue could be legitimately addressed rather than failing in some weird way because of a silent overflow condition.	2019-06-21 13:17:21 -04:00
Jay Berkenbilt	f40ffc9d63	Pl_Flate: constructor's out_bufsize is now unsigned int This is the type we need for the underlying zlib implementation.	2019-06-21 13:17:21 -04:00
Jay Berkenbilt	da30764bce	Change QPDFObjectHandle::pipeStreamData's encode_flags type Change from unsigned long to int since we pass enumerated type values to this field.	2019-06-21 13:17:21 -04:00
Jay Berkenbilt	3608afd5c5	Add new integer accessors to QPDFObjectHandle	2019-06-21 13:17:21 -04:00
Jay Berkenbilt	42306e2ff8	QUtil: add unsigned int/string functions	2019-06-21 13:17:21 -04:00
Jay Berkenbilt	a66828caff	New safe type converters in QIntC	2019-06-21 13:17:21 -04:00
Marco Scarpetta	b405e5e1c9	Fix typo (#334 )	2019-06-12 14:21:33 -04:00
jbarlow83	2efec4ce7b	Fix C++ exception handling when -fvisibility=hidden (#302 ) Fix C++ exception handling when -fvisibility=hidden Ensure that QPDFExc and QPDFSystemError are marked visible, so that their typeinfo will not be suppressed when -fvisibility=hidden. Details: https://gcc.gnu.org/wiki/Visibility	2019-03-11 18:28:29 -04:00
Thorsten Schöning	2a852f08b6	[bcc32 Error] QPDF.hh(803): E2247 'QPDF::Members::resolving' is not accessible Full parser context QPDF.cc(2): #include ..\..\..\..\src\include\qpdf\QPDF.hh QPDF.hh(48): class QPDF QPDF.hh(1380): decision to instantiate: QPDF::ResolveRecorder::ResolveRecorder(QPDF ,const QPDFObjGen &) --- Resetting parser context for instantiation... QPDF.hh(799): parsing: QPDF::ResolveRecorder::ResolveRecorder(QPDF ,const QPDFObjGen &)	2019-03-11 17:07:01 -04:00
Thorsten Schöning	1449d82ae4	[bcc32 Error] QPDFObjectHandle.hh(911): E2247 'QPDFObjectHandle::Members::obj' is not accessible Full parser context Pl_QPDFTokenizer.cc(1): #include ..\..\..\..\src\include\qpdf\Pl_QPDFTokenizer.hh Pl_QPDFTokenizer.hh(29): #include ..\..\..\..\src\include\qpdf/QPDFObjectHandle.hh QPDFObjectHandle.hh(51): class QPDFObjectHandle QPDFObjectHandle.hh(1052): decision to instantiate: PointerHolder<QPDFObject> QPDFObjectHandle::ObjAccessor::getObject(QPDFObjectHandle &) --- Resetting parser context for instantiation... QPDFObjectHandle.hh(909): parsing: PointerHolder<QPDFObject> QPDFObjectHandle::ObjAccessor::getObject(QPDFObjectHandle &)	2019-03-11 17:07:01 -04:00
Thorsten Schöning	86287acfd9	[bcc32 Error] QPDF.hh(223): E2303 Type name expected Full parser context QPDF.cc(2): #include ..\..\..\..\src\include\qpdf\QPDF.hh QPDF.hh(47): class QPDF	2019-03-11 16:57:16 -04:00
Thorsten Schöning	9b3314042a	[bcc32 Error] QPDF.hh(203): E2316 'vector' is not a member of 'std' Full parser context QPDF.cc(2): #include ..\..\..\..\src\include\qpdf\QPDF.hh QPDF.hh(46): class QPDF	2019-03-11 16:57:16 -04:00
Jay Berkenbilt	fec5bb124c	Spell check	2019-01-31 21:41:29 -05:00
Jay Berkenbilt	eb49e07c0a	Make inline image token exactly contain the image data Do not include the trailing EI, and handle cases where EI is not preceded by a delimiter. Such cases have been seen in the wild.	2019-01-31 20:28:44 -05:00
Jay Berkenbilt	5211bcb5ea	Externalize inline images (fixes #278 )	2019-01-31 10:38:13 -05:00
Jay Berkenbilt	1eb35a355f	Exclude space after ID in image data	2019-01-31 10:38:10 -05:00
Jay Berkenbilt	2b6c79bcae	Improve locating inline image's EI We've actually seen a PDF file in the wild that contained EI surrounded by delimiters inside the image data, which confused qpdf's naive code. This significantly improves EI detection.	2019-01-31 09:26:37 -05:00
Jay Berkenbilt	ec9e310c9e	Refactor QPDFTokenizer's inline image handling Add a version of expectInlineImage that takes an input source and searches for EI. This is in preparation for improving the way EI is found. This commit just refactors the code without changing the functionality and adds tests to make sure the old and new code behave identically.	2019-01-31 09:26:37 -05:00
Jay Berkenbilt	b776dcd2d3	Clean up some private functions	2019-01-29 22:14:20 -05:00
Jay Berkenbilt	2d0885bc11	Clarify documentation for copyForeignObject regarding pages Make explicit that copyForeignObject can be used on page objects and will copy them properly but not update the pages tree.	2019-01-28 21:53:55 -05:00
Jay Berkenbilt	52f9d326a5	Resolve duplicated page objects (fixes #268 ) When linearizing a file or getting the list of all pages in a file, detect if the pages tree contains a duplicated page object and, if so, shallow copy it. This makes it possible to have a one to one mapping of page positions to page objects.	2019-01-28 20:29:58 -05:00
Jay Berkenbilt	623f5b664e	Convert pages to form XObjects Support conversion of pages to form XObjects and placement of form XObjects on pages.	2019-01-27 07:50:30 -05:00
Jay Berkenbilt	8cb245739c	Add QPDFObjectHandle::getUniqueResourceName	2019-01-27 07:50:30 -05:00
Jay Berkenbilt	009767d97a	Handle inheritable page attributes Add getAttribute for handling inheritable page attributes, and fix getPageImages and annotation flattening code to use it.	2019-01-25 22:30:05 -05:00
Jay Berkenbilt	e1271361c5	Add documentation for features since 8.3.0	2019-01-19 15:58:51 -05:00
Jay Berkenbilt	c18ee440a3	mingw workaround for QPDFExc destructor mingw doesn't like it when you don't inline empty virtual destructors.	2019-01-19 10:14:07 -05:00
Jay Berkenbilt	e87d149918	Add QUtil::possible_repaired_encodings	2019-01-17 11:43:56 -05:00
Jay Berkenbilt	6ec22f117d	Modernize encryption API for more granularity Setting encryption permissions for R >= 3 set permission bits in groups corresponding to menu options in Acrobat 5. The new API allows the bits to be set individually.	2019-01-17 11:43:56 -05:00
Jay Berkenbilt	4630377731	Add status-reporting transcoders to QUtil	2019-01-17 11:43:56 -05:00
Jay Berkenbilt	8f389f14c0	QUtil::analyze_encoding	2019-01-17 11:43:56 -05:00
Jay Berkenbilt	698485468a	Move remaining existing transcoding to QUtil	2019-01-17 11:43:56 -05:00
Jay Berkenbilt	654c0e8caf	Allow adding the same page more than once in --pages (fixes #272 )	2019-01-12 10:01:47 -05:00
Jay Berkenbilt	5f128b9a27	Fix version number in comment	2019-01-11 07:46:53 -05:00
Jay Berkenbilt	d24a120c7f	Add QPDF::setImmediateCopyFrom	2019-01-10 22:35:08 -05:00
Jay Berkenbilt	3472f6c984	Update copyrights for 2019	2019-01-07 07:54:55 -05:00
Jay Berkenbilt	fddbcab0e7	Mostly don't require original QPDF for copyForeignObject (fixes #219 ) The original QPDF is only required now when the source QPDFObjectHandle is a stream that gets its stream data from a QPDFObjectHandle::StreamDataProvider.	2019-01-07 00:11:15 -05:00
Jay Berkenbilt	fbbb0ee016	Make a static version of QPDF::pipeStreamData This is in preparation of being able to pipe a stream's data without keeping a copy of its containing qpdf object.	2019-01-07 00:11:15 -05:00
Jay Berkenbilt	7588cac295	Create an application-scope unique ID for each QPDF object Use this instead of QPDF* as a map key for object_copiers.	2019-01-07 00:11:15 -05:00
Jay Berkenbilt	e27ac682e0	Move encryption parameters into a class	2019-01-06 09:58:16 -05:00
Jay Berkenbilt	a70fbaaf50	Honor other base encodings when generating appearances	2019-01-05 23:01:59 -05:00
Jay Berkenbilt	b341d742db	Add WinAnsi and MacRoman encoding	2019-01-05 23:01:44 -05:00
Jay Berkenbilt	089ce5902e	Move utf8_to_utf16 into QUtil	2019-01-05 22:59:27 -05:00
Jay Berkenbilt	2e342ee5bb	Spell check	2019-01-04 21:33:14 -05:00
Jay Berkenbilt	ee2aad4381	Add CLI flags for image optimization	2019-01-04 21:33:14 -05:00
Jay Berkenbilt	16fd6e64f9	Add QPDFWriter::getFinalVersion (fixes #266 )	2019-01-04 12:37:22 -05:00
Jay Berkenbilt	837dcf8fc2	Don't call assert while checking linearization data (fixes #209 , #231 ) Instead of calling assert for problems found during checking linearization data, throw an exception which is later caught and issued as an error. Ideally we would handle errors more robustly, but this is still a significant improvement.	2019-01-04 11:55:42 -05:00
Jay Berkenbilt	a01359189b	Fix dangling references (fixes #240 ) On certain operations, such as iterating through all objects and adding new indirect objects, walk through the entire object structure and explicitly resolve any indirect references to non-existent objects. That prevents new objects from springing into existence and causing the previously dangling references to point to them.	2019-01-04 10:29:29 -05:00
Jay Berkenbilt	158156d506	Add basic appearance stream generation	2019-01-04 08:00:19 -05:00
Jay Berkenbilt	02281632cc	Add QUtil::utf8_to_ascii	2019-01-03 23:18:13 -05:00
Jay Berkenbilt	b55567a0fa	Add special case setV code for button fields	2019-01-03 23:18:13 -05:00
Jay Berkenbilt	e3144ac417	Add form fields to json output Also add some additional methods for detecting form field types to assist in the json creation and for later use.	2019-01-03 23:18:13 -05:00
Jay Berkenbilt	ca94ac68d9	Honor flags when flattening annotations	2019-01-03 11:59:55 -05:00
Jay Berkenbilt	f78ea057ca	Switch annotation flattening to use the form xobjects Instead of directly putting the contents of the annotation appearance streams into the page's content stream, add commands to render the form xobjects directly. This is a more robust way to do it than the original solution as it works properly with patterns and avoids problems with resource name clashes between the pages and the form xobjects.	2019-01-02 21:49:47 -05:00
Jay Berkenbilt	3b8ce4f12a	Annotation flattening including form fields Flatten annotations by integrating their appearance streams into the content stream of the containing page. In the case of form fields, only flatten if /NeedAppearance is false (or equivalently absent). If flattening form fields, also remove /AcroForm from the document catalog.	2019-01-01 08:14:15 -05:00
Jay Berkenbilt	95d6b17a89	Add QPDFObjectHandle::mergeDictionary()	2019-01-01 08:12:56 -05:00
Jay Berkenbilt	104fd6da52	Add matrix and annotation appearance stream handling Generate page content fragment for rendering appearance streams including all matrix calculation.	2019-01-01 08:07:21 -05:00
Jay Berkenbilt	5059ec0d35	Add Matrix class under QPDFObjectHandle	2018-12-31 23:02:43 -05:00
Jay Berkenbilt	3440ea7d3c	JSON::serialize -> unparse Unparse is admittedly strange, but I'd rather be strange and consistent, and everything else in the qpdf library uses unparse to serialize. (If you're reading this, the convention of using "unparse" comes from the "clu" programming language.)	2018-12-25 11:52:21 -05:00
Jay Berkenbilt	fa3664357b	Move numrange code from qpdf.cc to QUtil.cc Also move tests to libtests.	2018-12-21 19:11:57 -05:00
Jay Berkenbilt	d5d179f441	Add document and object helpers for outlines (bookmarks)	2018-12-21 19:11:57 -05:00
Jay Berkenbilt	30a0c070e4	Add QPDFObjectHandle::getJSON()	2018-12-21 18:34:56 -05:00
Jay Berkenbilt	651179b5da	Add simple JSON serializer	2018-12-21 18:34:56 -05:00
Jay Berkenbilt	0776c00129	Add QPDFNameTreeObjectHelper	2018-12-21 18:34:56 -05:00
Jay Berkenbilt	d2f3975948	Add missing virtual destructors to all helper classes	2018-12-21 18:34:56 -05:00
Jay Berkenbilt	6ef9e31233	Add QPDFPageLabelDocumentHelper	2018-12-18 16:59:24 -05:00
Jay Berkenbilt	f38df27aa3	Add QPDFNumberTreeObjectHelper	2018-12-18 16:46:10 -05:00
Jay Berkenbilt	077d3d4512	Add QPDFObjectHandle::wrapInArray() Wrap an object in an array if it is not already an array.	2018-12-18 16:45:48 -05:00
Jay Berkenbilt	9caf005d89	Fix typo in header file	2018-12-18 16:27:36 -05:00
Jay Berkenbilt	b4bdc42b4f	New exception class QPDFSystemError (fixes #221 )	2018-08-13 20:01:51 -04:00
Jay Berkenbilt	3873f5fd9b	Protect headers with compliant identifiers (fixes #233 )	2018-08-12 14:10:32 -04:00
Jay Berkenbilt	4a4736c695	Fix EOL handling inside strings (fixes #226 ) CR, CRLF, and LF are all supposed to be treated as LF; only one EOL is to be ignored after backslash.	2018-08-05 20:48:35 -04:00
Jay Berkenbilt	651b51f056	Add QPDF_DLL to public destructors (fixes #220 ) A few public destructors were missing QPDF_DLL, which could cause some Windows applications to fail to link.	2018-08-04 20:08:06 -04:00
Jay Berkenbilt	4f4c627b77	ClosedFileInputSource: add method to keep file open During periods of intensive operation on a specific file, this method can reduce the overhead of repeated open/close operations.	2018-08-04 19:52:46 -04:00
Jay Berkenbilt	5db39a681a	Windows fixes	2018-06-22 17:01:18 -04:00
Jay Berkenbilt	d34ab8a936	spell check	2018-06-22 16:14:54 -04:00
Jay Berkenbilt	a433ed24f9	Add progress reporting for QPDFWriter (fixes #200 )	2018-06-22 16:14:54 -04:00
Jay Berkenbilt	2a82f6e1e0	Add method to get count of objects in QPDF	2018-06-22 15:53:40 -04:00
Jay Berkenbilt	4ccc8b1a44	Add ClosedFileInputSource ClosedFileInputSource is an input source that keeps the file closed when not reading it.	2018-06-22 12:52:45 -04:00
Jay Berkenbilt	6c89d4b35b	When splitting files, remove unreferenced objects (fixes #203 )	2018-06-21 21:03:30 -04:00
Jay Berkenbilt	ddd78c1b7f	Fix QPDFObjectHandle::shallowCopy It's not really a shallow copy. It just doesn't cross indirect object boundaries. The old implementation had a bug that would cause multiple shallow copies of the same object to share memory, which was not the intention.	2018-06-21 20:34:45 -04:00
Jay Berkenbilt	397b097c46	Allow setting a form field's value	2018-06-21 15:57:13 -04:00
Jay Berkenbilt	952a665a4e	Better support for creating Unicode strings	2018-06-21 15:57:13 -04:00
Jay Berkenbilt	e44c395c51	QUtil::toUTF16	2018-06-21 15:57:13 -04:00
Jay Berkenbilt	0b05111db8	Implement helper class for interactive forms	2018-06-21 15:57:13 -04:00
Jay Berkenbilt	2e7ee23bf6	Add QPDFPageDocumentHelper and QPDFPageObjectHelper This is the beginning of higher-level API support using helper classes. The goal is to be able to add more helpers without continuing to pollute QPDF's and QPDFObjectHandle's public interfaces.	2018-06-21 15:57:13 -04:00
Jay Berkenbilt	4cded10821	Add QPDFObjectHandle::Rectangle type Provide a convenient way of accessing rectangles.	2018-06-21 15:57:13 -04:00
Jay Berkenbilt	e4e2e26d99	Properly handle pages with no contents (fixes #194 ) Remove calls to assertPageObject(). All cases in the library that called assertPageObject() work fine if you don't call assertPageObject() because nothing assumes anything that was being checked by that call. Removing the calls enables more files to be successfully processed.	2018-03-06 11:34:07 -05:00
Jay Berkenbilt	4bb3046f0b	Properly handle strings with PDF Doc Encoding (fixes #179 ) The QPDF_String::getUTF8Val() method was not treating strings that weren't explicitly Unicode as PDF Doc Encoded. This only affects characters in the range 0x80 through 0xa0.	2018-02-18 21:06:27 -05:00
Jay Berkenbilt	2780a1871d	Add C API for checking PDF files	2018-02-18 21:06:27 -05:00
Jay Berkenbilt	d0e99f195a	More robust handling of type errors Give objects descriptions and context so it is possible to issue warnings instead of fatal errors for attempts to access objects of the wrong type.	2018-02-18 21:06:27 -05:00
Jay Berkenbilt	21b7481b0e	Push members of QPDFObjectHandle into a Members object As in other cases, this is to enable adding new member variables in the future without breaking ABI compatibility.	2018-02-18 21:06:27 -05:00
Jay Berkenbilt	e410b0fe0d	Simplify TokenFilter interface Expose Pl_QPDFTokenizer, and have it do more of the work of managing the token filter's pipeline.	2018-02-18 21:05:47 -05:00
Jay Berkenbilt	1fdd86a049	Move Pl_QPDFTokenizer to public interface	2018-02-18 21:05:47 -05:00
Jay Berkenbilt	5708b5d0aa	Add additional interface for filtering page contents	2018-02-18 21:05:47 -05:00
Jay Berkenbilt	9910104442	Implement TokenFilter and refactor Pl_QPDFTokenizer Implement a TokenFilter class and refactor Pl_QPDFTokenizer to use a TokenFilter class called ContentNormalizer. Pl_QPDFTokenizer is now a general filter that passes data through a TokenFilter.	2018-02-18 21:05:46 -05:00
Jay Berkenbilt	b8723e97f4	Add coalesce contents capability	2018-02-18 21:05:46 -05:00
Jay Berkenbilt	fcd611b61e	Refactor parseContentStream	2018-02-18 21:05:46 -05:00
Jay Berkenbilt	05ff619b09	Remove redundant method Remove a redundant method that was equal to another one with additional arguments. This breaks binary compatibility, but there are other ABI breaking changes in the upcoming release, so now is the time to do it.	2018-02-18 21:05:46 -05:00
Jay Berkenbilt	fefe25030e	Inline image token type	2018-02-18 21:05:46 -05:00
Jay Berkenbilt	2699ecf13e	Push QPDFTokenizer members into a nested structure This is for protection against future ABI breaking changes.	2018-02-18 21:05:46 -05:00
Jay Berkenbilt	d97474868d	Lexer enhancements: EOF, comment, space Significant enhancements to the lexer to improve EOF handling and to support comments and spaces as tokens. Various other minor issues were fixed as well.	2018-02-18 20:18:40 -05:00
Jay Berkenbilt	ebd5ed63de	Add option to save pass 1 of lineariziation This is useful only for debugging the linearization code.	2018-02-18 20:18:40 -05:00
Jay Berkenbilt	e3167c1a60	Fix linearization for files with nonstandard ID length	2018-02-04 18:16:23 -05:00
Jay Berkenbilt	aa2cfad61a	Clarify some comments	2018-01-28 18:29:47 -05:00
Jay Berkenbilt	569d74d36b	Allow raw encryption key to be specified Add options to enable the raw encryption key to be directly shown or specified. Thanks to Didier Stevens <didier.stevens@gmail.com> for the idea and contribution of one implementation of this idea.	2018-01-14 10:21:05 -05:00
Jay Berkenbilt	3e306ae64c	Add QUtil::hex_decode	2018-01-14 09:04:13 -05:00
Jay Berkenbilt	68572df2bf	Update copyright to 2018	2018-01-13 20:25:58 -05:00
Jay Berkenbilt	07c8bb2843	Additionally license under Apache License version 2.0 The Apache License version 2.0 is now the primary license for qpdf. However, users may, at their option, continue to use Artistic version 2.0.	2017-09-14 12:59:25 -04:00
Jay Berkenbilt	d31a7b76e7	Improve message for stream decoding error Tweak the message so that we inform the user that we are mitigating data loss.	2017-09-12 16:03:48 -04:00
Jay Berkenbilt	eaacf94005	Update C API with new QPDFWriter methods	2017-09-12 14:30:39 -04:00
Jay Berkenbilt	6d46346eb9	Detect integer overflow/underflow	2017-08-29 12:28:32 -04:00
Jay Berkenbilt	e999bbae43	Fix memory leak with bad jpeg data	2017-08-28 22:16:45 -04:00
Jay Berkenbilt	c6872d2c70	Clean up circular references in QPDF_Stream	2017-08-28 22:16:31 -04:00
Jay Berkenbilt	728dc9e6d8	Fix error caught by clang	2017-08-26 21:51:17 -04:00
Jay Berkenbilt	ad527a64f9	Parse iteratively to avoid stack overflow (fixes #146 )	2017-08-25 21:56:45 -04:00
Jay Berkenbilt	e452d9dca6	Spell check	2017-08-22 14:22:20 -04:00
Jay Berkenbilt	fabff0f3ec	Limit token length during xref recovery While scanning the file looking for objects, limit the length of tokens we allow. This prevents us from getting caught up in reading a file character by character while digging through large streams.	2017-08-22 14:13:10 -04:00
Jay Berkenbilt	ce435222b2	Push QPDFWriter member variables into a nested class	2017-08-21 22:04:07 -04:00
Jay Berkenbilt	a8c93bd324	Push QPDF member variables into a nested class Pushing member variables into a nested class enables addition of new member variables without breaking binary compatibility.	2017-08-21 21:35:11 -04:00
Jay Berkenbilt	8288a4eb3a	Update copyright to 2017	2017-08-21 21:18:47 -04:00
Jay Berkenbilt	8ab52fa558	Combine writePCLm with writeStandard Reduce code duplication	2017-08-21 21:05:48 -04:00
Jay Berkenbilt	9f60a864a0	Combine PCLm header into writeHeader	2017-08-21 21:05:47 -04:00
Jay Berkenbilt	4b908ade70	Update header documentation and ChangeLog entry for PCLm	2017-08-21 21:05:44 -04:00
Sahil Arora	b19210fa7d	QPDFWriter: Add setPCLm() and writePCLm() methods * Add support for PCLm using setPCLm() and writePCLm() methods in QPDFWriter.hh and QPDFWriter.cc * Add a function writePCLmHeader() for PCLm header in QPDFWriter	2017-08-21 18:55:02 -04:00
Jay Berkenbilt	ddc6cf0cf6	Precheck streams by default There is no need for a --precheck-streams option. We can do the precheck without imposing any penalty, only re-encoding the stream if it fails the first time.	2017-08-21 17:44:22 -04:00
Jay Berkenbilt	9744414c66	Enable finer grained control of stream decoding This commit adds several API methods that enable control over which types of filters QPDF will attempt to decode. It also adds support for /RunLengthDecode and /DCTDecode filters for both encoding and decoding.	2017-08-21 17:44:22 -04:00
Jay Berkenbilt	ae90d2c485	Implement Pl_DCT pipeline Additional testing is added in later commits to be supported by additional changes in the library.	2017-08-21 17:44:02 -04:00
Jay Berkenbilt	2d2f619665	Implement Pl_RunLength pipeline	2017-08-19 14:50:55 -04:00
Jay Berkenbilt	cfa2eb97fb	Add page rotation (fixes #132 )	2017-08-12 22:57:38 -04:00
Jay Berkenbilt	30f109e244	Read xref table without PCRE Also accept more errors than before.	2017-08-10 21:30:32 -04:00
Jay Berkenbilt	ca5b1d267a	Improve stream length recovery Eliminate PCRE and find endobj not preceded by endstream. Be more lax about placement of endstream and endobj.	2017-08-10 21:30:32 -04:00
Jay Berkenbilt	03aa9679ac	Find starxref without PCRE	2017-08-10 21:30:32 -04:00
Jay Berkenbilt	1765c6ec20	Find header without PCRE	2017-08-10 21:30:32 -04:00
Jay Berkenbilt	296b679d6e	Implement findFirst and findLast in InputSource Preparing to refactor some pattern searching code to use these instead of their own memchr loops. This should simplify the code that replaces PCRE.	2017-08-10 21:30:32 -04:00
Jay Berkenbilt	ef8ae5449d	Allow QPDFTokenizer::readToken to return bad tokens Sometimes we want to ignore bad tokens rather than having them throw an exception. A coverage case is commented out here and added in a later commit.	2017-08-10 19:01:41 -04:00
Jay Berkenbilt	c5dc6d8067	Remove unused PointerHolder interface Also fix a bug resulting from incorrect use of PointerHolder because of this unused parameter.	2017-08-10 19:01:38 -04:00
Jay Berkenbilt	8fe261d8b4	QUtil::strcasecmp	2017-08-05 10:22:33 -04:00
Jay Berkenbilt	2d5b854468	Allow reading command-line args from files (fixes #16 )	2017-07-29 22:23:21 -04:00
Jay Berkenbilt	5993c3e83c	Detect input file = output file (fixes #29 )	2017-07-29 20:58:01 -04:00
Jay Berkenbilt	f37d399d82	Add newline-before-endstream option (fixes #103 )	2017-07-29 12:21:38 -04:00
Jay Berkenbilt	b389268f16	Better handle split content streams (fixes #73 ) When parsing content streams, allow content to be split arbitrarily across stream boundaries.	2017-07-29 12:19:04 -04:00
Jay Berkenbilt	4647acbe3c	Clarify documentation on copyForeignObject (fixes #69 ) Be explicit about the need to keep the source QPDF object around.	2017-07-29 12:19:04 -04:00
Jay Berkenbilt	3a1ff5ded9	Add option to preserve unreferenced objects	2017-07-28 19:19:11 -04:00
Jay Berkenbilt	7f8892525f	Add precheck streams capability When requested, QPDFWriter will do more aggress prechecking of streams to make sure it can actually succeed in decoding them before attempting to do so. This will allow preservation of raw data even when the raw data is corrupted relative to the specified filters.	2017-07-27 23:42:27 -04:00
Jay Berkenbilt	a4fd4b91c6	Convert stream filtering errors to warnings	2017-07-27 18:43:07 -04:00
Jay Berkenbilt	40f00122b8	Convert object parsing errors to warnings QPDFObjectHandle::parseInternal now issues warnings instead of throwing exceptions for all error conditions that it finds (except internal logic errors) and has stronger recovery for things like invalid tokens and malformed dictionaries. This should improve qpdf's ability to recover from a wide range of broken files that currently cause it to fail.	2017-07-27 18:20:31 -04:00
Jay Berkenbilt	dd8dad74f4	Move lexer helper functions to QUtil	2017-07-27 13:59:56 -04:00
Jay Berkenbilt	701b518d5c	Detect recursion loops resolving objects (fixes #51 ) During parsing of an object, sometimes parts of the object have to be resolved. An example is stream lengths. If such an object directly or indirectly points to the object being parsed, it can cause an infinite loop. Guard against all cases of re-entrant resolution of objects.	2017-07-26 06:24:07 -04:00
Jay Berkenbilt	315092dd98	Avoid xref reconstruction infinite loop (fixes #100 ) This is CVE-2017-9209.	2017-07-26 06:24:07 -04:00
Jay Berkenbilt	bd6c845619	Fix typo in comment	2017-07-26 06:24:07 -04:00
Thorsten Schöning	7c08aa4280	Include QPDFExc.hh for use in std::list	2016-01-24 12:07:03 -05:00
Thorsten Schöning	e0201c12cc	Include QPDFObjectHandle for use in std::list QPDFObjectHandle was used as forward declaration, but C++-Builder 10 Seattle can't use it in std::list in such cases because the type is undefined.	2016-01-24 12:04:25 -05:00
Jay Berkenbilt	e0e9d64674	Remove some ABI compatibility private methods Since we have to bump soname, remove some private methods that were just there for binary compatibility	2015-11-10 12:22:40 -05:00
Jay Berkenbilt	0496ab1a6e	Fix spelling errors	2015-10-31 18:56:43 -04:00
Jay Berkenbilt	b8bdef0ad1	Implement deterministic ID For non-encrypted files, determinstic ID generation uses file contents instead of timestamp and file name. At a small runtime cost, this enables generation of the same /ID if the same inputs are converted in the same way multiple times.	2015-10-31 18:56:42 -04:00
Jay Berkenbilt	f77acbdbba	Copyright 2015	2015-05-24 17:26:49 -04:00
Jay Berkenbilt	857bb208d3	include time.h in QUtil.hh QUtil.hh needs time.h to get time_t on some platforms. Thanks Peter Korsgaard <peter@korsgaard.com>	2015-05-24 16:26:05 -04:00
Jay Berkenbilt	a11549a566	Detect loops in /Pages structure Pushing inherited objects to pages and getting all pages were both prone to stack overflow infinite loops if there were loops in the Pages dictionary. There is a general weakness in the code in that any part of the code that traverses the Pages structure would be prone to this and would have to implement its own loop detection. A more robust fix may provide some general method for handling the Pages structure, but it's probably not worth doing. Note: addition of *Internal2 private functions was done rather than changing signatures of existing methods to avoid breaking compatibility.	2015-02-21 19:47:11 -05:00
Jay Berkenbilt	225b018290	Update Copyright to 2014	2014-01-14 15:40:02 -05:00
Jay Berkenbilt	235d8f28f8	Increase random data provider support Add a method to get the current random data provider, and document and test the method for resetting it.	2013-12-16 16:21:28 -05:00
Jay Berkenbilt	5e3bad2f86	Refactor random data generation Add new RandomDataProvider object and implement existing random number generation in terms of that. This enables end users to supply their own random data providers.	2013-12-14 15:17:35 -05:00
Jay Berkenbilt	f010e07c0c	Add missing #include of <string>	2013-10-28 20:59:58 -04:00
Jay Berkenbilt	4229457068	Security: use a secure random number generator If not available, give an error. The user may also configure qpdf to use an insecure random number generator.	2013-10-18 10:45:12 -04:00
Jay Berkenbilt	cee2592ed1	Change API/ABI and withdraw 4.2.0 4.2.0 was binary incompatible in spite of there being no deletions or changes to any public methods. As such, we have to bump the ABI and are fixing some API breakage while we're at it. Previous 4.3.0 target is now 5.1.0.	2013-07-10 11:30:13 -04:00
Jay Berkenbilt	a3576a7359	Bug fix: handle generation > 0 when generating object streams Rework QPDFWriter to always track old object IDs and QPDFObjGen instead of int, thus not discarding the generation number. Switch to QPDF::getCompressibleObjGen() to properly handle the case of an old object eligible for compression that has a generation of other than zero.	2013-06-14 14:58:09 -04:00
Jay Berkenbilt	96eb965115	Use QPDFObjectHandle::getObjGen() where appropriate In internal code and examples, replace calls to getObjectID() and getGeneration() with calls to getObjGen() where possible.	2013-06-14 14:58:09 -04:00
Jay Berkenbilt	5039da0b91	Add QPDFObjectHandle::getObjGen() This is safer than getObjectID() and getGeneration() for many uses.	2013-06-14 14:58:09 -04:00
Jay Berkenbilt	d88231e01e	Promote QPDF::ObjGen to top-level object QPDFObjGen	2013-06-14 14:58:08 -04:00
Jay Berkenbilt	3803e9cc4a	Export terminateParsing in the DLL Windows fix: QPDFObject::ParserCallbacks::terminateParsing() was not declared with QPDF_DLL.	2013-03-11 12:37:32 -04:00
Jay Berkenbilt	9d4f52c014	Clarify documentation on encrypted files Explicitly state how QPDF handles empty passwords when writing files. Apparently some libraries treat the empty string as the owner password as an instruction to generate a random password.	2013-03-11 12:37:32 -04:00
Jay Berkenbilt	29f5830325	Fix getTypeCode and getTypeName work for indirect objects Remove const qualifier from getTypeCode and get getTypeName methods of QPDFObjectHandle, make them work properly for indirect objects, and exercise them much better in the test suite.	2013-03-05 13:35:46 -05:00
Jay Berkenbilt	119f2a4b68	Add method to terminate content stream parsing	2013-03-05 13:35:46 -05:00
Jay Berkenbilt	ac4deac187	Call QUtil::safe_fopen in place of fopen fopen was previuosly called wrapped by QUtil::fopen_wrapper, but QUtil::safe_fopen does this itself, which is less cumbersome.	2013-03-05 13:35:46 -05:00
Jay Berkenbilt	a51ae10b8d	Remove all calls to sprintf	2013-03-05 13:35:46 -05:00
Jay Berkenbilt	30027481f7	Remove all old-style casts from C++ code	2013-03-04 16:45:16 -05:00
Jay Berkenbilt	32b62035ce	Replace many calls to sprintf with QUtil::hex_encode Add QUtil::hex_encode to encode binary data has a hexadecimal string, and use it in place of sprintf where possible.	2013-03-04 16:45:15 -05:00
Jay Berkenbilt	bfda717749	Cosmetic changes to be closer to Adobe terminology Change object type Keyword to Operator, and place the order of the object types in object_type_e in the same order as they are mentioned in the PDF specification. Note that this change only breaks backward compatibility with code that has not yet been released.	2013-01-23 09:38:05 -05:00
Jay Berkenbilt	913eb5ac35	Add getTypeCode() and getTypeName() Add virtual methods to QPDFObject, wrappers to QPDFObjectHandle, and implementations to all the QPDF_Object types.	2013-01-22 10:01:45 -05:00
Jay Berkenbilt	f81152311e	Add QPDFObjectHandle::parseContentStream method This method allows parsing of the PDF objects in a content stream or array of content streams.	2013-01-20 15:35:39 -05:00
Jay Berkenbilt	1d88955fa6	Added new QPDFObjectHandle types Keyword and InlineImage These object types are to facilitate content stream parsing.	2013-01-20 15:35:39 -05:00
Jay Berkenbilt	a04a835849	Clarify methods to get user password With newer encryption formats, it is no longer possible to recover the user password using the owner password.	2013-01-03 20:45:53 -05:00
Jay Berkenbilt	f8306913ba	Update "C" API with functions for new features	2012-12-31 10:32:32 -05:00
Jay Berkenbilt	9eb5982fa3	Avoid modifying trailer when writing When preparing the trailer for writing to the new file, trim a copy of the trailer instead of the original file's trailer.	2012-12-31 10:32:32 -05:00
Jay Berkenbilt	8843e499b8	Update copyright year to 2013 Also add copyright notice to a few public headers that were missing one.	2012-12-31 10:32:32 -05:00
Jay Berkenbilt	4237a29c94	Refactor Dictionary writing code Original code was written before we could shallow copy objects, so all the filtering was done by suppressing the output of certain keys and replacing them with other keys. Now we can simplify the code greatly by modifying shallow copies of dictionaries in place.	2012-12-31 10:32:32 -05:00
Jay Berkenbilt	e57c25814e	Support for encryption with /V=5 and /R=5 and /R=6 Read and write support is implemented for /V=5 with /R=5 as well as /R=6. /R=5 is the deprecated encryption method used by Acrobat IX. /R=6 is the encryption method used by PDF 2.0 from ISO 32000-2.	2012-12-31 10:32:32 -05:00
Jay Berkenbilt	93ac1695a4	Support files with only attachments encrypted Test cases added in a future commit since they depend on /R=6 support.	2012-12-31 10:32:32 -05:00
Jay Berkenbilt	4eccb9d87b	Add random number functions to QUtil	2012-12-31 10:32:32 -05:00
Jay Berkenbilt	8f5de08c2a	Comment about non-const Pipeline data	2012-12-31 10:32:31 -05:00
Jay Berkenbilt	774584163f	Add ExtensionLevel support to version handling All version operations are now fully aware of extension levels.	2012-12-31 05:36:50 -05:00
Jay Berkenbilt	3101955ac0	Add V5 parameters to EncryptionData	2012-12-31 05:36:50 -05:00
Jay Berkenbilt	68447bb556	change EncryptionData	2012-12-31 05:36:50 -05:00
Jay Berkenbilt	04c203ae06	Eliminate flattenScalarReferences	2012-12-31 05:36:48 -05:00
Jay Berkenbilt	041397fdab	Allow reading from InputSource and writing to Pipeline Allowing users to subclass InputSource and Pipeline to read and write from/to arbitrary sources provides the maximum flexibility for users who want to read and write from other than files or memory.	2012-09-23 17:42:26 -04:00
Jay Berkenbilt	c1627d0438	Add QPDFWriter::setExtraHeaderText	2012-09-06 15:31:12 -04:00
Jay Berkenbilt	137dc7acb9	Refactor: move resolution of literal to its own method	2012-08-11 09:22:59 -04:00
Jay Berkenbilt	32051283b9	Fix spelling errors	2012-07-29 14:44:12 -04:00
Jay Berkenbilt	f83bddf882	Update copyright to 2012	2012-07-28 22:03:36 -04:00
Tobias Hoffmann	9c00874e77	added QPDFObjectHandle::replaceStreamData(std::string data).	2012-07-25 03:02:46 +02:00
Jay Berkenbilt	316328704b	Windows compilation fixes	2012-07-21 20:51:56 -04:00
Jay Berkenbilt	6bbea4baa0	Implement QPDFObjectHandle::parse Move object parsing code from QPDF to QPDFObjectHandle and parameterize the parts of it that are specific to a QPDF object. Provide a version that can't handle indirect objects and that can be called on an arbitrary string. A side effect of this change is that the offset used when reporting invalid stream length has changed, but since the new value seems like a better value than the old one, the test suite has been updated rather than making the code backward compatible. This only effects the offset reported for invalid streams that lack /Length or have an invalid /Length key. Updated some test code and exmaples to use QPDFObjectHandle::parse. Supporting changes include adding a BufferInputSource constructor that takes a string.	2012-07-21 09:06:10 -04:00
Jay Berkenbilt	f3e267fce2	Move readToken from QPDF to QPDFTokenizer	2012-07-21 09:06:10 -04:00
Jay Berkenbilt	15eaed5c52	Refactor: pull *InputSource out of QPDF InputSource, FileInputSource, and BufferInputSource are now top-level classes instead of privately nested inside QPDF.	2012-07-21 09:06:06 -04:00
Jay Berkenbilt	a101533e0a	Add command line option to copy encryption from other file Add --copy-encryption and --encryption-file-password options to qpdf. Also strengthen test suite for copying encryption. The strengthened test suite would have caught the failure to preserve AES and the failure to update the file version, which was invalidating the encrypted data.	2012-07-15 21:15:24 -04:00
Jay Berkenbilt	0575d77d77	Add public QPDFWriter::copyEncryptionParameters Method to copy encryption parameters from another file. Adapted from existing code to copy encryption parameters from the original file.	2012-07-14 09:14:41 -04:00
Jay Berkenbilt	11b194a1d0	Update getPageImages() comment to mention pushInheritedAttributesToPage()	2012-07-11 15:56:50 -04:00
Jay Berkenbilt	e7b8f297ba	Support copying objects from another QPDF object This includes QPDF::copyForeignObject and supporting foreign objects as arguments to addPage*.	2012-07-11 15:54:33 -04:00
Jay Berkenbilt	8a217eb3a2	Add concept of reserved objects QPDFObjectHandle::{new,is,assert}Reserved, QPDF::replaceReserved provide a mechanism to add objects to a PDF file when there are circular references. This is a prerequisite to copying objects from one PDF to another.	2012-07-10 23:34:32 -04:00
Tobias Hoffmann	8720446b23	Added assertNumber and assertScalar to QPDFObjectHandle	2012-07-07 18:55:08 -04:00
Tobias Hoffmann	a8266ccb0e	Added public assert{Type} methods to QPDFObjectHandle	2012-07-07 18:53:38 -04:00
Tobias Hoffmann	39bbaa86e3	Build this->all_pages while traversing with pushInheritedAttributesToPage	2012-07-07 17:45:10 -04:00
Jay Berkenbilt	e2dedde4bd	Don't require stream data provider to know length in advance Breaking API change: length parameter has disappeared from the StreamDataProvider version of QPDFObjectHandle::replaceStreamData since it is no longer necessary to compute it in advance. This breaking change is justified by the fact that removing the length parameter provides the caller an opportunity to simplify the calling code.	2012-07-07 17:33:45 -04:00
Jay Berkenbilt	8705e2e8fc	Add QPDFWriter method to output to FILE*	2012-07-05 21:24:04 -04:00
Tobias Hoffmann	abb53ac369	Limited inheritance to the attributes explicitly listed in the PDF spec Previous versions of qpdf incorrectly passed arbitrary objects from /Pages objects down to individual pages in direct contradition with the PDF specification. These are now left in /Pages. When intermediate /Pages nodes are being discarded as when the /Pages tree is being flattened, a warning is issued when unknown keys are encountered.	2012-07-04 23:04:55 -04:00
Tobias Hoffmann	7770a1b036	Added public method QPDF::pushInheritedAttributesToPage Refactored optimizePagesTree to pushInheritedAttributesToPage and made public	2012-07-04 16:24:03 -04:00
Tobias Hoffmann	235188df85	Fixed wording error in comment	2012-07-04 14:51:53 -04:00
Jay Berkenbilt	5f59c32f87	Add a few minor enhancements to recent work Test coverage case for new newStream method Expose decimal_places argument for double-based newReal All enhancements suggested by Tobias.	2012-06-27 10:43:27 -04:00
Tobias Hoffmann	f07e3370f0	Add Pl_Concatenate filter	2012-06-27 10:20:38 -04:00
Tobias Hoffmann	43c404b45a	Add QPDFObjectHandle::newStream(QPDF *, std::string const&) This makes the code simpler than having to create a buffer of a fixed size and copy the string to it.	2012-06-27 10:19:57 -04:00
Tobias Hoffmann	75054c0b94	Add QPDFObjectHandle::newReal(double)	2012-06-27 10:19:01 -04:00
Jay Berkenbilt	2266c6232b	Rework InputSource::readLine to make it much more efficient This rework makes xref reconstruction run much faster and use much less memory.	2012-06-27 06:48:06 -04:00
Jay Berkenbilt	736bafbb9c	Rename seek functions in QUtil	2012-06-26 23:10:10 -04:00
Jay Berkenbilt	8318d81ada	Fix and test support for files >= 4 GB	2012-06-24 15:56:50 -04:00
Jay Berkenbilt	781c313058	Change QPDF_Integer from int to long long This makes it possible to store offsets that are larger than 2 GB in the trailer dictionary.	2012-06-24 15:20:01 -04:00
Jay Berkenbilt	4f305488d8	Improve the FILE* version of QPDF::processFile	2012-06-23 18:23:06 -04:00
Jay Berkenbilt	ffb96ee17e	Add pdf-from-scratch example	2012-06-23 09:05:06 -04:00
Jay Berkenbilt	b6bdc0f595	Add factory methods for creating empty arrays and dictionaries. Also updated pdf_from_scratch test driver to use the new factories, and made some cosmetic improvements and documentation updates for the emptyPDF() method.	2012-06-22 09:46:33 -04:00
Jay Berkenbilt	a0768e4190	Add QPDF::emptyPDF() and pdf_from_scratch test code	2012-06-21 23:09:05 -04:00
Jay Berkenbilt	81e8752362	Use qpdf_offset_t in place of off_t in public APIs. off_t is used internally only when needed to talk to standard libraries. This requires that the "long long" type be supported by the compiler.	2012-06-21 21:23:24 -04:00
Jay Berkenbilt	d1ebe30ff6	Add QPDFObjectHandle::shallowCopy()	2012-06-21 16:15:09 -04:00
Jay Berkenbilt	eb802cfa8c	Implement page manipulation APIs	2012-06-21 15:01:02 -04:00
Jay Berkenbilt	df493c352f	Refactor optimizePagesTree Split optimizePagesTree into a simpler top-level routine and a recursive internal routine.	2012-06-21 15:01:02 -04:00
Tobias Hoffmann	5d3f93be29	Added first version of pages API.	2012-06-21 15:01:02 -04:00
Tobias Hoffmann	405a549f8c	Make QPDFObjectHandle::assertPageObject() public. The method is helpful in other places, like the upcoming QPDF::addPage, too.	2012-06-21 15:01:02 -04:00
Tobias Hoffmann	47a846a7e0	Added method to clear pages cache.	2012-06-21 15:01:02 -04:00
Jay Berkenbilt	f59ff6fcc2	fix include order for off_t	2012-06-21 14:11:22 -04:00
Jay Berkenbilt	bc1c4bb578	Add QPDF::processFile that takes an open FILE*	2012-06-21 08:00:35 -04:00
Tobias Hoffmann	db7474e0fa	Added additional array mutators Added methods to append to arrays, insert items into arrays, and replace array contents with a vector of items.	2012-06-20 15:29:44 -04:00
Jay Berkenbilt	5d4cad9c02	ABI change: fix use of off_t, size_t, and integer types Significantly improve the code's use of off_t for file offsets, size_t for memory sizes, and integer types in cases where there has to be compatibility with external interfaces. Rework sections of the code that would have prevented qpdf from working on files larger than 2 (or maybe 4) GB in size.	2012-06-20 15:20:26 -04:00
Jay Berkenbilt	b856379370	Portability issues: off_t, unlink New header qpdf/Types.h attempts to make sure size_t and off_t are defined on any platform and in a way that would work with large file support. Additionally, missing header files are included to get unlink.	2012-06-20 15:18:14 -04:00
Jay Berkenbilt	2757b9b05f	export new methods	2011-08-11 15:55:06 -04:00
Jay Berkenbilt	76b1659177	enhance PointerHolder so that it can explicitly be told to use delete [] instead of delete, thus making it useful to run valgrind over qpdf during its test suite	2011-08-11 11:57:37 -04:00

... 3 4 5 6 7 ...

522 Commits