2
1
mirror of https://github.com/qpdf/qpdf.git synced 2024-11-02 19:49:43 +00:00
Commit Graph

614 Commits

Author SHA1 Message Date
Jay Berkenbilt
2b6c79bcae Improve locating inline image's EI
We've actually seen a PDF file in the wild that contained EI
surrounded by delimiters inside the image data, which confused qpdf's
naive code. This significantly improves EI detection.
2019-01-31 09:26:37 -05:00
Jay Berkenbilt
ec9e310c9e Refactor QPDFTokenizer's inline image handling
Add a version of expectInlineImage that takes an input source and
searches for EI. This is in preparation for improving the way EI is
found. This commit just refactors the code without changing the
functionality and adds tests to make sure the old and new code behave
identically.
2019-01-31 09:26:37 -05:00
Jay Berkenbilt
31372edce0 Inline image token value ends with EI, not delimiter
The inline image token erroneously included the delimiter that
followed EI. The ObjectHandle created from it was correct.
2019-01-31 09:26:37 -05:00
Jay Berkenbilt
b776dcd2d3 Clean up some private functions 2019-01-29 22:14:20 -05:00
Jay Berkenbilt
8a9cfd2605 Handle direct page objects (fixes #164) 2019-01-29 17:01:36 -05:00
Jay Berkenbilt
2d0885bc11 Clarify documentation for copyForeignObject regarding pages
Make explicit that copyForeignObject can be used on page objects and
will copy them properly but not update the pages tree.
2019-01-28 21:53:55 -05:00
Jay Berkenbilt
2712869cf9 Fix logic for when to compress object and xref streams (fixes #271) 2019-01-28 21:43:06 -05:00
Jay Berkenbilt
52f9d326a5 Resolve duplicated page objects (fixes #268)
When linearizing a file or getting the list of all pages in a file,
detect if the pages tree contains a duplicated page object and, if so,
shallow copy it. This makes it possible to have a one to one mapping
of page positions to page objects.
2019-01-28 20:29:58 -05:00
Jay Berkenbilt
623f5b664e Convert pages to form XObjects
Support conversion of pages to form XObjects and placement of form
XObjects on pages.
2019-01-27 07:50:30 -05:00
Jay Berkenbilt
68ccd87c9e Move rectangle transformation into QPDFMatrix 2019-01-27 07:50:30 -05:00
Jay Berkenbilt
8cb245739c Add QPDFObjectHandle::getUniqueResourceName 2019-01-27 07:50:30 -05:00
Jay Berkenbilt
009767d97a Handle inheritable page attributes
Add getAttribute for handling inheritable page attributes, and fix
getPageImages and annotation flattening code to use it.
2019-01-25 22:30:05 -05:00
Jay Berkenbilt
2d32f4db8f Handle fallback font size in text appearances
If we end up using our fallback font size when generating appearances
for text fields, reflect that in the Tf operator used in the
appearance stream.
2019-01-21 07:38:21 -05:00
Jay Berkenbilt
9cb599875b Improve text objects used in text appearance streams 2019-01-20 23:05:58 -05:00
Jay Berkenbilt
930eade6d3 Fix omissions in text appearance generation
When generating appearance streams for variable text annotations,
properly handle the cases of there being no appearance dictionary, no
appearance stream, or an appearance stream with no BMC..EMC marker.
2019-01-20 23:05:58 -05:00
Jay Berkenbilt
65ef0bf313 When flattening, remove annotations with no appearance stream
With the exception of form field annotations when /NeedAppearances is
true, remove annotations that don't have appearance streams when
flattening. There is no reason to keep these when flattening since
they are invisible. This may include unchecked checkboxes, unshown
popup windows, etc.
2019-01-20 23:05:58 -05:00
Jay Berkenbilt
c18ee440a3 mingw workaround for QPDFExc destructor
mingw doesn't like it when you don't inline empty virtual destructors.
2019-01-19 10:14:07 -05:00
Jay Berkenbilt
e87d149918 Add QUtil::possible_repaired_encodings 2019-01-17 11:43:56 -05:00
Jay Berkenbilt
6ec22f117d Modernize encryption API for more granularity
Setting encryption permissions for R >= 3 set permission bits in
groups corresponding to menu options in Acrobat 5. The new API allows
the bits to be set individually.
2019-01-17 11:43:56 -05:00
Jay Berkenbilt
4630377731 Add status-reporting transcoders to QUtil 2019-01-17 11:43:56 -05:00
Jay Berkenbilt
8f389f14c0 QUtil::analyze_encoding 2019-01-17 11:43:56 -05:00
Jay Berkenbilt
6817ca585a Bidirectional transcoding for win, mac, pdf, utf8, utf16 2019-01-17 11:43:56 -05:00
Jay Berkenbilt
698485468a Move remaining existing transcoding to QUtil 2019-01-17 11:43:56 -05:00
Jay Berkenbilt
5cfcd4f361 Additional checks for unreferenced resources
Explicitly abandon removal of unreferenced resources if there are any
lexical errors in the page's contents. This case always generated a
warning, but it now also prevents removal of unreferenced resources,
this strongly decreasing the likelihood of data loss.
2019-01-17 11:43:56 -05:00
Jay Berkenbilt
4bc434000c Copy subdictionaries when removing resources (fixes #276)
When removing unreferenced resources, the code was copying the overall
resource dictionaries but not the subdictionaries being modified. This
was a "typo" in the code -- the comment clearly stated the need to do
this, but the code replaced the dictionary with itself rather than
with a shallow copy of itself.
2019-01-17 09:40:05 -05:00
Jay Berkenbilt
654c0e8caf Allow adding the same page more than once in --pages (fixes #272) 2019-01-12 10:01:47 -05:00
Jay Berkenbilt
4ecd1df6f2 Add configure option AVOID_WINDOWS_HANDLE
If set, we avoid using Windows I/O HANDLE, which is disallowed in some
versions of the Windows SDK, such as for Windows phones.
QUtil::same_file will always return false in this case. Only applies
to Windows builds.
2019-01-10 22:35:08 -05:00
Jay Berkenbilt
d24a120c7f Add QPDF::setImmediateCopyFrom 2019-01-10 22:35:08 -05:00
Jay Berkenbilt
b653929c93 Update version to 8.3.0 2019-01-07 11:16:54 -05:00
Jay Berkenbilt
aa602fd107 Fix integer overflow in large file test 2019-01-07 08:49:14 -05:00
Jay Berkenbilt
c3cee5f154 Exercise out of scope original pdf for copyForeignObject 2019-01-07 07:38:03 -05:00
Jay Berkenbilt
fddbcab0e7 Mostly don't require original QPDF for copyForeignObject (fixes #219)
The original QPDF is only required now when the source
QPDFObjectHandle is a stream that gets its stream data from a
QPDFObjectHandle::StreamDataProvider.
2019-01-07 00:11:15 -05:00
Jay Berkenbilt
fbbb0ee016 Make a static version of QPDF::pipeStreamData
This is in preparation of being able to pipe a stream's data without
keeping a copy of its containing qpdf object.
2019-01-07 00:11:15 -05:00
Jay Berkenbilt
7588cac295 Create an application-scope unique ID for each QPDF object
Use this instead of QPDF* as a map key for object_copiers.
2019-01-07 00:11:15 -05:00
Jay Berkenbilt
e27ac682e0 Move encryption parameters into a class 2019-01-06 09:58:16 -05:00
Jay Berkenbilt
a70fbaaf50 Honor other base encodings when generating appearances 2019-01-05 23:01:59 -05:00
Jay Berkenbilt
b341d742db Add WinAnsi and MacRoman encoding 2019-01-05 23:01:44 -05:00
Jay Berkenbilt
3ef1b77304 Refactor QUtil::utf8_to_ascii 2019-01-05 22:59:29 -05:00
Jay Berkenbilt
089ce5902e Move utf8_to_utf16 into QUtil 2019-01-05 22:59:27 -05:00
Jay Berkenbilt
ae18bfd142 Refactor string transcoding in QPDF_String 2019-01-05 22:56:58 -05:00
Jay Berkenbilt
2e342ee5bb Spell check 2019-01-04 21:33:14 -05:00
Jay Berkenbilt
16fd6e64f9 Add QPDFWriter::getFinalVersion (fixes #266) 2019-01-04 12:37:22 -05:00
Jay Berkenbilt
837dcf8fc2 Don't call assert while checking linearization data (fixes #209, #231)
Instead of calling assert for problems found during checking
linearization data, throw an exception which is later caught and
issued as an error. Ideally we would handle errors more robustly, but
this is still a significant improvement.
2019-01-04 11:55:42 -05:00
Jay Berkenbilt
a01359189b Fix dangling references (fixes #240)
On certain operations, such as iterating through all objects and
adding new indirect objects, walk through the entire object structure
and explicitly resolve any indirect references to non-existent
objects. That prevents new objects from springing into existence and
causing the previously dangling references to point to them.
2019-01-04 10:29:29 -05:00
Jay Berkenbilt
158156d506 Add basic appearance stream generation 2019-01-04 08:00:19 -05:00
Jay Berkenbilt
02281632cc Add QUtil::utf8_to_ascii 2019-01-03 23:18:13 -05:00
Jay Berkenbilt
b55567a0fa Add special case setV code for button fields 2019-01-03 23:18:13 -05:00
Jay Berkenbilt
e3144ac417 Add form fields to json output
Also add some additional methods for detecting form field types to
assist in the json creation and for later use.
2019-01-03 23:18:13 -05:00
Jay Berkenbilt
ca94ac68d9 Honor flags when flattening annotations 2019-01-03 11:59:55 -05:00
Jay Berkenbilt
06d6438ddf Minor fixes 2019-01-03 09:17:43 -05:00
Jay Berkenbilt
3e74916c5a Fix seg fault on empty xref stream (fixes #263)
Thanks to @p-cher for supplying a patch.
2019-01-03 09:17:43 -05:00
Jay Berkenbilt
f78ea057ca Switch annotation flattening to use the form xobjects
Instead of directly putting the contents of the annotation appearance
streams into the page's content stream, add commands to render the
form xobjects directly. This is a more robust way to do it than the
original solution as it works properly with patterns and avoids
problems with resource name clashes between the pages and the form
xobjects.
2019-01-02 21:49:47 -05:00
Jay Berkenbilt
3b8ce4f12a Annotation flattening including form fields
Flatten annotations by integrating their appearance streams into the
content stream of the containing page. In the case of form fields,
only flatten if /NeedAppearance is false (or equivalently absent). If
flattening form fields, also remove /AcroForm from the document
catalog.
2019-01-01 08:14:15 -05:00
Jay Berkenbilt
95d6b17a89 Add QPDFObjectHandle::mergeDictionary() 2019-01-01 08:12:56 -05:00
Jay Berkenbilt
104fd6da52 Add matrix and annotation appearance stream handling
Generate page content fragment for rendering appearance streams
including all matrix calculation.
2019-01-01 08:07:21 -05:00
Jay Berkenbilt
5059ec0d35 Add Matrix class under QPDFObjectHandle 2018-12-31 23:02:43 -05:00
Jay Berkenbilt
daeb5a85b6 Transformation matrix 2018-12-31 18:23:47 -05:00
Jay Berkenbilt
3440ea7d3c JSON::serialize -> unparse
Unparse is admittedly strange, but I'd rather be strange and
consistent, and everything else in the qpdf library uses unparse to
serialize. (If you're reading this, the convention of using "unparse"
comes from the "clu" programming language.)
2018-12-25 11:52:21 -05:00
Jay Berkenbilt
fa3664357b Move numrange code from qpdf.cc to QUtil.cc
Also move tests to libtests.
2018-12-21 19:11:57 -05:00
Jay Berkenbilt
d5d179f441 Add document and object helpers for outlines (bookmarks) 2018-12-21 19:11:57 -05:00
Jay Berkenbilt
30a0c070e4 Add QPDFObjectHandle::getJSON() 2018-12-21 18:34:56 -05:00
Jay Berkenbilt
651179b5da Add simple JSON serializer 2018-12-21 18:34:56 -05:00
Jay Berkenbilt
0776c00129 Add QPDFNameTreeObjectHelper 2018-12-21 18:34:56 -05:00
Jay Berkenbilt
cc500eda91 Minor cleanup 2018-12-21 17:25:31 -05:00
Jay Berkenbilt
6ef9e31233 Add QPDFPageLabelDocumentHelper 2018-12-18 16:59:24 -05:00
Jay Berkenbilt
f38df27aa3 Add QPDFNumberTreeObjectHelper 2018-12-18 16:46:10 -05:00
Jay Berkenbilt
077d3d4512 Add QPDFObjectHandle::wrapInArray()
Wrap an object in an array if it is not already an array.
2018-12-18 16:45:48 -05:00
Jay Berkenbilt
d1368a3851 Commit automatically generated files 2018-10-11 17:27:54 -04:00
Jay Berkenbilt
6ee761fc86 Prepare 8.2.1 release 2018-08-18 10:56:19 -04:00
Jay Berkenbilt
5e9e17e62a Prepare 8.2.0 release 2018-08-16 11:53:10 -04:00
Jay Berkenbilt
693cdaac35 Missing header for std::max 2018-08-16 11:53:10 -04:00
Jay Berkenbilt
b4ce557be5 Fix error in QPDFSystemError.cc 2018-08-14 11:39:07 -04:00
Jay Berkenbilt
b4bdc42b4f New exception class QPDFSystemError (fixes #221) 2018-08-13 20:01:51 -04:00
Jay Berkenbilt
5d9d80beba Fix fallback logic for encryption (fixes #229) 2018-08-12 22:32:40 -04:00
Jay Berkenbilt
60fe8061cb Fix one more identifier (fixes #236) 2018-08-12 22:01:51 -04:00
Jay Berkenbilt
a2f62935b3 Catch exceptions as const references (fixes #236)
This fix allows qpdf to compile/test cleanly with gcc 8.
2018-08-12 21:57:52 -04:00
Jay Berkenbilt
3d6615b276 Pl_Buffer: reduce memory growth (fixes #228)
Rather than keeping a list of buffers for every write, accumulate
bytes in a single buffer, doubling the size of the buffer when needed
to accommodate new data.

This is not the best possible implementation, but the change was
implemented in this way to avoid changing the shape of Pl_Buffer and
thus breaking backward compatibility.
2018-08-12 17:45:43 -04:00
Jay Berkenbilt
3873f5fd9b Protect headers with compliant identifiers (fixes #233) 2018-08-12 14:10:32 -04:00
Jay Berkenbilt
932799baab Fix memory access error
A previous fix introduced a potentially memory overrun under certain
rare conditions. The test suite now once again passes with address
sanitizer.
2018-08-12 13:16:17 -04:00
Jay Berkenbilt
b6e414b10b Remove some extraneous null pointer checks (fixes #234)
There were a few places in the code that were checking that a pointer
wasn't null before deleting it, even though C++ has always allowed
delete 0. Most of the code did not perform these checks.
2018-08-12 12:58:39 -04:00
Jay Berkenbilt
4a4736c695 Fix EOL handling inside strings (fixes #226)
CR, CRLF, and LF are all supposed to be treated as LF; only one EOL is
to be ignored after backslash.
2018-08-05 20:48:35 -04:00
Jay Berkenbilt
1619cad1e8 Return correct method for string encryption (fixes #227) 2018-08-05 16:58:21 -04:00
Jay Berkenbilt
e1cd5891af Fix infinite loop on small files with progress reporting (fixes #230)
Turns out you can keep adding zero to a number over and over again and
it just doesn't get any bigger. Who would have known?
2018-08-05 15:43:34 -04:00
Jay Berkenbilt
4f4c627b77 ClosedFileInputSource: add method to keep file open
During periods of intensive operation on a specific file, this method
can reduce the overhead of repeated open/close operations.
2018-08-04 19:52:46 -04:00
Jay Berkenbilt
1bd2a2e79b Prepare 8.1.0 release 2018-06-23 07:50:11 -04:00
Jay Berkenbilt
3aad28aed0 Bug fix: honor encryption key length with R=3 (fixes #212) 2018-06-22 19:24:26 -04:00
Jay Berkenbilt
a433ed24f9 Add progress reporting for QPDFWriter (fixes #200) 2018-06-22 16:14:54 -04:00
Jay Berkenbilt
2a82f6e1e0 Add method to get count of objects in QPDF 2018-06-22 15:53:40 -04:00
Jay Berkenbilt
c81836076f Correct incorrect comment 2018-06-22 13:13:09 -04:00
Jay Berkenbilt
4ccc8b1a44 Add ClosedFileInputSource
ClosedFileInputSource is an input source that keeps the file closed
when not reading it.
2018-06-22 12:52:45 -04:00
Jay Berkenbilt
c71dc6888c Don't prune resource dictionaries on errors or by request
If we are unable to filter a page's content streams, don't attempt to
remove objects from the page's resource dictionary. Also provide a
command line option to suppress resource removal in case we ever need
this as a workaround for some bug or broken PDF files.
2018-06-22 10:45:31 -04:00
Jay Berkenbilt
38c9ed23c3 Treat content stream parsing errors as an error, not a warning
If parsing content streams is treated as a warning, there is no way
for a caller to know if a parsing operation has failed. This is very
dangerous and will likely result in data loss when token filters are
parser callbacks are in use.
2018-06-22 10:44:08 -04:00
Jay Berkenbilt
6c89d4b35b When splitting files, remove unreferenced objects (fixes #203) 2018-06-21 21:03:30 -04:00
Jay Berkenbilt
ddd78c1b7f Fix QPDFObjectHandle::shallowCopy
It's not really a shallow copy. It just doesn't cross indirect object
boundaries. The old implementation had a bug that would cause multiple
shallow copies of the same object to share memory, which was not the
intention.
2018-06-21 20:34:45 -04:00
Jay Berkenbilt
397b097c46 Allow setting a form field's value 2018-06-21 15:57:13 -04:00
Jay Berkenbilt
952a665a4e Better support for creating Unicode strings 2018-06-21 15:57:13 -04:00
Jay Berkenbilt
e44c395c51 QUtil::toUTF16 2018-06-21 15:57:13 -04:00
Jay Berkenbilt
0b05111db8 Implement helper class for interactive forms 2018-06-21 15:57:13 -04:00
Jay Berkenbilt
2e7ee23bf6 Add QPDFPageDocumentHelper and QPDFPageObjectHelper
This is the beginning of higher-level API support using helper
classes. The goal is to be able to add more helpers without continuing
to pollute QPDF's and QPDFObjectHandle's public interfaces.
2018-06-21 15:57:13 -04:00
Jay Berkenbilt
4cded10821 Add QPDFObjectHandle::Rectangle type
Provide a convenient way of accessing rectangles.
2018-06-21 15:57:13 -04:00