2
1
mirror of https://github.com/qpdf/qpdf.git synced 2024-06-05 20:00:53 +00:00
Commit Graph

158 Commits

Author SHA1 Message Date
Jay Berkenbilt
39bfa01307 Implement user-provided stream filters
Refactor QPDF_Stream to use stream filter classes to handle supported
stream filters as well.
2020-12-28 12:58:19 -05:00
Jay Berkenbilt
cc8895078a Add QPDFObjectHandle::makeDirect(bool allow_streams) 2020-12-26 08:48:18 -05:00
Jay Berkenbilt
2050977099 Add QPDFObjectHandle manipulation to C API 2020-11-28 19:48:07 -05:00
Jay Berkenbilt
a7ef572c84 Small enhancement to --pages argument parsing 2020-11-09 11:12:34 -05:00
Jay Berkenbilt
b30deaeeab Avoid merging adjacent tokens when concatenating contents (fixes #444) 2020-10-23 08:00:04 -04:00
Jay Berkenbilt
893d38b87e Allow propagation of errors and retry through StreamDataProvider
StreamDataProvider::provideStreamData now has a rich enough API for it
to effectively proxy to pipeStreamData.
2020-04-05 20:07:13 -04:00
Jay Berkenbilt
67d5ed3a64 Implement remove-unreferenced-resources=auto 2020-04-04 13:19:49 -04:00
Jay Berkenbilt
dac65a21fb Look in form XObjects when removing unreferenced resources (fixes #373)
If a page contains a form XObject, also filter the form XObject and
remove its unreferenced resources.
2020-03-31 17:39:20 -04:00
Jay Berkenbilt
57c01ef81f In qdf mode, don't write extra XRef streams (fixes #386)
fix-qdf assumes there is exactly one XRef stream and that it is at the
end of the file.
2020-01-26 16:50:57 -05:00
Jay Berkenbilt
bbc2f8ffae Bug fix: handle ColorSpace lookup for inline images (fixes #392)
If the value of /CS in the inline image dictionary was is key in the
page's /Resource -> /ColorSpace dictionary, properly resolve it by
referencing the proper colorspace, and not just the name, in the
external image dictionary.
2020-01-26 15:29:10 -05:00
Masamichi Hosoda
5e0ba12687 Fix /Contents value representation in a signature dictionary
Table 8.93 "Entries in a signature dictionary" in PDF 1.5 reference
describes that the value of Contents entry is a hexadecimal string
representation when ByteRange is specified.

This commit makes QPDF always uses hexadecimal strings representation
instead of literal strings for it.
2019-10-22 16:16:16 -04:00
Jay Berkenbilt
8b1e307741 Warn for duplicated dictionary keys (fixes #345) 2019-09-19 20:22:34 -04:00
Jay Berkenbilt
47a38a942d Detect stream in object stream, fixing fuzz 16214
It's detected in QPDFWriter instead of at parse time because I can't
figure out how to construct a test case in a reasonable time. This
commit moves the fuzz file into the regular test suite for a QTC
coverage case.
2019-08-28 12:49:04 -04:00
Jay Berkenbilt
5da146c8b5 Track separately whether password was user/owner (fixes #159) 2019-08-24 11:01:19 -04:00
Jay Berkenbilt
3f3dbe22ea Remove array null flattening
For some reason, qpdf from the beginning was replacing indirect
references to null with literal null in arrays even after removing the
old behavior of flattening scalar references. This seems like a bad
idea.
2019-08-22 17:55:16 -04:00
Jay Berkenbilt
ae5bd7102d Accept extraneous space before xref (fixes #341) 2019-08-19 22:24:53 -04:00
Jay Berkenbilt
8a9086a689 Accept extraneous space after stream keyword (fixes #329) 2019-08-19 21:43:44 -04:00
Jay Berkenbilt
42d396f1dd Handle invalid name tokens symmetrically for PDF < 1.2 (fixes #332) 2019-08-19 19:48:27 -04:00
Jay Berkenbilt
45dac410b5 Remove broken QPDFTokenizer::expectInlineImage 2019-06-21 22:29:31 -04:00
Jay Berkenbilt
3608afd5c5 Add new integer accessors to QPDFObjectHandle 2019-06-21 13:17:21 -04:00
Jay Berkenbilt
31bde2f9d7 Handle empty DecodeParams array for (fixes #331)
On read, ignore /DecodeParms when empty list; on write, delete it.
Some files have been found that include an empty list for
/DecodeParms, but this is not technically compliant with the spec, and
the only sensible interpretation is to treat it as if there are no
decode parameters.
2019-06-09 17:19:49 -04:00
Jay Berkenbilt
0a470d2daf Don't optimize non-8-bit images
Also add test cases for additional coverage on image optimization.
2019-01-31 21:29:28 -05:00
Jay Berkenbilt
eb49e07c0a Make inline image token exactly contain the image data
Do not include the trailing EI, and handle cases where EI is not
preceded by a delimiter. Such cases have been seen in the wild.
2019-01-31 20:28:44 -05:00
Jay Berkenbilt
5211bcb5ea Externalize inline images (fixes #278) 2019-01-31 10:38:13 -05:00
Jay Berkenbilt
2b6c79bcae Improve locating inline image's EI
We've actually seen a PDF file in the wild that contained EI
surrounded by delimiters inside the image data, which confused qpdf's
naive code. This significantly improves EI detection.
2019-01-31 09:26:37 -05:00
Jay Berkenbilt
ec9e310c9e Refactor QPDFTokenizer's inline image handling
Add a version of expectInlineImage that takes an input source and
searches for EI. This is in preparation for improving the way EI is
found. This commit just refactors the code without changing the
functionality and adds tests to make sure the old and new code behave
identically.
2019-01-31 09:26:37 -05:00
Jay Berkenbilt
8a9cfd2605 Handle direct page objects (fixes #164) 2019-01-29 17:01:36 -05:00
Jay Berkenbilt
52f9d326a5 Resolve duplicated page objects (fixes #268)
When linearizing a file or getting the list of all pages in a file,
detect if the pages tree contains a duplicated page object and, if so,
shallow copy it. This makes it possible to have a one to one mapping
of page positions to page objects.
2019-01-28 20:29:58 -05:00
Jay Berkenbilt
426434c772 Add --overlay and --underlay to qpdf CLI (fixes #207) 2019-01-27 09:30:13 -05:00
Jay Berkenbilt
009767d97a Handle inheritable page attributes
Add getAttribute for handling inheritable page attributes, and fix
getPageImages and annotation flattening code to use it.
2019-01-25 22:30:05 -05:00
Jay Berkenbilt
2d32f4db8f Handle fallback font size in text appearances
If we end up using our fallback font size when generating appearances
for text fields, reflect that in the Tf operator used in the
appearance stream.
2019-01-21 07:38:21 -05:00
Jay Berkenbilt
930eade6d3 Fix omissions in text appearance generation
When generating appearance streams for variable text annotations,
properly handle the cases of there being no appearance dictionary, no
appearance stream, or an appearance stream with no BMC..EMC marker.
2019-01-20 23:05:58 -05:00
Jay Berkenbilt
65ef0bf313 When flattening, remove annotations with no appearance stream
With the exception of form field annotations when /NeedAppearances is
true, remove annotations that don't have appearance streams when
flattening. There is no reason to keep these when flattening since
they are invisible. This may include unchecked checkboxes, unshown
popup windows, etc.
2019-01-20 23:05:58 -05:00
Jay Berkenbilt
c2030d1f33 Implement password recovery suppression and password mode (fixes #215)
Allow fine control over how passwords are encoded for writing, and
allow password for reading to be given as a hexademical encoded
string. Allow suppression of password recovery as a means to ensure
that the password you specify is actually the right one.
2019-01-19 10:14:07 -05:00
Jay Berkenbilt
698485468a Move remaining existing transcoding to QUtil 2019-01-17 11:43:56 -05:00
Jay Berkenbilt
5cfcd4f361 Additional checks for unreferenced resources
Explicitly abandon removal of unreferenced resources if there are any
lexical errors in the page's contents. This case always generated a
warning, but it now also prevents removal of unreferenced resources,
this strongly decreasing the likelihood of data loss.
2019-01-17 11:43:56 -05:00
Jay Berkenbilt
654c0e8caf Allow adding the same page more than once in --pages (fixes #272) 2019-01-12 10:01:47 -05:00
Jay Berkenbilt
d24a120c7f Add QPDF::setImmediateCopyFrom 2019-01-10 22:35:08 -05:00
Jay Berkenbilt
c3cee5f154 Exercise out of scope original pdf for copyForeignObject 2019-01-07 07:38:03 -05:00
Jay Berkenbilt
fddbcab0e7 Mostly don't require original QPDF for copyForeignObject (fixes #219)
The original QPDF is only required now when the source
QPDFObjectHandle is a stream that gets its stream data from a
QPDFObjectHandle::StreamDataProvider.
2019-01-07 00:11:15 -05:00
Jay Berkenbilt
fbbb0ee016 Make a static version of QPDF::pipeStreamData
This is in preparation of being able to pipe a stream's data without
keeping a copy of its containing qpdf object.
2019-01-07 00:11:15 -05:00
Jay Berkenbilt
a70fbaaf50 Honor other base encodings when generating appearances 2019-01-05 23:01:59 -05:00
Jay Berkenbilt
2e342ee5bb Spell check 2019-01-04 21:33:14 -05:00
Jay Berkenbilt
ee2aad4381 Add CLI flags for image optimization 2019-01-04 21:33:14 -05:00
Jay Berkenbilt
a01359189b Fix dangling references (fixes #240)
On certain operations, such as iterating through all objects and
adding new indirect objects, walk through the entire object structure
and explicitly resolve any indirect references to non-existent
objects. That prevents new objects from springing into existence and
causing the previously dangling references to point to them.
2019-01-04 10:29:29 -05:00
Jay Berkenbilt
158156d506 Add basic appearance stream generation 2019-01-04 08:00:19 -05:00
Jay Berkenbilt
b55567a0fa Add special case setV code for button fields 2019-01-03 23:18:13 -05:00
Jay Berkenbilt
ca94ac68d9 Honor flags when flattening annotations 2019-01-03 11:59:55 -05:00
Jay Berkenbilt
f78ea057ca Switch annotation flattening to use the form xobjects
Instead of directly putting the contents of the annotation appearance
streams into the page's content stream, add commands to render the
form xobjects directly. This is a more robust way to do it than the
original solution as it works properly with patterns and avoids
problems with resource name clashes between the pages and the form
xobjects.
2019-01-02 21:49:47 -05:00
Jay Berkenbilt
3b8ce4f12a Annotation flattening including form fields
Flatten annotations by integrating their appearance streams into the
content stream of the containing page. In the case of form fields,
only flatten if /NeedAppearance is false (or equivalently absent). If
flattening form fields, also remove /AcroForm from the document
catalog.
2019-01-01 08:14:15 -05:00