Specifically, if a stream had its stream data replaced and had
indirect /Filter or /DecodeParms, it would result in non-silent loss
of data and/or internal error.
Wildcard expansion is different in Windows from non-Windows and
sometimes requires special link options to work. Add tests that fail
if we link incorrectly.
Issue #399 mentioned a use case for which qpdf has support, but the
fact that it is supported was not documented or in the test suite,
making it vulerable to accidental breakage.
If the value of /CS in the inline image dictionary was is key in the
page's /Resource -> /ColorSpace dictionary, properly resolve it by
referencing the proper colorspace, and not just the name, in the
external image dictionary.
Allow exit status-based checking of whether a file is encrypted or
requires a password without necessarily supplying the correct
password. Useful for scripting.
Various PDF digital signing tools do not encrypt /Contents value in
signature dictionary. Adobe Acrobat Reader DC can handle a PDF with
the /Contents value not encrypted.
Write Contents in signature dictionary without encryption
Tests ensure that string /Contents are not handled specially when not
found in sig dicts.
It seems better not to compress signature dictionaries. Various PDF
digital signing tools, including Adobe Acrobat Reader DC, do not
compress signature dictionaries.
Table 8.93 "Entries in a signature dictionary" in PDF 1.5 reference
describes that /ByteRange in the signature dictionary shall be used to
describe a digest that does not include the signature value
(/Contents) itself.
The byte ranges cannot be determined if the dictionary is compressed.
Table 8.93 "Entries in a signature dictionary" in PDF 1.5 reference
describes that the value of Contents entry is a hexadecimal string
representation when ByteRange is specified.
This commit makes QPDF always uses hexadecimal strings representation
instead of literal strings for it.
It's detected in QPDFWriter instead of at parse time because I can't
figure out how to construct a test case in a reasonable time. This
commit moves the fuzz file into the regular test suite for a QTC
coverage case.
This also reverts the addition of a new checkLinearization that
distinguishes errors from warnings. There's no practical distinction
between what was considered an error and what was considered a
warning.
On read, ignore /DecodeParms when empty list; on write, delete it.
Some files have been found that include an empty list for
/DecodeParms, but this is not technically compliant with the spec, and
the only sensible interpretation is to treat it as if there are no
decode parameters.
The preservation of outlines didn't provide very useful behavior
anyway as it copied all outlines but most didn't work. This
implementation also caused a very significant performance hit and so
is being reverted until a proper solution can be coded. The eventual
solution will not be compatible with the reverted solution anyway, so
it's best not to leave this in.
Embarcadero C++Builder doesn't support more than 50 files open at the same time for legacy 32 Bit apps, which makes a test fail trying to open more than that many files. This changes the number of open files for that test to far less to make the test succeed. Alternatively one could reduce the hard coded number of 200 in QPDF itself, which I didn't do currently because it needs adoption of manuals etc. and is something which needs to be discussed with the author of QPDF. I guess chances are better to get the test changed upstream.
This fixes #288: https://github.com/qpdf/qpdf/issues/288
There have been issues reported where exceptions are not thrown
properly across shared library/DLL boundaries, so add a test
specifically to ensure that exceptions are caught as thrown.
We've actually seen a PDF file in the wild that contained EI
surrounded by delimiters inside the image data, which confused qpdf's
naive code. This significantly improves EI detection.
Add a version of expectInlineImage that takes an input source and
searches for EI. This is in preparation for improving the way EI is
found. This commit just refactors the code without changing the
functionality and adds tests to make sure the old and new code behave
identically.
When linearizing a file or getting the list of all pages in a file,
detect if the pages tree contains a duplicated page object and, if so,
shallow copy it. This makes it possible to have a one to one mapping
of page positions to page objects.
When generating appearance streams for variable text annotations,
properly handle the cases of there being no appearance dictionary, no
appearance stream, or an appearance stream with no BMC..EMC marker.
With the exception of form field annotations when /NeedAppearances is
true, remove annotations that don't have appearance streams when
flattening. There is no reason to keep these when flattening since
they are invisible. This may include unchecked checkboxes, unshown
popup windows, etc.
Allow fine control over how passwords are encoded for writing, and
allow password for reading to be given as a hexademical encoded
string. Allow suppression of password recovery as a means to ensure
that the password you specify is actually the right one.
Explicitly abandon removal of unreferenced resources if there are any
lexical errors in the page's contents. This case always generated a
warning, but it now also prevents removal of unreferenced resources,
this strongly decreasing the likelihood of data loss.
On certain operations, such as iterating through all objects and
adding new indirect objects, walk through the entire object structure
and explicitly resolve any indirect references to non-existent
objects. That prevents new objects from springing into existence and
causing the previously dangling references to point to them.
Flatten annotations by integrating their appearance streams into the
content stream of the containing page. In the case of form fields,
only flatten if /NeedAppearance is false (or equivalently absent). If
flattening form fields, also remove /AcroForm from the document
catalog.