Original reported here:
https://bugs.launchpad.net/ubuntu/+source/qpdf/+bug/1397413
The PDF specification says that the /Type key for nodes in the pages
dictionary (both /Page and /Pages) is required, but some PDF files
omit them. Use the presence of other keys to determine the type of
pages tree node this is if the type key is not found.
QPDFWriter was trying to make /Filter and /DecodeParms direct in all
cases, but there are some cases where /DecodeParms may refer to a
stream, which can't be direct. QPDFWriter doesn't actually need
/DecodeParms to be direct in that case because it won't be able to
filter the stream. Until we can handle this type of stream, just don't
make /Filter and /DecodeParms direct if we can't filter the stream
anyway.
Fixes #34
In compare image tests, use the gs device tiff24nc instead of tiff12nc
since the 4 bit per sample images created by tiff12nc could sometimes
trigger a bug in tiffcmp. Fixes #20.
In places where std::vector<T>(size_t) was used, either validate that
the size parameter is sane or refactor code to avoid the need to
pre-allocate the vector.
Space rather than newline after xref, missing /ID in trailer for
encrypted file. This enables qpdf to handle some files that xpdf can
handle. Adobe reader can't necessarily handle them.
Rework QPDFWriter to always track old object IDs and QPDFObjGen
instead of int, thus not discarding the generation number. Switch to
QPDF::getCompressibleObjGen() to properly handle the case of an old
object eligible for compression that has a generation of other than
zero.
Remove const qualifier from getTypeCode and get getTypeName methods of
QPDFObjectHandle, make them work properly for indirect objects, and
exercise them much better in the test suite.
Put a specific comment marker next to every piece of code that MSVC
gives warning 4996 for. This warning is generated for calls to
functions that Microsoft considers insecure or deprecated. This
change is in preparation for fixing all these cases even though none
of them are actually incorrect or insecure as used in qpdf. The
comment marker makes them easier to find so they can be fixed in
subsequent commits.
Change iteration to use size_t instead of int. The code should be
equivalent in all reasonable cases, but the original way this was
coded was causing a test failure with gcc 4.8.0 on ppc64. See
https://bugzilla.redhat.com/show_bug.cgi?id=915321 for additional
information.
Fix exit status for case of errors without warnings, continue after
errors when possible, add test case for parsing a file with content
stream errors on some but not all pages.
Change object type Keyword to Operator, and place the order of the
object types in object_type_e in the same order as they are mentioned
in the PDF specification.
Note that this change only breaks backward compatibility with code
that has not yet been released.
This fix eliminates a false test failure on some platforms and makes
the binary test work properly whether characters with the high bit
set, when treated as integers, are negative or not.
Original code was written before we could shallow copy objects, so all
the filtering was done by suppressing the output of certain keys and
replacing them with other keys. Now we can simplify the code greatly
by modifying shallow copies of dictionaries in place.
Read and write support is implemented for /V=5 with /R=5 as well as
/R=6. /R=5 is the deprecated encryption method used by Acrobat IX.
/R=6 is the encryption method used by PDF 2.0 from ISO 32000-2.
This file used to exercise a zero offset test case when qpdf would
visit every object in the file. After the next commit, qpdf no longer
touches unreferenced objects, so a reference had to be added to
continue to have this file exercise the zero offset case.
For linearization tests where we are actually comparing the exact
output of the test with a known file, uncompress stream data so we can
see what's there. This makes looking at future changes a little
easier.