For non-encrypted files, determinstic ID generation uses file contents
instead of timestamp and file name. At a small runtime cost, this
enables generation of the same /ID if the same inputs are converted in
the same way multiple times.
As reported in issue #40, a call to CryptAcquireContext in
SecureRandomDataProvider fails in a fresh windows install prior to any
user keys being created in AppData\Roaming\Microsoft\Crypto\RSA.
Thanks michalrames.
Pushing inherited objects to pages and getting all pages were both
prone to stack overflow infinite loops if there were loops in the
Pages dictionary. There is a general weakness in the code in that any
part of the code that traverses the Pages structure would be prone to
this and would have to implement its own loop detection. A more robust
fix may provide some general method for handling the Pages structure,
but it's probably not worth doing.
Note: addition of *Internal2 private functions was done rather than
changing signatures of existing methods to avoid breaking
compatibility.
Converting a password to an encryption key is supposed to copy up to a
certain number of bytes from a digest. Make sure never to copy more
than the size of the digest.
When checking two objects preceding R while parsing, ensure that the
objects are direct. This avoids stuff like 1 0 obj containing 1 0 R 0 R
from causing an infinite loop in object resolution.
Original reported here:
https://bugs.launchpad.net/ubuntu/+source/qpdf/+bug/1397413
The PDF specification says that the /Type key for nodes in the pages
dictionary (both /Page and /Pages) is required, but some PDF files
omit them. Use the presence of other keys to determine the type of
pages tree node this is if the type key is not found.
QPDFWriter was trying to make /Filter and /DecodeParms direct in all
cases, but there are some cases where /DecodeParms may refer to a
stream, which can't be direct. QPDFWriter doesn't actually need
/DecodeParms to be direct in that case because it won't be able to
filter the stream. Until we can handle this type of stream, just don't
make /Filter and /DecodeParms direct if we can't filter the stream
anyway.
Fixes #34
Fix problem: if the last object in the first part of a linearized file
had an offset that was below 65536 by less than the size of the hint
stream, the xref stream was invalid and the resulting file is not
usable.
Add new RandomDataProvider object and implement existing random number
generation in terms of that. This enables end users to supply their
own random data providers.
If NO_GET_ENVIRONMENT is #defined at compile time on Windows, do not
call GetEnvironmentVariable. QUtil::get_env will always return
false. This option is not available through configure. This was
added to support a specific user's requirements to avoid calling
GetEnvironmentVariable from the Windows API. Nothing in qpdf outside
the test coverage system in qtest relies on QUtil::get_env.
Ideally, the library should never call assert outside of test code,
but it does in several places. For some cases where the assertion
might conceivably fail because of a problem with the input data,
replace assertions with exceptions so that they can be trapped by the
calling application. This commit surely misses some cases and
replaced some cases unnecessarily, but it should still be an
improvement.
In places where std::vector<T>(size_t) was used, either validate that
the size parameter is sane or refactor code to avoid the need to
pre-allocate the vector.
The /W array was not sanitized, possibly causing an integer overflow
in a multiplication. An analysis of the code suggests that there were
no possible exploits based on this since the problems were in checking
expected values but bounds checks were performed on actual values.
4.2.0 was binary incompatible in spite of there being no deletions or
changes to any public methods. As such, we have to bump the ABI and
are fixing some API breakage while we're at it.
Previous 4.3.0 target is now 5.1.0.
Space rather than newline after xref, missing /ID in trailer for
encrypted file. This enables qpdf to handle some files that xpdf can
handle. Adobe reader can't necessarily handle them.
Rework QPDFWriter to always track old object IDs and QPDFObjGen
instead of int, thus not discarding the generation number. Switch to
QPDF::getCompressibleObjGen() to properly handle the case of an old
object eligible for compression that has a generation of other than
zero.
Versions prior to 4.6 didn't allow gcc diagnostic pragmas with push
and pop and to appear anywhere in the file. Just let the warning be
there for those versions.
Remove const qualifier from getTypeCode and get getTypeName methods of
QPDFObjectHandle, make them work properly for indirect objects, and
exercise them much better in the test suite.
Put a specific comment marker next to every piece of code that MSVC
gives warning 4996 for. This warning is generated for calls to
functions that Microsoft considers insecure or deprecated. This
change is in preparation for fixing all these cases even though none
of them are actually incorrect or insecure as used in qpdf. The
comment marker makes them easier to find so they can be fixed in
subsequent commits.
Fix exit status for case of errors without warnings, continue after
errors when possible, add test case for parsing a file with content
stream errors on some but not all pages.
Change object type Keyword to Operator, and place the order of the
object types in object_type_e in the same order as they are mentioned
in the PDF specification.
Note that this change only breaks backward compatibility with code
that has not yet been released.
The upcoming 3.1 release contains non-compatible API changes, though
they only affect parts of the interface that are extremely unlikely to
have been used outside of qpdf itself. The methods and data types
affected were used for communication between QPDFWriter and QPDF and
would have had no real use in end user code.
Original code was written before we could shallow copy objects, so all
the filtering was done by suppressing the output of certain keys and
replacing them with other keys. Now we can simplify the code greatly
by modifying shallow copies of dictionaries in place.
Read and write support is implemented for /V=5 with /R=5 as well as
/R=6. /R=5 is the deprecated encryption method used by Acrobat IX.
/R=6 is the encryption method used by PDF 2.0 from ISO 32000-2.
Changes from upstream are limited to change #include paths so that I
can place header files and included "c" files in a subdirectory. I
didn't keep the unit tests from sphlib but instead verified them by
running them manually. I will implement the same tests using the
Pl_SHA2 pipeline except that sphlib's sha2 implementation supports
partial bytes, which I will not exercise in qpdf or our tests.