This makes all integer type conversions that have potential data loss
explicit with calls that do range checks and raise an exception. After
this commit, qpdf builds with no warnings when -Wsign-conversion
-Wconversion is used with gcc or clang or when -W3 -Wd4800 is used
with MSVC. This significantly reduces the likelihood of potential
crashes from bogus integer values.
There are some parts of the code that take int when they should take
size_t or an offset. Such places would make qpdf not support files
with more than 2^31 of something that usually wouldn't be so large. In
the event that such a file shows up and is valid, at least qpdf would
raise an error in the right spot so the issue could be legitimately
addressed rather than failing in some weird way because of a silent
overflow condition.
There have been issues reported where exceptions are not thrown
properly across shared library/DLL boundaries, so add a test
specifically to ensure that exceptions are caught as thrown.
When linearizing a file or getting the list of all pages in a file,
detect if the pages tree contains a duplicated page object and, if so,
shallow copy it. This makes it possible to have a one to one mapping
of page positions to page objects.
On certain operations, such as iterating through all objects and
adding new indirect objects, walk through the entire object structure
and explicitly resolve any indirect references to non-existent
objects. That prevents new objects from springing into existence and
causing the previously dangling references to point to them.
Instead of directly putting the contents of the annotation appearance
streams into the page's content stream, add commands to render the
form xobjects directly. This is a more robust way to do it than the
original solution as it works properly with patterns and avoids
problems with resource name clashes between the pages and the form
xobjects.
Some files in the test suite trigger antivirus warnings. These are
not infected files with malicious intent. They are test files to
ensure that qpdf does not crash when it encounters the files. This
change enables those files to be obfuscated in the source repository
so that checking out qpdf from version control or extracting the
source code doesn't trigger antivirus warnings.
Implement a TokenFilter class and refactor Pl_QPDFTokenizer to use a
TokenFilter class called ContentNormalizer. Pl_QPDFTokenizer is now a
general filter that passes data through a TokenFilter.
Files written in PCLm mode have to be created in a very specific way.
qpdf doesn't know how to create PCLm files from scratch. All it knows
how to do is to write an already valid file in a suitable way.
Therefore there is no command-line support for PCLm.
This commit adds several API methods that enable control over which
types of filters QPDF will attempt to decode. It also adds support for
/RunLengthDecode and /DCTDecode filters for both encoding and
decoding.
Remove const qualifier from getTypeCode and get getTypeName methods of
QPDFObjectHandle, make them work properly for indirect objects, and
exercise them much better in the test suite.
Put a specific comment marker next to every piece of code that MSVC
gives warning 4996 for. This warning is generated for calls to
functions that Microsoft considers insecure or deprecated. This
change is in preparation for fixing all these cases even though none
of them are actually incorrect or insecure as used in qpdf. The
comment marker makes them easier to find so they can be fixed in
subsequent commits.
This fix eliminates a false test failure on some platforms and makes
the binary test work properly whether characters with the high bit
set, when treated as integers, are negative or not.