This change works around STL problems with Embarcadero C++ Builder
version 10.2, but std::vector is more common than std::list in qpdf,
and this is a relatively new API, so an API change is tolerable.
Thanks to Thorsten Schöning <6223655+ams-tschoening@users.noreply.github.com>
for the fix.
This also reverts the addition of a new checkLinearization that
distinguishes errors from warnings. There's no practical distinction
between what was considered an error and what was considered a
warning.
Use PointerHolder in several places where manually memory allocation
and deallocation were being used. This helps to protect against memory
leaks when exceptions are thrown in surprising places.
In a small number of cases, it makes sense to replace an overloaded
function with a function that takes a default argument. We can do this
now because we've already broken binary compatibility since the last
release.
Have classes contain only a single private member of type
PointerHolder<Members>. This makes it safe to change the structure of
the Members class without breaking binary compatibility. Many of the
classes already follow this pattern quite successfully. This brings in
the rest of the class that are part of the public API.
This makes all integer type conversions that have potential data loss
explicit with calls that do range checks and raise an exception. After
this commit, qpdf builds with no warnings when -Wsign-conversion
-Wconversion is used with gcc or clang or when -W3 -Wd4800 is used
with MSVC. This significantly reduces the likelihood of potential
crashes from bogus integer values.
There are some parts of the code that take int when they should take
size_t or an offset. Such places would make qpdf not support files
with more than 2^31 of something that usually wouldn't be so large. In
the event that such a file shows up and is valid, at least qpdf would
raise an error in the right spot so the issue could be legitimately
addressed rather than failing in some weird way because of a silent
overflow condition.
Fix C++ exception handling when -fvisibility=hidden
Ensure that QPDFExc and QPDFSystemError are marked visible, so that their typeinfo will not be
suppressed when -fvisibility=hidden.
Details:
https://gcc.gnu.org/wiki/Visibility
We've actually seen a PDF file in the wild that contained EI
surrounded by delimiters inside the image data, which confused qpdf's
naive code. This significantly improves EI detection.
Add a version of expectInlineImage that takes an input source and
searches for EI. This is in preparation for improving the way EI is
found. This commit just refactors the code without changing the
functionality and adds tests to make sure the old and new code behave
identically.
When linearizing a file or getting the list of all pages in a file,
detect if the pages tree contains a duplicated page object and, if so,
shallow copy it. This makes it possible to have a one to one mapping
of page positions to page objects.
Setting encryption permissions for R >= 3 set permission bits in
groups corresponding to menu options in Acrobat 5. The new API allows
the bits to be set individually.
The original QPDF is only required now when the source
QPDFObjectHandle is a stream that gets its stream data from a
QPDFObjectHandle::StreamDataProvider.
Instead of calling assert for problems found during checking
linearization data, throw an exception which is later caught and
issued as an error. Ideally we would handle errors more robustly, but
this is still a significant improvement.
On certain operations, such as iterating through all objects and
adding new indirect objects, walk through the entire object structure
and explicitly resolve any indirect references to non-existent
objects. That prevents new objects from springing into existence and
causing the previously dangling references to point to them.
Instead of directly putting the contents of the annotation appearance
streams into the page's content stream, add commands to render the
form xobjects directly. This is a more robust way to do it than the
original solution as it works properly with patterns and avoids
problems with resource name clashes between the pages and the form
xobjects.
Flatten annotations by integrating their appearance streams into the
content stream of the containing page. In the case of form fields,
only flatten if /NeedAppearance is false (or equivalently absent). If
flattening form fields, also remove /AcroForm from the document
catalog.
Unparse is admittedly strange, but I'd rather be strange and
consistent, and everything else in the qpdf library uses unparse to
serialize. (If you're reading this, the convention of using "unparse"
comes from the "clu" programming language.)