Removing an element from a set with iterator is constant time, and
std::set specifies that other operations on the set do not invalidate
existing iterators.
QPDFValueProxy wasn't a good name for it. We decided the evil of
having the header file be named QPDFObject_private.hh was less than
the evil of having the class be named something other than what it
should have been named.
This is in preparation for restoring a QPDFObject.hh to ease the
transition on qpdf_object_type_e.
This commit was created by
* Renaming QPDFObject.cc and QPDFObject.hh
* Replacing QPDFObject\b with QPDFValueProxy (where \b is word
boundary)
* Running format-code
* Manually resorting files in libqpdf/CMakeLists.txt
* Manually refilling the comment in QPDF.hh near class Resolver
Also change QPDF::replaceObject and QPDF::swapObjects such that the
QPDFObject assigned to an og in the obj_cache is never replaced; only
QPDFObject::value is updated.
Remove unnecessary parameters.
Remove code that is unnecessary as result of a prior call to QPDF::getAllPages.
Avoid clearing and rebuilding of m->all_pages.
We need to know whether pushInheritedAttributesToPage or getAllPages
have been called when generating JSON output. When reading the JSON
back in, we have to call the same methods so that object numbers will
line up properly.
* Replace --create-from-json=file with --json-input, which causes the
regular input to be treated as json.
* Eliminate --to-json
* In --json=2, bring back "objects" and eliminate "objectinfo". Stream
data is never present.
* In --json-output=2, write "qpdf-v2" with "objects" and include
stream data.
This comment expands all tabs using an 8-character tab-width. You
should ignore this commit when using git blame or use git blame -w.
In the early days, I used to use tabs where possible for indentation,
since emacs did this automatically. In recent years, I have switched
to only using spaces, which means qpdf source code has been a mixture
of spaces and tabs. I have avoided cleaning this up because of not
wanting gratuitous whitespaces change to cloud the output of git
blame, but I changed my mind after discussing with users who view qpdf
source code in editors/IDEs that have other tab widths by default and
in light of the fact that I am planning to start applying automatic
code formatting soon.
This was originally not public because I wanted to get rid fo the
pages cache, but I recently realized there were deep reasons not to do
that, and the author of pikepdf wanted this, so I decided to make it
public.
This results in a performance penalty of 1% to 2% when replaceObject
and swapObjects are never called and a somewhat larger penalty if they
are called, but it's worth it to avoid very confusing behavior as
discussed in depth in qpdf#507.
I thought /EFF was supposed to be used as a default for decrypting
embedded file streams, but actually it's supposed to be advice to a
conforming writer about handling new ones. This makes sense since the
findAttachmentStreams code, which is not actually needed, was never
right.
This also reverts the addition of a new checkLinearization that
distinguishes errors from warnings. There's no practical distinction
between what was considered an error and what was considered a
warning.
This makes all integer type conversions that have potential data loss
explicit with calls that do range checks and raise an exception. After
this commit, qpdf builds with no warnings when -Wsign-conversion
-Wconversion is used with gcc or clang or when -W3 -Wd4800 is used
with MSVC. This significantly reduces the likelihood of potential
crashes from bogus integer values.
There are some parts of the code that take int when they should take
size_t or an offset. Such places would make qpdf not support files
with more than 2^31 of something that usually wouldn't be so large. In
the event that such a file shows up and is valid, at least qpdf would
raise an error in the right spot so the issue could be legitimately
addressed rather than failing in some weird way because of a silent
overflow condition.
When linearizing a file or getting the list of all pages in a file,
detect if the pages tree contains a duplicated page object and, if so,
shallow copy it. This makes it possible to have a one to one mapping
of page positions to page objects.
The original QPDF is only required now when the source
QPDFObjectHandle is a stream that gets its stream data from a
QPDFObjectHandle::StreamDataProvider.
Instead of calling assert for problems found during checking
linearization data, throw an exception which is later caught and
issued as an error. Ideally we would handle errors more robustly, but
this is still a significant improvement.
On certain operations, such as iterating through all objects and
adding new indirect objects, walk through the entire object structure
and explicitly resolve any indirect references to non-existent
objects. That prevents new objects from springing into existence and
causing the previously dangling references to point to them.
This is the beginning of higher-level API support using helper
classes. The goal is to be able to add more helpers without continuing
to pollute QPDF's and QPDFObjectHandle's public interfaces.