Comments about incremental update support

Also remove some trivial, non-functional code.
This commit is contained in:
Jay Berkenbilt 2013-11-30 16:04:21 -05:00
parent cdff7a4966
commit b802ca47e9
2 changed files with 35 additions and 8 deletions

34
TODO
View File

@ -64,6 +64,40 @@
General
=======
* Provide support in QPDFWriter for writing incremental updates.
Provide support in qpdf for preserving incremental updates. The
goal should be that QDF mode should be fully functional for files
with incremental updates including fix_qdf.
Note that there's nothing that says an indirect object in one
update can't refer to an object that doesn't appear until a later
update. This means that QPDF has to treat indirect null objects
differently from how it does now. QPDF drops indirect null objects
that appear as members of arrays or dictionaries. For arrays, it's
handled in QPDFWriter where we make indirect nulls direct. This is
in a single if block, and nothing else in the code cares about it.
We could just remove that if block and not break anything except a
few test cases that exercise the current behavior. For
dictionaries, it's more complicated. In this case,
QPDF_Dictionary::getKeys() ignores all keys with null values, and
hasKey() returns false for keys that have null values. We would
probably want to make QPDF_Dictionary able to handle the special
case of keys that are indirect nulls and basically never have it
drop any keys that are indirect objects.
If we make a change to have qpdf preserve indirect references to
null objects, we have to note this in ChangeLog and in the release
notes since this will change output files. We did this before when
we stopped flattening scalar references, so this is probably not a
big deal. We also have to make sure that the testing for this
handles non-trivial cases of the targets of indirect nulls being
replaced by real objects in an update. I'm not sure how this plays
with linearization, if at all. For cases where incremental updates
are not being preserved as incremental updates and where the data
is being folded in (as is always the case with qpdf now), none of
this should make any difference in the actual semantics of the
files.
* When decrypting files with /R=6, hash_V5 is called more than once
with the same inputs. Caching the results or refactoring to reduce
the number of identical calls could improve performance for

View File

@ -980,10 +980,6 @@ QPDFWriter::enqueueObject(QPDFObjectHandle object)
" another file.");
}
if (object.isNull())
{
// This is a place-holder object for an object stream
}
QPDFObjGen og = object.getObjGen();
if (obj_renumber.count(og) == 0)
@ -2014,10 +2010,7 @@ QPDFWriter::prepareFileForWrite()
// Do a traversal of the entire PDF file structure replacing all
// indirect objects that QPDFWriter wants to be direct. This
// includes stream lengths, stream filtering parameters, and
// document extension level information. Also replace all
// indirect null references with direct nulls. This way, the only
// indirect nulls queued for output will be object stream place
// holders.
// document extension level information.
std::list<QPDFObjectHandle> queue;
queue.push_back(getTrimmedTrailer());