mirror of
https://github.com/qpdf/qpdf.git
synced 2025-01-22 22:58:33 +00:00
TODO: abandon (again) and update notes about QPDFPagesTree
This commit is contained in:
parent
8a32515a62
commit
25aff0bd52
80
TODO
80
TODO
@ -9,7 +9,10 @@ Before Release:
|
||||
* Release qtest with updates to qtest-driver and copy back into qpdf
|
||||
|
||||
Next:
|
||||
* QPDFPagesTree -- avoid ever flattening the pages tree.
|
||||
* QPDF -- track whether the pages tree was modified (whether
|
||||
getAllPages was ever called. If so, consider generating a non-flat
|
||||
pages tree before creating output to better handle files with lots
|
||||
of pages.
|
||||
* JSON v2 fixes
|
||||
|
||||
Pending changes:
|
||||
@ -45,39 +48,6 @@ Pending changes:
|
||||
Soon: Break ground on "Document-level work"
|
||||
|
||||
|
||||
QPDFPagesTree
|
||||
=============
|
||||
|
||||
Partial work is on qpdf-pages-tree branch. QPDFPageTree is mostly
|
||||
implemented and mostly tested. There are not enough cases of different
|
||||
kinds of operations (pclm, linearize, json, etc.) with non-flat pages
|
||||
trees. Insertion is not implemented.
|
||||
|
||||
Page tree repair is silent (no warnings) and has a comment saying that
|
||||
we don't need warnings, but I think we should have warnings now that
|
||||
we have json v2. The reason is that page tree repair will change
|
||||
object numbers, and it's useful to know that.
|
||||
|
||||
I'm thinking we will want to keep a pages cache for efficient
|
||||
insertion. There's no reason we can't keep a vector of page objects up
|
||||
to date and just do a traversal the first time we do getAllPages just
|
||||
like we do now. The difference is that we would not flatten the pages
|
||||
tree. It would be useful to go through QPDF_pages and reimplement
|
||||
everything without calling flattenPagesTree. Then we can remove
|
||||
flattenPagesTree, which is private.
|
||||
|
||||
In its current state, QPDFPagesTree does not proactively fix /Type or
|
||||
correct page objects that are used multiple times. You have to
|
||||
traverse the pages tree to trigger this operation. It would be nice if
|
||||
we would do that somewhere but not do it more often than necessary so
|
||||
isPagesObject and isPageObject are reliable and can be made more
|
||||
reliable. Maybe add a validate or repair function? It should also make
|
||||
sure /Count and /Parent are correct.
|
||||
|
||||
refs/attic/QPDFPagesTree-old -- original, abandoned branch -- clean up
|
||||
when done.
|
||||
|
||||
|
||||
JSON v2 fixes
|
||||
=============
|
||||
|
||||
@ -676,7 +646,47 @@ A few important lessons (in README-maintainer)
|
||||
possible.
|
||||
|
||||
Also, it turns out that PointerHolder is more performant than
|
||||
std::shared_ptr.
|
||||
std::shared_ptr. (This was true at the time but subsequent
|
||||
implementations of std::shared_ptr became much more efficient.)
|
||||
|
||||
QPDFPagesTree
|
||||
=============
|
||||
|
||||
On a few occasions, I have considered implementing a QPDFPagesTree
|
||||
object that would allow the document's original page tree structure to
|
||||
be preserved. See comments at the top QPDF_pages.cc for why this was
|
||||
abandoned.
|
||||
|
||||
Partial work is in refs/attic/QPDFPagesTree. QPDFPageTree is mostly
|
||||
implemented and mostly tested. There are not enough cases of different
|
||||
kinds of operations (pclm, linearize, json, etc.) with non-flat pages
|
||||
trees. Insertion is not implemented. Insertion is potentially complex
|
||||
because of the issue of inherited objects. We will have to call
|
||||
pushInheritedAttributesToPage before adding any pages to the pages
|
||||
tree. The test suite is failing on that branch.
|
||||
|
||||
Some parts of page tree repair are silent (no warnings). All page tree
|
||||
repair should warn. The reason is that page tree repair will change
|
||||
object numbers, and knowing that is important when working with JSON
|
||||
output.
|
||||
|
||||
If we were to do this, we would still need keep a pages cache for
|
||||
efficient insertion. There's no reason we can't keep a vector of page
|
||||
objects up to date and just do a traversal the first time we do
|
||||
getAllPages just like we do now. The difference is that we would not
|
||||
flatten the pages tree. It would be useful to go through QPDF_pages
|
||||
and reimplement everything without calling flattenPagesTree. Then we
|
||||
can remove flattenPagesTree, which is private. That said, with the
|
||||
addition of creating non-flat pages trees, there is really no reason
|
||||
not to flatten the pages tree for internal use.
|
||||
|
||||
In its current state, QPDFPagesTree does not proactively fix /Type or
|
||||
correct page objects that are used multiple times. You have to
|
||||
traverse the pages tree to trigger this operation. It would be nice if
|
||||
we would do that somewhere but not do it more often than necessary so
|
||||
isPagesObject and isPageObject are reliable and can be made more
|
||||
reliable. Maybe add a validate or repair function? It should also make
|
||||
sure /Count and /Parent are correct.
|
||||
|
||||
Rejected Ideas
|
||||
==============
|
||||
|
@ -21,6 +21,8 @@
|
||||
// reference is returned, the current code is fine and does not need
|
||||
// to be replaced. A partial implementation of QPDFPagesTree is in
|
||||
// github in attic in case there is ever a reason to resurrect it.
|
||||
// There are additional notes in README-maintainer, which also refers
|
||||
// to this comment.
|
||||
|
||||
// The goal of this code is to ensure that the all_pages vector, which
|
||||
// users may have a reference to, and the pageobj_to_pages_pos map,
|
||||
|
Loading…
x
Reference in New Issue
Block a user