We were using SGML entities for various non-ASCII characters so they
could convert properly for both HTML and print, but this is no longer
necessary as we move from docbook to RST, so just replace them. Note
that the conversions done by sphinx automatically handle "smart
quotes", so it works to just use regular quotes in place of “
and ”.
With docbook, this was not converted properly in the PDF version, but
since we are moving out of docbook, we can just put the Unicode
character in the source.
The impact on the code would be extremely high, and using it would
clutter the code greatly because it would break chaining like
a.getKey("/B").getKey("/C"). There are better ways to deal with the
issue.
Operations that add the same object to multiple places in the pages
tree are throwing exceptions and then later causing assertion
failures. The assert calls shouldn't be there.
Converted ResourceFinder to ParserCallbacks so we can better detect
the name that precedes various operators and use the operators to sort
the names into resource types. This enables us to be smarter about
detecting unreferenced resources in pages and also sets the stage for
reconciling differences in /DR across documents.
I thought /EFF was supposed to be used as a default for decrypting
embedded file streams, but actually it's supposed to be advice to a
conforming writer about handling new ones. This makes sense since the
findAttachmentStreams code, which is not actually needed, was never
right.
Avoid calling finish() multiple times on the pipeline passed to
pipeContentStreams. This commit also fixes a bug in which qpdf was not
exiting with the proper exit status if warnings found while splitting
pages; this was exposed by a test case that changed.
Make some more methods in QPDFPageObjectHelper work with form
XObjects, provide forEach methods to walk through nested form
XObjects, possibly recursively. This should make it easier to work
with form XObjects from user code.