Update documentation on zlib compatibility

This commit is contained in:
Jay Berkenbilt 2023-12-20 14:16:39 -05:00
parent 10fe5143f4
commit 6aa811e5cd
4 changed files with 115 additions and 22 deletions

View File

@ -1,5 +1,12 @@
2023-12-20 Jay Berkenbilt <ejb@ql.org>
* Update code and tests so that qpdf's test suite no longer
depends on the output of any specific zlib implementation. This
makes it possible to get a fully passing test suite with any
API-compatible zlib library. CI tests with the default zlib as
well as zlib-ng (including verifying that zlib-ng is not the
default), but any zlib implementation should work. Fixes #774.
* Bug fix: with --compress-streams=n, don't compress object, XRef,
or linearization hint streams.

View File

@ -275,7 +275,99 @@ Building docs from pull requests is also enabled.
## ZLIB COMPATIBILITY
XXX Write this
The qpdf test suite is designed to be independent of the output of any
particular version of zlib. There are several strategies to make this
work:
* `build-scripts/test-alt-zlib` runs in CI and runs the test suite
with a non-default zlib. Please refer to that code for an example of
how to do this in case you want to test locally.
* The test suite is full of cases that compare output PDF files with
expected PDF files in the test suite. If the file contains data that
was compressed by QPDFWriter, then the output file will depend on
the behavior of zlib. As such, using a simple comparison won't work.
There are several strategies used by the test suite.
* A new program called `qpdf-test-compare`, in most cases, is a drop
in replacement for a simple file comparison. This code make sure
the two files have exactly the same number of objects with the
same object and generation numbers, and that corresponding objects
are identical with the following allowances (consult its source
code for all the details details):
* The `/Length` key is not compared in stream dictionaries.
* The second element of `/ID` is not compared.
* If the first and second element of `/ID` are the same, then the
first element if `/ID` is also not compared.
* If a stream is compressed with `/FlateDecode`, the
_uncompressed_ stream data is compared. Otherwise, the raw
stream data is compared.
* Generated fields in the `/Encrypt` dictionary are not compared,
though password-protected files must have the same password.
* Differences in the contents of `/XRef` streams are ignored.
To use this, run `qpdf-test-compare actual.pdf expected.pdf`, and
expect the output to match `expected.pdf`. For example, if a test
used to be written like this;
```perl
$td->runtest("check output",
{$td->FILE => "a.pdf"},
{$td->FILE => "out.pdf"});
```
then write it like this instead:
```perl
$td->runtest("check output",
{$td->COMMAND => "qpdf-test-compare a.pdf out.pdf"},
{$td->FILE => "out.pdf", $td->EXIT_STATUS => 0});
```
You can look at `compare-for-test/qtest/compare.test` for
additional examples.
Here's what's going on:
* If the files "match" according to the rules of
`qpdf-test-compare`, the output of the program is the expected
file.
* If the files do not match, the output is the actual file. The
reason is that, if a change is made that results in an expected
change to the expected file, the output of the comparison can be
used to replace the expected file (as long as it is definitely
known to be correct—no shortcuts here!). That way, it doesn't
matter which zlib you use to generate test files.
* As a special debugging tool, you can set the `QPDF_COMPARE_WHY`
environment variable to any value. In this case, if the files
don't match, the output is a description of the first thing in
the file that doesn't match. This is mostly useful for debugging
`qpdf-test-compare` itself, but it can also be helpful as a
sanity check that the differences are expected. If you are
trying to find out the _real_ differences, a suggestion is to
convert both files to qdf and compare them lexically.
* There are some cases where `qpdf-test-compare` can't be used. For
example, if you need to actually test one of the things that
`qpdf-test-compare` ignores, you'll need some other mechanism.
There are tests for deterministic ID creation and xref streams
that have to implement other mechanisms. Also, linearization hint
streams and the linearization dictionary in a linearized file
contain file offsets. Rather than ignoring those, it can be
helpful to create linearized files using `--compress-streams=n`.
In that case, `QPDFWriter` won't compress any data, so the PDF
will be independent of the output of any particular zlib
implementation.
You can find many examples of how tests were rewritten by looking at
the commits preceding the one that added this section of this README
file.
Note about `/ID`: many test cases use `--static-id` to have a
predictable `/ID` for testing. Many other test cases use
`--deterministic-id`. While `--static-id` is unaffected by file
contents, `--deterministic-id` is based on file contents and so is
dependent on zlib output if there is any newly compressed data. By
using `qpdf-test-compare`, it's actually not necessary to use either
`--static-id` or `--deterministic-id`. It may still be necessary to
use `--static-aes-iv` if comparing encrypted files, but since
`qpdf-test-compare` ignores `/Perms`, a wider range of encrypted files
can be compared using `qpdf-test-compare`.
## HOW TO ADD A COMMAND-LINE ARGUMENT

21
TODO.md
View File

@ -18,27 +18,6 @@ Contents
- [HISTORICAL NOTES](#historical-notes)
zlib-ng
=======
* Write ZLIB COMPATIBILITY section of README-maintainer.md.
* Note: deterministic IDs are affected by choice of zlib
```
cd /tmp
git clone https://github.com/zlib-ng/zlib-ng
cd zlib-ng
cmake -B build -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=/tmp/inst -DZLIB_COMPAT=ON
cmake --build build -j $(nproc)
(cd build; ctest --verbose)
cmake --install build
```
Then run qpdf's test suite with
```
LD_PRELOAD=/tmp/inst/lib/libz.so.1 ctest --verbose
```
Always
======

View File

@ -39,6 +39,21 @@ Planned changes for future 12.x (subject to change):
.. x.y.z: not yet released
11.7.0: not yet released
- Bug fixes:
- With ``--compress-streams=n``, qpdf was still compressing cross
reference streams, linearization hint streams, and object
streams. This has been fixed.
- Build Enhancements:
- The qpdf test suite now passes when qpdf is linked with an
alternative ``zlib`` implementation. There are no dependencies
anywhere in the qpdf test suite on any particular ``zlib``
output. Consult the ``ZLIB COMPATIBILITY`` section of
``README-maintainer.md`` for a detailed explanation of how to
maintain this.
- Library Enhancements:
- Add C++ functions ``qpdf_c_wrap`` and ``qpdf_c_get_qpdf`` to the