From 352ce9b22ba32fa50e22258f7d61557ae0d53cc5 Mon Sep 17 00:00:00 2001 From: Jay Berkenbilt Date: Tue, 18 Dec 2018 14:59:14 -0500 Subject: [PATCH] Preserve page labels (numbers) when splitting and merging --- ChangeLog | 6 + manual/qpdf-manual.xml | 38 +-- qpdf/qpdf.cc | 52 +++- qpdf/qpdf.testcov | 1 + qpdf/qtest/qpdf.test | 29 +- qpdf/qtest/qpdf/11-pages-with-labels.pdf | Bin 0 -> 4215 bytes qpdf/qtest/qpdf/labels-split-01-06.pdf | 324 ++++++++++++++++++++++ qpdf/qtest/qpdf/labels-split-07-11.pdf | 280 +++++++++++++++++++ qpdf/qtest/qpdf/merge-implicit-ranges.pdf | 220 +++++++-------- qpdf/qtest/qpdf/merge-multiple-labels.pdf | Bin 0 -> 3452 bytes qpdf/qtest/qpdf/merge-three-files-1.pdf | Bin 8495 -> 8396 bytes qpdf/qtest/qpdf/merge-three-files-2.pdf | Bin 6036 -> 6196 bytes 12 files changed, 806 insertions(+), 144 deletions(-) create mode 100644 qpdf/qtest/qpdf/11-pages-with-labels.pdf create mode 100644 qpdf/qtest/qpdf/labels-split-01-06.pdf create mode 100644 qpdf/qtest/qpdf/labels-split-07-11.pdf create mode 100644 qpdf/qtest/qpdf/merge-multiple-labels.pdf diff --git a/ChangeLog b/ChangeLog index 75fad549..5f44bb05 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,5 +1,11 @@ 2018-12-18 Jay Berkenbilt + * Preserve page labels when merging and splitting files. Prior + versions of qpdf simply preserved the page label information from + the first file, which usually wouldn't make any sense in the + merged file. Now any page that had a page number in any original + file will have the same page number after merging or splitting. + * Add QPDFPageLabelDocumentHelper class. This is a document helper class that provides useful methods for dealing with page labels. It abstracts the fact that they are stored as number trees and diff --git a/manual/qpdf-manual.xml b/manual/qpdf-manual.xml index 8185b848..e327eb95 100644 --- a/manual/qpdf-manual.xml +++ b/manual/qpdf-manual.xml @@ -911,23 +911,23 @@ make - Note that qpdf doesn't presently do anything special about other - constructs in a PDF file that may know about pages, so semantics - of splitting and merging vary across features. For example, the - document's outlines (bookmarks) point to actual page objects, so - if you select some pages and not others, bookmarks that point to - pages that are in the output file will work, and remaining - bookmarks will not work. On the other hand, page labels (page - numbers specified in the file) are just sequential, so page labels - will be messed up in the output file. A future version of - qpdf may do a better job at handling these - issues. (Note that the qpdf library already contains all of the - APIs required in order to implement this in your own application - if you need it.) In the mean time, you can always use - as the primary input file to avoid - copying all of that from the first file. For example, to take - pages 1 through 5 from a infile.pdf while - preserving all metadata associated with that file, you could use + Starting in qpdf version 8.3, when you split and merge files, any + page labels (page numbers) are preserved in the final file. It is + expected that more document features will be preserved by + splitting and merging. In the mean time, semantics of splitting + and merging vary across features. For example, the document's + outlines (bookmarks) point to actual page objects, so if you + select some pages and not others, bookmarks that point to pages + that are in the output file will work, and remaining bookmarks + will not work. A future version of qpdf may do + a better job at handling these issues. (Note that the qpdf library + already contains all of the APIs required in order to implement + this in your own application if you need it.) In the mean time, + you can always use as the primary input + file to avoid copying all of that from the first file. For + example, to take pages 1 through 5 from a + infile.pdf while preserving all metadata + associated with that file, you could use qpdf @@ -946,8 +946,8 @@ make If, for some reason, you wanted to take the first page of an encrypted file called encrypted.pdf with password pass and repeat it twice in an output - file, and if you wanted to drop metadata (like page numbers and - outlines) but preserve encryption, you would use + file, and if you wanted to drop document-level metadata but + preserve encryption, you would use qpdf