diff --git a/README b/README index 1c27b6fa..acb79dd5 100644 --- a/README +++ b/README @@ -15,11 +15,20 @@ Prerequisites QPDF depends on external libraries "zlib" and "pcre". These are part of virtually all Linux distributions and are readily available; -download information appears in the documentation. You can also -download the external library distributions in source from from qpdf's -download site. For Windows, you can download pre-built binary -verisons of those libraries for some compilers; see README-windows.txt -for additional details. +download information appears in the documentation. For Windows, you +can download pre-built binary verisons of those libraries for some +compilers; see README-windows.txt for additional details. + +QPDF requires a C++ compiler that works with STL. Your compiler must +also support "long long". Almost all modern compilers do. If you are +trying to port qpdf to a compiler that doesn't support long long, you +could change all occurrences of "long long" to "long" in the source +code, noting that this would break binary compatibility with other +builds of qpdf. Doing so would certainly prevent qpdf from working +with files larger than 2 GB, but remaining functionality would most +likely work fine. If you built qpdf this way and it passed its test +suite with large file support disabled, you could be confident that +you had an otherwise working qpdf. Licensing terms of embedded software @@ -49,20 +58,23 @@ For UNIX and UNIX-like systems, you can usually get by with just make make install -For more detailed general information, see the "INSTALL" file in this -directory. +Packagers may set DESTDIR, in which case make install will install +inside of DESTDIR, as is customary with many packages. For more +detailed general information, see the "INSTALL" file in this +directory. If you are already accustomed to building and installing +software that uses autoconf, there's nothing new for you in the +INSTALL file. + Building on Windows =================== -QPDF is known to build and pass its test suite with mingw (gcc 4.4.0) -and Microsoft Visual C++ .NET 2008 Express. Either cygwin or MSYS -plus ActivateState Perl is required to build as well in order to get -make and other related tools. The MSVC works with either cygwin or -MSYS. The mingw build requires MSYS and will probably not work with -cygwin. - -For details on how to build under Windows, see README-windows.txt. +QPDF is known to build and pass its test suite with mingw (latest +version tested: gcc 4.6.2), mingw64 (latest version tested: 4.7.0) and +Microsoft Visual C++ 2010, both 32-bit and 64-bit versions. MSYS plus +ActivateState Perl is required to build as well in order to get make +and other related tools. See README-windows.txt for details on how to +build under Windows, see README-windows.txt. Additional Notes on Build @@ -94,7 +106,10 @@ To learn about using the library, please read comments in the header files in include/qpdf, especially QPDF.hh, QPDFObjectHandle.hh, and QPDFWriter.hh. You can also study the code of qpdf/qpdf.cc, which exercises most of the public interface. There are additional example -programs in the examples directory. +programs in the examples directory. Reading all the source files in +the qpdf directory (including the qpdf command-line tool and some test +drivers) along with the code in the examples directory will give you a +complete picture of every aspect of the public interface. Additional Notes on Test Suite @@ -102,15 +117,21 @@ Additional Notes on Test Suite By default, slow tests are disabled. Slow tests include image comparison tests and large file tests. Image comparison tests can be -enabled by passing --enable-test-compare-images to ./configure. Large -file tests can be enabled by passing --with-large-file-test-path=path -to ./configure or by setting the LARGE_FILE_TEST_PATH environment -variable. Run ./configure --help for additional options. The test -suite provides nearly full coverage even without these tests. Unless -you are making deep changes to the library or testing this on a new -platform for the first time, there is no real reason to run these -tests. If you're just running the test suite to make sure that qpdf -works for your build, the default tests are adequate. +enabled by passing --enable-test-compare-images to ./configure. This +was on by default in qpdf versions prior to 3.0, but is now off by +default. Large file tests can be enabled by passing +--with-large-file-test-path=path to ./configure or by setting the +QPDF_LARGE_FILE_TEST_PATH environment variable. Run ./configure +--help for additional options. The test suite provides nearly full +coverage even without these tests. Unless you are making deep changes +to the library that would impact the contents of the generated PDF +files or testing this on a new platform for the first time, there is +no real reason to run these tests. If you're just running the test +suite to make sure that qpdf works for your build, the default tests +are adequate. The configure rules for these tests do nothing other +than setting variables in autoconf.mk, so you can feel free to turn +these on and off directly in autoconf.mk rather than rerunning +configure. If you are packaging qpdf for a distribution and preparing a build that is run by an autobuilder, you may want to add the diff --git a/README-what-to-download.txt b/README-what-to-download.txt index 67449e1f..287e5a84 100644 --- a/README-what-to-download.txt +++ b/README-what-to-download.txt @@ -5,24 +5,41 @@ file. For Windows, there are several additional files that you might want to download. - * qpdf--bin-mingw.zip + * qpdf--bin-mingw32.zip If you just want to use the qpdf commandline program or use the qpdf DLL's C-language interface, you can download this file. You can also download this version if you are using MINGW's gcc 4.4 and want to program using the C++ interface. - * qpdf--bin-msvc.zip + * qpdf--bin-mingw64.zip + + A 64-bit version built with mingw. Use this for 64-bit Windows + systems. The 32-bit version will also work on Windows 64-bit. + Both the 32-bit and the 64-bit version support files over 2 GB in + size, but you may find it easier to integrate this with your own + software if you use the 64-bit version. + + * qpdf--bin-msvc32.zip If you want to program using qpdf's C++ interface and you are using - Microsoft Visual C++ .NET 2008 (VC9), you can download this file. + Microsoft Visual C++ 2010 in 32-bit mode, you can download this + file. + + * qpdf--bin-msvc64.zip + + If you want to program using qpdf's C++ interface and you are using + Microsoft Visual C++ 2010 in 64-bit mode, you can download this + file. * qpdf-external-libs-bin.zip - If you want to build qpdf for Windows yourself with either MINGW's - gcc 4.4 or VC9, you can download this file and extract it inside - the qpdf source distribution. Please refer to README-windows.txt - in the qpdf source distribution for additional details. + If you want to build qpdf for Windows yourself with either MINGW or + MSVC 2010, you can download this file and extract it inside the + qpdf source distribution. Please refer to README-windows.txt in + the qpdf source distribution for additional details. Note that you + need the 2012-06-20 version or later to be able to build qpdf 3.0 + or newer. * qpdf-external-libs-src.zip diff --git a/README-windows.txt b/README-windows.txt index dda0caf2..33183880 100644 --- a/README-windows.txt +++ b/README-windows.txt @@ -1,21 +1,49 @@ Common Setup ============ -To be able to build qpdf and run its test suite, you must have either -Cygwin or MSYS from MinGW (>= 1.0.11) installed. If you want to build -with Microsoft Visual C++, either Cygwin or MSYS will do. If you want -to build with MinGW, you must use MSYS rather than Cygwin. +You may need to disable antivirus software to run qpdf's test suite. + +To be able to build qpdf and run its test suite, you must have MSYS +from MinGW installed, and you must have ActiveState Perl. Here's what +I did on my system: + +Install ActiveState perl. + +Grab the latest mingw-get-inst. From the installation wizard, choose +to install developer kit, C, and C++ support. Once installed, you +will have an icon to start an msys shell. From the msys shell, run + +mingw-get install msys-unzip msys-zip mingw32-make + +Then replace perl and make with the appropriate versions: + +mv /bin/perl.exe /bin/msys-perl.exe +mv /bin/make.exe /bin/msys-make.exe +mv /mingw/bin/mingw32-make.exe /mingw/bin/make.exe + +Make sure perl --version shows ActiveState perl. + +To install MinGW-w64, first install msys and mingw32 as above. + +From MinGW-w64 download page, go to "Toolchains targetting +Win64/Automated Builds" and find the latest mingw-w64 that runs under +i686-mingw. It will be called something like +mingw-w64-bin_i686-mingw_yyyymmdd.zip. The compiler binaries are +32-bit, which (of course) runs on 64-bit Windows. Extract this under +C:\MinGW-w64, and add C:\MinGW-w64\bin and C:\MinGW-w64\lib\mingw to +the path. As of this writing, the image comparison tests confuse ghostscript in cygwin, but there's a chance they might work at some point. If you -want to run them, you need ghostscript and tiff utils as well. Then -omit --disable-test-compare-images from the configure statements given -below. The image comparison tests have not been tried under MSYS. +want to run them, you need ghostscript and tiff utils as well, and you +will need to add --enable-test-compare-images from the configure +statements given below. Jian Ma has generously provided a port of QPDF that works with Microsoft VC6. Several changes are required, but they are well documented in his port. You can find the VC6 port in the -contrib area of the qpdf download area. +contrib area of the qpdf download area. It may not always be +up-to-date with the latest official qpdf release. External Libraries @@ -24,13 +52,14 @@ External Libraries In order to build qpdf, you must have copies of zlib and pcre. The easy way to get them is to download them from the qpdf download area. There are packages called external-libs-bin.zip and -external-libs-src.zip. If you are building with MSVC 9 (.NET 2008) or -MINGW 4.4, you can just extract the external-libs-bin.zip zip file -into the top-level qpdf source tree. It will create a directory -called external-libs which contains header files and precompiled -libraries. Passing --enable-external-libs to ./configure (which is -done automatically if you follow the instructions below) is sufficient -to find them. +external-libs-src.zip. If you are building with MSVC 2010 or MINGW, +you can just extract the qpdf-external-libs-bin.zip zip file into the +top-level qpdf source tree. Note that you need the 2012-06-20 version +(at least) to build qpdf 3.0 or greater since this incldues 64-bit +libaries. It will create a directory called external-libs which +contains header files and precompiled libraries. Passing +--enable-external-libs to ./configure (which is done automatically if +you follow the instructions below) is sufficient to find them. You can also obtain pcre and zlib directly on your own and install them. If you are using mingw, you can just set CPPFLAGS, LDFLAGS, and @@ -44,27 +73,42 @@ CPPFLAGS, LDFLAGS, LIBS in the generated autoconf.mk file. Note that you should use UNIX-like syntax (-I, -L, -l) even though this is not what cl takes on the command line. qpdf's build rules will fix it. +You can also download qpdf-external-libs-src.zip and follow the +instructions in the README.txt there for how to build external libs. + Building with MinGW =================== -QPDF is known to build and pass its test suite with MSYS-1.0.11 and -gcc 4.4.0 with C++ support. If you also have ActiveState Perl in your -path and the external-libs distribution described above, you can fully -configure, build, and test qpdf in this environment. You will most -likely not be able to build qpdf with mingw using cygwin. +QPDF is known to build and pass its test suite with mingw (latest +version tested: gcc 4.6.2), mingw64 (latest version tested: 4.7.0) and +Microsoft Visual C++ 2010, both 32-bit and 64-bit versions. MSYS plus +ActivateState Perl is required to build as well in order to get make +and other related tools. While it is possible that Cygwin could be +used to build native Windows versions of qpdf, this configuration has +not been tested recently. From your MSYS prompt, run - ./config-mingw + ./config-mingw32 + +or + + ./config-mingw64 and then make -Note that ./config-mingw just runs ./configure with specific -arguments, so you can look at it, make adjustments, and manually run -configure instead. +Note that ./config-mingw32 and ./configure-mingw64 just run +./configure with specific arguments, so you can look at it, make +adjustments, and manually run configure instead. Note also that +config-mingw32 appends definition of _FILE_OFFSET_BITS=64 to +qpdf-config.h since, as of the qpdf 3.0 release, the current versions +of the autoconf tools did not correctly detect that mingw requires +this to get large file support. This workaround is only required for +mingw32. The 64-bit version of mingw works "out of the box" with +large file support, as do both the 32-bit and 64-bit versions of MSVC. Add the absolute path to the libqpdf/build directory to your PATH. Make sure you can run the qpdf command by typing qpdf/build/qpdf and @@ -80,26 +124,42 @@ create install-mingw/qpdf-VERSION and populate it. The binary download of qpdf for Windows with mingw is created from this directory. +You can also take a look at make_windows_releases for reference. This +is how the distributed Windows executables are created. -Building with MSVC .NET 2008 Express -==================================== + +Building with MSVC 2010 +======================= These instructions would likely work with newer version of MSVC or with full version of MSVC. They may also work with .NET 2005. They -have only been tested with .NET 2008 Express (VC9.0). You may follow -these instructions from either Cygwin or from MSYS, though only MSYS -is regularly tested. +have only been tested with Visual C++ 2010. Earlier version of qpdf +were built with MSVC 2008 Express. You should first set up your environment to be able to run MSVC from the command line. There is usually a batch file included with MSVC -that does this. From that cmd prompt, you can start your cygwin -shell. +that does this. Make sure that you start a command line environment +configured for whichever of 32-bit or 64-bit output that you intend to +build for. + +From that cmd prompt, you can start your msys shell by just running +manually whatever command is associated with your msys shell icon. Configure as follows: - ./config-msvc + ./config-msvc 32 -and then +or + + ./config-msvc 64 + +Note that you must pass the 32/64 option that matches your command +line setup. The scripts do not presently figure this out. If you +used the wrong argument, it would probably just build the size you +have in your environment and then install the results in the wrong +place. + +Once configured, run make @@ -156,4 +216,5 @@ when the runtime is linked in statically, exceptions cannot be thrown across the DLL to EXE boundary. Since qpdf uses exception handling extensively for error handling, we have no choice but to redistribute the C++ runtime DLLs. Maybe this will be addressed in a future -version of the compilers. +version of the compilers. This has not been retested with the +toolchain versions used to create qpdf 3.0 distributions. diff --git a/README.maintainer b/README.maintainer index 9b89d88b..b844a2c6 100644 --- a/README.maintainer +++ b/README.maintainer @@ -94,6 +94,7 @@ Release Reminders * Remember to update the web page including putting new documentation in the "files" subdirectory of the website on sourceforge.net. + Linearize the PDF version of the manual when copying it there. * Create a tag in the version control system, and make backups of the actual releases. With git, use git tag -s to create a signed tag: diff --git a/TODO b/TODO index de99b5cd..3e27f0e5 100644 --- a/TODO +++ b/TODO @@ -1,89 +1,21 @@ -Next -==== - -*** ABI changes have been made. build.mk has been updated. - - * 64-bit windows build, remaining steps - - - new external-libs have been built and copied into - ~/Q/storage/releases/qpdf/external-libs. Release is done in - git. Just need to upload when ready. Remember to document that - this version is needed for > 2.3.1. - - - update README-windows.txt docs to indicate that MSVC 2010 is the - supported version and to update the information about mingw, - including the need for the _FILE_OFFSET_BITS workaround on the - 32-bit version. - - * Document that your compiler has to support long long. - - * Make sure that the release notes call attention to the one API - breaking change: removal of length from replaceStreamData. - - * Document thread safety: One individual QPDF or QPDFWriter object - can only be used by one thread at a time, but multiple threads can - simultaneously use separate objects. - - * Mention QPDFObjectHandle::parse in the documentation. - - * Manual: empty --empty as an input file name option - - * copyForeignObject, merge/split documentation: - - document details of --pages option in manual. Include nuances of - range parsing, such as backward ranges and "z". Discuss - implications of using --empty vs. using one of the source files as - the original file including Outlines (which basically work) and - page labels (which don't). Also mention trick of specifying two - different paths to the same file get duplication. - - Command line is - - --pages infile [ --password=pwd ] range ... -- - - The regular input referenced would be the one whose other data - would be preserved (like trailer, info, encryption, outlines, - etc.). It can be but doesn't have to be one of the files selected. - - Example: to grab pages 1-5 from file1 and 11-15 from file2 in - reverse: - - qpdf file1.pdf out.pdf --pages file1.pdf 1-5 file2.pdf 15-11 -- - - Use comments in qpdf.cc to guide internals documentation when - discussing implementation. Also see copyForeignObject as a source - for documentation. - - Document that makeIndirectObject doesn't handle foreign objects - automatically because copying a foreign object is a big enough deal - that it should be explicit. However addPages* does handle foreign - page objects automatically. - - * Document --copy-encryption and --encryption-file-password in - manual. Mention that the first half of /ID as well as all the - encryption parameters are copied. Maybe mention about StrF and - StrM with respect to AES here and also with encryption - preservation. - - -Soon -==== - - * See if I can support the new encryption formats mentioned in the - open bug on sourceforge. Check other sourceforge bugs. - - General ======= + * See if I can support the encryption format used with /R 5 /V 5, + even though a qpdf-announce subscriber with an adobe.com email + address mentioned that this is deprecated. There is also a new + encryption format coming in a future release, which may be better + to support. As of the qpdf 3.0 release, the specification was not + publicly available yet. + + * Consider the possibility of doing something locale-aware to support + non-ASCII passwords. Update documentation if this is done. + * Look for %PDF header somewhere within the first 1024 bytes of the file. Also accept headers of the form "%!PS−Adobe−N.n PDF−M.m". See Implementation notes 13 and 14 in appendix H of the PDF 1.7 specification. This is bug 3267974. - * Update qpdf docs about non-ascii passwords. See thread from - 2010-12-07,08 for details. - * Consider impact of article threads on page splitting/merging. Subramanyam provided a test file; see ../misc/article-threads.pdf. Email Q-Count: 431864 from 2009-11-03. Other things to consider: diff --git a/configure.ac b/configure.ac index 6423284d..8f1bb4d2 100644 --- a/configure.ac +++ b/configure.ac @@ -2,7 +2,7 @@ dnl Process this file with autoconf to produce a configure script. dnl This config.in requires autoconf 2.5 or greater. AC_PREREQ([2.68]) -AC_INIT([qpdf],[3.0.a0]) +AC_INIT([qpdf],[3.0.rc1]) AC_CONFIG_MACRO_DIR([m4]) AC_CONFIG_FILES([autoconf.mk]) diff --git a/libqpdf/QPDF.cc b/libqpdf/QPDF.cc index bee2f3ee..e08b3ecb 100644 --- a/libqpdf/QPDF.cc +++ b/libqpdf/QPDF.cc @@ -18,7 +18,7 @@ #include #include -std::string QPDF::qpdf_version = "3.0.a0"; +std::string QPDF::qpdf_version = "3.0.rc1"; static char const* EMPTY_PDF = "%PDF-1.3\n" diff --git a/manual/qpdf-manual.xml b/manual/qpdf-manual.xml index 6fa48759..bd102cdf 100644 --- a/manual/qpdf-manual.xml +++ b/manual/qpdf-manual.xml @@ -5,8 +5,8 @@ - - + + ]> @@ -26,6 +26,8 @@ QPDF is a program that does structural, content-preserving transformations on PDF files. QPDF's website is located at http://qpdf.sourceforge.net/. + QPDF's source code is hosted on github at https://github.com/qpdf/qpdf. QPDF has been released under the terms of - QPDF is not a PDF content creation library, a - PDF viewer, or a program capable of converting PDF into other - formats. In particular, QPDF knows nothing about the semantics of - PDF content streams. If you are looking for something that can do + With QPDF, it is possible to copy objects from one PDF file into + another and to manipulate the list of pages in a PDF file. This + makes it possible to merge and split PDF files. The QPDF library + also makes it possible for you to create PDF files from scratch. + In this mode, you are responsible for supplying all the contents of + the file, while the QPDF library takes care off all the syntactical + representation of the objects, creation of cross references tables + and, if you use them, object streams, encryption, linearization, + and other syntactic details. You are still responsible for + generating PDF content on your own. + + + QPDF has been designed with very few external dependencies, and it + is intentionally very lightweight. QPDF is + not a PDF content creation library, a PDF + viewer, or a program capable of converting PDF into other formats. + In particular, QPDF knows nothing about the semantics of PDF + content streams. If you are looking for something that can do that, you should look elsewhere. However, once you have a valid PDF file, QPDF can be used to transform that file in ways perhaps - your original PDF creation can't handle. For example, programs - generate simple PDF files but can't password-protect them, + your original PDF creation can't handle. For example, many + programs generate simple PDF files but can't password-protect them, web-optimize them, or perform other transformations of that type. @@ -112,17 +128,34 @@ -u. + + + A C++ compiler that works well with STL and has the long + long type. Most modern C++ compilers should fit the + bill fine. QPDF is tested with gcc and Microsoft Visual C++. + + Part of qpdf's test suite does comparisons of the contents PDF - files by converting them images and comparing the images. You can - optionally disable this part of the test suite by running - configure with the - flag. If you leave - this enabled, the following additional requirements are required - by the test suite. Note that in no case are these items required - to use qpdf. + files by converting them images and comparing the images. The + image comparison tests are disabled by default. Those tests are + not required for determining correctness of a qpdf build if you + have not modified the code since the test suite also contains + expected output files that are compared literally. The image + comparison tests provide an extra check to make sure that any + content transformations don't break the rendering of pages. + Transformations that affect the content streams themselves are off + by default and are only provided to help developers look into the + contents of PDF files. If you are making deep changes to the + library that cause changes in the contents of the files that qpdf + generates, then you should enable the image comparison tests. + Enable them by running configure with the + flag. If you enable + this, the following additional requirements are required by the + test suite. Note that in no case are these items required to use + qpdf. @@ -132,13 +165,12 @@ GhostScript version 8.60 or newer: http://pages.cs.wisc.edu/~ghost/ + url="http://www.ghostscript.com">http://www.ghostscript.com - This option is primarily intended for use by packagers of qpdf so - that they can avoid having the qpdf packages depend on tiff and - ghostscript software. + If you do not enable this, then you do not need to have tiff and + ghostscript. If Adobe Reader is installed as acroread, some @@ -158,7 +190,7 @@ To build the PDF version of the documentation, you need Apache fop (http://xml.apache.org/fop/) - version 0.94 of higher. + version 0.94 or higher. @@ -182,9 +214,9 @@ make Building on Windows is a little bit more complicated. For details, please see README-windows.txt in the source distribution. You can also download a binary distribution - for Windows. There is a port of qpdf in the - contrib area generously contributed by Jian - Ma. This is also discussed in more detail in + for Windows. There is a port of qpdf to Visual C++ version 6 in + the contrib area generously contributed by + Jian Ma. This is also discussed in more detail in README-windows.txt. @@ -215,7 +247,12 @@ make identical to the input file but may have been structurally reorganized. Also, orphaned objects will be removed from the file. Many transformations are available as controlled by the - options below. + options below. In place of , the + parameter may be specified. This causes + qpdf to use a dummy input file that contains zero pages. The only + normal use case for using would be if you + were going to add pages from another source, as discussed in . does not have to be seekable, even @@ -248,7 +285,35 @@ make - Causes generation of a linearized (web optimized) output file. + Causes generation of a linearized (web-optimized) output file. + + + + + + + + Encrypt the file using the same encryption parameters, + including user and owner password, as the specified file. Use + to specify a password + if one is needed to open this file. Note that copying the + encryption parameters from a file also copies the first half + of /ID from the file since this is part of + the encryption parameters. + + + + + + + + If the file specified with + requires a password, specify the password using this option. + Note that only one of the user or owner password is required. + Both passwords will be preserved since QPDF does not + distinguish between the two passwords. It is possible to + preserve encryption parameters, including the owner password, + from a file even if you don't know the file's owner password. @@ -271,6 +336,16 @@ make + + + + + Select specific pages from one or more input files. See for details on how to do page + selection (splitting and merging). + + + @@ -289,6 +364,25 @@ make restrictions or other restrictions placed on files by their producers. + + In all cases where qpdf allows specification of a password, care + must be taken if the password contains characters that fall + outside of the 7-bit US-ASCII character range to ensure that the + exact correct byte sequence is provided. It is possible that a + future version of qpdf may handle this more gracefully. For + example, if a password was encrypted using a password that was + encoded in ISO-8859-1 and your terminal is configured to use + UTF-8, the password you supply may not work properly. There are + various approaches to handling this. For example, if you are + using Linux and have the iconv executable (part of the ICU + package) installed, you could pass to qpdf where + password is a password specified in + your terminal's locale. A detailed discussion of this is out of + scope for this manual, but just be aware of this issue if you have + trouble with a password that contains 8-bit characters. + Encryption Options @@ -474,6 +568,126 @@ make The default for each permission option is to be fully permissive. + + Page Selection Options + + Starting with qpdf 3.0, it is possible to split and merge PDF + files by selecting pages from one or more input files. Whatever + file is given as the primary input file is used as the starting + point, but its pages are replaced with pages as specified. + + + + Multiple input files may be specified. Each one is given as the + name of the input file, an optional password (if required to open + the file), and the range of pages. Note that + “” terminates parsing of page + selection flags. + + + For each file that pages should be taken from, specify the file, a + password needed to open the file (if needed), and a page range. + If the primary input file file requires a password, that password + must be specified outside the option and + does not need to be repeated inside the . + The same file can be repeated multiple times. If a file that is + repeated has a password, the password only has to be given the + first time. All non-page data (info, outlines, page numbers, + etc.) are taken from the primary input file. To discard these, + use as the primary input. One subtlety + about specifying passwords is that specifying a password as + doesn't prevent you + from having to repeat that password of that is also one of the + input files. If in doubt, it's never an error to specify the + password multiple times. + + + It is not presently possible to specify the same page from the + same file directly more than once, but you can make this work by + specifying two different paths to the same file (such as by + putting ./ somewhere in the path). This can + also be used if you want to repeat a page from one of the input + files in the output file. This may be made more convenient in a + future version of qpdf if there is enough demand for this feature. + + + The page range is a set of numbers separated by commas, ranges of + numbers separated dashes, or combinations of those. The character + “z” represents the last page. Pages can appear in any + order. Ranges can appear with a high number followed by a low + number, which causes the pages to appear in reverse. Repeating a + number will cause an error, but you can use the workaround + discussed above should you really want to include the same page + twice. + + + Example page ranges: + + + + 1,3,5-9,15-12: pages 1, 2, 3, 5, 6, 7, 8, + 9, 15, 14, 13, and 12. + + + + + z-1: all pages in the document in reverse + + + + + + Note that qpdf doesn't presently do anything special about other + constructs in a PDF file that may know about pages, so semantics + of splitting and merging vary across features. For example, the + document's outlines (bookmarks) point to actual page objects, so + if you select some pages and not others, bookmarks that point to + pages that are in the output file will work, and remaining + bookmarks will not work. On the other hand, page labels (page + numbers specified in the file) are just sequential, so page labels + will be messed up in the output file. A future version of + qpdf may do a better job at handling these + issues. (Note that the qpdf library already contains all of the + APIs required in order to implement this in your own application + if you need it.) In the mean time, you can always use + as the primary input file to avoid + copying all of that from the first file. For example, to take + pages 1 through 5 from a infile.pdf while + preserving all metadata associated with that file, you could use + + qpdf + + If you wanted pages 1 through 5 from + infile.pdf but you wanted the rest of the + metadata to be dropped, you could instead run + + qpdf + + If you wanted to take pages 1–5 from + file1.pdf and pages 11–15 from + file2.pdf in reverse, you would run + + qpdf + + If, for some reason, you wanted to take the first page of an + encrypted file called encrypted.pdf with + password pass and repeat it twice in an output + file, and if you wanted to drop metadata (like page numbers and + outlines) but preserve encryption, you would use + + qpdf + + Note that we had to specify the password all three times because + giving a password as + doesn't count for page selection, and as far as qpdf is concerned, + encrypted.pdf and + ./encrypted.pdf are separated files. These + are all corner cases that most users should hopefully never have + to be bothered with. + + Advanced Transformation Options @@ -1053,6 +1267,14 @@ make your system understands how to read libtool .la files, this may not be necessary. + + The qpdf library is safe to use in a multithreaded program, but no + individual QPDF object instance (including + QPDF, QPDFObjectHandle, or + QPDFWriter) can be used in more than one thread at a + time. Multiple threads may simultaneously work with different + instances of these and all other QPDF objects. + Design and Library Notes @@ -1156,17 +1378,15 @@ make which objects are direct and which objects are indirect. - There is no public interface for creating instances of - QPDFObjectHandle. They can be created only inside the QPDF - library. This is generally done through a call to the private - method QPDF::readObject which uses - QPDFTokenizer to read an indirect object at - a given file position and return a - QPDFObjectHandle that encapsulates it. - There are also internal methods to create fabricated indirect - objects from existing direct objects or to change an indirect - object into a direct object, though these steps are not performed - except to support rewriting. + Instances of QPDFObjectHandle can be + directly created and modified using static factory methods in the + QPDFObjectHandle class. There are factory + methods for each type of object as well as a convenience method + QPDFObjectHandle::parse that creates an + object from a string representation of the object. Existing + instances of QPDFObjectHandle can also be + modified in several ways. See comments in + QPDFObjectHandle.hh for details. When the QPDF class creates a new object, @@ -1377,6 +1597,86 @@ make files. + + Adding and Removing Pages + + While qpdf's API has supported adding and modifying objects for + some time, version 3.0 introduces specific methods for adding and + removing pages. These are largely convenience routines that + handle two tricky issues: pushing inheritable resources from the + /Pages tree down to individual pages and + manipulation of the /Pages tree itself. For + details, see addPage and surrounding methods + in QPDF.hh. + + + + Reserving Object Numbers + + Version 3.0 of qpdf introduced the concept of reserved objects. + These are seldom needed for ordinary operations, but there are + cases in which you may want to add a series of indirect objects + with references to each other to a QPDF + object. This causes a problem because you can't determine the + object ID that a new indirect object will have until you add it to + the QPDF object with + QPDF::makeIndirectObject. The only way to + add two mutually referential objects to a + QPDF object prior to version 3.0 would be + to add the new objects first and then make them refer to each + other after adding them. Now it is possible to create a + reserved object using + QPDFObjectHandle::newReserved. This is an + indirect object that stays “unresolved” even if it is + queried for its type. So now, if you want to create a set of + mutually referential objects, you can create reservations for each + one of them and use those reservations to construct the + references. When finished, you can call + QPDF::replaceReserved to replace the reserved + objects with the real ones. This functionality will never be + needed by most applications, but it is used internally by QPDF + when copying objects from other PDF files, as discussed in . For an example of how to use + reserved objects, search for newReserved in + test_driver.cc in qpdf's sources. + + + + Copying Objects From Other PDF Files + + Version 3.0 of qpdf introduced the ability to copy objects into a + QPDF object from a different + QPDF object, which we refer to as + foreign objects. This allows arbitrary + merging of PDF files. The qpdf command-line + tool provides limited support for basic page selection, including + merging in pages from other files, but the library's API makes it + possible to implement arbitrarily complex merging operations. The + main method for copying foreign objects is + QPDF::copyForeignObject. This takes an + indirect object from another QPDF and + copies it recursively into this object while preserving all object + structure, including circular references. This means you can add + a direct object that you create from scratch to a + QPDF object with + QPDF::makeIndirectObject, and you can add an + indirect object from another file with + QPDF::copyForeignObject. The fact that + QPDF::makeIndirectObject does not + automatically detect a foreign object and copy it is an explicit + design decision. Copying a foreign object seems like a + sufficiently significant thing to do that it should be done + explicitly. + + + The other way to copy foreign objects is by passing a page from + one QPDF to another by calling + QPDF::addPage. In contrast to + QPDF::makeIndirectObject, this method + automatically distinguishes between indirect objects in the + current file, foreign objects, and direct objects. + + Writing PDF Files @@ -1892,8 +2192,8 @@ print "\n"; The specification recommends limiting the number of objects in object stream for efficiency in reading and decoding. Acrobat 6 - uses no more than objects per object stream for linearized files - and no more 200 objects per stream for non-linearized files. + uses no more than 100 objects per object stream for linearized + files and no more 200 objects per stream for non-linearized files. QPDFWriter, in object stream generation mode, never puts more than 100 objects in an object stream. @@ -2085,6 +2385,119 @@ print "\n"; For a detailed list of changes, please see the file ChangeLog in the source distribution. + + + 3.0.rc1: July 29, 2012 + + + + + Acknowledgment: I would like to express gratitude for the + contributions of Tobias Hoffmann toward the release of qpdf + version 3.0. He is responsible for most of the implementation + and design of the new API for manipulating pages, and + contributed code and ideas for many of the improvements made + in version 3.0. Without his work, this release would + certainly not have happened as soon as it did, if at all. + + + + + Non-compatible API change: The version of + QPDFObjectHandle::replaceStreamData that + uses a StreamDataProvider no longer + requires (or accepts) a length parameter. + See for an explanation. + While care is taken to avoid non-compatible API changes in + general, an exception was made this time because the new + interface offers an opportunity to significantly simplify + calling code. + + + + + Support has been added for large files. The test suite + verifies support for files larger than 4 gigabytes, and manual + testing has verified support for files larger than 10 + gigabytes. Large file support is available for both 32-bit + and 64-bit platforms as long as the compiler and underlying + platforms support it. + + + + + Support for page selection (splitting and merging PDF files) + has been added to the qpdf command-line + tool. See . + + + + + Options have been added to the qpdf + command-line tool for copying encryption parameters from + another file. See . + + + + + New methods have been added to the QPDF + object for adding and removing pages. See . + + + + + New methods have been added to the QPDF + object for copying objects from other PDF files. See + + + + + A new method QPDFObjectHandle::parse has + been added for constructing + QPDFObjectHandle objects from a string + description. + + + + + Methods have been added to QPDFWriter + to allow writing to an already open stdio FILE* + addition to writing to standard output or a named file. + Methods have been added to QPDF to be + able to process a file from an already open stdio + FILE*. This makes it possible to read and write + PDF from secure temporary files that have been unlinked prior + to being fully read or written. + + + + + The QPDF::emptyPDF can be used to allow + creation of PDF files from scratch. The example + examples/pdf-create.cc illustrates how it + can be used. + + + + + Several methods to take + PointerHolder<Buffer> can now + also accept std::string arguments. + + + + + Many new convenience methods have been added to the library, + most in QPDFObjectHandle. See + ChangeLog for a full list. + + + + + + 2.3.1: December 28, 2011 @@ -2728,4 +3141,47 @@ print "\n"; + + Upgrading to 3.0 + + For the most part, the API for qpdf version 3.0 is backward + compatible with versions 2.1 and later. There are two exceptions: + + + + The method + QPDFObjectHandle::replaceStreamData that + uses a StreamDataProvider to provide the + stream data no longer takes a length + parameter. While it would have been easy enough to keep the + parameter for backward compatibility, in this case, the + parameter was removed since this provides the user an + opportunity to simplify the calling code. This method was + introduced in version 2.2. At the time, the + length parameter was required in order to + ensure that calls to the stream data provider returned the same + length for a specific stream every time they were invoked. In + particular, the linearization code depends on this. Instead, + qpdf 3.0 and newer check for that constraint explicitly. The + first time the stream data provider is called for a specific + stream, the actual length is saved, and subsequent calls are + required to return the same number of bytes. This means the + calling code no longer has to compute the length in advance, + which can be a significant simplification. If your code fails + to compile because of the extra argument and you don't want to + make other changes to your code, just omit the argument. + + + + + Many methods take long long instead of other + integer types. Most if not all existing code should compile + fine with this change since such parameters had always + previously been smaller types. This change was required to + support files larger than two gigabytes in size. + + + + + diff --git a/qpdf.spec b/qpdf.spec index a51b34b1..b0f6acd6 100644 --- a/qpdf.spec +++ b/qpdf.spec @@ -1,6 +1,6 @@ Summary: Command-line tools and library for transforming PDF files Name: qpdf -Version: 3.0.a0 +Version: 3.0.rc1 Release: 1%{?dist} License: Artistic Group: System Environment/Libraries diff --git a/qpdf/qpdf.cc b/qpdf/qpdf.cc index 61f7a6bd..56c348ce 100644 --- a/qpdf/qpdf.cc +++ b/qpdf/qpdf.cc @@ -160,11 +160,6 @@ repeated multiple times. All non-page data (info, outlines, page numbers,\n\ etc. are taken from the primary input file. To discard this, use --empty\n\ as the primary input.\n\ \n\ -It is not presently possible to specify the same page from the same\n\ -file directly more than once, but you can make this work by specifying\n\ -two different paths to the same file (such as by putting ./ somewhere\n\ -in the path).\n\ -\n\ The page range is a set of numbers separated by commas, ranges of\n\ numbers separated dashes, or combinations of those. The character\n\ \"z\" represents the last page. Pages can appear in any order. Ranges\n\