diff --git a/ChangeLog b/ChangeLog index 62177431..e4844784 100644 --- a/ChangeLog +++ b/ChangeLog @@ -8,6 +8,12 @@ recovery when objects are copied from other files and when "immediate copy from" is enabled. + * When copying foreign streams with immediateCopyFrom set, the + same type of recovery from streams with filtering errors is + performed as when dealing with streams in the original input. This + could happen, for example, if you are using the --pages option to + take pages from another file and that file has errors in it. + * Add a new version of QPDFObjectHandle::pipeStreamData whose return value indicates overall success or failure rather than whether nor not filtering was attempted. It should have always @@ -36,6 +42,12 @@ --preserve-unreferenced-resources is now a synonym for --remove-unreferenced-resources=no. + * Use std::atomic for unique ID generation internally within the + library. This eliminates the already extremely low chance of a + collision, improves thread safety, and removes a dependency on a + random number generator. Thanks to Dean Scarff for the + contribution. + 2020-04-03 Jay Berkenbilt * Allow qpdf to be built on systems without wchar_t. All "normal" @@ -50,6 +62,10 @@ maximally fill the destination rectangle. Prior to this change, placeFormXObject might shrink it but would never expand it. + * When calling the C API, accept any non-zero value as TRUE rather + than just 1. This appears to resolve issues on Windows when + calling some versions of the DLL directly from other languages. + 2020-04-02 Jay Berkenbilt * Add method QPDFObjectHandle::unsafeShallowCopy for copying only diff --git a/manual/qpdf-manual.xml b/manual/qpdf-manual.xml index c875baa2..3dc4e2f6 100644 --- a/manual/qpdf-manual.xml +++ b/manual/qpdf-manual.xml @@ -1944,21 +1944,51 @@ outfile.pdf + + + + + The option may be + auto, yes, or + no. The default is auto. + + + Starting with qpdf 8.1, when splitting pages, qpdf is able to + attempt to remove images and fonts that are not used by a page + even if they are referenced in the page's resources + dictionary. When shared resources are in use, this behavior + can greatly reduce the file sizes of split pages, but the + analysis is very slow. In versions from 8.1 through 9.1.1, + qpdf did this analysis by default. Starting in qpdf 10.0.0, if + auto is used, qpdf does a quick analysis of + the file to determine whether the file is likely to have + unreferenced objects on pages, a pattern that frequently + occurs when resource dictionaries are shared across multiple + pages and rarely occurs otherwise. If it discovers this + pattern, then it will attempt to remove unreferenced + resources. Usually this means you get the slower splitting + speed only when it's actually going to create smaller files. + You can suppress removal of unreferenced resources altogether + by specifying no or force it to do the full + algorithm by specifying yes. + + + Other than cases in which you don't care about file size and + care a lot about runtime, there are few reasons to use this + option, especially now that auto mode is + supported. One reason to use this is if you suspect that qpdf + is removing resources it shouldn't be removing. If you + encounter that case, please report it as bug at https://github.com/qpdf/qpdf/issues/. + + + - Starting with qpdf 8.1, when splitting pages, qpdf ordinarily - attempts to remove images and fonts that are not used by a - page even if they are referenced in the page's resources - dictionary. This option suppresses that behavior. There are - few reasons to use this option. One reason to use this is if - you suspect that qpdf is removing resources it shouldn't be - removing. If you encounter that case, please report it as a - bug. Another reason is that the new behavior can be much - slower for files that include a very large number of images or - other XObjects on a page. In that case, using this option will - return qpdf to the old behavior and speed. + This is a synonym for + . See also , which does @@ -4700,6 +4730,239 @@ print "\n"; ChangeLog in the source distribution. + + + 10.0.0: April 6, 2020 + + + + + Performance Enhancements + + + + + The qpdf library and executable should run much faster in + this version than in the last several releases. Several + internal library optimizations have been made, and there has + been improved behavior on page splitting as well. This + version of qpdf should outperform any of the 8.x or 9.x + versions. + + + + + + + CLI Enhancements + + + + + Add objectinfo key to the JSON output. + This will be a place to put computed metadata or other + information about PDF objects that are not immediately + evident in other ways or that seem useful for some other + reason. In this version, information is provided about each + object indicating whether it is a stream and, if so, what + its length and filters are. Without this, it was not + possible to tell conclusively from the JSON output alone + whether or not an object was a stream. Run qpdf + --json-help for details. + + + + + Add new option + which takes + auto, yes, or + no as arguments. The new + auto mode, which is the default, performs + a fast heuristic over a PDF file when splitting pages to + determine whether the expensive process of finding and + removing unreferenced resources is likely to be of benefit. + For most files, this new default will result in a + significant performance improvement for splitting pages. See + for a more + detailed discussion. + + + + + The is + now just a synonym for + . + + + + + If the QPDF_EXECUTABLE environment + variable is set when invoking qpdf + --bash-completion or qpdf + --zsh-completion, the completion command that it + outputs will refer to qpdf using the value of that variable + rather than what qpdf determines its + executable path to be. This can be useful when wrapping + qpdf with a script, working with a + version in the source tree, using an AppImage, or other + situations where there is some indirection. + + + + + + + Library Enhancements + + + + + Add a new version of + QPDFObjectHandle::StreamDataProvider::provideStreamData + that accepts the suppress_warnings and + will_retry options and allows a success + code to be returned. This makes it possible to implement a + StreamDataProvider that calls + pipeStreamData on another stream and to + pass the response back to the caller, which enables better + error handling on those proxied streams. + + + + + Update QPDFObjectHandle::pipeStreamData + to return an overall success code that goes beyond whether + or not filtered data was written successfully. This allows + better error handling of cases that were not filtering + errors. You have to call this explicitly. Methods in + previously existing APIs have the same semantics as before. + + + + + The + QPDFPageObjectHelper::placeFormXObject + method now allows separate control over whether it should be + willing to shrink or expand objects to fit them better into + the destination rectangle. The previous behavior was that + shrinking was allowed but expansion was not. The previous + behavior is still the default. + + + + + When calling the C API, any non-zero value passed to a + boolean parameter is treated as TRUE. + Previously only the value 1 was accepted. + This makes the C API behave more like most C interfaces and + is known to improve compatibility with some Windows + environments that dynamically load the DLL and call + functions from it. + + + + + Add QPDFObjectHandle::unsafeShallowCopy + for copying only top-level dictionary keys or array items. + This is unsafe because it creates a situation in which + changing a lower-level item in one object may also change it + in another object, but for cases in which you + know you are only inserting or + replacing top-level items, it is much faster than + QPDFObjectHandle::shallowCopy. + + + + + Add QPDFObjectHandle::filterAsContents, + which filter's a stream's data as a content stream. This is + useful for parsing the contents for form XObjects in the + same way as parsing page content streams. + + + + + + + Bug Fixes + + + + + When detecting and removing unreferenced resources during + page splitting, traverse into form XObjects and handle their + resources dictionaries as well. + + + + + The same error recovery is applied to streams in other than + the primary input file when merging or splitting pages. + + + + + + + Build Changes + + + + + Allow qpdf to built on stripped down systems whose C/C++ + libraries lack the wchar_t type. + Search for wchar_t in qpdf's + README.md for details. This should be very rare, but it is + known to be helpful in some embedded environments. + + + + + + + 9.1.1: January 26, 2020 @@ -4804,8 +5067,6 @@ print "\n"; - - 9.1.0: November 17, 2019 @@ -4905,8 +5166,6 @@ print "\n"; - - 9.0.2: October 12, 2019 @@ -5272,7 +5531,7 @@ print "\n"; in dynamically linked code catching exceptions or subclassing, this could be the reason. If you see this, please report a bug at pikepdf. + url="https://github.com/qpdf/qpdf/issues/">https://github.com/qpdf/qpdf/issues/. diff --git a/qpdf/qpdf.cc b/qpdf/qpdf.cc index 190491bd..56925af0 100644 --- a/qpdf/qpdf.cc +++ b/qpdf/qpdf.cc @@ -1483,10 +1483,10 @@ ArgParser::argHelp() << "--normalize-content=[yn] enables or disables normalization of content streams\n" << "--object-streams=mode controls handing of object streams\n" << "--preserve-unreferenced preserve unreferenced objects\n" - << "--preserve-unreferenced-resources\n" - << " synonym for --remove-unreferenced-resources=no\n" << "--remove-unreferenced-resources={auto,yes,no}\n" << " whether to remove unreferenced page resources\n" + << "--preserve-unreferenced-resources\n" + << " synonym for --remove-unreferenced-resources=no\n" << "--newline-before-endstream always put a newline before endstream\n" << "--coalesce-contents force all pages' content to be a single stream\n" << "--flatten-annotations=option\n"