Commit Graph

969 Commits

Author SHA1 Message Date
Jay Berkenbilt 15272662f6 Fix typo in json output key name
moddify -> modify. Also carefully spell checked all remaining keys by
splitting them into words and running a spell checker, not just
relying on visual proofreading. That was the only one.
2022-05-08 13:45:20 -04:00
Jay Berkenbilt 1bc8abfdd3 Implement JSON v2 for Stream
Not fully exercised in this commit
2022-05-08 13:45:20 -04:00
Jay Berkenbilt 3246923cf2 Implement JSON v2 for String
Also refine the herustic for deciding whether to use hexadecimal
notation for a string.
2022-05-08 13:45:20 -04:00
Jay Berkenbilt 16f4f94cd9 Prepare code for JSON v2
Update getJSON() methods and calls to them
2022-05-07 11:12:01 -04:00
Jay Berkenbilt a9fbbd5dca Objectinfo json: write incrementally and in numeric order
This script was used on test data:

----------
#!/usr/bin/env python3
import json
import sys
import re

def json_dumps(data):
    return json.dumps(data, ensure_ascii=False,
                      indent=2, separators=(',', ': '))

for filename in sys.argv[1:]:
    with open(filename, 'r') as f:
        data = json.loads(f.read())
    if 'objectinfo' not in data:
        continue
    trailer = None
    to_sort = []
    for k, v in data['objectinfo'].items():
        if k == 'trailer':
            trailer = v
        else:
            m = re.match(r'^(\d+) \d+ R', k)
            if m:
                to_sort.append([int(m.group(1)), k, v])
    newobjectinfo = {x[1]: x[2] for x in sorted(to_sort)}
    if trailer is not None:
        newobjectinfo['trailer'] = trailer
    data['objectinfo'] = newobjectinfo
print(json_dumps(data))
----------
2022-05-07 08:26:31 -04:00
Jay Berkenbilt 948de60990 Objects json: write incrementally and in numeric order
The following script was used to adjust test data:

----------
#!/usr/bin/env python3
import json
import sys
import re

def json_dumps(data):
    return json.dumps(data, ensure_ascii=False,
                      indent=2, separators=(',', ': '))

for filename in sys.argv[1:]:
    with open(filename, 'r') as f:
        data = json.loads(f.read())
    if 'objects' not in data:
        continue
    trailer = None
    to_sort = []
    for k, v in data['objects'].items():
        if k == 'trailer':
            trailer = v
        else:
            m = re.match(r'^(\d+) \d+ R', k)
            if m:
                to_sort.append([int(m.group(1)), k, v])
    newobjects = {x[1]: x[2] for x in sorted(to_sort)}
    if trailer is not None:
        newobjects['trailer'] = trailer
    data['objects'] = newobjects
print(json_dumps(data))
----------
2022-05-07 08:26:31 -04:00
Jay Berkenbilt dc9b7287cd Top-level json: write incrementally
This commit just changes the order in which fields are written to the
json without changing their content. All the json files in the test
suite were modified with this script to ensure that we didn't get any
changes other than ordering.

----------
#!/usr/bin/env python3
import json
import sys

def json_dumps(data):
    return json.dumps(data, ensure_ascii=False,
                      indent=2, separators=(',', ': '))

for filename in sys.argv[1:]:
    with open(filename, 'r') as f:
        data = json.loads(f.read())
    newdata = {}
    for i in ('version', 'parameters', 'pages', 'pagelabels',
              'acroform', 'attachments', 'encrypt', 'outlines',
              'objects', 'objectinfo'):
        if i in data:
            newdata[i] = data[i]
print(json_dumps(newdata))
----------
2022-05-07 08:26:31 -04:00
Jay Berkenbilt 7f65a5c21f Test json against schema only on demand
Testing json against schema requires an in-memory copy, so do it only
when requested by the test suite.
2022-05-07 08:26:31 -04:00
Jay Berkenbilt a3c9980395 Add next to Pl_String and fix comments 2022-05-07 08:26:31 -04:00
Jay Berkenbilt e5f3910c3e Add new FileInputSource constructors 2022-05-04 12:07:11 -04:00
Jay Berkenbilt 8b25de24c9 Make "objects" and "pages" consistent in JSON output 2022-05-04 08:32:44 -04:00
Jay Berkenbilt f4206a0938 Add new Pl_String Pipeline 2022-05-03 18:54:51 -04:00
Jay Berkenbilt 16139d97c8 Add new Pl_OStream Pipeline 2022-05-03 18:54:51 -04:00
Jay Berkenbilt 21d6e3231f Make use of the new Pipeline methods in some places 2022-05-03 18:31:23 -04:00
Jay Berkenbilt 59f3e09edf Make Pipeline::write take an unsigned char const* (API change) 2022-05-03 18:31:22 -04:00
Jay Berkenbilt 62bf296a9c Make assert handling less error-prone
Prevent my future self or other contributors from using assert in
tests and then having that assert not do anything because of the
NDEBUG macro.
2022-05-03 18:31:22 -04:00
Jay Berkenbilt 8ccd3a8a89 Mark weak encryption with API changes (fixes #576) 2022-04-30 17:24:15 -04:00
Jay Berkenbilt cff26040d8 Using insecure crytpo from the CLI is now an error by default 2022-04-30 17:23:58 -04:00
Jay Berkenbilt 4f24617e1e Code clean up: use range-style for loops wherever possible
Where not possible, use "auto" to get the iterator type.

Editorial note: I have avoid this change for a long time because of
not wanting to make gratuitous changes to version history, which can
obscure when certain changes were made, but with having recently
touched every single file to apply automatic code formatting and with
making several broad changes to the API, I decided it was time to take
the plunge and get rid of the older (pre-C++11) verbose iterator
syntax. The new code is just easier to read and understand, and in
many cases, it will be more effecient as fewer temporary copies are
being made.

m-holger, if you're reading, you can see that I've finally come
around. :-)
2022-04-30 13:27:18 -04:00
Jay Berkenbilt 7f023701dd Formatting: remove space in range-style for loops
Change .clang-format and commit automated changes from a fresh run of
format-code
2022-04-30 13:26:43 -04:00
Jay Berkenbilt 2878c186bf Use fluent appendItem 2022-04-30 10:54:16 -04:00
Jay Berkenbilt ab9d557cb0 Use fluent replaceKey 2022-04-29 20:39:54 -04:00
Jay Berkenbilt e80fad86e9 Add new QPDFObjectHandle methods for more fluent programming 2022-04-29 20:09:10 -04:00
Jay Berkenbilt d0b7cc8ac6 QPDFJob json: make removeAttachment take an array (fixes #693) 2022-04-24 13:06:19 -04:00
Jay Berkenbilt 08ba21cf49 Fix some bugs around null values in dictionaries
Make it so that a key with a null value is always treated as not being
present. This was inconsistent before.
2022-04-24 10:08:32 -04:00
Jay Berkenbilt 4be2f36049 Deprecate replaceOrRemoveKey -- it's the same as replaceKey 2022-04-24 09:31:32 -04:00
Jay Berkenbilt b8d0b0b638 Re-add accidentally removed qpdf.testcov 2022-04-24 09:18:04 -04:00
Jay Berkenbilt 68e721981a Add new QPDF::warn that takes most of QPDFExc's arguments 2022-04-23 18:25:43 -04:00
Jay Berkenbilt 22b35c4928 Expose QUtil::get_next_utf8_codepoint 2022-04-23 18:25:43 -04:00
Jay Berkenbilt ce5c3bcad8 QPDFJob: pass capture output streams through to underlying QPDF 2022-04-18 11:24:17 -04:00
Jay Berkenbilt 80ed3076a0 Remove deprecated name/number tree constructors
Remove the name/number tree object helper constructors that don't take
a QPDF&.
2022-04-16 13:13:15 -04:00
Jay Berkenbilt cdd0b4fb7d Use = default and = delete where possible in classes 2022-04-16 11:39:14 -04:00
Jay Berkenbilt ce86307a1a Fix typo in error message 2022-04-10 16:54:23 -04:00
Jay Berkenbilt 5f4675bb24 Mark non-ABI symbols in exported class with QPDF_DLL_PRIVATE 2022-04-10 16:54:23 -04:00
Jay Berkenbilt 5525c93124 Use QPDF_DLL_CLASS with Pipeline and InputSource subclasses
This enables RTTI so we can use dynamic_cast on them across the shared
object boundary.
2022-04-10 16:52:57 -04:00
Jay Berkenbilt 07edf96440 Remove methods of private classes from ABI
Prior to the cmake conversion, several private classes had methods
that were exported into the shared library so they could be tested
with libtests. With cmake, we build libtests using an object library,
so this is no longer necessary. The methods that are disappearing from
the ABI were never exposed through public headers, so no code should
be using them. Removal had to wait until the window for ABI-breaking
changes was open.
2022-04-09 17:33:29 -04:00
Jay Berkenbilt 128e41648f Remove PointerHolder.hh from other than public header files
Increase to POINTERHOLDER_TRANSITION=4
2022-04-09 17:33:29 -04:00
Jay Berkenbilt ba0ef7a124 Replace PointerHolder with std::shared_ptr in the rest of the code
Increase to POINTERHOLDER_TRANSITION=3

patrepl s/PointerHolder/std::shared_ptr/g **/*.cc **/*.hh
patrepl s/make_pointer_holder/std::make_shared/g **/*.cc
patrepl s/make_array_pointer_holder/QUtil::make_shared_array/g **/*.cc
patrepl s,qpdf/std::shared_ptr,qpdf/PointerHolder, **/*.cc **/*.hh
git restore include/qpdf/PointerHolder.hh
git restore libtests/pointer_holder.cc
cleanpatch
./format-code
2022-04-09 17:33:29 -04:00
Jay Berkenbilt 12f1eb15ca Programmatically apply new formatting to code
Run this:

for i in  **/*.cc **/*.c **/*.h **/*.hh; do
  clang-format < $i >| $i.new && mv $i.new $i
done
2022-04-04 08:10:40 -04:00
Jay Berkenbilt 820a3f04fd Remove "lt-" workarounds
The executables that libtool built invoked the underlying binary with
an "lt-" prefix. The code contained numerous workarounds for testing,
which can now be removed.
2022-03-18 19:53:18 -04:00
Jay Berkenbilt acdf5b2e7a Update process for ABI testing 2022-03-18 19:53:18 -04:00
Jay Berkenbilt f58d2a60d5 Update build-related documentation and comments 2022-03-18 19:53:18 -04:00
Jay Berkenbilt 70d0d0889b Remove old build files 2022-03-18 19:53:18 -04:00
Jay Berkenbilt b8aff90997 Add cmake configuration files 2022-03-18 19:53:18 -04:00
Jay Berkenbilt 6941923ca9 Improve large file test output 2022-03-18 19:53:18 -04:00
Jay Berkenbilt c0a231bf32 Reverse sense of compare images toggle for qpdf.test
Run compare images tests when QPDF_TEST_COMPARE_IMAGES is set rather
than when QPDF_SKIP_TEST_COMPARE_IMAGES is not set.
2022-03-18 19:53:18 -04:00
Jay Berkenbilt 17c0e38c8e Force assert to be defined in test code 2022-03-07 10:07:27 -05:00
Jay Berkenbilt 1d86b70eab Tweak include of config for ctest 2022-03-01 14:01:34 -05:00
Jay Berkenbilt 99393e6ab7 Shorten coverage case name
This is so it will fit on one line after a qtest upgrade allows us to
split lines.
2022-02-26 10:18:23 -05:00
Jay Berkenbilt f7ac591590 Recognize explicit UTF-8 strings (fixes #654) 2022-02-22 08:10:05 -05:00
Jay Berkenbilt 31b45b0fd4 Fix logic error with Tf when generating appearances (fixes #655) 2022-02-18 13:46:35 -05:00
Jay Berkenbilt 3e2109ab37 Remove special case for 0xad for 10.6.2. 2022-02-16 06:52:05 -05:00
Jay Berkenbilt e810fe678a Fix asymmetry between newUnicodeString and getUTF8Value 2022-02-15 19:22:35 -05:00
Jay Berkenbilt a478cbb6dc Silently/transparently recognize UTF-16LE as UTF-16 (fixes #649)
The PDF spec only allows UTF-16BE, but most readers seem to accept
UTF-16LE as well, so now qpdf does too.
2022-02-15 16:13:12 -05:00
Jay Berkenbilt fbd3e56da7 Ignore -- at the top level arg parser (fixes #652)
This was unintended behavior that was added back for backward
compatibility. It is intentionally undocumented.
2022-02-15 16:13:12 -05:00
Jay Berkenbilt 19608ec151 Add missing spaces in usageExit 2022-02-15 16:11:33 -05:00
Jay Berkenbilt 956a272d62 Remove abs calls and pick correct floating point epsilon values (fixes #641) 2022-02-11 07:18:33 -05:00
Jay Berkenbilt d501e1c0d4 Only update output version from files used as input
If we're opening a PDF file to copy its encryption information or
attachments, its version doesn't need to influence the output version.
2022-02-08 13:49:22 -05:00
Jay Berkenbilt f91b21c7d4 Preserve input PDF version on pages/split-pages (fixes #610) 2022-02-08 12:34:14 -05:00
Jay Berkenbilt cfd5147d92 Add QPDF::getVersionAsPDFVersion 2022-02-08 12:34:14 -05:00
Jay Berkenbilt cb769c62e5 WHITESPACE ONLY -- expand tabs in source code
This comment expands all tabs using an 8-character tab-width. You
should ignore this commit when using git blame or use git blame -w.

In the early days, I used to use tabs where possible for indentation,
since emacs did this automatically. In recent years, I have switched
to only using spaces, which means qpdf source code has been a mixture
of spaces and tabs. I have avoided cleaning this up because of not
wanting gratuitous whitespaces change to cloud the output of git
blame, but I changed my mind after discussing with users who view qpdf
source code in editors/IDEs that have other tab widths by default and
in light of the fact that I am planning to start applying automatic
code formatting soon.
2022-02-08 11:51:15 -05:00
Jay Berkenbilt c62e8e2b28 Update for clean compile with POINTERHOLDER_TRANSITION=2 2022-02-07 17:38:22 -05:00
m-holger 5901fcad4c C-API expose QPDFObjectHandle::getKeyIfDict 2022-02-06 11:21:15 -05:00
m-holger 8371060340 Add method QPDFObjectHandle::getKeyIfDict 2022-02-06 11:21:15 -05:00
m-holger 2ed5f49a79 C-API expose QPDFObjectHandle::getValueAs... accessors 2022-02-05 19:40:30 -05:00
Jay Berkenbilt 7fb22740e1 Add operator ""_qpdf for creating QPDFObjectHandle literals 2022-02-05 11:29:25 -05:00
Jay Berkenbilt b48a0ff0e8 Add qpdf_empty_pdf to C API 2022-02-05 11:29:25 -05:00
m-holger e58b1174c7 Add new QPDFObjectHandle::getValueAs... accessors 2022-02-05 11:24:35 -05:00
Jay Berkenbilt 9044a24097 PointerHolder: deprecate getPointer() and getRefcount()
Use get() and use_count() instead. Add #define
NO_POINTERHOLDER_DEPRECATION to remove deprecation markers for these
only.

This commit also removes all deprecated PointerHolder API calls from
qpdf's code except in PointerHolder's test suite, which must continue
to test the deprecated APIs.
2022-02-04 13:12:37 -05:00
m-holger 95e7d36b7a C-API add two binary UTF8 funtions
add qpdf_oh_new_binary_unicode_string and qpdf_oh_get_binary_utf8_value
2022-02-04 13:10:51 -05:00
m-holger 1925ffd467 Fix --check-linearization of non-linearized files (fixes #615) 2022-02-04 06:52:38 -05:00
Jay Berkenbilt 42bff9f458 QPDFJob: let initializeFromArgv just take argv, not argc
Let argv be a null-terminated array. There is already code that
assumes this, and it makes it easier to construct the arguments.
2022-02-01 13:50:58 -05:00
Jay Berkenbilt b02d37bc0a Make QPDFArgParser accept const argv
This makes it much more convention to use the initializeFromArgv
functions since you can use string literals.
2022-02-01 13:50:58 -05:00
Jay Berkenbilt bc4e2320e7 Add qpdfjob-c.h -- simple C API around parts of QPDFJob 2022-02-01 09:04:55 -05:00
Jay Berkenbilt 03e67a28fe Move QTC::TC for qpdf to QPDFJob
All the coverage cases that used to be in qpdf.cc are now in
QPDFJob*.cc. It doesn't really matter, but better to follow the
convention of starting with the class that includes the coverage call.
2022-02-01 09:04:55 -05:00
Jay Berkenbilt b42f3e1d15 Move more code from qpdf.cc into QPDFJob 2022-02-01 09:04:55 -05:00
Jay Berkenbilt 21b9290785 QPDFJob json: make bare arguments expect the empty string
Changing from bool requiring true to string requiring the empty string
is more consistent with the CLI and makes it possible to add an
optional parameter or choices later without breaking compatibility.
2022-01-31 18:16:09 -05:00
Jay Berkenbilt ea96330bb6 QPDFJob json: flatten json structure
Flatten everything to make it easier to map command-line flags to
json. The old structure was an illusion anyway because there was no
mechanism to enforce that things were in the right place. This also
helps with future flexibility.
2022-01-31 18:16:09 -05:00
Jay Berkenbilt 47f33cec25 QPDFJob: add test cases 2022-01-31 15:57:45 -05:00
Jay Berkenbilt e3506253f1 Add optional version to --json 2022-01-31 15:57:45 -05:00
Jay Berkenbilt caa00556cf Change filename or path to file in json and QPDFJob
Use "file" consistently for specifying a file path. We use "filename"
when adding attachments for a completely different purpose.
2022-01-31 15:57:45 -05:00
Jay Berkenbilt 7097f29019 More editorial changes from m-holger + spell check 2022-01-30 13:11:03 -05:00
Jay Berkenbilt 0e909bab8e Improve top-level help information 2022-01-30 13:11:03 -05:00
Jay Berkenbilt 0364024781 Use QPDFUsage exception for cli, json, and QPDFJob errors 2022-01-30 13:11:03 -05:00
Jay Berkenbilt 95d127641c QPDFJob: move more top-level trivial handlers into config 2022-01-30 13:11:03 -05:00
Jay Berkenbilt 9373881cca Add QPDFJob::ConfigError exception 2022-01-30 13:11:03 -05:00
Jay Berkenbilt b9cd693a5b QPDFJob: allocate QPDFArgParser on stack
The previous commits have removed all references to memory from
QPDFArgParser from QPDFJob. This commit removes the constraint that
QPDFArgParser remain in scope. This is a prerequisite to allowing JSON
as an alternative way to initialize QPDFJob and to initialize it
directly using a public API.
2022-01-30 13:11:03 -05:00
Jay Berkenbilt b9af421ef7 Add missing \f support for JSON string encoder 2022-01-30 13:11:03 -05:00
Jay Berkenbilt 5c5e5ca29b Document how to add a command-line argument 2022-01-30 13:11:03 -05:00
Jay Berkenbilt a301cc5373 Minor code cleanup 2022-01-30 13:11:03 -05:00
Jay Berkenbilt bd89aac360 QPDFJob increment: move arg parsing into QPDFJob
Move ArgParser from qpdf.cc into QPDFJob.cc. It still works with
millions of public member variables, but now qpdf.cc is minimal and
just calls stable library functions.
2022-01-30 13:11:03 -05:00
Jay Berkenbilt 23b64f8357 Remove qpdf.cc version check
Remove comparison of qpdf CLI version with library. With almost all
the functionality moving into the library, this check is no longer
meaningful.
2022-01-30 13:11:03 -05:00
Jay Berkenbilt 1ddf5b4b4b QPDFJob increment: get rid of exit, handle verbose
Remove all calls to exit() from QPDFJob. Handle code that runs in
verbose mode to enable it to make use of output streams and message
prefix (whoami) from QPDFJob. This removes temporarily duplicated exit
code logic and most access to whoami/std::cout outside of QPDFJob
proper.
2022-01-30 13:11:03 -05:00
Jay Berkenbilt 0910e767ad QPDFJob increment: basic QPDFJob structure
Move most of the methods called from qpdf.cc after argument parsing
into QPDFJob. In this increment, enough QPDFJob API has been added to
handle the branch of QPDFJob::run() that creates output with an
appropriate division between qpdf.cc and QPDFJob.

There are temporary bits of code to enable everything to compile and
pass the test suite, including some duplication and hard-coded values.
2022-01-30 13:11:03 -05:00
Jay Berkenbilt 8c718b7e6f Prefix program name before exception message in qpdf CLI 2022-01-30 13:11:02 -05:00
Jay Berkenbilt c60b4ea55a Refactor arg parsing in qpdf.cc to use QPDFArgParser 2022-01-30 13:11:02 -05:00
Jay Berkenbilt 52817f0a45 Implement QPDFArgParser based on ArgParser from qpdf.cc 2022-01-30 13:11:02 -05:00
m-holger 8eca9d8fd9 Fix QPDFObjectHandle::isOrHasName
Ensure isOrHasName returns true if object is an array and the name is
present anywhere in the array.
2022-01-27 09:35:39 -06:00
m-holger 710d2e54f0 Allow testing for subtype without specifying type in isDictionaryOfType etc
Accept empty string as type parameter in
QPDFObjectHandle::isDictionaryOfType and isStreamOfType
to allow for dictionaries with optional type.
2022-01-27 07:31:12 -06:00
m-holger 1b1b471ca9 Make a few whitespace fixes from last commit
Commit by ejb@ql.org using m-holger as author so git annotate gives
proper credit for changes.
2022-01-22 09:14:53 -05:00
m-holger 8593b9fdf7 Add new convenience methods QPDFObjectHandle::isNameAndEquals, etc
Add methods isNameAndEquals, isDictionaryOfType, isStreamOfType
2022-01-22 08:10:28 -06:00
Jay Berkenbilt 370710657a Add missing characters from PDF doc encoding (fixes #606) 2022-01-11 15:55:19 -05:00
Jay Berkenbilt 0f1ffa1215 Move bash/zsh completion helpers to libtests/arg_parser 2022-01-05 18:13:25 -05:00
Jay Berkenbilt 4782b5904f Move filter-completion.pl to libtests/arg_parser 2022-01-05 18:13:25 -05:00
Jay Berkenbilt af91b5b584 Add QUtil::file_can_be_opened 2021-12-29 13:41:02 -05:00
Jay Berkenbilt ac0060ac38 Refactor arg parsing to allow help option with parameter 2021-12-29 13:35:05 -05:00
Jay Berkenbilt 04745320d6 Prepare 10.5.0 release 2021-12-20 14:51:46 -05:00
Jay Berkenbilt d866f48081 Change names of qpdf_object_type_e enumerations
They have to be ot_* rather than qpdf_ot_* for compatibility.

* Different enumerated types are not assignment-compatible in C++, at
  least with strict compiler settings
* While you can do `constexpr ot_xyz = ::qpdf_ot_xyz` in QPDFObject.hh to
  make QPDFObject::ot_xyz work, QPDFObject::object_type_e::ot_xyz will
  only work if the enumerated type names are the same.
2021-12-20 14:51:45 -05:00
Jay Berkenbilt cf7b2b5700 test_driver: split runtest into separate functions
Too bad about git annotate but it was pretty crazy to have all those
test cases together like that.
2021-12-20 12:40:03 -05:00
Jay Berkenbilt ea73bf72e0 Further improvements to handling binary strings 2021-12-19 14:30:45 -05:00
Jay Berkenbilt d3501c4f3e Fix LGTM alerts 2021-12-18 16:25:53 -05:00
Jay Berkenbilt ddbe59179e C API: simplify new error handling and improve documentation 2021-12-17 15:59:47 -05:00
m-holger f6293bd94c C-API expose QPDFObjectHandle::getTypeCode and getTypeName (fixes #597) 2021-12-17 14:24:43 -05:00
Jay Berkenbilt feafcc4e88 C API: add several stream functions (fixes #596) 2021-12-17 13:28:11 -05:00
Jay Berkenbilt 4024953682 Output C test n done at the end of each qpdf-ctest 2021-12-16 15:40:56 -05:00
Jay Berkenbilt 9bb6f570ec C API: add functions for working with pages (fixes #594) 2021-12-16 15:07:48 -05:00
Jay Berkenbilt f072be032f qpdf-ctest: outfile2 -> xarg 2021-12-16 11:51:16 -05:00
Jay Berkenbilt 08bcf6449c Clarify docs around @filename and leading/trailing space 2021-12-10 15:52:28 -05:00
Jay Berkenbilt af2a71aa2c Handle bitstream overflow errors more gracefully (fixes #581)
* Make it a runtime error, not a logic error
* Include additional information
* Capture it properly in checkLinearization
2021-12-10 15:37:35 -05:00
Jay Berkenbilt 1c62c2a342 C API: expose functions for indirect objects (fixes #588) 2021-12-10 14:57:35 -05:00
Jay Berkenbilt 8e0b153332 Expose QPDFObjectHandle::addTokenFilter (fixes #580) 2021-12-10 13:37:07 -05:00
Jay Berkenbilt 72c10d8617 C API: overhaul error handling
* Handle error conditions that occur when using the object handle
  interfaces. In the past, some exceptions were not correctly
  converted to errors or warnings.
* Add more detailed information to qpdf-c.h
* Make it possible to work more explicitly with uninitialized objects
2021-12-10 12:16:02 -05:00
Jay Berkenbilt 3340dbe976 Use a specific error code for type warnings and clarify docs 2021-12-10 11:15:49 -05:00
Jay Berkenbilt b2b2a175c4 Add missing unit test for register progress reporter in C API
It was exercised in the pdf-linearize example but not in qpdf-ctest.
2021-12-10 09:11:56 -05:00
Jay Berkenbilt 09f3737202 Split qpdf-ctest test 24 into multiple tests
Thanks for the nudge from m-holger!
2021-12-09 15:21:19 -05:00
Jay Berkenbilt e3cc171d02 C API: qpdf_oh_is_initialized 2021-12-09 10:33:31 -05:00
Jay Berkenbilt bef2c2222a C API: qpdf_get_last_string_length 2021-12-09 10:33:31 -05:00
m-holger 0c705a882b Minor documentation updates 2021-12-09 10:24:14 -05:00
m-holger b4fc9eb700 C-API expose new_object as qpdf_oh_new_object 2021-12-02 13:59:58 -05:00
Jay Berkenbilt 720ce9e8f3 Improve testing and error handling around operating before processing 2021-11-29 07:42:36 -05:00
Jay Berkenbilt b97a43e091 Add additional testing around improved array wrapping 2021-11-19 13:33:10 -05:00
m-holger 4630b8567c Ensure qpdf_oh handles returned by C-API functions are unique.
Return new qpdf_oh from qpdf_oh_wrap_in_array when input is already an array.
Update some doc comments in qpdf-c.h.
2021-11-19 13:31:59 +00:00
Jay Berkenbilt ce7db05d22 Prepare 10.4.0 release 2021-11-16 15:44:09 -05:00
Jay Berkenbilt 750aca5b94 First increment of improving handling of weak crypto (fixes #358) 2021-11-11 12:24:15 -05:00
Jay Berkenbilt f45dacf4cb Make recovery logic flexible about where objects end (fixes #573)
Don't assume endobj is at the beginning of the line. This means we are
looking at tokens for every line, but the odds of n n obj appearing in
the middle of the object are likely much lower than endobj not being
at the beginning of the line or missing entirely. This will probably
have a negative impact on recovery time for very large files.
Hopefully it will be worth it.
2021-11-07 15:27:22 -05:00
Jay Berkenbilt 4a648b9a00 Fix bug in merging resources /DR from foreign AcroForm (fixes #548)
When making resources indirect in from_dr, the code was using the
wrong owning QPDF, forgetting that from_dr had already been copied
using CopyForeignObject.
2021-11-04 12:29:42 -04:00
Jay Berkenbilt 9b28933647 Check object ownership when adding
When adding a QPDFObjectHandle to an array or dictionary, if possible,
check if the new object belongs to the same QPDF. This makes it much
easier to find incorrect code than waiting for the situation to be
detected when the file is written.
2021-11-04 12:29:42 -04:00
Jay Berkenbilt 73752683c9 Fix overlay/underlay on page with no resources (fixes #527) 2021-11-03 16:00:05 -04:00
Jay Berkenbilt 33a47d5c3c Make QPDF::findPage public (fixes #516)
This was originally not public because I wanted to get rid fo the
pages cache, but I recently realized there were deep reasons not to do
that, and the author of pikepdf wanted this, so I decided to make it
public.
2021-11-03 09:43:17 -04:00
Jay Berkenbilt 532a4f3d60 Detect recoverable but invalid zlib data streams (fixes #562) 2021-11-03 09:43:17 -04:00
Jay Berkenbilt 7ed991343b Better diagnostics when --pages is not closed (fixes #555) 2021-11-02 16:22:37 -04:00
Fredrik Fornwall e0775238b8 Fix QPDFEFStreamObjectHelper::{get,set}Subtype
The /Subtype entry that specifies the mime type of an embedded file is
inside the embedded file stream dictionary directly, not it in the
parameter dictionary.

See Table 45 and 46 in the PDF 1.7 specification:
https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#page=112
2021-09-10 10:02:24 -04:00
Jay Berkenbilt df38fe8e48 Fix string bounds checking in completion code (fixes #441) 2021-05-13 13:06:58 -04:00
Jay Berkenbilt bddebdb0ea Prepare 10.3.2 release 2021-05-08 10:41:14 -04:00
Jay Berkenbilt 30ac51bc78 Exclude unreferenced objects in object streams (fixes #520) 2021-05-08 09:42:09 -04:00
Jay Berkenbilt 8971443e46 QPDF::addPage*: handle duplicate pages more robustly 2021-04-05 10:58:10 -04:00
Jay Berkenbilt ec48820c3c Fix loop detection in NNTree 2021-04-05 07:59:02 -04:00
Jay Berkenbilt 3f05429cc5 Prepare 10.3.1 release 2021-03-11 12:59:41 -05:00
Jay Berkenbilt 972e08af58 Protect against future bugs in fixCopiedAnnotations
I don't want additional, undiscovered bugs to fully block page
splitting/merging operations.
2021-03-11 12:49:27 -05:00
Jay Berkenbilt 85884c363c Allow /DR to be direct in /AcroForm
Also handle direct annotation, though this is much less likely.
2021-03-11 11:43:38 -05:00
Jay Berkenbilt dc65b88457 Prepare 10.3.0 release 2021-03-05 06:15:48 -05:00
Jay Berkenbilt addc0672d1 Tweak form copying to avoid gratuitous field renames
When copying a page from the original file to the output in --pages,
don't alter the fields or annotations for the first copy of each page.
2021-03-05 05:31:15 -05:00
Jay Berkenbilt cb6e53136f QPDFAcroFormDocumentHelper: add missing analyze calls 2021-03-04 18:11:44 -05:00
Jay Berkenbilt f68e25c7f2 Don't use handleWarning, which is being reverted 2021-03-04 15:59:45 -05:00
Jay Berkenbilt 9fb174b9e9 Major rework of handling form fields when copying pages (fixes #509) 2021-03-04 15:08:37 -05:00
Jay Berkenbilt 887f35efaa When resolving font from /DR, copy it into resources 2021-03-04 15:08:36 -05:00
Jay Berkenbilt d7ffdfa994 Add optional conflict detection to mergeResources
Also improve behavior around direct vs. indirect resources.
2021-03-04 15:08:36 -05:00
Jay Berkenbilt e17585c2d2 Remove unreferenced: ignore names that are not Fonts or XObjects
Converted ResourceFinder to ParserCallbacks so we can better detect
the name that precedes various operators and use the operators to sort
the names into resource types. This enables us to be smarter about
detecting unreferenced resources in pages and also sets the stage for
reconciling differences in /DR across documents.
2021-03-03 17:05:49 -05:00
Jay Berkenbilt b444ab3352 Fix typos in coverage cases 2021-03-03 17:05:49 -05:00
Jay Berkenbilt fa2516df71 Fix behavior for finding /Q, /DA, and /DR for form fields
If not found in the field hierarchy, /Q and /DA are supposed to be
looked up in the document-level form dictionary. /DR is supposed to
only come from the document dictionary.
2021-03-03 17:05:19 -05:00
Jay Berkenbilt a4d6589ff2 Have QPDFObjectHandle notice when replaceObject was called
This results in a performance penalty of 1% to 2% when replaceObject
and swapObjects are never called and a somewhat larger penalty if they
are called, but it's worth it to avoid very confusing behavior as
discussed in depth in qpdf#507.
2021-02-25 07:32:46 -05:00
Jay Berkenbilt b5e937397c Prepare 10.2.0 release 2021-02-23 10:41:58 -05:00
Jay Berkenbilt 1886673d7e Spell check 2021-02-23 10:38:05 -05:00
Jay Berkenbilt 9e00be7ffa Remove warning that gives false positives in some normal cases 2021-02-23 08:26:21 -05:00
Jay Berkenbilt 039eb4a253 Fix input file = output file test for split pages 2021-02-23 08:26:21 -05:00
Jay Berkenbilt be3a8c0e7a Keep only referenced form fields in --pages 2021-02-23 08:26:21 -05:00
Jay Berkenbilt 50037fb33d Fix test case to not leave stray files behind 2021-02-22 19:51:36 -05:00
Jay Berkenbilt 83216e640c Preserve form fields when splitting pages (fixes #340) 2021-02-22 18:42:06 -05:00
Jay Berkenbilt 8e8c0d8290 Add new placeFormXObject that takes a matrix reference 2021-02-22 18:42:06 -05:00
Jay Berkenbilt 61d41e2e88 Add copyAnnotations, use with overlay/underlay (fixes #395) 2021-02-22 18:42:06 -05:00
Jay Berkenbilt 7b3cbacf5d Change from QPDF{Array,Dict}Items to aitems() and ditems() 2021-02-22 11:05:39 -05:00
Jay Berkenbilt a9ae8cadc6 Add transformAnnotations and fix flattenRotations to use it 2021-02-21 17:13:09 -05:00
Jay Berkenbilt 7540d2082a Explicitly override inherited rotate in flattenRotations 2021-02-21 14:58:45 -05:00
Jay Berkenbilt 92fbc6fdf5 QPDFObjectHandle::copyStream 2021-02-21 06:36:30 -05:00
Jay Berkenbilt 35dd11f356 Allow --rotate=0 2021-02-20 16:29:34 -05:00
Jay Berkenbilt 0a52e60ece Use QUtil::path_basename 2021-02-18 09:59:03 -05:00
Jay Berkenbilt dfce581754 Add numeric argument to --collate
This takes pages from the file in groups of n with default = 1. This
partially fixes the enhancement in issue #505 but doesn't implement
the entire suggestion.
2021-02-17 20:07:45 -05:00
Jay Berkenbilt a773f4c71d Add QPDFObjectHandle::parse for strings with context 2021-02-15 11:33:03 -05:00
Jay Berkenbilt efbb21673c Add functional versions of QPDFObjectHandle::replaceStreamData
Also fix a bug in checking consistency of length for stream data
providers. Length should not be checked or recorded if the provider
says it failed to generate the data.
2021-02-14 14:42:24 -05:00
Jay Berkenbilt 07f40bd254 QUtil::double_to_string: trim trailing zeroes with option to disable 2021-02-13 02:30:00 -05:00
Jay Berkenbilt 2538d84413 Explicitly deprecate old name/number tree constructors
Use C++14 [[deprecated]] tag
2021-02-10 16:28:00 -05:00
Jay Berkenbilt accb891b4f Add attachment information to the json output 2021-02-10 15:46:18 -05:00
Jay Berkenbilt 832d792e4e Add CLI support for working with attachments 2021-02-10 10:03:27 -05:00
Jay Berkenbilt 1f4771cd0d Minor clean up of Windows headers 2021-02-10 07:36:18 -05:00
Jay Berkenbilt ad34b9c278 Implement helpers for file attachments 2021-02-10 06:57:37 -05:00
Jay Berkenbilt e076c9bf08 Remove erroneous handling of /EFF for stream decryption
I thought /EFF was supposed to be used as a default for decrypting
embedded file streams, but actually it's supposed to be advice to a
conforming writer about handling new ones. This makes sense since the
findAttachmentStreams code, which is not actually needed, was never
right.
2021-02-06 17:08:41 -05:00
Jay Berkenbilt ac2b3b96e1 Make wrong object stream type a warning 2021-02-06 14:29:11 -05:00
Jay Berkenbilt af557db4a4 Cosmetic fix to help 2021-02-06 13:45:43 -05:00
Jay Berkenbilt 3de67173de Better fix to insecure password check (fixes #501) 2021-02-04 20:44:05 -05:00
Jay Berkenbilt 63158cf546 Add --password-file=filename option (fixes #499) 2021-02-04 16:48:53 -05:00
Jay Berkenbilt 21b0f4acfc Require --allow-insecure to create certain encrypted files (fixes #501)
For now, --allow-insecure allows creation of files with the owner
passwords empty or matching the user password.
2021-02-04 15:57:13 -05:00
Jay Berkenbilt faa2e3ddfd Handle older PDFs whose form XObjects inherit resources (fixes #494)
When removing unreferenced resources, notice if a page (recursively)
contains a form XObject with unreferenced resources, and count any
such resources as referenced by the page.
2021-02-02 18:06:05 -05:00
Jay Berkenbilt 5fdf37b1ba Handle warnings in --pages from other files
Warnings were not being handled per --no-warn or generating exit code 3.
2021-02-02 18:06:05 -05:00
Jay Berkenbilt de0b11fc47 Add C++ iterator API around array and dictionary objects 2021-01-30 15:15:23 -05:00
Jay Berkenbilt 8ed3e8c79b NNTree: rework iterators to be more memory efficient
Keep a std::pair internal to the iterators so that operator* can
return a reference and operator-> can work, and each can work without
copying pairs of objects around.
2021-01-26 09:12:23 -05:00
Jay Berkenbilt e7e20772ed name/number trees: remove 2021-01-26 09:12:23 -05:00
Jay Berkenbilt 5816fb44b8 name/number trees: insertAfter 2021-01-25 15:39:10 -05:00
Jay Berkenbilt 16a9bb3f6f name/number trees: newEmpty, increment/decrement end() 2021-01-25 15:39:10 -05:00
Jay Berkenbilt b5614f611d Implement repair and insert for name/number trees 2021-01-24 19:31:45 -05:00
Jay Berkenbilt 04edfe9fad QPDFObjectHandle::newUnicodeString to uses UTF-16 only when needed
Use the first of ASCII, PDFDocEncoding, or UTF-16 that is capable of
encoding the string.
2021-01-24 03:27:28 -05:00
Jay Berkenbilt 63e5cb533d Use new QPDF{Name,Number}TreeObjectHelper API 2021-01-24 03:27:28 -05:00
Jay Berkenbilt d61ffb65d0 Add new constructors for name/number tree helpers
Add constructors that take a QPDF object so we can issue warnings and
create new indirect objects.
2021-01-24 03:27:26 -05:00
Jay Berkenbilt 5f0708418a Add iterators to name/number tree helpers 2021-01-24 03:22:59 -05:00
Jay Berkenbilt 4a1cce0a47 Reimplement name and number tree object helpers
Create a computationally and memory efficient implementation of name
and number trees that does binary searches as intended by the data
structure rather than loading into a map, which can use a great deal
of memory and can be very slow.
2021-01-24 03:22:51 -05:00
Jay Berkenbilt 6fe7b704c7 Warn rather than segv on access after closing input source (fixes #495) 2021-01-06 10:11:34 -05:00
Jay Berkenbilt 0fed040392 Prepare version 10.1.0 2021-01-04 16:59:55 -05:00
Jay Berkenbilt bf8fd41fee Update copyright to 2021 2021-01-04 16:26:58 -05:00
Jay Berkenbilt 891751f618 Remove unreferenced resources only from relevant pages 2021-01-04 15:17:35 -05:00
Jay Berkenbilt a9bdeeb0e0 Fix zsh completion arguments (fixes #473) 2021-01-04 15:17:35 -05:00
Jay Berkenbilt 3be58f49e5 Make more QPDFPageObjectHelper methods work with form XObject 2021-01-02 14:08:53 -05:00
Jay Berkenbilt 98da4fd835 Externalize inline images now includes form XObjects 2021-01-02 14:08:17 -05:00
Jay Berkenbilt bedf35d6a5 Bug fix: avoid extraneous pipeline finish calls with multiple contents
Avoid calling finish() multiple times on the pipeline passed to
pipeContentStreams. This commit also fixes a bug in which qpdf was not
exiting with the proper exit status if warnings found while splitting
pages; this was exposed by a test case that changed.
2021-01-02 14:08:17 -05:00
Jay Berkenbilt a139d2b36d Add several methods for working with form XObjects (fixes #436)
Make some more methods in QPDFPageObjectHelper work with form
XObjects, provide forEach methods to walk through nested form
XObjects, possibly recursively. This should make it easier to work
with form XObjects from user code.
2021-01-02 12:29:31 -05:00
Jay Berkenbilt 63ea46193d QPDFPageObjectHelper: getPageImages -> getImages 2021-01-02 11:33:36 -05:00
Jay Berkenbilt e7a8554563 QPDFPageObjectHelper::getPageImages: support form XObjects 2021-01-02 11:33:36 -05:00
Jay Berkenbilt c9271335fa Add QPDFPageObjectHelper::flattenRotation and --flatten-rotation 2020-12-30 13:03:55 -05:00
Jay Berkenbilt 12ecd2019a Add QPDFObjectHandle::setFilterOnWrite 2020-12-28 12:58:19 -05:00
Jay Berkenbilt 858c7b89bc Let optimize filter stream parameters instead of making them direct
Also removes preclusion of stream references in stream parameters of
filterable streams and reduces write times by about 8% by eliminating
an extra traversal of the objects.
2020-12-28 12:58:19 -05:00
Jay Berkenbilt 39bfa01307 Implement user-provided stream filters
Refactor QPDF_Stream to use stream filter classes to handle supported
stream filters as well.
2020-12-28 12:58:19 -05:00
Jay Berkenbilt cc8895078a Add QPDFObjectHandle::makeDirect(bool allow_streams) 2020-12-26 08:48:18 -05:00
Jay Berkenbilt 2050977099 Add QPDFObjectHandle manipulation to C API 2020-11-28 19:48:07 -05:00
Jay Berkenbilt 78b9d6bfd4 Prepare 10.0.4 release 2020-11-21 13:50:02 -05:00
Jay Berkenbilt a7ef572c84 Small enhancement to --pages argument parsing 2020-11-09 11:12:34 -05:00
Jay Berkenbilt 47f4ebcdac Ignore unused field in xref entry, avoiding range error (fixes #482) 2020-11-04 07:46:46 -05:00
Jay Berkenbilt 3e5aaa299a Typo in help message 2020-11-03 09:03:16 -05:00
Jay Berkenbilt fbe40b800d Prepare 10.0.3 release 2020-10-31 13:47:03 -04:00
Jay Berkenbilt 96767fb104 Fix foreign stream copying bug (fixes #478)
This reverts an incorrect fix to #449 and codes it properly. The real
problem was that we were looking at the local dictionaries rather than
the foreign dictionaries when saving the foreign stream data. In the
case of direct objects, these happened to be the same, but in the case
of indirect objects, the object references could be pointing anywhere
since object numbers don't match up between the old and new files.
2020-10-31 12:14:26 -04:00
Jay Berkenbilt f1ae55a430 Better indirect filter test case
The test suite now contains test cases that fail with both 10.0.1 and
10.0.2 and reproduce the internal error from #449.
2020-10-31 09:02:30 -04:00
Jay Berkenbilt da7540794a Prepare 10.0.2 release 2020-10-27 11:57:48 -04:00
Jay Berkenbilt f8e4b6161c With --no-warn, suppress warnings in split-pages
Warnings issued on the output QPDF object were not suppressing
warnings since that option was only set on the input QPDF object.
2020-10-23 16:27:51 -04:00
Jay Berkenbilt b30deaeeab Avoid merging adjacent tokens when concatenating contents (fixes #444) 2020-10-23 08:00:04 -04:00
Jay Berkenbilt 0dea276997 Fix fix-qdf for empty streams 2020-10-23 06:39:42 -04:00
Jay Berkenbilt 8a11feacc3 Avoid leak by resolving object streams more than once (fuzz issue 23642) 2020-10-22 15:39:36 -04:00
Jay Berkenbilt 30bb4c64ee Minor code cleanup
* Return rather than exiting from realmain in qpdf.cc
* Remove extraneous blank line
* Don't assign temporary to const reference
2020-10-22 15:39:36 -04:00
Jay Berkenbilt 956c8f6432 Obscure bug fix copying foreign streams in special cases (fixes #449)
Specifically, if a stream had its stream data replaced and had
indirect /Filter or /DecodeParms, it would result in non-silent loss
of data and/or internal error.
2020-10-21 19:23:23 -04:00
Jay Berkenbilt deeface146 Add automated test for shell wildcard expansion
Wildcard expansion is different in Windows from non-Windows and
sometimes requires special link options to work. Add tests that fail
if we link incorrectly.
2020-10-21 14:15:31 -04:00
Jay Berkenbilt 758e3e38f5 Add option --warning-exit-0 to exit 0 instead of 3 with warnings 2020-10-20 18:02:39 -04:00
Jay Berkenbilt 90217e6686 Fix another case of errors written to stdout (fixes #438) 2020-10-20 17:48:55 -04:00
Jay Berkenbilt ff65e272a8 Fix printf formatting for newer msvc
Use autoconf rather than ifdefs to determine what format string to use
for long long.
2020-10-16 07:02:23 -04:00
Jay Berkenbilt bbd45cd01c Clarify qpdf's exit statuses in the documentation 2020-10-15 15:03:14 -04:00
Jay Berkenbilt a1994a5343 Fix/clarify documentation on --rotate option (fixes #470)
Make clear that you almost always want + or - before an angle when
specifying rotation.
2020-10-15 14:53:06 -04:00
Jay Berkenbilt 92d3cbecd4 Fix warnings reported by -Wshadow=local (fixes #431) 2020-04-16 12:41:43 -04:00
Jay Berkenbilt 578c5ac66c Use more references when iterating
When possible, use `for (auto&` or `for (auto const&` when iterating
using C++-11 style iterators.
2020-04-10 13:30:33 -04:00
Jay Berkenbilt 821a701851 Prepare 10.0.1 release 2020-04-09 11:48:26 -04:00
Jay Berkenbilt 1a7d3700a6 Fix unnecessary copies in auto iter (fixes #426)
Also switch to colon-style iteration in some cases. Thanks to Dean
Scarff for drawing this to my attention after detecting some
unnecessary copies with
https://clang.llvm.org/extra/clang-tidy/checks/performance-for-range-copy.html
2020-04-08 20:45:26 -04:00
Jay Berkenbilt 4977a7efa5 Bug fix: getStreamData should on unfilterable stream (fixes #425) 2020-04-08 18:52:04 -04:00
Jay Berkenbilt 892937cbbe Fix errors in --remove-unreferenced-resources=auto 2020-04-06 12:14:27 -04:00
Jay Berkenbilt 1e629c278a Prepare 10.0.0 release 2020-04-06 11:30:15 -04:00
Jay Berkenbilt ce6cee3570 Spell check 2020-04-06 11:23:02 -04:00
Jay Berkenbilt 3d0de5b924 Fixes to ChangeLog and manual for 10.0.0 changes 2020-04-06 09:02:58 -04:00