Commit Graph

969 Commits

Author SHA1 Message Date
m-holger 9dea7d3080 Tune QPDF::getAllPagesInternal
Avoid calling getAllPagesInternal for each /Page object.
2022-08-01 13:29:14 +01:00
Jay Berkenbilt 12d065c751 Provide a simpler QPDF::writeJSON 2022-07-31 16:23:17 -04:00
Jay Berkenbilt 13cf35ce2f Use calledgetallpages and pushedinheritedpageresources 2022-07-31 16:23:17 -04:00
Jay Berkenbilt 69820847af Change the output of --json to use "qpdf" instead of "objects" 2022-07-31 15:17:01 -04:00
Jay Berkenbilt d01c4f8819 Change --json-output format
from "qpdf-v2" to "qpdf": [..., ...]
2022-07-31 10:32:55 -04:00
Jay Berkenbilt b3e6d445cb Tweak "AndGet" mutator functions again
Remove any ambiguity around whether old or new value is being
returned.
2022-07-24 15:42:23 -04:00
m-holger afd35f9a30 Overload StreamDataProvider::provideStreamData
Use 'QPDFObjGen const&' instead of 'int, int' in signature.
2022-07-24 16:02:35 +01:00
m-holger f0a8178091 Refactor QPDFObject creation and cloning
Move responsibility for creating shared pointers to objects and cloning from QPDFObjectHandle to QPDFObject.
2022-06-27 12:47:02 -04:00
Jay Berkenbilt 0c7c7e4ba4 Track whether certain page modifying methods have been called
We need to know whether pushInheritedAttributesToPage or getAllPages
have been called when generating JSON output. When reading the JSON
back in, we have to call the same methods so that object numbers will
line up properly.
2022-06-25 13:55:45 -04:00
Jay Berkenbilt 8a32515a62 Add warnings for some additional page tree repair 2022-06-25 13:25:35 -04:00
Jay Berkenbilt eae75dbe44 Add Pl_Function -- a generic function pipeline 2022-06-19 09:12:29 -04:00
Jay Berkenbilt bb0ea2f8e7 Add qpdfjob_register_progress_reporter 2022-06-19 08:46:58 -04:00
Jay Berkenbilt 87412eb05b Add QPDFJob::registerProgressReporter 2022-06-19 08:46:58 -04:00
Jay Berkenbilt 3a7ee7e938 Move C-based ProgressReporter helper into QPDFWriter 2022-06-19 08:46:58 -04:00
Jay Berkenbilt daef4e8fb8 Add more flexible funtions to qpdfjob C API 2022-06-19 08:46:58 -04:00
Jay Berkenbilt e0720eaa78 Use the default logger for other writes to stdout/stderr
When there is no context for writing output or error messages, use the
default logger.
2022-06-18 10:38:50 -04:00
Jay Berkenbilt 83be2191b4 Use "save" logger when saving data to standard output
This includes the output PDF, streams from --show-object and
attachments from --save-attachment. This also enables --verbose and
--progress to work with saving to stdout.
2022-06-18 09:54:40 -04:00
Jay Berkenbilt 641e92c6a7 QPDF, QPDFJob: use QPDFLogger instead of custom output streams 2022-06-18 09:02:55 -04:00
Jay Berkenbilt f1f711963b Add and test QPDFLogger class 2022-06-18 09:02:55 -04:00
Jay Berkenbilt b7bbf12e85 In json mode, reveal recovered user password when otherwise unavailable 2022-05-30 20:03:08 -04:00
Jay Berkenbilt f049a77c59 Add additional information when listing attachments 2022-05-30 20:03:08 -04:00
Jay Berkenbilt 27a42c16c7 Change default decode level to "none" with --json-output 2022-05-21 17:51:34 -04:00
Jay Berkenbilt b0f1564376 Add another binary utf8 to JSON test 2022-05-21 17:39:35 -04:00
Jay Berkenbilt 752f43d4e4 Allow empty b: binary JSON strings 2022-05-21 17:36:32 -04:00
m-holger 6c69a747b9 Code clean up: use range-style for loops wherever possible
Remove variables obsoleted by commit 4f24617.
2022-05-21 16:06:29 -04:00
Jay Berkenbilt 905f47a55f Add json to large file test 2022-05-21 09:43:45 -04:00
Jay Berkenbilt 9b2eb01e25 Exercise object description in tests 2022-05-20 14:23:32 -04:00
Jay Berkenbilt 6c2fb5b8f0 Add test for bad data and bad datafile 2022-05-20 13:33:30 -04:00
Jay Berkenbilt d065098089 Test --update-from-json 2022-05-20 11:10:12 -04:00
Jay Berkenbilt 6d4e3ba8a4 Test (and fix) handling of dangling references 2022-05-20 09:16:25 -04:00
Jay Berkenbilt 35b1e1c493 Explicitly test ignoring unknown keys in JSON input 2022-05-20 09:16:25 -04:00
Jay Berkenbilt dc8df962d8 Make version default to latest for --json-output (like --json) 2022-05-20 09:16:25 -04:00
Jay Berkenbilt 907df2c823 Round-trip tests with --json-stream-data=file 2022-05-20 09:16:25 -04:00
Jay Berkenbilt a83b7b0611 Tests with manually constructed qpdf json 2022-05-20 09:16:25 -04:00
Jay Berkenbilt 7f8c4b183d Add tests for --json-input 2022-05-20 09:16:25 -04:00
Jay Berkenbilt 1ec561daa4 Add more names and strings in good13
* native UTF-8 strings
* names whose PDF and canonical syntax differ in both dictionary key
  positions and other positions

For json, names are converted both as names and directly when used as
dictionary keys.
2022-05-20 09:16:25 -04:00
Jay Berkenbilt 6c5e590673 Rename all test files: _ to - 2022-05-20 09:16:25 -04:00
Jay Berkenbilt 6f43bf8de3 Major rework -- see long comments
* Replace --create-from-json=file with --json-input, which causes the
  regular input to be treated as json.
* Eliminate --to-json
* In --json=2, bring back "objects" and eliminate "objectinfo". Stream
  data is never present.
* In --json-output=2, write "qpdf-v2" with "objects" and include
  stream data.
2022-05-20 09:16:25 -04:00
Jay Berkenbilt 56f1b411fe Back out fluent QPDFObjectHandle methods. Keep the andGet methods.
I decided these were confusing and inconsistent with how JSON works.
They muddle the API rather than improving it.
2022-05-20 09:16:25 -04:00
Jay Berkenbilt 7e7a9c4379 Parse objects; stream data is not yet handled 2022-05-20 09:16:25 -04:00
Jay Berkenbilt 7fa5d1773b Implement top-level qpdf json parsing 2022-05-16 13:41:40 -04:00
Jay Berkenbilt 9a0e9a1a9e Remove offset from missing /Root error
The last offset is irrelevant to not being able to find /Root.
2022-05-16 13:39:26 -04:00
Jay Berkenbilt 173b944ef8 Split qpdf.test into multiple test suites
This makes it a lot easier to run parts of the test suite.
2022-05-14 17:35:06 -04:00
Jay Berkenbilt 2a2f7f1bba Add maxobjectid to JSON 2022-05-08 13:45:20 -04:00
Jay Berkenbilt e9390aeaaa Add --to-json option 2022-05-08 13:45:20 -04:00
Jay Berkenbilt 2e87d593eb Test inline stream data with different decode levels 2022-05-08 13:45:20 -04:00
Jay Berkenbilt f08f398920 Test json v2 with invalid stream data 2022-05-08 13:45:20 -04:00
Jay Berkenbilt c76536dd9a Implement JSON v2 output 2022-05-08 13:45:20 -04:00
Jay Berkenbilt bdfc4da510 Apply script across future v2 test files
There is one unexpected pass in this commit. This script was applied
to the files changed in this commit:

----------
#!/usr/bin/env python3
import json
import sys

def json_dumps(data):
    return json.dumps(data, ensure_ascii=False,
                      indent=2, separators=(',', ': '))

for filename in sys.argv[1:]:
    with open(filename, 'r') as f:
        data = json.loads(f.read())
    data['version'] = 2
    objectinfo = {}
    if 'objectinfo' in data:
        objectinfo = data['objectinfo']
        del data['objectinfo']
    if 'objects' not in data:
        continue
    qpdf = {'jsonversion': 2, 'pdfversion': '1.3', 'objects': {}}
    for k, v in data['objects'].items():
        is_stream = objectinfo.get(k, {}).get('stream', {}).get('is', False)
        if k.endswith(' R'):
            k = 'obj:' + k
        if is_stream:
            v = {'stream': {'dict': v}}
        else:
            v = {'value': v}
        qpdf['objects'][k] = v
    data['qpdf'] = qpdf
    del data['objects']
print(json_dumps(data))
----------
2022-05-08 13:45:20 -04:00
Jay Berkenbilt 8d348974aa Prepare test suite for json v2 2022-05-08 13:45:20 -04:00
Jay Berkenbilt 15272662f6 Fix typo in json output key name
moddify -> modify. Also carefully spell checked all remaining keys by
splitting them into words and running a spell checker, not just
relying on visual proofreading. That was the only one.
2022-05-08 13:45:20 -04:00
Jay Berkenbilt 1bc8abfdd3 Implement JSON v2 for Stream
Not fully exercised in this commit
2022-05-08 13:45:20 -04:00
Jay Berkenbilt 3246923cf2 Implement JSON v2 for String
Also refine the herustic for deciding whether to use hexadecimal
notation for a string.
2022-05-08 13:45:20 -04:00
Jay Berkenbilt 16f4f94cd9 Prepare code for JSON v2
Update getJSON() methods and calls to them
2022-05-07 11:12:01 -04:00
Jay Berkenbilt a9fbbd5dca Objectinfo json: write incrementally and in numeric order
This script was used on test data:

----------
#!/usr/bin/env python3
import json
import sys
import re

def json_dumps(data):
    return json.dumps(data, ensure_ascii=False,
                      indent=2, separators=(',', ': '))

for filename in sys.argv[1:]:
    with open(filename, 'r') as f:
        data = json.loads(f.read())
    if 'objectinfo' not in data:
        continue
    trailer = None
    to_sort = []
    for k, v in data['objectinfo'].items():
        if k == 'trailer':
            trailer = v
        else:
            m = re.match(r'^(\d+) \d+ R', k)
            if m:
                to_sort.append([int(m.group(1)), k, v])
    newobjectinfo = {x[1]: x[2] for x in sorted(to_sort)}
    if trailer is not None:
        newobjectinfo['trailer'] = trailer
    data['objectinfo'] = newobjectinfo
print(json_dumps(data))
----------
2022-05-07 08:26:31 -04:00
Jay Berkenbilt 948de60990 Objects json: write incrementally and in numeric order
The following script was used to adjust test data:

----------
#!/usr/bin/env python3
import json
import sys
import re

def json_dumps(data):
    return json.dumps(data, ensure_ascii=False,
                      indent=2, separators=(',', ': '))

for filename in sys.argv[1:]:
    with open(filename, 'r') as f:
        data = json.loads(f.read())
    if 'objects' not in data:
        continue
    trailer = None
    to_sort = []
    for k, v in data['objects'].items():
        if k == 'trailer':
            trailer = v
        else:
            m = re.match(r'^(\d+) \d+ R', k)
            if m:
                to_sort.append([int(m.group(1)), k, v])
    newobjects = {x[1]: x[2] for x in sorted(to_sort)}
    if trailer is not None:
        newobjects['trailer'] = trailer
    data['objects'] = newobjects
print(json_dumps(data))
----------
2022-05-07 08:26:31 -04:00
Jay Berkenbilt dc9b7287cd Top-level json: write incrementally
This commit just changes the order in which fields are written to the
json without changing their content. All the json files in the test
suite were modified with this script to ensure that we didn't get any
changes other than ordering.

----------
#!/usr/bin/env python3
import json
import sys

def json_dumps(data):
    return json.dumps(data, ensure_ascii=False,
                      indent=2, separators=(',', ': '))

for filename in sys.argv[1:]:
    with open(filename, 'r') as f:
        data = json.loads(f.read())
    newdata = {}
    for i in ('version', 'parameters', 'pages', 'pagelabels',
              'acroform', 'attachments', 'encrypt', 'outlines',
              'objects', 'objectinfo'):
        if i in data:
            newdata[i] = data[i]
print(json_dumps(newdata))
----------
2022-05-07 08:26:31 -04:00
Jay Berkenbilt 7f65a5c21f Test json against schema only on demand
Testing json against schema requires an in-memory copy, so do it only
when requested by the test suite.
2022-05-07 08:26:31 -04:00
Jay Berkenbilt a3c9980395 Add next to Pl_String and fix comments 2022-05-07 08:26:31 -04:00
Jay Berkenbilt e5f3910c3e Add new FileInputSource constructors 2022-05-04 12:07:11 -04:00
Jay Berkenbilt 8b25de24c9 Make "objects" and "pages" consistent in JSON output 2022-05-04 08:32:44 -04:00
Jay Berkenbilt f4206a0938 Add new Pl_String Pipeline 2022-05-03 18:54:51 -04:00
Jay Berkenbilt 16139d97c8 Add new Pl_OStream Pipeline 2022-05-03 18:54:51 -04:00
Jay Berkenbilt 21d6e3231f Make use of the new Pipeline methods in some places 2022-05-03 18:31:23 -04:00
Jay Berkenbilt 59f3e09edf Make Pipeline::write take an unsigned char const* (API change) 2022-05-03 18:31:22 -04:00
Jay Berkenbilt 62bf296a9c Make assert handling less error-prone
Prevent my future self or other contributors from using assert in
tests and then having that assert not do anything because of the
NDEBUG macro.
2022-05-03 18:31:22 -04:00
Jay Berkenbilt 8ccd3a8a89 Mark weak encryption with API changes (fixes #576) 2022-04-30 17:24:15 -04:00
Jay Berkenbilt cff26040d8 Using insecure crytpo from the CLI is now an error by default 2022-04-30 17:23:58 -04:00
Jay Berkenbilt 4f24617e1e Code clean up: use range-style for loops wherever possible
Where not possible, use "auto" to get the iterator type.

Editorial note: I have avoid this change for a long time because of
not wanting to make gratuitous changes to version history, which can
obscure when certain changes were made, but with having recently
touched every single file to apply automatic code formatting and with
making several broad changes to the API, I decided it was time to take
the plunge and get rid of the older (pre-C++11) verbose iterator
syntax. The new code is just easier to read and understand, and in
many cases, it will be more effecient as fewer temporary copies are
being made.

m-holger, if you're reading, you can see that I've finally come
around. :-)
2022-04-30 13:27:18 -04:00
Jay Berkenbilt 7f023701dd Formatting: remove space in range-style for loops
Change .clang-format and commit automated changes from a fresh run of
format-code
2022-04-30 13:26:43 -04:00
Jay Berkenbilt 2878c186bf Use fluent appendItem 2022-04-30 10:54:16 -04:00
Jay Berkenbilt ab9d557cb0 Use fluent replaceKey 2022-04-29 20:39:54 -04:00
Jay Berkenbilt e80fad86e9 Add new QPDFObjectHandle methods for more fluent programming 2022-04-29 20:09:10 -04:00
Jay Berkenbilt d0b7cc8ac6 QPDFJob json: make removeAttachment take an array (fixes #693) 2022-04-24 13:06:19 -04:00
Jay Berkenbilt 08ba21cf49 Fix some bugs around null values in dictionaries
Make it so that a key with a null value is always treated as not being
present. This was inconsistent before.
2022-04-24 10:08:32 -04:00
Jay Berkenbilt 4be2f36049 Deprecate replaceOrRemoveKey -- it's the same as replaceKey 2022-04-24 09:31:32 -04:00
Jay Berkenbilt b8d0b0b638 Re-add accidentally removed qpdf.testcov 2022-04-24 09:18:04 -04:00
Jay Berkenbilt 68e721981a Add new QPDF::warn that takes most of QPDFExc's arguments 2022-04-23 18:25:43 -04:00
Jay Berkenbilt 22b35c4928 Expose QUtil::get_next_utf8_codepoint 2022-04-23 18:25:43 -04:00
Jay Berkenbilt ce5c3bcad8 QPDFJob: pass capture output streams through to underlying QPDF 2022-04-18 11:24:17 -04:00
Jay Berkenbilt 80ed3076a0 Remove deprecated name/number tree constructors
Remove the name/number tree object helper constructors that don't take
a QPDF&.
2022-04-16 13:13:15 -04:00
Jay Berkenbilt cdd0b4fb7d Use = default and = delete where possible in classes 2022-04-16 11:39:14 -04:00
Jay Berkenbilt ce86307a1a Fix typo in error message 2022-04-10 16:54:23 -04:00
Jay Berkenbilt 5f4675bb24 Mark non-ABI symbols in exported class with QPDF_DLL_PRIVATE 2022-04-10 16:54:23 -04:00
Jay Berkenbilt 5525c93124 Use QPDF_DLL_CLASS with Pipeline and InputSource subclasses
This enables RTTI so we can use dynamic_cast on them across the shared
object boundary.
2022-04-10 16:52:57 -04:00
Jay Berkenbilt 07edf96440 Remove methods of private classes from ABI
Prior to the cmake conversion, several private classes had methods
that were exported into the shared library so they could be tested
with libtests. With cmake, we build libtests using an object library,
so this is no longer necessary. The methods that are disappearing from
the ABI were never exposed through public headers, so no code should
be using them. Removal had to wait until the window for ABI-breaking
changes was open.
2022-04-09 17:33:29 -04:00
Jay Berkenbilt 128e41648f Remove PointerHolder.hh from other than public header files
Increase to POINTERHOLDER_TRANSITION=4
2022-04-09 17:33:29 -04:00
Jay Berkenbilt ba0ef7a124 Replace PointerHolder with std::shared_ptr in the rest of the code
Increase to POINTERHOLDER_TRANSITION=3

patrepl s/PointerHolder/std::shared_ptr/g **/*.cc **/*.hh
patrepl s/make_pointer_holder/std::make_shared/g **/*.cc
patrepl s/make_array_pointer_holder/QUtil::make_shared_array/g **/*.cc
patrepl s,qpdf/std::shared_ptr,qpdf/PointerHolder, **/*.cc **/*.hh
git restore include/qpdf/PointerHolder.hh
git restore libtests/pointer_holder.cc
cleanpatch
./format-code
2022-04-09 17:33:29 -04:00
Jay Berkenbilt 12f1eb15ca Programmatically apply new formatting to code
Run this:

for i in  **/*.cc **/*.c **/*.h **/*.hh; do
  clang-format < $i >| $i.new && mv $i.new $i
done
2022-04-04 08:10:40 -04:00
Jay Berkenbilt 820a3f04fd Remove "lt-" workarounds
The executables that libtool built invoked the underlying binary with
an "lt-" prefix. The code contained numerous workarounds for testing,
which can now be removed.
2022-03-18 19:53:18 -04:00
Jay Berkenbilt acdf5b2e7a Update process for ABI testing 2022-03-18 19:53:18 -04:00
Jay Berkenbilt f58d2a60d5 Update build-related documentation and comments 2022-03-18 19:53:18 -04:00
Jay Berkenbilt 70d0d0889b Remove old build files 2022-03-18 19:53:18 -04:00
Jay Berkenbilt b8aff90997 Add cmake configuration files 2022-03-18 19:53:18 -04:00
Jay Berkenbilt 6941923ca9 Improve large file test output 2022-03-18 19:53:18 -04:00
Jay Berkenbilt c0a231bf32 Reverse sense of compare images toggle for qpdf.test
Run compare images tests when QPDF_TEST_COMPARE_IMAGES is set rather
than when QPDF_SKIP_TEST_COMPARE_IMAGES is not set.
2022-03-18 19:53:18 -04:00
Jay Berkenbilt 17c0e38c8e Force assert to be defined in test code 2022-03-07 10:07:27 -05:00
Jay Berkenbilt 1d86b70eab Tweak include of config for ctest 2022-03-01 14:01:34 -05:00
Jay Berkenbilt 99393e6ab7 Shorten coverage case name
This is so it will fit on one line after a qtest upgrade allows us to
split lines.
2022-02-26 10:18:23 -05:00
Jay Berkenbilt f7ac591590 Recognize explicit UTF-8 strings (fixes #654) 2022-02-22 08:10:05 -05:00
Jay Berkenbilt 31b45b0fd4 Fix logic error with Tf when generating appearances (fixes #655) 2022-02-18 13:46:35 -05:00
Jay Berkenbilt 3e2109ab37 Remove special case for 0xad for 10.6.2. 2022-02-16 06:52:05 -05:00
Jay Berkenbilt e810fe678a Fix asymmetry between newUnicodeString and getUTF8Value 2022-02-15 19:22:35 -05:00
Jay Berkenbilt a478cbb6dc Silently/transparently recognize UTF-16LE as UTF-16 (fixes #649)
The PDF spec only allows UTF-16BE, but most readers seem to accept
UTF-16LE as well, so now qpdf does too.
2022-02-15 16:13:12 -05:00
Jay Berkenbilt fbd3e56da7 Ignore -- at the top level arg parser (fixes #652)
This was unintended behavior that was added back for backward
compatibility. It is intentionally undocumented.
2022-02-15 16:13:12 -05:00
Jay Berkenbilt 19608ec151 Add missing spaces in usageExit 2022-02-15 16:11:33 -05:00
Jay Berkenbilt 956a272d62 Remove abs calls and pick correct floating point epsilon values (fixes #641) 2022-02-11 07:18:33 -05:00
Jay Berkenbilt d501e1c0d4 Only update output version from files used as input
If we're opening a PDF file to copy its encryption information or
attachments, its version doesn't need to influence the output version.
2022-02-08 13:49:22 -05:00
Jay Berkenbilt f91b21c7d4 Preserve input PDF version on pages/split-pages (fixes #610) 2022-02-08 12:34:14 -05:00
Jay Berkenbilt cfd5147d92 Add QPDF::getVersionAsPDFVersion 2022-02-08 12:34:14 -05:00
Jay Berkenbilt cb769c62e5 WHITESPACE ONLY -- expand tabs in source code
This comment expands all tabs using an 8-character tab-width. You
should ignore this commit when using git blame or use git blame -w.

In the early days, I used to use tabs where possible for indentation,
since emacs did this automatically. In recent years, I have switched
to only using spaces, which means qpdf source code has been a mixture
of spaces and tabs. I have avoided cleaning this up because of not
wanting gratuitous whitespaces change to cloud the output of git
blame, but I changed my mind after discussing with users who view qpdf
source code in editors/IDEs that have other tab widths by default and
in light of the fact that I am planning to start applying automatic
code formatting soon.
2022-02-08 11:51:15 -05:00
Jay Berkenbilt c62e8e2b28 Update for clean compile with POINTERHOLDER_TRANSITION=2 2022-02-07 17:38:22 -05:00
m-holger 5901fcad4c C-API expose QPDFObjectHandle::getKeyIfDict 2022-02-06 11:21:15 -05:00
m-holger 8371060340 Add method QPDFObjectHandle::getKeyIfDict 2022-02-06 11:21:15 -05:00
m-holger 2ed5f49a79 C-API expose QPDFObjectHandle::getValueAs... accessors 2022-02-05 19:40:30 -05:00
Jay Berkenbilt 7fb22740e1 Add operator ""_qpdf for creating QPDFObjectHandle literals 2022-02-05 11:29:25 -05:00
Jay Berkenbilt b48a0ff0e8 Add qpdf_empty_pdf to C API 2022-02-05 11:29:25 -05:00
m-holger e58b1174c7 Add new QPDFObjectHandle::getValueAs... accessors 2022-02-05 11:24:35 -05:00
Jay Berkenbilt 9044a24097 PointerHolder: deprecate getPointer() and getRefcount()
Use get() and use_count() instead. Add #define
NO_POINTERHOLDER_DEPRECATION to remove deprecation markers for these
only.

This commit also removes all deprecated PointerHolder API calls from
qpdf's code except in PointerHolder's test suite, which must continue
to test the deprecated APIs.
2022-02-04 13:12:37 -05:00
m-holger 95e7d36b7a C-API add two binary UTF8 funtions
add qpdf_oh_new_binary_unicode_string and qpdf_oh_get_binary_utf8_value
2022-02-04 13:10:51 -05:00
m-holger 1925ffd467 Fix --check-linearization of non-linearized files (fixes #615) 2022-02-04 06:52:38 -05:00
Jay Berkenbilt 42bff9f458 QPDFJob: let initializeFromArgv just take argv, not argc
Let argv be a null-terminated array. There is already code that
assumes this, and it makes it easier to construct the arguments.
2022-02-01 13:50:58 -05:00
Jay Berkenbilt b02d37bc0a Make QPDFArgParser accept const argv
This makes it much more convention to use the initializeFromArgv
functions since you can use string literals.
2022-02-01 13:50:58 -05:00
Jay Berkenbilt bc4e2320e7 Add qpdfjob-c.h -- simple C API around parts of QPDFJob 2022-02-01 09:04:55 -05:00
Jay Berkenbilt 03e67a28fe Move QTC::TC for qpdf to QPDFJob
All the coverage cases that used to be in qpdf.cc are now in
QPDFJob*.cc. It doesn't really matter, but better to follow the
convention of starting with the class that includes the coverage call.
2022-02-01 09:04:55 -05:00
Jay Berkenbilt b42f3e1d15 Move more code from qpdf.cc into QPDFJob 2022-02-01 09:04:55 -05:00
Jay Berkenbilt 21b9290785 QPDFJob json: make bare arguments expect the empty string
Changing from bool requiring true to string requiring the empty string
is more consistent with the CLI and makes it possible to add an
optional parameter or choices later without breaking compatibility.
2022-01-31 18:16:09 -05:00
Jay Berkenbilt ea96330bb6 QPDFJob json: flatten json structure
Flatten everything to make it easier to map command-line flags to
json. The old structure was an illusion anyway because there was no
mechanism to enforce that things were in the right place. This also
helps with future flexibility.
2022-01-31 18:16:09 -05:00
Jay Berkenbilt 47f33cec25 QPDFJob: add test cases 2022-01-31 15:57:45 -05:00
Jay Berkenbilt e3506253f1 Add optional version to --json 2022-01-31 15:57:45 -05:00
Jay Berkenbilt caa00556cf Change filename or path to file in json and QPDFJob
Use "file" consistently for specifying a file path. We use "filename"
when adding attachments for a completely different purpose.
2022-01-31 15:57:45 -05:00
Jay Berkenbilt 7097f29019 More editorial changes from m-holger + spell check 2022-01-30 13:11:03 -05:00
Jay Berkenbilt 0e909bab8e Improve top-level help information 2022-01-30 13:11:03 -05:00
Jay Berkenbilt 0364024781 Use QPDFUsage exception for cli, json, and QPDFJob errors 2022-01-30 13:11:03 -05:00
Jay Berkenbilt 95d127641c QPDFJob: move more top-level trivial handlers into config 2022-01-30 13:11:03 -05:00
Jay Berkenbilt 9373881cca Add QPDFJob::ConfigError exception 2022-01-30 13:11:03 -05:00
Jay Berkenbilt b9cd693a5b QPDFJob: allocate QPDFArgParser on stack
The previous commits have removed all references to memory from
QPDFArgParser from QPDFJob. This commit removes the constraint that
QPDFArgParser remain in scope. This is a prerequisite to allowing JSON
as an alternative way to initialize QPDFJob and to initialize it
directly using a public API.
2022-01-30 13:11:03 -05:00
Jay Berkenbilt b9af421ef7 Add missing \f support for JSON string encoder 2022-01-30 13:11:03 -05:00
Jay Berkenbilt 5c5e5ca29b Document how to add a command-line argument 2022-01-30 13:11:03 -05:00
Jay Berkenbilt a301cc5373 Minor code cleanup 2022-01-30 13:11:03 -05:00
Jay Berkenbilt bd89aac360 QPDFJob increment: move arg parsing into QPDFJob
Move ArgParser from qpdf.cc into QPDFJob.cc. It still works with
millions of public member variables, but now qpdf.cc is minimal and
just calls stable library functions.
2022-01-30 13:11:03 -05:00
Jay Berkenbilt 23b64f8357 Remove qpdf.cc version check
Remove comparison of qpdf CLI version with library. With almost all
the functionality moving into the library, this check is no longer
meaningful.
2022-01-30 13:11:03 -05:00
Jay Berkenbilt 1ddf5b4b4b QPDFJob increment: get rid of exit, handle verbose
Remove all calls to exit() from QPDFJob. Handle code that runs in
verbose mode to enable it to make use of output streams and message
prefix (whoami) from QPDFJob. This removes temporarily duplicated exit
code logic and most access to whoami/std::cout outside of QPDFJob
proper.
2022-01-30 13:11:03 -05:00
Jay Berkenbilt 0910e767ad QPDFJob increment: basic QPDFJob structure
Move most of the methods called from qpdf.cc after argument parsing
into QPDFJob. In this increment, enough QPDFJob API has been added to
handle the branch of QPDFJob::run() that creates output with an
appropriate division between qpdf.cc and QPDFJob.

There are temporary bits of code to enable everything to compile and
pass the test suite, including some duplication and hard-coded values.
2022-01-30 13:11:03 -05:00
Jay Berkenbilt 8c718b7e6f Prefix program name before exception message in qpdf CLI 2022-01-30 13:11:02 -05:00
Jay Berkenbilt c60b4ea55a Refactor arg parsing in qpdf.cc to use QPDFArgParser 2022-01-30 13:11:02 -05:00
Jay Berkenbilt 52817f0a45 Implement QPDFArgParser based on ArgParser from qpdf.cc 2022-01-30 13:11:02 -05:00
m-holger 8eca9d8fd9 Fix QPDFObjectHandle::isOrHasName
Ensure isOrHasName returns true if object is an array and the name is
present anywhere in the array.
2022-01-27 09:35:39 -06:00
m-holger 710d2e54f0 Allow testing for subtype without specifying type in isDictionaryOfType etc
Accept empty string as type parameter in
QPDFObjectHandle::isDictionaryOfType and isStreamOfType
to allow for dictionaries with optional type.
2022-01-27 07:31:12 -06:00
m-holger 1b1b471ca9 Make a few whitespace fixes from last commit
Commit by ejb@ql.org using m-holger as author so git annotate gives
proper credit for changes.
2022-01-22 09:14:53 -05:00
m-holger 8593b9fdf7 Add new convenience methods QPDFObjectHandle::isNameAndEquals, etc
Add methods isNameAndEquals, isDictionaryOfType, isStreamOfType
2022-01-22 08:10:28 -06:00
Jay Berkenbilt 370710657a Add missing characters from PDF doc encoding (fixes #606) 2022-01-11 15:55:19 -05:00
Jay Berkenbilt 0f1ffa1215 Move bash/zsh completion helpers to libtests/arg_parser 2022-01-05 18:13:25 -05:00
Jay Berkenbilt 4782b5904f Move filter-completion.pl to libtests/arg_parser 2022-01-05 18:13:25 -05:00
Jay Berkenbilt af91b5b584 Add QUtil::file_can_be_opened 2021-12-29 13:41:02 -05:00
Jay Berkenbilt ac0060ac38 Refactor arg parsing to allow help option with parameter 2021-12-29 13:35:05 -05:00
Jay Berkenbilt 04745320d6 Prepare 10.5.0 release 2021-12-20 14:51:46 -05:00
Jay Berkenbilt d866f48081 Change names of qpdf_object_type_e enumerations
They have to be ot_* rather than qpdf_ot_* for compatibility.

* Different enumerated types are not assignment-compatible in C++, at
  least with strict compiler settings
* While you can do `constexpr ot_xyz = ::qpdf_ot_xyz` in QPDFObject.hh to
  make QPDFObject::ot_xyz work, QPDFObject::object_type_e::ot_xyz will
  only work if the enumerated type names are the same.
2021-12-20 14:51:45 -05:00
Jay Berkenbilt cf7b2b5700 test_driver: split runtest into separate functions
Too bad about git annotate but it was pretty crazy to have all those
test cases together like that.
2021-12-20 12:40:03 -05:00
Jay Berkenbilt ea73bf72e0 Further improvements to handling binary strings 2021-12-19 14:30:45 -05:00
Jay Berkenbilt d3501c4f3e Fix LGTM alerts 2021-12-18 16:25:53 -05:00
Jay Berkenbilt ddbe59179e C API: simplify new error handling and improve documentation 2021-12-17 15:59:47 -05:00
m-holger f6293bd94c C-API expose QPDFObjectHandle::getTypeCode and getTypeName (fixes #597) 2021-12-17 14:24:43 -05:00
Jay Berkenbilt feafcc4e88 C API: add several stream functions (fixes #596) 2021-12-17 13:28:11 -05:00
Jay Berkenbilt 4024953682 Output C test n done at the end of each qpdf-ctest 2021-12-16 15:40:56 -05:00
Jay Berkenbilt 9bb6f570ec C API: add functions for working with pages (fixes #594) 2021-12-16 15:07:48 -05:00
Jay Berkenbilt f072be032f qpdf-ctest: outfile2 -> xarg 2021-12-16 11:51:16 -05:00
Jay Berkenbilt 08bcf6449c Clarify docs around @filename and leading/trailing space 2021-12-10 15:52:28 -05:00
Jay Berkenbilt af2a71aa2c Handle bitstream overflow errors more gracefully (fixes #581)
* Make it a runtime error, not a logic error
* Include additional information
* Capture it properly in checkLinearization
2021-12-10 15:37:35 -05:00
Jay Berkenbilt 1c62c2a342 C API: expose functions for indirect objects (fixes #588) 2021-12-10 14:57:35 -05:00
Jay Berkenbilt 8e0b153332 Expose QPDFObjectHandle::addTokenFilter (fixes #580) 2021-12-10 13:37:07 -05:00
Jay Berkenbilt 72c10d8617 C API: overhaul error handling
* Handle error conditions that occur when using the object handle
  interfaces. In the past, some exceptions were not correctly
  converted to errors or warnings.
* Add more detailed information to qpdf-c.h
* Make it possible to work more explicitly with uninitialized objects
2021-12-10 12:16:02 -05:00
Jay Berkenbilt 3340dbe976 Use a specific error code for type warnings and clarify docs 2021-12-10 11:15:49 -05:00
Jay Berkenbilt b2b2a175c4 Add missing unit test for register progress reporter in C API
It was exercised in the pdf-linearize example but not in qpdf-ctest.
2021-12-10 09:11:56 -05:00
Jay Berkenbilt 09f3737202 Split qpdf-ctest test 24 into multiple tests
Thanks for the nudge from m-holger!
2021-12-09 15:21:19 -05:00
Jay Berkenbilt e3cc171d02 C API: qpdf_oh_is_initialized 2021-12-09 10:33:31 -05:00
Jay Berkenbilt bef2c2222a C API: qpdf_get_last_string_length 2021-12-09 10:33:31 -05:00
m-holger 0c705a882b Minor documentation updates 2021-12-09 10:24:14 -05:00
m-holger b4fc9eb700 C-API expose new_object as qpdf_oh_new_object 2021-12-02 13:59:58 -05:00
Jay Berkenbilt 720ce9e8f3 Improve testing and error handling around operating before processing 2021-11-29 07:42:36 -05:00
Jay Berkenbilt b97a43e091 Add additional testing around improved array wrapping 2021-11-19 13:33:10 -05:00
m-holger 4630b8567c Ensure qpdf_oh handles returned by C-API functions are unique.
Return new qpdf_oh from qpdf_oh_wrap_in_array when input is already an array.
Update some doc comments in qpdf-c.h.
2021-11-19 13:31:59 +00:00
Jay Berkenbilt ce7db05d22 Prepare 10.4.0 release 2021-11-16 15:44:09 -05:00
Jay Berkenbilt 750aca5b94 First increment of improving handling of weak crypto (fixes #358) 2021-11-11 12:24:15 -05:00
Jay Berkenbilt f45dacf4cb Make recovery logic flexible about where objects end (fixes #573)
Don't assume endobj is at the beginning of the line. This means we are
looking at tokens for every line, but the odds of n n obj appearing in
the middle of the object are likely much lower than endobj not being
at the beginning of the line or missing entirely. This will probably
have a negative impact on recovery time for very large files.
Hopefully it will be worth it.
2021-11-07 15:27:22 -05:00
Jay Berkenbilt 4a648b9a00 Fix bug in merging resources /DR from foreign AcroForm (fixes #548)
When making resources indirect in from_dr, the code was using the
wrong owning QPDF, forgetting that from_dr had already been copied
using CopyForeignObject.
2021-11-04 12:29:42 -04:00
Jay Berkenbilt 9b28933647 Check object ownership when adding
When adding a QPDFObjectHandle to an array or dictionary, if possible,
check if the new object belongs to the same QPDF. This makes it much
easier to find incorrect code than waiting for the situation to be
detected when the file is written.
2021-11-04 12:29:42 -04:00
Jay Berkenbilt 73752683c9 Fix overlay/underlay on page with no resources (fixes #527) 2021-11-03 16:00:05 -04:00
Jay Berkenbilt 33a47d5c3c Make QPDF::findPage public (fixes #516)
This was originally not public because I wanted to get rid fo the
pages cache, but I recently realized there were deep reasons not to do
that, and the author of pikepdf wanted this, so I decided to make it
public.
2021-11-03 09:43:17 -04:00
Jay Berkenbilt 532a4f3d60 Detect recoverable but invalid zlib data streams (fixes #562) 2021-11-03 09:43:17 -04:00
Jay Berkenbilt 7ed991343b Better diagnostics when --pages is not closed (fixes #555) 2021-11-02 16:22:37 -04:00
Fredrik Fornwall e0775238b8 Fix QPDFEFStreamObjectHelper::{get,set}Subtype
The /Subtype entry that specifies the mime type of an embedded file is
inside the embedded file stream dictionary directly, not it in the
parameter dictionary.

See Table 45 and 46 in the PDF 1.7 specification:
https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#page=112
2021-09-10 10:02:24 -04:00
Jay Berkenbilt df38fe8e48 Fix string bounds checking in completion code (fixes #441) 2021-05-13 13:06:58 -04:00
Jay Berkenbilt bddebdb0ea Prepare 10.3.2 release 2021-05-08 10:41:14 -04:00
Jay Berkenbilt 30ac51bc78 Exclude unreferenced objects in object streams (fixes #520) 2021-05-08 09:42:09 -04:00
Jay Berkenbilt 8971443e46 QPDF::addPage*: handle duplicate pages more robustly 2021-04-05 10:58:10 -04:00
Jay Berkenbilt ec48820c3c Fix loop detection in NNTree 2021-04-05 07:59:02 -04:00
Jay Berkenbilt 3f05429cc5 Prepare 10.3.1 release 2021-03-11 12:59:41 -05:00
Jay Berkenbilt 972e08af58 Protect against future bugs in fixCopiedAnnotations
I don't want additional, undiscovered bugs to fully block page
splitting/merging operations.
2021-03-11 12:49:27 -05:00
Jay Berkenbilt 85884c363c Allow /DR to be direct in /AcroForm
Also handle direct annotation, though this is much less likely.
2021-03-11 11:43:38 -05:00