m-holger
bd300be08d
Replace calls to QPDFObjectHandle::Factory::newIndirect where possible
2022-08-31 22:45:45 +01:00
Jay Berkenbilt
a078202c1b
Merge pull request #752 from jberkenbilt/report-mem-usage
...
Report mem usage
2022-08-31 15:50:17 -04:00
Jay Berkenbilt
7b3134ef94
Add ChangeLog for previous contribution
...
Also remove no-longer-needed #include
2022-08-31 15:06:37 -04:00
Jay Berkenbilt
433f1dae19
Add --report-mem-usage option for debugging/testing
2022-08-31 14:47:27 -04:00
Jay Berkenbilt
0a54247652
Add QUtil::get_max_memory_usage for testing
2022-08-31 14:47:27 -04:00
m-holger
9532dca3a5
Inline QPDFObjectHandle::setParsedOffset
...
Part of #729
2022-08-30 14:55:45 +01:00
m-holger
70d985f942
Optimise QPDFParser::parse for #311 problem
...
Avoid creating new null objects that later will be discarded and made
implicit.
Part of #729
2022-08-30 13:32:54 +01:00
m-holger
97a7ad1d80
Avoid setting descriptions / offsets for direct nulls in QPDFParser::parse
...
Part of #729
2022-08-30 13:07:48 +01:00
m-holger
7402c02c80
Combine stacks in QPDFParser::parse
...
Part of #729
2022-08-30 12:53:19 +01:00
m-holger
74162a2d48
Tune QPDFParser::parse
...
Replace SparseOHArray with std::vector<QPDFObjectHandle>.
Part of #729
2022-08-30 11:32:43 +01:00
m-holger
6fc982b71a
Move QPDFObjectHandle::setObjectDescriptionFromInput to QPDFParser
...
Part of #729
2022-08-30 06:42:46 +01:00
m-holger
8ad1ea34fe
Add private methods QPDFParser::warn
...
Part of #729
2022-08-30 06:04:34 +01:00
m-holger
6670c685ab
Move QPDFObjectHandle::parseInternal to new class QPDFParser
...
Part of #729
2022-08-30 05:56:23 +01:00
Jay Berkenbilt
0adfd74f8b
Merge pull request #747 from m-holger/new_stream
...
Add optional parameter allow_nullptr to QPDFObjectHandle::getOwningQPDF
2022-08-29 16:33:19 -04:00
Jay Berkenbilt
2b01a79e87
Fix header ordering in QTC (format code)
2022-08-29 11:55:02 -04:00
m-holger
c53d54b13d
Add optional parameter allow_nullptr to QPDFObjectHandle::getOwningQPDF
...
Also, inline method and add optional parameter error_msg.
2022-08-28 22:15:59 +01:00
m-holger
b0c1ae05a3
Fix commit b45420a
2022-08-27 12:43:49 +01:00
m-holger
fc4feb6f1a
Remove BufferInputSource::Members
2022-08-27 12:19:51 +01:00
m-holger
d6a447b654
Remove ClosedFileInputSource::Members
2022-08-27 12:13:39 +01:00
m-holger
69a5fb7047
Add methods InputSource::fastRead, fastUnRead and fastTell
...
Provide buffered input for QPDFTokenizer.
2022-08-26 23:55:56 +01:00
m-holger
13ef50cd27
Avoid virtual method call in FileInputSource::read
2022-08-25 15:08:03 +01:00
m-holger
a318b203be
Refactor FileInputSource::seek and FileInputSource::unreadCh
...
Avoid building error message each call "just in case".
2022-08-25 15:04:41 +01:00
m-holger
dc5c8b82eb
Remove FileInputSource::Members
2022-08-25 12:42:14 +01:00
m-holger
7108cd7b98
Remove redundant tests in QPDFTokenizer::readToken
2022-08-25 11:32:08 +01:00
m-holger
10fda01b07
In QPDFTokenizer::readToken move call to getToken out of loop
2022-08-25 11:31:45 +01:00
m-holger
e4073ee868
Remove unnecessary string copy in QPDFTokenizer::getToken
2022-08-25 11:31:09 +01:00
m-holger
b45420a980
Remove QPDFTokenizer::unread_char
2022-08-25 11:30:49 +01:00
m-holger
706106dabb
Refactor QPDFTokenizer::betweenTokens()
2022-08-25 11:30:35 +01:00
m-holger
6371b90ae3
Refactor QPDFTokenizer::presentEOF
2022-08-25 11:30:24 +01:00
m-holger
42ed58e446
Integrate booleans and null into state machine in QPDFTokenizer
2022-08-25 11:30:13 +01:00
m-holger
fe33b7ca18
Integrate numbers into state machine in QPDFTokenizer
2022-08-25 11:26:46 +01:00
m-holger
931fbb6156
Integrate names into state machine in QPDFTokenizer
2022-08-25 11:26:38 +01:00
m-holger
a3f3238f37
Split QPDFTokenizer::handleCharacter into individual methods
2022-08-25 11:26:05 +01:00
m-holger
6111a6a424
Refactor QPDFTokenizer::inCharCode
2022-08-25 10:55:45 +01:00
m-holger
e7889ec5dc
Refactor st_top case in QPDFTokenizer::handleCharacter
2022-08-25 10:51:51 +01:00
m-holger
e4fe0d5cf5
Refactor QPDFTokenizer::inHexstring
2022-08-25 10:50:06 +01:00
m-holger
a5d2e88775
Code tidy: replace if with case statement in QPDFTokenizer::inString
2022-08-25 10:43:29 +01:00
m-holger
7c32f6cc2e
Add state st_string_escape in QPDFTokenizer
2022-08-25 10:41:36 +01:00
m-holger
7c5778f999
Add state st_string_after_cr in QPDFTokenizer
2022-08-21 11:13:48 +01:00
m-holger
f29d0a6312
Add state st_char_code in QPDFTokenizer
2022-08-21 11:01:48 +01:00
m-holger
d26b537a7c
Add private method QPDFTokenizer::inString
2022-08-21 02:54:34 +01:00
m-holger
2697ba49bc
Add private method QPDFTokenizer::inHexstring
2022-08-21 02:46:31 +01:00
m-holger
f9530a5815
Code tidy: replace if with case statement in QPDFTokenizer::handleCharacter
2022-08-21 02:38:49 +01:00
m-holger
86ade3f9cd
Add private method QPDFTokenizer::handleCharacter
2022-08-21 02:26:27 +01:00
m-holger
91fb61eda5
Code tidy: replace if with case statement in QPDFTokenizer::presentCharacter
2022-08-21 00:54:41 +01:00
m-holger
cf945eeabf
Avoid shrinking QPDFTokenizer::val and QPDFTokenizer::raw_val
2022-08-20 19:43:00 +01:00
m-holger
45a6100cbb
Inline QUtil functions used by QPDFTokenizer
2022-08-18 15:23:35 +01:00
m-holger
c08bb0ec02
Remove QPDFTokenizer::Members
2022-08-18 13:13:19 +01:00
Jay Berkenbilt
cef6425bca
Disable QTC inside the library by default ( fixes #714 )
...
This results in measurable performance improvements to packaged binary
libqpdf distributions. QTC remains available for library users and is
still selectively enabled in CI.
2022-08-07 16:20:49 -04:00
Jay Berkenbilt
da71dc6f37
QTC: cache get_env results for improved performance
...
It turns out that QUtil::get_env is particularly expensive on Windows
if there is a large environment. This may be true on other platforms
as well.
2022-08-07 14:23:05 -04:00
Jay Berkenbilt
32e30a3af2
Resolve QPDF{Name,Number} tree helper linker issues ( fixes #745 )
...
This is a guess...I'm not sure exactly why there are linker issues or
how to reproduce them.
2022-08-07 09:21:01 -04:00
Jay Berkenbilt
b90adb1c6c
Merge pull request #746 from m-holger/smart
...
Code tidy: remove redundant calls to smart_ptrs get() method
2022-08-07 08:41:50 -04:00
m-holger
7c6901bce5
Code tidy: remove redundant calls to smart_ptrs get() method
2022-08-07 10:33:25 +01:00
Jay Berkenbilt
3ec43f055a
Fix parsing comment
2022-08-06 14:24:08 -04:00
Jay Berkenbilt
a3037ca440
Merge pull request #739 from m-holger/getobject
...
Add QPDF::getObject to replace getObjectByObjGen and getObjectByID
2022-08-06 14:23:56 -04:00
m-holger
1553868c4a
Add QPDF::getObject to replace getObjectByObjGen and getObjectByID
...
For consistency with similar methods, e.g. replaceObject.
2022-08-01 19:22:37 +01:00
m-holger
407b0766b8
Inline QPDFObjectHandle::getObjGen etc
...
Also, make QPDFObjectHandle::isIndirect const.
2022-08-01 15:08:48 +01:00
m-holger
903a86643a
Fix code formatting of QPDF::pushInheritedAttributesToPageInternal
2022-08-01 13:54:51 +01:00
m-holger
0356bcecc5
Tidy QPDF::pushInheritedAttributesToPageInternal
...
Remove unnecessary parameters.
Remove code that is unnecessary as result of a prior call to QPDF::getAllPages.
Avoid clearing and rebuilding of m->all_pages.
2022-08-01 13:29:14 +01:00
m-holger
ff69773b35
Fix warnings in QPDF::getAllPagesInternal
2022-08-01 13:29:14 +01:00
m-holger
9dea7d3080
Tune QPDF::getAllPagesInternal
...
Avoid calling getAllPagesInternal for each /Page object.
2022-08-01 13:29:14 +01:00
m-holger
4ccca20db0
Remove redundant parameter from QPDF::getAllPagesInternal
2022-08-01 13:29:14 +01:00
Jay Berkenbilt
5d63730b93
Clean up documentation
2022-07-31 16:26:02 -04:00
Jay Berkenbilt
12d065c751
Provide a simpler QPDF::writeJSON
2022-07-31 16:23:17 -04:00
Jay Berkenbilt
13cf35ce2f
Use calledgetallpages and pushedinheritedpageresources
2022-07-31 16:23:17 -04:00
Jay Berkenbilt
5f4224f31a
Simplify --json-output
...
Now --json-output just changes defaults. Allow output file with --json.
2022-07-31 16:23:17 -04:00
Jay Berkenbilt
80acfc3826
Fix --json-help to take a version parameter
2022-07-31 16:23:17 -04:00
Jay Berkenbilt
69820847af
Change the output of --json to use "qpdf" instead of "objects"
2022-07-31 15:17:01 -04:00
Jay Berkenbilt
d01c4f8819
Change --json-output format
...
from "qpdf-v2" to "qpdf": [..., ...]
2022-07-31 10:32:55 -04:00
Jay Berkenbilt
bb96499b61
Update docs and prepare QPDF::writeJSON for changes
...
Add additional parameters that will be needed to call QPDF::writeJSON
in partial mode.
2022-07-31 10:32:55 -04:00
Jay Berkenbilt
0e3d4cdc97
Fix/clarify meaning of depth parameter to json write methods
2022-07-31 10:32:55 -04:00
Jay Berkenbilt
4feb10fdaf
Merge pull request #734 from m-holger/nullptr
...
Code tidy : replace 0 with nullptr or true
2022-07-31 08:33:45 -04:00
m-holger
073808aa50
Code tidy : replace 0 with nullptr or true
2022-07-26 13:40:13 +01:00
Jay Berkenbilt
4674c04cb8
JSON schema: support multi-element array validation
2022-07-24 16:44:51 -04:00
Jay Berkenbilt
f8d1ab9462
JSON schema -- accept single item in place of array
...
When the schema wants a variable-length array, allow a single item as
well as allowing an array.
2022-07-24 16:17:03 -04:00
Jay Berkenbilt
b3e6d445cb
Tweak "AndGet" mutator functions again
...
Remove any ambiguity around whether old or new value is being
returned.
2022-07-24 15:42:23 -04:00
m-holger
8b4afa428e
Revert making second parameter of QPDFObjGen::QPDFObjGen optional
...
Also, change test for QPDFObjGen::isIndirect to obj != 0.
Delete comment from commit afd35f9
.
2022-07-24 16:55:10 +01:00
m-holger
afd35f9a30
Overload StreamDataProvider::provideStreamData
...
Use 'QPDFObjGen const&' instead of 'int, int' in signature.
2022-07-24 16:02:35 +01:00
m-holger
5d0469f1bc
QPDFObjGen : tidy QPDFJob
...
Use QPDFObjGen::unparse where appropriate.
2022-07-24 16:02:35 +01:00
m-holger
4b73d057fb
QPDFObjGen : tidy QPDF_Stream
...
Change method signatures to use QPDFObjGen.
Replace QPDF_Stream::objid and generation with QPDF_Stream::og.
2022-07-24 16:02:35 +01:00
m-holger
f7978db1f6
QPDFObjGen : tidy QPDF private methods
...
Change method signatures to use QPDFObjGen.
Use QPDFObjGen methods where possible.
Remove redundant QPDF::objGenToIndirect.
2022-07-24 16:02:35 +01:00
m-holger
3404ca8ac8
QPDFObjGen : tidy QPDFObjectHandle private methods
...
Change method signature to use QPDFObjGen.
2022-07-24 15:59:49 +01:00
m-holger
b123f79dfd
Replace QPDFObjectHandle::objid and generation with QPDFObjectHandle::og
2022-07-24 15:59:49 +01:00
m-holger
c0168cf88c
QPPFObjGen : tidy QPDF::readObjectAtOffset
...
Change method signature to use QPDFObjGen.
2022-07-24 15:59:49 +01:00
m-holger
eeb6162f76
Add optional parameter separator to QPDFObjGen::unparse
...
Also, revert inlining of unparse and operator << from commit 4c6640c
in
order to avoid exposing QUtil.
2022-07-24 15:41:48 +01:00
Jay Berkenbilt
6f1041afb8
Clarify intent in readObjectAtOffset
...
Rather than using object id -1 to mean "don't care", use object ID 0,
and clarify the difference between that use and indication of a direct
object.
2022-07-24 09:40:11 -04:00
m-holger
4c6640cb45
Inline QPDFObjGen methods
...
ABI breaking change
2022-07-16 14:32:48 -04:00
Jay Berkenbilt
a603c1e395
Run format-code
2022-06-27 12:50:35 -04:00
m-holger
f0a8178091
Refactor QPDFObject creation and cloning
...
Move responsibility for creating shared pointers to objects and cloning from QPDFObjectHandle to QPDFObject.
2022-06-27 12:47:02 -04:00
m-holger
5aa8225f49
Refactor QPDFObjectTypeAccessor and QPDFObjectHandle::dereference
2022-06-27 10:39:04 -04:00
Jay Berkenbilt
0c7c7e4ba4
Track whether certain page modifying methods have been called
...
We need to know whether pushInheritedAttributesToPage or getAllPages
have been called when generating JSON output. When reading the JSON
back in, we have to call the same methods so that object numbers will
line up properly.
2022-06-25 13:55:45 -04:00
Jay Berkenbilt
25aff0bd52
TODO: abandon (again) and update notes about QPDFPagesTree
2022-06-25 13:26:53 -04:00
Jay Berkenbilt
8a32515a62
Add warnings for some additional page tree repair
2022-06-25 13:25:35 -04:00
Jay Berkenbilt
6c4537885e
Reformat code
2022-06-25 11:11:24 -04:00
m-holger
7836e19747
Code tidy: remove redundant calls to QPDFObjectHandle::isInitialized
2022-06-25 11:10:06 -04:00
m-holger
3b3bcab349
Remove QPDF_Stream::setStreamDescription
2022-06-25 08:26:46 -04:00
m-holger
9eda1fdc41
Remove redundant QPDF_Array::setDescription and QPDF_Dictionary::setDescription
2022-06-25 08:25:58 -04:00
m-holger
e9c1637353
Add private method QPDFObjectHandle::getObjGenAsStr
...
Also, use methods to access objid and generation.
2022-06-25 08:25:32 -04:00
m-holger
97f737a562
Code tidy: QPDFJob::doJSONPageLabels
...
Remove redundant variables pages and next.
2022-06-25 08:24:50 -04:00
Jay Berkenbilt
1eb2f208ec
Use Pl_Function in qpdflogger C API implementation
2022-06-19 09:12:59 -04:00
Jay Berkenbilt
eae75dbe44
Add Pl_Function -- a generic function pipeline
2022-06-19 09:12:29 -04:00
Jay Berkenbilt
bb0ea2f8e7
Add qpdfjob_register_progress_reporter
2022-06-19 08:46:58 -04:00
Jay Berkenbilt
87412eb05b
Add QPDFJob::registerProgressReporter
2022-06-19 08:46:58 -04:00
Jay Berkenbilt
3a7ee7e938
Move C-based ProgressReporter helper into QPDFWriter
2022-06-19 08:46:58 -04:00
Jay Berkenbilt
8130d50e3b
Add C API to QPDFLogger
2022-06-19 08:46:58 -04:00
Jay Berkenbilt
daef4e8fb8
Add more flexible funtions to qpdfjob C API
2022-06-19 08:46:58 -04:00
Jay Berkenbilt
e0720eaa78
Use the default logger for other writes to stdout/stderr
...
When there is no context for writing output or error messages, use the
default logger.
2022-06-18 10:38:50 -04:00
Jay Berkenbilt
83be2191b4
Use "save" logger when saving data to standard output
...
This includes the output PDF, streams from --show-object and
attachments from --save-attachment. This also enables --verbose and
--progress to work with saving to stdout.
2022-06-18 09:54:40 -04:00
Jay Berkenbilt
641e92c6a7
QPDF, QPDFJob: use QPDFLogger instead of custom output streams
2022-06-18 09:02:55 -04:00
Jay Berkenbilt
f1f711963b
Add and test QPDFLogger class
2022-06-18 09:02:55 -04:00
Jay Berkenbilt
f588d74140
Add integer types to Pipeline::operator<<
2022-06-18 09:02:55 -04:00
m-holger
057bd659bc
Code tidy: remove redundant variable in QPDF::writeJSON
2022-06-05 18:46:21 -04:00
Jay Berkenbilt
0bd908b550
Update documentation for qpdf JSON v2
2022-05-30 20:03:08 -04:00
Jay Berkenbilt
b7bbf12e85
In json mode, reveal recovered user password when otherwise unavailable
2022-05-30 20:03:08 -04:00
Jay Berkenbilt
f049a77c59
Add additional information when listing attachments
2022-05-30 20:03:08 -04:00
Jay Berkenbilt
04fc7c4bea
Add conversions to ISO-8601 date format
2022-05-30 20:03:08 -04:00
Jay Berkenbilt
27a42c16c7
Change default decode level to "none" with --json-output
2022-05-21 17:51:34 -04:00
Jay Berkenbilt
752f43d4e4
Allow empty b: binary JSON strings
2022-05-21 17:36:32 -04:00
Jay Berkenbilt
05460d405c
Format code
2022-05-21 16:11:42 -04:00
m-holger
6c69a747b9
Code clean up: use range-style for loops wherever possible
...
Remove variables obsoleted by commit 4f24617
.
2022-05-21 16:06:29 -04:00
Jay Berkenbilt
c56a9ca7f6
JSON: Fix large file support
2022-05-21 09:43:45 -04:00
Jay Berkenbilt
47c093c48b
Replace std::regex with validators for better performance
2022-05-21 08:43:21 -04:00
Jay Berkenbilt
9b2eb01e25
Exercise object description in tests
2022-05-20 14:23:32 -04:00
Jay Berkenbilt
6c2fb5b8f0
Add test for bad data and bad datafile
2022-05-20 13:33:30 -04:00
Jay Berkenbilt
d065098089
Test --update-from-json
2022-05-20 11:10:12 -04:00
Jay Berkenbilt
ef955b04b5
Bug fix: don't clobber stream length with replaceDict
2022-05-20 11:09:45 -04:00
Jay Berkenbilt
3eb77a7004
JSON: detect duplicate dictionary keys while parsing
2022-05-20 10:13:15 -04:00
Jay Berkenbilt
6d4e3ba8a4
Test (and fix) handling of dangling references
2022-05-20 09:16:25 -04:00
Jay Berkenbilt
5a2aa59479
Bug fix: isReserved() true for indirect reference to reserved object
2022-05-20 09:16:25 -04:00
Jay Berkenbilt
35b1e1c493
Explicitly test ignoring unknown keys in JSON input
2022-05-20 09:16:25 -04:00
Jay Berkenbilt
dc8df962d8
Make version default to latest for --json-output (like --json)
2022-05-20 09:16:25 -04:00
Jay Berkenbilt
6c7326b290
JSON fix: correctly parse UTF-16 surrogate pairs
2022-05-20 09:16:25 -04:00
Jay Berkenbilt
6f43bf8de3
Major rework -- see long comments
...
* Replace --create-from-json=file with --json-input, which causes the
regular input to be treated as json.
* Eliminate --to-json
* In --json=2, bring back "objects" and eliminate "objectinfo". Stream
data is never present.
* In --json-output=2, write "qpdf-v2" with "objects" and include
stream data.
2022-05-20 09:16:25 -04:00
Jay Berkenbilt
23fc6756f1
Add QUtil::FileCloser to the public API
2022-05-20 09:16:25 -04:00
Jay Berkenbilt
0fe8d44762
Support stream data -- not tested
...
There are no automated tests yet, but committing work so far in
preparation for some refactoring.
2022-05-20 09:16:25 -04:00
Jay Berkenbilt
63c7eefe9d
replaceStreamData: accept uninitialized filter/decode_parms
...
These mean to leave the original values alone. This is needed for
reconstructing streams from JSON given that the stream data and stream
dictionary may appear in any order in the JSON.
2022-05-20 09:16:25 -04:00
Jay Berkenbilt
56f1b411fe
Back out fluent QPDFObjectHandle methods. Keep the andGet methods.
...
I decided these were confusing and inconsistent with how JSON works.
They muddle the API rather than improving it.
2022-05-20 09:16:25 -04:00
Jay Berkenbilt
7e7a9c4379
Parse objects; stream data is not yet handled
2022-05-20 09:16:25 -04:00
Jay Berkenbilt
9064542b5f
Add private methods for reserving specific objects
2022-05-20 07:54:09 -04:00
Jay Berkenbilt
7fa5d1773b
Implement top-level qpdf json parsing
2022-05-16 13:41:40 -04:00
Jay Berkenbilt
8d42eb2632
Add scaffolding for QPDF JSON reactor
2022-05-16 13:41:40 -04:00
Jay Berkenbilt
4fe2e06b47
Add --create-from-json and --update-from-json arguments
...
Also add stubs for top-level QPDF methods (createFromJSON,
updateFromJSON)
2022-05-16 13:41:40 -04:00
Jay Berkenbilt
9a0e9a1a9e
Remove offset from missing /Root error
...
The last offset is irrelevant to not being able to find /Root.
2022-05-16 13:39:26 -04:00
Jay Berkenbilt
051ae7c282
Improve handling of replacing stream data with empty strings
...
When an empty string was passed to replaceStreamData, the code was
passing a null pointer to memcpy. Since a 0 size was also passed, this
was harmless, but it triggers sanitizer errors. The code properly
handles a null pointer as the buffer in other places.
2022-05-16 13:39:26 -04:00
Jay Berkenbilt
60ec94a7c3
Add QUtil::is_long_long
2022-05-16 13:39:26 -04:00
Jay Berkenbilt
4c7cfd5cbc
JSON reactor: improve handling of nested containers
...
Call the parent container's item method before calling the child
item's start method so we can easily know the current nesting level
when nested items are added.
2022-05-14 17:35:06 -04:00
Jay Berkenbilt
2a2f7f1bba
Add maxobjectid to JSON
2022-05-08 13:45:20 -04:00
Jay Berkenbilt
e9390aeaaa
Add --to-json option
2022-05-08 13:45:20 -04:00
Jay Berkenbilt
c76536dd9a
Implement JSON v2 output
2022-05-08 13:45:20 -04:00
Jay Berkenbilt
15272662f6
Fix typo in json output key name
...
moddify -> modify. Also carefully spell checked all remaining keys by
splitting them into words and running a spell checker, not just
relying on visual proofreading. That was the only one.
2022-05-08 13:45:20 -04:00
Jay Berkenbilt
1bc8abfdd3
Implement JSON v2 for Stream
...
Not fully exercised in this commit
2022-05-08 13:45:20 -04:00
Jay Berkenbilt
3246923cf2
Implement JSON v2 for String
...
Also refine the herustic for deciding whether to use hexadecimal
notation for a string.
2022-05-08 13:45:20 -04:00
Jay Berkenbilt
16f4f94cd9
Prepare code for JSON v2
...
Update getJSON() methods and calls to them
2022-05-07 11:12:01 -04:00
Jay Berkenbilt
a9fbbd5dca
Objectinfo json: write incrementally and in numeric order
...
This script was used on test data:
----------
#!/usr/bin/env python3
import json
import sys
import re
def json_dumps(data):
return json.dumps(data, ensure_ascii=False,
indent=2, separators=(',', ': '))
for filename in sys.argv[1:]:
with open(filename, 'r') as f:
data = json.loads(f.read())
if 'objectinfo' not in data:
continue
trailer = None
to_sort = []
for k, v in data['objectinfo'].items():
if k == 'trailer':
trailer = v
else:
m = re.match(r'^(\d+) \d+ R', k)
if m:
to_sort.append([int(m.group(1)), k, v])
newobjectinfo = {x[1]: x[2] for x in sorted(to_sort)}
if trailer is not None:
newobjectinfo['trailer'] = trailer
data['objectinfo'] = newobjectinfo
print(json_dumps(data))
----------
2022-05-07 08:26:31 -04:00
Jay Berkenbilt
948de60990
Objects json: write incrementally and in numeric order
...
The following script was used to adjust test data:
----------
#!/usr/bin/env python3
import json
import sys
import re
def json_dumps(data):
return json.dumps(data, ensure_ascii=False,
indent=2, separators=(',', ': '))
for filename in sys.argv[1:]:
with open(filename, 'r') as f:
data = json.loads(f.read())
if 'objects' not in data:
continue
trailer = None
to_sort = []
for k, v in data['objects'].items():
if k == 'trailer':
trailer = v
else:
m = re.match(r'^(\d+) \d+ R', k)
if m:
to_sort.append([int(m.group(1)), k, v])
newobjects = {x[1]: x[2] for x in sorted(to_sort)}
if trailer is not None:
newobjects['trailer'] = trailer
data['objects'] = newobjects
print(json_dumps(data))
----------
2022-05-07 08:26:31 -04:00
Jay Berkenbilt
f50274ef46
Pages json: write each page incrementally
2022-05-07 08:26:31 -04:00
Jay Berkenbilt
dc9b7287cd
Top-level json: write incrementally
...
This commit just changes the order in which fields are written to the
json without changing their content. All the json files in the test
suite were modified with this script to ensure that we didn't get any
changes other than ordering.
----------
#!/usr/bin/env python3
import json
import sys
def json_dumps(data):
return json.dumps(data, ensure_ascii=False,
indent=2, separators=(',', ': '))
for filename in sys.argv[1:]:
with open(filename, 'r') as f:
data = json.loads(f.read())
newdata = {}
for i in ('version', 'parameters', 'pages', 'pagelabels',
'acroform', 'attachments', 'encrypt', 'outlines',
'objects', 'objectinfo'):
if i in data:
newdata[i] = data[i]
print(json_dumps(newdata))
----------
2022-05-07 08:26:31 -04:00
Jay Berkenbilt
7f65a5c21f
Test json against schema only on demand
...
Testing json against schema requires an in-memory copy, so do it only
when requested by the test suite.
2022-05-07 08:26:31 -04:00
Jay Berkenbilt
a3c9980395
Add next to Pl_String and fix comments
2022-05-07 08:26:31 -04:00
Jay Berkenbilt
b361c5ce19
Add --test-json-schema command-line option
2022-05-07 08:26:31 -04:00
Jay Berkenbilt
7604ac5cb2
QPDFJob: have doJSON write to a pipeline
2022-05-07 08:26:31 -04:00
Jay Berkenbilt
0500d4347a
JSON: add blob type that generates base64-encoded binary data
2022-05-06 19:14:52 -04:00
Jay Berkenbilt
05fda4afa2
Change JSON parser to parse from an InputSource
2022-05-04 12:07:11 -04:00
Jay Berkenbilt
e5f3910c3e
Add new FileInputSource constructors
2022-05-04 12:07:11 -04:00
Jay Berkenbilt
e259635986
JSON: add write methods and implement unparse() in terms of those
2022-05-04 12:07:11 -04:00
Jay Berkenbilt
8b25de24c9
Make "objects" and "pages" consistent in JSON output
2022-05-04 08:32:44 -04:00
Jay Berkenbilt
6b576797cd
Don't call pushInheritedAttributesToPage in json mode
...
We used to have to do that, but for quite some time, the code that
gets images has no longer required it.
2022-05-04 07:11:13 -04:00
Jay Berkenbilt
f4206a0938
Add new Pl_String Pipeline
2022-05-03 18:54:51 -04:00
Jay Berkenbilt
16139d97c8
Add new Pl_OStream Pipeline
2022-05-03 18:54:51 -04:00
Jay Berkenbilt
21d6e3231f
Make use of the new Pipeline methods in some places
2022-05-03 18:31:23 -04:00
Jay Berkenbilt
f1c6bb97db
Add new Pipeline convenience methods
2022-05-03 18:31:22 -04:00
Jay Berkenbilt
59f3e09edf
Make Pipeline::write take an unsigned char const* (API change)
2022-05-03 18:31:22 -04:00
Jay Berkenbilt
62bf296a9c
Make assert handling less error-prone
...
Prevent my future self or other contributors from using assert in
tests and then having that assert not do anything because of the
NDEBUG macro.
2022-05-03 18:31:22 -04:00
Jay Berkenbilt
92b692466f
Remove remaining incorrect assert calls from implementation
2022-05-03 18:31:22 -04:00
Jay Berkenbilt
3d9bac43da
Add internal Pl_Base64
...
Bidirectional base64; will be used by JSON v2.
2022-05-03 18:31:22 -04:00
Jay Berkenbilt
6724a362c3
Move generate_auto_job to the top-level CMakeLists.txt
2022-05-03 08:39:50 -04:00
Jay Berkenbilt
8d2a0eda5a
Add reactors to the JSON parser
2022-05-01 19:55:52 -04:00
Jay Berkenbilt
72e5c73419
Limit parser depth for json parser
2022-05-01 12:56:22 -04:00
Jay Berkenbilt
e34dbbfa18
Spell check
2022-05-01 12:56:22 -04:00
Jay Berkenbilt
8ccd3a8a89
Mark weak encryption with API changes ( fixes #576 )
2022-04-30 17:24:15 -04:00
Jay Berkenbilt
2213ed0c3d
Remove deprecated (pre-8.4.0) encryption APIs
2022-04-30 17:23:58 -04:00
Jay Berkenbilt
cff26040d8
Using insecure crytpo from the CLI is now an error by default
2022-04-30 17:23:58 -04:00
Jay Berkenbilt
ce19471f18
Add comments around non-security-related uses of MD5
2022-04-30 14:15:07 -04:00
Jay Berkenbilt
c365a26e9d
Revert "Remove QPDFObjectHandle::replaceOrRemoveKey"
...
This reverts commit dc059560e7
.
I changed my mind. There's no harm in leaving it deprecated for a
release cycle.
2022-04-30 14:15:07 -04:00
Jay Berkenbilt
dc059560e7
Remove QPDFObjectHandle::replaceOrRemoveKey
...
See ChangeLog for rationale for not deprecating it as originally
planned.
2022-04-30 13:39:45 -04:00
Jay Berkenbilt
4f24617e1e
Code clean up: use range-style for loops wherever possible
...
Where not possible, use "auto" to get the iterator type.
Editorial note: I have avoid this change for a long time because of
not wanting to make gratuitous changes to version history, which can
obscure when certain changes were made, but with having recently
touched every single file to apply automatic code formatting and with
making several broad changes to the API, I decided it was time to take
the plunge and get rid of the older (pre-C++11) verbose iterator
syntax. The new code is just easier to read and understand, and in
many cases, it will be more effecient as fewer temporary copies are
being made.
m-holger, if you're reading, you can see that I've finally come
around. :-)
2022-04-30 13:27:18 -04:00
Jay Berkenbilt
7f023701dd
Formatting: remove space in range-style for loops
...
Change .clang-format and commit automated changes from a fresh run of
format-code
2022-04-30 13:26:43 -04:00
Jay Berkenbilt
2878c186bf
Use fluent appendItem
2022-04-30 10:54:16 -04:00
Jay Berkenbilt
ab9d557cb0
Use fluent replaceKey
2022-04-29 20:39:54 -04:00
Jay Berkenbilt
d8fdf632a9
Use replaceKeyAndGet in a few places in existing code
2022-04-29 20:28:02 -04:00
Jay Berkenbilt
e80fad86e9
Add new QPDFObjectHandle methods for more fluent programming
2022-04-29 20:09:10 -04:00
Jay Berkenbilt
d0b7cc8ac6
QPDFJob json: make removeAttachment take an array ( fixes #693 )
2022-04-24 13:06:19 -04:00
Jay Berkenbilt
63c5a56f38
Fix build logic around generate_auto_job
...
It was being run at configuration time, not build time.
2022-04-24 13:06:16 -04:00
Jay Berkenbilt
08ba21cf49
Fix some bugs around null values in dictionaries
...
Make it so that a key with a null value is always treated as not being
present. This was inconsistent before.
2022-04-24 10:08:32 -04:00
Jay Berkenbilt
4be2f36049
Deprecate replaceOrRemoveKey -- it's the same as replaceKey
2022-04-24 09:31:32 -04:00
Jay Berkenbilt
4925f0d18c
Have dictionary/streams mutators take const& where possible
2022-04-24 09:05:50 -04:00
Jay Berkenbilt
68e721981a
Add new QPDF::warn that takes most of QPDFExc's arguments
2022-04-23 18:25:43 -04:00
Jay Berkenbilt
22b35c4928
Expose QUtil::get_next_utf8_codepoint
2022-04-23 18:25:43 -04:00
Jay Berkenbilt
5bbb0d4c30
Replace switch statements with static map initializers
...
Character transcoding from Unicode to single-byte characters used
hard-coded switch statements because the code predated our adoption of
C++11. Now we have thread-safe, static initialization of map literals,
so use that instead.
2022-04-23 18:25:43 -04:00
Jay Berkenbilt
ce5c3bcad8
QPDFJob: pass capture output streams through to underlying QPDF
2022-04-18 11:24:17 -04:00
Jay Berkenbilt
75fe4f60c3
Use anonymous namespaces for file-private classes
2022-04-16 13:35:27 -04:00
Jay Berkenbilt
80ed3076a0
Remove deprecated name/number tree constructors
...
Remove the name/number tree object helper constructors that don't take
a QPDF&.
2022-04-16 13:13:15 -04:00
Jay Berkenbilt
496ca2e4dc
Remove QPDFAcroFormDocumentHelper::copyFieldsFromForeignPage
2022-04-16 13:12:07 -04:00
Jay Berkenbilt
6df6260751
Change default --json from 1 to latest
2022-04-16 12:57:33 -04:00
Jay Berkenbilt
cdd0b4fb7d
Use = default and = delete where possible in classes
2022-04-16 11:39:14 -04:00
Jay Berkenbilt
2a7d2b63c2
Make ABI-breaking changes that don't modify API at all
...
* Merge overloaded functions by adding default values
* Remove non-const methods that are identical to const methods
2022-04-16 10:41:46 -04:00
Jay Berkenbilt
ce86307a1a
Fix typo in error message
2022-04-10 16:54:23 -04:00
Jay Berkenbilt
90cfe80bac
Clean up/fix DLL.h
...
* Change DLL_EXPORT to libqpdf_EXPORTS (internal to the build). The
new name is cmake's default, is more conventional, and is less
likely to clash with other symbols.
* Add QPDF_DLL_PRIVATE for non-Windows
* Make logic around when to define QPDF_DLL et al more explicit
* Add detailed comments
2022-04-10 16:52:36 -04:00
Jay Berkenbilt
07edf96440
Remove methods of private classes from ABI
...
Prior to the cmake conversion, several private classes had methods
that were exported into the shared library so they could be tested
with libtests. With cmake, we build libtests using an object library,
so this is no longer necessary. The methods that are disappearing from
the ABI were never exposed through public headers, so no code should
be using them. Removal had to wait until the window for ABI-breaking
changes was open.
2022-04-09 17:33:29 -04:00
Jay Berkenbilt
128e41648f
Remove PointerHolder.hh from other than public header files
...
Increase to POINTERHOLDER_TRANSITION=4
2022-04-09 17:33:29 -04:00
Jay Berkenbilt
a68703b07e
Replace PointerHolder with std::shared_ptr in library sources only
...
(patrepl and cleanpatch are my own utilities)
patrepl s/PointerHolder/std::shared_ptr/g {include,libqpdf}/qpdf/*.hh
patrepl s/PointerHolder/std::shared_ptr/g libqpdf/*.cc
patrepl s/make_pointer_holder/std::make_shared/g libqpdf/*.cc
patrepl s/make_array_pointer_holder/QUtil::make_shared_array/g libqpdf/*.cc
patrepl s,qpdf/std::shared_ptr,qpdf/PointerHolder, **/*.cc **/*.hh
git restore include/qpdf/PointerHolder.hh
cleanpatch
./format-code
2022-04-09 17:33:29 -04:00
Jay Berkenbilt
08fb583449
Remove accidentally committed file
2022-04-09 14:37:00 -04:00
Jay Berkenbilt
59834db472
Add documentation for code formatting and contribution guidelines
2022-04-09 12:25:08 -04:00
Jay Berkenbilt
77e889495f
Update some code manually to get better formatting results
...
Add comments to force line breaks, parenthesize function arguments
that are contatenated strings, etc. -- these kinds of changes improve
clang-format's results and also cause emacs cc-mode to match
clang-format. After this type of change, most of the time, when
clang-format and emacs disagree, clang-format is better.
2022-04-05 14:56:19 -04:00
Jay Berkenbilt
12f1eb15ca
Programmatically apply new formatting to code
...
Run this:
for i in **/*.cc **/*.c **/*.h **/*.hh; do
clang-format < $i >| $i.new && mv $i.new $i
done
2022-04-04 08:10:40 -04:00
Jay Berkenbilt
97fc98901c
Protect gnutls headers from clang-format rearranging them
2022-04-04 08:05:39 -04:00
Jay Berkenbilt
33caed4f17
Exclude formatting on embedded native crypto
2022-04-03 17:58:36 -04:00
Jay Berkenbilt
f8e97e0ed5
Put spaces around version constraint in pkg-config ( fixes #677 )
...
Also add a pkg-config runtime test that would have caught the error.
2022-03-23 10:52:40 -04:00
Jay Berkenbilt
6dcb26d21e
Fix test for whether atomic library is needed
...
Some platforms need it for atomic<long long> but not for atomic<int>.
2022-03-19 18:19:44 -04:00
Jay Berkenbilt
820a3f04fd
Remove "lt-" workarounds
...
The executables that libtool built invoked the underlying binary with
an "lt-" prefix. The code contained numerous workarounds for testing,
which can now be removed.
2022-03-18 19:53:18 -04:00
Jay Berkenbilt
acdf5b2e7a
Update process for ABI testing
2022-03-18 19:53:18 -04:00
Jay Berkenbilt
70d0d0889b
Remove old build files
2022-03-18 19:53:18 -04:00
Jay Berkenbilt
b8aff90997
Add cmake configuration files
2022-03-18 19:53:18 -04:00
Jay Berkenbilt
3331e8921c
Switch variables to cmake in qpdf-config.h
2022-03-18 19:53:18 -04:00
Jay Berkenbilt
f030789104
Rename bits_include.cc to qpdf/bits_functions.hh
...
It's better to just make it a .hh file to reduce confusion.
2022-03-07 18:01:27 -05:00
Jay Berkenbilt
6dd8465948
TODO: solidify plans for code formatting
2022-02-26 12:08:58 -05:00
Jay Berkenbilt
6aa58d51be
Rename bits.icc to bits_include.cc
2022-02-26 12:08:58 -05:00
Jay Berkenbilt
99393e6ab7
Shorten coverage case name
...
This is so it will fit on one line after a qtest upgrade allows us to
split lines.
2022-02-26 10:18:23 -05:00
Jay Berkenbilt
03bc6535bd
generate_auto_job: protect generated files from formatting
2022-02-26 09:17:51 -05:00
Jay Berkenbilt
ae17402c52
Move default values to constexpr
...
This was mainly to get comments about defaults out of constructor
initializer lists where their fragile when a code formatter is being
used.
2022-02-26 08:16:12 -05:00
Jay Berkenbilt
36794a60cf
Allow \/ in a json string
2022-02-25 11:42:50 -05:00
Jay Berkenbilt
56b4d5a610
Use val.at instead of val[]
2022-02-22 08:40:49 -05:00
Jay Berkenbilt
f7ac591590
Recognize explicit UTF-8 strings ( fixes #654 )
2022-02-22 08:10:05 -05:00
Jay Berkenbilt
3b4b9efd21
Fix autogeneration of job.sums
2022-02-22 08:10:05 -05:00
Jay Berkenbilt
31b45b0fd4
Fix logic error with Tf when generating appearances ( fixes #655 )
2022-02-18 13:46:35 -05:00
Jay Berkenbilt
3e2109ab37
Remove special case for 0xad for 10.6.2.
2022-02-16 06:52:05 -05:00
Jay Berkenbilt
e810fe678a
Fix asymmetry between newUnicodeString and getUTF8Value
2022-02-15 19:22:35 -05:00
Jay Berkenbilt
a478cbb6dc
Silently/transparently recognize UTF-16LE as UTF-16 ( fixes #649 )
...
The PDF spec only allows UTF-16BE, but most readers seem to accept
UTF-16LE as well, so now qpdf does too.
2022-02-15 16:13:12 -05:00
Jay Berkenbilt
fbd3e56da7
Ignore -- at the top level arg parser ( fixes #652 )
...
This was unintended behavior that was added back for backward
compatibility. It is intentionally undocumented.
2022-02-15 16:13:12 -05:00
Jay Berkenbilt
1065bbb016
Handle odd PDFDoc codepoints in UTF-8 during transcoding ( fixes #650 )
...
There are codepoints in PDFDoc that are not valid UTF-8 but map to
valid UTF-8. We were handling those correctly with bidirectional
mapping.
However, if those same code points appeared in UTF-8, where they have
no meaning, they were left as fixed points when converting to PDFDoc,
where they do have meaning. This change recognizes them as errors.
2022-02-15 08:32:38 -05:00
m-holger
4ff837f099
Fix tests for Form XObjects
...
Remove test for type == /XObject in QPDFObjectHandle::isFormXObject
as type value is optional (as per spec 8.10.2).
Replace code to test for /Form in QPDFJob::shouldRemoveUnreferencedResources
with a call to isFormXObject.
2022-02-10 19:47:37 -05:00
Jay Berkenbilt
235c89e037
Fix one more PDF doc encoding error for 10.6 release ( fixes #637 )
2022-02-09 05:47:58 -05:00
Jay Berkenbilt
d501e1c0d4
Only update output version from files used as input
...
If we're opening a PDF file to copy its encryption information or
attachments, its version doesn't need to influence the output version.
2022-02-08 13:49:22 -05:00
Jay Berkenbilt
f91b21c7d4
Preserve input PDF version on pages/split-pages ( fixes #610 )
2022-02-08 12:34:14 -05:00
Jay Berkenbilt
cfd5147d92
Add QPDF::getVersionAsPDFVersion
2022-02-08 12:34:14 -05:00
Jay Berkenbilt
8082af09be
Add PDFVersion class
2022-02-08 12:34:14 -05:00
Jay Berkenbilt
cb769c62e5
WHITESPACE ONLY -- expand tabs in source code
...
This comment expands all tabs using an 8-character tab-width. You
should ignore this commit when using git blame or use git blame -w.
In the early days, I used to use tabs where possible for indentation,
since emacs did this automatically. In recent years, I have switched
to only using spaces, which means qpdf source code has been a mixture
of spaces and tabs. I have avoided cleaning this up because of not
wanting gratuitous whitespaces change to cloud the output of git
blame, but I changed my mind after discussing with users who view qpdf
source code in editors/IDEs that have other tab widths by default and
in light of the fact that I am planning to start applying automatic
code formatting soon.
2022-02-08 11:51:15 -05:00
Jay Berkenbilt
c62e8e2b28
Update for clean compile with POINTERHOLDER_TRANSITION=2
2022-02-07 17:38:22 -05:00
Jay Berkenbilt
3f22bea084
Use make_array_pointer_holder
...
This will be able to be replaced with QUtil::make_shared_array
2022-02-07 17:38:22 -05:00
Jay Berkenbilt
40f1946df8
Replace PointerHolder arrays with shared_ptr arrays where possible
...
Replace PointerHolder arrays wherever it can be done without breaking ABI.
2022-02-07 17:38:22 -05:00
Jay Berkenbilt
df2f5c6a36
Add QUtil::make_shared_array to help with PointerHolder transition
2022-02-07 14:08:46 -05:00
Jay Berkenbilt
cfaae47dc6
Add getBufferSharedPointer() to Pl_Buffer and QPDFWriter
2022-02-07 12:53:28 -05:00
m-holger
5901fcad4c
C-API expose QPDFObjectHandle::getKeyIfDict
2022-02-06 11:21:15 -05:00
m-holger
8371060340
Add method QPDFObjectHandle::getKeyIfDict
2022-02-06 11:21:15 -05:00
m-holger
2ed5f49a79
C-API expose QPDFObjectHandle::getValueAs... accessors
2022-02-05 19:40:30 -05:00
Jay Berkenbilt
af3f74de8c
Stop using std::iterator ( fixes #618 )
...
Create the typedefs directly in iterators rather than deriving from
the deprecated std::iterator class.
2022-02-05 11:29:25 -05:00
Jay Berkenbilt
7fb22740e1
Add operator ""_qpdf for creating QPDFObjectHandle literals
2022-02-05 11:29:25 -05:00
Jay Berkenbilt
b48a0ff0e8
Add qpdf_empty_pdf to C API
2022-02-05 11:29:25 -05:00
Jay Berkenbilt
8cf7f2bfb5
API contract: qpdf_get_qpdf_version() returns a static
2022-02-05 11:24:56 -05:00
Jay Berkenbilt
5f3f78822b
Improve use of std::unique_ptr
...
* Use unique_ptr in place of shared_ptr in some cases
* unique_ptr for arrays does not require a custom deleter
* use std::make_unique (c++14) where possible
2022-02-05 11:24:56 -05:00
m-holger
e58b1174c7
Add new QPDFObjectHandle::getValueAs... accessors
2022-02-05 11:24:35 -05:00
Jay Berkenbilt
cfaa2de804
Update copyright for 2022
2022-02-04 16:36:22 -05:00
Jay Berkenbilt
2229e37e88
Add a blank line after the first header included in each source
2022-02-04 16:31:31 -05:00
Jay Berkenbilt
8eab616d62
Add qpdf version macros to qpdf/DLL.h
2022-02-04 13:41:01 -05:00
Jay Berkenbilt
abc300f05c
Replace containers of PointerHolder with containers of std::shared_ptr
...
None of these are in the public API.
2022-02-04 13:12:37 -05:00
Jay Berkenbilt
f0c2e0ef1e
JSON: use std::shared_ptr internally
2022-02-04 13:12:37 -05:00
Jay Berkenbilt
9044a24097
PointerHolder: deprecate getPointer() and getRefcount()
...
Use get() and use_count() instead. Add #define
NO_POINTERHOLDER_DEPRECATION to remove deprecation markers for these
only.
This commit also removes all deprecated PointerHolder API calls from
qpdf's code except in PointerHolder's test suite, which must continue
to test the deprecated APIs.
2022-02-04 13:12:37 -05:00
m-holger
95e7d36b7a
C-API add two binary UTF8 funtions
...
add qpdf_oh_new_binary_unicode_string and qpdf_oh_get_binary_utf8_value
2022-02-04 13:10:51 -05:00
m-holger
1925ffd467
Fix --check-linearization of non-linearized files ( fixes #615 )
2022-02-04 06:52:38 -05:00
m-holger
4d507251fe
Change QPDFExc type to unsupported for /Standard filter
2022-02-02 14:07:32 -06:00
Jay Berkenbilt
42bff9f458
QPDFJob: let initializeFromArgv just take argv, not argc
...
Let argv be a null-terminated array. There is already code that
assumes this, and it makes it easier to construct the arguments.
2022-02-01 13:50:58 -05:00
Jay Berkenbilt
b02d37bc0a
Make QPDFArgParser accept const argv
...
This makes it much more convention to use the initializeFromArgv
functions since you can use string literals.
2022-02-01 13:50:58 -05:00
Jay Berkenbilt
bc4e2320e7
Add qpdfjob-c.h -- simple C API around parts of QPDFJob
2022-02-01 09:04:55 -05:00
Jay Berkenbilt
03e67a28fe
Move QTC::TC for qpdf to QPDFJob
...
All the coverage cases that used to be in qpdf.cc are now in
QPDFJob*.cc. It doesn't really matter, but better to follow the
convention of starting with the class that includes the coverage call.
2022-02-01 09:04:55 -05:00
Jay Berkenbilt
b42f3e1d15
Move more code from qpdf.cc into QPDFJob
2022-02-01 09:04:55 -05:00
Jay Berkenbilt
cc5485dac1
QPDFJob: documentation
2022-02-01 09:04:55 -05:00
Jay Berkenbilt
5a7bb3474e
generate_auto_job: generate overloaded config decls for optional
...
For optional parameter/choices, generate an overloaded config method
that takes no arguments. This makes it possible to convert from a bare
argument to one that takes an optional parameter without breaking
binary compatibility.
2022-02-01 09:04:55 -05:00
Jay Berkenbilt
5953116634
Clean up documentation and help around json options
2022-01-31 18:40:11 -05:00
Jay Berkenbilt
606420ab54
Tweak short text for job schema help
2022-01-31 18:26:03 -05:00
Jay Berkenbilt
21b9290785
QPDFJob json: make bare arguments expect the empty string
...
Changing from bool requiring true to string requiring the empty string
is more consistent with the CLI and makes it possible to add an
optional parameter or choices later without breaking compatibility.
2022-01-31 18:16:09 -05:00
Jay Berkenbilt
ea96330bb6
QPDFJob json: flatten json structure
...
Flatten everything to make it easier to map command-line flags to
json. The old structure was an illusion anyway because there was no
mechanism to enforce that things were in the right place. This also
helps with future flexibility.
2022-01-31 18:16:09 -05:00
Jay Berkenbilt
47f33cec25
QPDFJob: add test cases
2022-01-31 15:57:45 -05:00
Jay Berkenbilt
e3506253f1
Add optional version to --json
2022-01-31 15:57:45 -05:00
Jay Berkenbilt
b4fb9b4ec3
Remove outdated comments
2022-01-31 15:57:45 -05:00
Jay Berkenbilt
caa00556cf
Change filename or path to file in json and QPDFJob
...
Use "file" consistently for specifying a file path. We use "filename"
when adding attachments for a completely different purpose.
2022-01-31 15:57:45 -05:00
Jay Berkenbilt
1a3ed1ee85
job json: move deterministic-id into output options
2022-01-31 15:57:45 -05:00
Jay Berkenbilt
81b6314cb5
QPDFJob: fix logic errors in handling arrays
...
The code was assuming everything was happening inside dictionaries.
Instead, make the dictionary key handler creatino explicit only when
iterating through dictionary keys.
2022-01-31 15:57:45 -05:00
Jay Berkenbilt
f99e0af49c
QPDFJob: rename function that returns job schema
2022-01-31 15:57:45 -05:00
Jay Berkenbilt
1355d95d08
QPDFJob: partial mode for initializeFromJson
2022-01-31 15:57:45 -05:00
Jay Berkenbilt
cd30f626fe
QPDFJob: remove from json a few things that only make sense from CLI
2022-01-31 15:57:45 -05:00
Jay Berkenbilt
eeffc69d87
QPDFJob_json: implement handlers for pages
2022-01-31 15:57:45 -05:00
Jay Berkenbilt
fa9676557e
QDPFJob: incorporate change to JSONHandler for array start function
2022-01-31 15:57:45 -05:00
Jay Berkenbilt
3b60224bae
JSONHandler: pass JSON object to array start function
2022-01-31 15:57:45 -05:00
Jay Berkenbilt
b74e7989c3
QPDFJob_json: implement handlers except pages
2022-01-31 15:57:45 -05:00
Jay Berkenbilt
e01bbccb40
QPDFJob: incorporate change to JSONHandler for dict start function
2022-01-31 15:57:45 -05:00
Jay Berkenbilt
ce3406e93f
JSONHandler: pass JSON object to dict start function
...
If some keys depend on others, we have to check up front since there
is no control of what order key handlers will be called. Anyway, keys
are unordered in json, so we don't want to depend on ordering.
2022-01-31 15:57:45 -05:00
Jay Berkenbilt
11a86e444d
QPDFJob: autogenerate json init and declarations
...
Now still have to go through and implement the handlers.
2022-01-31 15:57:45 -05:00
Jay Berkenbilt
842a9d928e
QPDFJob_json: add code to register handlers
2022-01-31 15:57:45 -05:00
Jay Berkenbilt
967a2b9f28
Fix typo in error message
2022-01-31 15:57:45 -05:00
Jay Berkenbilt
a7b0aec2cf
Fix false compiler warning in debug mode
2022-01-31 15:57:45 -05:00
Jay Berkenbilt
28278e27ea
Keep JSONHandler and QPDFArgParser private
...
Since the functionality of argument parsing has moved into QPDFJob,
these classes no longer need to be public. Their methods still have to
be in the library's binary interface so they can be tested in libtests.
2022-01-31 15:57:45 -05:00
Jay Berkenbilt
0f05cae66a
QPDFJob: generate json decl and init file skeletons
2022-01-31 15:57:45 -05:00
Jay Berkenbilt
8a9100f674
QPDFJob: add checkConfiguration to Config
2022-01-31 15:57:45 -05:00
Jay Berkenbilt
0c8e9e5912
QPDFJob: prepare for automatically generated json handlers
2022-01-31 15:57:45 -05:00
Jay Berkenbilt
7eeaf58bb7
More doc tweaks
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
7097f29019
More editorial changes from m-holger + spell check
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
0e909bab8e
Improve top-level help information
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
0364024781
Use QPDFUsage exception for cli, json, and QPDFJob errors
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
f3d68aa5a0
Incorporate editorial changes from m-holger
2022-01-30 13:11:03 -05:00
m-holger
7dd5f31230
Fix typos in manual
...
Fix typos in cli.rst
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
c62ab2ee9f
QPDFJob: use pointers instead of references for Config
...
Why? The main methods that create them return smart pointers so that
users can initialize them when needed, which you can't do with
references. Returning pointers instead of references makes for a more
uniform interface.
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
03f3369f35
QPDFJob: use manually named end functions for Config classes
...
Use named functions rather than just end() for clarity.
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
9013b7ca91
QPDFJob: move placeholder json to a separate source file
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
edef2cd330
QPDFJob: make remaining members private
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
f2409f4fca
Minor cleanup
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
01969c78a8
QPDFJob: move private members into Members
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
cf6c56a463
QPDFJob: use config API in place-holder json
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
2c7b583b3a
QPDFJob: move input/output handling into config
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
1258054543
QPDFJob: eliminate most access to QPDFJob members from ArgParser
...
All that's left now is input and output handling.
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
901e3e4fbf
QPDFArgParser: remove unused copyFromOtherTable
...
This was used, but it no longer is, so let's not keep the extra
complexity around.
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
700dfa40d3
QPDFJob: convert encryption handlers
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
b5d41b16b8
QPDFJob: convert under/overlay and rotate
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
1cc532dc91
QPDFJob: move some helpers from ArgParser to QPDFJob
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
95d127641c
QPDFJob: move more top-level trivial handlers into config
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
41c5af8f26
QPDFJob: convert pages
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
9373881cca
Add QPDFJob::ConfigError exception
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
0a354af02c
QPDFJob: convert AddAttachment handlers
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
bf255ccc89
QPDFJob: convert password in two tables
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
21c897aad0
QPDFJob: convert a flag in other than the main table
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
f60526aff9
QPDFJob: start changing generation for trivial config handlers
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
b4b0df0df9
QPDFJob: convert trivial functions to config API
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
79187e585a
QPDFJob: begin configuration API with verbose
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
160e869d1e
Mark trivial arg functions
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
558f043d91
QPDFJob: TRUE -> true
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
fcdbc8a102
Move doFinalChecks to QPDFJob::checkConfiguration
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
c4e56fa5f4
QPDFJob: make createsOutput callable before run()
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
564dc03607
QPDFJob: start real API
...
Create QPDFJob_options.cc to hold API implementation functions.
Reorganize a little in preparation for moving public member variables
private and creating the real QPDFJob API that will be used by callers
as well as the argv/json initialization methods.
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
1d099ab743
QPDFJob: placeholder for initializeFromJson
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
1c8d53465f
Incorporate job schema generation into generate_auto_job
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
b9cd693a5b
QPDFJob: allocate QPDFArgParser on stack
...
The previous commits have removed all references to memory from
QPDFArgParser from QPDFJob. This commit removes the constraint that
QPDFArgParser remain in scope. This is a prerequisite to allowing JSON
as an alternative way to initialize QPDFJob and to initialize it
directly using a public API.
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
d526d4c17f
QPDFJob: convert Under/Overlay to use shared pointers
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
88891a75a2
QPDFJob: convert Under/Overlay ranges to strings
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
e48bfce930
QPDFJob: convert PageSpec to used shared pointer
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
e4905983d2
QPDFJob: convert outfilename to shared pointer
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
e5edfc786f
QPDFJob: convert infilename to shared pointer
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
ee7824cf28
QPDFJob: convert encryption_file args to shared pointers
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
021db6f226
QPDFJob: convert password to shared pointer
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
1a8c2eb93b
QPDFJob: use std::shared_ptr over PointerHolder where possible
...
Also fix QPDFArgParser
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
76c4f78b5c
Add QUtil::make_shared_cstr
...
Replace most of the calls to QUtil::copy_string with this instead.
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
67f9d0b7d5
cli.rst: remove () from end of short help
...
This is used to generate a schema for the job json, which can't
contain `)"` because it breaks the R"(...)" syntax in C++. While C++
accepts R"anything(...)anything" to avoid this, as of this writing,
MSVC 2019 doesn't understand that. For now, just avoid it by removing
parentheses from the end of short help.
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
8dea480c9f
Allow optional fields in json "schema" checks
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
ec85e56c3f
Add missing help topic for inspection
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
1db0a7ffce
JSONHandler: rework dictionary and array handlers
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
acf8d18b6e
Editorial changes to cli.rst
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
cf8405d91e
Fix json schema for objects to include dictionary key
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
2e58541493
Use JSON::parse to initialize schema for json mode
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
37105710ee
Implement JSONHandler for recursively processing JSON
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
a6df6fdaf7
CLI doc: use tables where helpful
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
e8e8f6f43c
Add JSON::parse
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
b9af421ef7
Add missing \f support for JSON string encoder
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
aa0a379b37
Add JSON::isDictionary and JSON::isArray
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
5c5e5ca29b
Document how to add a command-line argument
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
c8729398dd
Generate help content from manual
...
This is a massive rewrite of the help text and cli.rst section of the
manual. All command-line flags now have their own help and are
specifically index. qpdf --help is completely redone.
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
b4bd124be4
QPDFArgParser: support adding/printing help information
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
5303130cf9
Fix comment on duplicated top-level json keys
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
53ba65eb59
QPDFArgParser: handle optional choices including help
...
Handle optional choices in addition to required choices. Refactor the
way help options are added to completion to make it work with optional
help choices.
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
a301cc5373
Minor code cleanup
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
3ab25d595b
Fix doc typos caught by m-holger -- thanks
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
4577df4b5d
QPDFJob increment: generate option table initialization
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
f1d805badc
Add QPDFArgParser::copyFromOtherTable
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
c3e9b64e7f
QPDFJob increment: generate handler declarations
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
6e70d99b58
QPDFJob increment: generate choices variables in init
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
cb684ec4d3
QPDFJob increment: generate table names
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
f8eee83515
Expose QPDFArgParser::usage
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
8dcf6da259
QPDFJob: remove non-check from doFinalChecks
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
c216854607
Add basic framework for QPDFJob code generation
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
bd89aac360
QPDFJob increment: move arg parsing into QPDFJob
...
Move ArgParser from qpdf.cc into QPDFJob.cc. It still works with
millions of public member variables, but now qpdf.cc is minimal and
just calls stable library functions.
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
12396702af
QPDFJob: reorder functions, no other changes
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
2394dd8519
QPDFJob increment: static functions to member functions
...
Convert remaining static functions that take QPDFJob& as a parameter
to member functions. Utility functions that don't take QPDFJob& remain
static functions and can probably just stay that way since the keep
extra complexity out of QPDFJob.hh.
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
e2975b9ed0
QPDFJob: de-templatize do_process and do_process_once
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
2f631997f2
QPDFJob increment: remove std::cout, std::cerr, whoami
...
Remove remaining temporary duplication of hard-coded values and direct
access to std::cout, std::cerr, and whoami in favor of parameters in
QPDFJob. This moves a few more static methods into QPDFJob member
functions.
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
1ddf5b4b4b
QPDFJob increment: get rid of exit, handle verbose
...
Remove all calls to exit() from QPDFJob. Handle code that runs in
verbose mode to enable it to make use of output streams and message
prefix (whoami) from QPDFJob. This removes temporarily duplicated exit
code logic and most access to whoami/std::cout outside of QPDFJob
proper.
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
0910e767ad
QPDFJob increment: basic QPDFJob structure
...
Move most of the methods called from qpdf.cc after argument parsing
into QPDFJob. In this increment, enough QPDFJob API has been added to
handle the branch of QPDFJob::run() that creates output with an
appropriate division between qpdf.cc and QPDFJob.
There are temporary bits of code to enable everything to compile and
pass the test suite, including some duplication and hard-coded values.
2022-01-30 13:11:03 -05:00
Jay Berkenbilt
52817f0a45
Implement QPDFArgParser based on ArgParser from qpdf.cc
2022-01-30 13:11:02 -05:00
m-holger
0f9086e509
Fix doc typos
2022-01-30 12:09:54 -06:00
m-holger
8eca9d8fd9
Fix QPDFObjectHandle::isOrHasName
...
Ensure isOrHasName returns true if object is an array and the name is
present anywhere in the array.
2022-01-27 09:35:39 -06:00
m-holger
07db3200cb
Remove some if statements and simplify some boolean expressions
...
Use QPDFObjectHandle::isNameAndEquals, isDictionaryOfType and
isStreamOfType.
2022-01-27 07:31:12 -06:00
m-holger
710d2e54f0
Allow testing for subtype without specifying type in isDictionaryOfType etc
...
Accept empty string as type parameter in
QPDFObjectHandle::isDictionaryOfType and isStreamOfType
to allow for dictionaries with optional type.
2022-01-27 07:31:12 -06:00
m-holger
1b1b471ca9
Make a few whitespace fixes from last commit
...
Commit by ejb@ql.org using m-holger as author so git annotate gives
proper credit for changes.
2022-01-22 09:14:53 -05:00
m-holger
8593b9fdf7
Add new convenience methods QPDFObjectHandle::isNameAndEquals, etc
...
Add methods isNameAndEquals, isDictionaryOfType, isStreamOfType
2022-01-22 08:10:28 -06:00
Jay Berkenbilt
370710657a
Add missing characters from PDF doc encoding ( fixes #606 )
2022-01-11 15:55:19 -05:00
Jay Berkenbilt
77c31305fe
Fix signed/unsigned char warning ( fixes #604 )
2022-01-11 06:51:31 -05:00
Jay Berkenbilt
af91b5b584
Add QUtil::file_can_be_opened
2021-12-29 13:41:02 -05:00
Jay Berkenbilt
04745320d6
Prepare 10.5.0 release
2021-12-20 14:51:46 -05:00
Jay Berkenbilt
d866f48081
Change names of qpdf_object_type_e enumerations
...
They have to be ot_* rather than qpdf_ot_* for compatibility.
* Different enumerated types are not assignment-compatible in C++, at
least with strict compiler settings
* While you can do `constexpr ot_xyz = ::qpdf_ot_xyz` in QPDFObject.hh to
make QPDFObject::ot_xyz work, QPDFObject::object_type_e::ot_xyz will
only work if the enumerated type names are the same.
2021-12-20 14:51:45 -05:00
Jay Berkenbilt
ea73bf72e0
Further improvements to handling binary strings
2021-12-19 14:30:45 -05:00
Jay Berkenbilt
ddbe59179e
C API: simplify new error handling and improve documentation
2021-12-17 15:59:47 -05:00
m-holger
f6293bd94c
C-API expose QPDFObjectHandle::getTypeCode and getTypeName ( fixes #597 )
2021-12-17 14:24:43 -05:00
Jay Berkenbilt
feafcc4e88
C API: add several stream functions ( fixes #596 )
2021-12-17 13:28:11 -05:00
Jay Berkenbilt
fee7489ee4
Add Pl_Buffer::getMallocBuffer
2021-12-17 12:38:52 -05:00
Jay Berkenbilt
9bb6f570ec
C API: add functions for working with pages ( fixes #594 )
2021-12-16 15:07:48 -05:00
Jay Berkenbilt
245ca28066
Use value rather than reference captures where possible
2021-12-16 11:47:07 -05:00
Jay Berkenbilt
af2a71aa2c
Handle bitstream overflow errors more gracefully ( fixes #581 )
...
* Make it a runtime error, not a logic error
* Include additional information
* Capture it properly in checkLinearization
2021-12-10 15:37:35 -05:00
Jay Berkenbilt
1c62c2a342
C API: expose functions for indirect objects ( fixes #588 )
2021-12-10 14:57:35 -05:00
Jay Berkenbilt
72c10d8617
C API: overhaul error handling
...
* Handle error conditions that occur when using the object handle
interfaces. In the past, some exceptions were not correctly
converted to errors or warnings.
* Add more detailed information to qpdf-c.h
* Make it possible to work more explicitly with uninitialized objects
2021-12-10 12:16:02 -05:00
Jay Berkenbilt
3340dbe976
Use a specific error code for type warnings and clarify docs
2021-12-10 11:15:49 -05:00
Jay Berkenbilt
b2b2a175c4
Add missing unit test for register progress reporter in C API
...
It was exercised in the pdf-linearize example but not in qpdf-ctest.
2021-12-10 09:11:56 -05:00
Jay Berkenbilt
1faa21502f
Refactor trap_errors to use std::function
2021-12-09 10:33:31 -05:00
Jay Berkenbilt
e3cc171d02
C API: qpdf_oh_is_initialized
2021-12-09 10:33:31 -05:00
Jay Berkenbilt
bef2c2222a
C API: qpdf_get_last_string_length
2021-12-09 10:33:31 -05:00
m-holger
b4fc9eb700
C-API expose new_object as qpdf_oh_new_object
2021-12-02 13:59:58 -05:00
Jay Berkenbilt
720ce9e8f3
Improve testing and error handling around operating before processing
2021-11-29 07:42:36 -05:00
Jay Berkenbilt
ac17308cf6
Initialize QPDF::Members::file ( fixes #584 )
2021-11-29 07:16:34 -05:00
m-holger
4630b8567c
Ensure qpdf_oh handles returned by C-API functions are unique.
...
Return new qpdf_oh from qpdf_oh_wrap_in_array when input is already an array.
Update some doc comments in qpdf-c.h.
2021-11-19 13:31:59 +00:00
Jay Berkenbilt
ce7db05d22
Prepare 10.4.0 release
2021-11-16 15:44:09 -05:00
Jay Berkenbilt
750aca5b94
First increment of improving handling of weak crypto ( fixes #358 )
2021-11-11 12:24:15 -05:00
Jay Berkenbilt
f45dacf4cb
Make recovery logic flexible about where objects end ( fixes #573 )
...
Don't assume endobj is at the beginning of the line. This means we are
looking at tokens for every line, but the odds of n n obj appearing in
the middle of the object are likely much lower than endobj not being
at the beginning of the line or missing entirely. This will probably
have a negative impact on recovery time for very large files.
Hopefully it will be worth it.
2021-11-07 15:27:22 -05:00
Jay Berkenbilt
3794f8e2ad
Support OpenSSL 3 ( fixes #568 )
2021-11-04 18:24:54 -04:00
Jay Berkenbilt
a84a0b2487
Add range check in QPDFNumberTreeObjectHelper (fuzz issue 37740)
2021-11-04 14:03:24 -04:00
Jay Berkenbilt
4a648b9a00
Fix bug in merging resources /DR from foreign AcroForm ( fixes #548 )
...
When making resources indirect in from_dr, the code was using the
wrong owning QPDF, forgetting that from_dr had already been copied
using CopyForeignObject.
2021-11-04 12:29:42 -04:00
Jay Berkenbilt
9b28933647
Check object ownership when adding
...
When adding a QPDFObjectHandle to an array or dictionary, if possible,
check if the new object belongs to the same QPDF. This makes it much
easier to find incorrect code than waiting for the situation to be
detected when the file is written.
2021-11-04 12:29:42 -04:00
Jay Berkenbilt
33a47d5c3c
Make QPDF::findPage public ( fixes #516 )
...
This was originally not public because I wanted to get rid fo the
pages cache, but I recently realized there were deep reasons not to do
that, and the author of pikepdf wanted this, so I decided to make it
public.
2021-11-03 09:43:17 -04:00
Jay Berkenbilt
532a4f3d60
Detect recoverable but invalid zlib data streams ( fixes #562 )
2021-11-03 09:43:17 -04:00
Fredrik Fornwall
e0775238b8
Fix QPDFEFStreamObjectHelper::{get,set}Subtype
...
The /Subtype entry that specifies the mime type of an embedded file is
inside the embedded file stream dictionary directly, not it in the
parameter dictionary.
See Table 45 and 46 in the PDF 1.7 specification:
https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#page=112
2021-09-10 10:02:24 -04:00
Jay Berkenbilt
3cacb27a90
Performance fix on preserveObjectStreams
2021-05-09 07:51:14 -04:00
Jay Berkenbilt
bddebdb0ea
Prepare 10.3.2 release
2021-05-08 10:41:14 -04:00
Jay Berkenbilt
30ac51bc78
Exclude unreferenced objects in object streams ( fixes #520 )
2021-05-08 09:42:09 -04:00
Zdenek Dohnal
16c19e9424
libqpdf/Pl_AES_PDF.cc: remove duplicated if branch
...
Check for this->encrypt seems to be moved to plugged crypto
implementations, so it can be removed from Pl_AES_PDF.cc.
2021-04-29 09:42:38 -04:00
Jay Berkenbilt
36c7c20819
Fix timezone portability issue ( fixes #515 )
2021-04-17 18:12:55 -04:00
Jay Berkenbilt
8971443e46
QPDF::addPage*: handle duplicate pages more robustly
2021-04-05 10:58:10 -04:00
Jay Berkenbilt
ec48820c3c
Fix loop detection in NNTree
2021-04-05 07:59:02 -04:00
Jay Berkenbilt
258675fc99
Move ABI comment to the right place
2021-04-03 11:43:08 -04:00
Jay Berkenbilt
a77f58142d
Remove some assertions that are not necessarily true ( fixes #514 )
...
Operations that add the same object to multiple places in the pages
tree are throwing exceptions and then later causing assertion
failures. The assert calls shouldn't be there.
2021-03-21 19:35:23 -04:00
Jay Berkenbilt
3f05429cc5
Prepare 10.3.1 release
2021-03-11 12:59:41 -05:00
Jay Berkenbilt
85884c363c
Allow /DR to be direct in /AcroForm
...
Also handle direct annotation, though this is much less likely.
2021-03-11 11:43:38 -05:00
Jay Berkenbilt
dc65b88457
Prepare 10.3.0 release
2021-03-05 06:15:48 -05:00
Jay Berkenbilt
cb6e53136f
QPDFAcroFormDocumentHelper: add missing analyze calls
2021-03-04 18:11:44 -05:00
Jay Berkenbilt
0b77f2cf26
Revert non-binary-compatible handleWarning change -- see TODO (ABI)
2021-03-04 15:59:46 -05:00
Jay Berkenbilt
f68e25c7f2
Don't use handleWarning, which is being reverted
2021-03-04 15:59:45 -05:00
Jay Berkenbilt
9fb174b9e9
Major rework of handling form fields when copying pages ( fixes #509 )
2021-03-04 15:08:37 -05:00
Jay Berkenbilt
887f35efaa
When resolving font from /DR, copy it into resources
2021-03-04 15:08:36 -05:00
Jay Berkenbilt
a2124f992c
Add QPDFMatrix::operator==
2021-03-04 15:08:36 -05:00
Jay Berkenbilt
552303a94a
Check for reserved after dereference
2021-03-04 15:08:36 -05:00
Jay Berkenbilt
d7ffdfa994
Add optional conflict detection to mergeResources
...
Also improve behavior around direct vs. indirect resources.
2021-03-04 15:08:36 -05:00
Jay Berkenbilt
e17585c2d2
Remove unreferenced: ignore names that are not Fonts or XObjects
...
Converted ResourceFinder to ParserCallbacks so we can better detect
the name that precedes various operators and use the operators to sort
the names into resource types. This enables us to be smarter about
detecting unreferenced resources in pages and also sets the stage for
reconciling differences in /DR across documents.
2021-03-03 17:05:49 -05:00
Jay Berkenbilt
a15ec6967d
Enhancements to ParserCallbacks
2021-03-03 17:05:49 -05:00
Jay Berkenbilt
1bb209a9bf
Add QPDF::numWarnings
2021-03-03 17:05:49 -05:00
Jay Berkenbilt
37fcc5ff71
Create ResourceFinder from NameWatcher in QPDFPageObjectHelper
2021-03-03 17:05:49 -05:00
Jay Berkenbilt
b444ab3352
Fix typos in coverage cases
2021-03-03 17:05:49 -05:00
Jay Berkenbilt
fa2516df71
Fix behavior for finding /Q, /DA, and /DR for form fields
...
If not found in the field hierarchy, /Q and /DA are supposed to be
looked up in the document-level form dictionary. /DR is supposed to
only come from the document dictionary.
2021-03-03 17:05:19 -05:00
Jay Berkenbilt
a4d6589ff2
Have QPDFObjectHandle notice when replaceObject was called
...
This results in a performance penalty of 1% to 2% when replaceObject
and swapObjects are never called and a somewhat larger penalty if they
are called, but it's worth it to avoid very confusing behavior as
discussed in depth in qpdf#507.
2021-02-25 07:32:46 -05:00
Jay Berkenbilt
ec6719fd25
Always call dereference() before querying obj pointer
2021-02-25 07:31:26 -05:00
Jay Berkenbilt
b5e937397c
Prepare 10.2.0 release
2021-02-23 10:41:58 -05:00
Jay Berkenbilt
1886673d7e
Spell check
2021-02-23 10:38:05 -05:00
Jay Berkenbilt
9e00be7ffa
Remove warning that gives false positives in some normal cases
2021-02-23 08:26:21 -05:00
Jay Berkenbilt
be3a8c0e7a
Keep only referenced form fields in --pages
2021-02-23 08:26:21 -05:00
Jay Berkenbilt
83216e640c
Preserve form fields when splitting pages ( fixes #340 )
2021-02-22 18:42:06 -05:00
Jay Berkenbilt
1f35ec9988
Add methods for copying form fields
2021-02-22 18:42:06 -05:00
Jay Berkenbilt
8e8c0d8290
Add new placeFormXObject that takes a matrix reference
2021-02-22 18:42:06 -05:00
Jay Berkenbilt
61d41e2e88
Add copyAnnotations, use with overlay/underlay ( fixes #395 )
2021-02-22 18:42:06 -05:00
Jay Berkenbilt
7b3cbacf5d
Change from QPDF{Array,Dict}Items to aitems() and ditems()
2021-02-22 11:05:39 -05:00
Jay Berkenbilt
a9ae8cadc6
Add transformAnnotations and fix flattenRotations to use it
2021-02-21 17:13:09 -05:00
Jay Berkenbilt
a76decd2d5
Add QPDFObjGen::unparse
2021-02-21 16:21:52 -05:00
Jay Berkenbilt
7540d2082a
Explicitly override inherited rotate in flattenRotations
2021-02-21 14:58:45 -05:00
Jay Berkenbilt
e899926e0d
Use QPDFMatrix inside flattenRotations
2021-02-21 14:58:45 -05:00
Jay Berkenbilt
92fbc6fdf5
QPDFObjectHandle::copyStream
2021-02-21 06:36:30 -05:00
Jay Berkenbilt
60afe4142e
Refactor: separate copyStreamData from replaceForeignIndirectObjects
2021-02-21 06:36:30 -05:00
Jay Berkenbilt
15269f36d8
addFormField: update cache rather than invalidating
2021-02-21 06:36:30 -05:00
Jay Berkenbilt
901f1a788c
Enhance QPDFMatrix API
2021-02-21 06:36:30 -05:00
Jay Berkenbilt
05eb5826d8
Fix isPagesObject and isPageObject
...
There are lots of things with /Kids that are not pages. Repair the
pages tree, then do a reliable check.
2021-02-20 19:42:41 -05:00
Jay Berkenbilt
35dd11f356
Allow --rotate=0
2021-02-20 16:29:34 -05:00
Jay Berkenbilt
71e8627285
Add const versions of QPDFMatrix::transform*
2021-02-19 18:35:19 -05:00
Jay Berkenbilt
de8929a41c
Add QPDFAcroFormDocumentHelper::addFormField
2021-02-18 12:25:48 -05:00
Jay Berkenbilt
5cec6b4c3d
Add QPDFPageObjectHelper::getMatrixForFormXObjectPlacement
2021-02-18 12:25:48 -05:00
Jay Berkenbilt
0765872295
Form field for non-widget just returns null
2021-02-18 10:25:07 -05:00
Jay Berkenbilt
0b1623d07d
Add QUtil::path_basename
2021-02-18 09:59:03 -05:00
Jay Berkenbilt
a773f4c71d
Add QPDFObjectHandle::parse for strings with context
2021-02-15 11:33:03 -05:00
Jay Berkenbilt
7eb903d9aa
Use functional replaceStreamData
2021-02-14 14:42:24 -05:00
Jay Berkenbilt
efbb21673c
Add functional versions of QPDFObjectHandle::replaceStreamData
...
Also fix a bug in checking consistency of length for stream data
providers. Length should not be checked or recorded if the provider
says it failed to generate the data.
2021-02-14 14:42:24 -05:00
Jay Berkenbilt
e2593e2efe
Move QPDFMatrix into the public API
2021-02-13 02:30:00 -05:00
Jay Berkenbilt
07f40bd254
QUtil::double_to_string: trim trailing zeroes with option to disable
2021-02-13 02:30:00 -05:00
Jay Berkenbilt
8fbc8579f2
Allow zone information to be omitted from timestamp strings
2021-02-11 14:26:55 -05:00
Jay Berkenbilt
df067c9ab6
Add autoconf test for localtime_r
2021-02-11 14:26:55 -05:00
Jay Berkenbilt
1b3f84f967
Require C++14 instead of C++11
2021-02-10 16:27:58 -05:00
Jay Berkenbilt
9fcf61b2f6
Fix loop in QPDFOutlineDocumentHelper (fuzz issue 30507)
2021-02-10 16:27:44 -05:00
Jay Berkenbilt
4d1f2fdcac
Update to new name/number tree API
2021-02-10 15:46:20 -05:00
Jay Berkenbilt
1f4771cd0d
Minor clean up of Windows headers
2021-02-10 07:36:18 -05:00
Jay Berkenbilt
ad34b9c278
Implement helpers for file attachments
2021-02-10 06:57:37 -05:00
Jay Berkenbilt
bf0e6eb302
Add QUtil methods for dealing with PDF timestamp strings
2021-02-09 17:50:24 -05:00
Jay Berkenbilt
bfbeec5497
Make newly created name/number trees indirect objects
2021-02-08 06:49:56 -05:00
Jay Berkenbilt
553ac7f353
Add QUtil::pipe_file and QUtil::file_provider
2021-02-07 19:41:34 -05:00
Jay Berkenbilt
e076c9bf08
Remove erroneous handling of /EFF for stream decryption
...
I thought /EFF was supposed to be used as a default for decrypting
embedded file streams, but actually it's supposed to be advice to a
conforming writer about handling new ones. This makes sense since the
findAttachmentStreams code, which is not actually needed, was never
right.
2021-02-06 17:08:41 -05:00
Jay Berkenbilt
ac2b3b96e1
Make wrong object stream type a warning
2021-02-06 14:29:11 -05:00
Jay Berkenbilt
faa2e3ddfd
Handle older PDFs whose form XObjects inherit resources ( fixes #494 )
...
When removing unreferenced resources, notice if a page (recursively)
contains a form XObject with unreferenced resources, and count any
such resources as referenced by the page.
2021-02-02 18:06:05 -05:00
Jay Berkenbilt
81025e4998
Refactor removal of unreferenced resources
...
Refactor in preparation for resolving unresolved resources in form
xobjects from page.
2021-02-02 18:06:05 -05:00
Jay Berkenbilt
9c9ce64eec
Handle strings in inline image dictionaries
...
We need to use token.getRawValue, not token.getValue
2021-01-31 07:50:03 -05:00
Jay Berkenbilt
178f995fc2
Recover from exceptions during filtering for inline images
2021-01-31 07:49:08 -05:00
Jay Berkenbilt
4ae93a73c5
Improve memory safety of dict/array iterators
2021-01-31 07:16:03 -05:00
Jay Berkenbilt
de0b11fc47
Add C++ iterator API around array and dictionary objects
2021-01-30 15:15:23 -05:00
Jay Berkenbilt
35e7859bc7
Make QPDFObjectHandle::is* return false for uninitialized objects
2021-01-29 15:46:54 -05:00
Jay Berkenbilt
50decc9bb8
name/number tree: explicitly declare default destructors
2021-01-29 15:46:54 -05:00
Jay Berkenbilt
8ed3e8c79b
NNTree: rework iterators to be more memory efficient
...
Keep a std::pair internal to the iterators so that operator* can
return a reference and operator-> can work, and each can work without
copying pairs of objects around.
2021-01-26 09:12:23 -05:00
Jay Berkenbilt
e7e20772ed
name/number trees: remove
2021-01-26 09:12:23 -05:00
Jay Berkenbilt
5816fb44b8
name/number trees: insertAfter
2021-01-25 15:39:10 -05:00
Jay Berkenbilt
16a9bb3f6f
name/number trees: newEmpty, increment/decrement end()
2021-01-25 15:39:10 -05:00
Jay Berkenbilt
b5614f611d
Implement repair and insert for name/number trees
2021-01-24 19:31:45 -05:00
Jay Berkenbilt
04edfe9fad
QPDFObjectHandle::newUnicodeString to uses UTF-16 only when needed
...
Use the first of ASCII, PDFDocEncoding, or UTF-16 that is capable of
encoding the string.
2021-01-24 03:27:28 -05:00
Jay Berkenbilt
63e5cb533d
Use new QPDF{Name,Number}TreeObjectHelper API
2021-01-24 03:27:28 -05:00
Jay Berkenbilt
d61ffb65d0
Add new constructors for name/number tree helpers
...
Add constructors that take a QPDF object so we can issue warnings and
create new indirect objects.
2021-01-24 03:27:26 -05:00
Jay Berkenbilt
ba814703fb
Use QPDFNameTreeObjectHelper's iterator directly
2021-01-24 03:25:11 -05:00
Jay Berkenbilt
5f0708418a
Add iterators to name/number tree helpers
2021-01-24 03:22:59 -05:00
Jay Berkenbilt
4a1cce0a47
Reimplement name and number tree object helpers
...
Create a computationally and memory efficient implementation of name
and number trees that does binary searches as intended by the data
structure rather than loading into a map, which can use a great deal
of memory and can be very slow.
2021-01-24 03:22:51 -05:00
Jay Berkenbilt
6226b69dba
Add warn() to QPDF's public API
2021-01-16 18:41:53 -05:00
Jay Berkenbilt
fc88837d4b
Treat /EmbeddedFiles as a proper name tree
...
If we ever had an encrypted file with different filters for
attachments and either the /EmbeddedFiles name tree was deep or some
of the file specs didn't have /Type, we would have overlooked those as
attachment streams. The code now properly handles /EmbeddedFiles as a
name tree.
2021-01-11 10:50:44 -05:00
Jay Berkenbilt
6fe7b704c7
Warn rather than segv on access after closing input source ( fixes #495 )
2021-01-06 10:11:34 -05:00
Jay Berkenbilt
0fed040392
Prepare version 10.1.0
2021-01-04 16:59:55 -05:00
Jay Berkenbilt
18340b8835
Spell check
2021-01-04 16:26:58 -05:00
Jay Berkenbilt
dc92574c10
Fix some pipelines to be safe if downstream write fails (fuzz issue 28262)
2021-01-04 15:17:35 -05:00
Jay Berkenbilt
ba6b6aacf1
Fix outdated comment
2021-01-03 15:59:49 -05:00
Jay Berkenbilt
3be58f49e5
Make more QPDFPageObjectHelper methods work with form XObject
2021-01-02 14:08:53 -05:00
Jay Berkenbilt
98da4fd835
Externalize inline images now includes form XObjects
2021-01-02 14:08:17 -05:00
Jay Berkenbilt
bedf35d6a5
Bug fix: avoid extraneous pipeline finish calls with multiple contents
...
Avoid calling finish() multiple times on the pipeline passed to
pipeContentStreams. This commit also fixes a bug in which qpdf was not
exiting with the proper exit status if warnings found while splitting
pages; this was exposed by a test case that changed.
2021-01-02 14:08:17 -05:00
Jay Berkenbilt
a139d2b36d
Add several methods for working with form XObjects ( fixes #436 )
...
Make some more methods in QPDFPageObjectHelper work with form
XObjects, provide forEach methods to walk through nested form
XObjects, possibly recursively. This should make it easier to work
with form XObjects from user code.
2021-01-02 12:29:31 -05:00
Jay Berkenbilt
6154221edb
QPDFPageObjectHelper: filterPageContents -> filterContents + form XObject
2021-01-02 11:33:36 -05:00
Jay Berkenbilt
63ea46193d
QPDFPageObjectHelper: getPageImages -> getImages
2021-01-02 11:33:36 -05:00
Jay Berkenbilt
e7a8554563
QPDFPageObjectHelper::getPageImages: support form XObjects
2021-01-02 11:33:36 -05:00
Jay Berkenbilt
1562d34c09
Add QPDFObjectHandle::isFormXObject
2021-01-01 07:36:10 -05:00
Jay Berkenbilt
c9271335fa
Add QPDFPageObjectHelper::flattenRotation and --flatten-rotation
2020-12-30 13:03:55 -05:00
Jay Berkenbilt
12ecd2019a
Add QPDFObjectHandle::setFilterOnWrite
2020-12-28 12:58:19 -05:00
Jay Berkenbilt
3f9191a344
Add ostream << for QPDFObjGen
2020-12-28 12:58:19 -05:00
Jay Berkenbilt
858c7b89bc
Let optimize filter stream parameters instead of making them direct
...
Also removes preclusion of stream references in stream parameters of
filterable streams and reduces write times by about 8% by eliminating
an extra traversal of the objects.
2020-12-28 12:58:19 -05:00
Jay Berkenbilt
1a62cce940
Restructure optimize to allow skipping parameters of filtered streams
2020-12-28 12:58:19 -05:00
Jay Berkenbilt
09027344b9
Refactor: separate code that determines whether to filter a stream
2020-12-28 12:58:19 -05:00
Jay Berkenbilt
39bfa01307
Implement user-provided stream filters
...
Refactor QPDF_Stream to use stream filter classes to handle supported
stream filters as well.
2020-12-28 12:58:19 -05:00
Jay Berkenbilt
cc8895078a
Add QPDFObjectHandle::makeDirect(bool allow_streams)
2020-12-26 08:48:18 -05:00
Jay Berkenbilt
573b6eb8b1
Provide qpdf write progress reporting from C API ( fixes #487 )
2020-12-20 14:43:24 -05:00
Jay Berkenbilt
2050977099
Add QPDFObjectHandle manipulation to C API
2020-11-28 19:48:07 -05:00
Jay Berkenbilt
78b9d6bfd4
Prepare 10.0.4 release
2020-11-21 13:50:02 -05:00
Jay Berkenbilt
bd79138c84
Treat direct page as runtime rather than logic error (fuzz issue 27393)
2020-11-11 09:50:43 -05:00
Jay Berkenbilt
47f4ebcdac
Ignore unused field in xref entry, avoiding range error ( fixes #482 )
2020-11-04 07:46:46 -05:00
Jay Berkenbilt
fbe40b800d
Prepare 10.0.3 release
2020-10-31 13:47:03 -04:00
Jay Berkenbilt
6971f78ff6
Fix stack overflow on direct root (fuzz issue 26761)
2020-10-31 13:10:39 -04:00
Jay Berkenbilt
ffe6af6f77
Add comments explaining the foreign object copying code
...
These are the comments I would have liked to have been able to read
while fixing #449 and #478 .
2020-10-31 12:14:26 -04:00
Jay Berkenbilt
96767fb104
Fix foreign stream copying bug ( fixes #478 )
...
This reverts an incorrect fix to #449 and codes it properly. The real
problem was that we were looking at the local dictionaries rather than
the foreign dictionaries when saving the foreign stream data. In the
case of direct objects, these happened to be the same, but in the case
of indirect objects, the object references could be pointing anywhere
since object numbers don't match up between the old and new files.
2020-10-31 12:14:26 -04:00
Jay Berkenbilt
da7540794a
Prepare 10.0.2 release
2020-10-27 11:57:48 -04:00
Jay Berkenbilt
09bd1fafb1
Improve efficiency of number to string conversion
2020-10-27 11:57:48 -04:00
Jay Berkenbilt
bcea54fcaa
Revert removal of unreadCh change for performance
...
Turns out unreadCh is much more efficient than seek(-1, SEEK_CUR).
Update comments and code to reflect this.
2020-10-27 11:57:48 -04:00
Jay Berkenbilt
b30deaeeab
Avoid merging adjacent tokens when concatenating contents ( fixes #444 )
2020-10-23 08:00:04 -04:00
Jay Berkenbilt
8a11feacc3
Avoid leak by resolving object streams more than once (fuzz issue 23642)
2020-10-22 15:39:36 -04:00
Jay Berkenbilt
30bb4c64ee
Minor code cleanup
...
* Return rather than exiting from realmain in qpdf.cc
* Remove extraneous blank line
* Don't assign temporary to const reference
2020-10-22 15:39:36 -04:00