Jay Berkenbilt
da71dc6f37
QTC: cache get_env results for improved performance
...
It turns out that QUtil::get_env is particularly expensive on Windows
if there is a large environment. This may be true on other platforms
as well.
2022-08-07 14:23:05 -04:00
Jay Berkenbilt
32e30a3af2
Resolve QPDF{Name,Number} tree helper linker issues ( fixes #745 )
...
This is a guess...I'm not sure exactly why there are linker issues or
how to reproduce them.
2022-08-07 09:21:01 -04:00
Jay Berkenbilt
b90adb1c6c
Merge pull request #746 from m-holger/smart
...
Code tidy: remove redundant calls to smart_ptrs get() method
2022-08-07 08:41:50 -04:00
m-holger
7c6901bce5
Code tidy: remove redundant calls to smart_ptrs get() method
2022-08-07 10:33:25 +01:00
Jay Berkenbilt
3ec43f055a
Fix parsing comment
2022-08-06 14:24:08 -04:00
Jay Berkenbilt
a3037ca440
Merge pull request #739 from m-holger/getobject
...
Add QPDF::getObject to replace getObjectByObjGen and getObjectByID
2022-08-06 14:23:56 -04:00
m-holger
1553868c4a
Add QPDF::getObject to replace getObjectByObjGen and getObjectByID
...
For consistency with similar methods, e.g. replaceObject.
2022-08-01 19:22:37 +01:00
m-holger
407b0766b8
Inline QPDFObjectHandle::getObjGen etc
...
Also, make QPDFObjectHandle::isIndirect const.
2022-08-01 15:08:48 +01:00
Jay Berkenbilt
5d63730b93
Clean up documentation
2022-07-31 16:26:02 -04:00
Jay Berkenbilt
12d065c751
Provide a simpler QPDF::writeJSON
2022-07-31 16:23:17 -04:00
Jay Berkenbilt
13cf35ce2f
Use calledgetallpages and pushedinheritedpageresources
2022-07-31 16:23:17 -04:00
Jay Berkenbilt
5f4224f31a
Simplify --json-output
...
Now --json-output just changes defaults. Allow output file with --json.
2022-07-31 16:23:17 -04:00
Jay Berkenbilt
80acfc3826
Fix --json-help to take a version parameter
2022-07-31 16:23:17 -04:00
Jay Berkenbilt
69820847af
Change the output of --json to use "qpdf" instead of "objects"
2022-07-31 15:17:01 -04:00
Jay Berkenbilt
d01c4f8819
Change --json-output format
...
from "qpdf-v2" to "qpdf": [..., ...]
2022-07-31 10:32:55 -04:00
Jay Berkenbilt
bb96499b61
Update docs and prepare QPDF::writeJSON for changes
...
Add additional parameters that will be needed to call QPDF::writeJSON
in partial mode.
2022-07-31 10:32:55 -04:00
Jay Berkenbilt
0e3d4cdc97
Fix/clarify meaning of depth parameter to json write methods
2022-07-31 10:32:55 -04:00
Jay Berkenbilt
4feb10fdaf
Merge pull request #734 from m-holger/nullptr
...
Code tidy : replace 0 with nullptr or true
2022-07-31 08:33:45 -04:00
m-holger
073808aa50
Code tidy : replace 0 with nullptr or true
2022-07-26 13:40:13 +01:00
Jay Berkenbilt
4674c04cb8
JSON schema: support multi-element array validation
2022-07-24 16:44:51 -04:00
Jay Berkenbilt
f8d1ab9462
JSON schema -- accept single item in place of array
...
When the schema wants a variable-length array, allow a single item as
well as allowing an array.
2022-07-24 16:17:03 -04:00
Jay Berkenbilt
b3e6d445cb
Tweak "AndGet" mutator functions again
...
Remove any ambiguity around whether old or new value is being
returned.
2022-07-24 15:42:23 -04:00
m-holger
8b4afa428e
Revert making second parameter of QPDFObjGen::QPDFObjGen optional
...
Also, change test for QPDFObjGen::isIndirect to obj != 0.
Delete comment from commit afd35f9
.
2022-07-24 16:55:10 +01:00
m-holger
afd35f9a30
Overload StreamDataProvider::provideStreamData
...
Use 'QPDFObjGen const&' instead of 'int, int' in signature.
2022-07-24 16:02:35 +01:00
m-holger
5d0469f1bc
QPDFObjGen : tidy QPDFJob
...
Use QPDFObjGen::unparse where appropriate.
2022-07-24 16:02:35 +01:00
m-holger
4b73d057fb
QPDFObjGen : tidy QPDF_Stream
...
Change method signatures to use QPDFObjGen.
Replace QPDF_Stream::objid and generation with QPDF_Stream::og.
2022-07-24 16:02:35 +01:00
m-holger
f7978db1f6
QPDFObjGen : tidy QPDF private methods
...
Change method signatures to use QPDFObjGen.
Use QPDFObjGen methods where possible.
Remove redundant QPDF::objGenToIndirect.
2022-07-24 16:02:35 +01:00
m-holger
3404ca8ac8
QPDFObjGen : tidy QPDFObjectHandle private methods
...
Change method signature to use QPDFObjGen.
2022-07-24 15:59:49 +01:00
m-holger
b123f79dfd
Replace QPDFObjectHandle::objid and generation with QPDFObjectHandle::og
2022-07-24 15:59:49 +01:00
m-holger
c0168cf88c
QPPFObjGen : tidy QPDF::readObjectAtOffset
...
Change method signature to use QPDFObjGen.
2022-07-24 15:59:49 +01:00
m-holger
eeb6162f76
Add optional parameter separator to QPDFObjGen::unparse
...
Also, revert inlining of unparse and operator << from commit 4c6640c
in
order to avoid exposing QUtil.
2022-07-24 15:41:48 +01:00
Jay Berkenbilt
6f1041afb8
Clarify intent in readObjectAtOffset
...
Rather than using object id -1 to mean "don't care", use object ID 0,
and clarify the difference between that use and indication of a direct
object.
2022-07-24 09:40:11 -04:00
m-holger
4c6640cb45
Inline QPDFObjGen methods
...
ABI breaking change
2022-07-16 14:32:48 -04:00
Jay Berkenbilt
a603c1e395
Run format-code
2022-06-27 12:50:35 -04:00
m-holger
f0a8178091
Refactor QPDFObject creation and cloning
...
Move responsibility for creating shared pointers to objects and cloning from QPDFObjectHandle to QPDFObject.
2022-06-27 12:47:02 -04:00
m-holger
5aa8225f49
Refactor QPDFObjectTypeAccessor and QPDFObjectHandle::dereference
2022-06-27 10:39:04 -04:00
Jay Berkenbilt
0c7c7e4ba4
Track whether certain page modifying methods have been called
...
We need to know whether pushInheritedAttributesToPage or getAllPages
have been called when generating JSON output. When reading the JSON
back in, we have to call the same methods so that object numbers will
line up properly.
2022-06-25 13:55:45 -04:00
Jay Berkenbilt
25aff0bd52
TODO: abandon (again) and update notes about QPDFPagesTree
2022-06-25 13:26:53 -04:00
Jay Berkenbilt
8a32515a62
Add warnings for some additional page tree repair
2022-06-25 13:25:35 -04:00
Jay Berkenbilt
6c4537885e
Reformat code
2022-06-25 11:11:24 -04:00
m-holger
7836e19747
Code tidy: remove redundant calls to QPDFObjectHandle::isInitialized
2022-06-25 11:10:06 -04:00
m-holger
3b3bcab349
Remove QPDF_Stream::setStreamDescription
2022-06-25 08:26:46 -04:00
m-holger
9eda1fdc41
Remove redundant QPDF_Array::setDescription and QPDF_Dictionary::setDescription
2022-06-25 08:25:58 -04:00
m-holger
e9c1637353
Add private method QPDFObjectHandle::getObjGenAsStr
...
Also, use methods to access objid and generation.
2022-06-25 08:25:32 -04:00
m-holger
97f737a562
Code tidy: QPDFJob::doJSONPageLabels
...
Remove redundant variables pages and next.
2022-06-25 08:24:50 -04:00
Jay Berkenbilt
1eb2f208ec
Use Pl_Function in qpdflogger C API implementation
2022-06-19 09:12:59 -04:00
Jay Berkenbilt
eae75dbe44
Add Pl_Function -- a generic function pipeline
2022-06-19 09:12:29 -04:00
Jay Berkenbilt
bb0ea2f8e7
Add qpdfjob_register_progress_reporter
2022-06-19 08:46:58 -04:00
Jay Berkenbilt
87412eb05b
Add QPDFJob::registerProgressReporter
2022-06-19 08:46:58 -04:00
Jay Berkenbilt
3a7ee7e938
Move C-based ProgressReporter helper into QPDFWriter
2022-06-19 08:46:58 -04:00
Jay Berkenbilt
8130d50e3b
Add C API to QPDFLogger
2022-06-19 08:46:58 -04:00
Jay Berkenbilt
daef4e8fb8
Add more flexible funtions to qpdfjob C API
2022-06-19 08:46:58 -04:00
Jay Berkenbilt
e0720eaa78
Use the default logger for other writes to stdout/stderr
...
When there is no context for writing output or error messages, use the
default logger.
2022-06-18 10:38:50 -04:00
Jay Berkenbilt
83be2191b4
Use "save" logger when saving data to standard output
...
This includes the output PDF, streams from --show-object and
attachments from --save-attachment. This also enables --verbose and
--progress to work with saving to stdout.
2022-06-18 09:54:40 -04:00
Jay Berkenbilt
641e92c6a7
QPDF, QPDFJob: use QPDFLogger instead of custom output streams
2022-06-18 09:02:55 -04:00
Jay Berkenbilt
f1f711963b
Add and test QPDFLogger class
2022-06-18 09:02:55 -04:00
Jay Berkenbilt
f588d74140
Add integer types to Pipeline::operator<<
2022-06-18 09:02:55 -04:00
m-holger
057bd659bc
Code tidy: remove redundant variable in QPDF::writeJSON
2022-06-05 18:46:21 -04:00
Jay Berkenbilt
0bd908b550
Update documentation for qpdf JSON v2
2022-05-30 20:03:08 -04:00
Jay Berkenbilt
b7bbf12e85
In json mode, reveal recovered user password when otherwise unavailable
2022-05-30 20:03:08 -04:00
Jay Berkenbilt
f049a77c59
Add additional information when listing attachments
2022-05-30 20:03:08 -04:00
Jay Berkenbilt
04fc7c4bea
Add conversions to ISO-8601 date format
2022-05-30 20:03:08 -04:00
Jay Berkenbilt
27a42c16c7
Change default decode level to "none" with --json-output
2022-05-21 17:51:34 -04:00
Jay Berkenbilt
752f43d4e4
Allow empty b: binary JSON strings
2022-05-21 17:36:32 -04:00
Jay Berkenbilt
05460d405c
Format code
2022-05-21 16:11:42 -04:00
m-holger
6c69a747b9
Code clean up: use range-style for loops wherever possible
...
Remove variables obsoleted by commit 4f24617
.
2022-05-21 16:06:29 -04:00
Jay Berkenbilt
c56a9ca7f6
JSON: Fix large file support
2022-05-21 09:43:45 -04:00
Jay Berkenbilt
47c093c48b
Replace std::regex with validators for better performance
2022-05-21 08:43:21 -04:00
Jay Berkenbilt
9b2eb01e25
Exercise object description in tests
2022-05-20 14:23:32 -04:00
Jay Berkenbilt
6c2fb5b8f0
Add test for bad data and bad datafile
2022-05-20 13:33:30 -04:00
Jay Berkenbilt
d065098089
Test --update-from-json
2022-05-20 11:10:12 -04:00
Jay Berkenbilt
ef955b04b5
Bug fix: don't clobber stream length with replaceDict
2022-05-20 11:09:45 -04:00
Jay Berkenbilt
3eb77a7004
JSON: detect duplicate dictionary keys while parsing
2022-05-20 10:13:15 -04:00
Jay Berkenbilt
6d4e3ba8a4
Test (and fix) handling of dangling references
2022-05-20 09:16:25 -04:00
Jay Berkenbilt
5a2aa59479
Bug fix: isReserved() true for indirect reference to reserved object
2022-05-20 09:16:25 -04:00
Jay Berkenbilt
35b1e1c493
Explicitly test ignoring unknown keys in JSON input
2022-05-20 09:16:25 -04:00
Jay Berkenbilt
dc8df962d8
Make version default to latest for --json-output (like --json)
2022-05-20 09:16:25 -04:00
Jay Berkenbilt
6c7326b290
JSON fix: correctly parse UTF-16 surrogate pairs
2022-05-20 09:16:25 -04:00
Jay Berkenbilt
6f43bf8de3
Major rework -- see long comments
...
* Replace --create-from-json=file with --json-input, which causes the
regular input to be treated as json.
* Eliminate --to-json
* In --json=2, bring back "objects" and eliminate "objectinfo". Stream
data is never present.
* In --json-output=2, write "qpdf-v2" with "objects" and include
stream data.
2022-05-20 09:16:25 -04:00
Jay Berkenbilt
23fc6756f1
Add QUtil::FileCloser to the public API
2022-05-20 09:16:25 -04:00
Jay Berkenbilt
0fe8d44762
Support stream data -- not tested
...
There are no automated tests yet, but committing work so far in
preparation for some refactoring.
2022-05-20 09:16:25 -04:00
Jay Berkenbilt
63c7eefe9d
replaceStreamData: accept uninitialized filter/decode_parms
...
These mean to leave the original values alone. This is needed for
reconstructing streams from JSON given that the stream data and stream
dictionary may appear in any order in the JSON.
2022-05-20 09:16:25 -04:00
Jay Berkenbilt
56f1b411fe
Back out fluent QPDFObjectHandle methods. Keep the andGet methods.
...
I decided these were confusing and inconsistent with how JSON works.
They muddle the API rather than improving it.
2022-05-20 09:16:25 -04:00
Jay Berkenbilt
7e7a9c4379
Parse objects; stream data is not yet handled
2022-05-20 09:16:25 -04:00
Jay Berkenbilt
9064542b5f
Add private methods for reserving specific objects
2022-05-20 07:54:09 -04:00
Jay Berkenbilt
7fa5d1773b
Implement top-level qpdf json parsing
2022-05-16 13:41:40 -04:00
Jay Berkenbilt
8d42eb2632
Add scaffolding for QPDF JSON reactor
2022-05-16 13:41:40 -04:00
Jay Berkenbilt
4fe2e06b47
Add --create-from-json and --update-from-json arguments
...
Also add stubs for top-level QPDF methods (createFromJSON,
updateFromJSON)
2022-05-16 13:41:40 -04:00
Jay Berkenbilt
9a0e9a1a9e
Remove offset from missing /Root error
...
The last offset is irrelevant to not being able to find /Root.
2022-05-16 13:39:26 -04:00
Jay Berkenbilt
051ae7c282
Improve handling of replacing stream data with empty strings
...
When an empty string was passed to replaceStreamData, the code was
passing a null pointer to memcpy. Since a 0 size was also passed, this
was harmless, but it triggers sanitizer errors. The code properly
handles a null pointer as the buffer in other places.
2022-05-16 13:39:26 -04:00
Jay Berkenbilt
60ec94a7c3
Add QUtil::is_long_long
2022-05-16 13:39:26 -04:00
Jay Berkenbilt
4c7cfd5cbc
JSON reactor: improve handling of nested containers
...
Call the parent container's item method before calling the child
item's start method so we can easily know the current nesting level
when nested items are added.
2022-05-14 17:35:06 -04:00
Jay Berkenbilt
2a2f7f1bba
Add maxobjectid to JSON
2022-05-08 13:45:20 -04:00
Jay Berkenbilt
e9390aeaaa
Add --to-json option
2022-05-08 13:45:20 -04:00
Jay Berkenbilt
c76536dd9a
Implement JSON v2 output
2022-05-08 13:45:20 -04:00
Jay Berkenbilt
15272662f6
Fix typo in json output key name
...
moddify -> modify. Also carefully spell checked all remaining keys by
splitting them into words and running a spell checker, not just
relying on visual proofreading. That was the only one.
2022-05-08 13:45:20 -04:00
Jay Berkenbilt
1bc8abfdd3
Implement JSON v2 for Stream
...
Not fully exercised in this commit
2022-05-08 13:45:20 -04:00
Jay Berkenbilt
3246923cf2
Implement JSON v2 for String
...
Also refine the herustic for deciding whether to use hexadecimal
notation for a string.
2022-05-08 13:45:20 -04:00
Jay Berkenbilt
16f4f94cd9
Prepare code for JSON v2
...
Update getJSON() methods and calls to them
2022-05-07 11:12:01 -04:00
Jay Berkenbilt
a9fbbd5dca
Objectinfo json: write incrementally and in numeric order
...
This script was used on test data:
----------
#!/usr/bin/env python3
import json
import sys
import re
def json_dumps(data):
return json.dumps(data, ensure_ascii=False,
indent=2, separators=(',', ': '))
for filename in sys.argv[1:]:
with open(filename, 'r') as f:
data = json.loads(f.read())
if 'objectinfo' not in data:
continue
trailer = None
to_sort = []
for k, v in data['objectinfo'].items():
if k == 'trailer':
trailer = v
else:
m = re.match(r'^(\d+) \d+ R', k)
if m:
to_sort.append([int(m.group(1)), k, v])
newobjectinfo = {x[1]: x[2] for x in sorted(to_sort)}
if trailer is not None:
newobjectinfo['trailer'] = trailer
data['objectinfo'] = newobjectinfo
print(json_dumps(data))
----------
2022-05-07 08:26:31 -04:00