m-holger
7248cab71b
Add class QPDF_Unresolved
...
Allow QPDFObjectHandle::obj to be set prior resolving object.
ot_unresolved has been appended to the list object types in order to
preserve the output of existing test cases.
2022-08-31 22:46:09 +01:00
m-holger
bd300be08d
Replace calls to QPDFObjectHandle::Factory::newIndirect where possible
2022-08-31 22:45:45 +01:00
Jay Berkenbilt
a078202c1b
Merge pull request #752 from jberkenbilt/report-mem-usage
...
Report mem usage
2022-08-31 15:50:17 -04:00
Jay Berkenbilt
7b3134ef94
Add ChangeLog for previous contribution
...
Also remove no-longer-needed #include
2022-08-31 15:06:37 -04:00
Jay Berkenbilt
433f1dae19
Add --report-mem-usage option for debugging/testing
2022-08-31 14:47:27 -04:00
Jay Berkenbilt
0a54247652
Add QUtil::get_max_memory_usage for testing
2022-08-31 14:47:27 -04:00
m-holger
9532dca3a5
Inline QPDFObjectHandle::setParsedOffset
...
Part of #729
2022-08-30 14:55:45 +01:00
m-holger
70d985f942
Optimise QPDFParser::parse for #311 problem
...
Avoid creating new null objects that later will be discarded and made
implicit.
Part of #729
2022-08-30 13:32:54 +01:00
m-holger
97a7ad1d80
Avoid setting descriptions / offsets for direct nulls in QPDFParser::parse
...
Part of #729
2022-08-30 13:07:48 +01:00
m-holger
7402c02c80
Combine stacks in QPDFParser::parse
...
Part of #729
2022-08-30 12:53:19 +01:00
m-holger
74162a2d48
Tune QPDFParser::parse
...
Replace SparseOHArray with std::vector<QPDFObjectHandle>.
Part of #729
2022-08-30 11:32:43 +01:00
m-holger
6fc982b71a
Move QPDFObjectHandle::setObjectDescriptionFromInput to QPDFParser
...
Part of #729
2022-08-30 06:42:46 +01:00
m-holger
8ad1ea34fe
Add private methods QPDFParser::warn
...
Part of #729
2022-08-30 06:04:34 +01:00
m-holger
6670c685ab
Move QPDFObjectHandle::parseInternal to new class QPDFParser
...
Part of #729
2022-08-30 05:56:23 +01:00
Jay Berkenbilt
0adfd74f8b
Merge pull request #747 from m-holger/new_stream
...
Add optional parameter allow_nullptr to QPDFObjectHandle::getOwningQPDF
2022-08-29 16:33:19 -04:00
Jay Berkenbilt
2b01a79e87
Fix header ordering in QTC (format code)
2022-08-29 11:55:02 -04:00
m-holger
c53d54b13d
Add optional parameter allow_nullptr to QPDFObjectHandle::getOwningQPDF
...
Also, inline method and add optional parameter error_msg.
2022-08-28 22:15:59 +01:00
m-holger
b0c1ae05a3
Fix commit b45420a
2022-08-27 12:43:49 +01:00
m-holger
fc4feb6f1a
Remove BufferInputSource::Members
2022-08-27 12:19:51 +01:00
m-holger
d6a447b654
Remove ClosedFileInputSource::Members
2022-08-27 12:13:39 +01:00
m-holger
69a5fb7047
Add methods InputSource::fastRead, fastUnRead and fastTell
...
Provide buffered input for QPDFTokenizer.
2022-08-26 23:55:56 +01:00
m-holger
13ef50cd27
Avoid virtual method call in FileInputSource::read
2022-08-25 15:08:03 +01:00
m-holger
a318b203be
Refactor FileInputSource::seek and FileInputSource::unreadCh
...
Avoid building error message each call "just in case".
2022-08-25 15:04:41 +01:00
m-holger
dc5c8b82eb
Remove FileInputSource::Members
2022-08-25 12:42:14 +01:00
m-holger
7108cd7b98
Remove redundant tests in QPDFTokenizer::readToken
2022-08-25 11:32:08 +01:00
m-holger
10fda01b07
In QPDFTokenizer::readToken move call to getToken out of loop
2022-08-25 11:31:45 +01:00
m-holger
e4073ee868
Remove unnecessary string copy in QPDFTokenizer::getToken
2022-08-25 11:31:09 +01:00
m-holger
b45420a980
Remove QPDFTokenizer::unread_char
2022-08-25 11:30:49 +01:00
m-holger
706106dabb
Refactor QPDFTokenizer::betweenTokens()
2022-08-25 11:30:35 +01:00
m-holger
6371b90ae3
Refactor QPDFTokenizer::presentEOF
2022-08-25 11:30:24 +01:00
m-holger
42ed58e446
Integrate booleans and null into state machine in QPDFTokenizer
2022-08-25 11:30:13 +01:00
m-holger
fe33b7ca18
Integrate numbers into state machine in QPDFTokenizer
2022-08-25 11:26:46 +01:00
m-holger
931fbb6156
Integrate names into state machine in QPDFTokenizer
2022-08-25 11:26:38 +01:00
m-holger
a3f3238f37
Split QPDFTokenizer::handleCharacter into individual methods
2022-08-25 11:26:05 +01:00
m-holger
6111a6a424
Refactor QPDFTokenizer::inCharCode
2022-08-25 10:55:45 +01:00
m-holger
e7889ec5dc
Refactor st_top case in QPDFTokenizer::handleCharacter
2022-08-25 10:51:51 +01:00
m-holger
e4fe0d5cf5
Refactor QPDFTokenizer::inHexstring
2022-08-25 10:50:06 +01:00
m-holger
a5d2e88775
Code tidy: replace if with case statement in QPDFTokenizer::inString
2022-08-25 10:43:29 +01:00
m-holger
7c32f6cc2e
Add state st_string_escape in QPDFTokenizer
2022-08-25 10:41:36 +01:00
m-holger
7c5778f999
Add state st_string_after_cr in QPDFTokenizer
2022-08-21 11:13:48 +01:00
m-holger
f29d0a6312
Add state st_char_code in QPDFTokenizer
2022-08-21 11:01:48 +01:00
m-holger
d26b537a7c
Add private method QPDFTokenizer::inString
2022-08-21 02:54:34 +01:00
m-holger
2697ba49bc
Add private method QPDFTokenizer::inHexstring
2022-08-21 02:46:31 +01:00
m-holger
f9530a5815
Code tidy: replace if with case statement in QPDFTokenizer::handleCharacter
2022-08-21 02:38:49 +01:00
m-holger
86ade3f9cd
Add private method QPDFTokenizer::handleCharacter
2022-08-21 02:26:27 +01:00
m-holger
91fb61eda5
Code tidy: replace if with case statement in QPDFTokenizer::presentCharacter
2022-08-21 00:54:41 +01:00
m-holger
cf945eeabf
Avoid shrinking QPDFTokenizer::val and QPDFTokenizer::raw_val
2022-08-20 19:43:00 +01:00
m-holger
45a6100cbb
Inline QUtil functions used by QPDFTokenizer
2022-08-18 15:23:35 +01:00
m-holger
c08bb0ec02
Remove QPDFTokenizer::Members
2022-08-18 13:13:19 +01:00
Jay Berkenbilt
cef6425bca
Disable QTC inside the library by default ( fixes #714 )
...
This results in measurable performance improvements to packaged binary
libqpdf distributions. QTC remains available for library users and is
still selectively enabled in CI.
2022-08-07 16:20:49 -04:00
Jay Berkenbilt
da71dc6f37
QTC: cache get_env results for improved performance
...
It turns out that QUtil::get_env is particularly expensive on Windows
if there is a large environment. This may be true on other platforms
as well.
2022-08-07 14:23:05 -04:00
Jay Berkenbilt
32e30a3af2
Resolve QPDF{Name,Number} tree helper linker issues ( fixes #745 )
...
This is a guess...I'm not sure exactly why there are linker issues or
how to reproduce them.
2022-08-07 09:21:01 -04:00
Jay Berkenbilt
b90adb1c6c
Merge pull request #746 from m-holger/smart
...
Code tidy: remove redundant calls to smart_ptrs get() method
2022-08-07 08:41:50 -04:00
m-holger
7c6901bce5
Code tidy: remove redundant calls to smart_ptrs get() method
2022-08-07 10:33:25 +01:00
Jay Berkenbilt
3ec43f055a
Fix parsing comment
2022-08-06 14:24:08 -04:00
Jay Berkenbilt
a3037ca440
Merge pull request #739 from m-holger/getobject
...
Add QPDF::getObject to replace getObjectByObjGen and getObjectByID
2022-08-06 14:23:56 -04:00
m-holger
1553868c4a
Add QPDF::getObject to replace getObjectByObjGen and getObjectByID
...
For consistency with similar methods, e.g. replaceObject.
2022-08-01 19:22:37 +01:00
m-holger
407b0766b8
Inline QPDFObjectHandle::getObjGen etc
...
Also, make QPDFObjectHandle::isIndirect const.
2022-08-01 15:08:48 +01:00
m-holger
903a86643a
Fix code formatting of QPDF::pushInheritedAttributesToPageInternal
2022-08-01 13:54:51 +01:00
m-holger
0356bcecc5
Tidy QPDF::pushInheritedAttributesToPageInternal
...
Remove unnecessary parameters.
Remove code that is unnecessary as result of a prior call to QPDF::getAllPages.
Avoid clearing and rebuilding of m->all_pages.
2022-08-01 13:29:14 +01:00
m-holger
ff69773b35
Fix warnings in QPDF::getAllPagesInternal
2022-08-01 13:29:14 +01:00
m-holger
9dea7d3080
Tune QPDF::getAllPagesInternal
...
Avoid calling getAllPagesInternal for each /Page object.
2022-08-01 13:29:14 +01:00
m-holger
4ccca20db0
Remove redundant parameter from QPDF::getAllPagesInternal
2022-08-01 13:29:14 +01:00
Jay Berkenbilt
5d63730b93
Clean up documentation
2022-07-31 16:26:02 -04:00
Jay Berkenbilt
12d065c751
Provide a simpler QPDF::writeJSON
2022-07-31 16:23:17 -04:00
Jay Berkenbilt
13cf35ce2f
Use calledgetallpages and pushedinheritedpageresources
2022-07-31 16:23:17 -04:00
Jay Berkenbilt
5f4224f31a
Simplify --json-output
...
Now --json-output just changes defaults. Allow output file with --json.
2022-07-31 16:23:17 -04:00
Jay Berkenbilt
80acfc3826
Fix --json-help to take a version parameter
2022-07-31 16:23:17 -04:00
Jay Berkenbilt
69820847af
Change the output of --json to use "qpdf" instead of "objects"
2022-07-31 15:17:01 -04:00
Jay Berkenbilt
d01c4f8819
Change --json-output format
...
from "qpdf-v2" to "qpdf": [..., ...]
2022-07-31 10:32:55 -04:00
Jay Berkenbilt
bb96499b61
Update docs and prepare QPDF::writeJSON for changes
...
Add additional parameters that will be needed to call QPDF::writeJSON
in partial mode.
2022-07-31 10:32:55 -04:00
Jay Berkenbilt
0e3d4cdc97
Fix/clarify meaning of depth parameter to json write methods
2022-07-31 10:32:55 -04:00
Jay Berkenbilt
4feb10fdaf
Merge pull request #734 from m-holger/nullptr
...
Code tidy : replace 0 with nullptr or true
2022-07-31 08:33:45 -04:00
m-holger
073808aa50
Code tidy : replace 0 with nullptr or true
2022-07-26 13:40:13 +01:00
Jay Berkenbilt
4674c04cb8
JSON schema: support multi-element array validation
2022-07-24 16:44:51 -04:00
Jay Berkenbilt
f8d1ab9462
JSON schema -- accept single item in place of array
...
When the schema wants a variable-length array, allow a single item as
well as allowing an array.
2022-07-24 16:17:03 -04:00
Jay Berkenbilt
b3e6d445cb
Tweak "AndGet" mutator functions again
...
Remove any ambiguity around whether old or new value is being
returned.
2022-07-24 15:42:23 -04:00
m-holger
8b4afa428e
Revert making second parameter of QPDFObjGen::QPDFObjGen optional
...
Also, change test for QPDFObjGen::isIndirect to obj != 0.
Delete comment from commit afd35f9
.
2022-07-24 16:55:10 +01:00
m-holger
afd35f9a30
Overload StreamDataProvider::provideStreamData
...
Use 'QPDFObjGen const&' instead of 'int, int' in signature.
2022-07-24 16:02:35 +01:00
m-holger
5d0469f1bc
QPDFObjGen : tidy QPDFJob
...
Use QPDFObjGen::unparse where appropriate.
2022-07-24 16:02:35 +01:00
m-holger
4b73d057fb
QPDFObjGen : tidy QPDF_Stream
...
Change method signatures to use QPDFObjGen.
Replace QPDF_Stream::objid and generation with QPDF_Stream::og.
2022-07-24 16:02:35 +01:00
m-holger
f7978db1f6
QPDFObjGen : tidy QPDF private methods
...
Change method signatures to use QPDFObjGen.
Use QPDFObjGen methods where possible.
Remove redundant QPDF::objGenToIndirect.
2022-07-24 16:02:35 +01:00
m-holger
3404ca8ac8
QPDFObjGen : tidy QPDFObjectHandle private methods
...
Change method signature to use QPDFObjGen.
2022-07-24 15:59:49 +01:00
m-holger
b123f79dfd
Replace QPDFObjectHandle::objid and generation with QPDFObjectHandle::og
2022-07-24 15:59:49 +01:00
m-holger
c0168cf88c
QPPFObjGen : tidy QPDF::readObjectAtOffset
...
Change method signature to use QPDFObjGen.
2022-07-24 15:59:49 +01:00
m-holger
eeb6162f76
Add optional parameter separator to QPDFObjGen::unparse
...
Also, revert inlining of unparse and operator << from commit 4c6640c
in
order to avoid exposing QUtil.
2022-07-24 15:41:48 +01:00
Jay Berkenbilt
6f1041afb8
Clarify intent in readObjectAtOffset
...
Rather than using object id -1 to mean "don't care", use object ID 0,
and clarify the difference between that use and indication of a direct
object.
2022-07-24 09:40:11 -04:00
m-holger
4c6640cb45
Inline QPDFObjGen methods
...
ABI breaking change
2022-07-16 14:32:48 -04:00
Jay Berkenbilt
a603c1e395
Run format-code
2022-06-27 12:50:35 -04:00
m-holger
f0a8178091
Refactor QPDFObject creation and cloning
...
Move responsibility for creating shared pointers to objects and cloning from QPDFObjectHandle to QPDFObject.
2022-06-27 12:47:02 -04:00
m-holger
5aa8225f49
Refactor QPDFObjectTypeAccessor and QPDFObjectHandle::dereference
2022-06-27 10:39:04 -04:00
Jay Berkenbilt
0c7c7e4ba4
Track whether certain page modifying methods have been called
...
We need to know whether pushInheritedAttributesToPage or getAllPages
have been called when generating JSON output. When reading the JSON
back in, we have to call the same methods so that object numbers will
line up properly.
2022-06-25 13:55:45 -04:00
Jay Berkenbilt
25aff0bd52
TODO: abandon (again) and update notes about QPDFPagesTree
2022-06-25 13:26:53 -04:00
Jay Berkenbilt
8a32515a62
Add warnings for some additional page tree repair
2022-06-25 13:25:35 -04:00
Jay Berkenbilt
6c4537885e
Reformat code
2022-06-25 11:11:24 -04:00
m-holger
7836e19747
Code tidy: remove redundant calls to QPDFObjectHandle::isInitialized
2022-06-25 11:10:06 -04:00
m-holger
3b3bcab349
Remove QPDF_Stream::setStreamDescription
2022-06-25 08:26:46 -04:00
m-holger
9eda1fdc41
Remove redundant QPDF_Array::setDescription and QPDF_Dictionary::setDescription
2022-06-25 08:25:58 -04:00
m-holger
e9c1637353
Add private method QPDFObjectHandle::getObjGenAsStr
...
Also, use methods to access objid and generation.
2022-06-25 08:25:32 -04:00
m-holger
97f737a562
Code tidy: QPDFJob::doJSONPageLabels
...
Remove redundant variables pages and next.
2022-06-25 08:24:50 -04:00
Jay Berkenbilt
1eb2f208ec
Use Pl_Function in qpdflogger C API implementation
2022-06-19 09:12:59 -04:00
Jay Berkenbilt
eae75dbe44
Add Pl_Function -- a generic function pipeline
2022-06-19 09:12:29 -04:00
Jay Berkenbilt
bb0ea2f8e7
Add qpdfjob_register_progress_reporter
2022-06-19 08:46:58 -04:00
Jay Berkenbilt
87412eb05b
Add QPDFJob::registerProgressReporter
2022-06-19 08:46:58 -04:00
Jay Berkenbilt
3a7ee7e938
Move C-based ProgressReporter helper into QPDFWriter
2022-06-19 08:46:58 -04:00
Jay Berkenbilt
8130d50e3b
Add C API to QPDFLogger
2022-06-19 08:46:58 -04:00
Jay Berkenbilt
daef4e8fb8
Add more flexible funtions to qpdfjob C API
2022-06-19 08:46:58 -04:00
Jay Berkenbilt
e0720eaa78
Use the default logger for other writes to stdout/stderr
...
When there is no context for writing output or error messages, use the
default logger.
2022-06-18 10:38:50 -04:00
Jay Berkenbilt
83be2191b4
Use "save" logger when saving data to standard output
...
This includes the output PDF, streams from --show-object and
attachments from --save-attachment. This also enables --verbose and
--progress to work with saving to stdout.
2022-06-18 09:54:40 -04:00
Jay Berkenbilt
641e92c6a7
QPDF, QPDFJob: use QPDFLogger instead of custom output streams
2022-06-18 09:02:55 -04:00
Jay Berkenbilt
f1f711963b
Add and test QPDFLogger class
2022-06-18 09:02:55 -04:00
Jay Berkenbilt
f588d74140
Add integer types to Pipeline::operator<<
2022-06-18 09:02:55 -04:00
m-holger
057bd659bc
Code tidy: remove redundant variable in QPDF::writeJSON
2022-06-05 18:46:21 -04:00
Jay Berkenbilt
0bd908b550
Update documentation for qpdf JSON v2
2022-05-30 20:03:08 -04:00
Jay Berkenbilt
b7bbf12e85
In json mode, reveal recovered user password when otherwise unavailable
2022-05-30 20:03:08 -04:00
Jay Berkenbilt
f049a77c59
Add additional information when listing attachments
2022-05-30 20:03:08 -04:00
Jay Berkenbilt
04fc7c4bea
Add conversions to ISO-8601 date format
2022-05-30 20:03:08 -04:00
Jay Berkenbilt
27a42c16c7
Change default decode level to "none" with --json-output
2022-05-21 17:51:34 -04:00
Jay Berkenbilt
752f43d4e4
Allow empty b: binary JSON strings
2022-05-21 17:36:32 -04:00
Jay Berkenbilt
05460d405c
Format code
2022-05-21 16:11:42 -04:00
m-holger
6c69a747b9
Code clean up: use range-style for loops wherever possible
...
Remove variables obsoleted by commit 4f24617
.
2022-05-21 16:06:29 -04:00
Jay Berkenbilt
c56a9ca7f6
JSON: Fix large file support
2022-05-21 09:43:45 -04:00
Jay Berkenbilt
47c093c48b
Replace std::regex with validators for better performance
2022-05-21 08:43:21 -04:00
Jay Berkenbilt
9b2eb01e25
Exercise object description in tests
2022-05-20 14:23:32 -04:00
Jay Berkenbilt
6c2fb5b8f0
Add test for bad data and bad datafile
2022-05-20 13:33:30 -04:00
Jay Berkenbilt
d065098089
Test --update-from-json
2022-05-20 11:10:12 -04:00
Jay Berkenbilt
ef955b04b5
Bug fix: don't clobber stream length with replaceDict
2022-05-20 11:09:45 -04:00
Jay Berkenbilt
3eb77a7004
JSON: detect duplicate dictionary keys while parsing
2022-05-20 10:13:15 -04:00
Jay Berkenbilt
6d4e3ba8a4
Test (and fix) handling of dangling references
2022-05-20 09:16:25 -04:00
Jay Berkenbilt
5a2aa59479
Bug fix: isReserved() true for indirect reference to reserved object
2022-05-20 09:16:25 -04:00
Jay Berkenbilt
35b1e1c493
Explicitly test ignoring unknown keys in JSON input
2022-05-20 09:16:25 -04:00
Jay Berkenbilt
dc8df962d8
Make version default to latest for --json-output (like --json)
2022-05-20 09:16:25 -04:00
Jay Berkenbilt
6c7326b290
JSON fix: correctly parse UTF-16 surrogate pairs
2022-05-20 09:16:25 -04:00
Jay Berkenbilt
6f43bf8de3
Major rework -- see long comments
...
* Replace --create-from-json=file with --json-input, which causes the
regular input to be treated as json.
* Eliminate --to-json
* In --json=2, bring back "objects" and eliminate "objectinfo". Stream
data is never present.
* In --json-output=2, write "qpdf-v2" with "objects" and include
stream data.
2022-05-20 09:16:25 -04:00
Jay Berkenbilt
23fc6756f1
Add QUtil::FileCloser to the public API
2022-05-20 09:16:25 -04:00
Jay Berkenbilt
0fe8d44762
Support stream data -- not tested
...
There are no automated tests yet, but committing work so far in
preparation for some refactoring.
2022-05-20 09:16:25 -04:00
Jay Berkenbilt
63c7eefe9d
replaceStreamData: accept uninitialized filter/decode_parms
...
These mean to leave the original values alone. This is needed for
reconstructing streams from JSON given that the stream data and stream
dictionary may appear in any order in the JSON.
2022-05-20 09:16:25 -04:00
Jay Berkenbilt
56f1b411fe
Back out fluent QPDFObjectHandle methods. Keep the andGet methods.
...
I decided these were confusing and inconsistent with how JSON works.
They muddle the API rather than improving it.
2022-05-20 09:16:25 -04:00
Jay Berkenbilt
7e7a9c4379
Parse objects; stream data is not yet handled
2022-05-20 09:16:25 -04:00
Jay Berkenbilt
9064542b5f
Add private methods for reserving specific objects
2022-05-20 07:54:09 -04:00
Jay Berkenbilt
7fa5d1773b
Implement top-level qpdf json parsing
2022-05-16 13:41:40 -04:00
Jay Berkenbilt
8d42eb2632
Add scaffolding for QPDF JSON reactor
2022-05-16 13:41:40 -04:00
Jay Berkenbilt
4fe2e06b47
Add --create-from-json and --update-from-json arguments
...
Also add stubs for top-level QPDF methods (createFromJSON,
updateFromJSON)
2022-05-16 13:41:40 -04:00
Jay Berkenbilt
9a0e9a1a9e
Remove offset from missing /Root error
...
The last offset is irrelevant to not being able to find /Root.
2022-05-16 13:39:26 -04:00
Jay Berkenbilt
051ae7c282
Improve handling of replacing stream data with empty strings
...
When an empty string was passed to replaceStreamData, the code was
passing a null pointer to memcpy. Since a 0 size was also passed, this
was harmless, but it triggers sanitizer errors. The code properly
handles a null pointer as the buffer in other places.
2022-05-16 13:39:26 -04:00
Jay Berkenbilt
60ec94a7c3
Add QUtil::is_long_long
2022-05-16 13:39:26 -04:00
Jay Berkenbilt
4c7cfd5cbc
JSON reactor: improve handling of nested containers
...
Call the parent container's item method before calling the child
item's start method so we can easily know the current nesting level
when nested items are added.
2022-05-14 17:35:06 -04:00
Jay Berkenbilt
2a2f7f1bba
Add maxobjectid to JSON
2022-05-08 13:45:20 -04:00
Jay Berkenbilt
e9390aeaaa
Add --to-json option
2022-05-08 13:45:20 -04:00
Jay Berkenbilt
c76536dd9a
Implement JSON v2 output
2022-05-08 13:45:20 -04:00
Jay Berkenbilt
15272662f6
Fix typo in json output key name
...
moddify -> modify. Also carefully spell checked all remaining keys by
splitting them into words and running a spell checker, not just
relying on visual proofreading. That was the only one.
2022-05-08 13:45:20 -04:00
Jay Berkenbilt
1bc8abfdd3
Implement JSON v2 for Stream
...
Not fully exercised in this commit
2022-05-08 13:45:20 -04:00
Jay Berkenbilt
3246923cf2
Implement JSON v2 for String
...
Also refine the herustic for deciding whether to use hexadecimal
notation for a string.
2022-05-08 13:45:20 -04:00
Jay Berkenbilt
16f4f94cd9
Prepare code for JSON v2
...
Update getJSON() methods and calls to them
2022-05-07 11:12:01 -04:00
Jay Berkenbilt
a9fbbd5dca
Objectinfo json: write incrementally and in numeric order
...
This script was used on test data:
----------
#!/usr/bin/env python3
import json
import sys
import re
def json_dumps(data):
return json.dumps(data, ensure_ascii=False,
indent=2, separators=(',', ': '))
for filename in sys.argv[1:]:
with open(filename, 'r') as f:
data = json.loads(f.read())
if 'objectinfo' not in data:
continue
trailer = None
to_sort = []
for k, v in data['objectinfo'].items():
if k == 'trailer':
trailer = v
else:
m = re.match(r'^(\d+) \d+ R', k)
if m:
to_sort.append([int(m.group(1)), k, v])
newobjectinfo = {x[1]: x[2] for x in sorted(to_sort)}
if trailer is not None:
newobjectinfo['trailer'] = trailer
data['objectinfo'] = newobjectinfo
print(json_dumps(data))
----------
2022-05-07 08:26:31 -04:00
Jay Berkenbilt
948de60990
Objects json: write incrementally and in numeric order
...
The following script was used to adjust test data:
----------
#!/usr/bin/env python3
import json
import sys
import re
def json_dumps(data):
return json.dumps(data, ensure_ascii=False,
indent=2, separators=(',', ': '))
for filename in sys.argv[1:]:
with open(filename, 'r') as f:
data = json.loads(f.read())
if 'objects' not in data:
continue
trailer = None
to_sort = []
for k, v in data['objects'].items():
if k == 'trailer':
trailer = v
else:
m = re.match(r'^(\d+) \d+ R', k)
if m:
to_sort.append([int(m.group(1)), k, v])
newobjects = {x[1]: x[2] for x in sorted(to_sort)}
if trailer is not None:
newobjects['trailer'] = trailer
data['objects'] = newobjects
print(json_dumps(data))
----------
2022-05-07 08:26:31 -04:00
Jay Berkenbilt
f50274ef46
Pages json: write each page incrementally
2022-05-07 08:26:31 -04:00
Jay Berkenbilt
dc9b7287cd
Top-level json: write incrementally
...
This commit just changes the order in which fields are written to the
json without changing their content. All the json files in the test
suite were modified with this script to ensure that we didn't get any
changes other than ordering.
----------
#!/usr/bin/env python3
import json
import sys
def json_dumps(data):
return json.dumps(data, ensure_ascii=False,
indent=2, separators=(',', ': '))
for filename in sys.argv[1:]:
with open(filename, 'r') as f:
data = json.loads(f.read())
newdata = {}
for i in ('version', 'parameters', 'pages', 'pagelabels',
'acroform', 'attachments', 'encrypt', 'outlines',
'objects', 'objectinfo'):
if i in data:
newdata[i] = data[i]
print(json_dumps(newdata))
----------
2022-05-07 08:26:31 -04:00
Jay Berkenbilt
7f65a5c21f
Test json against schema only on demand
...
Testing json against schema requires an in-memory copy, so do it only
when requested by the test suite.
2022-05-07 08:26:31 -04:00
Jay Berkenbilt
a3c9980395
Add next to Pl_String and fix comments
2022-05-07 08:26:31 -04:00
Jay Berkenbilt
b361c5ce19
Add --test-json-schema command-line option
2022-05-07 08:26:31 -04:00
Jay Berkenbilt
7604ac5cb2
QPDFJob: have doJSON write to a pipeline
2022-05-07 08:26:31 -04:00
Jay Berkenbilt
0500d4347a
JSON: add blob type that generates base64-encoded binary data
2022-05-06 19:14:52 -04:00
Jay Berkenbilt
05fda4afa2
Change JSON parser to parse from an InputSource
2022-05-04 12:07:11 -04:00
Jay Berkenbilt
e5f3910c3e
Add new FileInputSource constructors
2022-05-04 12:07:11 -04:00
Jay Berkenbilt
e259635986
JSON: add write methods and implement unparse() in terms of those
2022-05-04 12:07:11 -04:00
Jay Berkenbilt
8b25de24c9
Make "objects" and "pages" consistent in JSON output
2022-05-04 08:32:44 -04:00
Jay Berkenbilt
6b576797cd
Don't call pushInheritedAttributesToPage in json mode
...
We used to have to do that, but for quite some time, the code that
gets images has no longer required it.
2022-05-04 07:11:13 -04:00
Jay Berkenbilt
f4206a0938
Add new Pl_String Pipeline
2022-05-03 18:54:51 -04:00
Jay Berkenbilt
16139d97c8
Add new Pl_OStream Pipeline
2022-05-03 18:54:51 -04:00
Jay Berkenbilt
21d6e3231f
Make use of the new Pipeline methods in some places
2022-05-03 18:31:23 -04:00
Jay Berkenbilt
f1c6bb97db
Add new Pipeline convenience methods
2022-05-03 18:31:22 -04:00
Jay Berkenbilt
59f3e09edf
Make Pipeline::write take an unsigned char const* (API change)
2022-05-03 18:31:22 -04:00
Jay Berkenbilt
62bf296a9c
Make assert handling less error-prone
...
Prevent my future self or other contributors from using assert in
tests and then having that assert not do anything because of the
NDEBUG macro.
2022-05-03 18:31:22 -04:00
Jay Berkenbilt
92b692466f
Remove remaining incorrect assert calls from implementation
2022-05-03 18:31:22 -04:00
Jay Berkenbilt
3d9bac43da
Add internal Pl_Base64
...
Bidirectional base64; will be used by JSON v2.
2022-05-03 18:31:22 -04:00
Jay Berkenbilt
6724a362c3
Move generate_auto_job to the top-level CMakeLists.txt
2022-05-03 08:39:50 -04:00
Jay Berkenbilt
8d2a0eda5a
Add reactors to the JSON parser
2022-05-01 19:55:52 -04:00
Jay Berkenbilt
72e5c73419
Limit parser depth for json parser
2022-05-01 12:56:22 -04:00
Jay Berkenbilt
e34dbbfa18
Spell check
2022-05-01 12:56:22 -04:00
Jay Berkenbilt
8ccd3a8a89
Mark weak encryption with API changes ( fixes #576 )
2022-04-30 17:24:15 -04:00
Jay Berkenbilt
2213ed0c3d
Remove deprecated (pre-8.4.0) encryption APIs
2022-04-30 17:23:58 -04:00
Jay Berkenbilt
cff26040d8
Using insecure crytpo from the CLI is now an error by default
2022-04-30 17:23:58 -04:00
Jay Berkenbilt
ce19471f18
Add comments around non-security-related uses of MD5
2022-04-30 14:15:07 -04:00
Jay Berkenbilt
c365a26e9d
Revert "Remove QPDFObjectHandle::replaceOrRemoveKey"
...
This reverts commit dc059560e7
.
I changed my mind. There's no harm in leaving it deprecated for a
release cycle.
2022-04-30 14:15:07 -04:00
Jay Berkenbilt
dc059560e7
Remove QPDFObjectHandle::replaceOrRemoveKey
...
See ChangeLog for rationale for not deprecating it as originally
planned.
2022-04-30 13:39:45 -04:00
Jay Berkenbilt
4f24617e1e
Code clean up: use range-style for loops wherever possible
...
Where not possible, use "auto" to get the iterator type.
Editorial note: I have avoid this change for a long time because of
not wanting to make gratuitous changes to version history, which can
obscure when certain changes were made, but with having recently
touched every single file to apply automatic code formatting and with
making several broad changes to the API, I decided it was time to take
the plunge and get rid of the older (pre-C++11) verbose iterator
syntax. The new code is just easier to read and understand, and in
many cases, it will be more effecient as fewer temporary copies are
being made.
m-holger, if you're reading, you can see that I've finally come
around. :-)
2022-04-30 13:27:18 -04:00
Jay Berkenbilt
7f023701dd
Formatting: remove space in range-style for loops
...
Change .clang-format and commit automated changes from a fresh run of
format-code
2022-04-30 13:26:43 -04:00
Jay Berkenbilt
2878c186bf
Use fluent appendItem
2022-04-30 10:54:16 -04:00
Jay Berkenbilt
ab9d557cb0
Use fluent replaceKey
2022-04-29 20:39:54 -04:00
Jay Berkenbilt
d8fdf632a9
Use replaceKeyAndGet in a few places in existing code
2022-04-29 20:28:02 -04:00
Jay Berkenbilt
e80fad86e9
Add new QPDFObjectHandle methods for more fluent programming
2022-04-29 20:09:10 -04:00
Jay Berkenbilt
d0b7cc8ac6
QPDFJob json: make removeAttachment take an array ( fixes #693 )
2022-04-24 13:06:19 -04:00
Jay Berkenbilt
63c5a56f38
Fix build logic around generate_auto_job
...
It was being run at configuration time, not build time.
2022-04-24 13:06:16 -04:00
Jay Berkenbilt
08ba21cf49
Fix some bugs around null values in dictionaries
...
Make it so that a key with a null value is always treated as not being
present. This was inconsistent before.
2022-04-24 10:08:32 -04:00
Jay Berkenbilt
4be2f36049
Deprecate replaceOrRemoveKey -- it's the same as replaceKey
2022-04-24 09:31:32 -04:00
Jay Berkenbilt
4925f0d18c
Have dictionary/streams mutators take const& where possible
2022-04-24 09:05:50 -04:00
Jay Berkenbilt
68e721981a
Add new QPDF::warn that takes most of QPDFExc's arguments
2022-04-23 18:25:43 -04:00
Jay Berkenbilt
22b35c4928
Expose QUtil::get_next_utf8_codepoint
2022-04-23 18:25:43 -04:00
Jay Berkenbilt
5bbb0d4c30
Replace switch statements with static map initializers
...
Character transcoding from Unicode to single-byte characters used
hard-coded switch statements because the code predated our adoption of
C++11. Now we have thread-safe, static initialization of map literals,
so use that instead.
2022-04-23 18:25:43 -04:00