2
1
mirror of https://github.com/qpdf/qpdf.git synced 2024-11-16 01:27:07 +00:00
Commit Graph

2572 Commits

Author SHA1 Message Date
Jay Berkenbilt
3246923cf2 Implement JSON v2 for String
Also refine the herustic for deciding whether to use hexadecimal
notation for a string.
2022-05-08 13:45:20 -04:00
Jay Berkenbilt
16f4f94cd9 Prepare code for JSON v2
Update getJSON() methods and calls to them
2022-05-07 11:12:01 -04:00
Jay Berkenbilt
a9fbbd5dca Objectinfo json: write incrementally and in numeric order
This script was used on test data:

----------
#!/usr/bin/env python3
import json
import sys
import re

def json_dumps(data):
    return json.dumps(data, ensure_ascii=False,
                      indent=2, separators=(',', ': '))

for filename in sys.argv[1:]:
    with open(filename, 'r') as f:
        data = json.loads(f.read())
    if 'objectinfo' not in data:
        continue
    trailer = None
    to_sort = []
    for k, v in data['objectinfo'].items():
        if k == 'trailer':
            trailer = v
        else:
            m = re.match(r'^(\d+) \d+ R', k)
            if m:
                to_sort.append([int(m.group(1)), k, v])
    newobjectinfo = {x[1]: x[2] for x in sorted(to_sort)}
    if trailer is not None:
        newobjectinfo['trailer'] = trailer
    data['objectinfo'] = newobjectinfo
print(json_dumps(data))
----------
2022-05-07 08:26:31 -04:00
Jay Berkenbilt
948de60990 Objects json: write incrementally and in numeric order
The following script was used to adjust test data:

----------
#!/usr/bin/env python3
import json
import sys
import re

def json_dumps(data):
    return json.dumps(data, ensure_ascii=False,
                      indent=2, separators=(',', ': '))

for filename in sys.argv[1:]:
    with open(filename, 'r') as f:
        data = json.loads(f.read())
    if 'objects' not in data:
        continue
    trailer = None
    to_sort = []
    for k, v in data['objects'].items():
        if k == 'trailer':
            trailer = v
        else:
            m = re.match(r'^(\d+) \d+ R', k)
            if m:
                to_sort.append([int(m.group(1)), k, v])
    newobjects = {x[1]: x[2] for x in sorted(to_sort)}
    if trailer is not None:
        newobjects['trailer'] = trailer
    data['objects'] = newobjects
print(json_dumps(data))
----------
2022-05-07 08:26:31 -04:00
Jay Berkenbilt
f50274ef46 Pages json: write each page incrementally 2022-05-07 08:26:31 -04:00
Jay Berkenbilt
1615d7feaf Make JSON::writeNext public 2022-05-07 08:26:31 -04:00
Jay Berkenbilt
dc9b7287cd Top-level json: write incrementally
This commit just changes the order in which fields are written to the
json without changing their content. All the json files in the test
suite were modified with this script to ensure that we didn't get any
changes other than ordering.

----------
#!/usr/bin/env python3
import json
import sys

def json_dumps(data):
    return json.dumps(data, ensure_ascii=False,
                      indent=2, separators=(',', ': '))

for filename in sys.argv[1:]:
    with open(filename, 'r') as f:
        data = json.loads(f.read())
    newdata = {}
    for i in ('version', 'parameters', 'pages', 'pagelabels',
              'acroform', 'attachments', 'encrypt', 'outlines',
              'objects', 'objectinfo'):
        if i in data:
            newdata[i] = data[i]
print(json_dumps(newdata))
----------
2022-05-07 08:26:31 -04:00
Jay Berkenbilt
7f65a5c21f Test json against schema only on demand
Testing json against schema requires an in-memory copy, so do it only
when requested by the test suite.
2022-05-07 08:26:31 -04:00
Jay Berkenbilt
a3c9980395 Add next to Pl_String and fix comments 2022-05-07 08:26:31 -04:00
Jay Berkenbilt
b361c5ce19 Add --test-json-schema command-line option 2022-05-07 08:26:31 -04:00
Jay Berkenbilt
7604ac5cb2 QPDFJob: have doJSON write to a pipeline 2022-05-07 08:26:31 -04:00
Jay Berkenbilt
2a92b1b0d6 TODO: solidify remaining json v2 work 2022-05-07 08:26:31 -04:00
Jay Berkenbilt
0500d4347a JSON: add blob type that generates base64-encoded binary data 2022-05-06 19:14:52 -04:00
Jay Berkenbilt
05fda4afa2 Change JSON parser to parse from an InputSource 2022-05-04 12:07:11 -04:00
Jay Berkenbilt
e5f3910c3e Add new FileInputSource constructors 2022-05-04 12:07:11 -04:00
Jay Berkenbilt
e259635986 JSON: add write methods and implement unparse() in terms of those 2022-05-04 12:07:11 -04:00
Jay Berkenbilt
8b25de24c9 Make "objects" and "pages" consistent in JSON output 2022-05-04 08:32:44 -04:00
Jay Berkenbilt
6b576797cd Don't call pushInheritedAttributesToPage in json mode
We used to have to do that, but for quite some time, the code that
gets images has no longer required it.
2022-05-04 07:11:13 -04:00
Jay Berkenbilt
f4206a0938 Add new Pl_String Pipeline 2022-05-03 18:54:51 -04:00
Jay Berkenbilt
16139d97c8 Add new Pl_OStream Pipeline 2022-05-03 18:54:51 -04:00
Jay Berkenbilt
21d6e3231f Make use of the new Pipeline methods in some places 2022-05-03 18:31:23 -04:00
Jay Berkenbilt
f1c6bb97db Add new Pipeline convenience methods 2022-05-03 18:31:22 -04:00
Jay Berkenbilt
59f3e09edf Make Pipeline::write take an unsigned char const* (API change) 2022-05-03 18:31:22 -04:00
Jay Berkenbilt
d55c7ac570 Spell check with newer cSpell 2022-05-03 18:31:22 -04:00
Jay Berkenbilt
62bf296a9c Make assert handling less error-prone
Prevent my future self or other contributors from using assert in
tests and then having that assert not do anything because of the
NDEBUG macro.
2022-05-03 18:31:22 -04:00
Jay Berkenbilt
92b692466f Remove remaining incorrect assert calls from implementation 2022-05-03 18:31:22 -04:00
Jay Berkenbilt
b20f051922 TODO note about test suites 2022-05-03 18:31:22 -04:00
Jay Berkenbilt
3d9bac43da Add internal Pl_Base64
Bidirectional base64; will be used by JSON v2.
2022-05-03 18:31:22 -04:00
Jay Berkenbilt
f07284da18 Make sure building docs updates job.sums if needed 2022-05-03 08:39:50 -04:00
Jay Berkenbilt
6724a362c3 Move generate_auto_job to the top-level CMakeLists.txt 2022-05-03 08:39:50 -04:00
Jay Berkenbilt
7882b85b06 TODO: more JSON notes 2022-05-03 08:39:50 -04:00
Jay Berkenbilt
3c4d2bfb21 TODO: JSON notes 2022-05-03 08:39:50 -04:00
Jay Berkenbilt
8d2a0eda5a Add reactors to the JSON parser 2022-05-01 19:55:52 -04:00
Jay Berkenbilt
f5dd63819d Windows perl workaround 2022-05-01 19:55:52 -04:00
Jay Berkenbilt
ab01045bcd qtest: don't run coverage when TESTS is given 2022-05-01 13:25:51 -04:00
Jay Berkenbilt
72e5c73419 Limit parser depth for json parser 2022-05-01 12:56:22 -04:00
Jay Berkenbilt
e34dbbfa18 Spell check 2022-05-01 12:56:22 -04:00
Jay Berkenbilt
04118ca44b TODO item 2022-05-01 12:56:22 -04:00
Jay Berkenbilt
8ccd3a8a89 Mark weak encryption with API changes (fixes #576) 2022-04-30 17:24:15 -04:00
Jay Berkenbilt
2213ed0c3d Remove deprecated (pre-8.4.0) encryption APIs 2022-04-30 17:23:58 -04:00
Jay Berkenbilt
7608ff4e0b TODO: reminder to look for deprecated APIs in ABI section 2022-04-30 17:23:58 -04:00
Jay Berkenbilt
cff26040d8 Using insecure crytpo from the CLI is now an error by default 2022-04-30 17:23:58 -04:00
Jay Berkenbilt
ce19471f18 Add comments around non-security-related uses of MD5 2022-04-30 14:15:07 -04:00
Jay Berkenbilt
c365a26e9d Revert "Remove QPDFObjectHandle::replaceOrRemoveKey"
This reverts commit dc059560e7.

I changed my mind. There's no harm in leaving it deprecated for a
release cycle.
2022-04-30 14:15:07 -04:00
Jay Berkenbilt
dc059560e7 Remove QPDFObjectHandle::replaceOrRemoveKey
See ChangeLog for rationale for not deprecating it as originally
planned.
2022-04-30 13:39:45 -04:00
Jay Berkenbilt
0122f44865 TODO: remove a few discarded API change ideas
I had some ideas about some more convenience methods from discussions
with some developers, but I decided that the newly added ones cover
most of the use cases. The other ideas were too hard to explain
clearly and therefore too specialized to put into the public API,
where I would have to support them for a long time.
2022-04-30 13:30:53 -04:00
Jay Berkenbilt
4f24617e1e Code clean up: use range-style for loops wherever possible
Where not possible, use "auto" to get the iterator type.

Editorial note: I have avoid this change for a long time because of
not wanting to make gratuitous changes to version history, which can
obscure when certain changes were made, but with having recently
touched every single file to apply automatic code formatting and with
making several broad changes to the API, I decided it was time to take
the plunge and get rid of the older (pre-C++11) verbose iterator
syntax. The new code is just easier to read and understand, and in
many cases, it will be more effecient as fewer temporary copies are
being made.

m-holger, if you're reading, you can see that I've finally come
around. :-)
2022-04-30 13:27:18 -04:00
Jay Berkenbilt
7f023701dd Formatting: remove space in range-style for loops
Change .clang-format and commit automated changes from a fresh run of
format-code
2022-04-30 13:26:43 -04:00
Jay Berkenbilt
2878c186bf Use fluent appendItem 2022-04-30 10:54:16 -04:00
Jay Berkenbilt
ab9d557cb0 Use fluent replaceKey 2022-04-29 20:39:54 -04:00