octoleo/qpdf - qpdf - Vast Development Method

mirror of https://github.com/qpdf/qpdf.git synced 2024-12-23 03:18:59 +00:00

Author	SHA1	Message	Date
Jay Berkenbilt	051ae7c282	Improve handling of replacing stream data with empty strings When an empty string was passed to replaceStreamData, the code was passing a null pointer to memcpy. Since a 0 size was also passed, this was harmless, but it triggers sanitizer errors. The code properly handles a null pointer as the buffer in other places.	2022-05-16 13:39:26 -04:00
Jay Berkenbilt	60ec94a7c3	Add QUtil::is_long_long	2022-05-16 13:39:26 -04:00
Jay Berkenbilt	4c7cfd5cbc	JSON reactor: improve handling of nested containers Call the parent container's item method before calling the child item's start method so we can easily know the current nesting level when nested items are added.	2022-05-14 17:35:06 -04:00
Jay Berkenbilt	2a2f7f1bba	Add maxobjectid to JSON	2022-05-08 13:45:20 -04:00
Jay Berkenbilt	e9390aeaaa	Add --to-json option	2022-05-08 13:45:20 -04:00
Jay Berkenbilt	c76536dd9a	Implement JSON v2 output	2022-05-08 13:45:20 -04:00
Jay Berkenbilt	15272662f6	Fix typo in json output key name moddify -> modify. Also carefully spell checked all remaining keys by splitting them into words and running a spell checker, not just relying on visual proofreading. That was the only one.	2022-05-08 13:45:20 -04:00
Jay Berkenbilt	1bc8abfdd3	Implement JSON v2 for Stream Not fully exercised in this commit	2022-05-08 13:45:20 -04:00
Jay Berkenbilt	3246923cf2	Implement JSON v2 for String Also refine the herustic for deciding whether to use hexadecimal notation for a string.	2022-05-08 13:45:20 -04:00
Jay Berkenbilt	16f4f94cd9	Prepare code for JSON v2 Update getJSON() methods and calls to them	2022-05-07 11:12:01 -04:00
Jay Berkenbilt	a9fbbd5dca	Objectinfo json: write incrementally and in numeric order This script was used on test data: ---------- #!/usr/bin/env python3 import json import sys import re def json_dumps(data): return json.dumps(data, ensure_ascii=False, indent=2, separators=(',', ': ')) for filename in sys.argv[1:]: with open(filename, 'r') as f: data = json.loads(f.read()) if 'objectinfo' not in data: continue trailer = None to_sort = [] for k, v in data['objectinfo'].items(): if k == 'trailer': trailer = v else: m = re.match(r'^(\d+) \d+ R', k) if m: to_sort.append([int(m.group(1)), k, v]) newobjectinfo = {x[1]: x[2] for x in sorted(to_sort)} if trailer is not None: newobjectinfo['trailer'] = trailer data['objectinfo'] = newobjectinfo print(json_dumps(data)) ----------	2022-05-07 08:26:31 -04:00
Jay Berkenbilt	948de60990	Objects json: write incrementally and in numeric order The following script was used to adjust test data: ---------- #!/usr/bin/env python3 import json import sys import re def json_dumps(data): return json.dumps(data, ensure_ascii=False, indent=2, separators=(',', ': ')) for filename in sys.argv[1:]: with open(filename, 'r') as f: data = json.loads(f.read()) if 'objects' not in data: continue trailer = None to_sort = [] for k, v in data['objects'].items(): if k == 'trailer': trailer = v else: m = re.match(r'^(\d+) \d+ R', k) if m: to_sort.append([int(m.group(1)), k, v]) newobjects = {x[1]: x[2] for x in sorted(to_sort)} if trailer is not None: newobjects['trailer'] = trailer data['objects'] = newobjects print(json_dumps(data)) ----------	2022-05-07 08:26:31 -04:00
Jay Berkenbilt	f50274ef46	Pages json: write each page incrementally	2022-05-07 08:26:31 -04:00
Jay Berkenbilt	dc9b7287cd	Top-level json: write incrementally This commit just changes the order in which fields are written to the json without changing their content. All the json files in the test suite were modified with this script to ensure that we didn't get any changes other than ordering. ---------- #!/usr/bin/env python3 import json import sys def json_dumps(data): return json.dumps(data, ensure_ascii=False, indent=2, separators=(',', ': ')) for filename in sys.argv[1:]: with open(filename, 'r') as f: data = json.loads(f.read()) newdata = {} for i in ('version', 'parameters', 'pages', 'pagelabels', 'acroform', 'attachments', 'encrypt', 'outlines', 'objects', 'objectinfo'): if i in data: newdata[i] = data[i] print(json_dumps(newdata)) ----------	2022-05-07 08:26:31 -04:00
Jay Berkenbilt	7f65a5c21f	Test json against schema only on demand Testing json against schema requires an in-memory copy, so do it only when requested by the test suite.	2022-05-07 08:26:31 -04:00
Jay Berkenbilt	a3c9980395	Add next to Pl_String and fix comments	2022-05-07 08:26:31 -04:00
Jay Berkenbilt	b361c5ce19	Add --test-json-schema command-line option	2022-05-07 08:26:31 -04:00
Jay Berkenbilt	7604ac5cb2	QPDFJob: have doJSON write to a pipeline	2022-05-07 08:26:31 -04:00
Jay Berkenbilt	0500d4347a	JSON: add blob type that generates base64-encoded binary data	2022-05-06 19:14:52 -04:00
Jay Berkenbilt	05fda4afa2	Change JSON parser to parse from an InputSource	2022-05-04 12:07:11 -04:00
Jay Berkenbilt	e5f3910c3e	Add new FileInputSource constructors	2022-05-04 12:07:11 -04:00
Jay Berkenbilt	e259635986	JSON: add write methods and implement unparse() in terms of those	2022-05-04 12:07:11 -04:00
Jay Berkenbilt	8b25de24c9	Make "objects" and "pages" consistent in JSON output	2022-05-04 08:32:44 -04:00
Jay Berkenbilt	6b576797cd	Don't call pushInheritedAttributesToPage in json mode We used to have to do that, but for quite some time, the code that gets images has no longer required it.	2022-05-04 07:11:13 -04:00
Jay Berkenbilt	f4206a0938	Add new Pl_String Pipeline	2022-05-03 18:54:51 -04:00
Jay Berkenbilt	16139d97c8	Add new Pl_OStream Pipeline	2022-05-03 18:54:51 -04:00
Jay Berkenbilt	21d6e3231f	Make use of the new Pipeline methods in some places	2022-05-03 18:31:23 -04:00
Jay Berkenbilt	f1c6bb97db	Add new Pipeline convenience methods	2022-05-03 18:31:22 -04:00
Jay Berkenbilt	59f3e09edf	Make Pipeline::write take an unsigned char const* (API change)	2022-05-03 18:31:22 -04:00
Jay Berkenbilt	62bf296a9c	Make assert handling less error-prone Prevent my future self or other contributors from using assert in tests and then having that assert not do anything because of the NDEBUG macro.	2022-05-03 18:31:22 -04:00
Jay Berkenbilt	92b692466f	Remove remaining incorrect assert calls from implementation	2022-05-03 18:31:22 -04:00
Jay Berkenbilt	3d9bac43da	Add internal Pl_Base64 Bidirectional base64; will be used by JSON v2.	2022-05-03 18:31:22 -04:00
Jay Berkenbilt	6724a362c3	Move generate_auto_job to the top-level CMakeLists.txt	2022-05-03 08:39:50 -04:00
Jay Berkenbilt	8d2a0eda5a	Add reactors to the JSON parser	2022-05-01 19:55:52 -04:00
Jay Berkenbilt	72e5c73419	Limit parser depth for json parser	2022-05-01 12:56:22 -04:00
Jay Berkenbilt	e34dbbfa18	Spell check	2022-05-01 12:56:22 -04:00
Jay Berkenbilt	8ccd3a8a89	Mark weak encryption with API changes (fixes #576 )	2022-04-30 17:24:15 -04:00
Jay Berkenbilt	2213ed0c3d	Remove deprecated (pre-8.4.0) encryption APIs	2022-04-30 17:23:58 -04:00
Jay Berkenbilt	cff26040d8	Using insecure crytpo from the CLI is now an error by default	2022-04-30 17:23:58 -04:00
Jay Berkenbilt	ce19471f18	Add comments around non-security-related uses of MD5	2022-04-30 14:15:07 -04:00
Jay Berkenbilt	c365a26e9d	Revert "Remove QPDFObjectHandle::replaceOrRemoveKey" This reverts commit `dc059560e7`. I changed my mind. There's no harm in leaving it deprecated for a release cycle.	2022-04-30 14:15:07 -04:00
Jay Berkenbilt	dc059560e7	Remove QPDFObjectHandle::replaceOrRemoveKey See ChangeLog for rationale for not deprecating it as originally planned.	2022-04-30 13:39:45 -04:00
Jay Berkenbilt	4f24617e1e	Code clean up: use range-style for loops wherever possible Where not possible, use "auto" to get the iterator type. Editorial note: I have avoid this change for a long time because of not wanting to make gratuitous changes to version history, which can obscure when certain changes were made, but with having recently touched every single file to apply automatic code formatting and with making several broad changes to the API, I decided it was time to take the plunge and get rid of the older (pre-C++11) verbose iterator syntax. The new code is just easier to read and understand, and in many cases, it will be more effecient as fewer temporary copies are being made. m-holger, if you're reading, you can see that I've finally come around. :-)	2022-04-30 13:27:18 -04:00
Jay Berkenbilt	7f023701dd	Formatting: remove space in range-style for loops Change .clang-format and commit automated changes from a fresh run of format-code	2022-04-30 13:26:43 -04:00
Jay Berkenbilt	2878c186bf	Use fluent appendItem	2022-04-30 10:54:16 -04:00
Jay Berkenbilt	ab9d557cb0	Use fluent replaceKey	2022-04-29 20:39:54 -04:00
Jay Berkenbilt	d8fdf632a9	Use replaceKeyAndGet in a few places in existing code	2022-04-29 20:28:02 -04:00
Jay Berkenbilt	e80fad86e9	Add new QPDFObjectHandle methods for more fluent programming	2022-04-29 20:09:10 -04:00
Jay Berkenbilt	d0b7cc8ac6	QPDFJob json: make removeAttachment take an array (fixes #693 )	2022-04-24 13:06:19 -04:00
Jay Berkenbilt	63c5a56f38	Fix build logic around generate_auto_job It was being run at configuration time, not build time.	2022-04-24 13:06:16 -04:00
Jay Berkenbilt	08ba21cf49	Fix some bugs around null values in dictionaries Make it so that a key with a null value is always treated as not being present. This was inconsistent before.	2022-04-24 10:08:32 -04:00
Jay Berkenbilt	4be2f36049	Deprecate replaceOrRemoveKey -- it's the same as replaceKey	2022-04-24 09:31:32 -04:00
Jay Berkenbilt	4925f0d18c	Have dictionary/streams mutators take const& where possible	2022-04-24 09:05:50 -04:00
Jay Berkenbilt	68e721981a	Add new QPDF::warn that takes most of QPDFExc's arguments	2022-04-23 18:25:43 -04:00
Jay Berkenbilt	22b35c4928	Expose QUtil::get_next_utf8_codepoint	2022-04-23 18:25:43 -04:00
Jay Berkenbilt	5bbb0d4c30	Replace switch statements with static map initializers Character transcoding from Unicode to single-byte characters used hard-coded switch statements because the code predated our adoption of C++11. Now we have thread-safe, static initialization of map literals, so use that instead.	2022-04-23 18:25:43 -04:00
Jay Berkenbilt	ce5c3bcad8	QPDFJob: pass capture output streams through to underlying QPDF	2022-04-18 11:24:17 -04:00
Jay Berkenbilt	75fe4f60c3	Use anonymous namespaces for file-private classes	2022-04-16 13:35:27 -04:00
Jay Berkenbilt	80ed3076a0	Remove deprecated name/number tree constructors Remove the name/number tree object helper constructors that don't take a QPDF&.	2022-04-16 13:13:15 -04:00
Jay Berkenbilt	496ca2e4dc	Remove QPDFAcroFormDocumentHelper::copyFieldsFromForeignPage	2022-04-16 13:12:07 -04:00
Jay Berkenbilt	6df6260751	Change default --json from 1 to latest	2022-04-16 12:57:33 -04:00
Jay Berkenbilt	cdd0b4fb7d	Use = default and = delete where possible in classes	2022-04-16 11:39:14 -04:00
Jay Berkenbilt	2a7d2b63c2	Make ABI-breaking changes that don't modify API at all * Merge overloaded functions by adding default values * Remove non-const methods that are identical to const methods	2022-04-16 10:41:46 -04:00
Jay Berkenbilt	ce86307a1a	Fix typo in error message	2022-04-10 16:54:23 -04:00
Jay Berkenbilt	90cfe80bac	Clean up/fix DLL.h * Change DLL_EXPORT to libqpdf_EXPORTS (internal to the build). The new name is cmake's default, is more conventional, and is less likely to clash with other symbols. * Add QPDF_DLL_PRIVATE for non-Windows * Make logic around when to define QPDF_DLL et al more explicit * Add detailed comments	2022-04-10 16:52:36 -04:00
Jay Berkenbilt	07edf96440	Remove methods of private classes from ABI Prior to the cmake conversion, several private classes had methods that were exported into the shared library so they could be tested with libtests. With cmake, we build libtests using an object library, so this is no longer necessary. The methods that are disappearing from the ABI were never exposed through public headers, so no code should be using them. Removal had to wait until the window for ABI-breaking changes was open.	2022-04-09 17:33:29 -04:00
Jay Berkenbilt	128e41648f	Remove PointerHolder.hh from other than public header files Increase to POINTERHOLDER_TRANSITION=4	2022-04-09 17:33:29 -04:00
Jay Berkenbilt	a68703b07e	Replace PointerHolder with std::shared_ptr in library sources only (patrepl and cleanpatch are my own utilities) patrepl s/PointerHolder/std::shared_ptr/g {include,libqpdf}/qpdf/.hh patrepl s/PointerHolder/std::shared_ptr/g libqpdf/.cc patrepl s/make_pointer_holder/std::make_shared/g libqpdf/.cc patrepl s/make_array_pointer_holder/QUtil::make_shared_array/g libqpdf/.cc patrepl s,qpdf/std::shared_ptr,qpdf/PointerHolder, */.cc */.hh git restore include/qpdf/PointerHolder.hh cleanpatch ./format-code	2022-04-09 17:33:29 -04:00
Jay Berkenbilt	08fb583449	Remove accidentally committed file	2022-04-09 14:37:00 -04:00
Jay Berkenbilt	59834db472	Add documentation for code formatting and contribution guidelines	2022-04-09 12:25:08 -04:00
Jay Berkenbilt	77e889495f	Update some code manually to get better formatting results Add comments to force line breaks, parenthesize function arguments that are contatenated strings, etc. -- these kinds of changes improve clang-format's results and also cause emacs cc-mode to match clang-format. After this type of change, most of the time, when clang-format and emacs disagree, clang-format is better.	2022-04-05 14:56:19 -04:00
Jay Berkenbilt	12f1eb15ca	Programmatically apply new formatting to code Run this: for i in */.cc */.c */.h */.hh; do clang-format < $i >\| $i.new && mv $i.new $i done	2022-04-04 08:10:40 -04:00
Jay Berkenbilt	97fc98901c	Protect gnutls headers from clang-format rearranging them	2022-04-04 08:05:39 -04:00
Jay Berkenbilt	33caed4f17	Exclude formatting on embedded native crypto	2022-04-03 17:58:36 -04:00
Jay Berkenbilt	f8e97e0ed5	Put spaces around version constraint in pkg-config (fixes #677 ) Also add a pkg-config runtime test that would have caught the error.	2022-03-23 10:52:40 -04:00
Jay Berkenbilt	6dcb26d21e	Fix test for whether atomic library is needed Some platforms need it for atomic<long long> but not for atomic<int>.	2022-03-19 18:19:44 -04:00
Jay Berkenbilt	820a3f04fd	Remove "lt-" workarounds The executables that libtool built invoked the underlying binary with an "lt-" prefix. The code contained numerous workarounds for testing, which can now be removed.	2022-03-18 19:53:18 -04:00
Jay Berkenbilt	acdf5b2e7a	Update process for ABI testing	2022-03-18 19:53:18 -04:00
Jay Berkenbilt	70d0d0889b	Remove old build files	2022-03-18 19:53:18 -04:00
Jay Berkenbilt	b8aff90997	Add cmake configuration files	2022-03-18 19:53:18 -04:00
Jay Berkenbilt	3331e8921c	Switch variables to cmake in qpdf-config.h	2022-03-18 19:53:18 -04:00
Jay Berkenbilt	f030789104	Rename bits_include.cc to qpdf/bits_functions.hh It's better to just make it a .hh file to reduce confusion.	2022-03-07 18:01:27 -05:00
Jay Berkenbilt	6dd8465948	TODO: solidify plans for code formatting	2022-02-26 12:08:58 -05:00
Jay Berkenbilt	6aa58d51be	Rename bits.icc to bits_include.cc	2022-02-26 12:08:58 -05:00
Jay Berkenbilt	99393e6ab7	Shorten coverage case name This is so it will fit on one line after a qtest upgrade allows us to split lines.	2022-02-26 10:18:23 -05:00
Jay Berkenbilt	03bc6535bd	generate_auto_job: protect generated files from formatting	2022-02-26 09:17:51 -05:00
Jay Berkenbilt	ae17402c52	Move default values to constexpr This was mainly to get comments about defaults out of constructor initializer lists where their fragile when a code formatter is being used.	2022-02-26 08:16:12 -05:00
Jay Berkenbilt	36794a60cf	Allow \/ in a json string	2022-02-25 11:42:50 -05:00
Jay Berkenbilt	56b4d5a610	Use val.at instead of val[]	2022-02-22 08:40:49 -05:00
Jay Berkenbilt	f7ac591590	Recognize explicit UTF-8 strings (fixes #654 )	2022-02-22 08:10:05 -05:00
Jay Berkenbilt	3b4b9efd21	Fix autogeneration of job.sums	2022-02-22 08:10:05 -05:00
Jay Berkenbilt	31b45b0fd4	Fix logic error with Tf when generating appearances (fixes #655 )	2022-02-18 13:46:35 -05:00
Jay Berkenbilt	3e2109ab37	Remove special case for 0xad for 10.6.2.	2022-02-16 06:52:05 -05:00
Jay Berkenbilt	e810fe678a	Fix asymmetry between newUnicodeString and getUTF8Value	2022-02-15 19:22:35 -05:00
Jay Berkenbilt	a478cbb6dc	Silently/transparently recognize UTF-16LE as UTF-16 (fixes #649 ) The PDF spec only allows UTF-16BE, but most readers seem to accept UTF-16LE as well, so now qpdf does too.	2022-02-15 16:13:12 -05:00
Jay Berkenbilt	fbd3e56da7	Ignore -- at the top level arg parser (fixes #652 ) This was unintended behavior that was added back for backward compatibility. It is intentionally undocumented.	2022-02-15 16:13:12 -05:00
Jay Berkenbilt	1065bbb016	Handle odd PDFDoc codepoints in UTF-8 during transcoding (fixes #650 ) There are codepoints in PDFDoc that are not valid UTF-8 but map to valid UTF-8. We were handling those correctly with bidirectional mapping. However, if those same code points appeared in UTF-8, where they have no meaning, they were left as fixed points when converting to PDFDoc, where they do have meaning. This change recognizes them as errors.	2022-02-15 08:32:38 -05:00
m-holger	4ff837f099	Fix tests for Form XObjects Remove test for type == /XObject in QPDFObjectHandle::isFormXObject as type value is optional (as per spec 8.10.2). Replace code to test for /Form in QPDFJob::shouldRemoveUnreferencedResources with a call to isFormXObject.	2022-02-10 19:47:37 -05:00
Jay Berkenbilt	235c89e037	Fix one more PDF doc encoding error for 10.6 release (fixes #637 )	2022-02-09 05:47:58 -05:00
Jay Berkenbilt	d501e1c0d4	Only update output version from files used as input If we're opening a PDF file to copy its encryption information or attachments, its version doesn't need to influence the output version.	2022-02-08 13:49:22 -05:00

1 2 3 4 5 ...

1186 Commits