octoleo/qpdf - qpdf - Vast Development Method

mirror of https://github.com/qpdf/qpdf.git synced 2024-12-23 19:39:04 +00:00

Author	SHA1	Message	Date
Jay Berkenbilt	eae75dbe44	Add Pl_Function -- a generic function pipeline	2022-06-19 09:12:29 -04:00
Jay Berkenbilt	bb0ea2f8e7	Add qpdfjob_register_progress_reporter	2022-06-19 08:46:58 -04:00
Jay Berkenbilt	87412eb05b	Add QPDFJob::registerProgressReporter	2022-06-19 08:46:58 -04:00
Jay Berkenbilt	3a7ee7e938	Move C-based ProgressReporter helper into QPDFWriter	2022-06-19 08:46:58 -04:00
Jay Berkenbilt	8130d50e3b	Add C API to QPDFLogger	2022-06-19 08:46:58 -04:00
Jay Berkenbilt	daef4e8fb8	Add more flexible funtions to qpdfjob C API	2022-06-19 08:46:58 -04:00
Jay Berkenbilt	e0720eaa78	Use the default logger for other writes to stdout/stderr When there is no context for writing output or error messages, use the default logger.	2022-06-18 10:38:50 -04:00
Jay Berkenbilt	83be2191b4	Use "save" logger when saving data to standard output This includes the output PDF, streams from --show-object and attachments from --save-attachment. This also enables --verbose and --progress to work with saving to stdout.	2022-06-18 09:54:40 -04:00
Jay Berkenbilt	641e92c6a7	QPDF, QPDFJob: use QPDFLogger instead of custom output streams	2022-06-18 09:02:55 -04:00
Jay Berkenbilt	f1f711963b	Add and test QPDFLogger class	2022-06-18 09:02:55 -04:00
Jay Berkenbilt	f588d74140	Add integer types to Pipeline::operator<<	2022-06-18 09:02:55 -04:00
m-holger	057bd659bc	Code tidy: remove redundant variable in QPDF::writeJSON	2022-06-05 18:46:21 -04:00
Jay Berkenbilt	0bd908b550	Update documentation for qpdf JSON v2	2022-05-30 20:03:08 -04:00
Jay Berkenbilt	b7bbf12e85	In json mode, reveal recovered user password when otherwise unavailable	2022-05-30 20:03:08 -04:00
Jay Berkenbilt	f049a77c59	Add additional information when listing attachments	2022-05-30 20:03:08 -04:00
Jay Berkenbilt	04fc7c4bea	Add conversions to ISO-8601 date format	2022-05-30 20:03:08 -04:00
Jay Berkenbilt	27a42c16c7	Change default decode level to "none" with --json-output	2022-05-21 17:51:34 -04:00
Jay Berkenbilt	752f43d4e4	Allow empty b: binary JSON strings	2022-05-21 17:36:32 -04:00
Jay Berkenbilt	05460d405c	Format code	2022-05-21 16:11:42 -04:00
m-holger	6c69a747b9	Code clean up: use range-style for loops wherever possible Remove variables obsoleted by commit `4f24617`.	2022-05-21 16:06:29 -04:00
Jay Berkenbilt	c56a9ca7f6	JSON: Fix large file support	2022-05-21 09:43:45 -04:00
Jay Berkenbilt	47c093c48b	Replace std::regex with validators for better performance	2022-05-21 08:43:21 -04:00
Jay Berkenbilt	9b2eb01e25	Exercise object description in tests	2022-05-20 14:23:32 -04:00
Jay Berkenbilt	6c2fb5b8f0	Add test for bad data and bad datafile	2022-05-20 13:33:30 -04:00
Jay Berkenbilt	d065098089	Test --update-from-json	2022-05-20 11:10:12 -04:00
Jay Berkenbilt	ef955b04b5	Bug fix: don't clobber stream length with replaceDict	2022-05-20 11:09:45 -04:00
Jay Berkenbilt	3eb77a7004	JSON: detect duplicate dictionary keys while parsing	2022-05-20 10:13:15 -04:00
Jay Berkenbilt	6d4e3ba8a4	Test (and fix) handling of dangling references	2022-05-20 09:16:25 -04:00
Jay Berkenbilt	5a2aa59479	Bug fix: isReserved() true for indirect reference to reserved object	2022-05-20 09:16:25 -04:00
Jay Berkenbilt	35b1e1c493	Explicitly test ignoring unknown keys in JSON input	2022-05-20 09:16:25 -04:00
Jay Berkenbilt	dc8df962d8	Make version default to latest for --json-output (like --json)	2022-05-20 09:16:25 -04:00
Jay Berkenbilt	6c7326b290	JSON fix: correctly parse UTF-16 surrogate pairs	2022-05-20 09:16:25 -04:00
Jay Berkenbilt	6f43bf8de3	Major rework -- see long comments * Replace --create-from-json=file with --json-input, which causes the regular input to be treated as json. * Eliminate --to-json * In --json=2, bring back "objects" and eliminate "objectinfo". Stream data is never present. * In --json-output=2, write "qpdf-v2" with "objects" and include stream data.	2022-05-20 09:16:25 -04:00
Jay Berkenbilt	23fc6756f1	Add QUtil::FileCloser to the public API	2022-05-20 09:16:25 -04:00
Jay Berkenbilt	0fe8d44762	Support stream data -- not tested There are no automated tests yet, but committing work so far in preparation for some refactoring.	2022-05-20 09:16:25 -04:00
Jay Berkenbilt	63c7eefe9d	replaceStreamData: accept uninitialized filter/decode_parms These mean to leave the original values alone. This is needed for reconstructing streams from JSON given that the stream data and stream dictionary may appear in any order in the JSON.	2022-05-20 09:16:25 -04:00
Jay Berkenbilt	56f1b411fe	Back out fluent QPDFObjectHandle methods. Keep the andGet methods. I decided these were confusing and inconsistent with how JSON works. They muddle the API rather than improving it.	2022-05-20 09:16:25 -04:00
Jay Berkenbilt	7e7a9c4379	Parse objects; stream data is not yet handled	2022-05-20 09:16:25 -04:00
Jay Berkenbilt	9064542b5f	Add private methods for reserving specific objects	2022-05-20 07:54:09 -04:00
Jay Berkenbilt	7fa5d1773b	Implement top-level qpdf json parsing	2022-05-16 13:41:40 -04:00
Jay Berkenbilt	8d42eb2632	Add scaffolding for QPDF JSON reactor	2022-05-16 13:41:40 -04:00
Jay Berkenbilt	4fe2e06b47	Add --create-from-json and --update-from-json arguments Also add stubs for top-level QPDF methods (createFromJSON, updateFromJSON)	2022-05-16 13:41:40 -04:00
Jay Berkenbilt	9a0e9a1a9e	Remove offset from missing /Root error The last offset is irrelevant to not being able to find /Root.	2022-05-16 13:39:26 -04:00
Jay Berkenbilt	051ae7c282	Improve handling of replacing stream data with empty strings When an empty string was passed to replaceStreamData, the code was passing a null pointer to memcpy. Since a 0 size was also passed, this was harmless, but it triggers sanitizer errors. The code properly handles a null pointer as the buffer in other places.	2022-05-16 13:39:26 -04:00
Jay Berkenbilt	60ec94a7c3	Add QUtil::is_long_long	2022-05-16 13:39:26 -04:00
Jay Berkenbilt	4c7cfd5cbc	JSON reactor: improve handling of nested containers Call the parent container's item method before calling the child item's start method so we can easily know the current nesting level when nested items are added.	2022-05-14 17:35:06 -04:00
Jay Berkenbilt	2a2f7f1bba	Add maxobjectid to JSON	2022-05-08 13:45:20 -04:00
Jay Berkenbilt	e9390aeaaa	Add --to-json option	2022-05-08 13:45:20 -04:00
Jay Berkenbilt	c76536dd9a	Implement JSON v2 output	2022-05-08 13:45:20 -04:00
Jay Berkenbilt	15272662f6	Fix typo in json output key name moddify -> modify. Also carefully spell checked all remaining keys by splitting them into words and running a spell checker, not just relying on visual proofreading. That was the only one.	2022-05-08 13:45:20 -04:00
Jay Berkenbilt	1bc8abfdd3	Implement JSON v2 for Stream Not fully exercised in this commit	2022-05-08 13:45:20 -04:00
Jay Berkenbilt	3246923cf2	Implement JSON v2 for String Also refine the herustic for deciding whether to use hexadecimal notation for a string.	2022-05-08 13:45:20 -04:00
Jay Berkenbilt	16f4f94cd9	Prepare code for JSON v2 Update getJSON() methods and calls to them	2022-05-07 11:12:01 -04:00
Jay Berkenbilt	a9fbbd5dca	Objectinfo json: write incrementally and in numeric order This script was used on test data: ---------- #!/usr/bin/env python3 import json import sys import re def json_dumps(data): return json.dumps(data, ensure_ascii=False, indent=2, separators=(',', ': ')) for filename in sys.argv[1:]: with open(filename, 'r') as f: data = json.loads(f.read()) if 'objectinfo' not in data: continue trailer = None to_sort = [] for k, v in data['objectinfo'].items(): if k == 'trailer': trailer = v else: m = re.match(r'^(\d+) \d+ R', k) if m: to_sort.append([int(m.group(1)), k, v]) newobjectinfo = {x[1]: x[2] for x in sorted(to_sort)} if trailer is not None: newobjectinfo['trailer'] = trailer data['objectinfo'] = newobjectinfo print(json_dumps(data)) ----------	2022-05-07 08:26:31 -04:00
Jay Berkenbilt	948de60990	Objects json: write incrementally and in numeric order The following script was used to adjust test data: ---------- #!/usr/bin/env python3 import json import sys import re def json_dumps(data): return json.dumps(data, ensure_ascii=False, indent=2, separators=(',', ': ')) for filename in sys.argv[1:]: with open(filename, 'r') as f: data = json.loads(f.read()) if 'objects' not in data: continue trailer = None to_sort = [] for k, v in data['objects'].items(): if k == 'trailer': trailer = v else: m = re.match(r'^(\d+) \d+ R', k) if m: to_sort.append([int(m.group(1)), k, v]) newobjects = {x[1]: x[2] for x in sorted(to_sort)} if trailer is not None: newobjects['trailer'] = trailer data['objects'] = newobjects print(json_dumps(data)) ----------	2022-05-07 08:26:31 -04:00
Jay Berkenbilt	f50274ef46	Pages json: write each page incrementally	2022-05-07 08:26:31 -04:00
Jay Berkenbilt	dc9b7287cd	Top-level json: write incrementally This commit just changes the order in which fields are written to the json without changing their content. All the json files in the test suite were modified with this script to ensure that we didn't get any changes other than ordering. ---------- #!/usr/bin/env python3 import json import sys def json_dumps(data): return json.dumps(data, ensure_ascii=False, indent=2, separators=(',', ': ')) for filename in sys.argv[1:]: with open(filename, 'r') as f: data = json.loads(f.read()) newdata = {} for i in ('version', 'parameters', 'pages', 'pagelabels', 'acroform', 'attachments', 'encrypt', 'outlines', 'objects', 'objectinfo'): if i in data: newdata[i] = data[i] print(json_dumps(newdata)) ----------	2022-05-07 08:26:31 -04:00
Jay Berkenbilt	7f65a5c21f	Test json against schema only on demand Testing json against schema requires an in-memory copy, so do it only when requested by the test suite.	2022-05-07 08:26:31 -04:00
Jay Berkenbilt	a3c9980395	Add next to Pl_String and fix comments	2022-05-07 08:26:31 -04:00
Jay Berkenbilt	b361c5ce19	Add --test-json-schema command-line option	2022-05-07 08:26:31 -04:00
Jay Berkenbilt	7604ac5cb2	QPDFJob: have doJSON write to a pipeline	2022-05-07 08:26:31 -04:00
Jay Berkenbilt	0500d4347a	JSON: add blob type that generates base64-encoded binary data	2022-05-06 19:14:52 -04:00
Jay Berkenbilt	05fda4afa2	Change JSON parser to parse from an InputSource	2022-05-04 12:07:11 -04:00
Jay Berkenbilt	e5f3910c3e	Add new FileInputSource constructors	2022-05-04 12:07:11 -04:00
Jay Berkenbilt	e259635986	JSON: add write methods and implement unparse() in terms of those	2022-05-04 12:07:11 -04:00
Jay Berkenbilt	8b25de24c9	Make "objects" and "pages" consistent in JSON output	2022-05-04 08:32:44 -04:00
Jay Berkenbilt	6b576797cd	Don't call pushInheritedAttributesToPage in json mode We used to have to do that, but for quite some time, the code that gets images has no longer required it.	2022-05-04 07:11:13 -04:00
Jay Berkenbilt	f4206a0938	Add new Pl_String Pipeline	2022-05-03 18:54:51 -04:00
Jay Berkenbilt	16139d97c8	Add new Pl_OStream Pipeline	2022-05-03 18:54:51 -04:00
Jay Berkenbilt	21d6e3231f	Make use of the new Pipeline methods in some places	2022-05-03 18:31:23 -04:00
Jay Berkenbilt	f1c6bb97db	Add new Pipeline convenience methods	2022-05-03 18:31:22 -04:00
Jay Berkenbilt	59f3e09edf	Make Pipeline::write take an unsigned char const* (API change)	2022-05-03 18:31:22 -04:00
Jay Berkenbilt	62bf296a9c	Make assert handling less error-prone Prevent my future self or other contributors from using assert in tests and then having that assert not do anything because of the NDEBUG macro.	2022-05-03 18:31:22 -04:00
Jay Berkenbilt	92b692466f	Remove remaining incorrect assert calls from implementation	2022-05-03 18:31:22 -04:00
Jay Berkenbilt	3d9bac43da	Add internal Pl_Base64 Bidirectional base64; will be used by JSON v2.	2022-05-03 18:31:22 -04:00
Jay Berkenbilt	6724a362c3	Move generate_auto_job to the top-level CMakeLists.txt	2022-05-03 08:39:50 -04:00
Jay Berkenbilt	8d2a0eda5a	Add reactors to the JSON parser	2022-05-01 19:55:52 -04:00
Jay Berkenbilt	72e5c73419	Limit parser depth for json parser	2022-05-01 12:56:22 -04:00
Jay Berkenbilt	e34dbbfa18	Spell check	2022-05-01 12:56:22 -04:00
Jay Berkenbilt	8ccd3a8a89	Mark weak encryption with API changes (fixes #576 )	2022-04-30 17:24:15 -04:00
Jay Berkenbilt	2213ed0c3d	Remove deprecated (pre-8.4.0) encryption APIs	2022-04-30 17:23:58 -04:00
Jay Berkenbilt	cff26040d8	Using insecure crytpo from the CLI is now an error by default	2022-04-30 17:23:58 -04:00
Jay Berkenbilt	ce19471f18	Add comments around non-security-related uses of MD5	2022-04-30 14:15:07 -04:00
Jay Berkenbilt	c365a26e9d	Revert "Remove QPDFObjectHandle::replaceOrRemoveKey" This reverts commit `dc059560e7`. I changed my mind. There's no harm in leaving it deprecated for a release cycle.	2022-04-30 14:15:07 -04:00
Jay Berkenbilt	dc059560e7	Remove QPDFObjectHandle::replaceOrRemoveKey See ChangeLog for rationale for not deprecating it as originally planned.	2022-04-30 13:39:45 -04:00
Jay Berkenbilt	4f24617e1e	Code clean up: use range-style for loops wherever possible Where not possible, use "auto" to get the iterator type. Editorial note: I have avoid this change for a long time because of not wanting to make gratuitous changes to version history, which can obscure when certain changes were made, but with having recently touched every single file to apply automatic code formatting and with making several broad changes to the API, I decided it was time to take the plunge and get rid of the older (pre-C++11) verbose iterator syntax. The new code is just easier to read and understand, and in many cases, it will be more effecient as fewer temporary copies are being made. m-holger, if you're reading, you can see that I've finally come around. :-)	2022-04-30 13:27:18 -04:00
Jay Berkenbilt	7f023701dd	Formatting: remove space in range-style for loops Change .clang-format and commit automated changes from a fresh run of format-code	2022-04-30 13:26:43 -04:00
Jay Berkenbilt	2878c186bf	Use fluent appendItem	2022-04-30 10:54:16 -04:00
Jay Berkenbilt	ab9d557cb0	Use fluent replaceKey	2022-04-29 20:39:54 -04:00
Jay Berkenbilt	d8fdf632a9	Use replaceKeyAndGet in a few places in existing code	2022-04-29 20:28:02 -04:00
Jay Berkenbilt	e80fad86e9	Add new QPDFObjectHandle methods for more fluent programming	2022-04-29 20:09:10 -04:00
Jay Berkenbilt	d0b7cc8ac6	QPDFJob json: make removeAttachment take an array (fixes #693 )	2022-04-24 13:06:19 -04:00
Jay Berkenbilt	63c5a56f38	Fix build logic around generate_auto_job It was being run at configuration time, not build time.	2022-04-24 13:06:16 -04:00
Jay Berkenbilt	08ba21cf49	Fix some bugs around null values in dictionaries Make it so that a key with a null value is always treated as not being present. This was inconsistent before.	2022-04-24 10:08:32 -04:00
Jay Berkenbilt	4be2f36049	Deprecate replaceOrRemoveKey -- it's the same as replaceKey	2022-04-24 09:31:32 -04:00
Jay Berkenbilt	4925f0d18c	Have dictionary/streams mutators take const& where possible	2022-04-24 09:05:50 -04:00
Jay Berkenbilt	68e721981a	Add new QPDF::warn that takes most of QPDFExc's arguments	2022-04-23 18:25:43 -04:00
Jay Berkenbilt	22b35c4928	Expose QUtil::get_next_utf8_codepoint	2022-04-23 18:25:43 -04:00
Jay Berkenbilt	5bbb0d4c30	Replace switch statements with static map initializers Character transcoding from Unicode to single-byte characters used hard-coded switch statements because the code predated our adoption of C++11. Now we have thread-safe, static initialization of map literals, so use that instead.	2022-04-23 18:25:43 -04:00
Jay Berkenbilt	ce5c3bcad8	QPDFJob: pass capture output streams through to underlying QPDF	2022-04-18 11:24:17 -04:00
Jay Berkenbilt	75fe4f60c3	Use anonymous namespaces for file-private classes	2022-04-16 13:35:27 -04:00
Jay Berkenbilt	80ed3076a0	Remove deprecated name/number tree constructors Remove the name/number tree object helper constructors that don't take a QPDF&.	2022-04-16 13:13:15 -04:00
Jay Berkenbilt	496ca2e4dc	Remove QPDFAcroFormDocumentHelper::copyFieldsFromForeignPage	2022-04-16 13:12:07 -04:00
Jay Berkenbilt	6df6260751	Change default --json from 1 to latest	2022-04-16 12:57:33 -04:00
Jay Berkenbilt	cdd0b4fb7d	Use = default and = delete where possible in classes	2022-04-16 11:39:14 -04:00
Jay Berkenbilt	2a7d2b63c2	Make ABI-breaking changes that don't modify API at all * Merge overloaded functions by adding default values * Remove non-const methods that are identical to const methods	2022-04-16 10:41:46 -04:00
Jay Berkenbilt	ce86307a1a	Fix typo in error message	2022-04-10 16:54:23 -04:00
Jay Berkenbilt	90cfe80bac	Clean up/fix DLL.h * Change DLL_EXPORT to libqpdf_EXPORTS (internal to the build). The new name is cmake's default, is more conventional, and is less likely to clash with other symbols. * Add QPDF_DLL_PRIVATE for non-Windows * Make logic around when to define QPDF_DLL et al more explicit * Add detailed comments	2022-04-10 16:52:36 -04:00
Jay Berkenbilt	07edf96440	Remove methods of private classes from ABI Prior to the cmake conversion, several private classes had methods that were exported into the shared library so they could be tested with libtests. With cmake, we build libtests using an object library, so this is no longer necessary. The methods that are disappearing from the ABI were never exposed through public headers, so no code should be using them. Removal had to wait until the window for ABI-breaking changes was open.	2022-04-09 17:33:29 -04:00
Jay Berkenbilt	128e41648f	Remove PointerHolder.hh from other than public header files Increase to POINTERHOLDER_TRANSITION=4	2022-04-09 17:33:29 -04:00
Jay Berkenbilt	a68703b07e	Replace PointerHolder with std::shared_ptr in library sources only (patrepl and cleanpatch are my own utilities) patrepl s/PointerHolder/std::shared_ptr/g {include,libqpdf}/qpdf/.hh patrepl s/PointerHolder/std::shared_ptr/g libqpdf/.cc patrepl s/make_pointer_holder/std::make_shared/g libqpdf/.cc patrepl s/make_array_pointer_holder/QUtil::make_shared_array/g libqpdf/.cc patrepl s,qpdf/std::shared_ptr,qpdf/PointerHolder, */.cc */.hh git restore include/qpdf/PointerHolder.hh cleanpatch ./format-code	2022-04-09 17:33:29 -04:00
Jay Berkenbilt	08fb583449	Remove accidentally committed file	2022-04-09 14:37:00 -04:00
Jay Berkenbilt	59834db472	Add documentation for code formatting and contribution guidelines	2022-04-09 12:25:08 -04:00
Jay Berkenbilt	77e889495f	Update some code manually to get better formatting results Add comments to force line breaks, parenthesize function arguments that are contatenated strings, etc. -- these kinds of changes improve clang-format's results and also cause emacs cc-mode to match clang-format. After this type of change, most of the time, when clang-format and emacs disagree, clang-format is better.	2022-04-05 14:56:19 -04:00
Jay Berkenbilt	12f1eb15ca	Programmatically apply new formatting to code Run this: for i in */.cc */.c */.h */.hh; do clang-format < $i >\| $i.new && mv $i.new $i done	2022-04-04 08:10:40 -04:00
Jay Berkenbilt	97fc98901c	Protect gnutls headers from clang-format rearranging them	2022-04-04 08:05:39 -04:00
Jay Berkenbilt	33caed4f17	Exclude formatting on embedded native crypto	2022-04-03 17:58:36 -04:00
Jay Berkenbilt	f8e97e0ed5	Put spaces around version constraint in pkg-config (fixes #677 ) Also add a pkg-config runtime test that would have caught the error.	2022-03-23 10:52:40 -04:00
Jay Berkenbilt	6dcb26d21e	Fix test for whether atomic library is needed Some platforms need it for atomic<long long> but not for atomic<int>.	2022-03-19 18:19:44 -04:00
Jay Berkenbilt	820a3f04fd	Remove "lt-" workarounds The executables that libtool built invoked the underlying binary with an "lt-" prefix. The code contained numerous workarounds for testing, which can now be removed.	2022-03-18 19:53:18 -04:00
Jay Berkenbilt	acdf5b2e7a	Update process for ABI testing	2022-03-18 19:53:18 -04:00
Jay Berkenbilt	70d0d0889b	Remove old build files	2022-03-18 19:53:18 -04:00
Jay Berkenbilt	b8aff90997	Add cmake configuration files	2022-03-18 19:53:18 -04:00
Jay Berkenbilt	3331e8921c	Switch variables to cmake in qpdf-config.h	2022-03-18 19:53:18 -04:00
Jay Berkenbilt	f030789104	Rename bits_include.cc to qpdf/bits_functions.hh It's better to just make it a .hh file to reduce confusion.	2022-03-07 18:01:27 -05:00
Jay Berkenbilt	6dd8465948	TODO: solidify plans for code formatting	2022-02-26 12:08:58 -05:00
Jay Berkenbilt	6aa58d51be	Rename bits.icc to bits_include.cc	2022-02-26 12:08:58 -05:00
Jay Berkenbilt	99393e6ab7	Shorten coverage case name This is so it will fit on one line after a qtest upgrade allows us to split lines.	2022-02-26 10:18:23 -05:00
Jay Berkenbilt	03bc6535bd	generate_auto_job: protect generated files from formatting	2022-02-26 09:17:51 -05:00
Jay Berkenbilt	ae17402c52	Move default values to constexpr This was mainly to get comments about defaults out of constructor initializer lists where their fragile when a code formatter is being used.	2022-02-26 08:16:12 -05:00
Jay Berkenbilt	36794a60cf	Allow \/ in a json string	2022-02-25 11:42:50 -05:00
Jay Berkenbilt	56b4d5a610	Use val.at instead of val[]	2022-02-22 08:40:49 -05:00
Jay Berkenbilt	f7ac591590	Recognize explicit UTF-8 strings (fixes #654 )	2022-02-22 08:10:05 -05:00
Jay Berkenbilt	3b4b9efd21	Fix autogeneration of job.sums	2022-02-22 08:10:05 -05:00
Jay Berkenbilt	31b45b0fd4	Fix logic error with Tf when generating appearances (fixes #655 )	2022-02-18 13:46:35 -05:00
Jay Berkenbilt	3e2109ab37	Remove special case for 0xad for 10.6.2.	2022-02-16 06:52:05 -05:00
Jay Berkenbilt	e810fe678a	Fix asymmetry between newUnicodeString and getUTF8Value	2022-02-15 19:22:35 -05:00
Jay Berkenbilt	a478cbb6dc	Silently/transparently recognize UTF-16LE as UTF-16 (fixes #649 ) The PDF spec only allows UTF-16BE, but most readers seem to accept UTF-16LE as well, so now qpdf does too.	2022-02-15 16:13:12 -05:00
Jay Berkenbilt	fbd3e56da7	Ignore -- at the top level arg parser (fixes #652 ) This was unintended behavior that was added back for backward compatibility. It is intentionally undocumented.	2022-02-15 16:13:12 -05:00
Jay Berkenbilt	1065bbb016	Handle odd PDFDoc codepoints in UTF-8 during transcoding (fixes #650 ) There are codepoints in PDFDoc that are not valid UTF-8 but map to valid UTF-8. We were handling those correctly with bidirectional mapping. However, if those same code points appeared in UTF-8, where they have no meaning, they were left as fixed points when converting to PDFDoc, where they do have meaning. This change recognizes them as errors.	2022-02-15 08:32:38 -05:00
m-holger	4ff837f099	Fix tests for Form XObjects Remove test for type == /XObject in QPDFObjectHandle::isFormXObject as type value is optional (as per spec 8.10.2). Replace code to test for /Form in QPDFJob::shouldRemoveUnreferencedResources with a call to isFormXObject.	2022-02-10 19:47:37 -05:00
Jay Berkenbilt	235c89e037	Fix one more PDF doc encoding error for 10.6 release (fixes #637 )	2022-02-09 05:47:58 -05:00
Jay Berkenbilt	d501e1c0d4	Only update output version from files used as input If we're opening a PDF file to copy its encryption information or attachments, its version doesn't need to influence the output version.	2022-02-08 13:49:22 -05:00
Jay Berkenbilt	f91b21c7d4	Preserve input PDF version on pages/split-pages (fixes #610 )	2022-02-08 12:34:14 -05:00
Jay Berkenbilt	cfd5147d92	Add QPDF::getVersionAsPDFVersion	2022-02-08 12:34:14 -05:00
Jay Berkenbilt	8082af09be	Add PDFVersion class	2022-02-08 12:34:14 -05:00
Jay Berkenbilt	cb769c62e5	WHITESPACE ONLY -- expand tabs in source code This comment expands all tabs using an 8-character tab-width. You should ignore this commit when using git blame or use git blame -w. In the early days, I used to use tabs where possible for indentation, since emacs did this automatically. In recent years, I have switched to only using spaces, which means qpdf source code has been a mixture of spaces and tabs. I have avoided cleaning this up because of not wanting gratuitous whitespaces change to cloud the output of git blame, but I changed my mind after discussing with users who view qpdf source code in editors/IDEs that have other tab widths by default and in light of the fact that I am planning to start applying automatic code formatting soon.	2022-02-08 11:51:15 -05:00
Jay Berkenbilt	c62e8e2b28	Update for clean compile with POINTERHOLDER_TRANSITION=2	2022-02-07 17:38:22 -05:00
Jay Berkenbilt	3f22bea084	Use make_array_pointer_holder This will be able to be replaced with QUtil::make_shared_array	2022-02-07 17:38:22 -05:00
Jay Berkenbilt	40f1946df8	Replace PointerHolder arrays with shared_ptr arrays where possible Replace PointerHolder arrays wherever it can be done without breaking ABI.	2022-02-07 17:38:22 -05:00
Jay Berkenbilt	df2f5c6a36	Add QUtil::make_shared_array to help with PointerHolder transition	2022-02-07 14:08:46 -05:00
Jay Berkenbilt	cfaae47dc6	Add getBufferSharedPointer() to Pl_Buffer and QPDFWriter	2022-02-07 12:53:28 -05:00
m-holger	5901fcad4c	C-API expose QPDFObjectHandle::getKeyIfDict	2022-02-06 11:21:15 -05:00
m-holger	8371060340	Add method QPDFObjectHandle::getKeyIfDict	2022-02-06 11:21:15 -05:00
m-holger	2ed5f49a79	C-API expose QPDFObjectHandle::getValueAs... accessors	2022-02-05 19:40:30 -05:00
Jay Berkenbilt	af3f74de8c	Stop using std::iterator (fixes #618 ) Create the typedefs directly in iterators rather than deriving from the deprecated std::iterator class.	2022-02-05 11:29:25 -05:00
Jay Berkenbilt	7fb22740e1	Add operator ""_qpdf for creating QPDFObjectHandle literals	2022-02-05 11:29:25 -05:00
Jay Berkenbilt	b48a0ff0e8	Add qpdf_empty_pdf to C API	2022-02-05 11:29:25 -05:00
Jay Berkenbilt	8cf7f2bfb5	API contract: qpdf_get_qpdf_version() returns a static	2022-02-05 11:24:56 -05:00
Jay Berkenbilt	5f3f78822b	Improve use of std::unique_ptr * Use unique_ptr in place of shared_ptr in some cases * unique_ptr for arrays does not require a custom deleter * use std::make_unique (c++14) where possible	2022-02-05 11:24:56 -05:00
m-holger	e58b1174c7	Add new QPDFObjectHandle::getValueAs... accessors	2022-02-05 11:24:35 -05:00
Jay Berkenbilt	cfaa2de804	Update copyright for 2022	2022-02-04 16:36:22 -05:00
Jay Berkenbilt	2229e37e88	Add a blank line after the first header included in each source	2022-02-04 16:31:31 -05:00
Jay Berkenbilt	8eab616d62	Add qpdf version macros to qpdf/DLL.h	2022-02-04 13:41:01 -05:00
Jay Berkenbilt	abc300f05c	Replace containers of PointerHolder with containers of std::shared_ptr None of these are in the public API.	2022-02-04 13:12:37 -05:00
Jay Berkenbilt	f0c2e0ef1e	JSON: use std::shared_ptr internally	2022-02-04 13:12:37 -05:00
Jay Berkenbilt	9044a24097	PointerHolder: deprecate getPointer() and getRefcount() Use get() and use_count() instead. Add #define NO_POINTERHOLDER_DEPRECATION to remove deprecation markers for these only. This commit also removes all deprecated PointerHolder API calls from qpdf's code except in PointerHolder's test suite, which must continue to test the deprecated APIs.	2022-02-04 13:12:37 -05:00
m-holger	95e7d36b7a	C-API add two binary UTF8 funtions add qpdf_oh_new_binary_unicode_string and qpdf_oh_get_binary_utf8_value	2022-02-04 13:10:51 -05:00
m-holger	1925ffd467	Fix --check-linearization of non-linearized files (fixes #615 )	2022-02-04 06:52:38 -05:00
m-holger	4d507251fe	Change QPDFExc type to unsupported for /Standard filter	2022-02-02 14:07:32 -06:00
Jay Berkenbilt	42bff9f458	QPDFJob: let initializeFromArgv just take argv, not argc Let argv be a null-terminated array. There is already code that assumes this, and it makes it easier to construct the arguments.	2022-02-01 13:50:58 -05:00
Jay Berkenbilt	b02d37bc0a	Make QPDFArgParser accept const argv This makes it much more convention to use the initializeFromArgv functions since you can use string literals.	2022-02-01 13:50:58 -05:00
Jay Berkenbilt	bc4e2320e7	Add qpdfjob-c.h -- simple C API around parts of QPDFJob	2022-02-01 09:04:55 -05:00
Jay Berkenbilt	03e67a28fe	Move QTC::TC for qpdf to QPDFJob All the coverage cases that used to be in qpdf.cc are now in QPDFJob*.cc. It doesn't really matter, but better to follow the convention of starting with the class that includes the coverage call.	2022-02-01 09:04:55 -05:00
Jay Berkenbilt	b42f3e1d15	Move more code from qpdf.cc into QPDFJob	2022-02-01 09:04:55 -05:00
Jay Berkenbilt	cc5485dac1	QPDFJob: documentation	2022-02-01 09:04:55 -05:00
Jay Berkenbilt	5a7bb3474e	generate_auto_job: generate overloaded config decls for optional For optional parameter/choices, generate an overloaded config method that takes no arguments. This makes it possible to convert from a bare argument to one that takes an optional parameter without breaking binary compatibility.	2022-02-01 09:04:55 -05:00
Jay Berkenbilt	5953116634	Clean up documentation and help around json options	2022-01-31 18:40:11 -05:00
Jay Berkenbilt	606420ab54	Tweak short text for job schema help	2022-01-31 18:26:03 -05:00
Jay Berkenbilt	21b9290785	QPDFJob json: make bare arguments expect the empty string Changing from bool requiring true to string requiring the empty string is more consistent with the CLI and makes it possible to add an optional parameter or choices later without breaking compatibility.	2022-01-31 18:16:09 -05:00
Jay Berkenbilt	ea96330bb6	QPDFJob json: flatten json structure Flatten everything to make it easier to map command-line flags to json. The old structure was an illusion anyway because there was no mechanism to enforce that things were in the right place. This also helps with future flexibility.	2022-01-31 18:16:09 -05:00
Jay Berkenbilt	47f33cec25	QPDFJob: add test cases	2022-01-31 15:57:45 -05:00
Jay Berkenbilt	e3506253f1	Add optional version to --json	2022-01-31 15:57:45 -05:00
Jay Berkenbilt	b4fb9b4ec3	Remove outdated comments	2022-01-31 15:57:45 -05:00
Jay Berkenbilt	caa00556cf	Change filename or path to file in json and QPDFJob Use "file" consistently for specifying a file path. We use "filename" when adding attachments for a completely different purpose.	2022-01-31 15:57:45 -05:00
Jay Berkenbilt	1a3ed1ee85	job json: move deterministic-id into output options	2022-01-31 15:57:45 -05:00
Jay Berkenbilt	81b6314cb5	QPDFJob: fix logic errors in handling arrays The code was assuming everything was happening inside dictionaries. Instead, make the dictionary key handler creatino explicit only when iterating through dictionary keys.	2022-01-31 15:57:45 -05:00
Jay Berkenbilt	f99e0af49c	QPDFJob: rename function that returns job schema	2022-01-31 15:57:45 -05:00
Jay Berkenbilt	1355d95d08	QPDFJob: partial mode for initializeFromJson	2022-01-31 15:57:45 -05:00
Jay Berkenbilt	cd30f626fe	QPDFJob: remove from json a few things that only make sense from CLI	2022-01-31 15:57:45 -05:00
Jay Berkenbilt	eeffc69d87	QPDFJob_json: implement handlers for pages	2022-01-31 15:57:45 -05:00
Jay Berkenbilt	fa9676557e	QDPFJob: incorporate change to JSONHandler for array start function	2022-01-31 15:57:45 -05:00
Jay Berkenbilt	3b60224bae	JSONHandler: pass JSON object to array start function	2022-01-31 15:57:45 -05:00
Jay Berkenbilt	b74e7989c3	QPDFJob_json: implement handlers except pages	2022-01-31 15:57:45 -05:00
Jay Berkenbilt	e01bbccb40	QPDFJob: incorporate change to JSONHandler for dict start function	2022-01-31 15:57:45 -05:00
Jay Berkenbilt	ce3406e93f	JSONHandler: pass JSON object to dict start function If some keys depend on others, we have to check up front since there is no control of what order key handlers will be called. Anyway, keys are unordered in json, so we don't want to depend on ordering.	2022-01-31 15:57:45 -05:00
Jay Berkenbilt	11a86e444d	QPDFJob: autogenerate json init and declarations Now still have to go through and implement the handlers.	2022-01-31 15:57:45 -05:00
Jay Berkenbilt	842a9d928e	QPDFJob_json: add code to register handlers	2022-01-31 15:57:45 -05:00
Jay Berkenbilt	967a2b9f28	Fix typo in error message	2022-01-31 15:57:45 -05:00
Jay Berkenbilt	a7b0aec2cf	Fix false compiler warning in debug mode	2022-01-31 15:57:45 -05:00
Jay Berkenbilt	28278e27ea	Keep JSONHandler and QPDFArgParser private Since the functionality of argument parsing has moved into QPDFJob, these classes no longer need to be public. Their methods still have to be in the library's binary interface so they can be tested in libtests.	2022-01-31 15:57:45 -05:00
Jay Berkenbilt	0f05cae66a	QPDFJob: generate json decl and init file skeletons	2022-01-31 15:57:45 -05:00
Jay Berkenbilt	8a9100f674	QPDFJob: add checkConfiguration to Config	2022-01-31 15:57:45 -05:00
Jay Berkenbilt	0c8e9e5912	QPDFJob: prepare for automatically generated json handlers	2022-01-31 15:57:45 -05:00
Jay Berkenbilt	7eeaf58bb7	More doc tweaks	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	7097f29019	More editorial changes from m-holger + spell check	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	0e909bab8e	Improve top-level help information	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	0364024781	Use QPDFUsage exception for cli, json, and QPDFJob errors	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	f3d68aa5a0	Incorporate editorial changes from m-holger	2022-01-30 13:11:03 -05:00
m-holger	7dd5f31230	Fix typos in manual Fix typos in cli.rst	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	c62ab2ee9f	QPDFJob: use pointers instead of references for Config Why? The main methods that create them return smart pointers so that users can initialize them when needed, which you can't do with references. Returning pointers instead of references makes for a more uniform interface.	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	03f3369f35	QPDFJob: use manually named end functions for Config classes Use named functions rather than just end() for clarity.	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	9013b7ca91	QPDFJob: move placeholder json to a separate source file	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	edef2cd330	QPDFJob: make remaining members private	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	f2409f4fca	Minor cleanup	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	01969c78a8	QPDFJob: move private members into Members	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	cf6c56a463	QPDFJob: use config API in place-holder json	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	2c7b583b3a	QPDFJob: move input/output handling into config	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	1258054543	QPDFJob: eliminate most access to QPDFJob members from ArgParser All that's left now is input and output handling.	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	901e3e4fbf	QPDFArgParser: remove unused copyFromOtherTable This was used, but it no longer is, so let's not keep the extra complexity around.	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	700dfa40d3	QPDFJob: convert encryption handlers	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	b5d41b16b8	QPDFJob: convert under/overlay and rotate	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	1cc532dc91	QPDFJob: move some helpers from ArgParser to QPDFJob	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	95d127641c	QPDFJob: move more top-level trivial handlers into config	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	41c5af8f26	QPDFJob: convert pages	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	9373881cca	Add QPDFJob::ConfigError exception	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	0a354af02c	QPDFJob: convert AddAttachment handlers	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	bf255ccc89	QPDFJob: convert password in two tables	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	21c897aad0	QPDFJob: convert a flag in other than the main table	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	f60526aff9	QPDFJob: start changing generation for trivial config handlers	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	b4b0df0df9	QPDFJob: convert trivial functions to config API	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	79187e585a	QPDFJob: begin configuration API with verbose	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	160e869d1e	Mark trivial arg functions	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	558f043d91	QPDFJob: TRUE -> true	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	fcdbc8a102	Move doFinalChecks to QPDFJob::checkConfiguration	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	c4e56fa5f4	QPDFJob: make createsOutput callable before run()	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	564dc03607	QPDFJob: start real API Create QPDFJob_options.cc to hold API implementation functions. Reorganize a little in preparation for moving public member variables private and creating the real QPDFJob API that will be used by callers as well as the argv/json initialization methods.	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	1d099ab743	QPDFJob: placeholder for initializeFromJson	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	1c8d53465f	Incorporate job schema generation into generate_auto_job	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	b9cd693a5b	QPDFJob: allocate QPDFArgParser on stack The previous commits have removed all references to memory from QPDFArgParser from QPDFJob. This commit removes the constraint that QPDFArgParser remain in scope. This is a prerequisite to allowing JSON as an alternative way to initialize QPDFJob and to initialize it directly using a public API.	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	d526d4c17f	QPDFJob: convert Under/Overlay to use shared pointers	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	88891a75a2	QPDFJob: convert Under/Overlay ranges to strings	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	e48bfce930	QPDFJob: convert PageSpec to used shared pointer	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	e4905983d2	QPDFJob: convert outfilename to shared pointer	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	e5edfc786f	QPDFJob: convert infilename to shared pointer	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	ee7824cf28	QPDFJob: convert encryption_file args to shared pointers	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	021db6f226	QPDFJob: convert password to shared pointer	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	1a8c2eb93b	QPDFJob: use std::shared_ptr over PointerHolder where possible Also fix QPDFArgParser	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	76c4f78b5c	Add QUtil::make_shared_cstr Replace most of the calls to QUtil::copy_string with this instead.	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	67f9d0b7d5	cli.rst: remove () from end of short help This is used to generate a schema for the job json, which can't contain `)"` because it breaks the R"(...)" syntax in C++. While C++ accepts R"anything(...)anything" to avoid this, as of this writing, MSVC 2019 doesn't understand that. For now, just avoid it by removing parentheses from the end of short help.	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	8dea480c9f	Allow optional fields in json "schema" checks	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	ec85e56c3f	Add missing help topic for inspection	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	1db0a7ffce	JSONHandler: rework dictionary and array handlers	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	acf8d18b6e	Editorial changes to cli.rst	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	cf8405d91e	Fix json schema for objects to include dictionary key	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	2e58541493	Use JSON::parse to initialize schema for json mode	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	37105710ee	Implement JSONHandler for recursively processing JSON	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	a6df6fdaf7	CLI doc: use tables where helpful	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	e8e8f6f43c	Add JSON::parse	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	b9af421ef7	Add missing \f support for JSON string encoder	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	aa0a379b37	Add JSON::isDictionary and JSON::isArray	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	5c5e5ca29b	Document how to add a command-line argument	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	c8729398dd	Generate help content from manual This is a massive rewrite of the help text and cli.rst section of the manual. All command-line flags now have their own help and are specifically index. qpdf --help is completely redone.	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	b4bd124be4	QPDFArgParser: support adding/printing help information	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	5303130cf9	Fix comment on duplicated top-level json keys	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	53ba65eb59	QPDFArgParser: handle optional choices including help Handle optional choices in addition to required choices. Refactor the way help options are added to completion to make it work with optional help choices.	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	a301cc5373	Minor code cleanup	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	3ab25d595b	Fix doc typos caught by m-holger -- thanks	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	4577df4b5d	QPDFJob increment: generate option table initialization	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	f1d805badc	Add QPDFArgParser::copyFromOtherTable	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	c3e9b64e7f	QPDFJob increment: generate handler declarations	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	6e70d99b58	QPDFJob increment: generate choices variables in init	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	cb684ec4d3	QPDFJob increment: generate table names	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	f8eee83515	Expose QPDFArgParser::usage	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	8dcf6da259	QPDFJob: remove non-check from doFinalChecks	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	c216854607	Add basic framework for QPDFJob code generation	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	bd89aac360	QPDFJob increment: move arg parsing into QPDFJob Move ArgParser from qpdf.cc into QPDFJob.cc. It still works with millions of public member variables, but now qpdf.cc is minimal and just calls stable library functions.	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	12396702af	QPDFJob: reorder functions, no other changes	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	2394dd8519	QPDFJob increment: static functions to member functions Convert remaining static functions that take QPDFJob& as a parameter to member functions. Utility functions that don't take QPDFJob& remain static functions and can probably just stay that way since the keep extra complexity out of QPDFJob.hh.	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	e2975b9ed0	QPDFJob: de-templatize do_process and do_process_once	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	2f631997f2	QPDFJob increment: remove std::cout, std::cerr, whoami Remove remaining temporary duplication of hard-coded values and direct access to std::cout, std::cerr, and whoami in favor of parameters in QPDFJob. This moves a few more static methods into QPDFJob member functions.	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	1ddf5b4b4b	QPDFJob increment: get rid of exit, handle verbose Remove all calls to exit() from QPDFJob. Handle code that runs in verbose mode to enable it to make use of output streams and message prefix (whoami) from QPDFJob. This removes temporarily duplicated exit code logic and most access to whoami/std::cout outside of QPDFJob proper.	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	0910e767ad	QPDFJob increment: basic QPDFJob structure Move most of the methods called from qpdf.cc after argument parsing into QPDFJob. In this increment, enough QPDFJob API has been added to handle the branch of QPDFJob::run() that creates output with an appropriate division between qpdf.cc and QPDFJob. There are temporary bits of code to enable everything to compile and pass the test suite, including some duplication and hard-coded values.	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	52817f0a45	Implement QPDFArgParser based on ArgParser from qpdf.cc	2022-01-30 13:11:02 -05:00
m-holger	0f9086e509	Fix doc typos	2022-01-30 12:09:54 -06:00
m-holger	8eca9d8fd9	Fix QPDFObjectHandle::isOrHasName Ensure isOrHasName returns true if object is an array and the name is present anywhere in the array.	2022-01-27 09:35:39 -06:00
m-holger	07db3200cb	Remove some if statements and simplify some boolean expressions Use QPDFObjectHandle::isNameAndEquals, isDictionaryOfType and isStreamOfType.	2022-01-27 07:31:12 -06:00
m-holger	710d2e54f0	Allow testing for subtype without specifying type in isDictionaryOfType etc Accept empty string as type parameter in QPDFObjectHandle::isDictionaryOfType and isStreamOfType to allow for dictionaries with optional type.	2022-01-27 07:31:12 -06:00
m-holger	1b1b471ca9	Make a few whitespace fixes from last commit Commit by ejb@ql.org using m-holger as author so git annotate gives proper credit for changes.	2022-01-22 09:14:53 -05:00
m-holger	8593b9fdf7	Add new convenience methods QPDFObjectHandle::isNameAndEquals, etc Add methods isNameAndEquals, isDictionaryOfType, isStreamOfType	2022-01-22 08:10:28 -06:00
Jay Berkenbilt	370710657a	Add missing characters from PDF doc encoding (fixes #606 )	2022-01-11 15:55:19 -05:00
Jay Berkenbilt	77c31305fe	Fix signed/unsigned char warning (fixes #604 )	2022-01-11 06:51:31 -05:00
Jay Berkenbilt	af91b5b584	Add QUtil::file_can_be_opened	2021-12-29 13:41:02 -05:00
Jay Berkenbilt	04745320d6	Prepare 10.5.0 release	2021-12-20 14:51:46 -05:00
Jay Berkenbilt	d866f48081	Change names of qpdf_object_type_e enumerations They have to be ot_* rather than qpdf_ot_* for compatibility. * Different enumerated types are not assignment-compatible in C++, at least with strict compiler settings * While you can do `constexpr ot_xyz = ::qpdf_ot_xyz` in QPDFObject.hh to make QPDFObject::ot_xyz work, QPDFObject::object_type_e::ot_xyz will only work if the enumerated type names are the same.	2021-12-20 14:51:45 -05:00
Jay Berkenbilt	ea73bf72e0	Further improvements to handling binary strings	2021-12-19 14:30:45 -05:00
Jay Berkenbilt	ddbe59179e	C API: simplify new error handling and improve documentation	2021-12-17 15:59:47 -05:00
m-holger	f6293bd94c	C-API expose QPDFObjectHandle::getTypeCode and getTypeName (fixes #597 )	2021-12-17 14:24:43 -05:00
Jay Berkenbilt	feafcc4e88	C API: add several stream functions (fixes #596 )	2021-12-17 13:28:11 -05:00
Jay Berkenbilt	fee7489ee4	Add Pl_Buffer::getMallocBuffer	2021-12-17 12:38:52 -05:00
Jay Berkenbilt	9bb6f570ec	C API: add functions for working with pages (fixes #594 )	2021-12-16 15:07:48 -05:00
Jay Berkenbilt	245ca28066	Use value rather than reference captures where possible	2021-12-16 11:47:07 -05:00
Jay Berkenbilt	af2a71aa2c	Handle bitstream overflow errors more gracefully (fixes #581 ) * Make it a runtime error, not a logic error * Include additional information * Capture it properly in checkLinearization	2021-12-10 15:37:35 -05:00
Jay Berkenbilt	1c62c2a342	C API: expose functions for indirect objects (fixes #588 )	2021-12-10 14:57:35 -05:00
Jay Berkenbilt	72c10d8617	C API: overhaul error handling * Handle error conditions that occur when using the object handle interfaces. In the past, some exceptions were not correctly converted to errors or warnings. * Add more detailed information to qpdf-c.h * Make it possible to work more explicitly with uninitialized objects	2021-12-10 12:16:02 -05:00
Jay Berkenbilt	3340dbe976	Use a specific error code for type warnings and clarify docs	2021-12-10 11:15:49 -05:00
Jay Berkenbilt	b2b2a175c4	Add missing unit test for register progress reporter in C API It was exercised in the pdf-linearize example but not in qpdf-ctest.	2021-12-10 09:11:56 -05:00
Jay Berkenbilt	1faa21502f	Refactor trap_errors to use std::function	2021-12-09 10:33:31 -05:00
Jay Berkenbilt	e3cc171d02	C API: qpdf_oh_is_initialized	2021-12-09 10:33:31 -05:00
Jay Berkenbilt	bef2c2222a	C API: qpdf_get_last_string_length	2021-12-09 10:33:31 -05:00
m-holger	b4fc9eb700	C-API expose new_object as qpdf_oh_new_object	2021-12-02 13:59:58 -05:00
Jay Berkenbilt	720ce9e8f3	Improve testing and error handling around operating before processing	2021-11-29 07:42:36 -05:00
Jay Berkenbilt	ac17308cf6	Initialize QPDF::Members::file (fixes #584 )	2021-11-29 07:16:34 -05:00
m-holger	4630b8567c	Ensure qpdf_oh handles returned by C-API functions are unique. Return new qpdf_oh from qpdf_oh_wrap_in_array when input is already an array. Update some doc comments in qpdf-c.h.	2021-11-19 13:31:59 +00:00
Jay Berkenbilt	ce7db05d22	Prepare 10.4.0 release	2021-11-16 15:44:09 -05:00
Jay Berkenbilt	750aca5b94	First increment of improving handling of weak crypto (fixes #358 )	2021-11-11 12:24:15 -05:00
Jay Berkenbilt	f45dacf4cb	Make recovery logic flexible about where objects end (fixes #573 ) Don't assume endobj is at the beginning of the line. This means we are looking at tokens for every line, but the odds of n n obj appearing in the middle of the object are likely much lower than endobj not being at the beginning of the line or missing entirely. This will probably have a negative impact on recovery time for very large files. Hopefully it will be worth it.	2021-11-07 15:27:22 -05:00
Jay Berkenbilt	3794f8e2ad	Support OpenSSL 3 (fixes #568 )	2021-11-04 18:24:54 -04:00
Jay Berkenbilt	a84a0b2487	Add range check in QPDFNumberTreeObjectHelper (fuzz issue 37740)	2021-11-04 14:03:24 -04:00
Jay Berkenbilt	4a648b9a00	Fix bug in merging resources /DR from foreign AcroForm (fixes #548 ) When making resources indirect in from_dr, the code was using the wrong owning QPDF, forgetting that from_dr had already been copied using CopyForeignObject.	2021-11-04 12:29:42 -04:00
Jay Berkenbilt	9b28933647	Check object ownership when adding When adding a QPDFObjectHandle to an array or dictionary, if possible, check if the new object belongs to the same QPDF. This makes it much easier to find incorrect code than waiting for the situation to be detected when the file is written.	2021-11-04 12:29:42 -04:00
Jay Berkenbilt	33a47d5c3c	Make QPDF::findPage public (fixes #516 ) This was originally not public because I wanted to get rid fo the pages cache, but I recently realized there were deep reasons not to do that, and the author of pikepdf wanted this, so I decided to make it public.	2021-11-03 09:43:17 -04:00
Jay Berkenbilt	532a4f3d60	Detect recoverable but invalid zlib data streams (fixes #562 )	2021-11-03 09:43:17 -04:00
Fredrik Fornwall	e0775238b8	Fix QPDFEFStreamObjectHelper::{get,set}Subtype The /Subtype entry that specifies the mime type of an embedded file is inside the embedded file stream dictionary directly, not it in the parameter dictionary. See Table 45 and 46 in the PDF 1.7 specification: https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#page=112	2021-09-10 10:02:24 -04:00
Jay Berkenbilt	3cacb27a90	Performance fix on preserveObjectStreams	2021-05-09 07:51:14 -04:00
Jay Berkenbilt	bddebdb0ea	Prepare 10.3.2 release	2021-05-08 10:41:14 -04:00
Jay Berkenbilt	30ac51bc78	Exclude unreferenced objects in object streams (fixes #520 )	2021-05-08 09:42:09 -04:00
Zdenek Dohnal	16c19e9424	libqpdf/Pl_AES_PDF.cc: remove duplicated if branch Check for this->encrypt seems to be moved to plugged crypto implementations, so it can be removed from Pl_AES_PDF.cc.	2021-04-29 09:42:38 -04:00
Jay Berkenbilt	36c7c20819	Fix timezone portability issue (fixes #515 )	2021-04-17 18:12:55 -04:00
Jay Berkenbilt	8971443e46	QPDF::addPage*: handle duplicate pages more robustly	2021-04-05 10:58:10 -04:00
Jay Berkenbilt	ec48820c3c	Fix loop detection in NNTree	2021-04-05 07:59:02 -04:00
Jay Berkenbilt	258675fc99	Move ABI comment to the right place	2021-04-03 11:43:08 -04:00
Jay Berkenbilt	a77f58142d	Remove some assertions that are not necessarily true (fixes #514 ) Operations that add the same object to multiple places in the pages tree are throwing exceptions and then later causing assertion failures. The assert calls shouldn't be there.	2021-03-21 19:35:23 -04:00
Jay Berkenbilt	3f05429cc5	Prepare 10.3.1 release	2021-03-11 12:59:41 -05:00
Jay Berkenbilt	85884c363c	Allow /DR to be direct in /AcroForm Also handle direct annotation, though this is much less likely.	2021-03-11 11:43:38 -05:00
Jay Berkenbilt	dc65b88457	Prepare 10.3.0 release	2021-03-05 06:15:48 -05:00
Jay Berkenbilt	cb6e53136f	QPDFAcroFormDocumentHelper: add missing analyze calls	2021-03-04 18:11:44 -05:00
Jay Berkenbilt	0b77f2cf26	Revert non-binary-compatible handleWarning change -- see TODO (ABI)	2021-03-04 15:59:46 -05:00
Jay Berkenbilt	f68e25c7f2	Don't use handleWarning, which is being reverted	2021-03-04 15:59:45 -05:00
Jay Berkenbilt	9fb174b9e9	Major rework of handling form fields when copying pages (fixes #509 )	2021-03-04 15:08:37 -05:00
Jay Berkenbilt	887f35efaa	When resolving font from /DR, copy it into resources	2021-03-04 15:08:36 -05:00
Jay Berkenbilt	a2124f992c	Add QPDFMatrix::operator==	2021-03-04 15:08:36 -05:00
Jay Berkenbilt	552303a94a	Check for reserved after dereference	2021-03-04 15:08:36 -05:00
Jay Berkenbilt	d7ffdfa994	Add optional conflict detection to mergeResources Also improve behavior around direct vs. indirect resources.	2021-03-04 15:08:36 -05:00
Jay Berkenbilt	e17585c2d2	Remove unreferenced: ignore names that are not Fonts or XObjects Converted ResourceFinder to ParserCallbacks so we can better detect the name that precedes various operators and use the operators to sort the names into resource types. This enables us to be smarter about detecting unreferenced resources in pages and also sets the stage for reconciling differences in /DR across documents.	2021-03-03 17:05:49 -05:00
Jay Berkenbilt	a15ec6967d	Enhancements to ParserCallbacks	2021-03-03 17:05:49 -05:00
Jay Berkenbilt	1bb209a9bf	Add QPDF::numWarnings	2021-03-03 17:05:49 -05:00
Jay Berkenbilt	37fcc5ff71	Create ResourceFinder from NameWatcher in QPDFPageObjectHelper	2021-03-03 17:05:49 -05:00
Jay Berkenbilt	b444ab3352	Fix typos in coverage cases	2021-03-03 17:05:49 -05:00
Jay Berkenbilt	fa2516df71	Fix behavior for finding /Q, /DA, and /DR for form fields If not found in the field hierarchy, /Q and /DA are supposed to be looked up in the document-level form dictionary. /DR is supposed to only come from the document dictionary.	2021-03-03 17:05:19 -05:00
Jay Berkenbilt	a4d6589ff2	Have QPDFObjectHandle notice when replaceObject was called This results in a performance penalty of 1% to 2% when replaceObject and swapObjects are never called and a somewhat larger penalty if they are called, but it's worth it to avoid very confusing behavior as discussed in depth in qpdf#507.	2021-02-25 07:32:46 -05:00
Jay Berkenbilt	ec6719fd25	Always call dereference() before querying obj pointer	2021-02-25 07:31:26 -05:00
Jay Berkenbilt	b5e937397c	Prepare 10.2.0 release	2021-02-23 10:41:58 -05:00
Jay Berkenbilt	1886673d7e	Spell check	2021-02-23 10:38:05 -05:00
Jay Berkenbilt	9e00be7ffa	Remove warning that gives false positives in some normal cases	2021-02-23 08:26:21 -05:00
Jay Berkenbilt	be3a8c0e7a	Keep only referenced form fields in --pages	2021-02-23 08:26:21 -05:00
Jay Berkenbilt	83216e640c	Preserve form fields when splitting pages (fixes #340 )	2021-02-22 18:42:06 -05:00
Jay Berkenbilt	1f35ec9988	Add methods for copying form fields	2021-02-22 18:42:06 -05:00
Jay Berkenbilt	8e8c0d8290	Add new placeFormXObject that takes a matrix reference	2021-02-22 18:42:06 -05:00
Jay Berkenbilt	61d41e2e88	Add copyAnnotations, use with overlay/underlay (fixes #395 )	2021-02-22 18:42:06 -05:00
Jay Berkenbilt	7b3cbacf5d	Change from QPDF{Array,Dict}Items to aitems() and ditems()	2021-02-22 11:05:39 -05:00
Jay Berkenbilt	a9ae8cadc6	Add transformAnnotations and fix flattenRotations to use it	2021-02-21 17:13:09 -05:00
Jay Berkenbilt	a76decd2d5	Add QPDFObjGen::unparse	2021-02-21 16:21:52 -05:00
Jay Berkenbilt	7540d2082a	Explicitly override inherited rotate in flattenRotations	2021-02-21 14:58:45 -05:00
Jay Berkenbilt	e899926e0d	Use QPDFMatrix inside flattenRotations	2021-02-21 14:58:45 -05:00
Jay Berkenbilt	92fbc6fdf5	QPDFObjectHandle::copyStream	2021-02-21 06:36:30 -05:00
Jay Berkenbilt	60afe4142e	Refactor: separate copyStreamData from replaceForeignIndirectObjects	2021-02-21 06:36:30 -05:00
Jay Berkenbilt	15269f36d8	addFormField: update cache rather than invalidating	2021-02-21 06:36:30 -05:00
Jay Berkenbilt	901f1a788c	Enhance QPDFMatrix API	2021-02-21 06:36:30 -05:00
Jay Berkenbilt	05eb5826d8	Fix isPagesObject and isPageObject There are lots of things with /Kids that are not pages. Repair the pages tree, then do a reliable check.	2021-02-20 19:42:41 -05:00
Jay Berkenbilt	35dd11f356	Allow --rotate=0	2021-02-20 16:29:34 -05:00
Jay Berkenbilt	71e8627285	Add const versions of QPDFMatrix::transform*	2021-02-19 18:35:19 -05:00
Jay Berkenbilt	de8929a41c	Add QPDFAcroFormDocumentHelper::addFormField	2021-02-18 12:25:48 -05:00
Jay Berkenbilt	5cec6b4c3d	Add QPDFPageObjectHelper::getMatrixForFormXObjectPlacement	2021-02-18 12:25:48 -05:00
Jay Berkenbilt	0765872295	Form field for non-widget just returns null	2021-02-18 10:25:07 -05:00
Jay Berkenbilt	0b1623d07d	Add QUtil::path_basename	2021-02-18 09:59:03 -05:00
Jay Berkenbilt	a773f4c71d	Add QPDFObjectHandle::parse for strings with context	2021-02-15 11:33:03 -05:00
Jay Berkenbilt	7eb903d9aa	Use functional replaceStreamData	2021-02-14 14:42:24 -05:00
Jay Berkenbilt	efbb21673c	Add functional versions of QPDFObjectHandle::replaceStreamData Also fix a bug in checking consistency of length for stream data providers. Length should not be checked or recorded if the provider says it failed to generate the data.	2021-02-14 14:42:24 -05:00
Jay Berkenbilt	e2593e2efe	Move QPDFMatrix into the public API	2021-02-13 02:30:00 -05:00
Jay Berkenbilt	07f40bd254	QUtil::double_to_string: trim trailing zeroes with option to disable	2021-02-13 02:30:00 -05:00
Jay Berkenbilt	8fbc8579f2	Allow zone information to be omitted from timestamp strings	2021-02-11 14:26:55 -05:00
Jay Berkenbilt	df067c9ab6	Add autoconf test for localtime_r	2021-02-11 14:26:55 -05:00
Jay Berkenbilt	1b3f84f967	Require C++14 instead of C++11	2021-02-10 16:27:58 -05:00
Jay Berkenbilt	9fcf61b2f6	Fix loop in QPDFOutlineDocumentHelper (fuzz issue 30507)	2021-02-10 16:27:44 -05:00
Jay Berkenbilt	4d1f2fdcac	Update to new name/number tree API	2021-02-10 15:46:20 -05:00
Jay Berkenbilt	1f4771cd0d	Minor clean up of Windows headers	2021-02-10 07:36:18 -05:00
Jay Berkenbilt	ad34b9c278	Implement helpers for file attachments	2021-02-10 06:57:37 -05:00
Jay Berkenbilt	bf0e6eb302	Add QUtil methods for dealing with PDF timestamp strings	2021-02-09 17:50:24 -05:00
Jay Berkenbilt	bfbeec5497	Make newly created name/number trees indirect objects	2021-02-08 06:49:56 -05:00
Jay Berkenbilt	553ac7f353	Add QUtil::pipe_file and QUtil::file_provider	2021-02-07 19:41:34 -05:00
Jay Berkenbilt	e076c9bf08	Remove erroneous handling of /EFF for stream decryption I thought /EFF was supposed to be used as a default for decrypting embedded file streams, but actually it's supposed to be advice to a conforming writer about handling new ones. This makes sense since the findAttachmentStreams code, which is not actually needed, was never right.	2021-02-06 17:08:41 -05:00
Jay Berkenbilt	ac2b3b96e1	Make wrong object stream type a warning	2021-02-06 14:29:11 -05:00
Jay Berkenbilt	faa2e3ddfd	Handle older PDFs whose form XObjects inherit resources (fixes #494 ) When removing unreferenced resources, notice if a page (recursively) contains a form XObject with unreferenced resources, and count any such resources as referenced by the page.	2021-02-02 18:06:05 -05:00
Jay Berkenbilt	81025e4998	Refactor removal of unreferenced resources Refactor in preparation for resolving unresolved resources in form xobjects from page.	2021-02-02 18:06:05 -05:00
Jay Berkenbilt	9c9ce64eec	Handle strings in inline image dictionaries We need to use token.getRawValue, not token.getValue	2021-01-31 07:50:03 -05:00
Jay Berkenbilt	178f995fc2	Recover from exceptions during filtering for inline images	2021-01-31 07:49:08 -05:00
Jay Berkenbilt	4ae93a73c5	Improve memory safety of dict/array iterators	2021-01-31 07:16:03 -05:00
Jay Berkenbilt	de0b11fc47	Add C++ iterator API around array and dictionary objects	2021-01-30 15:15:23 -05:00
Jay Berkenbilt	35e7859bc7	Make QPDFObjectHandle::is* return false for uninitialized objects	2021-01-29 15:46:54 -05:00
Jay Berkenbilt	50decc9bb8	name/number tree: explicitly declare default destructors	2021-01-29 15:46:54 -05:00
Jay Berkenbilt	8ed3e8c79b	NNTree: rework iterators to be more memory efficient Keep a std::pair internal to the iterators so that operator* can return a reference and operator-> can work, and each can work without copying pairs of objects around.	2021-01-26 09:12:23 -05:00
Jay Berkenbilt	e7e20772ed	name/number trees: remove	2021-01-26 09:12:23 -05:00
Jay Berkenbilt	5816fb44b8	name/number trees: insertAfter	2021-01-25 15:39:10 -05:00
Jay Berkenbilt	16a9bb3f6f	name/number trees: newEmpty, increment/decrement end()	2021-01-25 15:39:10 -05:00
Jay Berkenbilt	b5614f611d	Implement repair and insert for name/number trees	2021-01-24 19:31:45 -05:00
Jay Berkenbilt	04edfe9fad	QPDFObjectHandle::newUnicodeString to uses UTF-16 only when needed Use the first of ASCII, PDFDocEncoding, or UTF-16 that is capable of encoding the string.	2021-01-24 03:27:28 -05:00
Jay Berkenbilt	63e5cb533d	Use new QPDF{Name,Number}TreeObjectHelper API	2021-01-24 03:27:28 -05:00
Jay Berkenbilt	d61ffb65d0	Add new constructors for name/number tree helpers Add constructors that take a QPDF object so we can issue warnings and create new indirect objects.	2021-01-24 03:27:26 -05:00
Jay Berkenbilt	ba814703fb	Use QPDFNameTreeObjectHelper's iterator directly	2021-01-24 03:25:11 -05:00
Jay Berkenbilt	5f0708418a	Add iterators to name/number tree helpers	2021-01-24 03:22:59 -05:00
Jay Berkenbilt	4a1cce0a47	Reimplement name and number tree object helpers Create a computationally and memory efficient implementation of name and number trees that does binary searches as intended by the data structure rather than loading into a map, which can use a great deal of memory and can be very slow.	2021-01-24 03:22:51 -05:00
Jay Berkenbilt	6226b69dba	Add warn() to QPDF's public API	2021-01-16 18:41:53 -05:00
Jay Berkenbilt	fc88837d4b	Treat /EmbeddedFiles as a proper name tree If we ever had an encrypted file with different filters for attachments and either the /EmbeddedFiles name tree was deep or some of the file specs didn't have /Type, we would have overlooked those as attachment streams. The code now properly handles /EmbeddedFiles as a name tree.	2021-01-11 10:50:44 -05:00
Jay Berkenbilt	6fe7b704c7	Warn rather than segv on access after closing input source (fixes #495 )	2021-01-06 10:11:34 -05:00
Jay Berkenbilt	0fed040392	Prepare version 10.1.0	2021-01-04 16:59:55 -05:00
Jay Berkenbilt	18340b8835	Spell check	2021-01-04 16:26:58 -05:00
Jay Berkenbilt	dc92574c10	Fix some pipelines to be safe if downstream write fails (fuzz issue 28262)	2021-01-04 15:17:35 -05:00
Jay Berkenbilt	ba6b6aacf1	Fix outdated comment	2021-01-03 15:59:49 -05:00
Jay Berkenbilt	3be58f49e5	Make more QPDFPageObjectHelper methods work with form XObject	2021-01-02 14:08:53 -05:00
Jay Berkenbilt	98da4fd835	Externalize inline images now includes form XObjects	2021-01-02 14:08:17 -05:00
Jay Berkenbilt	bedf35d6a5	Bug fix: avoid extraneous pipeline finish calls with multiple contents Avoid calling finish() multiple times on the pipeline passed to pipeContentStreams. This commit also fixes a bug in which qpdf was not exiting with the proper exit status if warnings found while splitting pages; this was exposed by a test case that changed.	2021-01-02 14:08:17 -05:00
Jay Berkenbilt	a139d2b36d	Add several methods for working with form XObjects (fixes #436 ) Make some more methods in QPDFPageObjectHelper work with form XObjects, provide forEach methods to walk through nested form XObjects, possibly recursively. This should make it easier to work with form XObjects from user code.	2021-01-02 12:29:31 -05:00
Jay Berkenbilt	6154221edb	QPDFPageObjectHelper: filterPageContents -> filterContents + form XObject	2021-01-02 11:33:36 -05:00
Jay Berkenbilt	63ea46193d	QPDFPageObjectHelper: getPageImages -> getImages	2021-01-02 11:33:36 -05:00
Jay Berkenbilt	e7a8554563	QPDFPageObjectHelper::getPageImages: support form XObjects	2021-01-02 11:33:36 -05:00
Jay Berkenbilt	1562d34c09	Add QPDFObjectHandle::isFormXObject	2021-01-01 07:36:10 -05:00
Jay Berkenbilt	c9271335fa	Add QPDFPageObjectHelper::flattenRotation and --flatten-rotation	2020-12-30 13:03:55 -05:00
Jay Berkenbilt	12ecd2019a	Add QPDFObjectHandle::setFilterOnWrite	2020-12-28 12:58:19 -05:00
Jay Berkenbilt	3f9191a344	Add ostream << for QPDFObjGen	2020-12-28 12:58:19 -05:00
Jay Berkenbilt	858c7b89bc	Let optimize filter stream parameters instead of making them direct Also removes preclusion of stream references in stream parameters of filterable streams and reduces write times by about 8% by eliminating an extra traversal of the objects.	2020-12-28 12:58:19 -05:00
Jay Berkenbilt	1a62cce940	Restructure optimize to allow skipping parameters of filtered streams	2020-12-28 12:58:19 -05:00
Jay Berkenbilt	09027344b9	Refactor: separate code that determines whether to filter a stream	2020-12-28 12:58:19 -05:00
Jay Berkenbilt	39bfa01307	Implement user-provided stream filters Refactor QPDF_Stream to use stream filter classes to handle supported stream filters as well.	2020-12-28 12:58:19 -05:00
Jay Berkenbilt	cc8895078a	Add QPDFObjectHandle::makeDirect(bool allow_streams)	2020-12-26 08:48:18 -05:00
Jay Berkenbilt	573b6eb8b1	Provide qpdf write progress reporting from C API (fixes #487 )	2020-12-20 14:43:24 -05:00
Jay Berkenbilt	2050977099	Add QPDFObjectHandle manipulation to C API	2020-11-28 19:48:07 -05:00
Jay Berkenbilt	78b9d6bfd4	Prepare 10.0.4 release	2020-11-21 13:50:02 -05:00
Jay Berkenbilt	bd79138c84	Treat direct page as runtime rather than logic error (fuzz issue 27393)	2020-11-11 09:50:43 -05:00
Jay Berkenbilt	47f4ebcdac	Ignore unused field in xref entry, avoiding range error (fixes #482 )	2020-11-04 07:46:46 -05:00
Jay Berkenbilt	fbe40b800d	Prepare 10.0.3 release	2020-10-31 13:47:03 -04:00
Jay Berkenbilt	6971f78ff6	Fix stack overflow on direct root (fuzz issue 26761)	2020-10-31 13:10:39 -04:00
Jay Berkenbilt	ffe6af6f77	Add comments explaining the foreign object copying code These are the comments I would have liked to have been able to read while fixing #449 and #478.	2020-10-31 12:14:26 -04:00
Jay Berkenbilt	96767fb104	Fix foreign stream copying bug (fixes #478 ) This reverts an incorrect fix to #449 and codes it properly. The real problem was that we were looking at the local dictionaries rather than the foreign dictionaries when saving the foreign stream data. In the case of direct objects, these happened to be the same, but in the case of indirect objects, the object references could be pointing anywhere since object numbers don't match up between the old and new files.	2020-10-31 12:14:26 -04:00
Jay Berkenbilt	da7540794a	Prepare 10.0.2 release	2020-10-27 11:57:48 -04:00
Jay Berkenbilt	09bd1fafb1	Improve efficiency of number to string conversion	2020-10-27 11:57:48 -04:00
Jay Berkenbilt	bcea54fcaa	Revert removal of unreadCh change for performance Turns out unreadCh is much more efficient than seek(-1, SEEK_CUR). Update comments and code to reflect this.	2020-10-27 11:57:48 -04:00
Jay Berkenbilt	b30deaeeab	Avoid merging adjacent tokens when concatenating contents (fixes #444 )	2020-10-23 08:00:04 -04:00
Jay Berkenbilt	8a11feacc3	Avoid leak by resolving object streams more than once (fuzz issue 23642)	2020-10-22 15:39:36 -04:00
Jay Berkenbilt	30bb4c64ee	Minor code cleanup * Return rather than exiting from realmain in qpdf.cc * Remove extraneous blank line * Don't assign temporary to const reference	2020-10-22 15:39:36 -04:00
Jay Berkenbilt	232f5fc9f3	Handle jpeg library fuzz false positives The jpeg library has some assembly code that is missed by the compiler instrumentation used by memory sanitization. There is a runtime environment variable that is used to work around this issue.	2020-10-22 06:31:52 -04:00
Jay Berkenbilt	c1684eae91	Check for overflow in page labels (fuzz issue 23599)	2020-10-22 05:49:24 -04:00
Jay Berkenbilt	7f4a4df919	Add range_check method to QIntC	2020-10-22 05:48:40 -04:00
Jay Berkenbilt	24196c08cb	Fix loop detection error (fuzz issue 23172)	2020-10-22 05:48:35 -04:00
Jay Berkenbilt	956c8f6432	Obscure bug fix copying foreign streams in special cases (fixes #449 ) Specifically, if a stream had its stream data replaced and had indirect /Filter or /DecodeParms, it would result in non-silent loss of data and/or internal error.	2020-10-21 19:23:23 -04:00
Jay Berkenbilt	98f6c00dad	Protect numeric conversion against user's locale (fixes #459 )	2020-10-21 16:42:51 -04:00
Jay Berkenbilt	bed165c9fc	Stop using InputSource::unreadCh	2020-10-18 07:43:05 -04:00
Dean Scarff	153060a0c5	Check integer overflow in resolveObjectsInStream Fixes a crash found by fuzzing.	2020-10-16 20:09:24 -04:00
Dean Scarff	9a3791c53b	Properly detect OPENSSL_IS_BORINGSSL OPENSSL_IS_BORINGSSL is not actually set by configure, so it will be undefined until a BoringSSL header is included. Hence the #ifdef logic in QPDFCrypto_openssl.h would usually never apply. This still worked because evp.h transitively included BoringSSL's cipher.h and digest.h, but the latter are the correct (documented) headers. By re-ordering the includes, we can ensure the macro is defined when we use it. Also: fix case in the header guards.	2020-10-16 20:04:36 -04:00
Dean Scarff	2ff84aa2c9	Include detailed OpenSSL error messages Fixes qpdf/qpdf#450	2020-10-16 19:58:11 -04:00
James R. Barlow	3fc7c99d02	Replace memchr with manual memory search On large files with predominantly \n line endings, memchr(..'\r'..) seems to waste a considerable amount of time searching for a line ending candidate that we don't need. On the Adobe PDF Reference Manual 1.7, this commit is 8x faster at QPDF::processMemoryFile().	2020-10-16 19:57:29 -04:00
oltolm	3221022fc9	fix WindowsCryptProvider fixes #432	2020-10-16 19:56:33 -04:00
Jay Berkenbilt	ff65e272a8	Fix printf formatting for newer msvc Use autoconf rather than ifdefs to determine what format string to use for long long.	2020-10-16 07:02:23 -04:00
Jay Berkenbilt	88b8f8ec86	Remove redundant check found by lgtm.com	2020-10-15 14:47:43 -04:00
Jay Berkenbilt	26514ab731	Write linearization errors to stderr (fixes #438 )	2020-04-29 17:33:34 -04:00
Jay Berkenbilt	92d3cbecd4	Fix warnings reported by -Wshadow=local (fixes #431 )	2020-04-16 12:41:43 -04:00
Jay Berkenbilt	578c5ac66c	Use more references when iterating When possible, use `for (auto&` or `for (auto const&` when iterating using C++-11 style iterators.	2020-04-10 13:30:33 -04:00
Jay Berkenbilt	821a701851	Prepare 10.0.1 release	2020-04-09 11:48:26 -04:00
Jay Berkenbilt	1a7d3700a6	Fix unnecessary copies in auto iter (fixes #426 ) Also switch to colon-style iteration in some cases. Thanks to Dean Scarff for drawing this to my attention after detecting some unnecessary copies with https://clang.llvm.org/extra/clang-tidy/checks/performance-for-range-copy.html	2020-04-08 20:45:26 -04:00
Jay Berkenbilt	4977a7efa5	Bug fix: getStreamData should on unfilterable stream (fixes #425 )	2020-04-08 18:52:04 -04:00
Jay Berkenbilt	1e629c278a	Prepare 10.0.0 release	2020-04-06 11:30:15 -04:00
Jay Berkenbilt	c996f4ac33	Don't include <cwchar> if not building with wchar	2020-04-06 11:23:02 -04:00
Jay Berkenbilt	77198d5310	Delegate random number generation to crypto provider (fixes #418 )	2020-04-06 11:23:02 -04:00
Jay Berkenbilt	52749b85df	Make random data provider code thread-safe This uses C++-11 thread-safe static initializers now.	2020-04-06 10:00:43 -04:00
Jay Berkenbilt	619d294e9d	Remove QUtil::srandom	2020-04-06 09:49:02 -04:00
Dean Scarff	0f2507234f	Add OpenSSL/BoringSSL crypto provider Fixes qpdf/qpdf#417	2020-04-06 09:01:55 -04:00
Jay Berkenbilt	893d38b87e	Allow propagation of errors and retry through StreamDataProvider StreamDataProvider::provideStreamData now has a rich enough API for it to effectively proxy to pipeStreamData.	2020-04-05 20:07:13 -04:00
Jay Berkenbilt	7246404177	JSON: implement pattern keys in schema	2020-04-04 18:06:32 -04:00
Dean Scarff	c5c1a028cd	Use deterministic assignments for unique_id Fixes qpdf/qpdf#419	2020-04-04 08:29:28 -04:00
Jay Berkenbilt	2100b4ce15	Allow qpdf to be built on systems without wchar_t (fixes #406 )	2020-04-03 21:39:44 -04:00
Jay Berkenbilt	6a4117add9	Avoid potential segfault in warning methods	2020-04-03 21:39:20 -04:00
Jay Berkenbilt	4f3b89991b	placeFormXObject: allow control of shrink/expand (fixes #409 )	2020-04-03 21:39:17 -04:00
Jay Berkenbilt	b76b73b229	C API: accept any non-zero value as TRUE	2020-04-03 17:33:44 -04:00
Jay Berkenbilt	54726930df	Remove redundant methods in QUtil This was being saved until we had to break ABI.	2020-04-03 12:17:57 -04:00
Jay Berkenbilt	5806e5c60c	QPDFPageObjectHelper::placeFormXObject: use std::string const& (fixes #374 )	2020-04-03 12:17:57 -04:00
Jay Berkenbilt	97de12343b	Performance: remove Members indirection for Pipeline	2020-04-03 12:17:57 -04:00
Jay Berkenbilt	bfda941519	Use an unordered map for SparseOHArray for efficiency This was added in C++11.	2020-04-03 12:16:24 -04:00
Jay Berkenbilt	ee271fd2f2	Use auto for iterating over sparse array	2020-04-03 12:16:24 -04:00
Jay Berkenbilt	70665cb381	Internally use unsafeShallowCopy where we can	2020-04-03 12:16:24 -04:00
Jay Berkenbilt	38afdcea7b	Add QPDFObjectHandle::unsafeShallowCopy	2020-04-03 12:16:24 -04:00
Jay Berkenbilt	07afb668b1	Performance: remove indirection through Members for QPDFObject	2020-04-03 12:16:24 -04:00
Jay Berkenbilt	89f19b7099	Performance: remove Members indirection for QPDFObjectHandle	2020-04-03 12:16:24 -04:00
Jay Berkenbilt	dac65a21fb	Look in form XObjects when removing unreferenced resources (fixes #373 ) If a page contains a form XObject, also filter the form XObject and remove its unreferenced resources.	2020-03-31 17:39:20 -04:00
Jay Berkenbilt	278710fbe8	Refactor QPDFPageObjectHelper::removeUnreferencedResources() Refactor removeUnreferencedResources to prepare for filtering form XObjects.	2020-03-31 17:39:20 -04:00
Jay Berkenbilt	bb6768b8f0	Include header for wcslen (fixes #405 )	2020-02-29 08:43:33 -05:00
Jay Berkenbilt	bb3137296d	Handle root /Pages pointing to other than page tree root (fixes #398 )	2020-02-22 11:10:31 -05:00
Jay Berkenbilt	52a2e95dd5	Prepare 9.1.1 release	2020-01-26 18:49:04 -05:00
Jay Berkenbilt	57c01ef81f	In qdf mode, don't write extra XRef streams (fixes #386 ) fix-qdf assumes there is exactly one XRef stream and that it is at the end of the file.	2020-01-26 16:50:57 -05:00
Jay Berkenbilt	bbc2f8ffae	Bug fix: handle ColorSpace lookup for inline images (fixes #392 ) If the value of /CS in the inline image dictionary was is key in the page's /Resource -> /ColorSpace dictionary, properly resolve it by referencing the proper colorspace, and not just the name, in the external image dictionary.	2020-01-26 15:29:10 -05:00
Cloudmersive	a8b6ff5763	Fix for Windows unable to acquire crypt context with new keyset (fixes #387 ) Fix is based on guidance https://support.microsoft.com/en-us/help/238187/cryptacquirecontext-use-and-troubleshooting and is the proper fix for #285/#286	2020-01-14 18:45:54 -05:00
Jay Berkenbilt	a44b5a34a0	Pull wmain -> main code from qpdf.cc into QUtil.cc	2020-01-14 11:40:51 -05:00
Jay Berkenbilt	ab4061f1ee	Add error detection for read_lines_from_file(FILE*)	2020-01-14 11:07:09 -05:00
Jay Berkenbilt	211a7f57be	QUtil::read_lines_from_file: optional EOL preservation	2020-01-13 11:26:18 -05:00
Jay Berkenbilt	9a398504ca	Refactor QUtil::read_lines_from_file This commit adds the preserve_eol flags but doesn't implement EOL preservation yet.	2020-01-13 09:19:53 -05:00
Jay Berkenbilt	9b0c6022d7	Prepare 9.1.0 release	2019-11-16 22:29:54 -05:00
Jay Berkenbilt	5e6dfc938e	Prepare 9.1.rc1 release	2019-11-09 22:00:53 -05:00
Jay Berkenbilt	c4478e5249	Allow odd/even modifiers in numeric range (fixes #364 )	2019-11-09 13:23:12 -05:00
Jay Berkenbilt	5508f74603	Allow /P in encryption dictionary to be positive (fixes #382 ) Even though this is disallowed by the spec, files like this have been encountered in the wild.	2019-11-09 12:33:15 -05:00
Jay Berkenbilt	127a957aee	Allow runtime inspection/override of crypto provider	2019-11-09 09:53:42 -05:00
Jay Berkenbilt	88bedb41fe	Implement gnutls crypto provider (fixes #218 ) Thanks to Zdenek Dohnal <zdohnal@redhat.com> for contributing the code used for the gnutls crypto provider.	2019-11-09 09:53:38 -05:00
Jay Berkenbilt	cc14523440	Update autoconf to support crypto selection	2019-11-09 08:18:02 -05:00
Jay Berkenbilt	d0a53cd3ea	Fix typos in configure.ac	2019-11-09 08:18:02 -05:00
Jay Berkenbilt	c03ced09c0	Isolate source files used for native crypto	2019-11-09 08:18:02 -05:00
Jay Berkenbilt	d1ffe46c04	AES_PDF: move CBC logic from pipeline to AES_PDF implementation	2019-11-09 08:18:02 -05:00
Jay Berkenbilt	c8cda4f965	AES_PDF: switch to pluggable crypto	2019-11-09 08:18:02 -05:00
Jay Berkenbilt	bb427bd117	SHA2: switch to pluggable crypto	2019-11-09 08:18:02 -05:00
Jay Berkenbilt	eadc222ff9	Rename SHA2 implementation (non-bisectable)	2019-11-09 08:18:02 -05:00
Jay Berkenbilt	4287fcc002	RC4: switch to pluggable crypto	2019-11-09 08:18:02 -05:00
Jay Berkenbilt	0cdcd10228	Rename RC4 implementation (non-bisectable)	2019-11-09 08:18:02 -05:00
Jay Berkenbilt	ce8f9b6608	MD5: switch to pluggable crypto	2019-11-09 08:18:02 -05:00
Jay Berkenbilt	5c3e856e9f	Rename MD5 implementation (non-bisectable) Just rename MD5 -> MD5_native in place so that git annotate will show the lines as having originated there.	2019-11-09 08:18:02 -05:00
Jay Berkenbilt	2de41856a0	QPDFCryptoProvider: initial implementation	2019-11-09 08:18:02 -05:00
Jay Berkenbilt	700f5b961e	Remove int type checks -- subsumed by C++-11	2019-11-09 08:18:02 -05:00
Jay Berkenbilt	653ce3550d	Require C++-11 Includes updates to m4/ax_cxx_compile_stdcxx.m4 to make it work with msvc, which supports C++-11 with no flags but doesn't set __cplusplus to a recent value.	2019-11-09 08:18:02 -05:00
Jay Berkenbilt	9094fb1f8e	Fix two additional fuzz test cases	2019-11-03 18:59:12 -05:00
Masamichi Hosoda	5a842792b6	Parse Contents in signature dictionary without encryption Various PDF digital signing tools do not encrypt /Contents value in signature dictionary. Adobe Acrobat Reader DC can handle a PDF with the /Contents value not encrypted. Write Contents in signature dictionary without encryption Tests ensure that string /Contents are not handled specially when not found in sig dicts.	2019-10-22 16:20:21 -04:00
Masamichi Hosoda	cdc46d78f4	Add QPDFObject::getParsedOffset()	2019-10-22 16:19:06 -04:00
Masamichi Hosoda	50b329ee9f	Add QPDFWriter::getWrittenXRefTable()	2019-10-22 16:16:16 -04:00
Masamichi Hosoda	5cf4090aee	Add QPDFWriter::getRenumberedObjGen()	2019-10-22 16:16:16 -04:00
Masamichi Hosoda	46ac3e21b3	Add QPDF::getXRefTable()	2019-10-22 16:16:16 -04:00
Masamichi Hosoda	06b818dcd3	Exclude signature dictionary from compressible objects It seems better not to compress signature dictionaries. Various PDF digital signing tools, including Adobe Acrobat Reader DC, do not compress signature dictionaries. Table 8.93 "Entries in a signature dictionary" in PDF 1.5 reference describes that /ByteRange in the signature dictionary shall be used to describe a digest that does not include the signature value (/Contents) itself. The byte ranges cannot be determined if the dictionary is compressed.	2019-10-22 16:16:16 -04:00
Masamichi Hosoda	5e0ba12687	Fix /Contents value representation in a signature dictionary Table 8.93 "Entries in a signature dictionary" in PDF 1.5 reference describes that the value of Contents entry is a hexadecimal string representation when ByteRange is specified. This commit makes QPDF always uses hexadecimal strings representation instead of literal strings for it.	2019-10-22 16:16:16 -04:00
Jay Berkenbilt	3094955dee	Prepare 9.0.2 release	2019-10-12 19:37:40 -04:00
Jay Berkenbilt	4ea940b03c	Prepare 9.0.1 release	2019-09-20 07:38:18 -04:00
Jay Berkenbilt	685250d7d6	Correct reversed Rectangle coordinates (fixes #363 )	2019-09-19 21:25:34 -04:00
Jay Berkenbilt	48b7de2cc3	Fix typo in comment	2019-09-19 21:04:32 -04:00
Jay Berkenbilt	8b1e307741	Warn for duplicated dictionary keys (fixes #345 )	2019-09-19 20:22:34 -04:00
Jay Berkenbilt	bb83e65193	Fix fuzz issue 16953 (overflow checking in xref stream index)	2019-09-17 19:48:47 -04:00
Jay Berkenbilt	17d431dfd5	Fix integer type warnings for big-endian systems	2019-09-17 19:14:27 -04:00
Jay Berkenbilt	5462dfce31	Prepare 9.0.0 release	2019-08-31 20:07:36 -04:00
Jay Berkenbilt	babd12c9b2	Add methods QPDF::anyWarnings and QPDF::closeInputSource	2019-08-31 15:51:20 -04:00
Jay Berkenbilt	4fa7b1eb60	Add remove_file and rename_file to QUtil	2019-08-31 15:51:04 -04:00
Jay Berkenbilt	0e51a9aca6	Don't encrypt trailer, fixes fuzz issue 15983 Ordinarily the trailer doesn't contain any strings, so this is usually a non-issue, but if the trailer contains strings, linearizing and encrypting with object streams would include encrypted strings in the trailer, which would blow out the padding because encrypted strings are longer than their cleartext counterparts.	2019-08-28 23:06:32 -04:00
Jay Berkenbilt	47a38a942d	Detect stream in object stream, fixing fuzz 16214 It's detected in QPDFWriter instead of at parse time because I can't figure out how to construct a test case in a reasonable time. This commit moves the fuzz file into the regular test suite for a QTC coverage case.	2019-08-28 12:49:04 -04:00
Jay Berkenbilt	ba5fb69164	Make popping pipeline stack safer Use destructors to pop the pipeline stack, and ensure that code that pops the stack is actually popping the intended thing.	2019-08-27 22:27:47 -04:00
Jay Berkenbilt	dadf8307c8	Fix fuzz issues 15316 and 15390	2019-08-27 20:39:06 -04:00
Jay Berkenbilt	456c285b02	Fix fuzz issue 16172 (overflow checking in OffsetInputSource)	2019-08-27 13:08:07 -04:00
Jay Berkenbilt	ad8081daf5	Fix fuzz issue 15442 (overflow checking in BufferInputSource)	2019-08-27 11:26:25 -04:00
Jay Berkenbilt	9a095c5c76	Seek in two stages to avoid overflow When seeing to a position based on a value read from the input, we are prone to integer overflow (fuzz issue 15442). Seek in two stages to move the overflow check into the input source code.	2019-08-27 11:26:25 -04:00
Jay Berkenbilt	ac5e6de2e8	Fix fuzz issue 15387 (overflow checking xref size)	2019-08-27 11:26:25 -04:00
Jay Berkenbilt	6bc4cc3d48	Fix fuzz issue 15475	2019-08-25 22:52:25 -04:00
Jay Berkenbilt	94e86e2528	Fix fuzz issue 16301	2019-08-25 22:52:25 -04:00
Jay Berkenbilt	5da146c8b5	Track separately whether password was user/owner (fixes #159 )	2019-08-24 11:01:19 -04:00
Jay Berkenbilt	5a0aef55a0	Split long line	2019-08-24 10:58:51 -04:00
Jay Berkenbilt	2794bfb1a6	Add flags to control zlib compression level (fixes #113 )	2019-08-23 20:34:21 -04:00
Jay Berkenbilt	dac0598b94	Add ability to set zlib compression level globally	2019-08-23 20:34:21 -04:00
Jay Berkenbilt	3f1ab64066	Pass offset and length to ParserCallbacks::handleObject	2019-08-22 22:54:29 -04:00
Jay Berkenbilt	4b2e72c4cd	Test for direct, rather than resolved nulls in parser Just because we know an indirect reference is null, doesn't mean we shouldn't keep it indirect.	2019-08-22 17:55:16 -04:00
Jay Berkenbilt	3f3dbe22ea	Remove array null flattening For some reason, qpdf from the beginning was replacing indirect references to null with literal null in arrays even after removing the old behavior of flattening scalar references. This seems like a bad idea.	2019-08-22 17:55:16 -04:00
Jay Berkenbilt	225cd9dac2	Protect against coding error of re-entrant parsing	2019-08-22 17:55:16 -04:00
Jay Berkenbilt	ae5bd7102d	Accept extraneous space before xref (fixes #341 )	2019-08-19 22:24:53 -04:00
Jay Berkenbilt	8a9086a689	Accept extraneous space after stream keyword (fixes #329 )	2019-08-19 21:43:44 -04:00
Jay Berkenbilt	43f91f58b8	Improve invalid name token warning message This message used to only appear for PDF >= 1.2. The invalid name is valid for PDF 1.0 and 1.1. However, since QPDFWriter may write a newer version, it's better to detect and warn in all cases. Therefore make the warning more informative.	2019-08-19 19:48:27 -04:00
Jay Berkenbilt	42d396f1dd	Handle invalid name tokens symmetrically for PDF < 1.2 (fixes #332 )	2019-08-19 19:48:27 -04:00
Jay Berkenbilt	d9dd99eca3	Attempt to repair /Type key in pages nodes (fixes #349 )	2019-08-18 18:54:37 -04:00
Jay Berkenbilt	522d2b2227	Improve efficiency of fixDanglingReferences	2019-08-18 09:00:40 -04:00
Jay Berkenbilt	5187a3ec85	Shallow copy arrays without removing sparseness	2019-08-17 23:02:41 -04:00
Jay Berkenbilt	bf7c6a8070	Use SparseOHArray in parsing	2019-08-17 23:02:41 -04:00
Jay Berkenbilt	e5f504b6c5	Use SparseOHArray in QPDF_Array	2019-08-17 23:02:41 -04:00
Jay Berkenbilt	a89d8a0677	Refactor QPDF_Array in preparation for using SparseOHArray	2019-08-17 23:02:41 -04:00
Jay Berkenbilt	e83f3308fb	SparseOHArray	2019-08-17 23:02:41 -04:00
Thorsten Schöning	8f06da7534	Change list to vector for outline helpers (fixes #297 ) This change works around STL problems with Embarcadero C++ Builder version 10.2, but std::vector is more common than std::list in qpdf, and this is a relatively new API, so an API change is tolerable. Thanks to Thorsten Schöning <6223655+ams-tschoening@users.noreply.github.com> for the fix.	2019-07-03 20:08:47 -04:00
Jay Berkenbilt	4db1de97ce	Convert some cases of logic_error to runtime_error There were a few cases that could be caused by invalid input rather than bugs in the code which were throwing logic_error instead of runtime_error.	2019-06-25 12:43:06 -04:00
Jay Berkenbilt	201e8798d7	Convert previously overlooked static cast to QIntC	2019-06-25 12:43:06 -04:00
Jay Berkenbilt	04f45cf652	Treat all linearization errors as warnings This also reverts the addition of a new checkLinearization that distinguishes errors from warnings. There's no practical distinction between what was considered an error and what was considered a warning.	2019-06-23 13:45:45 -04:00
Jay Berkenbilt	c5ed1b8075	Handle invalid encryption Length (fixes #333 )	2019-06-22 20:57:33 -04:00
Jay Berkenbilt	551dfbf697	Allow set*EncryptionParameters before filename iset (fixes #336 )	2019-06-22 20:57:33 -04:00
Jay Berkenbilt	7bd38a3eb3	Provide error message in Windows crypto code (fixes #286 ) Thanks to github user zdenop for supplying some additional error-handling code.	2019-06-22 17:12:01 -04:00
Jay Berkenbilt	6c39aa8763	In shippable code, favor smart pointers (fixes #235 ) Use PointerHolder in several places where manually memory allocation and deallocation were being used. This helps to protect against memory leaks when exceptions are thrown in surprising places.	2019-06-22 16:57:52 -04:00
Jay Berkenbilt	85a3f95a89	qpdf: exit 3 for linearization warnings without errors (fixes #50 )	2019-06-22 16:57:51 -04:00
Jay Berkenbilt	1bde5c68a3	Add QUtil::read_file_into_memory This code was essentially duplicated between test_driver and standalone_fuzz_target_runner.	2019-06-22 10:14:25 -04:00
Jay Berkenbilt	658b5bb3be	QPDFWriter: clean up overloaded functions In a small number of cases, it makes sense to replace an overloaded function with a function that takes a default argument. We can do this now because we've already broken binary compatibility since the last release.	2019-06-22 10:13:27 -04:00
Jay Berkenbilt	79f6b4823b	Convert remaining public classes to use Members pattern Have classes contain only a single private member of type PointerHolder<Members>. This makes it safe to change the structure of the Members class without breaking binary compatibility. Many of the classes already follow this pattern quite successfully. This brings in the rest of the class that are part of the public API.	2019-06-22 10:13:27 -04:00
Jay Berkenbilt	45dac410b5	Remove broken QPDFTokenizer::expectInlineImage	2019-06-21 22:29:31 -04:00
Jay Berkenbilt	25dd3c6750	Remove QPDF::copyForeignObject with unused parameter	2019-06-21 22:29:31 -04:00
Jay Berkenbilt	c6cfd64503	Rename QUtil::strcasecmp to QUtil::str_compare_nocase (fixes #242 )	2019-06-21 22:29:31 -04:00
Jay Berkenbilt	848351f1fc	Add missing #include <cstring>	2019-06-21 22:29:31 -04:00
Jay Berkenbilt	b07ad6794e	Fix bugs found by fuzz tests * Several assertions in linearization were not always true; change them to run time errors * Handle a few cases of uninitialized objects * Handle pages with no contents when doing form operations * Handle invalid page tree nodes when traversing pages	2019-06-21 17:56:24 -04:00
Jay Berkenbilt	a35d4ce9cc	Fix bounds error in utf16_to_utf8 conversion	2019-06-21 17:40:24 -04:00
Jay Berkenbilt	63a643a3c7	Remove implicit conversion from int/pointer to bool This fixes cases of warning C4800 from msvc	2019-06-21 13:17:21 -04:00
Jay Berkenbilt	d71f05ca07	Fix sign and conversion warnings (major) This makes all integer type conversions that have potential data loss explicit with calls that do range checks and raise an exception. After this commit, qpdf builds with no warnings when -Wsign-conversion -Wconversion is used with gcc or clang or when -W3 -Wd4800 is used with MSVC. This significantly reduces the likelihood of potential crashes from bogus integer values. There are some parts of the code that take int when they should take size_t or an offset. Such places would make qpdf not support files with more than 2^31 of something that usually wouldn't be so large. In the event that such a file shows up and is valid, at least qpdf would raise an error in the right spot so the issue could be legitimately addressed rather than failing in some weird way because of a silent overflow condition.	2019-06-21 13:17:21 -04:00
Jay Berkenbilt	f40ffc9d63	Pl_Flate: constructor's out_bufsize is now unsigned int This is the type we need for the underlying zlib implementation.	2019-06-21 13:17:21 -04:00
Jay Berkenbilt	da30764bce	Change QPDFObjectHandle::pipeStreamData's encode_flags type Change from unsigned long to int since we pass enumerated type values to this field.	2019-06-21 13:17:21 -04:00
Jay Berkenbilt	3608afd5c5	Add new integer accessors to QPDFObjectHandle	2019-06-21 13:17:21 -04:00
Jay Berkenbilt	42306e2ff8	QUtil: add unsigned int/string functions	2019-06-21 13:17:21 -04:00
Jay Berkenbilt	2155815234	configure: determine wordsize automatically Based on sizeof(size_t). Assumes 64 if not 32.	2019-06-21 13:17:21 -04:00
Jay Berkenbilt	713d961990	Appearance streams: some floating point values were truncated Bounding box X coordinates could be truncated, causing them to be off by a fraction of a point. This was most likely not visible, but it was still wrong.	2019-06-20 21:32:30 -04:00
Jay Berkenbilt	eb7948876b	Fix problems found in fuzz corpus	2019-06-15 17:24:24 -04:00
Jay Berkenbilt	cf469d7890	Give up reading objects with too many consecutive errors	2019-06-15 08:52:19 -04:00
Jay Berkenbilt	cd830968ef	Eliminate one potential integer overflow There are more to handle, but this resolves an issue already caught by oss-fuzz.	2019-06-15 08:52:19 -04:00
Jay Berkenbilt	31bde2f9d7	Handle empty DecodeParams array for (fixes #331 ) On read, ignore /DecodeParms when empty list; on write, delete it. Some files have been found that include an empty list for /DecodeParms, but this is not technically compliant with the spec, and the only sensible interpretation is to treat it as if there are no decode parameters.	2019-06-09 17:19:49 -04:00
Jay Berkenbilt	b1a78be1a8	Prepare 8.4.2 release	2019-05-18 08:56:37 -04:00
Jay Berkenbilt	b3f0dbff62	Fix Windows memory error (fixes #330 )	2019-05-16 14:26:51 -04:00
Jay Berkenbilt	a323f6f49f	Prepare 8.4.1 release	2019-04-27 20:44:20 -04:00
Jay Berkenbilt	81205e007b	Spell check	2019-04-21 13:09:11 -04:00
Jay Berkenbilt	011695dfdf	Support Unicode in filenames (fixes #298 )	2019-04-20 21:00:43 -04:00
Jay Berkenbilt	4ccb29912a	Tighten isPageObject (fixes #310 )	2019-04-20 21:00:43 -04:00
Thorsten Schöning	2c704b99a1	Undefined functions because of missing std:: or header. (#295 ) * [bcc32 Error] QPDF.cc(375): E2268 Call to undefined function 'atof' Full parser context QPDF.cc(358): parsing: void QPDF::parse(const char ) [bcc32 Error] QPDFTokenizer.cc(183): E2268 Call to undefined function 'strtol' Full parser context QPDFTokenizer.cc(163): parsing: void QPDFTokenizer::resolveLiteral() * [bcc32 Error] pdf-split-pages.cc(52): E2268 Call to undefined function 'exit' Full parser context pdf-split-pages.cc(50): parsing: void usage() * PR #295: Including "cstdlib" should be replaced with "stdlib.h" to be more consistent. At the same time I changed the order of the surrounding includes to reflect alphabetical order, because at some files this already have been the case.	2019-03-12 10:05:29 -04:00
Thorsten Schöning	71b7ed9f4f	"_setmode" and "_stricmp" are not available on Borland C++Builder, neither the classic one nor newer ones based on CLANG.	2019-03-11 16:58:55 -04:00
Jay Berkenbilt	da7c2c0ee9	Fix json serialization for {x \| -1 < x < 1} (fixes #308 ) JSON serialization was preserving the value as presented, but JSON doesn't accept decimal values without a 0 before the decimal point.	2019-03-11 16:22:59 -04:00
Jay Berkenbilt	03074ca5a0	Prepare 8.4.0 release	2019-02-01 22:25:25 -05:00
Jay Berkenbilt	fec5bb124c	Spell check	2019-01-31 21:41:29 -05:00
Jay Berkenbilt	eb49e07c0a	Make inline image token exactly contain the image data Do not include the trailing EI, and handle cases where EI is not preceded by a delimiter. Such cases have been seen in the wild.	2019-01-31 20:28:44 -05:00
Jay Berkenbilt	5211bcb5ea	Externalize inline images (fixes #278 )	2019-01-31 10:38:13 -05:00
Jay Berkenbilt	1eb35a355f	Exclude space after ID in image data	2019-01-31 10:38:10 -05:00
Jay Berkenbilt	2b6c79bcae	Improve locating inline image's EI We've actually seen a PDF file in the wild that contained EI surrounded by delimiters inside the image data, which confused qpdf's naive code. This significantly improves EI detection.	2019-01-31 09:26:37 -05:00
Jay Berkenbilt	ec9e310c9e	Refactor QPDFTokenizer's inline image handling Add a version of expectInlineImage that takes an input source and searches for EI. This is in preparation for improving the way EI is found. This commit just refactors the code without changing the functionality and adds tests to make sure the old and new code behave identically.	2019-01-31 09:26:37 -05:00
Jay Berkenbilt	31372edce0	Inline image token value ends with EI, not delimiter The inline image token erroneously included the delimiter that followed EI. The ObjectHandle created from it was correct.	2019-01-31 09:26:37 -05:00
Jay Berkenbilt	b776dcd2d3	Clean up some private functions	2019-01-29 22:14:20 -05:00
Jay Berkenbilt	8a9cfd2605	Handle direct page objects (fixes #164 )	2019-01-29 17:01:36 -05:00
Jay Berkenbilt	2d0885bc11	Clarify documentation for copyForeignObject regarding pages Make explicit that copyForeignObject can be used on page objects and will copy them properly but not update the pages tree.	2019-01-28 21:53:55 -05:00
Jay Berkenbilt	2712869cf9	Fix logic for when to compress object and xref streams (fixes #271 )	2019-01-28 21:43:06 -05:00
Jay Berkenbilt	52f9d326a5	Resolve duplicated page objects (fixes #268 ) When linearizing a file or getting the list of all pages in a file, detect if the pages tree contains a duplicated page object and, if so, shallow copy it. This makes it possible to have a one to one mapping of page positions to page objects.	2019-01-28 20:29:58 -05:00
Jay Berkenbilt	623f5b664e	Convert pages to form XObjects Support conversion of pages to form XObjects and placement of form XObjects on pages.	2019-01-27 07:50:30 -05:00
Jay Berkenbilt	68ccd87c9e	Move rectangle transformation into QPDFMatrix	2019-01-27 07:50:30 -05:00
Jay Berkenbilt	8cb245739c	Add QPDFObjectHandle::getUniqueResourceName	2019-01-27 07:50:30 -05:00
Jay Berkenbilt	009767d97a	Handle inheritable page attributes Add getAttribute for handling inheritable page attributes, and fix getPageImages and annotation flattening code to use it.	2019-01-25 22:30:05 -05:00
Jay Berkenbilt	2d32f4db8f	Handle fallback font size in text appearances If we end up using our fallback font size when generating appearances for text fields, reflect that in the Tf operator used in the appearance stream.	2019-01-21 07:38:21 -05:00
Jay Berkenbilt	9cb599875b	Improve text objects used in text appearance streams	2019-01-20 23:05:58 -05:00
Jay Berkenbilt	930eade6d3	Fix omissions in text appearance generation When generating appearance streams for variable text annotations, properly handle the cases of there being no appearance dictionary, no appearance stream, or an appearance stream with no BMC..EMC marker.	2019-01-20 23:05:58 -05:00
Jay Berkenbilt	65ef0bf313	When flattening, remove annotations with no appearance stream With the exception of form field annotations when /NeedAppearances is true, remove annotations that don't have appearance streams when flattening. There is no reason to keep these when flattening since they are invisible. This may include unchecked checkboxes, unshown popup windows, etc.	2019-01-20 23:05:58 -05:00
Jay Berkenbilt	c18ee440a3	mingw workaround for QPDFExc destructor mingw doesn't like it when you don't inline empty virtual destructors.	2019-01-19 10:14:07 -05:00
Jay Berkenbilt	e87d149918	Add QUtil::possible_repaired_encodings	2019-01-17 11:43:56 -05:00
Jay Berkenbilt	6ec22f117d	Modernize encryption API for more granularity Setting encryption permissions for R >= 3 set permission bits in groups corresponding to menu options in Acrobat 5. The new API allows the bits to be set individually.	2019-01-17 11:43:56 -05:00
Jay Berkenbilt	4630377731	Add status-reporting transcoders to QUtil	2019-01-17 11:43:56 -05:00
Jay Berkenbilt	8f389f14c0	QUtil::analyze_encoding	2019-01-17 11:43:56 -05:00
Jay Berkenbilt	6817ca585a	Bidirectional transcoding for win, mac, pdf, utf8, utf16	2019-01-17 11:43:56 -05:00
Jay Berkenbilt	698485468a	Move remaining existing transcoding to QUtil	2019-01-17 11:43:56 -05:00
Jay Berkenbilt	5cfcd4f361	Additional checks for unreferenced resources Explicitly abandon removal of unreferenced resources if there are any lexical errors in the page's contents. This case always generated a warning, but it now also prevents removal of unreferenced resources, this strongly decreasing the likelihood of data loss.	2019-01-17 11:43:56 -05:00
Jay Berkenbilt	4bc434000c	Copy subdictionaries when removing resources (fixes #276 ) When removing unreferenced resources, the code was copying the overall resource dictionaries but not the subdictionaries being modified. This was a "typo" in the code -- the comment clearly stated the need to do this, but the code replaced the dictionary with itself rather than with a shallow copy of itself.	2019-01-17 09:40:05 -05:00
Jay Berkenbilt	654c0e8caf	Allow adding the same page more than once in --pages (fixes #272 )	2019-01-12 10:01:47 -05:00
Jay Berkenbilt	4ecd1df6f2	Add configure option AVOID_WINDOWS_HANDLE If set, we avoid using Windows I/O HANDLE, which is disallowed in some versions of the Windows SDK, such as for Windows phones. QUtil::same_file will always return false in this case. Only applies to Windows builds.	2019-01-10 22:35:08 -05:00
Jay Berkenbilt	d24a120c7f	Add QPDF::setImmediateCopyFrom	2019-01-10 22:35:08 -05:00
Jay Berkenbilt	b653929c93	Update version to 8.3.0	2019-01-07 11:16:54 -05:00
Jay Berkenbilt	aa602fd107	Fix integer overflow in large file test	2019-01-07 08:49:14 -05:00
Jay Berkenbilt	c3cee5f154	Exercise out of scope original pdf for copyForeignObject	2019-01-07 07:38:03 -05:00
Jay Berkenbilt	fddbcab0e7	Mostly don't require original QPDF for copyForeignObject (fixes #219 ) The original QPDF is only required now when the source QPDFObjectHandle is a stream that gets its stream data from a QPDFObjectHandle::StreamDataProvider.	2019-01-07 00:11:15 -05:00
Jay Berkenbilt	fbbb0ee016	Make a static version of QPDF::pipeStreamData This is in preparation of being able to pipe a stream's data without keeping a copy of its containing qpdf object.	2019-01-07 00:11:15 -05:00
Jay Berkenbilt	7588cac295	Create an application-scope unique ID for each QPDF object Use this instead of QPDF* as a map key for object_copiers.	2019-01-07 00:11:15 -05:00
Jay Berkenbilt	e27ac682e0	Move encryption parameters into a class	2019-01-06 09:58:16 -05:00

... 11 12 13 14 15 ...

1779 Commits