octoleo/qpdf - qpdf - Vast Development Method

mirror of https://github.com/qpdf/qpdf.git synced 2024-12-23 03:18:59 +00:00

Author	SHA1	Message	Date
Jay Berkenbilt	b7bbf12e85	In json mode, reveal recovered user password when otherwise unavailable	2022-05-30 20:03:08 -04:00
Jay Berkenbilt	f049a77c59	Add additional information when listing attachments	2022-05-30 20:03:08 -04:00
Jay Berkenbilt	27a42c16c7	Change default decode level to "none" with --json-output	2022-05-21 17:51:34 -04:00
Jay Berkenbilt	b0f1564376	Add another binary utf8 to JSON test	2022-05-21 17:39:35 -04:00
Jay Berkenbilt	752f43d4e4	Allow empty b: binary JSON strings	2022-05-21 17:36:32 -04:00
m-holger	6c69a747b9	Code clean up: use range-style for loops wherever possible Remove variables obsoleted by commit `4f24617`.	2022-05-21 16:06:29 -04:00
Jay Berkenbilt	905f47a55f	Add json to large file test	2022-05-21 09:43:45 -04:00
Jay Berkenbilt	9b2eb01e25	Exercise object description in tests	2022-05-20 14:23:32 -04:00
Jay Berkenbilt	6c2fb5b8f0	Add test for bad data and bad datafile	2022-05-20 13:33:30 -04:00
Jay Berkenbilt	d065098089	Test --update-from-json	2022-05-20 11:10:12 -04:00
Jay Berkenbilt	6d4e3ba8a4	Test (and fix) handling of dangling references	2022-05-20 09:16:25 -04:00
Jay Berkenbilt	35b1e1c493	Explicitly test ignoring unknown keys in JSON input	2022-05-20 09:16:25 -04:00
Jay Berkenbilt	dc8df962d8	Make version default to latest for --json-output (like --json)	2022-05-20 09:16:25 -04:00
Jay Berkenbilt	907df2c823	Round-trip tests with --json-stream-data=file	2022-05-20 09:16:25 -04:00
Jay Berkenbilt	a83b7b0611	Tests with manually constructed qpdf json	2022-05-20 09:16:25 -04:00
Jay Berkenbilt	7f8c4b183d	Add tests for --json-input	2022-05-20 09:16:25 -04:00
Jay Berkenbilt	1ec561daa4	Add more names and strings in good13 * native UTF-8 strings * names whose PDF and canonical syntax differ in both dictionary key positions and other positions For json, names are converted both as names and directly when used as dictionary keys.	2022-05-20 09:16:25 -04:00
Jay Berkenbilt	6c5e590673	Rename all test files: _ to -	2022-05-20 09:16:25 -04:00
Jay Berkenbilt	6f43bf8de3	Major rework -- see long comments * Replace --create-from-json=file with --json-input, which causes the regular input to be treated as json. * Eliminate --to-json * In --json=2, bring back "objects" and eliminate "objectinfo". Stream data is never present. * In --json-output=2, write "qpdf-v2" with "objects" and include stream data.	2022-05-20 09:16:25 -04:00
Jay Berkenbilt	56f1b411fe	Back out fluent QPDFObjectHandle methods. Keep the andGet methods. I decided these were confusing and inconsistent with how JSON works. They muddle the API rather than improving it.	2022-05-20 09:16:25 -04:00
Jay Berkenbilt	7e7a9c4379	Parse objects; stream data is not yet handled	2022-05-20 09:16:25 -04:00
Jay Berkenbilt	7fa5d1773b	Implement top-level qpdf json parsing	2022-05-16 13:41:40 -04:00
Jay Berkenbilt	9a0e9a1a9e	Remove offset from missing /Root error The last offset is irrelevant to not being able to find /Root.	2022-05-16 13:39:26 -04:00
Jay Berkenbilt	173b944ef8	Split qpdf.test into multiple test suites This makes it a lot easier to run parts of the test suite.	2022-05-14 17:35:06 -04:00
Jay Berkenbilt	2a2f7f1bba	Add maxobjectid to JSON	2022-05-08 13:45:20 -04:00
Jay Berkenbilt	e9390aeaaa	Add --to-json option	2022-05-08 13:45:20 -04:00
Jay Berkenbilt	2e87d593eb	Test inline stream data with different decode levels	2022-05-08 13:45:20 -04:00
Jay Berkenbilt	f08f398920	Test json v2 with invalid stream data	2022-05-08 13:45:20 -04:00
Jay Berkenbilt	c76536dd9a	Implement JSON v2 output	2022-05-08 13:45:20 -04:00
Jay Berkenbilt	bdfc4da510	Apply script across future v2 test files There is one unexpected pass in this commit. This script was applied to the files changed in this commit: ---------- #!/usr/bin/env python3 import json import sys def json_dumps(data): return json.dumps(data, ensure_ascii=False, indent=2, separators=(',', ': ')) for filename in sys.argv[1:]: with open(filename, 'r') as f: data = json.loads(f.read()) data['version'] = 2 objectinfo = {} if 'objectinfo' in data: objectinfo = data['objectinfo'] del data['objectinfo'] if 'objects' not in data: continue qpdf = {'jsonversion': 2, 'pdfversion': '1.3', 'objects': {}} for k, v in data['objects'].items(): is_stream = objectinfo.get(k, {}).get('stream', {}).get('is', False) if k.endswith(' R'): k = 'obj:' + k if is_stream: v = {'stream': {'dict': v}} else: v = {'value': v} qpdf['objects'][k] = v data['qpdf'] = qpdf del data['objects'] print(json_dumps(data)) ----------	2022-05-08 13:45:20 -04:00
Jay Berkenbilt	8d348974aa	Prepare test suite for json v2	2022-05-08 13:45:20 -04:00
Jay Berkenbilt	15272662f6	Fix typo in json output key name moddify -> modify. Also carefully spell checked all remaining keys by splitting them into words and running a spell checker, not just relying on visual proofreading. That was the only one.	2022-05-08 13:45:20 -04:00
Jay Berkenbilt	1bc8abfdd3	Implement JSON v2 for Stream Not fully exercised in this commit	2022-05-08 13:45:20 -04:00
Jay Berkenbilt	3246923cf2	Implement JSON v2 for String Also refine the herustic for deciding whether to use hexadecimal notation for a string.	2022-05-08 13:45:20 -04:00
Jay Berkenbilt	16f4f94cd9	Prepare code for JSON v2 Update getJSON() methods and calls to them	2022-05-07 11:12:01 -04:00
Jay Berkenbilt	a9fbbd5dca	Objectinfo json: write incrementally and in numeric order This script was used on test data: ---------- #!/usr/bin/env python3 import json import sys import re def json_dumps(data): return json.dumps(data, ensure_ascii=False, indent=2, separators=(',', ': ')) for filename in sys.argv[1:]: with open(filename, 'r') as f: data = json.loads(f.read()) if 'objectinfo' not in data: continue trailer = None to_sort = [] for k, v in data['objectinfo'].items(): if k == 'trailer': trailer = v else: m = re.match(r'^(\d+) \d+ R', k) if m: to_sort.append([int(m.group(1)), k, v]) newobjectinfo = {x[1]: x[2] for x in sorted(to_sort)} if trailer is not None: newobjectinfo['trailer'] = trailer data['objectinfo'] = newobjectinfo print(json_dumps(data)) ----------	2022-05-07 08:26:31 -04:00
Jay Berkenbilt	948de60990	Objects json: write incrementally and in numeric order The following script was used to adjust test data: ---------- #!/usr/bin/env python3 import json import sys import re def json_dumps(data): return json.dumps(data, ensure_ascii=False, indent=2, separators=(',', ': ')) for filename in sys.argv[1:]: with open(filename, 'r') as f: data = json.loads(f.read()) if 'objects' not in data: continue trailer = None to_sort = [] for k, v in data['objects'].items(): if k == 'trailer': trailer = v else: m = re.match(r'^(\d+) \d+ R', k) if m: to_sort.append([int(m.group(1)), k, v]) newobjects = {x[1]: x[2] for x in sorted(to_sort)} if trailer is not None: newobjects['trailer'] = trailer data['objects'] = newobjects print(json_dumps(data)) ----------	2022-05-07 08:26:31 -04:00
Jay Berkenbilt	dc9b7287cd	Top-level json: write incrementally This commit just changes the order in which fields are written to the json without changing their content. All the json files in the test suite were modified with this script to ensure that we didn't get any changes other than ordering. ---------- #!/usr/bin/env python3 import json import sys def json_dumps(data): return json.dumps(data, ensure_ascii=False, indent=2, separators=(',', ': ')) for filename in sys.argv[1:]: with open(filename, 'r') as f: data = json.loads(f.read()) newdata = {} for i in ('version', 'parameters', 'pages', 'pagelabels', 'acroform', 'attachments', 'encrypt', 'outlines', 'objects', 'objectinfo'): if i in data: newdata[i] = data[i] print(json_dumps(newdata)) ----------	2022-05-07 08:26:31 -04:00
Jay Berkenbilt	7f65a5c21f	Test json against schema only on demand Testing json against schema requires an in-memory copy, so do it only when requested by the test suite.	2022-05-07 08:26:31 -04:00
Jay Berkenbilt	a3c9980395	Add next to Pl_String and fix comments	2022-05-07 08:26:31 -04:00
Jay Berkenbilt	e5f3910c3e	Add new FileInputSource constructors	2022-05-04 12:07:11 -04:00
Jay Berkenbilt	8b25de24c9	Make "objects" and "pages" consistent in JSON output	2022-05-04 08:32:44 -04:00
Jay Berkenbilt	f4206a0938	Add new Pl_String Pipeline	2022-05-03 18:54:51 -04:00
Jay Berkenbilt	16139d97c8	Add new Pl_OStream Pipeline	2022-05-03 18:54:51 -04:00
Jay Berkenbilt	21d6e3231f	Make use of the new Pipeline methods in some places	2022-05-03 18:31:23 -04:00
Jay Berkenbilt	59f3e09edf	Make Pipeline::write take an unsigned char const* (API change)	2022-05-03 18:31:22 -04:00
Jay Berkenbilt	62bf296a9c	Make assert handling less error-prone Prevent my future self or other contributors from using assert in tests and then having that assert not do anything because of the NDEBUG macro.	2022-05-03 18:31:22 -04:00
Jay Berkenbilt	8ccd3a8a89	Mark weak encryption with API changes (fixes #576 )	2022-04-30 17:24:15 -04:00
Jay Berkenbilt	cff26040d8	Using insecure crytpo from the CLI is now an error by default	2022-04-30 17:23:58 -04:00
Jay Berkenbilt	4f24617e1e	Code clean up: use range-style for loops wherever possible Where not possible, use "auto" to get the iterator type. Editorial note: I have avoid this change for a long time because of not wanting to make gratuitous changes to version history, which can obscure when certain changes were made, but with having recently touched every single file to apply automatic code formatting and with making several broad changes to the API, I decided it was time to take the plunge and get rid of the older (pre-C++11) verbose iterator syntax. The new code is just easier to read and understand, and in many cases, it will be more effecient as fewer temporary copies are being made. m-holger, if you're reading, you can see that I've finally come around. :-)	2022-04-30 13:27:18 -04:00
Jay Berkenbilt	7f023701dd	Formatting: remove space in range-style for loops Change .clang-format and commit automated changes from a fresh run of format-code	2022-04-30 13:26:43 -04:00
Jay Berkenbilt	2878c186bf	Use fluent appendItem	2022-04-30 10:54:16 -04:00
Jay Berkenbilt	ab9d557cb0	Use fluent replaceKey	2022-04-29 20:39:54 -04:00
Jay Berkenbilt	e80fad86e9	Add new QPDFObjectHandle methods for more fluent programming	2022-04-29 20:09:10 -04:00
Jay Berkenbilt	d0b7cc8ac6	QPDFJob json: make removeAttachment take an array (fixes #693 )	2022-04-24 13:06:19 -04:00
Jay Berkenbilt	08ba21cf49	Fix some bugs around null values in dictionaries Make it so that a key with a null value is always treated as not being present. This was inconsistent before.	2022-04-24 10:08:32 -04:00
Jay Berkenbilt	4be2f36049	Deprecate replaceOrRemoveKey -- it's the same as replaceKey	2022-04-24 09:31:32 -04:00
Jay Berkenbilt	b8d0b0b638	Re-add accidentally removed qpdf.testcov	2022-04-24 09:18:04 -04:00
Jay Berkenbilt	68e721981a	Add new QPDF::warn that takes most of QPDFExc's arguments	2022-04-23 18:25:43 -04:00
Jay Berkenbilt	22b35c4928	Expose QUtil::get_next_utf8_codepoint	2022-04-23 18:25:43 -04:00
Jay Berkenbilt	ce5c3bcad8	QPDFJob: pass capture output streams through to underlying QPDF	2022-04-18 11:24:17 -04:00
Jay Berkenbilt	80ed3076a0	Remove deprecated name/number tree constructors Remove the name/number tree object helper constructors that don't take a QPDF&.	2022-04-16 13:13:15 -04:00
Jay Berkenbilt	cdd0b4fb7d	Use = default and = delete where possible in classes	2022-04-16 11:39:14 -04:00
Jay Berkenbilt	ce86307a1a	Fix typo in error message	2022-04-10 16:54:23 -04:00
Jay Berkenbilt	5f4675bb24	Mark non-ABI symbols in exported class with QPDF_DLL_PRIVATE	2022-04-10 16:54:23 -04:00
Jay Berkenbilt	5525c93124	Use QPDF_DLL_CLASS with Pipeline and InputSource subclasses This enables RTTI so we can use dynamic_cast on them across the shared object boundary.	2022-04-10 16:52:57 -04:00
Jay Berkenbilt	07edf96440	Remove methods of private classes from ABI Prior to the cmake conversion, several private classes had methods that were exported into the shared library so they could be tested with libtests. With cmake, we build libtests using an object library, so this is no longer necessary. The methods that are disappearing from the ABI were never exposed through public headers, so no code should be using them. Removal had to wait until the window for ABI-breaking changes was open.	2022-04-09 17:33:29 -04:00
Jay Berkenbilt	128e41648f	Remove PointerHolder.hh from other than public header files Increase to POINTERHOLDER_TRANSITION=4	2022-04-09 17:33:29 -04:00
Jay Berkenbilt	ba0ef7a124	Replace PointerHolder with std::shared_ptr in the rest of the code Increase to POINTERHOLDER_TRANSITION=3 patrepl s/PointerHolder/std::shared_ptr/g */.cc */.hh patrepl s/make_pointer_holder/std::make_shared/g */.cc patrepl s/make_array_pointer_holder/QUtil::make_shared_array/g */.cc patrepl s,qpdf/std::shared_ptr,qpdf/PointerHolder, */.cc */.hh git restore include/qpdf/PointerHolder.hh git restore libtests/pointer_holder.cc cleanpatch ./format-code	2022-04-09 17:33:29 -04:00
Jay Berkenbilt	12f1eb15ca	Programmatically apply new formatting to code Run this: for i in */.cc */.c */.h */.hh; do clang-format < $i >\| $i.new && mv $i.new $i done	2022-04-04 08:10:40 -04:00
Jay Berkenbilt	820a3f04fd	Remove "lt-" workarounds The executables that libtool built invoked the underlying binary with an "lt-" prefix. The code contained numerous workarounds for testing, which can now be removed.	2022-03-18 19:53:18 -04:00
Jay Berkenbilt	acdf5b2e7a	Update process for ABI testing	2022-03-18 19:53:18 -04:00
Jay Berkenbilt	f58d2a60d5	Update build-related documentation and comments	2022-03-18 19:53:18 -04:00
Jay Berkenbilt	70d0d0889b	Remove old build files	2022-03-18 19:53:18 -04:00
Jay Berkenbilt	b8aff90997	Add cmake configuration files	2022-03-18 19:53:18 -04:00
Jay Berkenbilt	6941923ca9	Improve large file test output	2022-03-18 19:53:18 -04:00
Jay Berkenbilt	c0a231bf32	Reverse sense of compare images toggle for qpdf.test Run compare images tests when QPDF_TEST_COMPARE_IMAGES is set rather than when QPDF_SKIP_TEST_COMPARE_IMAGES is not set.	2022-03-18 19:53:18 -04:00
Jay Berkenbilt	17c0e38c8e	Force assert to be defined in test code	2022-03-07 10:07:27 -05:00
Jay Berkenbilt	1d86b70eab	Tweak include of config for ctest	2022-03-01 14:01:34 -05:00
Jay Berkenbilt	99393e6ab7	Shorten coverage case name This is so it will fit on one line after a qtest upgrade allows us to split lines.	2022-02-26 10:18:23 -05:00
Jay Berkenbilt	f7ac591590	Recognize explicit UTF-8 strings (fixes #654 )	2022-02-22 08:10:05 -05:00
Jay Berkenbilt	31b45b0fd4	Fix logic error with Tf when generating appearances (fixes #655 )	2022-02-18 13:46:35 -05:00
Jay Berkenbilt	3e2109ab37	Remove special case for 0xad for 10.6.2.	2022-02-16 06:52:05 -05:00
Jay Berkenbilt	e810fe678a	Fix asymmetry between newUnicodeString and getUTF8Value	2022-02-15 19:22:35 -05:00
Jay Berkenbilt	a478cbb6dc	Silently/transparently recognize UTF-16LE as UTF-16 (fixes #649 ) The PDF spec only allows UTF-16BE, but most readers seem to accept UTF-16LE as well, so now qpdf does too.	2022-02-15 16:13:12 -05:00
Jay Berkenbilt	fbd3e56da7	Ignore -- at the top level arg parser (fixes #652 ) This was unintended behavior that was added back for backward compatibility. It is intentionally undocumented.	2022-02-15 16:13:12 -05:00
Jay Berkenbilt	19608ec151	Add missing spaces in usageExit	2022-02-15 16:11:33 -05:00
Jay Berkenbilt	956a272d62	Remove abs calls and pick correct floating point epsilon values (fixes #641 )	2022-02-11 07:18:33 -05:00
Jay Berkenbilt	d501e1c0d4	Only update output version from files used as input If we're opening a PDF file to copy its encryption information or attachments, its version doesn't need to influence the output version.	2022-02-08 13:49:22 -05:00
Jay Berkenbilt	f91b21c7d4	Preserve input PDF version on pages/split-pages (fixes #610 )	2022-02-08 12:34:14 -05:00
Jay Berkenbilt	cfd5147d92	Add QPDF::getVersionAsPDFVersion	2022-02-08 12:34:14 -05:00
Jay Berkenbilt	cb769c62e5	WHITESPACE ONLY -- expand tabs in source code This comment expands all tabs using an 8-character tab-width. You should ignore this commit when using git blame or use git blame -w. In the early days, I used to use tabs where possible for indentation, since emacs did this automatically. In recent years, I have switched to only using spaces, which means qpdf source code has been a mixture of spaces and tabs. I have avoided cleaning this up because of not wanting gratuitous whitespaces change to cloud the output of git blame, but I changed my mind after discussing with users who view qpdf source code in editors/IDEs that have other tab widths by default and in light of the fact that I am planning to start applying automatic code formatting soon.	2022-02-08 11:51:15 -05:00
Jay Berkenbilt	c62e8e2b28	Update for clean compile with POINTERHOLDER_TRANSITION=2	2022-02-07 17:38:22 -05:00
m-holger	5901fcad4c	C-API expose QPDFObjectHandle::getKeyIfDict	2022-02-06 11:21:15 -05:00
m-holger	8371060340	Add method QPDFObjectHandle::getKeyIfDict	2022-02-06 11:21:15 -05:00
m-holger	2ed5f49a79	C-API expose QPDFObjectHandle::getValueAs... accessors	2022-02-05 19:40:30 -05:00
Jay Berkenbilt	7fb22740e1	Add operator ""_qpdf for creating QPDFObjectHandle literals	2022-02-05 11:29:25 -05:00
Jay Berkenbilt	b48a0ff0e8	Add qpdf_empty_pdf to C API	2022-02-05 11:29:25 -05:00
m-holger	e58b1174c7	Add new QPDFObjectHandle::getValueAs... accessors	2022-02-05 11:24:35 -05:00
Jay Berkenbilt	9044a24097	PointerHolder: deprecate getPointer() and getRefcount() Use get() and use_count() instead. Add #define NO_POINTERHOLDER_DEPRECATION to remove deprecation markers for these only. This commit also removes all deprecated PointerHolder API calls from qpdf's code except in PointerHolder's test suite, which must continue to test the deprecated APIs.	2022-02-04 13:12:37 -05:00
m-holger	95e7d36b7a	C-API add two binary UTF8 funtions add qpdf_oh_new_binary_unicode_string and qpdf_oh_get_binary_utf8_value	2022-02-04 13:10:51 -05:00
m-holger	1925ffd467	Fix --check-linearization of non-linearized files (fixes #615 )	2022-02-04 06:52:38 -05:00
Jay Berkenbilt	42bff9f458	QPDFJob: let initializeFromArgv just take argv, not argc Let argv be a null-terminated array. There is already code that assumes this, and it makes it easier to construct the arguments.	2022-02-01 13:50:58 -05:00
Jay Berkenbilt	b02d37bc0a	Make QPDFArgParser accept const argv This makes it much more convention to use the initializeFromArgv functions since you can use string literals.	2022-02-01 13:50:58 -05:00
Jay Berkenbilt	bc4e2320e7	Add qpdfjob-c.h -- simple C API around parts of QPDFJob	2022-02-01 09:04:55 -05:00
Jay Berkenbilt	03e67a28fe	Move QTC::TC for qpdf to QPDFJob All the coverage cases that used to be in qpdf.cc are now in QPDFJob*.cc. It doesn't really matter, but better to follow the convention of starting with the class that includes the coverage call.	2022-02-01 09:04:55 -05:00
Jay Berkenbilt	b42f3e1d15	Move more code from qpdf.cc into QPDFJob	2022-02-01 09:04:55 -05:00
Jay Berkenbilt	21b9290785	QPDFJob json: make bare arguments expect the empty string Changing from bool requiring true to string requiring the empty string is more consistent with the CLI and makes it possible to add an optional parameter or choices later without breaking compatibility.	2022-01-31 18:16:09 -05:00
Jay Berkenbilt	ea96330bb6	QPDFJob json: flatten json structure Flatten everything to make it easier to map command-line flags to json. The old structure was an illusion anyway because there was no mechanism to enforce that things were in the right place. This also helps with future flexibility.	2022-01-31 18:16:09 -05:00
Jay Berkenbilt	47f33cec25	QPDFJob: add test cases	2022-01-31 15:57:45 -05:00
Jay Berkenbilt	e3506253f1	Add optional version to --json	2022-01-31 15:57:45 -05:00
Jay Berkenbilt	caa00556cf	Change filename or path to file in json and QPDFJob Use "file" consistently for specifying a file path. We use "filename" when adding attachments for a completely different purpose.	2022-01-31 15:57:45 -05:00
Jay Berkenbilt	7097f29019	More editorial changes from m-holger + spell check	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	0e909bab8e	Improve top-level help information	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	0364024781	Use QPDFUsage exception for cli, json, and QPDFJob errors	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	95d127641c	QPDFJob: move more top-level trivial handlers into config	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	9373881cca	Add QPDFJob::ConfigError exception	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	b9cd693a5b	QPDFJob: allocate QPDFArgParser on stack The previous commits have removed all references to memory from QPDFArgParser from QPDFJob. This commit removes the constraint that QPDFArgParser remain in scope. This is a prerequisite to allowing JSON as an alternative way to initialize QPDFJob and to initialize it directly using a public API.	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	b9af421ef7	Add missing \f support for JSON string encoder	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	5c5e5ca29b	Document how to add a command-line argument	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	a301cc5373	Minor code cleanup	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	bd89aac360	QPDFJob increment: move arg parsing into QPDFJob Move ArgParser from qpdf.cc into QPDFJob.cc. It still works with millions of public member variables, but now qpdf.cc is minimal and just calls stable library functions.	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	23b64f8357	Remove qpdf.cc version check Remove comparison of qpdf CLI version with library. With almost all the functionality moving into the library, this check is no longer meaningful.	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	1ddf5b4b4b	QPDFJob increment: get rid of exit, handle verbose Remove all calls to exit() from QPDFJob. Handle code that runs in verbose mode to enable it to make use of output streams and message prefix (whoami) from QPDFJob. This removes temporarily duplicated exit code logic and most access to whoami/std::cout outside of QPDFJob proper.	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	0910e767ad	QPDFJob increment: basic QPDFJob structure Move most of the methods called from qpdf.cc after argument parsing into QPDFJob. In this increment, enough QPDFJob API has been added to handle the branch of QPDFJob::run() that creates output with an appropriate division between qpdf.cc and QPDFJob. There are temporary bits of code to enable everything to compile and pass the test suite, including some duplication and hard-coded values.	2022-01-30 13:11:03 -05:00
Jay Berkenbilt	8c718b7e6f	Prefix program name before exception message in qpdf CLI	2022-01-30 13:11:02 -05:00
Jay Berkenbilt	c60b4ea55a	Refactor arg parsing in qpdf.cc to use QPDFArgParser	2022-01-30 13:11:02 -05:00
Jay Berkenbilt	52817f0a45	Implement QPDFArgParser based on ArgParser from qpdf.cc	2022-01-30 13:11:02 -05:00
m-holger	8eca9d8fd9	Fix QPDFObjectHandle::isOrHasName Ensure isOrHasName returns true if object is an array and the name is present anywhere in the array.	2022-01-27 09:35:39 -06:00
m-holger	710d2e54f0	Allow testing for subtype without specifying type in isDictionaryOfType etc Accept empty string as type parameter in QPDFObjectHandle::isDictionaryOfType and isStreamOfType to allow for dictionaries with optional type.	2022-01-27 07:31:12 -06:00
m-holger	1b1b471ca9	Make a few whitespace fixes from last commit Commit by ejb@ql.org using m-holger as author so git annotate gives proper credit for changes.	2022-01-22 09:14:53 -05:00
m-holger	8593b9fdf7	Add new convenience methods QPDFObjectHandle::isNameAndEquals, etc Add methods isNameAndEquals, isDictionaryOfType, isStreamOfType	2022-01-22 08:10:28 -06:00
Jay Berkenbilt	370710657a	Add missing characters from PDF doc encoding (fixes #606 )	2022-01-11 15:55:19 -05:00
Jay Berkenbilt	0f1ffa1215	Move bash/zsh completion helpers to libtests/arg_parser	2022-01-05 18:13:25 -05:00
Jay Berkenbilt	4782b5904f	Move filter-completion.pl to libtests/arg_parser	2022-01-05 18:13:25 -05:00
Jay Berkenbilt	af91b5b584	Add QUtil::file_can_be_opened	2021-12-29 13:41:02 -05:00
Jay Berkenbilt	ac0060ac38	Refactor arg parsing to allow help option with parameter	2021-12-29 13:35:05 -05:00
Jay Berkenbilt	04745320d6	Prepare 10.5.0 release	2021-12-20 14:51:46 -05:00
Jay Berkenbilt	d866f48081	Change names of qpdf_object_type_e enumerations They have to be ot_* rather than qpdf_ot_* for compatibility. * Different enumerated types are not assignment-compatible in C++, at least with strict compiler settings * While you can do `constexpr ot_xyz = ::qpdf_ot_xyz` in QPDFObject.hh to make QPDFObject::ot_xyz work, QPDFObject::object_type_e::ot_xyz will only work if the enumerated type names are the same.	2021-12-20 14:51:45 -05:00
Jay Berkenbilt	cf7b2b5700	test_driver: split runtest into separate functions Too bad about git annotate but it was pretty crazy to have all those test cases together like that.	2021-12-20 12:40:03 -05:00
Jay Berkenbilt	ea73bf72e0	Further improvements to handling binary strings	2021-12-19 14:30:45 -05:00
Jay Berkenbilt	d3501c4f3e	Fix LGTM alerts	2021-12-18 16:25:53 -05:00
Jay Berkenbilt	ddbe59179e	C API: simplify new error handling and improve documentation	2021-12-17 15:59:47 -05:00
m-holger	f6293bd94c	C-API expose QPDFObjectHandle::getTypeCode and getTypeName (fixes #597 )	2021-12-17 14:24:43 -05:00
Jay Berkenbilt	feafcc4e88	C API: add several stream functions (fixes #596 )	2021-12-17 13:28:11 -05:00
Jay Berkenbilt	4024953682	Output C test n done at the end of each qpdf-ctest	2021-12-16 15:40:56 -05:00
Jay Berkenbilt	9bb6f570ec	C API: add functions for working with pages (fixes #594 )	2021-12-16 15:07:48 -05:00
Jay Berkenbilt	f072be032f	qpdf-ctest: outfile2 -> xarg	2021-12-16 11:51:16 -05:00
Jay Berkenbilt	08bcf6449c	Clarify docs around @filename and leading/trailing space	2021-12-10 15:52:28 -05:00
Jay Berkenbilt	af2a71aa2c	Handle bitstream overflow errors more gracefully (fixes #581 ) * Make it a runtime error, not a logic error * Include additional information * Capture it properly in checkLinearization	2021-12-10 15:37:35 -05:00
Jay Berkenbilt	1c62c2a342	C API: expose functions for indirect objects (fixes #588 )	2021-12-10 14:57:35 -05:00
Jay Berkenbilt	8e0b153332	Expose QPDFObjectHandle::addTokenFilter (fixes #580 )	2021-12-10 13:37:07 -05:00
Jay Berkenbilt	72c10d8617	C API: overhaul error handling * Handle error conditions that occur when using the object handle interfaces. In the past, some exceptions were not correctly converted to errors or warnings. * Add more detailed information to qpdf-c.h * Make it possible to work more explicitly with uninitialized objects	2021-12-10 12:16:02 -05:00
Jay Berkenbilt	3340dbe976	Use a specific error code for type warnings and clarify docs	2021-12-10 11:15:49 -05:00
Jay Berkenbilt	b2b2a175c4	Add missing unit test for register progress reporter in C API It was exercised in the pdf-linearize example but not in qpdf-ctest.	2021-12-10 09:11:56 -05:00
Jay Berkenbilt	09f3737202	Split qpdf-ctest test 24 into multiple tests Thanks for the nudge from m-holger!	2021-12-09 15:21:19 -05:00
Jay Berkenbilt	e3cc171d02	C API: qpdf_oh_is_initialized	2021-12-09 10:33:31 -05:00
Jay Berkenbilt	bef2c2222a	C API: qpdf_get_last_string_length	2021-12-09 10:33:31 -05:00
m-holger	0c705a882b	Minor documentation updates	2021-12-09 10:24:14 -05:00
m-holger	b4fc9eb700	C-API expose new_object as qpdf_oh_new_object	2021-12-02 13:59:58 -05:00
Jay Berkenbilt	720ce9e8f3	Improve testing and error handling around operating before processing	2021-11-29 07:42:36 -05:00
Jay Berkenbilt	b97a43e091	Add additional testing around improved array wrapping	2021-11-19 13:33:10 -05:00
m-holger	4630b8567c	Ensure qpdf_oh handles returned by C-API functions are unique. Return new qpdf_oh from qpdf_oh_wrap_in_array when input is already an array. Update some doc comments in qpdf-c.h.	2021-11-19 13:31:59 +00:00
Jay Berkenbilt	ce7db05d22	Prepare 10.4.0 release	2021-11-16 15:44:09 -05:00
Jay Berkenbilt	750aca5b94	First increment of improving handling of weak crypto (fixes #358 )	2021-11-11 12:24:15 -05:00
Jay Berkenbilt	f45dacf4cb	Make recovery logic flexible about where objects end (fixes #573 ) Don't assume endobj is at the beginning of the line. This means we are looking at tokens for every line, but the odds of n n obj appearing in the middle of the object are likely much lower than endobj not being at the beginning of the line or missing entirely. This will probably have a negative impact on recovery time for very large files. Hopefully it will be worth it.	2021-11-07 15:27:22 -05:00
Jay Berkenbilt	4a648b9a00	Fix bug in merging resources /DR from foreign AcroForm (fixes #548 ) When making resources indirect in from_dr, the code was using the wrong owning QPDF, forgetting that from_dr had already been copied using CopyForeignObject.	2021-11-04 12:29:42 -04:00
Jay Berkenbilt	9b28933647	Check object ownership when adding When adding a QPDFObjectHandle to an array or dictionary, if possible, check if the new object belongs to the same QPDF. This makes it much easier to find incorrect code than waiting for the situation to be detected when the file is written.	2021-11-04 12:29:42 -04:00
Jay Berkenbilt	73752683c9	Fix overlay/underlay on page with no resources (fixes #527 )	2021-11-03 16:00:05 -04:00
Jay Berkenbilt	33a47d5c3c	Make QPDF::findPage public (fixes #516 ) This was originally not public because I wanted to get rid fo the pages cache, but I recently realized there were deep reasons not to do that, and the author of pikepdf wanted this, so I decided to make it public.	2021-11-03 09:43:17 -04:00
Jay Berkenbilt	532a4f3d60	Detect recoverable but invalid zlib data streams (fixes #562 )	2021-11-03 09:43:17 -04:00
Jay Berkenbilt	7ed991343b	Better diagnostics when --pages is not closed (fixes #555 )	2021-11-02 16:22:37 -04:00
Fredrik Fornwall	e0775238b8	Fix QPDFEFStreamObjectHelper::{get,set}Subtype The /Subtype entry that specifies the mime type of an embedded file is inside the embedded file stream dictionary directly, not it in the parameter dictionary. See Table 45 and 46 in the PDF 1.7 specification: https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf#page=112	2021-09-10 10:02:24 -04:00
Jay Berkenbilt	df38fe8e48	Fix string bounds checking in completion code (fixes #441 )	2021-05-13 13:06:58 -04:00
Jay Berkenbilt	bddebdb0ea	Prepare 10.3.2 release	2021-05-08 10:41:14 -04:00
Jay Berkenbilt	30ac51bc78	Exclude unreferenced objects in object streams (fixes #520 )	2021-05-08 09:42:09 -04:00
Jay Berkenbilt	8971443e46	QPDF::addPage*: handle duplicate pages more robustly	2021-04-05 10:58:10 -04:00
Jay Berkenbilt	ec48820c3c	Fix loop detection in NNTree	2021-04-05 07:59:02 -04:00
Jay Berkenbilt	3f05429cc5	Prepare 10.3.1 release	2021-03-11 12:59:41 -05:00
Jay Berkenbilt	972e08af58	Protect against future bugs in fixCopiedAnnotations I don't want additional, undiscovered bugs to fully block page splitting/merging operations.	2021-03-11 12:49:27 -05:00
Jay Berkenbilt	85884c363c	Allow /DR to be direct in /AcroForm Also handle direct annotation, though this is much less likely.	2021-03-11 11:43:38 -05:00
Jay Berkenbilt	dc65b88457	Prepare 10.3.0 release	2021-03-05 06:15:48 -05:00
Jay Berkenbilt	addc0672d1	Tweak form copying to avoid gratuitous field renames When copying a page from the original file to the output in --pages, don't alter the fields or annotations for the first copy of each page.	2021-03-05 05:31:15 -05:00
Jay Berkenbilt	cb6e53136f	QPDFAcroFormDocumentHelper: add missing analyze calls	2021-03-04 18:11:44 -05:00
Jay Berkenbilt	f68e25c7f2	Don't use handleWarning, which is being reverted	2021-03-04 15:59:45 -05:00
Jay Berkenbilt	9fb174b9e9	Major rework of handling form fields when copying pages (fixes #509 )	2021-03-04 15:08:37 -05:00
Jay Berkenbilt	887f35efaa	When resolving font from /DR, copy it into resources	2021-03-04 15:08:36 -05:00
Jay Berkenbilt	d7ffdfa994	Add optional conflict detection to mergeResources Also improve behavior around direct vs. indirect resources.	2021-03-04 15:08:36 -05:00
Jay Berkenbilt	e17585c2d2	Remove unreferenced: ignore names that are not Fonts or XObjects Converted ResourceFinder to ParserCallbacks so we can better detect the name that precedes various operators and use the operators to sort the names into resource types. This enables us to be smarter about detecting unreferenced resources in pages and also sets the stage for reconciling differences in /DR across documents.	2021-03-03 17:05:49 -05:00
Jay Berkenbilt	b444ab3352	Fix typos in coverage cases	2021-03-03 17:05:49 -05:00
Jay Berkenbilt	fa2516df71	Fix behavior for finding /Q, /DA, and /DR for form fields If not found in the field hierarchy, /Q and /DA are supposed to be looked up in the document-level form dictionary. /DR is supposed to only come from the document dictionary.	2021-03-03 17:05:19 -05:00
Jay Berkenbilt	a4d6589ff2	Have QPDFObjectHandle notice when replaceObject was called This results in a performance penalty of 1% to 2% when replaceObject and swapObjects are never called and a somewhat larger penalty if they are called, but it's worth it to avoid very confusing behavior as discussed in depth in qpdf#507.	2021-02-25 07:32:46 -05:00
Jay Berkenbilt	b5e937397c	Prepare 10.2.0 release	2021-02-23 10:41:58 -05:00
Jay Berkenbilt	1886673d7e	Spell check	2021-02-23 10:38:05 -05:00
Jay Berkenbilt	9e00be7ffa	Remove warning that gives false positives in some normal cases	2021-02-23 08:26:21 -05:00
Jay Berkenbilt	039eb4a253	Fix input file = output file test for split pages	2021-02-23 08:26:21 -05:00
Jay Berkenbilt	be3a8c0e7a	Keep only referenced form fields in --pages	2021-02-23 08:26:21 -05:00
Jay Berkenbilt	50037fb33d	Fix test case to not leave stray files behind	2021-02-22 19:51:36 -05:00
Jay Berkenbilt	83216e640c	Preserve form fields when splitting pages (fixes #340 )	2021-02-22 18:42:06 -05:00
Jay Berkenbilt	8e8c0d8290	Add new placeFormXObject that takes a matrix reference	2021-02-22 18:42:06 -05:00

... 2 3 4 5 6 ...

950 Commits