octoleo/qpdf - qpdf - Vast Development Method

mirror of https://github.com/qpdf/qpdf.git synced 2024-09-27 12:39:06 +00:00

Author	SHA1	Message	Date
Jay Berkenbilt	d97474868d	Lexer enhancements: EOF, comment, space Significant enhancements to the lexer to improve EOF handling and to support comments and spaces as tokens. Various other minor issues were fixed as well.	2018-02-18 20:18:40 -05:00
Jay Berkenbilt	13d9756a45	Minor fixes to tokenizer	2018-01-28 18:34:43 -05:00
Jay Berkenbilt	ec0087e3ce	Support TIFF Predictor (fixes #171 )	2018-01-13 19:49:42 -05:00
Jay Berkenbilt	eaacf94005	Update C API with new QPDFWriter methods	2017-09-12 14:30:39 -04:00
Jay Berkenbilt	fabff0f3ec	Limit token length during xref recovery While scanning the file looking for objects, limit the length of tokens we allow. This prevents us from getting caught up in reading a file character by character while digging through large streams.	2017-08-22 14:13:10 -04:00
Jay Berkenbilt	ddc6cf0cf6	Precheck streams by default There is no need for a --precheck-streams option. We can do the precheck without imposing any penalty, only re-encoding the stream if it fails the first time.	2017-08-21 17:44:22 -04:00
Jay Berkenbilt	9744414c66	Enable finer grained control of stream decoding This commit adds several API methods that enable control over which types of filters QPDF will attempt to decode. It also adds support for /RunLengthDecode and /DCTDecode filters for both encoding and decoding.	2017-08-21 17:44:22 -04:00
Jay Berkenbilt	cfa2eb97fb	Add page rotation (fixes #132 )	2017-08-12 22:57:38 -04:00
Jay Berkenbilt	df33c368b4	Change --single-pages to --split-pages This is in preparation for implementing page groups.	2017-08-12 11:49:04 -04:00
Jay Berkenbilt	8249a26d69	Fix infinite loop in QPDFWriter (fixes #143 )	2017-08-12 08:36:36 -04:00
Jay Berkenbilt	8fe0b06cd8	Pad encryption parameters that are too short (fixes #96 )	2017-08-11 19:53:56 -04:00
Jay Berkenbilt	30f109e244	Read xref table without PCRE Also accept more errors than before.	2017-08-10 21:30:32 -04:00
Jay Berkenbilt	90840be594	Find lindict without PCRE	2017-08-10 21:30:32 -04:00
Jay Berkenbilt	03aa9679ac	Find starxref without PCRE	2017-08-10 21:30:32 -04:00
Jay Berkenbilt	49825e5cb6	Add --split-pages option (fixes #30 )	2017-08-05 10:22:33 -04:00
Jay Berkenbilt	2d5b854468	Allow reading command-line args from files (fixes #16 )	2017-07-29 22:23:21 -04:00
Jay Berkenbilt	5993c3e83c	Detect input file = output file (fixes #29 )	2017-07-29 20:58:01 -04:00
Jay Berkenbilt	07d6f770b2	Better recovery of bad stream start (fixes #104 )	2017-07-29 12:19:04 -04:00
Jay Berkenbilt	b389268f16	Better handle split content streams (fixes #73 ) When parsing content streams, allow content to be split arbitrarily across stream boundaries.	2017-07-29 12:19:04 -04:00
Jay Berkenbilt	3a1ff5ded9	Add option to preserve unreferenced objects	2017-07-28 19:19:11 -04:00
Jay Berkenbilt	7f8892525f	Add precheck streams capability When requested, QPDFWriter will do more aggress prechecking of streams to make sure it can actually succeed in decoding them before attempting to do so. This will allow preservation of raw data even when the raw data is corrupted relative to the specified filters.	2017-07-27 23:42:27 -04:00
Jay Berkenbilt	428d96dfe1	Convert many more errors to warnings	2017-07-27 22:57:55 -04:00
Jay Berkenbilt	a4fd4b91c6	Convert stream filtering errors to warnings	2017-07-27 18:43:07 -04:00
Jay Berkenbilt	40f00122b8	Convert object parsing errors to warnings QPDFObjectHandle::parseInternal now issues warnings instead of throwing exceptions for all error conditions that it finds (except internal logic errors) and has stronger recovery for things like invalid tokens and malformed dictionaries. This should improve qpdf's ability to recover from a wide range of broken files that currently cause it to fail.	2017-07-27 18:20:31 -04:00
Jay Berkenbilt	701b518d5c	Detect recursion loops resolving objects (fixes #51 ) During parsing of an object, sometimes parts of the object have to be resolved. An example is stream lengths. If such an object directly or indirectly points to the object being parsed, it can cause an infinite loop. Guard against all cases of re-entrant resolution of objects.	2017-07-26 06:24:07 -04:00
Jay Berkenbilt	afe0242b26	Handle object ID 0 (fixes #99 ) This is CVE-2017-9208. The QPDF library uses object ID 0 internally as a sentinel to represent a direct object, but prior to this fix, was not blocking handling of 0 0 obj or 0 0 R as a special case. Creating an object in the file with 0 0 obj could cause various infinite loops. The PDF spec doesn't allow for object 0. Having qpdf handle object 0 might be a better fix, but changing all the places in the code that assumes objid == 0 means direct would be risky.	2017-07-26 06:24:07 -04:00
Jay Berkenbilt	b8bdef0ad1	Implement deterministic ID For non-encrypted files, determinstic ID generation uses file contents instead of timestamp and file name. At a small runtime cost, this enables generation of the same /ID if the same inputs are converted in the same way multiple times.	2015-10-31 18:56:42 -04:00
Jay Berkenbilt	c9a9fe9c2f	Avoid traversing same object twice when copying objects This is a performance fix. The output is unchanged. Fixes #28.	2013-12-26 11:51:50 -05:00
Jay Berkenbilt	91367239fd	Add --show-npages option to qpdf	2013-07-07 19:43:16 -04:00
Jay Berkenbilt	adccedc02f	Allow numeric range to be omitted qpdf --pages Detect a missing page range and assume 1-z.	2013-07-07 19:43:16 -04:00
Jay Berkenbilt	a85007cb0d	Handle more broken files Space rather than newline after xref, missing /ID in trailer for encrypted file. This enables qpdf to handle some files that xpdf can handle. Adobe reader can't necessarily handle them.	2013-06-15 12:40:01 -04:00
Jay Berkenbilt	16051788ed	Handle /Outlines dictionary being a direct object Even though this case is not valid according to the spec, it has been seen, and caused an internal error.	2013-06-14 21:36:04 -04:00
Jay Berkenbilt	a3576a7359	Bug fix: handle generation > 0 when generating object streams Rework QPDFWriter to always track old object IDs and QPDFObjGen instead of int, thus not discarding the generation number. Switch to QPDF::getCompressibleObjGen() to properly handle the case of an old object eligible for compression that has a generation of other than zero.	2013-06-14 14:58:09 -04:00
Jay Berkenbilt	6c7bf114dc	Bug fix: properly handle overridden compressed objects When caching objects in an object stream, only cache objects that still resolve to that stream. See Changelog mod from this commit for details.	2013-02-23 17:51:17 -05:00
Jay Berkenbilt	f81152311e	Add QPDFObjectHandle::parseContentStream method This method allows parsing of the PDF objects in a content stream or array of content streams.	2013-01-20 15:35:39 -05:00
Jay Berkenbilt	f8306913ba	Update "C" API with functions for new features	2012-12-31 10:32:32 -05:00
Jay Berkenbilt	9a23c3dcb6	Remove /Crypt from stream filters unconditionally When writing a new stream, always remove /Crypt even if we are not otherwise able to filter the stream.	2012-12-31 10:32:32 -05:00
Jay Berkenbilt	4237a29c94	Refactor Dictionary writing code Original code was written before we could shallow copy objects, so all the filtering was done by suppressing the output of certain keys and replacing them with other keys. Now we can simplify the code greatly by modifying shallow copies of dictionaries in place.	2012-12-31 10:32:32 -05:00
Jay Berkenbilt	e57c25814e	Support for encryption with /V=5 and /R=5 and /R=6 Read and write support is implemented for /V=5 with /R=5 as well as /R=6. /R=5 is the deprecated encryption method used by Acrobat IX. /R=6 is the encryption method used by PDF 2.0 from ISO 32000-2.	2012-12-31 10:32:32 -05:00
Jay Berkenbilt	93ac1695a4	Support files with only attachments encrypted Test cases added in a future commit since they depend on /R=6 support.	2012-12-31 10:32:32 -05:00
Jay Berkenbilt	16a23368e7	Fix infinite loop trimming passwords with ( in them	2012-12-31 10:32:31 -05:00
Jay Berkenbilt	774584163f	Add ExtensionLevel support to version handling All version operations are now fully aware of extension levels.	2012-12-31 05:36:50 -05:00
Jay Berkenbilt	04c203ae06	Eliminate flattenScalarReferences	2012-12-31 05:36:48 -05:00
Jay Berkenbilt	7f84239cad	Find PDF header anywhere in the first 1024 bytes	2012-12-25 14:43:37 -05:00
Jay Berkenbilt	f256670eba	Ignore objects with offset 0	2012-11-20 13:57:37 -05:00
Jay Berkenbilt	c1627d0438	Add QPDFWriter::setExtraHeaderText	2012-09-06 15:31:12 -04:00
Jay Berkenbilt	29e9c34fe3	Bug fix: let EOF resolve literal token Previously only whitespace and comments did it. This fix is needed for object streams whose last object is a literal (name, integer, real, string) not terminated by space or newline.	2012-08-11 09:29:04 -04:00
Jay Berkenbilt	bde98044f4	Improve password handling Use --encryption-file-password, if given, in addition to --password as a source for passwords for files specified in --pages.	2012-07-29 13:22:37 -04:00
Jay Berkenbilt	6bbea4baa0	Implement QPDFObjectHandle::parse Move object parsing code from QPDF to QPDFObjectHandle and parameterize the parts of it that are specific to a QPDF object. Provide a version that can't handle indirect objects and that can be called on an arbitrary string. A side effect of this change is that the offset used when reporting invalid stream length has changed, but since the new value seems like a better value than the old one, the test suite has been updated rather than making the code backward compatible. This only effects the offset reported for invalid streams that lack /Length or have an invalid /Length key. Updated some test code and exmaples to use QPDFObjectHandle::parse. Supporting changes include adding a BufferInputSource constructor that takes a string.	2012-07-21 09:06:10 -04:00
Jay Berkenbilt	db95960ac1	Bug fix: preserve AES when copying encryption parameters	2012-07-15 19:07:59 -04:00
Jay Berkenbilt	1c944e4c89	Have QPDFWriter detect foreign objects while writing Throw an exception that directs the user to QPDF::copyForeignObject.	2012-07-14 08:07:23 -04:00
Jay Berkenbilt	e7b8f297ba	Support copying objects from another QPDF object This includes QPDF::copyForeignObject and supporting foreign objects as arguments to addPage*.	2012-07-11 15:54:33 -04:00
Jay Berkenbilt	8a217eb3a2	Add concept of reserved objects QPDFObjectHandle::{new,is,assert}Reserved, QPDF::replaceReserved provide a mechanism to add objects to a PDF file when there are circular references. This is a prerequisite to copying objects from one PDF to another.	2012-07-10 23:34:32 -04:00
Jay Berkenbilt	e2dedde4bd	Don't require stream data provider to know length in advance Breaking API change: length parameter has disappeared from the StreamDataProvider version of QPDFObjectHandle::replaceStreamData since it is no longer necessary to compute it in advance. This breaking change is justified by the fact that removing the length parameter provides the caller an opportunity to simplify the calling code.	2012-07-07 17:33:45 -04:00
Tobias Hoffmann	abb53ac369	Limited inheritance to the attributes explicitly listed in the PDF spec Previous versions of qpdf incorrectly passed arbitrary objects from /Pages objects down to individual pages in direct contradition with the PDF specification. These are now left in /Pages. When intermediate /Pages nodes are being discarded as when the /Pages tree is being flattened, a warning is issued when unknown keys are encountered.	2012-07-04 23:04:55 -04:00
Jay Berkenbilt	5f59c32f87	Add a few minor enhancements to recent work Test coverage case for new newStream method Expose decimal_places argument for double-based newReal All enhancements suggested by Tobias.	2012-06-27 10:43:27 -04:00
Jay Berkenbilt	d1ebe30ff6	Add QPDFObjectHandle::shallowCopy()	2012-06-21 16:15:09 -04:00
Jay Berkenbilt	3844aedd93	Add testing for page APIs	2012-06-21 15:01:02 -04:00
Jay Berkenbilt	eb802cfa8c	Implement page manipulation APIs	2012-06-21 15:01:02 -04:00
Jay Berkenbilt	bc1c4bb578	Add QPDF::processFile that takes an open FILE*	2012-06-21 08:00:35 -04:00
Jay Berkenbilt	76b1659177	enhance PointerHolder so that it can explicitly be told to use delete [] instead of delete, thus making it useful to run valgrind over qpdf during its test suite	2011-08-11 11:57:37 -04:00
Jay Berkenbilt	14fe2e6de3	qpdf_set_info_key, qpdf_get_info_key	2011-08-11 10:48:37 -04:00
Jay Berkenbilt	a42a4068b5	preserve /EncryptMetadata when copying encryption parameters	2011-08-10 19:47:18 -04:00
Jay Berkenbilt	7dc197ef88	implement replace and swap	2011-08-10 12:42:48 -04:00
Jay Berkenbilt	aeb892f99b	accept stream keyword with CR only git-svn-id: svn+q:///qpdf/trunk@1052 71b93d88-0707-0410-a8cf-f5a4172ac649	2011-04-30 21:46:09 +00:00
Jay Berkenbilt	6405d3928f	be less conservative when skipping over inline images in content normalization git-svn-id: svn+q:///qpdf/trunk@1050 71b93d88-0707-0410-a8cf-f5a4172ac649	2011-04-30 18:20:35 +00:00
Jay Berkenbilt	b36f62a326	add qpdf_read_memory to C API git-svn-id: svn+q:///qpdf/trunk@1044 71b93d88-0707-0410-a8cf-f5a4172ac649	2010-10-04 15:24:10 +00:00
Jay Berkenbilt	b1e0dcff16	handle stream filter abbreviations from table H.1 git-svn-id: svn+q:///qpdf/trunk@1025 71b93d88-0707-0410-a8cf-f5a4172ac649	2010-09-05 15:00:44 +00:00
Jay Berkenbilt	bd7261da9b	getRawStreamData() git-svn-id: svn+q:///qpdf/trunk@1010 71b93d88-0707-0410-a8cf-f5a4172ac649	2010-08-09 23:33:40 +00:00
Jay Berkenbilt	2dbc1006fb	addPageContents git-svn-id: svn+q:///qpdf/trunk@995 71b93d88-0707-0410-a8cf-f5a4172ac649	2010-08-05 21:06:49 +00:00
Jay Berkenbilt	6f2bd7eb3a	newStream git-svn-id: svn+q:///qpdf/trunk@991 71b93d88-0707-0410-a8cf-f5a4172ac649	2010-08-05 20:20:52 +00:00
Jay Berkenbilt	11df7809af	add pipeline-based stream data replacement function git-svn-id: svn+q:///qpdf/trunk@990 71b93d88-0707-0410-a8cf-f5a4172ac649	2010-08-05 19:04:22 +00:00
Jay Berkenbilt	998a6cbee9	remove stream_data_handler; it wouldn't work as designed. replacement data implemented but not tested git-svn-id: svn+q:///qpdf/trunk@988 71b93d88-0707-0410-a8cf-f5a4172ac649	2010-08-02 22:40:52 +00:00
Jay Berkenbilt	a80d9d176d	add C interface for getting software version git-svn-id: svn+q:///qpdf/trunk@903 71b93d88-0707-0410-a8cf-f5a4172ac649	2009-10-24 13:23:20 +00:00
Jay Berkenbilt	7f5d78c2d1	improve C error handling interface git-svn-id: svn+q:///qpdf/trunk@884 71b93d88-0707-0410-a8cf-f5a4172ac649	2009-10-23 15:27:30 +00:00
Jay Berkenbilt	398354b6f0	update C API for error retrieval git-svn-id: svn+q:///qpdf/trunk@830 71b93d88-0707-0410-a8cf-f5a4172ac649	2009-10-20 00:24:44 +00:00
Jay Berkenbilt	3f8c4c2736	categorize all error messages and include object information if available git-svn-id: svn+q:///qpdf/trunk@829 71b93d88-0707-0410-a8cf-f5a4172ac649	2009-10-19 23:09:19 +00:00
Jay Berkenbilt	734ac1e1d2	deal with stream-specific crypt filters git-svn-id: svn+q:///qpdf/trunk@827 71b93d88-0707-0410-a8cf-f5a4172ac649	2009-10-19 01:58:31 +00:00
Jay Berkenbilt	a8715c495b	add C API for R4 encryption git-svn-id: svn+q:///qpdf/trunk@825 71b93d88-0707-0410-a8cf-f5a4172ac649	2009-10-19 00:36:51 +00:00
Jay Berkenbilt	09175e4578	more testing, bug fix for linearized aes encrypted files git-svn-id: svn+q:///qpdf/trunk@824 71b93d88-0707-0410-a8cf-f5a4172ac649	2009-10-19 00:17:11 +00:00
Jay Berkenbilt	94131116a9	more notes, testing of cleartext metadata, some crypt filter fixes git-svn-id: svn+q:///qpdf/trunk@823 71b93d88-0707-0410-a8cf-f5a4172ac649	2009-10-18 19:54:24 +00:00
Jay Berkenbilt	4ccc9330a8	only seed randon number generater once for aes-cbc, try to avoid compressing Metadata streams git-svn-id: svn+q:///qpdf/trunk@818 71b93d88-0707-0410-a8cf-f5a4172ac649	2009-10-18 14:09:10 +00:00
Jay Berkenbilt	e25910b59a	reading crypt filters is largely implemented but not fully tested git-svn-id: svn+q:///qpdf/trunk@812 71b93d88-0707-0410-a8cf-f5a4172ac649	2009-10-17 23:37:55 +00:00
Jay Berkenbilt	c2023db265	Implement changes suggested by Zarko and our subsequent conversations: - Add a way to set the minimum PDF version - Add a way to force the PDF version - Have isEncrypted return true if an /Encrypt dictionary exists even when we can't read the file - Allow qpdf_init_write to be called multiple times - Update some comments in headers git-svn-id: svn+q:///qpdf/trunk@748 71b93d88-0707-0410-a8cf-f5a4172ac649	2009-10-05 00:42:48 +00:00
Jay Berkenbilt	8d7bb3ff50	add methods for getting encryption data git-svn-id: svn+q:///qpdf/trunk@733 71b93d88-0707-0410-a8cf-f5a4172ac649	2009-09-27 20:05:38 +00:00
Jay Berkenbilt	fe6771e0e5	add many new tests to exercise C api git-svn-id: svn+q:///qpdf/trunk@727 71b93d88-0707-0410-a8cf-f5a4172ac649	2009-09-27 16:01:45 +00:00
Jay Berkenbilt	84ec83e925	basic implementation of C API git-svn-id: svn+q:///qpdf/trunk@725 71b93d88-0707-0410-a8cf-f5a4172ac649	2009-09-27 14:39:04 +00:00
Jay Berkenbilt	d6f50e98c3	remove extraneous coverage case (another coverage case was in the same block of code) git-svn-id: svn+q:///qpdf/trunk@694 71b93d88-0707-0410-a8cf-f5a4172ac649	2009-09-26 14:39:50 +00:00
Jay Berkenbilt	a1fbb4bd97	update test suite git-svn-id: svn+q:///qpdf/trunk@675 71b93d88-0707-0410-a8cf-f5a4172ac649	2009-05-03 20:03:21 +00:00
Jay Berkenbilt	599daddb47	decode streams on check, always exit abnormally when warnings are detected git-svn-id: svn+q:///qpdf/trunk@660 71b93d88-0707-0410-a8cf-f5a4172ac649	2009-03-08 19:00:19 +00:00
Jay Berkenbilt	35d72c822e	better recovery for appended files with damaged cross-reference tables git-svn-id: svn+q:///qpdf/trunk@648 71b93d88-0707-0410-a8cf-f5a4172ac649	2009-02-21 02:31:08 +00:00
Jay Berkenbilt	337b900708	handle UTF-16BE fully git-svn-id: svn+q:///qpdf/trunk@639 71b93d88-0707-0410-a8cf-f5a4172ac649	2008-11-23 18:49:13 +00:00
Jay Berkenbilt	9a0b88bf77	update release date to actual date git-svn-id: svn+q:///qpdf/trunk@599 71b93d88-0707-0410-a8cf-f5a4172ac649	2008-04-29 12:55:25 +00:00

1 2 3 4 5

243 Commits