Commit Graph

92 Commits

Author SHA1 Message Date
Jay Berkenbilt 07c8bb2843 Additionally license under Apache License version 2.0
The Apache License version 2.0 is now the primary license for qpdf.
However, users may, at their option, continue to use Artistic version
2.0.
2017-09-14 12:59:25 -04:00
Jay Berkenbilt d31a7b76e7 Improve message for stream decoding error
Tweak the message so that we inform the user that we are mitigating
data loss.
2017-09-12 16:03:48 -04:00
Jay Berkenbilt e452d9dca6 Spell check 2017-08-22 14:22:20 -04:00
Jay Berkenbilt fabff0f3ec Limit token length during xref recovery
While scanning the file looking for objects, limit the length of
tokens we allow. This prevents us from getting caught up in reading a
file character by character while digging through large streams.
2017-08-22 14:13:10 -04:00
Jay Berkenbilt a8c93bd324 Push QPDF member variables into a nested class
Pushing member variables into a nested class enables addition of new
member variables without breaking binary compatibility.
2017-08-21 21:35:11 -04:00
Jay Berkenbilt 8288a4eb3a Update copyright to 2017 2017-08-21 21:18:47 -04:00
Jay Berkenbilt 30f109e244 Read xref table without PCRE
Also accept more errors than before.
2017-08-10 21:30:32 -04:00
Jay Berkenbilt ca5b1d267a Improve stream length recovery
Eliminate PCRE and find endobj not preceded by endstream. Be more lax
about placement of endstream and endobj.
2017-08-10 21:30:32 -04:00
Jay Berkenbilt 03aa9679ac Find starxref without PCRE 2017-08-10 21:30:32 -04:00
Jay Berkenbilt 1765c6ec20 Find header without PCRE 2017-08-10 21:30:32 -04:00
Jay Berkenbilt 296b679d6e Implement findFirst and findLast in InputSource
Preparing to refactor some pattern searching code to use these instead
of their own memchr loops. This should simplify the code that replaces
PCRE.
2017-08-10 21:30:32 -04:00
Jay Berkenbilt ef8ae5449d Allow QPDFTokenizer::readToken to return bad tokens
Sometimes we want to ignore bad tokens rather than having them throw
an exception. A coverage case is commented out here and added in a
later commit.
2017-08-10 19:01:41 -04:00
Jay Berkenbilt 4647acbe3c Clarify documentation on copyForeignObject (fixes #69)
Be explicit about the need to keep the source QPDF object around.
2017-07-29 12:19:04 -04:00
Jay Berkenbilt 3a1ff5ded9 Add option to preserve unreferenced objects 2017-07-28 19:19:11 -04:00
Jay Berkenbilt 7f8892525f Add precheck streams capability
When requested, QPDFWriter will do more aggress prechecking of streams
to make sure it can actually succeed in decoding them before
attempting to do so. This will allow preservation of raw data even
when the raw data is corrupted relative to the specified filters.
2017-07-27 23:42:27 -04:00
Jay Berkenbilt a4fd4b91c6 Convert stream filtering errors to warnings 2017-07-27 18:43:07 -04:00
Jay Berkenbilt 40f00122b8 Convert object parsing errors to warnings
QPDFObjectHandle::parseInternal now issues warnings instead of
throwing exceptions for all error conditions that it finds (except
internal logic errors) and has stronger recovery for things like
invalid tokens and malformed dictionaries. This should improve qpdf's
ability to recover from a wide range of broken files that currently
cause it to fail.
2017-07-27 18:20:31 -04:00
Jay Berkenbilt 701b518d5c Detect recursion loops resolving objects (fixes #51)
During parsing of an object, sometimes parts of the object have to be
resolved. An example is stream lengths. If such an object directly or
indirectly points to the object being parsed, it can cause an infinite
loop. Guard against all cases of re-entrant resolution of objects.
2017-07-26 06:24:07 -04:00
Jay Berkenbilt 315092dd98 Avoid xref reconstruction infinite loop (fixes #100)
This is CVE-2017-9209.
2017-07-26 06:24:07 -04:00
Thorsten Schöning 7c08aa4280 Include QPDFExc.hh for use in std::list 2016-01-24 12:07:03 -05:00
Jay Berkenbilt f77acbdbba Copyright 2015 2015-05-24 17:26:49 -04:00
Jay Berkenbilt a11549a566 Detect loops in /Pages structure
Pushing inherited objects to pages and getting all pages were both
prone to stack overflow infinite loops if there were loops in the
Pages dictionary. There is a general weakness in the code in that any
part of the code that traverses the Pages structure would be prone to
this and would have to implement its own loop detection. A more robust
fix may provide some general method for handling the Pages structure,
but it's probably not worth doing.

Note: addition of *Internal2 private functions was done rather than
changing signatures of existing methods to avoid breaking
compatibility.
2015-02-21 19:47:11 -05:00
Jay Berkenbilt 225b018290 Update Copyright to 2014 2014-01-14 15:40:02 -05:00
Jay Berkenbilt cee2592ed1 Change API/ABI and withdraw 4.2.0
4.2.0 was binary incompatible in spite of there being no deletions or
changes to any public methods.  As such, we have to bump the ABI and
are fixing some API breakage while we're at it.

Previous 4.3.0 target is now 5.1.0.
2013-07-10 11:30:13 -04:00
Jay Berkenbilt a3576a7359 Bug fix: handle generation > 0 when generating object streams
Rework QPDFWriter to always track old object IDs and QPDFObjGen
instead of int, thus not discarding the generation number.  Switch to
QPDF::getCompressibleObjGen() to properly handle the case of an old
object eligible for compression that has a generation of other than
zero.
2013-06-14 14:58:09 -04:00
Jay Berkenbilt 96eb965115 Use QPDFObjectHandle::getObjGen() where appropriate
In internal code and examples, replace calls to getObjectID() and
getGeneration() with calls to getObjGen() where possible.
2013-06-14 14:58:09 -04:00
Jay Berkenbilt d88231e01e Promote QPDF::ObjGen to top-level object QPDFObjGen 2013-06-14 14:58:08 -04:00
Jay Berkenbilt 30027481f7 Remove all old-style casts from C++ code 2013-03-04 16:45:16 -05:00
Jay Berkenbilt a04a835849 Clarify methods to get user password
With newer encryption formats, it is no longer possible to recover the
user password using the owner password.
2013-01-03 20:45:53 -05:00
Jay Berkenbilt 8843e499b8 Update copyright year to 2013
Also add copyright notice to a few public headers that were missing
one.
2012-12-31 10:32:32 -05:00
Jay Berkenbilt e57c25814e Support for encryption with /V=5 and /R=5 and /R=6
Read and write support is implemented for /V=5 with /R=5 as well as
/R=6.  /R=5 is the deprecated encryption method used by Acrobat IX.
/R=6 is the encryption method used by PDF 2.0 from ISO 32000-2.
2012-12-31 10:32:32 -05:00
Jay Berkenbilt 93ac1695a4 Support files with only attachments encrypted
Test cases added in a future commit since they depend on /R=6 support.
2012-12-31 10:32:32 -05:00
Jay Berkenbilt 774584163f Add ExtensionLevel support to version handling
All version operations are now fully aware of extension levels.
2012-12-31 05:36:50 -05:00
Jay Berkenbilt 3101955ac0 Add V5 parameters to EncryptionData 2012-12-31 05:36:50 -05:00
Jay Berkenbilt 68447bb556 change EncryptionData 2012-12-31 05:36:50 -05:00
Jay Berkenbilt 04c203ae06 Eliminate flattenScalarReferences 2012-12-31 05:36:48 -05:00
Jay Berkenbilt 041397fdab Allow reading from InputSource and writing to Pipeline
Allowing users to subclass InputSource and Pipeline to read and write
from/to arbitrary sources provides the maximum flexibility for users
who want to read and write from other than files or memory.
2012-09-23 17:42:26 -04:00
Jay Berkenbilt f83bddf882 Update copyright to 2012 2012-07-28 22:03:36 -04:00
Jay Berkenbilt 316328704b Windows compilation fixes 2012-07-21 20:51:56 -04:00
Jay Berkenbilt 6bbea4baa0 Implement QPDFObjectHandle::parse
Move object parsing code from QPDF to QPDFObjectHandle and
parameterize the parts of it that are specific to a QPDF object.
Provide a version that can't handle indirect objects and that can be
called on an arbitrary string.

A side effect of this change is that the offset used when reporting
invalid stream length has changed, but since the new value seems like
a better value than the old one, the test suite has been updated
rather than making the code backward compatible.  This only effects
the offset reported for invalid streams that lack /Length or have an
invalid /Length key.

Updated some test code and exmaples to use QPDFObjectHandle::parse.

Supporting changes include adding a BufferInputSource constructor that
takes a string.
2012-07-21 09:06:10 -04:00
Jay Berkenbilt 15eaed5c52 Refactor: pull *InputSource out of QPDF
InputSource, FileInputSource, and BufferInputSource are now top-level
classes instead of privately nested inside QPDF.
2012-07-21 09:06:06 -04:00
Jay Berkenbilt a101533e0a Add command line option to copy encryption from other file
Add --copy-encryption and --encryption-file-password options to qpdf.
Also strengthen test suite for copying encryption.  The strengthened
test suite would have caught the failure to preserve AES and the
failure to update the file version, which was invalidating the
encrypted data.
2012-07-15 21:15:24 -04:00
Jay Berkenbilt e7b8f297ba Support copying objects from another QPDF object
This includes QPDF::copyForeignObject and supporting foreign objects
as arguments to addPage*.
2012-07-11 15:54:33 -04:00
Jay Berkenbilt 8a217eb3a2 Add concept of reserved objects
QPDFObjectHandle::{new,is,assert}Reserved, QPDF::replaceReserved
provide a mechanism to add objects to a PDF file when there are
circular references.  This is a prerequisite to copying objects from
one PDF to another.
2012-07-10 23:34:32 -04:00
Tobias Hoffmann 39bbaa86e3 Build this->all_pages while traversing with pushInheritedAttributesToPage 2012-07-07 17:45:10 -04:00
Tobias Hoffmann abb53ac369 Limited inheritance to the attributes explicitly listed in the PDF spec
Previous versions of qpdf incorrectly passed arbitrary objects from
/Pages objects down to individual pages in direct contradition with
the PDF specification.  These are now left in /Pages.  When
intermediate /Pages nodes are being discarded as when the /Pages tree
is being flattened, a warning is issued when unknown keys are
encountered.
2012-07-04 23:04:55 -04:00
Tobias Hoffmann 7770a1b036 Added public method QPDF::pushInheritedAttributesToPage
Refactored optimizePagesTree to pushInheritedAttributesToPage and made
public
2012-07-04 16:24:03 -04:00
Jay Berkenbilt 2266c6232b Rework InputSource::readLine to make it much more efficient
This rework makes xref reconstruction run much faster and use much
less memory.
2012-06-27 06:48:06 -04:00
Jay Berkenbilt 8318d81ada Fix and test support for files >= 4 GB 2012-06-24 15:56:50 -04:00
Jay Berkenbilt 781c313058 Change QPDF_Integer from int to long long
This makes it possible to store offsets that are larger than 2 GB in
the trailer dictionary.
2012-06-24 15:20:01 -04:00