Commit Graph

257 Commits

Author SHA1 Message Date
Jay Berkenbilt d24a120c7f Add QPDF::setImmediateCopyFrom 2019-01-10 22:35:08 -05:00
Jay Berkenbilt 3472f6c984 Update copyrights for 2019 2019-01-07 07:54:55 -05:00
Jay Berkenbilt fddbcab0e7 Mostly don't require original QPDF for copyForeignObject (fixes #219)
The original QPDF is only required now when the source
QPDFObjectHandle is a stream that gets its stream data from a
QPDFObjectHandle::StreamDataProvider.
2019-01-07 00:11:15 -05:00
Jay Berkenbilt fbbb0ee016 Make a static version of QPDF::pipeStreamData
This is in preparation of being able to pipe a stream's data without
keeping a copy of its containing qpdf object.
2019-01-07 00:11:15 -05:00
Jay Berkenbilt 7588cac295 Create an application-scope unique ID for each QPDF object
Use this instead of QPDF* as a map key for object_copiers.
2019-01-07 00:11:15 -05:00
Jay Berkenbilt e27ac682e0 Move encryption parameters into a class 2019-01-06 09:58:16 -05:00
Jay Berkenbilt 837dcf8fc2 Don't call assert while checking linearization data (fixes #209, #231)
Instead of calling assert for problems found during checking
linearization data, throw an exception which is later caught and
issued as an error. Ideally we would handle errors more robustly, but
this is still a significant improvement.
2019-01-04 11:55:42 -05:00
Jay Berkenbilt a01359189b Fix dangling references (fixes #240)
On certain operations, such as iterating through all objects and
adding new indirect objects, walk through the entire object structure
and explicitly resolve any indirect references to non-existent
objects. That prevents new objects from springing into existence and
causing the previously dangling references to point to them.
2019-01-04 10:29:29 -05:00
Jay Berkenbilt 3873f5fd9b Protect headers with compliant identifiers (fixes #233) 2018-08-12 14:10:32 -04:00
Jay Berkenbilt 651b51f056 Add QPDF_DLL to public destructors (fixes #220)
A few public destructors were missing QPDF_DLL, which could cause some
Windows applications to fail to link.
2018-08-04 20:08:06 -04:00
Jay Berkenbilt 2a82f6e1e0 Add method to get count of objects in QPDF 2018-06-22 15:53:40 -04:00
Jay Berkenbilt 2e7ee23bf6 Add QPDFPageDocumentHelper and QPDFPageObjectHelper
This is the beginning of higher-level API support using helper
classes. The goal is to be able to add more helpers without continuing
to pollute QPDF's and QPDFObjectHandle's public interfaces.
2018-06-21 15:57:13 -04:00
Jay Berkenbilt d0e99f195a More robust handling of type errors
Give objects descriptions and context so it is possible to issue
warnings instead of fatal errors for attempts to access objects of the
wrong type.
2018-02-18 21:06:27 -05:00
Jay Berkenbilt 569d74d36b Allow raw encryption key to be specified
Add options to enable the raw encryption key to be directly shown or
specified. Thanks to Didier Stevens <didier.stevens@gmail.com> for the
idea and contribution of one implementation of this idea.
2018-01-14 10:21:05 -05:00
Jay Berkenbilt 68572df2bf Update copyright to 2018 2018-01-13 20:25:58 -05:00
Jay Berkenbilt 07c8bb2843 Additionally license under Apache License version 2.0
The Apache License version 2.0 is now the primary license for qpdf.
However, users may, at their option, continue to use Artistic version
2.0.
2017-09-14 12:59:25 -04:00
Jay Berkenbilt d31a7b76e7 Improve message for stream decoding error
Tweak the message so that we inform the user that we are mitigating
data loss.
2017-09-12 16:03:48 -04:00
Jay Berkenbilt e452d9dca6 Spell check 2017-08-22 14:22:20 -04:00
Jay Berkenbilt fabff0f3ec Limit token length during xref recovery
While scanning the file looking for objects, limit the length of
tokens we allow. This prevents us from getting caught up in reading a
file character by character while digging through large streams.
2017-08-22 14:13:10 -04:00
Jay Berkenbilt a8c93bd324 Push QPDF member variables into a nested class
Pushing member variables into a nested class enables addition of new
member variables without breaking binary compatibility.
2017-08-21 21:35:11 -04:00
Jay Berkenbilt 8288a4eb3a Update copyright to 2017 2017-08-21 21:18:47 -04:00
Jay Berkenbilt 30f109e244 Read xref table without PCRE
Also accept more errors than before.
2017-08-10 21:30:32 -04:00
Jay Berkenbilt ca5b1d267a Improve stream length recovery
Eliminate PCRE and find endobj not preceded by endstream. Be more lax
about placement of endstream and endobj.
2017-08-10 21:30:32 -04:00
Jay Berkenbilt 03aa9679ac Find starxref without PCRE 2017-08-10 21:30:32 -04:00
Jay Berkenbilt 1765c6ec20 Find header without PCRE 2017-08-10 21:30:32 -04:00
Jay Berkenbilt 296b679d6e Implement findFirst and findLast in InputSource
Preparing to refactor some pattern searching code to use these instead
of their own memchr loops. This should simplify the code that replaces
PCRE.
2017-08-10 21:30:32 -04:00
Jay Berkenbilt ef8ae5449d Allow QPDFTokenizer::readToken to return bad tokens
Sometimes we want to ignore bad tokens rather than having them throw
an exception. A coverage case is commented out here and added in a
later commit.
2017-08-10 19:01:41 -04:00
Jay Berkenbilt 4647acbe3c Clarify documentation on copyForeignObject (fixes #69)
Be explicit about the need to keep the source QPDF object around.
2017-07-29 12:19:04 -04:00
Jay Berkenbilt 3a1ff5ded9 Add option to preserve unreferenced objects 2017-07-28 19:19:11 -04:00
Jay Berkenbilt 7f8892525f Add precheck streams capability
When requested, QPDFWriter will do more aggress prechecking of streams
to make sure it can actually succeed in decoding them before
attempting to do so. This will allow preservation of raw data even
when the raw data is corrupted relative to the specified filters.
2017-07-27 23:42:27 -04:00
Jay Berkenbilt a4fd4b91c6 Convert stream filtering errors to warnings 2017-07-27 18:43:07 -04:00
Jay Berkenbilt 40f00122b8 Convert object parsing errors to warnings
QPDFObjectHandle::parseInternal now issues warnings instead of
throwing exceptions for all error conditions that it finds (except
internal logic errors) and has stronger recovery for things like
invalid tokens and malformed dictionaries. This should improve qpdf's
ability to recover from a wide range of broken files that currently
cause it to fail.
2017-07-27 18:20:31 -04:00
Jay Berkenbilt 701b518d5c Detect recursion loops resolving objects (fixes #51)
During parsing of an object, sometimes parts of the object have to be
resolved. An example is stream lengths. If such an object directly or
indirectly points to the object being parsed, it can cause an infinite
loop. Guard against all cases of re-entrant resolution of objects.
2017-07-26 06:24:07 -04:00
Jay Berkenbilt 315092dd98 Avoid xref reconstruction infinite loop (fixes #100)
This is CVE-2017-9209.
2017-07-26 06:24:07 -04:00
Thorsten Schöning 7c08aa4280 Include QPDFExc.hh for use in std::list 2016-01-24 12:07:03 -05:00
Jay Berkenbilt f77acbdbba Copyright 2015 2015-05-24 17:26:49 -04:00
Jay Berkenbilt a11549a566 Detect loops in /Pages structure
Pushing inherited objects to pages and getting all pages were both
prone to stack overflow infinite loops if there were loops in the
Pages dictionary. There is a general weakness in the code in that any
part of the code that traverses the Pages structure would be prone to
this and would have to implement its own loop detection. A more robust
fix may provide some general method for handling the Pages structure,
but it's probably not worth doing.

Note: addition of *Internal2 private functions was done rather than
changing signatures of existing methods to avoid breaking
compatibility.
2015-02-21 19:47:11 -05:00
Jay Berkenbilt 225b018290 Update Copyright to 2014 2014-01-14 15:40:02 -05:00
Jay Berkenbilt cee2592ed1 Change API/ABI and withdraw 4.2.0
4.2.0 was binary incompatible in spite of there being no deletions or
changes to any public methods.  As such, we have to bump the ABI and
are fixing some API breakage while we're at it.

Previous 4.3.0 target is now 5.1.0.
2013-07-10 11:30:13 -04:00
Jay Berkenbilt a3576a7359 Bug fix: handle generation > 0 when generating object streams
Rework QPDFWriter to always track old object IDs and QPDFObjGen
instead of int, thus not discarding the generation number.  Switch to
QPDF::getCompressibleObjGen() to properly handle the case of an old
object eligible for compression that has a generation of other than
zero.
2013-06-14 14:58:09 -04:00
Jay Berkenbilt 96eb965115 Use QPDFObjectHandle::getObjGen() where appropriate
In internal code and examples, replace calls to getObjectID() and
getGeneration() with calls to getObjGen() where possible.
2013-06-14 14:58:09 -04:00
Jay Berkenbilt d88231e01e Promote QPDF::ObjGen to top-level object QPDFObjGen 2013-06-14 14:58:08 -04:00
Jay Berkenbilt 30027481f7 Remove all old-style casts from C++ code 2013-03-04 16:45:16 -05:00
Jay Berkenbilt a04a835849 Clarify methods to get user password
With newer encryption formats, it is no longer possible to recover the
user password using the owner password.
2013-01-03 20:45:53 -05:00
Jay Berkenbilt 8843e499b8 Update copyright year to 2013
Also add copyright notice to a few public headers that were missing
one.
2012-12-31 10:32:32 -05:00
Jay Berkenbilt e57c25814e Support for encryption with /V=5 and /R=5 and /R=6
Read and write support is implemented for /V=5 with /R=5 as well as
/R=6.  /R=5 is the deprecated encryption method used by Acrobat IX.
/R=6 is the encryption method used by PDF 2.0 from ISO 32000-2.
2012-12-31 10:32:32 -05:00
Jay Berkenbilt 93ac1695a4 Support files with only attachments encrypted
Test cases added in a future commit since they depend on /R=6 support.
2012-12-31 10:32:32 -05:00
Jay Berkenbilt 774584163f Add ExtensionLevel support to version handling
All version operations are now fully aware of extension levels.
2012-12-31 05:36:50 -05:00
Jay Berkenbilt 3101955ac0 Add V5 parameters to EncryptionData 2012-12-31 05:36:50 -05:00
Jay Berkenbilt 68447bb556 change EncryptionData 2012-12-31 05:36:50 -05:00
Jay Berkenbilt 04c203ae06 Eliminate flattenScalarReferences 2012-12-31 05:36:48 -05:00
Jay Berkenbilt 041397fdab Allow reading from InputSource and writing to Pipeline
Allowing users to subclass InputSource and Pipeline to read and write
from/to arbitrary sources provides the maximum flexibility for users
who want to read and write from other than files or memory.
2012-09-23 17:42:26 -04:00
Jay Berkenbilt f83bddf882 Update copyright to 2012 2012-07-28 22:03:36 -04:00
Jay Berkenbilt 316328704b Windows compilation fixes 2012-07-21 20:51:56 -04:00
Jay Berkenbilt 6bbea4baa0 Implement QPDFObjectHandle::parse
Move object parsing code from QPDF to QPDFObjectHandle and
parameterize the parts of it that are specific to a QPDF object.
Provide a version that can't handle indirect objects and that can be
called on an arbitrary string.

A side effect of this change is that the offset used when reporting
invalid stream length has changed, but since the new value seems like
a better value than the old one, the test suite has been updated
rather than making the code backward compatible.  This only effects
the offset reported for invalid streams that lack /Length or have an
invalid /Length key.

Updated some test code and exmaples to use QPDFObjectHandle::parse.

Supporting changes include adding a BufferInputSource constructor that
takes a string.
2012-07-21 09:06:10 -04:00
Jay Berkenbilt 15eaed5c52 Refactor: pull *InputSource out of QPDF
InputSource, FileInputSource, and BufferInputSource are now top-level
classes instead of privately nested inside QPDF.
2012-07-21 09:06:06 -04:00
Jay Berkenbilt a101533e0a Add command line option to copy encryption from other file
Add --copy-encryption and --encryption-file-password options to qpdf.
Also strengthen test suite for copying encryption.  The strengthened
test suite would have caught the failure to preserve AES and the
failure to update the file version, which was invalidating the
encrypted data.
2012-07-15 21:15:24 -04:00
Jay Berkenbilt e7b8f297ba Support copying objects from another QPDF object
This includes QPDF::copyForeignObject and supporting foreign objects
as arguments to addPage*.
2012-07-11 15:54:33 -04:00
Jay Berkenbilt 8a217eb3a2 Add concept of reserved objects
QPDFObjectHandle::{new,is,assert}Reserved, QPDF::replaceReserved
provide a mechanism to add objects to a PDF file when there are
circular references.  This is a prerequisite to copying objects from
one PDF to another.
2012-07-10 23:34:32 -04:00
Tobias Hoffmann 39bbaa86e3 Build this->all_pages while traversing with pushInheritedAttributesToPage 2012-07-07 17:45:10 -04:00
Tobias Hoffmann abb53ac369 Limited inheritance to the attributes explicitly listed in the PDF spec
Previous versions of qpdf incorrectly passed arbitrary objects from
/Pages objects down to individual pages in direct contradition with
the PDF specification.  These are now left in /Pages.  When
intermediate /Pages nodes are being discarded as when the /Pages tree
is being flattened, a warning is issued when unknown keys are
encountered.
2012-07-04 23:04:55 -04:00
Tobias Hoffmann 7770a1b036 Added public method QPDF::pushInheritedAttributesToPage
Refactored optimizePagesTree to pushInheritedAttributesToPage and made
public
2012-07-04 16:24:03 -04:00
Jay Berkenbilt 2266c6232b Rework InputSource::readLine to make it much more efficient
This rework makes xref reconstruction run much faster and use much
less memory.
2012-06-27 06:48:06 -04:00
Jay Berkenbilt 8318d81ada Fix and test support for files >= 4 GB 2012-06-24 15:56:50 -04:00
Jay Berkenbilt 781c313058 Change QPDF_Integer from int to long long
This makes it possible to store offsets that are larger than 2 GB in
the trailer dictionary.
2012-06-24 15:20:01 -04:00
Jay Berkenbilt 4f305488d8 Improve the FILE* version of QPDF::processFile 2012-06-23 18:23:06 -04:00
Jay Berkenbilt ffb96ee17e Add pdf-from-scratch example 2012-06-23 09:05:06 -04:00
Jay Berkenbilt a0768e4190 Add QPDF::emptyPDF() and pdf_from_scratch test code 2012-06-21 23:09:05 -04:00
Jay Berkenbilt 81e8752362 Use qpdf_offset_t in place of off_t in public APIs.
off_t is used internally only when needed to talk to standard
libraries.  This requires that the "long long" type be supported by
the compiler.
2012-06-21 21:23:24 -04:00
Jay Berkenbilt eb802cfa8c Implement page manipulation APIs 2012-06-21 15:01:02 -04:00
Jay Berkenbilt df493c352f Refactor optimizePagesTree
Split optimizePagesTree into a simpler top-level routine and a
recursive internal routine.
2012-06-21 15:01:02 -04:00
Tobias Hoffmann 5d3f93be29 Added first version of pages API. 2012-06-21 15:01:02 -04:00
Tobias Hoffmann 47a846a7e0 Added method to clear pages cache. 2012-06-21 15:01:02 -04:00
Jay Berkenbilt f59ff6fcc2 fix include order for off_t 2012-06-21 14:11:22 -04:00
Jay Berkenbilt bc1c4bb578 Add QPDF::processFile that takes an open FILE* 2012-06-21 08:00:35 -04:00
Jay Berkenbilt 5d4cad9c02 ABI change: fix use of off_t, size_t, and integer types
Significantly improve the code's use of off_t for file offsets, size_t
for memory sizes, and integer types in cases where there has to be
compatibility with external interfaces.  Rework sections of the code
that would have prevented qpdf from working on files larger than 2 (or
maybe 4) GB in size.
2012-06-20 15:20:26 -04:00
Jay Berkenbilt b856379370 Portability issues: off_t, unlink
New header qpdf/Types.h attempts to make sure size_t and off_t are
defined on any platform and in a way that would work with large file
support.  Additionally, missing header files are included to get
unlink.
2012-06-20 15:18:14 -04:00
Jay Berkenbilt 7dc197ef88 implement replace and swap 2011-08-10 12:42:48 -04:00
Jay Berkenbilt c551b972f6 update version to 2.2.3, update copyright to 2011
git-svn-id: svn+q:///qpdf/trunk@1051 71b93d88-0707-0410-a8cf-f5a4172ac649
2011-04-30 19:19:30 +00:00
Jay Berkenbilt a72ce95c92 setOutputStreams
git-svn-id: svn+q:///qpdf/trunk@1035 71b93d88-0707-0410-a8cf-f5a4172ac649
2010-10-01 11:02:35 +00:00
Jay Berkenbilt 9f444ffef3 add QPDF::processMemoryFile and API additions to support it
git-svn-id: svn+q:///qpdf/trunk@1034 71b93d88-0707-0410-a8cf-f5a4172ac649
2010-10-01 10:20:38 +00:00
Jay Berkenbilt ce8b1ba6a5 convert file to a PointerHolder<InputSource> so it could be either a file or a buffer; also fix a bug in BufferInputSource::seek
git-svn-id: svn+q:///qpdf/trunk@1030 71b93d88-0707-0410-a8cf-f5a4172ac649
2010-09-24 19:10:08 +00:00
Jay Berkenbilt bcd621e208 update copyrights for 2010
git-svn-id: svn+q:///qpdf/trunk@935 71b93d88-0707-0410-a8cf-f5a4172ac649
2010-01-25 01:23:20 +00:00
Jay Berkenbilt ace2a031b5 prepare 2.1.rc1 for release
git-svn-id: svn+q:///qpdf/trunk@901 71b93d88-0707-0410-a8cf-f5a4172ac649
2009-10-24 04:47:17 +00:00
Jay Berkenbilt 27ee889c0e tweak dll stuff again
git-svn-id: svn+q:///qpdf/trunk@851 71b93d88-0707-0410-a8cf-f5a4172ac649
2009-10-21 01:45:13 +00:00
Jay Berkenbilt 748ab301d4 go back to function-based DLL_EXPORT rather than class-based to avoid creation of export files with executables under msvc
git-svn-id: svn+q:///qpdf/trunk@849 71b93d88-0707-0410-a8cf-f5a4172ac649
2009-10-21 00:27:24 +00:00
Jay Berkenbilt 398354b6f0 update C API for error retrieval
git-svn-id: svn+q:///qpdf/trunk@830 71b93d88-0707-0410-a8cf-f5a4172ac649
2009-10-20 00:24:44 +00:00
Jay Berkenbilt 3f8c4c2736 categorize all error messages and include object information if available
git-svn-id: svn+q:///qpdf/trunk@829 71b93d88-0707-0410-a8cf-f5a4172ac649
2009-10-19 23:09:19 +00:00
Jay Berkenbilt b67a3c15e7 DLL.hh -> DLL.h, move public enumerated types into Constants.h and use them both for C and C++ interfaces
git-svn-id: svn+q:///qpdf/trunk@828 71b93d88-0707-0410-a8cf-f5a4172ac649
2009-10-19 20:17:14 +00:00
Jay Berkenbilt e25910b59a reading crypt filters is largely implemented but not fully tested
git-svn-id: svn+q:///qpdf/trunk@812 71b93d88-0707-0410-a8cf-f5a4172ac649
2009-10-17 23:37:55 +00:00
Jay Berkenbilt c13bc66de8 checkpoint -- partially implemented /V=4 encryption
git-svn-id: svn+q:///qpdf/trunk@811 71b93d88-0707-0410-a8cf-f5a4172ac649
2009-10-17 18:54:51 +00:00
Jay Berkenbilt 846c9f6bcc checkpoint -- started doing some R4 encryption support
git-svn-id: svn+q:///qpdf/trunk@807 71b93d88-0707-0410-a8cf-f5a4172ac649
2009-10-17 03:14:47 +00:00
Jay Berkenbilt f71eb2af91 fix class-level DLL_EXPORT
git-svn-id: svn+q:///qpdf/trunk@797 71b93d88-0707-0410-a8cf-f5a4172ac649
2009-10-12 01:18:19 +00:00
Jay Berkenbilt 44cbd3d4b4 do DLL_EXPORT only in header files and only at the class or top-level function level
git-svn-id: svn+q:///qpdf/trunk@796 71b93d88-0707-0410-a8cf-f5a4172ac649
2009-10-12 01:15:55 +00:00
Jay Berkenbilt ad03b51b3d typo in comment
git-svn-id: svn+q:///qpdf/trunk@745 71b93d88-0707-0410-a8cf-f5a4172ac649
2009-10-04 23:55:02 +00:00
Jay Berkenbilt 8d7bb3ff50 add methods for getting encryption data
git-svn-id: svn+q:///qpdf/trunk@733 71b93d88-0707-0410-a8cf-f5a4172ac649
2009-09-27 20:05:38 +00:00
Jay Berkenbilt fe6771e0e5 add many new tests to exercise C api
git-svn-id: svn+q:///qpdf/trunk@727 71b93d88-0707-0410-a8cf-f5a4172ac649
2009-09-27 16:01:45 +00:00
Jay Berkenbilt 84ec83e925 basic implementation of C API
git-svn-id: svn+q:///qpdf/trunk@725 71b93d88-0707-0410-a8cf-f5a4172ac649
2009-09-27 14:39:04 +00:00
Jay Berkenbilt 02333ba1e9 checkpoint -- first crack at C API, minor refactoring of encryption functions
git-svn-id: svn+q:///qpdf/trunk@720 71b93d88-0707-0410-a8cf-f5a4172ac649
2009-09-27 03:11:29 +00:00
Jay Berkenbilt 82ea3dd3a7 don't dll export inline functions
git-svn-id: svn+q:///qpdf/trunk@712 71b93d88-0707-0410-a8cf-f5a4172ac649
2009-09-27 01:56:41 +00:00
Jay Berkenbilt 1e74c03acd stick DLL_EXPORT in front of every public method of every public class
git-svn-id: svn+q:///qpdf/trunk@688 71b93d88-0707-0410-a8cf-f5a4172ac649
2009-08-06 19:00:25 +00:00
Jay Berkenbilt 599daddb47 decode streams on check, always exit abnormally when warnings are detected
git-svn-id: svn+q:///qpdf/trunk@660 71b93d88-0707-0410-a8cf-f5a4172ac649
2009-03-08 19:00:19 +00:00
Jay Berkenbilt 91cb7c0a58 fix many typos in comments and strings
git-svn-id: svn+q:///qpdf/trunk@651 71b93d88-0707-0410-a8cf-f5a4172ac649
2009-02-21 02:54:31 +00:00
Jay Berkenbilt 4499e04b57 better recovery for appended files with damaged cross-reference tables
git-svn-id: svn+q:///qpdf/trunk@649 71b93d88-0707-0410-a8cf-f5a4172ac649
2009-02-21 02:31:32 +00:00
Jay Berkenbilt 9f93c89ee5 update copyright, release 2.0.3
git-svn-id: svn+q:///qpdf/trunk@644 71b93d88-0707-0410-a8cf-f5a4172ac649
2009-02-15 16:31:12 +00:00
Jay Berkenbilt 62bff4861f fix potential 64-bit issues
git-svn-id: svn+q:///qpdf/trunk@613 71b93d88-0707-0410-a8cf-f5a4172ac649
2008-05-05 02:22:40 +00:00
Jay Berkenbilt 9a0b88bf77 update release date to actual date
git-svn-id: svn+q:///qpdf/trunk@599 71b93d88-0707-0410-a8cf-f5a4172ac649
2008-04-29 12:55:25 +00:00