2
1
mirror of https://github.com/qpdf/qpdf.git synced 2025-02-02 03:48:24 +00:00

231 Commits

Author SHA1 Message Date
Jay Berkenbilt
315092dd98 Avoid xref reconstruction infinite loop (fixes #100)
This is CVE-2017-9209.
2017-07-26 06:24:07 -04:00
Jay Berkenbilt
603f222365 Fix infinite loop while reporting an error (fixes #101)
This is CVE-2017-9210.

The description string for an error message included unparsing an
object, which is too complex of a thing to try to do while throwing an
exception. There was only one example of this in the entire codebase,
so it is not a pervasive problem. Fixing this eliminated one class of
infinite loop errors.
2017-07-26 06:24:07 -04:00
Thorsten Schöning
e80b6e3341 Support paths with spaces 2016-01-24 11:52:09 -05:00
Thorsten Schöning
eff935ab60 Use absolute paths for large file tests
Working with absolute paths makes debugging easier, but some called
scripts always need / as dir separator or won't work.
2016-01-24 11:52:09 -05:00
Thorsten Schöning
adbaa54ad4 Fix non-portable use of /dev/null
/dev/null is not portable, so use File::Spec instead, which provides
portable "paths" and especially "nul" on Windows. I changed all places
with hard coded /dev/null to be sure, while I think it only is a
problem in direct system calls, because the other executed commands go
to sh.exe from MSYS which itself should port /dev/null to NUL. The
test still pass, so shouldn't have made any harm...
2016-01-24 11:52:09 -05:00
Thorsten Schöning
951dbc3b7f Fix expr syntax, support spaces in paths
expr needs ARG + ARG
quote paths to support support spaces
2016-01-24 11:52:09 -05:00
Thorsten Schöning
3c1555a622 Explicitly invoke shell scripts with sh
Shebang doesn't work well on Windows.
2016-01-24 11:52:09 -05:00
Jay Berkenbilt
b62cbe2508 Tolerate some mangled xref tables
If xref table entries lack the spec-required trailing whitespace or
contain a small amount of extra space, handle them anyway.
2015-10-31 18:56:43 -04:00
Jay Berkenbilt
b8bdef0ad1 Implement deterministic ID
For non-encrypted files, determinstic ID generation uses file contents
instead of timestamp and file name. At a small runtime cost, this
enables generation of the same /ID if the same inputs are converted in
the same way multiple times.
2015-10-31 18:56:42 -04:00
Jay Berkenbilt
b356b9dfa2 fix-qdf: handle object streams with > 255 objects
fix-qdf was previously hard-coding the number of bytes for the f2
field of the xref stream entry. This addresses issue #37. Thanks
aluebcke for reporting.
2015-05-24 16:52:42 -04:00
Jay Berkenbilt
a11549a566 Detect loops in /Pages structure
Pushing inherited objects to pages and getting all pages were both
prone to stack overflow infinite loops if there were loops in the
Pages dictionary. There is a general weakness in the code in that any
part of the code that traverses the Pages structure would be prone to
this and would have to implement its own loop detection. A more robust
fix may provide some general method for handling the Pages structure,
but it's probably not worth doing.

Note: addition of *Internal2 private functions was done rather than
changing signatures of existing methods to avoid breaking
compatibility.
2015-02-21 19:47:11 -05:00
Jay Berkenbilt
c729e07d55 Avoid resolving arguments to R
When checking two objects preceding R while parsing, ensure that the
objects are direct. This avoids stuff like 1 0 obj containing 1 0 R 0 R
from causing an infinite loop in object resolution.
2015-02-21 17:51:08 -05:00
Jay Berkenbilt
d8900c2255 Handle page tree node with no /Type
Original reported here:
https://bugs.launchpad.net/ubuntu/+source/qpdf/+bug/1397413

The PDF specification says that the /Type key for nodes in the pages
dictionary (both /Page and /Pages) is required, but some PDF files
omit them. Use the presence of other keys to determine the type of
pages tree node this is if the type key is not found.
2014-12-29 10:17:21 -05:00
Jay Berkenbilt
caab1b0e16 Handle pages with no /Contents from getPageContents()
The spec allows /Contents to be omitted for pages that are blank, but
QPDFObjectHandle::getPageContents() was throwing an exception in this
case.
2014-11-14 13:43:34 -05:00
Jay Berkenbilt
9f8aba1db7 Handle indirect stream filter/decode parameters
QPDFWriter was trying to make /Filter and /DecodeParms direct in all
cases, but there are some cases where /DecodeParms may refer to a
stream, which can't be direct. QPDFWriter doesn't actually need
/DecodeParms to be direct in that case because it won't be able to
filter the stream. Until we can handle this type of stream, just don't
make /Filter and /DecodeParms direct if we can't filter the stream
anyway.

Fixes #34
2014-06-07 16:31:03 -04:00
Jay Berkenbilt
e9a319fb95 Allow arbitrary whitespace, not just newline, after xref
Fixes #27.
2013-12-14 15:17:23 -05:00
Jay Berkenbilt
157c936b97 Use 8 bit per sample images in tests
In compare image tests, use the gs device tiff24nc instead of tiff12nc
since the 4 bit per sample images created by tiff12nc could sometimes
trigger a bug in tiffcmp.  Fixes #20.
2013-11-21 13:41:37 -05:00
Jay Berkenbilt
a237e92445 Warn when -accessibility=n will be ignored
Also accept -accessibility=n with 256 bit keys even though it will be
ignored.
2013-10-18 10:45:15 -04:00
Jay Berkenbilt
0bfe902489 Security: avoid pre-allocating vectors based on file data
In places where std::vector<T>(size_t) was used, either validate that
the size parameter is sane or refactor code to avoid the need to
pre-allocate the vector.
2013-10-09 20:57:14 -04:00
Jay Berkenbilt
3eb4b066ab Security: better bounds checks for linearization data
The faulty code was only used during explicit checks of linearization
data.  Those checks are not part of normal reading or writing of PDF
files.
2013-10-09 19:50:09 -04:00
Jay Berkenbilt
b84f57e56d Ignore broken DecodeParms for stream with no filters 2013-07-07 19:43:16 -04:00
Jay Berkenbilt
91367239fd Add --show-npages option to qpdf 2013-07-07 19:43:16 -04:00
Jay Berkenbilt
adccedc02f Allow numeric range to be omitted qpdf --pages
Detect a missing page range and assume 1-z.
2013-07-07 19:43:16 -04:00
Jay Berkenbilt
a85007cb0d Handle more broken files
Space rather than newline after xref, missing /ID in trailer for
encrypted file.  This enables qpdf to handle some files that xpdf can
handle.  Adobe reader can't necessarily handle them.
2013-06-15 12:40:01 -04:00
Jay Berkenbilt
16051788ed Handle /Outlines dictionary being a direct object
Even though this case is not valid according to the spec, it has been
seen, and caused an internal error.
2013-06-14 21:36:04 -04:00
Jay Berkenbilt
eae8370cd9 Add optional /Length key in crypt filter dictionary 2013-06-14 20:42:39 -04:00
Jay Berkenbilt
a3576a7359 Bug fix: handle generation > 0 when generating object streams
Rework QPDFWriter to always track old object IDs and QPDFObjGen
instead of int, thus not discarding the generation number.  Switch to
QPDF::getCompressibleObjGen() to properly handle the case of an old
object eligible for compression that has a generation of other than
zero.
2013-06-14 14:58:09 -04:00
Jay Berkenbilt
29f5830325 Fix getTypeCode and getTypeName work for indirect objects
Remove const qualifier from getTypeCode and get getTypeName methods of
QPDFObjectHandle, make them work properly for indirect objects, and
exercise them much better in the test suite.
2013-03-05 13:35:46 -05:00
Jay Berkenbilt
119f2a4b68 Add method to terminate content stream parsing 2013-03-05 13:35:46 -05:00
Jay Berkenbilt
6c7bf114dc Bug fix: properly handle overridden compressed objects
When caching objects in an object stream, only cache objects that
still resolve to that stream.  See Changelog mod from this commit for
details.
2013-02-23 17:51:17 -05:00
Jay Berkenbilt
a5d8783f67 Improve qpdf --check
Fix exit status for case of errors without warnings, continue after
errors when possible, add test case for parsing a file with content
stream errors on some but not all pages.
2013-01-25 11:08:50 -05:00
Jay Berkenbilt
bfda717749 Cosmetic changes to be closer to Adobe terminology
Change object type Keyword to Operator, and place the order of the
object types in object_type_e in the same order as they are mentioned
in the PDF specification.

Note that this change only breaks backward compatibility with code
that has not yet been released.
2013-01-23 09:38:05 -05:00
Jay Berkenbilt
913eb5ac35 Add getTypeCode() and getTypeName()
Add virtual methods to QPDFObject, wrappers to QPDFObjectHandle, and
implementations to all the QPDF_Object types.
2013-01-22 10:01:45 -05:00
Jay Berkenbilt
f81152311e Add QPDFObjectHandle::parseContentStream method
This method allows parsing of the PDF objects in a content stream or
array of content streams.
2013-01-20 15:35:39 -05:00
Jay Berkenbilt
f8306913ba Update "C" API with functions for new features 2012-12-31 10:32:32 -05:00
Jay Berkenbilt
9a23c3dcb6 Remove /Crypt from stream filters unconditionally
When writing a new stream, always remove /Crypt even if we are not
otherwise able to filter the stream.
2012-12-31 10:32:32 -05:00
Jay Berkenbilt
4237a29c94 Refactor Dictionary writing code
Original code was written before we could shallow copy objects, so all
the filtering was done by suppressing the output of certain keys and
replacing them with other keys.  Now we can simplify the code greatly
by modifying shallow copies of dictionaries in place.
2012-12-31 10:32:32 -05:00
Jay Berkenbilt
e57c25814e Support for encryption with /V=5 and /R=5 and /R=6
Read and write support is implemented for /V=5 with /R=5 as well as
/R=6.  /R=5 is the deprecated encryption method used by Acrobat IX.
/R=6 is the encryption method used by PDF 2.0 from ISO 32000-2.
2012-12-31 10:32:32 -05:00
Jay Berkenbilt
93ac1695a4 Support files with only attachments encrypted
Test cases added in a future commit since they depend on /R=6 support.
2012-12-31 10:32:32 -05:00
Jay Berkenbilt
4fe6f61def Add missing test case from long ago
I noticed a test output file that was not accessed in the test suite
and added a test case for it.
2012-12-31 10:32:32 -05:00
Jay Berkenbilt
774584163f Add ExtensionLevel support to version handling
All version operations are now fully aware of extension levels.
2012-12-31 05:36:50 -05:00
Jay Berkenbilt
04c203ae06 Eliminate flattenScalarReferences 2012-12-31 05:36:48 -05:00
Jay Berkenbilt
b4b8b28ed2 Reference object with zero offset
This file used to exercise a zero offset test case when qpdf would
visit every object in the file.  After the next commit, qpdf no longer
touches unreferenced objects, so a reference had to be added to
continue to have this file exercise the zero offset case.
2012-12-27 11:36:48 -05:00
Jay Berkenbilt
35873031a7 Uncompress stream data for some linearization tests
For linearization tests where we are actually comparing the exact
output of the test with a known file, uncompress stream data so we can
see what's there.  This makes looking at future changes a little
easier.
2012-12-27 11:26:06 -05:00
Jay Berkenbilt
7f84239cad Find PDF header anywhere in the first 1024 bytes 2012-12-25 14:43:37 -05:00
Jay Berkenbilt
f256670eba Ignore objects with offset 0 2012-11-20 13:57:37 -05:00
Jay Berkenbilt
041397fdab Allow reading from InputSource and writing to Pipeline
Allowing users to subclass InputSource and Pipeline to read and write
from/to arbitrary sources provides the maximum flexibility for users
who want to read and write from other than files or memory.
2012-09-23 17:42:26 -04:00
Jay Berkenbilt
c1627d0438 Add QPDFWriter::setExtraHeaderText 2012-09-06 15:31:12 -04:00
Jay Berkenbilt
8d2b29ef98 Fix segmentation fault with use of QPDFWriter::setOutputMemory 2012-09-06 14:39:06 -04:00
Jay Berkenbilt
3c4110184c Add specially crafted test cases for EOF error
This replaces a PDF from the wild that I didn't want to include in the
test suite but used to verify the original fix.
2012-08-11 12:36:57 -04:00