2
1
mirror of https://github.com/qpdf/qpdf.git synced 2024-09-22 10:09:06 +00:00
Commit Graph

1023 Commits

Author SHA1 Message Date
Jay Berkenbilt
83216e640c Preserve form fields when splitting pages (fixes #340) 2021-02-22 18:42:06 -05:00
Jay Berkenbilt
1f35ec9988 Add methods for copying form fields 2021-02-22 18:42:06 -05:00
Jay Berkenbilt
8e8c0d8290 Add new placeFormXObject that takes a matrix reference 2021-02-22 18:42:06 -05:00
Jay Berkenbilt
61d41e2e88 Add copyAnnotations, use with overlay/underlay (fixes #395) 2021-02-22 18:42:06 -05:00
Jay Berkenbilt
7b3cbacf5d Change from QPDF{Array,Dict}Items to aitems() and ditems() 2021-02-22 11:05:39 -05:00
Jay Berkenbilt
a9ae8cadc6 Add transformAnnotations and fix flattenRotations to use it 2021-02-21 17:13:09 -05:00
Jay Berkenbilt
a76decd2d5 Add QPDFObjGen::unparse 2021-02-21 16:21:52 -05:00
Jay Berkenbilt
7540d2082a Explicitly override inherited rotate in flattenRotations 2021-02-21 14:58:45 -05:00
Jay Berkenbilt
e899926e0d Use QPDFMatrix inside flattenRotations 2021-02-21 14:58:45 -05:00
Jay Berkenbilt
92fbc6fdf5 QPDFObjectHandle::copyStream 2021-02-21 06:36:30 -05:00
Jay Berkenbilt
60afe4142e Refactor: separate copyStreamData from replaceForeignIndirectObjects 2021-02-21 06:36:30 -05:00
Jay Berkenbilt
15269f36d8 addFormField: update cache rather than invalidating 2021-02-21 06:36:30 -05:00
Jay Berkenbilt
901f1a788c Enhance QPDFMatrix API 2021-02-21 06:36:30 -05:00
Jay Berkenbilt
05eb5826d8 Fix isPagesObject and isPageObject
There are lots of things with /Kids that are not pages. Repair the
pages tree, then do a reliable check.
2021-02-20 19:42:41 -05:00
Jay Berkenbilt
35dd11f356 Allow --rotate=0 2021-02-20 16:29:34 -05:00
Jay Berkenbilt
71e8627285 Add const versions of QPDFMatrix::transform* 2021-02-19 18:35:19 -05:00
Jay Berkenbilt
de8929a41c Add QPDFAcroFormDocumentHelper::addFormField 2021-02-18 12:25:48 -05:00
Jay Berkenbilt
5cec6b4c3d Add QPDFPageObjectHelper::getMatrixForFormXObjectPlacement 2021-02-18 12:25:48 -05:00
Jay Berkenbilt
0765872295 Form field for non-widget just returns null 2021-02-18 10:25:07 -05:00
Jay Berkenbilt
0b1623d07d Add QUtil::path_basename 2021-02-18 09:59:03 -05:00
Jay Berkenbilt
a773f4c71d Add QPDFObjectHandle::parse for strings with context 2021-02-15 11:33:03 -05:00
Jay Berkenbilt
7eb903d9aa Use functional replaceStreamData 2021-02-14 14:42:24 -05:00
Jay Berkenbilt
efbb21673c Add functional versions of QPDFObjectHandle::replaceStreamData
Also fix a bug in checking consistency of length for stream data
providers. Length should not be checked or recorded if the provider
says it failed to generate the data.
2021-02-14 14:42:24 -05:00
Jay Berkenbilt
e2593e2efe Move QPDFMatrix into the public API 2021-02-13 02:30:00 -05:00
Jay Berkenbilt
07f40bd254 QUtil::double_to_string: trim trailing zeroes with option to disable 2021-02-13 02:30:00 -05:00
Jay Berkenbilt
8fbc8579f2 Allow zone information to be omitted from timestamp strings 2021-02-11 14:26:55 -05:00
Jay Berkenbilt
df067c9ab6 Add autoconf test for localtime_r 2021-02-11 14:26:55 -05:00
Jay Berkenbilt
1b3f84f967 Require C++14 instead of C++11 2021-02-10 16:27:58 -05:00
Jay Berkenbilt
9fcf61b2f6 Fix loop in QPDFOutlineDocumentHelper (fuzz issue 30507) 2021-02-10 16:27:44 -05:00
Jay Berkenbilt
4d1f2fdcac Update to new name/number tree API 2021-02-10 15:46:20 -05:00
Jay Berkenbilt
1f4771cd0d Minor clean up of Windows headers 2021-02-10 07:36:18 -05:00
Jay Berkenbilt
ad34b9c278 Implement helpers for file attachments 2021-02-10 06:57:37 -05:00
Jay Berkenbilt
bf0e6eb302 Add QUtil methods for dealing with PDF timestamp strings 2021-02-09 17:50:24 -05:00
Jay Berkenbilt
bfbeec5497 Make newly created name/number trees indirect objects 2021-02-08 06:49:56 -05:00
Jay Berkenbilt
553ac7f353 Add QUtil::pipe_file and QUtil::file_provider 2021-02-07 19:41:34 -05:00
Jay Berkenbilt
e076c9bf08 Remove erroneous handling of /EFF for stream decryption
I thought /EFF was supposed to be used as a default for decrypting
embedded file streams, but actually it's supposed to be advice to a
conforming writer about handling new ones. This makes sense since the
findAttachmentStreams code, which is not actually needed, was never
right.
2021-02-06 17:08:41 -05:00
Jay Berkenbilt
ac2b3b96e1 Make wrong object stream type a warning 2021-02-06 14:29:11 -05:00
Jay Berkenbilt
faa2e3ddfd Handle older PDFs whose form XObjects inherit resources (fixes #494)
When removing unreferenced resources, notice if a page (recursively)
contains a form XObject with unreferenced resources, and count any
such resources as referenced by the page.
2021-02-02 18:06:05 -05:00
Jay Berkenbilt
81025e4998 Refactor removal of unreferenced resources
Refactor in preparation for resolving unresolved resources in form
xobjects from page.
2021-02-02 18:06:05 -05:00
Jay Berkenbilt
9c9ce64eec Handle strings in inline image dictionaries
We need to use token.getRawValue, not token.getValue
2021-01-31 07:50:03 -05:00
Jay Berkenbilt
178f995fc2 Recover from exceptions during filtering for inline images 2021-01-31 07:49:08 -05:00
Jay Berkenbilt
4ae93a73c5 Improve memory safety of dict/array iterators 2021-01-31 07:16:03 -05:00
Jay Berkenbilt
de0b11fc47 Add C++ iterator API around array and dictionary objects 2021-01-30 15:15:23 -05:00
Jay Berkenbilt
35e7859bc7 Make QPDFObjectHandle::is* return false for uninitialized objects 2021-01-29 15:46:54 -05:00
Jay Berkenbilt
50decc9bb8 name/number tree: explicitly declare default destructors 2021-01-29 15:46:54 -05:00
Jay Berkenbilt
8ed3e8c79b NNTree: rework iterators to be more memory efficient
Keep a std::pair internal to the iterators so that operator* can
return a reference and operator-> can work, and each can work without
copying pairs of objects around.
2021-01-26 09:12:23 -05:00
Jay Berkenbilt
e7e20772ed name/number trees: remove 2021-01-26 09:12:23 -05:00
Jay Berkenbilt
5816fb44b8 name/number trees: insertAfter 2021-01-25 15:39:10 -05:00
Jay Berkenbilt
16a9bb3f6f name/number trees: newEmpty, increment/decrement end() 2021-01-25 15:39:10 -05:00
Jay Berkenbilt
b5614f611d Implement repair and insert for name/number trees 2021-01-24 19:31:45 -05:00
Jay Berkenbilt
04edfe9fad QPDFObjectHandle::newUnicodeString to uses UTF-16 only when needed
Use the first of ASCII, PDFDocEncoding, or UTF-16 that is capable of
encoding the string.
2021-01-24 03:27:28 -05:00
Jay Berkenbilt
63e5cb533d Use new QPDF{Name,Number}TreeObjectHelper API 2021-01-24 03:27:28 -05:00
Jay Berkenbilt
d61ffb65d0 Add new constructors for name/number tree helpers
Add constructors that take a QPDF object so we can issue warnings and
create new indirect objects.
2021-01-24 03:27:26 -05:00
Jay Berkenbilt
ba814703fb Use QPDFNameTreeObjectHelper's iterator directly 2021-01-24 03:25:11 -05:00
Jay Berkenbilt
5f0708418a Add iterators to name/number tree helpers 2021-01-24 03:22:59 -05:00
Jay Berkenbilt
4a1cce0a47 Reimplement name and number tree object helpers
Create a computationally and memory efficient implementation of name
and number trees that does binary searches as intended by the data
structure rather than loading into a map, which can use a great deal
of memory and can be very slow.
2021-01-24 03:22:51 -05:00
Jay Berkenbilt
6226b69dba Add warn() to QPDF's public API 2021-01-16 18:41:53 -05:00
Jay Berkenbilt
fc88837d4b Treat /EmbeddedFiles as a proper name tree
If we ever had an encrypted file with different filters for
attachments and either the /EmbeddedFiles name tree was deep or some
of the file specs didn't have /Type, we would have overlooked those as
attachment streams. The code now properly handles /EmbeddedFiles as a
name tree.
2021-01-11 10:50:44 -05:00
Jay Berkenbilt
6fe7b704c7 Warn rather than segv on access after closing input source (fixes #495) 2021-01-06 10:11:34 -05:00
Jay Berkenbilt
0fed040392 Prepare version 10.1.0 2021-01-04 16:59:55 -05:00
Jay Berkenbilt
18340b8835 Spell check 2021-01-04 16:26:58 -05:00
Jay Berkenbilt
dc92574c10 Fix some pipelines to be safe if downstream write fails (fuzz issue 28262) 2021-01-04 15:17:35 -05:00
Jay Berkenbilt
ba6b6aacf1 Fix outdated comment 2021-01-03 15:59:49 -05:00
Jay Berkenbilt
3be58f49e5 Make more QPDFPageObjectHelper methods work with form XObject 2021-01-02 14:08:53 -05:00
Jay Berkenbilt
98da4fd835 Externalize inline images now includes form XObjects 2021-01-02 14:08:17 -05:00
Jay Berkenbilt
bedf35d6a5 Bug fix: avoid extraneous pipeline finish calls with multiple contents
Avoid calling finish() multiple times on the pipeline passed to
pipeContentStreams. This commit also fixes a bug in which qpdf was not
exiting with the proper exit status if warnings found while splitting
pages; this was exposed by a test case that changed.
2021-01-02 14:08:17 -05:00
Jay Berkenbilt
a139d2b36d Add several methods for working with form XObjects (fixes #436)
Make some more methods in QPDFPageObjectHelper work with form
XObjects, provide forEach methods to walk through nested form
XObjects, possibly recursively. This should make it easier to work
with form XObjects from user code.
2021-01-02 12:29:31 -05:00
Jay Berkenbilt
6154221edb QPDFPageObjectHelper: filterPageContents -> filterContents + form XObject 2021-01-02 11:33:36 -05:00
Jay Berkenbilt
63ea46193d QPDFPageObjectHelper: getPageImages -> getImages 2021-01-02 11:33:36 -05:00
Jay Berkenbilt
e7a8554563 QPDFPageObjectHelper::getPageImages: support form XObjects 2021-01-02 11:33:36 -05:00
Jay Berkenbilt
1562d34c09 Add QPDFObjectHandle::isFormXObject 2021-01-01 07:36:10 -05:00
Jay Berkenbilt
c9271335fa Add QPDFPageObjectHelper::flattenRotation and --flatten-rotation 2020-12-30 13:03:55 -05:00
Jay Berkenbilt
12ecd2019a Add QPDFObjectHandle::setFilterOnWrite 2020-12-28 12:58:19 -05:00
Jay Berkenbilt
3f9191a344 Add ostream << for QPDFObjGen 2020-12-28 12:58:19 -05:00
Jay Berkenbilt
858c7b89bc Let optimize filter stream parameters instead of making them direct
Also removes preclusion of stream references in stream parameters of
filterable streams and reduces write times by about 8% by eliminating
an extra traversal of the objects.
2020-12-28 12:58:19 -05:00
Jay Berkenbilt
1a62cce940 Restructure optimize to allow skipping parameters of filtered streams 2020-12-28 12:58:19 -05:00
Jay Berkenbilt
09027344b9 Refactor: separate code that determines whether to filter a stream 2020-12-28 12:58:19 -05:00
Jay Berkenbilt
39bfa01307 Implement user-provided stream filters
Refactor QPDF_Stream to use stream filter classes to handle supported
stream filters as well.
2020-12-28 12:58:19 -05:00
Jay Berkenbilt
cc8895078a Add QPDFObjectHandle::makeDirect(bool allow_streams) 2020-12-26 08:48:18 -05:00
Jay Berkenbilt
573b6eb8b1 Provide qpdf write progress reporting from C API (fixes #487) 2020-12-20 14:43:24 -05:00
Jay Berkenbilt
2050977099 Add QPDFObjectHandle manipulation to C API 2020-11-28 19:48:07 -05:00
Jay Berkenbilt
78b9d6bfd4 Prepare 10.0.4 release 2020-11-21 13:50:02 -05:00
Jay Berkenbilt
bd79138c84 Treat direct page as runtime rather than logic error (fuzz issue 27393) 2020-11-11 09:50:43 -05:00
Jay Berkenbilt
47f4ebcdac Ignore unused field in xref entry, avoiding range error (fixes #482) 2020-11-04 07:46:46 -05:00
Jay Berkenbilt
fbe40b800d Prepare 10.0.3 release 2020-10-31 13:47:03 -04:00
Jay Berkenbilt
6971f78ff6 Fix stack overflow on direct root (fuzz issue 26761) 2020-10-31 13:10:39 -04:00
Jay Berkenbilt
ffe6af6f77 Add comments explaining the foreign object copying code
These are the comments I would have liked to have been able to read
while fixing #449 and #478.
2020-10-31 12:14:26 -04:00
Jay Berkenbilt
96767fb104 Fix foreign stream copying bug (fixes #478)
This reverts an incorrect fix to #449 and codes it properly. The real
problem was that we were looking at the local dictionaries rather than
the foreign dictionaries when saving the foreign stream data. In the
case of direct objects, these happened to be the same, but in the case
of indirect objects, the object references could be pointing anywhere
since object numbers don't match up between the old and new files.
2020-10-31 12:14:26 -04:00
Jay Berkenbilt
da7540794a Prepare 10.0.2 release 2020-10-27 11:57:48 -04:00
Jay Berkenbilt
09bd1fafb1 Improve efficiency of number to string conversion 2020-10-27 11:57:48 -04:00
Jay Berkenbilt
bcea54fcaa Revert removal of unreadCh change for performance
Turns out unreadCh is much more efficient than seek(-1, SEEK_CUR).
Update comments and code to reflect this.
2020-10-27 11:57:48 -04:00
Jay Berkenbilt
b30deaeeab Avoid merging adjacent tokens when concatenating contents (fixes #444) 2020-10-23 08:00:04 -04:00
Jay Berkenbilt
8a11feacc3 Avoid leak by resolving object streams more than once (fuzz issue 23642) 2020-10-22 15:39:36 -04:00
Jay Berkenbilt
30bb4c64ee Minor code cleanup
* Return rather than exiting from realmain in qpdf.cc
* Remove extraneous blank line
* Don't assign temporary to const reference
2020-10-22 15:39:36 -04:00
Jay Berkenbilt
232f5fc9f3 Handle jpeg library fuzz false positives
The jpeg library has some assembly code that is missed by the compiler
instrumentation used by memory sanitization. There is a runtime
environment variable that is used to work around this issue.
2020-10-22 06:31:52 -04:00
Jay Berkenbilt
c1684eae91 Check for overflow in page labels (fuzz issue 23599) 2020-10-22 05:49:24 -04:00
Jay Berkenbilt
7f4a4df919 Add range_check method to QIntC 2020-10-22 05:48:40 -04:00
Jay Berkenbilt
24196c08cb Fix loop detection error (fuzz issue 23172) 2020-10-22 05:48:35 -04:00
Jay Berkenbilt
956c8f6432 Obscure bug fix copying foreign streams in special cases (fixes #449)
Specifically, if a stream had its stream data replaced and had
indirect /Filter or /DecodeParms, it would result in non-silent loss
of data and/or internal error.
2020-10-21 19:23:23 -04:00
Jay Berkenbilt
98f6c00dad Protect numeric conversion against user's locale (fixes #459) 2020-10-21 16:42:51 -04:00
Jay Berkenbilt
bed165c9fc Stop using InputSource::unreadCh 2020-10-18 07:43:05 -04:00
Dean Scarff
153060a0c5 Check integer overflow in resolveObjectsInStream
Fixes a crash found by fuzzing.
2020-10-16 20:09:24 -04:00
Dean Scarff
9a3791c53b Properly detect OPENSSL_IS_BORINGSSL
OPENSSL_IS_BORINGSSL is not actually set by configure, so it will be
undefined until a BoringSSL header is included.  Hence the #ifdef logic
in QPDFCrypto_openssl.h would usually never apply.

This still worked because evp.h transitively included BoringSSL's
cipher.h and digest.h, but the latter are the correct (documented)
headers.

By re-ordering the includes, we can ensure the macro is defined when we
use it.

Also: fix case in the header guards.
2020-10-16 20:04:36 -04:00
Dean Scarff
2ff84aa2c9 Include detailed OpenSSL error messages
Fixes qpdf/qpdf#450
2020-10-16 19:58:11 -04:00
James R. Barlow
3fc7c99d02 Replace memchr with manual memory search
On large files with predominantly \n line endings, memchr(..'\r'..)
seems to waste a considerable amount of time searching for a line
ending candidate that we don't need.

On the Adobe PDF Reference Manual 1.7, this commit is 8x faster at
QPDF::processMemoryFile().
2020-10-16 19:57:29 -04:00
oltolm
3221022fc9 fix WindowsCryptProvider fixes #432 2020-10-16 19:56:33 -04:00
Jay Berkenbilt
ff65e272a8 Fix printf formatting for newer msvc
Use autoconf rather than ifdefs to determine what format string to use
for long long.
2020-10-16 07:02:23 -04:00
Jay Berkenbilt
88b8f8ec86 Remove redundant check found by lgtm.com 2020-10-15 14:47:43 -04:00
Jay Berkenbilt
26514ab731 Write linearization errors to stderr (fixes #438) 2020-04-29 17:33:34 -04:00
Jay Berkenbilt
92d3cbecd4 Fix warnings reported by -Wshadow=local (fixes #431) 2020-04-16 12:41:43 -04:00
Jay Berkenbilt
578c5ac66c Use more references when iterating
When possible, use `for (auto&` or `for (auto const&` when iterating
using C++-11 style iterators.
2020-04-10 13:30:33 -04:00
Jay Berkenbilt
821a701851 Prepare 10.0.1 release 2020-04-09 11:48:26 -04:00
Jay Berkenbilt
1a7d3700a6 Fix unnecessary copies in auto iter (fixes #426)
Also switch to colon-style iteration in some cases. Thanks to Dean
Scarff for drawing this to my attention after detecting some
unnecessary copies with
https://clang.llvm.org/extra/clang-tidy/checks/performance-for-range-copy.html
2020-04-08 20:45:26 -04:00
Jay Berkenbilt
4977a7efa5 Bug fix: getStreamData should on unfilterable stream (fixes #425) 2020-04-08 18:52:04 -04:00
Jay Berkenbilt
1e629c278a Prepare 10.0.0 release 2020-04-06 11:30:15 -04:00
Jay Berkenbilt
c996f4ac33 Don't include <cwchar> if not building with wchar 2020-04-06 11:23:02 -04:00
Jay Berkenbilt
77198d5310 Delegate random number generation to crypto provider (fixes #418) 2020-04-06 11:23:02 -04:00
Jay Berkenbilt
52749b85df Make random data provider code thread-safe
This uses C++-11 thread-safe static initializers now.
2020-04-06 10:00:43 -04:00
Jay Berkenbilt
619d294e9d Remove QUtil::srandom 2020-04-06 09:49:02 -04:00
Dean Scarff
0f2507234f Add OpenSSL/BoringSSL crypto provider
Fixes qpdf/qpdf#417
2020-04-06 09:01:55 -04:00
Jay Berkenbilt
893d38b87e Allow propagation of errors and retry through StreamDataProvider
StreamDataProvider::provideStreamData now has a rich enough API for it
to effectively proxy to pipeStreamData.
2020-04-05 20:07:13 -04:00
Jay Berkenbilt
7246404177 JSON: implement pattern keys in schema 2020-04-04 18:06:32 -04:00
Dean Scarff
c5c1a028cd Use deterministic assignments for unique_id
Fixes qpdf/qpdf#419
2020-04-04 08:29:28 -04:00
Jay Berkenbilt
2100b4ce15 Allow qpdf to be built on systems without wchar_t (fixes #406) 2020-04-03 21:39:44 -04:00
Jay Berkenbilt
6a4117add9 Avoid potential segfault in warning methods 2020-04-03 21:39:20 -04:00
Jay Berkenbilt
4f3b89991b placeFormXObject: allow control of shrink/expand (fixes #409) 2020-04-03 21:39:17 -04:00
Jay Berkenbilt
b76b73b229 C API: accept any non-zero value as TRUE 2020-04-03 17:33:44 -04:00
Jay Berkenbilt
54726930df Remove redundant methods in QUtil
This was being saved until we had to break ABI.
2020-04-03 12:17:57 -04:00
Jay Berkenbilt
5806e5c60c QPDFPageObjectHelper::placeFormXObject: use std::string const& (fixes #374) 2020-04-03 12:17:57 -04:00
Jay Berkenbilt
97de12343b Performance: remove Members indirection for Pipeline 2020-04-03 12:17:57 -04:00
Jay Berkenbilt
bfda941519 Use an unordered map for SparseOHArray for efficiency
This was added in C++11.
2020-04-03 12:16:24 -04:00
Jay Berkenbilt
ee271fd2f2 Use auto for iterating over sparse array 2020-04-03 12:16:24 -04:00
Jay Berkenbilt
70665cb381 Internally use unsafeShallowCopy where we can 2020-04-03 12:16:24 -04:00
Jay Berkenbilt
38afdcea7b Add QPDFObjectHandle::unsafeShallowCopy 2020-04-03 12:16:24 -04:00
Jay Berkenbilt
07afb668b1 Performance: remove indirection through Members for QPDFObject 2020-04-03 12:16:24 -04:00
Jay Berkenbilt
89f19b7099 Performance: remove Members indirection for QPDFObjectHandle 2020-04-03 12:16:24 -04:00
Jay Berkenbilt
dac65a21fb Look in form XObjects when removing unreferenced resources (fixes #373)
If a page contains a form XObject, also filter the form XObject and
remove its unreferenced resources.
2020-03-31 17:39:20 -04:00
Jay Berkenbilt
278710fbe8 Refactor QPDFPageObjectHelper::removeUnreferencedResources()
Refactor removeUnreferencedResources to prepare for filtering form
XObjects.
2020-03-31 17:39:20 -04:00
Jay Berkenbilt
bb6768b8f0 Include header for wcslen (fixes #405) 2020-02-29 08:43:33 -05:00
Jay Berkenbilt
bb3137296d Handle root /Pages pointing to other than page tree root (fixes #398) 2020-02-22 11:10:31 -05:00
Jay Berkenbilt
52a2e95dd5 Prepare 9.1.1 release 2020-01-26 18:49:04 -05:00
Jay Berkenbilt
57c01ef81f In qdf mode, don't write extra XRef streams (fixes #386)
fix-qdf assumes there is exactly one XRef stream and that it is at the
end of the file.
2020-01-26 16:50:57 -05:00
Jay Berkenbilt
bbc2f8ffae Bug fix: handle ColorSpace lookup for inline images (fixes #392)
If the value of /CS in the inline image dictionary was is key in the
page's /Resource -> /ColorSpace dictionary, properly resolve it by
referencing the proper colorspace, and not just the name, in the
external image dictionary.
2020-01-26 15:29:10 -05:00
Cloudmersive
a8b6ff5763 Fix for Windows unable to acquire crypt context with new keyset (fixes #387)
Fix is based on guidance
https://support.microsoft.com/en-us/help/238187/cryptacquirecontext-use-and-troubleshooting
and is the proper fix for #285/#286
2020-01-14 18:45:54 -05:00
Jay Berkenbilt
a44b5a34a0 Pull wmain -> main code from qpdf.cc into QUtil.cc 2020-01-14 11:40:51 -05:00
Jay Berkenbilt
ab4061f1ee Add error detection for read_lines_from_file(FILE*) 2020-01-14 11:07:09 -05:00
Jay Berkenbilt
211a7f57be QUtil::read_lines_from_file: optional EOL preservation 2020-01-13 11:26:18 -05:00
Jay Berkenbilt
9a398504ca Refactor QUtil::read_lines_from_file
This commit adds the preserve_eol flags but doesn't implement EOL
preservation yet.
2020-01-13 09:19:53 -05:00
Jay Berkenbilt
9b0c6022d7 Prepare 9.1.0 release 2019-11-16 22:29:54 -05:00
Jay Berkenbilt
5e6dfc938e Prepare 9.1.rc1 release 2019-11-09 22:00:53 -05:00
Jay Berkenbilt
c4478e5249 Allow odd/even modifiers in numeric range (fixes #364) 2019-11-09 13:23:12 -05:00
Jay Berkenbilt
5508f74603 Allow /P in encryption dictionary to be positive (fixes #382)
Even though this is disallowed by the spec, files like this have been
encountered in the wild.
2019-11-09 12:33:15 -05:00
Jay Berkenbilt
127a957aee Allow runtime inspection/override of crypto provider 2019-11-09 09:53:42 -05:00
Jay Berkenbilt
88bedb41fe Implement gnutls crypto provider (fixes #218)
Thanks to Zdenek Dohnal <zdohnal@redhat.com> for contributing the code
used for the gnutls crypto provider.
2019-11-09 09:53:38 -05:00
Jay Berkenbilt
cc14523440 Update autoconf to support crypto selection 2019-11-09 08:18:02 -05:00
Jay Berkenbilt
d0a53cd3ea Fix typos in configure.ac 2019-11-09 08:18:02 -05:00
Jay Berkenbilt
c03ced09c0 Isolate source files used for native crypto 2019-11-09 08:18:02 -05:00
Jay Berkenbilt
d1ffe46c04 AES_PDF: move CBC logic from pipeline to AES_PDF implementation 2019-11-09 08:18:02 -05:00
Jay Berkenbilt
c8cda4f965 AES_PDF: switch to pluggable crypto 2019-11-09 08:18:02 -05:00
Jay Berkenbilt
bb427bd117 SHA2: switch to pluggable crypto 2019-11-09 08:18:02 -05:00
Jay Berkenbilt
eadc222ff9 Rename SHA2 implementation (non-bisectable) 2019-11-09 08:18:02 -05:00
Jay Berkenbilt
4287fcc002 RC4: switch to pluggable crypto 2019-11-09 08:18:02 -05:00
Jay Berkenbilt
0cdcd10228 Rename RC4 implementation (non-bisectable) 2019-11-09 08:18:02 -05:00
Jay Berkenbilt
ce8f9b6608 MD5: switch to pluggable crypto 2019-11-09 08:18:02 -05:00
Jay Berkenbilt
5c3e856e9f Rename MD5 implementation (non-bisectable)
Just rename MD5 -> MD5_native in place so that git annotate will show
the lines as having originated there.
2019-11-09 08:18:02 -05:00
Jay Berkenbilt
2de41856a0 QPDFCryptoProvider: initial implementation 2019-11-09 08:18:02 -05:00
Jay Berkenbilt
700f5b961e Remove int type checks -- subsumed by C++-11 2019-11-09 08:18:02 -05:00
Jay Berkenbilt
653ce3550d Require C++-11
Includes updates to m4/ax_cxx_compile_stdcxx.m4 to make it work with
msvc, which supports C++-11 with no flags but doesn't set __cplusplus
to a recent value.
2019-11-09 08:18:02 -05:00
Jay Berkenbilt
9094fb1f8e Fix two additional fuzz test cases 2019-11-03 18:59:12 -05:00
Masamichi Hosoda
5a842792b6 Parse Contents in signature dictionary without encryption
Various PDF digital signing tools do not encrypt /Contents value in
signature dictionary. Adobe Acrobat Reader DC can handle a PDF with
the /Contents value not encrypted.

Write Contents in signature dictionary without encryption

Tests ensure that string /Contents are not handled specially when not
found in sig dicts.
2019-10-22 16:20:21 -04:00
Masamichi Hosoda
cdc46d78f4 Add QPDFObject::getParsedOffset() 2019-10-22 16:19:06 -04:00
Masamichi Hosoda
50b329ee9f Add QPDFWriter::getWrittenXRefTable() 2019-10-22 16:16:16 -04:00
Masamichi Hosoda
5cf4090aee Add QPDFWriter::getRenumberedObjGen() 2019-10-22 16:16:16 -04:00
Masamichi Hosoda
46ac3e21b3 Add QPDF::getXRefTable() 2019-10-22 16:16:16 -04:00
Masamichi Hosoda
06b818dcd3 Exclude signature dictionary from compressible objects
It seems better not to compress signature dictionaries. Various PDF
digital signing tools, including Adobe Acrobat Reader DC, do not
compress signature dictionaries.

Table 8.93 "Entries in a signature dictionary" in PDF 1.5 reference
describes that /ByteRange in the signature dictionary shall be used to
describe a digest that does not include the signature value
(/Contents) itself.

The byte ranges cannot be determined if the dictionary is compressed.
2019-10-22 16:16:16 -04:00
Masamichi Hosoda
5e0ba12687 Fix /Contents value representation in a signature dictionary
Table 8.93 "Entries in a signature dictionary" in PDF 1.5 reference
describes that the value of Contents entry is a hexadecimal string
representation when ByteRange is specified.

This commit makes QPDF always uses hexadecimal strings representation
instead of literal strings for it.
2019-10-22 16:16:16 -04:00
Jay Berkenbilt
3094955dee Prepare 9.0.2 release 2019-10-12 19:37:40 -04:00
Jay Berkenbilt
4ea940b03c Prepare 9.0.1 release 2019-09-20 07:38:18 -04:00
Jay Berkenbilt
685250d7d6 Correct reversed Rectangle coordinates (fixes #363) 2019-09-19 21:25:34 -04:00
Jay Berkenbilt
48b7de2cc3 Fix typo in comment 2019-09-19 21:04:32 -04:00
Jay Berkenbilt
8b1e307741 Warn for duplicated dictionary keys (fixes #345) 2019-09-19 20:22:34 -04:00
Jay Berkenbilt
bb83e65193 Fix fuzz issue 16953 (overflow checking in xref stream index) 2019-09-17 19:48:47 -04:00
Jay Berkenbilt
17d431dfd5 Fix integer type warnings for big-endian systems 2019-09-17 19:14:27 -04:00
Jay Berkenbilt
5462dfce31 Prepare 9.0.0 release 2019-08-31 20:07:36 -04:00
Jay Berkenbilt
babd12c9b2 Add methods QPDF::anyWarnings and QPDF::closeInputSource 2019-08-31 15:51:20 -04:00
Jay Berkenbilt
4fa7b1eb60 Add remove_file and rename_file to QUtil 2019-08-31 15:51:04 -04:00
Jay Berkenbilt
0e51a9aca6 Don't encrypt trailer, fixes fuzz issue 15983
Ordinarily the trailer doesn't contain any strings, so this is usually
a non-issue, but if the trailer contains strings, linearizing and
encrypting with object streams would include encrypted strings in the
trailer, which would blow out the padding because encrypted strings
are longer than their cleartext counterparts.
2019-08-28 23:06:32 -04:00
Jay Berkenbilt
47a38a942d Detect stream in object stream, fixing fuzz 16214
It's detected in QPDFWriter instead of at parse time because I can't
figure out how to construct a test case in a reasonable time. This
commit moves the fuzz file into the regular test suite for a QTC
coverage case.
2019-08-28 12:49:04 -04:00
Jay Berkenbilt
ba5fb69164 Make popping pipeline stack safer
Use destructors to pop the pipeline stack, and ensure that code that
pops the stack is actually popping the intended thing.
2019-08-27 22:27:47 -04:00
Jay Berkenbilt
dadf8307c8 Fix fuzz issues 15316 and 15390 2019-08-27 20:39:06 -04:00
Jay Berkenbilt
456c285b02 Fix fuzz issue 16172 (overflow checking in OffsetInputSource) 2019-08-27 13:08:07 -04:00
Jay Berkenbilt
ad8081daf5 Fix fuzz issue 15442 (overflow checking in BufferInputSource) 2019-08-27 11:26:25 -04:00
Jay Berkenbilt
9a095c5c76 Seek in two stages to avoid overflow
When seeing to a position based on a value read from the input, we are
prone to integer overflow (fuzz issue 15442). Seek in two stages to
move the overflow check into the input source code.
2019-08-27 11:26:25 -04:00
Jay Berkenbilt
ac5e6de2e8 Fix fuzz issue 15387 (overflow checking xref size) 2019-08-27 11:26:25 -04:00
Jay Berkenbilt
6bc4cc3d48 Fix fuzz issue 15475 2019-08-25 22:52:25 -04:00
Jay Berkenbilt
94e86e2528 Fix fuzz issue 16301 2019-08-25 22:52:25 -04:00
Jay Berkenbilt
5da146c8b5 Track separately whether password was user/owner (fixes #159) 2019-08-24 11:01:19 -04:00
Jay Berkenbilt
5a0aef55a0 Split long line 2019-08-24 10:58:51 -04:00
Jay Berkenbilt
2794bfb1a6 Add flags to control zlib compression level (fixes #113) 2019-08-23 20:34:21 -04:00
Jay Berkenbilt
dac0598b94 Add ability to set zlib compression level globally 2019-08-23 20:34:21 -04:00
Jay Berkenbilt
3f1ab64066 Pass offset and length to ParserCallbacks::handleObject 2019-08-22 22:54:29 -04:00
Jay Berkenbilt
4b2e72c4cd Test for direct, rather than resolved nulls in parser
Just because we know an indirect reference is null, doesn't mean we
shouldn't keep it indirect.
2019-08-22 17:55:16 -04:00
Jay Berkenbilt
3f3dbe22ea Remove array null flattening
For some reason, qpdf from the beginning was replacing indirect
references to null with literal null in arrays even after removing the
old behavior of flattening scalar references. This seems like a bad
idea.
2019-08-22 17:55:16 -04:00
Jay Berkenbilt
225cd9dac2 Protect against coding error of re-entrant parsing 2019-08-22 17:55:16 -04:00
Jay Berkenbilt
ae5bd7102d Accept extraneous space before xref (fixes #341) 2019-08-19 22:24:53 -04:00
Jay Berkenbilt
8a9086a689 Accept extraneous space after stream keyword (fixes #329) 2019-08-19 21:43:44 -04:00
Jay Berkenbilt
43f91f58b8 Improve invalid name token warning message
This message used to only appear for PDF >= 1.2. The invalid name is
valid for PDF 1.0 and 1.1. However, since QPDFWriter may write a newer
version, it's better to detect and warn in all cases. Therefore make
the warning more informative.
2019-08-19 19:48:27 -04:00
Jay Berkenbilt
42d396f1dd Handle invalid name tokens symmetrically for PDF < 1.2 (fixes #332) 2019-08-19 19:48:27 -04:00
Jay Berkenbilt
d9dd99eca3 Attempt to repair /Type key in pages nodes (fixes #349) 2019-08-18 18:54:37 -04:00
Jay Berkenbilt
522d2b2227 Improve efficiency of fixDanglingReferences 2019-08-18 09:00:40 -04:00
Jay Berkenbilt
5187a3ec85 Shallow copy arrays without removing sparseness 2019-08-17 23:02:41 -04:00
Jay Berkenbilt
bf7c6a8070 Use SparseOHArray in parsing 2019-08-17 23:02:41 -04:00
Jay Berkenbilt
e5f504b6c5 Use SparseOHArray in QPDF_Array 2019-08-17 23:02:41 -04:00
Jay Berkenbilt
a89d8a0677 Refactor QPDF_Array in preparation for using SparseOHArray 2019-08-17 23:02:41 -04:00
Jay Berkenbilt
e83f3308fb SparseOHArray 2019-08-17 23:02:41 -04:00
Thorsten Schöning
8f06da7534 Change list to vector for outline helpers (fixes #297)
This change works around STL problems with Embarcadero C++ Builder
version 10.2, but std::vector is more common than std::list in qpdf,
and this is a relatively new API, so an API change is tolerable.

Thanks to Thorsten Schöning <6223655+ams-tschoening@users.noreply.github.com>
for the fix.
2019-07-03 20:08:47 -04:00
Jay Berkenbilt
4db1de97ce Convert some cases of logic_error to runtime_error
There were a few cases that could be caused by invalid input rather
than bugs in the code which were throwing logic_error instead of
runtime_error.
2019-06-25 12:43:06 -04:00
Jay Berkenbilt
201e8798d7 Convert previously overlooked static cast to QIntC 2019-06-25 12:43:06 -04:00
Jay Berkenbilt
04f45cf652 Treat all linearization errors as warnings
This also reverts the addition of a new checkLinearization that
distinguishes errors from warnings. There's no practical distinction
between what was considered an error and what was considered a
warning.
2019-06-23 13:45:45 -04:00
Jay Berkenbilt
c5ed1b8075 Handle invalid encryption Length (fixes #333) 2019-06-22 20:57:33 -04:00
Jay Berkenbilt
551dfbf697 Allow set*EncryptionParameters before filename iset (fixes #336) 2019-06-22 20:57:33 -04:00
Jay Berkenbilt
7bd38a3eb3 Provide error message in Windows crypto code (fixes #286)
Thanks to github user zdenop for supplying some additional
error-handling code.
2019-06-22 17:12:01 -04:00
Jay Berkenbilt
6c39aa8763 In shippable code, favor smart pointers (fixes #235)
Use PointerHolder in several places where manually memory allocation
and deallocation were being used. This helps to protect against memory
leaks when exceptions are thrown in surprising places.
2019-06-22 16:57:52 -04:00
Jay Berkenbilt
85a3f95a89 qpdf: exit 3 for linearization warnings without errors (fixes #50) 2019-06-22 16:57:51 -04:00
Jay Berkenbilt
1bde5c68a3 Add QUtil::read_file_into_memory
This code was essentially duplicated between test_driver and
standalone_fuzz_target_runner.
2019-06-22 10:14:25 -04:00
Jay Berkenbilt
658b5bb3be QPDFWriter: clean up overloaded functions
In a small number of cases, it makes sense to replace an overloaded
function with a function that takes a default argument. We can do this
now because we've already broken binary compatibility since the last
release.
2019-06-22 10:13:27 -04:00
Jay Berkenbilt
79f6b4823b Convert remaining public classes to use Members pattern
Have classes contain only a single private member of type
PointerHolder<Members>. This makes it safe to change the structure of
the Members class without breaking binary compatibility. Many of the
classes already follow this pattern quite successfully. This brings in
the rest of the class that are part of the public API.
2019-06-22 10:13:27 -04:00
Jay Berkenbilt
45dac410b5 Remove broken QPDFTokenizer::expectInlineImage 2019-06-21 22:29:31 -04:00
Jay Berkenbilt
25dd3c6750 Remove QPDF::copyForeignObject with unused parameter 2019-06-21 22:29:31 -04:00
Jay Berkenbilt
c6cfd64503 Rename QUtil::strcasecmp to QUtil::str_compare_nocase (fixes #242) 2019-06-21 22:29:31 -04:00
Jay Berkenbilt
848351f1fc Add missing #include <cstring> 2019-06-21 22:29:31 -04:00
Jay Berkenbilt
b07ad6794e Fix bugs found by fuzz tests
* Several assertions in linearization were not always true; change
  them to run time errors
* Handle a few cases of uninitialized objects
* Handle pages with no contents when doing form operations
* Handle invalid page tree nodes when traversing pages
2019-06-21 17:56:24 -04:00
Jay Berkenbilt
a35d4ce9cc Fix bounds error in utf16_to_utf8 conversion 2019-06-21 17:40:24 -04:00
Jay Berkenbilt
63a643a3c7 Remove implicit conversion from int/pointer to bool
This fixes cases of warning C4800 from msvc
2019-06-21 13:17:21 -04:00
Jay Berkenbilt
d71f05ca07 Fix sign and conversion warnings (major)
This makes all integer type conversions that have potential data loss
explicit with calls that do range checks and raise an exception. After
this commit, qpdf builds with no warnings when -Wsign-conversion
-Wconversion is used with gcc or clang or when -W3 -Wd4800 is used
with MSVC. This significantly reduces the likelihood of potential
crashes from bogus integer values.

There are some parts of the code that take int when they should take
size_t or an offset. Such places would make qpdf not support files
with more than 2^31 of something that usually wouldn't be so large. In
the event that such a file shows up and is valid, at least qpdf would
raise an error in the right spot so the issue could be legitimately
addressed rather than failing in some weird way because of a silent
overflow condition.
2019-06-21 13:17:21 -04:00
Jay Berkenbilt
f40ffc9d63 Pl_Flate: constructor's out_bufsize is now unsigned int
This is the type we need for the underlying zlib implementation.
2019-06-21 13:17:21 -04:00
Jay Berkenbilt
da30764bce Change QPDFObjectHandle::pipeStreamData's encode_flags type
Change from unsigned long to int since we pass enumerated type values
to this field.
2019-06-21 13:17:21 -04:00
Jay Berkenbilt
3608afd5c5 Add new integer accessors to QPDFObjectHandle 2019-06-21 13:17:21 -04:00
Jay Berkenbilt
42306e2ff8 QUtil: add unsigned int/string functions 2019-06-21 13:17:21 -04:00
Jay Berkenbilt
2155815234 configure: determine wordsize automatically
Based on sizeof(size_t). Assumes 64 if not 32.
2019-06-21 13:17:21 -04:00
Jay Berkenbilt
713d961990 Appearance streams: some floating point values were truncated
Bounding box X coordinates could be truncated, causing them to be off
by a fraction of a point. This was most likely not visible, but it was
still wrong.
2019-06-20 21:32:30 -04:00
Jay Berkenbilt
eb7948876b Fix problems found in fuzz corpus 2019-06-15 17:24:24 -04:00
Jay Berkenbilt
cf469d7890 Give up reading objects with too many consecutive errors 2019-06-15 08:52:19 -04:00
Jay Berkenbilt
cd830968ef Eliminate one potential integer overflow
There are more to handle, but this resolves an issue already caught by
oss-fuzz.
2019-06-15 08:52:19 -04:00
Jay Berkenbilt
31bde2f9d7 Handle empty DecodeParams array for (fixes #331)
On read, ignore /DecodeParms when empty list; on write, delete it.
Some files have been found that include an empty list for
/DecodeParms, but this is not technically compliant with the spec, and
the only sensible interpretation is to treat it as if there are no
decode parameters.
2019-06-09 17:19:49 -04:00
Jay Berkenbilt
b1a78be1a8 Prepare 8.4.2 release 2019-05-18 08:56:37 -04:00
Jay Berkenbilt
b3f0dbff62 Fix Windows memory error (fixes #330) 2019-05-16 14:26:51 -04:00
Jay Berkenbilt
a323f6f49f Prepare 8.4.1 release 2019-04-27 20:44:20 -04:00
Jay Berkenbilt
81205e007b Spell check 2019-04-21 13:09:11 -04:00
Jay Berkenbilt
011695dfdf Support Unicode in filenames (fixes #298) 2019-04-20 21:00:43 -04:00