2
1
mirror of https://github.com/qpdf/qpdf.git synced 2025-02-01 19:38:25 +00:00

368 Commits

Author SHA1 Message Date
Jay Berkenbilt
8a9086a689 Accept extraneous space after stream keyword (fixes #329) 2019-08-19 21:43:44 -04:00
Jay Berkenbilt
43f91f58b8 Improve invalid name token warning message
This message used to only appear for PDF >= 1.2. The invalid name is
valid for PDF 1.0 and 1.1. However, since QPDFWriter may write a newer
version, it's better to detect and warn in all cases. Therefore make
the warning more informative.
2019-08-19 19:48:27 -04:00
Jay Berkenbilt
42d396f1dd Handle invalid name tokens symmetrically for PDF < 1.2 (fixes #332) 2019-08-19 19:48:27 -04:00
Jay Berkenbilt
d9dd99eca3 Attempt to repair /Type key in pages nodes (fixes #349) 2019-08-18 18:54:37 -04:00
Jay Berkenbilt
04f45cf652 Treat all linearization errors as warnings
This also reverts the addition of a new checkLinearization that
distinguishes errors from warnings. There's no practical distinction
between what was considered an error and what was considered a
warning.
2019-06-23 13:45:45 -04:00
Jay Berkenbilt
c5ed1b8075 Handle invalid encryption Length (fixes #333) 2019-06-22 20:57:33 -04:00
Jay Berkenbilt
551dfbf697 Allow set*EncryptionParameters before filename iset (fixes #336) 2019-06-22 20:57:33 -04:00
Jay Berkenbilt
85a3f95a89 qpdf: exit 3 for linearization warnings without errors (fixes #50) 2019-06-22 16:57:51 -04:00
Jay Berkenbilt
45dac410b5 Remove broken QPDFTokenizer::expectInlineImage 2019-06-21 22:29:31 -04:00
Jay Berkenbilt
b07ad6794e Fix bugs found by fuzz tests
* Several assertions in linearization were not always true; change
  them to run time errors
* Handle a few cases of uninitialized objects
* Handle pages with no contents when doing form operations
* Handle invalid page tree nodes when traversing pages
2019-06-21 17:56:24 -04:00
Jay Berkenbilt
ed7f2a6c76 Add smaller image streams file for testing 2019-06-21 17:39:53 -04:00
Jay Berkenbilt
3608afd5c5 Add new integer accessors to QPDFObjectHandle 2019-06-21 13:17:21 -04:00
Jay Berkenbilt
713d961990 Appearance streams: some floating point values were truncated
Bounding box X coordinates could be truncated, causing them to be off
by a fraction of a point. This was most likely not visible, but it was
still wrong.
2019-06-20 21:32:30 -04:00
Jay Berkenbilt
bcfa407912 As a test suite, run stand-alone fuzzer on seed corpus
Temporarily skip fuzz tests on Windows. There are Windows-specific
failures to address later.
2019-06-15 17:24:24 -04:00
Jay Berkenbilt
320702c086 Add test files from oss-fuzz bugs (fixes #335) 2019-06-15 17:24:24 -04:00
Jay Berkenbilt
cf469d7890 Give up reading objects with too many consecutive errors 2019-06-15 08:52:19 -04:00
Jay Berkenbilt
3a180a0591 Commit forgotten test files 2019-06-09 18:11:37 -04:00
Jay Berkenbilt
31bde2f9d7 Handle empty DecodeParams array for (fixes #331)
On read, ignore /DecodeParms when empty list; on write, delete it.
Some files have been found that include an empty list for
/DecodeParms, but this is not technically compliant with the spec, and
the only sensible interpretation is to treat it as if there are no
decode parameters.
2019-06-09 17:19:49 -04:00
Jay Berkenbilt
03e27709f3 Improve Unicode filename testing
Remove dependency on the behavior of perl for reliable creation of
Unicode file names on Windows.
2019-04-27 20:37:33 -04:00
Jay Berkenbilt
7ff234a92f Remove stray comment 2019-04-27 20:37:33 -04:00
Jay Berkenbilt
12b159118a Compare versions between CLI and library 2019-04-20 21:00:43 -04:00
Jay Berkenbilt
2b011f9d81 Add --remove-page-labels option (fixes #317) 2019-04-20 21:00:43 -04:00
Jay Berkenbilt
e50d5201df Add --keep-files-open-threshold (fixes #288) 2019-04-20 21:00:43 -04:00
Jay Berkenbilt
011695dfdf Support Unicode in filenames (fixes #298) 2019-04-20 21:00:43 -04:00
Jay Berkenbilt
a5a016cdd2 Revert preservations of outlines with --split-pages
The preservation of outlines didn't provide very useful behavior
anyway as it copied all outlines but most didn't work. This
implementation also caused a very significant performance hit and so
is being reverted until a proper solution can be coded. The eventual
solution will not be compatible with the reverted solution anyway, so
it's best not to leave this in.
2019-04-20 21:00:43 -04:00
Thorsten Schöning
af42fe9daf Don't open more than 50 files.
Embarcadero C++Builder doesn't support more than 50 files open at the same time for legacy 32 Bit apps, which makes a test fail trying to open more than that many files. This changes the number of open files for that test to far less to make the test succeed. Alternatively one could reduce the hard coded number of 200 in QPDF itself, which I didn't do currently because it needs adoption of manuals etc. and is something which needs to be discussed with the author of QPDF. I guess chances are better to get the test changed upstream.

This fixes #288: https://github.com/qpdf/qpdf/issues/288
2019-03-11 17:14:22 -04:00
Thorsten Schöning
27f18e0f67 The kfo-PDF files for testing need to be copied using "binmode" or Windows will introduce \r\n.
qpdf: selecting --keep-open-files=n
qpdf: processing 001-kfo.pdf
WARNING: 001-kfo.pdf: file is damaged
WARNING: 001-kfo.pdf (offset 556): xref not found
WARNING: 001-kfo.pdf: Attempting to reconstruct cross-reference table
2019-02-14 18:54:38 +01:00
Jay Berkenbilt
fc2e491f74 Add test for exception handling
There have been issues reported where exceptions are not thrown
properly across shared library/DLL boundaries, so add a test
specifically to ensure that exceptions are caught as thrown.
2019-02-07 19:21:26 -05:00
Jay Berkenbilt
8acf636b4e Incorporate improved Windows fragility workaround from qtest 2019-02-01 22:25:25 -05:00
Jay Berkenbilt
1fba24aada Add another test case for weird page trees 2019-01-31 21:29:28 -05:00
Jay Berkenbilt
0a470d2daf Don't optimize non-8-bit images
Also add test cases for additional coverage on image optimization.
2019-01-31 21:29:28 -05:00
Jay Berkenbilt
eb49e07c0a Make inline image token exactly contain the image data
Do not include the trailing EI, and handle cases where EI is not
preceded by a delimiter. Such cases have been seen in the wild.
2019-01-31 20:28:44 -05:00
Jay Berkenbilt
5211bcb5ea Externalize inline images (fixes #278) 2019-01-31 10:38:13 -05:00
Jay Berkenbilt
22bcdbe786 Remove acroread from tests
This hasn't worked or been exercised in years since Adobe stopped
releasing a Linux version of reader.
2019-01-31 10:38:13 -05:00
Jay Berkenbilt
1eb35a355f Exclude space after ID in image data 2019-01-31 10:38:10 -05:00
Jay Berkenbilt
2b6c79bcae Improve locating inline image's EI
We've actually seen a PDF file in the wild that contained EI
surrounded by delimiters inside the image data, which confused qpdf's
naive code. This significantly improves EI detection.
2019-01-31 09:26:37 -05:00
Jay Berkenbilt
ec9e310c9e Refactor QPDFTokenizer's inline image handling
Add a version of expectInlineImage that takes an input source and
searches for EI. This is in preparation for improving the way EI is
found. This commit just refactors the code without changing the
functionality and adds tests to make sure the old and new code behave
identically.
2019-01-31 09:26:37 -05:00
Jay Berkenbilt
31372edce0 Inline image token value ends with EI, not delimiter
The inline image token erroneously included the delimiter that
followed EI. The ObjectHandle created from it was correct.
2019-01-31 09:26:37 -05:00
Jay Berkenbilt
c136356378 Typo in message 2019-01-31 09:26:37 -05:00
Jay Berkenbilt
8a9cfd2605 Handle direct page objects (fixes #164) 2019-01-29 17:01:36 -05:00
Jay Berkenbilt
2712869cf9 Fix logic for when to compress object and xref streams (fixes #271) 2019-01-28 21:43:06 -05:00
Jay Berkenbilt
52f9d326a5 Resolve duplicated page objects (fixes #268)
When linearizing a file or getting the list of all pages in a file,
detect if the pages tree contains a duplicated page object and, if so,
shallow copy it. This makes it possible to have a one to one mapping
of page positions to page objects.
2019-01-28 20:29:58 -05:00
Jay Berkenbilt
426434c772 Add --overlay and --underlay to qpdf CLI (fixes #207) 2019-01-27 09:30:13 -05:00
Jay Berkenbilt
c2ae35540e Add boundary condition test for getUniqueResourceName 2019-01-27 09:26:33 -05:00
Jay Berkenbilt
623f5b664e Convert pages to form XObjects
Support conversion of pages to form XObjects and placement of form
XObjects on pages.
2019-01-27 07:50:30 -05:00
Jay Berkenbilt
009767d97a Handle inheritable page attributes
Add getAttribute for handling inheritable page attributes, and fix
getPageImages and annotation flattening code to use it.
2019-01-25 22:30:05 -05:00
Jay Berkenbilt
2d32f4db8f Handle fallback font size in text appearances
If we end up using our fallback font size when generating appearances
for text fields, reflect that in the Tf operator used in the
appearance stream.
2019-01-21 07:38:21 -05:00
Jay Berkenbilt
9cb599875b Improve text objects used in text appearance streams 2019-01-20 23:05:58 -05:00
Jay Berkenbilt
930eade6d3 Fix omissions in text appearance generation
When generating appearance streams for variable text annotations,
properly handle the cases of there being no appearance dictionary, no
appearance stream, or an appearance stream with no BMC..EMC marker.
2019-01-20 23:05:58 -05:00
Jay Berkenbilt
65ef0bf313 When flattening, remove annotations with no appearance stream
With the exception of form field annotations when /NeedAppearances is
true, remove annotations that don't have appearance streams when
flattening. There is no reason to keep these when flattening since
they are invisible. This may include unchecked checkboxes, unshown
popup windows, etc.
2019-01-20 23:05:58 -05:00