Jay Berkenbilt
fddbcab0e7
Mostly don't require original QPDF for copyForeignObject ( fixes #219 )
...
The original QPDF is only required now when the source
QPDFObjectHandle is a stream that gets its stream data from a
QPDFObjectHandle::StreamDataProvider.
2019-01-07 00:11:15 -05:00
Jay Berkenbilt
fbbb0ee016
Make a static version of QPDF::pipeStreamData
...
This is in preparation of being able to pipe a stream's data without
keeping a copy of its containing qpdf object.
2019-01-07 00:11:15 -05:00
Jay Berkenbilt
7588cac295
Create an application-scope unique ID for each QPDF object
...
Use this instead of QPDF* as a map key for object_copiers.
2019-01-07 00:11:15 -05:00
Jay Berkenbilt
e27ac682e0
Move encryption parameters into a class
2019-01-06 09:58:16 -05:00
Jay Berkenbilt
a70fbaaf50
Honor other base encodings when generating appearances
2019-01-05 23:01:59 -05:00
Jay Berkenbilt
b341d742db
Add WinAnsi and MacRoman encoding
2019-01-05 23:01:44 -05:00
Jay Berkenbilt
3ef1b77304
Refactor QUtil::utf8_to_ascii
2019-01-05 22:59:29 -05:00
Jay Berkenbilt
089ce5902e
Move utf8_to_utf16 into QUtil
2019-01-05 22:59:27 -05:00
Jay Berkenbilt
ae18bfd142
Refactor string transcoding in QPDF_String
2019-01-05 22:56:58 -05:00
Jay Berkenbilt
2e342ee5bb
Spell check
2019-01-04 21:33:14 -05:00
Jay Berkenbilt
16fd6e64f9
Add QPDFWriter::getFinalVersion ( fixes #266 )
2019-01-04 12:37:22 -05:00
Jay Berkenbilt
837dcf8fc2
Don't call assert while checking linearization data ( fixes #209 , #231 )
...
Instead of calling assert for problems found during checking
linearization data, throw an exception which is later caught and
issued as an error. Ideally we would handle errors more robustly, but
this is still a significant improvement.
2019-01-04 11:55:42 -05:00
Jay Berkenbilt
a01359189b
Fix dangling references ( fixes #240 )
...
On certain operations, such as iterating through all objects and
adding new indirect objects, walk through the entire object structure
and explicitly resolve any indirect references to non-existent
objects. That prevents new objects from springing into existence and
causing the previously dangling references to point to them.
2019-01-04 10:29:29 -05:00
Jay Berkenbilt
158156d506
Add basic appearance stream generation
2019-01-04 08:00:19 -05:00
Jay Berkenbilt
02281632cc
Add QUtil::utf8_to_ascii
2019-01-03 23:18:13 -05:00
Jay Berkenbilt
b55567a0fa
Add special case setV code for button fields
2019-01-03 23:18:13 -05:00
Jay Berkenbilt
e3144ac417
Add form fields to json output
...
Also add some additional methods for detecting form field types to
assist in the json creation and for later use.
2019-01-03 23:18:13 -05:00
Jay Berkenbilt
ca94ac68d9
Honor flags when flattening annotations
2019-01-03 11:59:55 -05:00
Jay Berkenbilt
06d6438ddf
Minor fixes
2019-01-03 09:17:43 -05:00
Jay Berkenbilt
3e74916c5a
Fix seg fault on empty xref stream ( fixes #263 )
...
Thanks to @p-cher for supplying a patch.
2019-01-03 09:17:43 -05:00
Jay Berkenbilt
f78ea057ca
Switch annotation flattening to use the form xobjects
...
Instead of directly putting the contents of the annotation appearance
streams into the page's content stream, add commands to render the
form xobjects directly. This is a more robust way to do it than the
original solution as it works properly with patterns and avoids
problems with resource name clashes between the pages and the form
xobjects.
2019-01-02 21:49:47 -05:00
Jay Berkenbilt
3b8ce4f12a
Annotation flattening including form fields
...
Flatten annotations by integrating their appearance streams into the
content stream of the containing page. In the case of form fields,
only flatten if /NeedAppearance is false (or equivalently absent). If
flattening form fields, also remove /AcroForm from the document
catalog.
2019-01-01 08:14:15 -05:00
Jay Berkenbilt
95d6b17a89
Add QPDFObjectHandle::mergeDictionary()
2019-01-01 08:12:56 -05:00
Jay Berkenbilt
104fd6da52
Add matrix and annotation appearance stream handling
...
Generate page content fragment for rendering appearance streams
including all matrix calculation.
2019-01-01 08:07:21 -05:00
Jay Berkenbilt
5059ec0d35
Add Matrix class under QPDFObjectHandle
2018-12-31 23:02:43 -05:00
Jay Berkenbilt
daeb5a85b6
Transformation matrix
2018-12-31 18:23:47 -05:00
Jay Berkenbilt
3440ea7d3c
JSON::serialize -> unparse
...
Unparse is admittedly strange, but I'd rather be strange and
consistent, and everything else in the qpdf library uses unparse to
serialize. (If you're reading this, the convention of using "unparse"
comes from the "clu" programming language.)
2018-12-25 11:52:21 -05:00
Jay Berkenbilt
fa3664357b
Move numrange code from qpdf.cc to QUtil.cc
...
Also move tests to libtests.
2018-12-21 19:11:57 -05:00
Jay Berkenbilt
d5d179f441
Add document and object helpers for outlines (bookmarks)
2018-12-21 19:11:57 -05:00
Jay Berkenbilt
30a0c070e4
Add QPDFObjectHandle::getJSON()
2018-12-21 18:34:56 -05:00
Jay Berkenbilt
651179b5da
Add simple JSON serializer
2018-12-21 18:34:56 -05:00
Jay Berkenbilt
0776c00129
Add QPDFNameTreeObjectHelper
2018-12-21 18:34:56 -05:00
Jay Berkenbilt
cc500eda91
Minor cleanup
2018-12-21 17:25:31 -05:00
Jay Berkenbilt
6ef9e31233
Add QPDFPageLabelDocumentHelper
2018-12-18 16:59:24 -05:00
Jay Berkenbilt
f38df27aa3
Add QPDFNumberTreeObjectHelper
2018-12-18 16:46:10 -05:00
Jay Berkenbilt
077d3d4512
Add QPDFObjectHandle::wrapInArray()
...
Wrap an object in an array if it is not already an array.
2018-12-18 16:45:48 -05:00
Jay Berkenbilt
d1368a3851
Commit automatically generated files
2018-10-11 17:27:54 -04:00
Jay Berkenbilt
6ee761fc86
Prepare 8.2.1 release
2018-08-18 10:56:19 -04:00
Jay Berkenbilt
5e9e17e62a
Prepare 8.2.0 release
2018-08-16 11:53:10 -04:00
Jay Berkenbilt
693cdaac35
Missing header for std::max
2018-08-16 11:53:10 -04:00
Jay Berkenbilt
b4ce557be5
Fix error in QPDFSystemError.cc
2018-08-14 11:39:07 -04:00
Jay Berkenbilt
b4bdc42b4f
New exception class QPDFSystemError ( fixes #221 )
2018-08-13 20:01:51 -04:00
Jay Berkenbilt
5d9d80beba
Fix fallback logic for encryption ( fixes #229 )
2018-08-12 22:32:40 -04:00
Jay Berkenbilt
60fe8061cb
Fix one more identifier ( fixes #236 )
2018-08-12 22:01:51 -04:00
Jay Berkenbilt
a2f62935b3
Catch exceptions as const references ( fixes #236 )
...
This fix allows qpdf to compile/test cleanly with gcc 8.
2018-08-12 21:57:52 -04:00
Jay Berkenbilt
3d6615b276
Pl_Buffer: reduce memory growth ( fixes #228 )
...
Rather than keeping a list of buffers for every write, accumulate
bytes in a single buffer, doubling the size of the buffer when needed
to accommodate new data.
This is not the best possible implementation, but the change was
implemented in this way to avoid changing the shape of Pl_Buffer and
thus breaking backward compatibility.
2018-08-12 17:45:43 -04:00
Jay Berkenbilt
3873f5fd9b
Protect headers with compliant identifiers ( fixes #233 )
2018-08-12 14:10:32 -04:00
Jay Berkenbilt
932799baab
Fix memory access error
...
A previous fix introduced a potentially memory overrun under certain
rare conditions. The test suite now once again passes with address
sanitizer.
2018-08-12 13:16:17 -04:00
Jay Berkenbilt
b6e414b10b
Remove some extraneous null pointer checks ( fixes #234 )
...
There were a few places in the code that were checking that a pointer
wasn't null before deleting it, even though C++ has always allowed
delete 0. Most of the code did not perform these checks.
2018-08-12 12:58:39 -04:00
Jay Berkenbilt
4a4736c695
Fix EOL handling inside strings ( fixes #226 )
...
CR, CRLF, and LF are all supposed to be treated as LF; only one EOL is
to be ignored after backslash.
2018-08-05 20:48:35 -04:00
Jay Berkenbilt
1619cad1e8
Return correct method for string encryption ( fixes #227 )
2018-08-05 16:58:21 -04:00
Jay Berkenbilt
e1cd5891af
Fix infinite loop on small files with progress reporting ( fixes #230 )
...
Turns out you can keep adding zero to a number over and over again and
it just doesn't get any bigger. Who would have known?
2018-08-05 15:43:34 -04:00
Jay Berkenbilt
4f4c627b77
ClosedFileInputSource: add method to keep file open
...
During periods of intensive operation on a specific file, this method
can reduce the overhead of repeated open/close operations.
2018-08-04 19:52:46 -04:00
Jay Berkenbilt
1bd2a2e79b
Prepare 8.1.0 release
2018-06-23 07:50:11 -04:00
Jay Berkenbilt
3aad28aed0
Bug fix: honor encryption key length with R=3 ( fixes #212 )
2018-06-22 19:24:26 -04:00
Jay Berkenbilt
a433ed24f9
Add progress reporting for QPDFWriter ( fixes #200 )
2018-06-22 16:14:54 -04:00
Jay Berkenbilt
2a82f6e1e0
Add method to get count of objects in QPDF
2018-06-22 15:53:40 -04:00
Jay Berkenbilt
c81836076f
Correct incorrect comment
2018-06-22 13:13:09 -04:00
Jay Berkenbilt
4ccc8b1a44
Add ClosedFileInputSource
...
ClosedFileInputSource is an input source that keeps the file closed
when not reading it.
2018-06-22 12:52:45 -04:00
Jay Berkenbilt
c71dc6888c
Don't prune resource dictionaries on errors or by request
...
If we are unable to filter a page's content streams, don't attempt to
remove objects from the page's resource dictionary. Also provide a
command line option to suppress resource removal in case we ever need
this as a workaround for some bug or broken PDF files.
2018-06-22 10:45:31 -04:00
Jay Berkenbilt
38c9ed23c3
Treat content stream parsing errors as an error, not a warning
...
If parsing content streams is treated as a warning, there is no way
for a caller to know if a parsing operation has failed. This is very
dangerous and will likely result in data loss when token filters are
parser callbacks are in use.
2018-06-22 10:44:08 -04:00
Jay Berkenbilt
6c89d4b35b
When splitting files, remove unreferenced objects ( fixes #203 )
2018-06-21 21:03:30 -04:00
Jay Berkenbilt
ddd78c1b7f
Fix QPDFObjectHandle::shallowCopy
...
It's not really a shallow copy. It just doesn't cross indirect object
boundaries. The old implementation had a bug that would cause multiple
shallow copies of the same object to share memory, which was not the
intention.
2018-06-21 20:34:45 -04:00
Jay Berkenbilt
397b097c46
Allow setting a form field's value
2018-06-21 15:57:13 -04:00
Jay Berkenbilt
952a665a4e
Better support for creating Unicode strings
2018-06-21 15:57:13 -04:00
Jay Berkenbilt
e44c395c51
QUtil::toUTF16
2018-06-21 15:57:13 -04:00
Jay Berkenbilt
0b05111db8
Implement helper class for interactive forms
2018-06-21 15:57:13 -04:00
Jay Berkenbilt
2e7ee23bf6
Add QPDFPageDocumentHelper and QPDFPageObjectHelper
...
This is the beginning of higher-level API support using helper
classes. The goal is to be able to add more helpers without continuing
to pollute QPDF's and QPDFObjectHandle's public interfaces.
2018-06-21 15:57:13 -04:00
Jay Berkenbilt
4cded10821
Add QPDFObjectHandle::Rectangle type
...
Provide a convenient way of accessing rectangles.
2018-06-21 15:57:13 -04:00
Jay Berkenbilt
078cf9bf90
newline before endstream fix for object streams ( fixes #205 )
2018-05-12 13:17:43 -04:00
Jay Berkenbilt
15ed9f8565
Fix small logic error in Token construct ( fixes #206 )
...
The special case around name token was not reachable. This would only
affect constructors of name tokens that were represented in
non-canonical form such as with a hex substitution for a printable
character. The error was harmless but still a bug.
2018-05-05 17:47:56 -04:00
Jay Berkenbilt
b4d6cf6836
Limit depth of nesting in direct objects ( fixes #202 )
...
This fixes CVE-2018-9918.
2018-04-15 16:11:22 -04:00
Jay Berkenbilt
f8c8e4dcc0
Prepare 8.0.2 release
2018-03-06 11:34:07 -05:00
Jay Berkenbilt
e4e2e26d99
Properly handle pages with no contents ( fixes #194 )
...
Remove calls to assertPageObject(). All cases in the library that
called assertPageObject() work fine if you don't call
assertPageObject() because nothing assumes anything that was being
checked by that call. Removing the calls enables more files to be
successfully processed.
2018-03-06 11:34:07 -05:00
Jay Berkenbilt
1a4dcb4aaf
Pl_Buffer starts in a ready state
2018-03-06 11:31:03 -05:00
Jay Berkenbilt
ee44aef8d0
Treat loop in xref tables as damage ( fixes #192 )
...
Prior to this fix, if there was a loop detected in following /Prev
pointers in xref streams/tables, it would cause qpdf to lose data.
Note that this condition causes many PDF readers to hang or fail.
2018-03-05 14:26:58 -05:00
Jay Berkenbilt
6fe1e9de40
Prepare 8.0.1 release
2018-03-04 07:16:20 -05:00
Jay Berkenbilt
7b9f23a99a
Ignore zlib data check errors ( fixes #191 )
2018-03-03 11:35:01 -05:00
Jay Berkenbilt
3e8b643ae3
Release 8.0.0
2018-02-25 16:00:11 -05:00
Jay Berkenbilt
111ec50950
8.0.rc3
2018-02-25 14:17:59 -05:00
Jay Berkenbilt
d3d3970cf6
8.0.rc2
2018-02-25 13:50:22 -05:00
Jay Berkenbilt
a16d703f4d
Update version to 8.0.rc1
...
This is for testing the release process, particularly as it pertains
to AppImage creation.
2018-02-25 09:03:27 -05:00
Jay Berkenbilt
82cae01a76
Bump version number and soname
...
Bump to an alpha release. This version is not being widely released
but is being used to push the new shared library version through the
debian packaging system and to test out github releases.
2018-02-20 21:31:38 -05:00
Jay Berkenbilt
4bb3046f0b
Properly handle strings with PDF Doc Encoding ( fixes #179 )
...
The QPDF_String::getUTF8Val() method was not treating strings that
weren't explicitly Unicode as PDF Doc Encoded. This only affects
characters in the range 0x80 through 0xa0.
2018-02-18 21:06:27 -05:00
Jay Berkenbilt
2780a1871d
Add C API for checking PDF files
2018-02-18 21:06:27 -05:00
Jay Berkenbilt
d0e99f195a
More robust handling of type errors
...
Give objects descriptions and context so it is possible to issue
warnings instead of fatal errors for attempts to access objects of the
wrong type.
2018-02-18 21:06:27 -05:00
Jay Berkenbilt
c2e16827b6
Replace "file position" with "offset" in error messages
...
Sometimes it's an offset in an object stream or a content stream, so
file position is confusing in some cases.
2018-02-18 21:06:27 -05:00
Jay Berkenbilt
52e024f701
Include omitted object description in error message
2018-02-18 21:06:27 -05:00
Jay Berkenbilt
cb3b705cf9
Include filename in object stream parse error
2018-02-18 21:06:27 -05:00
Jay Berkenbilt
21b7481b0e
Push members of QPDFObjectHandle into a Members object
...
As in other cases, this is to enable adding new member variables in
the future without breaking ABI compatibility.
2018-02-18 21:06:27 -05:00
Jay Berkenbilt
e410b0fe0d
Simplify TokenFilter interface
...
Expose Pl_QPDFTokenizer, and have it do more of the work of managing
the token filter's pipeline.
2018-02-18 21:05:47 -05:00
Jay Berkenbilt
1fdd86a049
Move Pl_QPDFTokenizer to public interface
2018-02-18 21:05:47 -05:00
Jay Berkenbilt
5708b5d0aa
Add additional interface for filtering page contents
2018-02-18 21:05:47 -05:00
Jay Berkenbilt
fd02944e19
Clean up comment
2018-02-18 21:05:47 -05:00
Jay Berkenbilt
5136238f2a
Detect and report bad tokens in content normalization
2018-02-18 21:05:47 -05:00
Jay Berkenbilt
9910104442
Implement TokenFilter and refactor Pl_QPDFTokenizer
...
Implement a TokenFilter class and refactor Pl_QPDFTokenizer to use a
TokenFilter class called ContentNormalizer. Pl_QPDFTokenizer is now a
general filter that passes data through a TokenFilter.
2018-02-18 21:05:46 -05:00
Jay Berkenbilt
b8723e97f4
Add coalesce contents capability
2018-02-18 21:05:46 -05:00
Jay Berkenbilt
25988e8d10
Bug fix: content normalizer should not add trailing newline
...
Adding a trailing newline in content normalization damages files whose
contents are split across streams in the middle of tokens. Let
QPDFWriter add the newline with the indicator to ignore the newline,
which it already does. This changes the way some qdf files look.
2018-02-18 21:05:46 -05:00
Jay Berkenbilt
fcd611b61e
Refactor parseContentStream
2018-02-18 21:05:46 -05:00
Jay Berkenbilt
05ff619b09
Remove redundant method
...
Remove a redundant method that was equal to another one with
additional arguments. This breaks binary compatibility, but there are
other ABI breaking changes in the upcoming release, so now is the time
to do it.
2018-02-18 21:05:46 -05:00
Jay Berkenbilt
55ee55394c
Use inline image token in content parser
2018-02-18 21:05:46 -05:00
Jay Berkenbilt
ba453ba4ff
Use space tokens in tokenizer filter
2018-02-18 21:05:46 -05:00
Jay Berkenbilt
ec538792fa
Use inline image token type in tokenizer filter
2018-02-18 21:05:46 -05:00
Jay Berkenbilt
fefe25030e
Inline image token type
2018-02-18 21:05:46 -05:00
Jay Berkenbilt
2699ecf13e
Push QPDFTokenizer members into a nested structure
...
This is for protection against future ABI breaking changes.
2018-02-18 21:05:46 -05:00
Jay Berkenbilt
d97474868d
Lexer enhancements: EOF, comment, space
...
Significant enhancements to the lexer to improve EOF handling and to
support comments and spaces as tokens. Various other minor issues were
fixed as well.
2018-02-18 20:18:40 -05:00
Jay Berkenbilt
ebd5ed63de
Add option to save pass 1 of lineariziation
...
This is useful only for debugging the linearization code.
2018-02-18 20:18:40 -05:00
Jay Berkenbilt
2ebdd6929e
Prepare 7.1.1 release
2018-02-04 18:31:42 -05:00
Jay Berkenbilt
e3167c1a60
Fix linearization for files with nonstandard ID length
2018-02-04 18:16:23 -05:00
Jay Berkenbilt
3b2a3cdd77
Fix setLineBuf for bsd ( fixes #177 )
...
Use 0 instead of NULL in a cast.
2018-02-04 14:19:00 -05:00
Jay Berkenbilt
d5bfd49cb2
Remove use of std::abs ( fixes #172 )
...
Different compilers want different choices of headers for std::abs.
It's easier to just to not use it.
2018-02-04 14:19:00 -05:00
Jay Berkenbilt
34a9b835b0
Fix indentation
2018-02-04 14:19:00 -05:00
Jay Berkenbilt
7e5e1a7158
Fix offset in error message
2018-02-04 14:19:00 -05:00
Jay Berkenbilt
633fb414af
Pl_QPDFTokenizer: Use unsigned_char_pointer instead of copy
2018-01-28 18:34:43 -05:00
Jay Berkenbilt
13d9756a45
Minor fixes to tokenizer
2018-01-28 18:34:43 -05:00
Jay Berkenbilt
2e4ca7ecf4
Update version numbers for 7.1.0
2018-01-14 20:09:20 -05:00
Jay Berkenbilt
04e47deaf9
Fixes for clang
2018-01-14 19:18:04 -05:00
Jay Berkenbilt
569d74d36b
Allow raw encryption key to be specified
...
Add options to enable the raw encryption key to be directly shown or
specified. Thanks to Didier Stevens <didier.stevens@gmail.com> for the
idea and contribution of one implementation of this idea.
2018-01-14 10:21:05 -05:00
Jay Berkenbilt
3e306ae64c
Add QUtil::hex_decode
2018-01-14 09:04:13 -05:00
Jay Berkenbilt
791e0db762
Allow trailing . in numeric token ( fixes #165 )
2018-01-13 20:05:40 -05:00
Jay Berkenbilt
ec0087e3ce
Support TIFF Predictor ( fixes #171 )
2018-01-13 19:49:42 -05:00
Jay Berkenbilt
53971d50be
Add Pl_TIFFPredictor
2018-01-13 19:49:42 -05:00
Jay Berkenbilt
d9c9049708
Add signed support to BitStream and BitWriter
2018-01-13 19:49:42 -05:00
Jay Berkenbilt
661ed1d28e
Minor fixes to Pl_PNGFilter
...
Fix comment, remove restriction that doesn't actually matter.
2018-01-13 19:49:42 -05:00
Jay Berkenbilt
be27d47bdc
Use better error for getStreamData failure
...
If the stream isn't filterable but we call getStreamData, throw a
regular exception instead of a logic error so that normal error
handling and reporting mechanisms will be used.
2018-01-13 19:49:42 -05:00
Jay Berkenbilt
4edfe1f41d
Add tests for new PNG filters
2017-12-25 18:20:52 -05:00
Jay Berkenbilt
a3a55be9cd
Correct errors in PNG filters and make use from library
2017-12-25 14:24:48 -05:00
Casey Rojas
9a48720246
Initial implementation of other PNG decode filters
...
Initial implementation provided by Casey Rojas <crojas@infotechfl.com>
Some problems are fixed in a subsequent commit.
2017-12-24 22:59:51 -05:00
Jay Berkenbilt
0f1ce8e646
Prepare 7.0.0 release
2017-09-16 13:22:15 -04:00
Jay Berkenbilt
249e95f608
Fix test failure on MSVC
2017-09-15 23:09:04 -04:00
Jay Berkenbilt
6898bc8d98
Spell check
2017-09-15 23:09:04 -04:00
Jay Berkenbilt
f2ffb6968a
Fix Windows compilation errors
2017-09-15 21:44:57 -04:00
Jay Berkenbilt
d31a7b76e7
Improve message for stream decoding error
...
Tweak the message so that we inform the user that we are mitigating
data loss.
2017-09-12 16:03:48 -04:00
Jay Berkenbilt
eaacf94005
Update C API with new QPDFWriter methods
2017-09-12 14:30:39 -04:00
Jay Berkenbilt
40ecba4172
Pl_DCT: Use custom source and destination managers ( fixes #153 )
...
Avoid calling jpeg_mem_src and jpeg_mem_dest. The custom destination
manager writes to the pipeline in smaller chunks to avoid having the
whole image in memory at once. The source manager works directly with
the Buffer object. Using customer managers avoids use of memory source
and destination managers, which are not present in older versions of
libjpeg still in use by some Linux distributions.
2017-09-07 22:59:11 -04:00
Jay Berkenbilt
3ef1be9783
PNGFilter: Better range checking for columns
2017-08-31 07:26:58 -04:00
Jay Berkenbilt
1868a10f8b
Replace all atoi calls with QUtil::string_to_int
...
The latter catches underflow/overflow.
2017-08-29 12:28:32 -04:00
Jay Berkenbilt
742190bd98
Pl_PNGFilter: disallow columns = 0
2017-08-29 12:28:32 -04:00
Jay Berkenbilt
6d46346eb9
Detect integer overflow/underflow
2017-08-29 12:28:32 -04:00
Jay Berkenbilt
e999bbae43
Fix memory leak with bad jpeg data
2017-08-28 22:16:45 -04:00
Jay Berkenbilt
c6872d2c70
Clean up circular references in QPDF_Stream
2017-08-28 22:16:31 -04:00
Jay Berkenbilt
728dc9e6d8
Fix error caught by clang
2017-08-26 21:51:17 -04:00
Jay Berkenbilt
dea704f0ab
Pad keys to avoid memory errors ( fixes #147 )
2017-08-26 21:35:59 -04:00
Jay Berkenbilt
021c229331
Fix Pl_Flate memory leak on error ( fixes #148 )
2017-08-25 22:26:53 -04:00
Jay Berkenbilt
ad527a64f9
Parse iteratively to avoid stack overflow ( fixes #146 )
2017-08-25 21:56:45 -04:00
Jay Berkenbilt
85f05cc57f
Detect xref pointer infinite loop ( fixes #149 )
2017-08-25 19:58:31 -04:00
Jay Berkenbilt
1e52d33822
Bump soname to 18 and version to 7.0.b1
2017-08-22 16:50:48 -04:00
Jay Berkenbilt
e452d9dca6
Spell check
2017-08-22 14:22:20 -04:00
Jay Berkenbilt
6219111ed7
Update references to README files
...
Most of the README files have been renamed. Refer to the new names.
2017-08-22 14:13:10 -04:00
Jay Berkenbilt
83ec09f66c
Do memory checks
...
Slightly improve memory cleanup in Pl_DCT
Make it easier to test with valgrind
2017-08-22 14:13:10 -04:00
Jay Berkenbilt
fabff0f3ec
Limit token length during xref recovery
...
While scanning the file looking for objects, limit the length of
tokens we allow. This prevents us from getting caught up in reading a
file character by character while digging through large streams.
2017-08-22 14:13:10 -04:00
Jay Berkenbilt
caf5e39c2e
Fix compiler warnings for clang/mac OS X
2017-08-22 14:13:10 -04:00
Jay Berkenbilt
6884ad2ead
Fix logic error in recovery
...
A stray semicolon caused a condition to be incorrectly applied during
stream length recovery.
2017-08-22 07:19:41 -04:00
Jay Berkenbilt
ce435222b2
Push QPDFWriter member variables into a nested class
2017-08-21 22:04:07 -04:00
Jay Berkenbilt
a8c93bd324
Push QPDF member variables into a nested class
...
Pushing member variables into a nested class enables addition of new
member variables without breaking binary compatibility.
2017-08-21 21:35:11 -04:00
Jay Berkenbilt
198856a825
Improve pclm parameter settings
2017-08-21 21:05:48 -04:00
Jay Berkenbilt
8ab52fa558
Combine writePCLm with writeStandard
...
Reduce code duplication
2017-08-21 21:05:48 -04:00
Jay Berkenbilt
9f60a864a0
Combine PCLm header into writeHeader
2017-08-21 21:05:47 -04:00
Jay Berkenbilt
adbcfcff2d
Remove duplicated coverage cases
...
Remove duplicated coverage cases from Sahil's code so existing test
suite passes.
2017-08-21 18:55:02 -04:00
Sahil Arora
b19210fa7d
QPDFWriter: Add setPCLm() and writePCLm() methods
...
* Add support for PCLm using setPCLm() and writePCLm() methods in
QPDFWriter.hh and QPDFWriter.cc
* Add a function writePCLmHeader() for PCLm header in QPDFWriter
2017-08-21 18:55:02 -04:00
Jay Berkenbilt
ddc6cf0cf6
Precheck streams by default
...
There is no need for a --precheck-streams option. We can do the
precheck without imposing any penalty, only re-encoding the stream if
it fails the first time.
2017-08-21 17:44:22 -04:00
Jay Berkenbilt
9744414c66
Enable finer grained control of stream decoding
...
This commit adds several API methods that enable control over which
types of filters QPDF will attempt to decode. It also adds support for
/RunLengthDecode and /DCTDecode filters for both encoding and
decoding.
2017-08-21 17:44:22 -04:00
Jay Berkenbilt
ae90d2c485
Implement Pl_DCT pipeline
...
Additional testing is added in later commits to be supported by
additional changes in the library.
2017-08-21 17:44:02 -04:00
Jay Berkenbilt
2d2f619665
Implement Pl_RunLength pipeline
2017-08-19 14:50:55 -04:00
Jay Berkenbilt
cfa2eb97fb
Add page rotation ( fixes #132 )
2017-08-12 22:57:38 -04:00
Jay Berkenbilt
8249a26d69
Fix infinite loop in QPDFWriter ( fixes #143 )
2017-08-12 08:36:36 -04:00
Jay Berkenbilt
36b3fe5af7
Fix --newline-before-endstream option ( fixes #133 )
...
Add a newline unconditionally before endstream even if a newline was
already written as part of the stream data.
2017-08-11 20:57:05 -04:00
Jay Berkenbilt
46611f0710
Prevent a division by zero error ( fixes #141 )
...
Bad /W in an xref stream could cause a division by zero error. Now
this is handled as a special case.
2017-08-11 20:11:19 -04:00
Jay Berkenbilt
8fe0b06cd8
Pad encryption parameters that are too short ( fixes #96 )
2017-08-11 19:53:56 -04:00
Jay Berkenbilt
e7d0019bf4
Generate libqpdf.map from autoconf
...
Rather than checking consistency of libqpdf.map, generate it.
2017-08-11 04:56:22 -04:00
Jay Berkenbilt
6247aaa57c
Fix libqpdf.map and prevent future breakage
...
The build now checks to make sure libqpdf.map has the right library
version number in it.
2017-08-10 21:53:19 -04:00
Jay Berkenbilt
9a96e233b0
Remove PCRE
2017-08-10 21:30:32 -04:00
Jay Berkenbilt
30f109e244
Read xref table without PCRE
...
Also accept more errors than before.
2017-08-10 21:30:32 -04:00
Jay Berkenbilt
98a843c2a2
Reconstruct xref without PCRE
2017-08-10 21:30:32 -04:00
Jay Berkenbilt
ca5b1d267a
Improve stream length recovery
...
Eliminate PCRE and find endobj not preceded by endstream. Be more lax
about placement of endstream and endobj.
2017-08-10 21:30:32 -04:00
Jay Berkenbilt
3082e4e606
Find xref without PCRE
2017-08-10 21:30:32 -04:00
Jay Berkenbilt
90840be594
Find lindict without PCRE
2017-08-10 21:30:32 -04:00
Jay Berkenbilt
03aa9679ac
Find starxref without PCRE
2017-08-10 21:30:32 -04:00
Jay Berkenbilt
1765c6ec20
Find header without PCRE
2017-08-10 21:30:32 -04:00
Jay Berkenbilt
296b679d6e
Implement findFirst and findLast in InputSource
...
Preparing to refactor some pattern searching code to use these instead
of their own memchr loops. This should simplify the code that replaces
PCRE.
2017-08-10 21:30:32 -04:00
Jay Berkenbilt
ef8ae5449d
Allow QPDFTokenizer::readToken to return bad tokens
...
Sometimes we want to ignore bad tokens rather than having them throw
an exception. A coverage case is commented out here and added in a
later commit.
2017-08-10 19:01:41 -04:00
Jay Berkenbilt
8fe261d8b4
QUtil::strcasecmp
2017-08-05 10:22:33 -04:00
Pranjal Bhor
6f88fd36ab
Include missing header in QPDFTokenizer.cc ( fixes #125 )
...
Required for strtol()
2017-07-30 08:47:05 -04:00
Jay Berkenbilt
2d5b854468
Allow reading command-line args from files ( fixes #16 )
2017-07-29 22:23:21 -04:00
Jay Berkenbilt
5993c3e83c
Detect input file = output file ( fixes #29 )
2017-07-29 20:58:01 -04:00
Jay Berkenbilt
570db9b60b
Catch more exceptions while resolving objects
2017-07-29 19:31:12 -04:00
Jay Berkenbilt
b43a0ac237
When recover stream length, indicate the length ( fixes #44 )
2017-07-29 19:15:06 -04:00
Jay Berkenbilt
f37d399d82
Add newline-before-endstream option ( fixes #103 )
2017-07-29 12:21:38 -04:00
Jay Berkenbilt
6a7d53ad2b
Handle zlib data errors better ( fixes #106 )
2017-07-29 12:19:04 -04:00
Jay Berkenbilt
07d6f770b2
Better recovery of bad stream start ( fixes #104 )
2017-07-29 12:19:04 -04:00
Jay Berkenbilt
b389268f16
Better handle split content streams ( fixes #73 )
...
When parsing content streams, allow content to be split arbitrarily
across stream boundaries.
2017-07-29 12:19:04 -04:00
Jay Berkenbilt
a136824243
Fix exception catch
2017-07-29 12:19:04 -04:00
Jay Berkenbilt
ba2bae4acc
Use 1.2 as the version if we can't read it from the header
...
The code was using 1.0, but we use /FlateDecode, which didn't appear
until 1.2.
2017-07-29 12:19:04 -04:00
Jay Berkenbilt
3a1ff5ded9
Add option to preserve unreferenced objects
2017-07-28 19:19:11 -04:00
Jay Berkenbilt
a94a729fee
Explicitly check root dictionary type
...
Very badly corrupted files may not have a retrievable root dictionary.
Handle that as a special case so that a more helpful error message can
be provided.
2017-07-28 18:03:30 -04:00
Jay Berkenbilt
7f8892525f
Add precheck streams capability
...
When requested, QPDFWriter will do more aggress prechecking of streams
to make sure it can actually succeed in decoding them before
attempting to do so. This will allow preservation of raw data even
when the raw data is corrupted relative to the specified filters.
2017-07-27 23:42:27 -04:00
Jay Berkenbilt
428d96dfe1
Convert many more errors to warnings
2017-07-27 22:57:55 -04:00
Jay Berkenbilt
a4fd4b91c6
Convert stream filtering errors to warnings
2017-07-27 18:43:07 -04:00
Jay Berkenbilt
40f00122b8
Convert object parsing errors to warnings
...
QPDFObjectHandle::parseInternal now issues warnings instead of
throwing exceptions for all error conditions that it finds (except
internal logic errors) and has stronger recovery for things like
invalid tokens and malformed dictionaries. This should improve qpdf's
ability to recover from a wide range of broken files that currently
cause it to fail.
2017-07-27 18:20:31 -04:00
Jay Berkenbilt
dd8dad74f4
Move lexer helper functions to QUtil
2017-07-27 13:59:56 -04:00