The original QPDF is only required now when the source
QPDFObjectHandle is a stream that gets its stream data from a
QPDFObjectHandle::StreamDataProvider.
Implement a TokenFilter class and refactor Pl_QPDFTokenizer to use a
TokenFilter class called ContentNormalizer. Pl_QPDFTokenizer is now a
general filter that passes data through a TokenFilter.
If the stream isn't filterable but we call getStreamData, throw a
regular exception instead of a logic error so that normal error
handling and reporting mechanisms will be used.
This commit adds several API methods that enable control over which
types of filters QPDF will attempt to decode. It also adds support for
/RunLengthDecode and /DCTDecode filters for both encoding and
decoding.
When requested, QPDFWriter will do more aggress prechecking of streams
to make sure it can actually succeed in decoding them before
attempting to do so. This will allow preservation of raw data even
when the raw data is corrupted relative to the specified filters.
Move object parsing code from QPDF to QPDFObjectHandle and
parameterize the parts of it that are specific to a QPDF object.
Provide a version that can't handle indirect objects and that can be
called on an arbitrary string.
A side effect of this change is that the offset used when reporting
invalid stream length has changed, but since the new value seems like
a better value than the old one, the test suite has been updated
rather than making the code backward compatible. This only effects
the offset reported for invalid streams that lack /Length or have an
invalid /Length key.
Updated some test code and exmaples to use QPDFObjectHandle::parse.
Supporting changes include adding a BufferInputSource constructor that
takes a string.
Breaking API change: length parameter has disappeared from the
StreamDataProvider version of QPDFObjectHandle::replaceStreamData
since it is no longer necessary to compute it in advance. This
breaking change is justified by the fact that removing the length
parameter provides the caller an opportunity to simplify the calling
code.
Significantly improve the code's use of off_t for file offsets, size_t
for memory sizes, and integer types in cases where there has to be
compatibility with external interfaces. Rework sections of the code
that would have prevented qpdf from working on files larger than 2 (or
maybe 4) GB in size.