Commit Graph

66 Commits

Author SHA1 Message Date
m-holger 34a6f8938f Add methods QPDFTokenizer::Token::isWord 2022-11-20 11:55:02 -05:00
m-holger dca70f13e7 Add method QPDFTokenizer::Token::isInteger 2022-11-20 11:55:02 -05:00
Jay Berkenbilt e9980efec8 Correctly handle reuse of xref stream (fixes #809) 2022-11-19 17:03:17 -05:00
m-holger cb0a6be983 Code tidy: use QPDF::toS and QPDF::toI where possible 2022-10-01 11:17:39 -04:00
m-holger 5ccab4be03 Add private methods QPDF::damagedPDF 2022-10-01 11:17:39 -04:00
m-holger 2e6869483b Replace calls to QUtil::int_to_string with std::to_string 2022-09-21 15:57:14 -04:00
m-holger bd300be08d Replace calls to QPDFObjectHandle::Factory::newIndirect where possible 2022-08-31 22:45:45 +01:00
m-holger 1553868c4a Add QPDF::getObject to replace getObjectByObjGen and getObjectByID
For consistency with similar methods, e.g. replaceObject.
2022-08-01 19:22:37 +01:00
m-holger 073808aa50 Code tidy : replace 0 with nullptr or true 2022-07-26 13:40:13 +01:00
m-holger 8b4afa428e Revert making second parameter of QPDFObjGen::QPDFObjGen optional
Also, change test for QPDFObjGen::isIndirect to obj != 0.
Delete comment from commit afd35f9.
2022-07-24 16:55:10 +01:00
m-holger f7978db1f6 QPDFObjGen : tidy QPDF private methods
Change method signatures to use QPDFObjGen.
Use QPDFObjGen methods where possible.
Remove redundant QPDF::objGenToIndirect.
2022-07-24 16:02:35 +01:00
m-holger 3404ca8ac8 QPDFObjGen : tidy QPDFObjectHandle private methods
Change method signature to use QPDFObjGen.
2022-07-24 15:59:49 +01:00
m-holger c0168cf88c QPPFObjGen : tidy QPDF::readObjectAtOffset
Change method signature to use QPDFObjGen.
2022-07-24 15:59:49 +01:00
Jay Berkenbilt 6f1041afb8 Clarify intent in readObjectAtOffset
Rather than using object id -1 to mean "don't care", use object ID 0,
and clarify the difference between that use and indication of a direct
object.
2022-07-24 09:40:11 -04:00
Jay Berkenbilt 641e92c6a7 QPDF, QPDFJob: use QPDFLogger instead of custom output streams 2022-06-18 09:02:55 -04:00
m-holger 6c69a747b9 Code clean up: use range-style for loops wherever possible
Remove variables obsoleted by commit 4f24617.
2022-05-21 16:06:29 -04:00
Jay Berkenbilt 92b692466f Remove remaining incorrect assert calls from implementation 2022-05-03 18:31:22 -04:00
Jay Berkenbilt 4f24617e1e Code clean up: use range-style for loops wherever possible
Where not possible, use "auto" to get the iterator type.

Editorial note: I have avoid this change for a long time because of
not wanting to make gratuitous changes to version history, which can
obscure when certain changes were made, but with having recently
touched every single file to apply automatic code formatting and with
making several broad changes to the API, I decided it was time to take
the plunge and get rid of the older (pre-C++11) verbose iterator
syntax. The new code is just easier to read and understand, and in
many cases, it will be more effecient as fewer temporary copies are
being made.

m-holger, if you're reading, you can see that I've finally come
around. :-)
2022-04-30 13:27:18 -04:00
Jay Berkenbilt a68703b07e Replace PointerHolder with std::shared_ptr in library sources only
(patrepl and cleanpatch are my own utilities)

patrepl s/PointerHolder/std::shared_ptr/g {include,libqpdf}/qpdf/*.hh
patrepl s/PointerHolder/std::shared_ptr/g libqpdf/*.cc
patrepl s/make_pointer_holder/std::make_shared/g libqpdf/*.cc
patrepl s/make_array_pointer_holder/QUtil::make_shared_array/g libqpdf/*.cc
patrepl s,qpdf/std::shared_ptr,qpdf/PointerHolder, **/*.cc **/*.hh
git restore include/qpdf/PointerHolder.hh
cleanpatch
./format-code
2022-04-09 17:33:29 -04:00
Jay Berkenbilt 12f1eb15ca Programmatically apply new formatting to code
Run this:

for i in  **/*.cc **/*.c **/*.h **/*.hh; do
  clang-format < $i >| $i.new && mv $i.new $i
done
2022-04-04 08:10:40 -04:00
Jay Berkenbilt cb769c62e5 WHITESPACE ONLY -- expand tabs in source code
This comment expands all tabs using an 8-character tab-width. You
should ignore this commit when using git blame or use git blame -w.

In the early days, I used to use tabs where possible for indentation,
since emacs did this automatically. In recent years, I have switched
to only using spaces, which means qpdf source code has been a mixture
of spaces and tabs. I have avoided cleaning this up because of not
wanting gratuitous whitespaces change to cloud the output of git
blame, but I changed my mind after discussing with users who view qpdf
source code in editors/IDEs that have other tab widths by default and
in light of the fact that I am planning to start applying automatic
code formatting soon.
2022-02-08 11:51:15 -05:00
Jay Berkenbilt c62e8e2b28 Update for clean compile with POINTERHOLDER_TRANSITION=2 2022-02-07 17:38:22 -05:00
Jay Berkenbilt 40f1946df8 Replace PointerHolder arrays with shared_ptr arrays where possible
Replace PointerHolder arrays wherever it can be done without breaking ABI.
2022-02-07 17:38:22 -05:00
Jay Berkenbilt 9044a24097 PointerHolder: deprecate getPointer() and getRefcount()
Use get() and use_count() instead. Add #define
NO_POINTERHOLDER_DEPRECATION to remove deprecation markers for these
only.

This commit also removes all deprecated PointerHolder API calls from
qpdf's code except in PointerHolder's test suite, which must continue
to test the deprecated APIs.
2022-02-04 13:12:37 -05:00
Jay Berkenbilt af2a71aa2c Handle bitstream overflow errors more gracefully (fixes #581)
* Make it a runtime error, not a logic error
* Include additional information
* Capture it properly in checkLinearization
2021-12-10 15:37:35 -05:00
Jay Berkenbilt 26514ab731 Write linearization errors to stderr (fixes #438) 2020-04-29 17:33:34 -04:00
Jay Berkenbilt 92d3cbecd4 Fix warnings reported by -Wshadow=local (fixes #431) 2020-04-16 12:41:43 -04:00
Jay Berkenbilt 201e8798d7 Convert previously overlooked static cast to QIntC 2019-06-25 12:43:06 -04:00
Jay Berkenbilt 04f45cf652 Treat all linearization errors as warnings
This also reverts the addition of a new checkLinearization that
distinguishes errors from warnings. There's no practical distinction
between what was considered an error and what was considered a
warning.
2019-06-23 13:45:45 -04:00
Jay Berkenbilt 85a3f95a89 qpdf: exit 3 for linearization warnings without errors (fixes #50) 2019-06-22 16:57:51 -04:00
Jay Berkenbilt b07ad6794e Fix bugs found by fuzz tests
* Several assertions in linearization were not always true; change
  them to run time errors
* Handle a few cases of uninitialized objects
* Handle pages with no contents when doing form operations
* Handle invalid page tree nodes when traversing pages
2019-06-21 17:56:24 -04:00
Jay Berkenbilt d71f05ca07 Fix sign and conversion warnings (major)
This makes all integer type conversions that have potential data loss
explicit with calls that do range checks and raise an exception. After
this commit, qpdf builds with no warnings when -Wsign-conversion
-Wconversion is used with gcc or clang or when -W3 -Wd4800 is used
with MSVC. This significantly reduces the likelihood of potential
crashes from bogus integer values.

There are some parts of the code that take int when they should take
size_t or an offset. Such places would make qpdf not support files
with more than 2^31 of something that usually wouldn't be so large. In
the event that such a file shows up and is valid, at least qpdf would
raise an error in the right spot so the issue could be legitimately
addressed rather than failing in some weird way because of a silent
overflow condition.
2019-06-21 13:17:21 -04:00
Jay Berkenbilt aa602fd107 Fix integer overflow in large file test 2019-01-07 08:49:14 -05:00
Jay Berkenbilt 837dcf8fc2 Don't call assert while checking linearization data (fixes #209, #231)
Instead of calling assert for problems found during checking
linearization data, throw an exception which is later caught and
issued as an error. Ideally we would handle errors more robustly, but
this is still a significant improvement.
2019-01-04 11:55:42 -05:00
Jay Berkenbilt d0e99f195a More robust handling of type errors
Give objects descriptions and context so it is possible to issue
warnings instead of fatal errors for attempts to access objects of the
wrong type.
2018-02-18 21:06:27 -05:00
Jay Berkenbilt a8c93bd324 Push QPDF member variables into a nested class
Pushing member variables into a nested class enables addition of new
member variables without breaking binary compatibility.
2017-08-21 21:35:11 -04:00
Jay Berkenbilt 9744414c66 Enable finer grained control of stream decoding
This commit adds several API methods that enable control over which
types of filters QPDF will attempt to decode. It also adds support for
/RunLengthDecode and /DCTDecode filters for both encoding and
decoding.
2017-08-21 17:44:22 -04:00
Jay Berkenbilt 90840be594 Find lindict without PCRE 2017-08-10 21:30:32 -04:00
Jay Berkenbilt 296b679d6e Implement findFirst and findLast in InputSource
Preparing to refactor some pattern searching code to use these instead
of their own memchr loops. This should simplify the code that replaces
PCRE.
2017-08-10 21:30:32 -04:00
Jay Berkenbilt ac9c1f0d56 Security: replace operator[] with at
For std::string and std::vector, replace operator[] with at.  This was
done using an automated process.  See README.hardening for details.
2013-10-18 10:45:14 -04:00
Jay Berkenbilt e19eb579b2 Replace some assertions with std::logic_error
Ideally, the library should never call assert outside of test code,
but it does in several places.  For some cases where the assertion
might conceivably fail because of a problem with the input data,
replace assertions with exceptions so that they can be trapped by the
calling application.  This commit surely misses some cases and
replaced some cases unnecessarily, but it should still be an
improvement.
2013-10-09 20:57:14 -04:00
Jay Berkenbilt 0bfe902489 Security: avoid pre-allocating vectors based on file data
In places where std::vector<T>(size_t) was used, either validate that
the size parameter is sane or refactor code to avoid the need to
pre-allocate the vector.
2013-10-09 20:57:14 -04:00
Jay Berkenbilt 3eb4b066ab Security: better bounds checks for linearization data
The faulty code was only used during explicit checks of linearization
data.  Those checks are not part of normal reading or writing of PDF
files.
2013-10-09 19:50:09 -04:00
Jay Berkenbilt 16051788ed Handle /Outlines dictionary being a direct object
Even though this case is not valid according to the spec, it has been
seen, and caused an internal error.
2013-06-14 21:36:04 -04:00
Jay Berkenbilt 96eb965115 Use QPDFObjectHandle::getObjGen() where appropriate
In internal code and examples, replace calls to getObjectID() and
getGeneration() with calls to getObjGen() where possible.
2013-06-14 14:58:09 -04:00
Jay Berkenbilt d88231e01e Promote QPDF::ObjGen to top-level object QPDFObjGen 2013-06-14 14:58:08 -04:00
Jay Berkenbilt 30027481f7 Remove all old-style casts from C++ code 2013-03-04 16:45:16 -05:00
Jay Berkenbilt fc4c82a950 Reset state in QPDF::calculateLinearizationData
This makes it possible to use two different writers to write
linearized files from the same QPDF object.
2012-09-06 15:28:16 -04:00
Jay Berkenbilt 8318d81ada Fix and test support for files >= 4 GB 2012-06-24 15:56:50 -04:00
Jay Berkenbilt 81e8752362 Use qpdf_offset_t in place of off_t in public APIs.
off_t is used internally only when needed to talk to standard
libraries.  This requires that the "long long" type be supported by
the compiler.
2012-06-21 21:23:24 -04:00