Commit Graph

257 Commits

Author SHA1 Message Date
m-holger da14ab4dc7 Move definition of QPDF::JSONReactor into QPDF_json
Allow access to private header files when defining data members.
2023-02-18 08:33:08 +00:00
m-holger dab27c9bb3 Refactor setting of object descriptions in QPDF::JSONReactor 2023-02-18 08:33:08 +00:00
Jay Berkenbilt 1308c45090 Implement --remove-restrictions (fixes #833) 2023-01-28 13:42:19 -05:00
m-holger b0457b37e2 Update doc comment for QPDF::fixDanglingReferences 2022-12-31 09:28:28 -05:00
m-holger cfcb279e49 Alternative fix logic for fixDanglingReferences 2022-12-06 15:21:34 -05:00
m-holger 008364a9a4 Remove redundant friend class statements 2022-12-04 14:05:16 -05:00
Jay Berkenbilt 562ff1b608 Rename function for pikepdf (for 11.2.0)
A new private overload of QPDF::makeIndirectObject breaks pikepdf's
build, so renaming function.
2022-11-20 16:04:58 -05:00
Jay Berkenbilt e9980efec8 Correctly handle reuse of xref stream (fixes #809) 2022-11-19 17:03:17 -05:00
m-holger 9ebabd1953 Add new methods QPDF::newStream 2022-11-19 14:10:42 -05:00
m-holger 0a3c533186 Add private method QPDF::nextObjGen 2022-11-19 14:10:42 -05:00
m-holger b3d71e1f58 Add private overload of QPDF::makeIndirectObject taking a QPDFObject shared_ptr 2022-11-19 14:10:42 -05:00
m-holger 5ccab4be03 Add private methods QPDF::damagedPDF 2022-10-01 11:17:39 -04:00
m-holger b948366280 Add doc comment to QPDF::getFilename 2022-10-01 11:17:39 -04:00
Jay Berkenbilt c5f61fcbd3 Improve efficiency of ResolveRecorder
Removing an element from a set with iterator is constant time, and
std::set specifies that other operations on the set do not invalidate
existing iterators.
2022-09-13 11:19:24 -04:00
Jay Berkenbilt 31b2cfbb79 Fix up a few comments 2022-09-13 11:18:49 -04:00
Jay Berkenbilt 18a583e8d9 Rename QPDFValueProxy back to QPDFObject
QPDFValueProxy wasn't a good name for it. We decided the evil of
having the header file be named QPDFObject_private.hh was less than
the evil of having the class be named something other than what it
should have been named.
2022-09-08 11:29:23 -04:00
Jay Berkenbilt 6c61be00e8 Rename QPDFObject -> QPDFValueProxy
This is in preparation for restoring a QPDFObject.hh to ease the
transition on qpdf_object_type_e.

This commit was created by
* Renaming QPDFObject.cc and QPDFObject.hh
* Replacing QPDFObject\b with QPDFValueProxy (where \b is word
  boundary)
* Running format-code
* Manually resorting files in libqpdf/CMakeLists.txt
* Manually refilling the comment in QPDF.hh near class Resolver
2022-09-05 18:52:59 -04:00
Jay Berkenbilt a59e7ac7ec Disable copying/assigning to QPDF objects, add QPDF::create() 2022-09-02 08:53:27 -04:00
Jay Berkenbilt da0b0e405d Fix outdated comment 2022-09-02 08:30:11 -04:00
Jay Berkenbilt 3d029fb17e
Merge pull request #730 from m-holger/allpages
Tidy QPDF::getAllPagesInternal and QPDF::pushInheritedAttributesToPageInternal
2022-09-01 15:28:32 -04:00
m-holger 4a8515912c Add method QPDFObject::resolve 2022-09-01 17:19:06 +01:00
m-holger ae6e484e23 Change return type of QPDF::resolve to void 2022-09-01 17:08:45 +01:00
m-holger 556c34f0f2 Add private method QPDF::ObjCache::update
Add a new obj_cache entry or update an existing entry in place.
2022-09-01 14:30:26 +01:00
m-holger c0cd72a3ee Add private methods QPDF::isCached and QPDF::isUnresolved 2022-09-01 14:29:53 +01:00
m-holger 27fae2b55e Remove QPDF::ObjectChanged
Also change QPDF::replaceObject and QPDF::swapObjects such that the
QPDFObject assigned to an og in the obj_cache is never replaced; only
QPDFObject::value is updated.
2022-09-01 14:27:46 +01:00
m-holger 2a2eebcaea Modify newIndirect to set QPDFObjectHandle::obj 2022-08-31 22:47:11 +01:00
m-holger 6670c685ab Move QPDFObjectHandle::parseInternal to new class QPDFParser
Part of #729
2022-08-30 05:56:23 +01:00
Jay Berkenbilt 7084c3f715 Add comment clarifying getObject vs others 2022-08-06 14:25:12 -04:00
m-holger 1553868c4a Add QPDF::getObject to replace getObjectByObjGen and getObjectByID
For consistency with similar methods, e.g. replaceObject.
2022-08-01 19:22:37 +01:00
m-holger 0356bcecc5 Tidy QPDF::pushInheritedAttributesToPageInternal
Remove unnecessary parameters.
Remove code that is unnecessary as result of a prior call to QPDF::getAllPages.
Avoid clearing and rebuilding of m->all_pages.
2022-08-01 13:29:14 +01:00
m-holger 4ccca20db0 Remove redundant parameter from QPDF::getAllPagesInternal 2022-08-01 13:29:14 +01:00
Jay Berkenbilt 12d065c751 Provide a simpler QPDF::writeJSON 2022-07-31 16:23:17 -04:00
Jay Berkenbilt 5f4224f31a Simplify --json-output
Now --json-output just changes defaults. Allow output file with --json.
2022-07-31 16:23:17 -04:00
Jay Berkenbilt 69820847af Change the output of --json to use "qpdf" instead of "objects" 2022-07-31 15:17:01 -04:00
Jay Berkenbilt d01c4f8819 Change --json-output format
from "qpdf-v2" to "qpdf": [..., ...]
2022-07-31 10:32:55 -04:00
Jay Berkenbilt bb96499b61 Update docs and prepare QPDF::writeJSON for changes
Add additional parameters that will be needed to call QPDF::writeJSON
in partial mode.
2022-07-31 10:32:55 -04:00
m-holger afd35f9a30 Overload StreamDataProvider::provideStreamData
Use 'QPDFObjGen const&' instead of 'int, int' in signature.
2022-07-24 16:02:35 +01:00
m-holger f7978db1f6 QPDFObjGen : tidy QPDF private methods
Change method signatures to use QPDFObjGen.
Use QPDFObjGen methods where possible.
Remove redundant QPDF::objGenToIndirect.
2022-07-24 16:02:35 +01:00
m-holger c0168cf88c QPPFObjGen : tidy QPDF::readObjectAtOffset
Change method signature to use QPDFObjGen.
2022-07-24 15:59:49 +01:00
Jay Berkenbilt 0c7c7e4ba4 Track whether certain page modifying methods have been called
We need to know whether pushInheritedAttributesToPage or getAllPages
have been called when generating JSON output. When reading the JSON
back in, we have to call the same methods so that object numbers will
line up properly.
2022-06-25 13:55:45 -04:00
Jay Berkenbilt 641e92c6a7 QPDF, QPDFJob: use QPDFLogger instead of custom output streams 2022-06-18 09:02:55 -04:00
Jay Berkenbilt 0bd908b550 Update documentation for qpdf JSON v2 2022-05-30 20:03:08 -04:00
Jay Berkenbilt 05460d405c Format code 2022-05-21 16:11:42 -04:00
Jay Berkenbilt c56a9ca7f6 JSON: Fix large file support 2022-05-21 09:43:45 -04:00
Jay Berkenbilt 47c093c48b Replace std::regex with validators for better performance 2022-05-21 08:43:21 -04:00
Jay Berkenbilt 9b2eb01e25 Exercise object description in tests 2022-05-20 14:23:32 -04:00
Jay Berkenbilt d065098089 Test --update-from-json 2022-05-20 11:10:12 -04:00
Jay Berkenbilt 6f43bf8de3 Major rework -- see long comments
* Replace --create-from-json=file with --json-input, which causes the
  regular input to be treated as json.
* Eliminate --to-json
* In --json=2, bring back "objects" and eliminate "objectinfo". Stream
  data is never present.
* In --json-output=2, write "qpdf-v2" with "objects" and include
  stream data.
2022-05-20 09:16:25 -04:00
Jay Berkenbilt 0fe8d44762 Support stream data -- not tested
There are no automated tests yet, but committing work so far in
preparation for some refactoring.
2022-05-20 09:16:25 -04:00
Jay Berkenbilt 7e7a9c4379 Parse objects; stream data is not yet handled 2022-05-20 09:16:25 -04:00
Jay Berkenbilt 9064542b5f Add private methods for reserving specific objects 2022-05-20 07:54:09 -04:00
Jay Berkenbilt 7fa5d1773b Implement top-level qpdf json parsing 2022-05-16 13:41:40 -04:00
Jay Berkenbilt 8d42eb2632 Add scaffolding for QPDF JSON reactor 2022-05-16 13:41:40 -04:00
Jay Berkenbilt 4fe2e06b47 Add --create-from-json and --update-from-json arguments
Also add stubs for top-level QPDF methods (createFromJSON,
updateFromJSON)
2022-05-16 13:41:40 -04:00
Jay Berkenbilt 68e721981a Add new QPDF::warn that takes most of QPDFExc's arguments 2022-04-23 18:25:43 -04:00
Jay Berkenbilt cdd0b4fb7d Use = default and = delete where possible in classes 2022-04-16 11:39:14 -04:00
Jay Berkenbilt 2a7d2b63c2 Make ABI-breaking changes that don't modify API at all
* Merge overloaded functions by adding default values
* Remove non-const methods that are identical to const methods
2022-04-16 10:41:46 -04:00
Jay Berkenbilt a68703b07e Replace PointerHolder with std::shared_ptr in library sources only
(patrepl and cleanpatch are my own utilities)

patrepl s/PointerHolder/std::shared_ptr/g {include,libqpdf}/qpdf/*.hh
patrepl s/PointerHolder/std::shared_ptr/g libqpdf/*.cc
patrepl s/make_pointer_holder/std::make_shared/g libqpdf/*.cc
patrepl s/make_array_pointer_holder/QUtil::make_shared_array/g libqpdf/*.cc
patrepl s,qpdf/std::shared_ptr,qpdf/PointerHolder, **/*.cc **/*.hh
git restore include/qpdf/PointerHolder.hh
cleanpatch
./format-code
2022-04-09 17:33:29 -04:00
Jay Berkenbilt 12f1eb15ca Programmatically apply new formatting to code
Run this:

for i in  **/*.cc **/*.c **/*.h **/*.hh; do
  clang-format < $i >| $i.new && mv $i.new $i
done
2022-04-04 08:10:40 -04:00
Jay Berkenbilt 5f329e6206 Remove Version.h -- it was never used 2022-02-27 20:01:32 -05:00
Jay Berkenbilt cfd5147d92 Add QPDF::getVersionAsPDFVersion 2022-02-08 12:34:14 -05:00
Jay Berkenbilt cb769c62e5 WHITESPACE ONLY -- expand tabs in source code
This comment expands all tabs using an 8-character tab-width. You
should ignore this commit when using git blame or use git blame -w.

In the early days, I used to use tabs where possible for indentation,
since emacs did this automatically. In recent years, I have switched
to only using spaces, which means qpdf source code has been a mixture
of spaces and tabs. I have avoided cleaning this up because of not
wanting gratuitous whitespaces change to cloud the output of git
blame, but I changed my mind after discussing with users who view qpdf
source code in editors/IDEs that have other tab widths by default and
in light of the fact that I am planning to start applying automatic
code formatting soon.
2022-02-08 11:51:15 -05:00
Jay Berkenbilt 8cf7f2bfb5 API contract: qpdf_get_qpdf_version() returns a static 2022-02-05 11:24:56 -05:00
Jay Berkenbilt cfaa2de804 Update copyright for 2022 2022-02-04 16:36:22 -05:00
Jay Berkenbilt 8eab616d62 Add qpdf version macros to qpdf/DLL.h 2022-02-04 13:41:01 -05:00
Jay Berkenbilt abc300f05c Replace containers of PointerHolder with containers of std::shared_ptr
None of these are in the public API.
2022-02-04 13:12:37 -05:00
m-holger 0f9086e509 Fix doc typos 2022-01-30 12:09:54 -06:00
m-holger 0c705a882b Minor documentation updates 2021-12-09 10:24:14 -05:00
Jay Berkenbilt 33a47d5c3c Make QPDF::findPage public (fixes #516)
This was originally not public because I wanted to get rid fo the
pages cache, but I recently realized there were deep reasons not to do
that, and the author of pikepdf wanted this, so I decided to make it
public.
2021-11-03 09:43:17 -04:00
Jay Berkenbilt 9fb174b9e9 Major rework of handling form fields when copying pages (fixes #509) 2021-03-04 15:08:37 -05:00
Jay Berkenbilt 1bb209a9bf Add QPDF::numWarnings 2021-03-03 17:05:49 -05:00
Jay Berkenbilt a4d6589ff2 Have QPDFObjectHandle notice when replaceObject was called
This results in a performance penalty of 1% to 2% when replaceObject
and swapObjects are never called and a somewhat larger penalty if they
are called, but it's worth it to avoid very confusing behavior as
discussed in depth in qpdf#507.
2021-02-25 07:32:46 -05:00
Jay Berkenbilt 92fbc6fdf5 QPDFObjectHandle::copyStream 2021-02-21 06:36:30 -05:00
Jay Berkenbilt 60afe4142e Refactor: separate copyStreamData from replaceForeignIndirectObjects 2021-02-21 06:36:30 -05:00
Jay Berkenbilt e076c9bf08 Remove erroneous handling of /EFF for stream decryption
I thought /EFF was supposed to be used as a default for decrypting
embedded file streams, but actually it's supposed to be advice to a
conforming writer about handling new ones. This makes sense since the
findAttachmentStreams code, which is not actually needed, was never
right.
2021-02-06 17:08:41 -05:00
Jay Berkenbilt 6226b69dba Add warn() to QPDF's public API 2021-01-16 18:41:53 -05:00
Jay Berkenbilt bf8fd41fee Update copyright to 2021 2021-01-04 16:26:58 -05:00
Jay Berkenbilt 1a62cce940 Restructure optimize to allow skipping parameters of filtered streams 2020-12-28 12:58:19 -05:00
Jay Berkenbilt 39bfa01307 Implement user-provided stream filters
Refactor QPDF_Stream to use stream filter classes to handle supported
stream filters as well.
2020-12-28 12:58:19 -05:00
Jay Berkenbilt 8a11feacc3 Avoid leak by resolving object streams more than once (fuzz issue 23642) 2020-10-22 15:39:36 -04:00
Jay Berkenbilt 24196c08cb Fix loop detection error (fuzz issue 23172) 2020-10-22 05:48:35 -04:00
Jay Berkenbilt 893d38b87e Allow propagation of errors and retry through StreamDataProvider
StreamDataProvider::provideStreamData now has a rich enough API for it
to effectively proxy to pipeStreamData.
2020-04-05 20:07:13 -04:00
Jay Berkenbilt a6f1f829db Use deleted copy/assignment (C++11) 2020-04-03 12:17:57 -04:00
Jay Berkenbilt e5cc065598 Update copyright to 2020 2020-01-26 16:57:27 -05:00
Masamichi Hosoda 46ac3e21b3 Add QPDF::getXRefTable() 2019-10-22 16:16:16 -04:00
Jay Berkenbilt babd12c9b2 Add methods QPDF::anyWarnings and QPDF::closeInputSource 2019-08-31 15:51:20 -04:00
Jay Berkenbilt 5da146c8b5 Track separately whether password was user/owner (fixes #159) 2019-08-24 11:01:19 -04:00
Jay Berkenbilt 225cd9dac2 Protect against coding error of re-entrant parsing 2019-08-22 17:55:16 -04:00
Jay Berkenbilt 04f45cf652 Treat all linearization errors as warnings
This also reverts the addition of a new checkLinearization that
distinguishes errors from warnings. There's no practical distinction
between what was considered an error and what was considered a
warning.
2019-06-23 13:45:45 -04:00
Jay Berkenbilt 85a3f95a89 qpdf: exit 3 for linearization warnings without errors (fixes #50) 2019-06-22 16:57:51 -04:00
Jay Berkenbilt 25dd3c6750 Remove QPDF::copyForeignObject with unused parameter 2019-06-21 22:29:31 -04:00
Jay Berkenbilt d71f05ca07 Fix sign and conversion warnings (major)
This makes all integer type conversions that have potential data loss
explicit with calls that do range checks and raise an exception. After
this commit, qpdf builds with no warnings when -Wsign-conversion
-Wconversion is used with gcc or clang or when -W3 -Wd4800 is used
with MSVC. This significantly reduces the likelihood of potential
crashes from bogus integer values.

There are some parts of the code that take int when they should take
size_t or an offset. Such places would make qpdf not support files
with more than 2^31 of something that usually wouldn't be so large. In
the event that such a file shows up and is valid, at least qpdf would
raise an error in the right spot so the issue could be legitimately
addressed rather than failing in some weird way because of a silent
overflow condition.
2019-06-21 13:17:21 -04:00
Thorsten Schöning 2a852f08b6 [bcc32 Error] QPDF.hh(803): E2247 'QPDF::Members::resolving' is not accessible
Full parser context
    QPDF.cc(2): #include ..\..\..\..\src\include\qpdf\QPDF.hh
    QPDF.hh(48): class QPDF
    QPDF.hh(1380): decision to instantiate:  QPDF::ResolveRecorder::ResolveRecorder(QPDF *,const QPDFObjGen &)
    --- Resetting parser context for instantiation...
    QPDF.hh(799): parsing:  QPDF::ResolveRecorder::ResolveRecorder(QPDF *,const QPDFObjGen &)
2019-03-11 17:07:01 -04:00
Thorsten Schöning 86287acfd9 [bcc32 Error] QPDF.hh(223): E2303 Type name expected
Full parser context
    QPDF.cc(2): #include ..\..\..\..\src\include\qpdf\QPDF.hh
    QPDF.hh(47): class QPDF
2019-03-11 16:57:16 -04:00
Thorsten Schöning 9b3314042a [bcc32 Error] QPDF.hh(203): E2316 'vector' is not a member of 'std'
Full parser context
    QPDF.cc(2): #include ..\..\..\..\src\include\qpdf\QPDF.hh
    QPDF.hh(46): class QPDF
2019-03-11 16:57:16 -04:00
Jay Berkenbilt b776dcd2d3 Clean up some private functions 2019-01-29 22:14:20 -05:00
Jay Berkenbilt 2d0885bc11 Clarify documentation for copyForeignObject regarding pages
Make explicit that copyForeignObject can be used on page objects and
will copy them properly but not update the pages tree.
2019-01-28 21:53:55 -05:00
Jay Berkenbilt 52f9d326a5 Resolve duplicated page objects (fixes #268)
When linearizing a file or getting the list of all pages in a file,
detect if the pages tree contains a duplicated page object and, if so,
shallow copy it. This makes it possible to have a one to one mapping
of page positions to page objects.
2019-01-28 20:29:58 -05:00
Jay Berkenbilt 654c0e8caf Allow adding the same page more than once in --pages (fixes #272) 2019-01-12 10:01:47 -05:00
Jay Berkenbilt 5f128b9a27 Fix version number in comment 2019-01-11 07:46:53 -05:00