2
1
mirror of https://github.com/qpdf/qpdf.git synced 2025-02-03 12:28:27 +00:00

3711 Commits

Author SHA1 Message Date
m-holger
8753ffe335 Generalise last commit to xref parsing and reconstruction
Tests use a modified dangling-bad-xref.pdf.
2024-10-18 16:48:30 +01:00
m-holger
405f3765c5 Refactor QPDF::Objects / JSONReactor interaction
Allow for the fact that when processing JSON input we cannot determine
whether references are dangling until the whole input has been processed.

We handle this by optimistically storing references in the object table.
When additional gens are encountered for the same object we store them
in the new map unconfirmed_objects. Once processing is complete we clean
up the object table and clear unconfirmed_objects.

New test cases are adapted from manual-qpdf-json.json etc.
2024-10-18 16:23:46 +01:00
m-holger
486e68f2e5 Tidy Objects::read 2024-10-17 14:30:10 +01:00
m-holger
6408c5c364 Refactor Qbjects::update_table
Decompose into new methods update_entry and Entry::update.
2024-10-17 14:30:10 +01:00
m-holger
55bca1a117 Index QPDF::Objects::table by object id only 2024-10-17 14:30:10 +01:00
m-holger
542794bc28 Replace QPDFObject::setObjGen with new methods make_indirect etc 2024-10-17 14:30:10 +01:00
m-holger
84a5c03bb8 Change signatures of various QPDF::Objects methods 2024-10-17 14:30:10 +01:00
m-holger
6555cdb6de Refactor Objects::next_id
Store last used id as a data member. Update when calling next_id /
inserting a new object (which only occurs in make_indirect and
get_for_json). Add new methods initialize to calculate the initial value
for last_id and last_id to query it.

Use initialize in QPDF::fixDanglingReferences, which is redundant.
Refactor make_indirect to use update_table.
2024-10-17 14:30:10 +01:00
m-holger
7a20ba0c6a Add new method Xref_table::prepare_obj_table
Remove invalid objects introduced into the object table during xref table
parsing. At the moment, such invalid objects could override valid objects
if valid gen < invalid gen. Once the object table is indexed by object id
only, the invalid object will prevent valid objects with the same id
entering the object table.

The additional test case uses good10.pdf, with the dangling reference
in the trailer replaced with 3 1 R. Prior to this commit, this caused
the page object 3 0 to be masked and replaced with a null object.
2024-10-17 14:30:10 +01:00
m-holger
560130c893 Refactor Xref_table::resolve
Make resolve resolve responsible to try again after xref reconstruction
rather than returning false to indicate failure. Rename to resolve_all.
2024-10-17 14:30:10 +01:00
m-holger
c648b9a018
Merge pull request #1297 from m-holger/qpdf_objects
Add inner class QPDF::Objects to encapsulate reading and managing of objects
2024-10-17 14:03:41 +01:00
m-holger
acc57ca090 Add QPDF::Objects destructor
Also, make obj_cache private and rename to table.
2024-10-09 12:02:34 +01:00
m-holger
336d783325 Move calculations from QPDF::getObjectCount to Objects::next_id 2024-10-09 11:55:29 +01:00
m-holger
113ea4e7ae Add new method Objects::all 2024-10-09 11:39:44 +01:00
m-holger
9e03dc54cc Add new method Objects::swap 2024-10-09 11:39:17 +01:00
m-holger
83fc18af09 Add new method Objects::replace 2024-10-09 11:27:40 +01:00
m-holger
6c9903062f Add new method Objects::get 2024-10-09 11:27:28 +01:00
m-holger
83443c116d Make ObjCache of inner class of QPDF::Objects and rename to Entry 2024-10-09 11:09:18 +01:00
m-holger
b5a5780019 Make Xref_table an inner class of QPDF::Objects 2024-10-09 09:53:57 +01:00
m-holger
a3f693c8f9 Move private methods in QPDF_objects to QPDF::Objects 2024-10-09 08:58:57 +01:00
m-holger
2015f71c7d Add new inner class to QPDF::Objects 2024-10-07 14:18:59 +01:00
m-holger
83897e8789 Split QPDF.cc into QPDF.cc and QPDF_objects.cc
Move methods responsible for loading or keeping track of objects to
QPDF_objects.cc.
2024-10-07 14:10:18 +01:00
m-holger
9f0cc086b1 Copy QPDF.cc to new QPDF_objects 2024-10-06 17:45:29 +01:00
m-holger
12b67a3227
Merge pull request #1282 from m-holger/next
Add new protected inline method Pipeline::next
2024-10-06 15:59:42 +01:00
m-holger
c916dcf973 Add new protected inline method Pipeline::next
Also, tidy pipeline constructors and make subclasses final where possible.
2024-10-06 15:10:13 +01:00
m-holger
2cb2412fbf
Merge pull request #1294 from m-holger/fuzz
Add additional xref and object stream sanity checks
2024-09-28 01:02:32 +01:00
m-holger
c2ff89ae11 Add additional fuzz test cases 2024-09-28 00:36:32 +01:00
m-holger
192525226f Validate that offsets in object streams are strictly increasing 2024-09-28 00:28:17 +01:00
m-holger
1b6a504d42 Add sanity check for xref stream /Size entry 2024-09-28 00:25:31 +01:00
m-holger
529501aa41
Merge pull request #1293 from m-holger/pr1287
Tweak #1287 comments
2024-09-27 12:26:30 +01:00
m-holger
43a88e1d28 Tweak #1287 comments 2024-09-27 11:58:46 +01:00
m-holger
638bf5f9ae
Merge pull request #1287 from mslichao/mslichao/capifreebuf
Add C API qpdf_oh_free_buffer to release memory allocated by stream data functions
2024-09-27 11:34:54 +01:00
m-holger
1796365713
Merge branch 'main' into mslichao/capifreebuf 2024-09-27 11:31:55 +01:00
m-holger
50d385c858
Merge pull request #1274 from m-holger/meta
Add new commands --remove-metadata and --remove-info
2024-09-27 11:26:34 +01:00
m-holger
0198ff7e48
Merge pull request #1291 from m-holger/fuzz
In QPDFWordTokenFinder::check limit the token length
2024-09-24 01:55:36 +01:00
m-holger
0aa6b67eea In QPDFWordTokenFinder::check limit the token length
Tokens longer than the target cannot be a match and therefore there is no
need to read to the end of token.
2024-09-24 01:32:32 +01:00
m-holger
0e92cf6bf3
Merge pull request #1289 from m-holger/fuzz
Fix bugs found during fuzzing
2024-09-20 15:52:14 +01:00
m-holger
477fbd9839 Add additional fuzz test cases 2024-09-20 15:28:53 +01:00
m-holger
21f176d374 Add sanity check on trailer /Size entry 2024-09-20 15:28:49 +01:00
m-holger
44a1395194 Refactor QPDF::Xref_table::read_entry and read_bad_entry
Return results rather than using reference parameters.

Fixes bug in #1272 where parameters were not reinitialized when calling
read_bad_entry from read_entry.
2024-09-20 15:28:34 +01:00
Chao Li(VISION)
f6ae1ff16a Rename to qpdf_oh_free_buffer 2024-09-20 04:53:32 +00:00
m-holger
7d34b89a69
Merge pull request #1288 from m-holger/fuzz
In  QPDFParser add a limit on total number of errors in one object
2024-09-19 23:58:26 +01:00
m-holger
06a2d955fc In QPDFParser add a limit on total number of errors in one object
Currently, QPDFParser gives up attempting to parse an object if 5
near-consecutive bad tokens are encountered. Add a limit of a total of 15
bad tokens in a single object before giving up.
2024-09-19 17:28:26 +01:00
Chao Li(VISION)
8c1cde4ec3 Add C API qpdf_free_buffer to release memory allocated by stream data functions 2024-09-19 12:21:49 +00:00
m-holger
ff2a78f579
Merge pull request #1272 from m-holger/xref_table
Refactor QPDF xref table
2024-09-19 07:58:48 +01:00
m-holger
cb7180b1ba Move QPDF::ObjCache::end_before_space etc to Xref_table
Also, delay adjustments for compressed objects until needed by
linearization checks.
2024-09-18 10:25:38 +01:00
m-holger
28c13f5492 Refactor Xref_table::subsections
Optimistically read subsection headers without reading individual object
entries, assuming that they are 20 bytes long as per the PDF spec. If
problems are encountered, fall back to calling bad_subsections.
2024-09-18 10:25:38 +01:00
m-holger
ad10fa3006 Rename Xref_table::subsections to bad_subsections 2024-09-18 10:25:38 +01:00
m-holger
0f0747b3ae Refactor QPDF::getXRefTable 2024-09-18 10:25:38 +01:00
m-holger
965f0fcd63 Refactor QPDF::recoverStreamLength 2024-09-18 10:25:38 +01:00