m-holger
8753ffe335
Generalise last commit to xref parsing and reconstruction
...
Tests use a modified dangling-bad-xref.pdf.
2024-10-18 16:48:30 +01:00
m-holger
405f3765c5
Refactor QPDF::Objects / JSONReactor interaction
...
Allow for the fact that when processing JSON input we cannot determine
whether references are dangling until the whole input has been processed.
We handle this by optimistically storing references in the object table.
When additional gens are encountered for the same object we store them
in the new map unconfirmed_objects. Once processing is complete we clean
up the object table and clear unconfirmed_objects.
New test cases are adapted from manual-qpdf-json.json etc.
2024-10-18 16:23:46 +01:00
m-holger
486e68f2e5
Tidy Objects::read
2024-10-17 14:30:10 +01:00
m-holger
6408c5c364
Refactor Qbjects::update_table
...
Decompose into new methods update_entry and Entry::update.
2024-10-17 14:30:10 +01:00
m-holger
55bca1a117
Index QPDF::Objects::table by object id only
2024-10-17 14:30:10 +01:00
m-holger
542794bc28
Replace QPDFObject::setObjGen with new methods make_indirect etc
2024-10-17 14:30:10 +01:00
m-holger
84a5c03bb8
Change signatures of various QPDF::Objects methods
2024-10-17 14:30:10 +01:00
m-holger
6555cdb6de
Refactor Objects::next_id
...
Store last used id as a data member. Update when calling next_id /
inserting a new object (which only occurs in make_indirect and
get_for_json). Add new methods initialize to calculate the initial value
for last_id and last_id to query it.
Use initialize in QPDF::fixDanglingReferences, which is redundant.
Refactor make_indirect to use update_table.
2024-10-17 14:30:10 +01:00
m-holger
7a20ba0c6a
Add new method Xref_table::prepare_obj_table
...
Remove invalid objects introduced into the object table during xref table
parsing. At the moment, such invalid objects could override valid objects
if valid gen < invalid gen. Once the object table is indexed by object id
only, the invalid object will prevent valid objects with the same id
entering the object table.
The additional test case uses good10.pdf, with the dangling reference
in the trailer replaced with 3 1 R. Prior to this commit, this caused
the page object 3 0 to be masked and replaced with a null object.
2024-10-17 14:30:10 +01:00
m-holger
560130c893
Refactor Xref_table::resolve
...
Make resolve resolve responsible to try again after xref reconstruction
rather than returning false to indicate failure. Rename to resolve_all.
2024-10-17 14:30:10 +01:00
m-holger
c648b9a018
Merge pull request #1297 from m-holger/qpdf_objects
...
Add inner class QPDF::Objects to encapsulate reading and managing of objects
2024-10-17 14:03:41 +01:00
m-holger
acc57ca090
Add QPDF::Objects destructor
...
Also, make obj_cache private and rename to table.
2024-10-09 12:02:34 +01:00
m-holger
336d783325
Move calculations from QPDF::getObjectCount to Objects::next_id
2024-10-09 11:55:29 +01:00
m-holger
113ea4e7ae
Add new method Objects::all
2024-10-09 11:39:44 +01:00
m-holger
9e03dc54cc
Add new method Objects::swap
2024-10-09 11:39:17 +01:00
m-holger
83fc18af09
Add new method Objects::replace
2024-10-09 11:27:40 +01:00
m-holger
6c9903062f
Add new method Objects::get
2024-10-09 11:27:28 +01:00
m-holger
83443c116d
Make ObjCache of inner class of QPDF::Objects and rename to Entry
2024-10-09 11:09:18 +01:00
m-holger
b5a5780019
Make Xref_table an inner class of QPDF::Objects
2024-10-09 09:53:57 +01:00
m-holger
a3f693c8f9
Move private methods in QPDF_objects to QPDF::Objects
2024-10-09 08:58:57 +01:00
m-holger
2015f71c7d
Add new inner class to QPDF::Objects
2024-10-07 14:18:59 +01:00
m-holger
83897e8789
Split QPDF.cc into QPDF.cc and QPDF_objects.cc
...
Move methods responsible for loading or keeping track of objects to
QPDF_objects.cc.
2024-10-07 14:10:18 +01:00
m-holger
9f0cc086b1
Copy QPDF.cc to new QPDF_objects
2024-10-06 17:45:29 +01:00
m-holger
12b67a3227
Merge pull request #1282 from m-holger/next
...
Add new protected inline method Pipeline::next
2024-10-06 15:59:42 +01:00
m-holger
c916dcf973
Add new protected inline method Pipeline::next
...
Also, tidy pipeline constructors and make subclasses final where possible.
2024-10-06 15:10:13 +01:00
m-holger
2cb2412fbf
Merge pull request #1294 from m-holger/fuzz
...
Add additional xref and object stream sanity checks
2024-09-28 01:02:32 +01:00
m-holger
c2ff89ae11
Add additional fuzz test cases
2024-09-28 00:36:32 +01:00
m-holger
192525226f
Validate that offsets in object streams are strictly increasing
2024-09-28 00:28:17 +01:00
m-holger
1b6a504d42
Add sanity check for xref stream /Size entry
2024-09-28 00:25:31 +01:00
m-holger
529501aa41
Merge pull request #1293 from m-holger/pr1287
...
Tweak #1287 comments
2024-09-27 12:26:30 +01:00
m-holger
43a88e1d28
Tweak #1287 comments
2024-09-27 11:58:46 +01:00
m-holger
638bf5f9ae
Merge pull request #1287 from mslichao/mslichao/capifreebuf
...
Add C API qpdf_oh_free_buffer to release memory allocated by stream data functions
2024-09-27 11:34:54 +01:00
m-holger
1796365713
Merge branch 'main' into mslichao/capifreebuf
2024-09-27 11:31:55 +01:00
m-holger
50d385c858
Merge pull request #1274 from m-holger/meta
...
Add new commands --remove-metadata and --remove-info
2024-09-27 11:26:34 +01:00
m-holger
0198ff7e48
Merge pull request #1291 from m-holger/fuzz
...
In QPDFWordTokenFinder::check limit the token length
2024-09-24 01:55:36 +01:00
m-holger
0aa6b67eea
In QPDFWordTokenFinder::check limit the token length
...
Tokens longer than the target cannot be a match and therefore there is no
need to read to the end of token.
2024-09-24 01:32:32 +01:00
m-holger
0e92cf6bf3
Merge pull request #1289 from m-holger/fuzz
...
Fix bugs found during fuzzing
2024-09-20 15:52:14 +01:00
m-holger
477fbd9839
Add additional fuzz test cases
2024-09-20 15:28:53 +01:00
m-holger
21f176d374
Add sanity check on trailer /Size entry
2024-09-20 15:28:49 +01:00
m-holger
44a1395194
Refactor QPDF::Xref_table::read_entry and read_bad_entry
...
Return results rather than using reference parameters.
Fixes bug in #1272 where parameters were not reinitialized when calling
read_bad_entry from read_entry.
2024-09-20 15:28:34 +01:00
Chao Li(VISION)
f6ae1ff16a
Rename to qpdf_oh_free_buffer
2024-09-20 04:53:32 +00:00
m-holger
7d34b89a69
Merge pull request #1288 from m-holger/fuzz
...
In QPDFParser add a limit on total number of errors in one object
2024-09-19 23:58:26 +01:00
m-holger
06a2d955fc
In QPDFParser add a limit on total number of errors in one object
...
Currently, QPDFParser gives up attempting to parse an object if 5
near-consecutive bad tokens are encountered. Add a limit of a total of 15
bad tokens in a single object before giving up.
2024-09-19 17:28:26 +01:00
Chao Li(VISION)
8c1cde4ec3
Add C API qpdf_free_buffer to release memory allocated by stream data functions
2024-09-19 12:21:49 +00:00
m-holger
ff2a78f579
Merge pull request #1272 from m-holger/xref_table
...
Refactor QPDF xref table
2024-09-19 07:58:48 +01:00
m-holger
cb7180b1ba
Move QPDF::ObjCache::end_before_space etc to Xref_table
...
Also, delay adjustments for compressed objects until needed by
linearization checks.
2024-09-18 10:25:38 +01:00
m-holger
28c13f5492
Refactor Xref_table::subsections
...
Optimistically read subsection headers without reading individual object
entries, assuming that they are 20 bytes long as per the PDF spec. If
problems are encountered, fall back to calling bad_subsections.
2024-09-18 10:25:38 +01:00
m-holger
ad10fa3006
Rename Xref_table::subsections to bad_subsections
2024-09-18 10:25:38 +01:00
m-holger
0f0747b3ae
Refactor QPDF::getXRefTable
2024-09-18 10:25:38 +01:00
m-holger
965f0fcd63
Refactor QPDF::recoverStreamLength
2024-09-18 10:25:38 +01:00