Commit Graph

260 Commits

Author SHA1 Message Date
James R. Barlow 12967bdf8a Take advantage of unique_ptr and move construction for Buffer
Since Buffer has always implemented its copy constructor with a deep
copy, its Members object will never have multiple owners. Change to unique_ptr.

Also implement move constructors for Buffer, since there may be cases
where a deep copy is not needed.
2022-09-05 13:29:56 -07:00
Jay Berkenbilt 0a54247652 Add QUtil::get_max_memory_usage for testing 2022-08-31 14:47:27 -04:00
Jay Berkenbilt 4674c04cb8 JSON schema: support multi-element array validation 2022-07-24 16:44:51 -04:00
Jay Berkenbilt f8d1ab9462 JSON schema -- accept single item in place of array
When the schema wants a variable-length array, allow a single item as
well as allowing an array.
2022-07-24 16:17:03 -04:00
Jay Berkenbilt 6c4537885e Reformat code 2022-06-25 11:11:24 -04:00
Jay Berkenbilt eae75dbe44 Add Pl_Function -- a generic function pipeline 2022-06-19 09:12:29 -04:00
Jay Berkenbilt 8130d50e3b Add C API to QPDFLogger 2022-06-19 08:46:58 -04:00
Jay Berkenbilt f1f711963b Add and test QPDFLogger class 2022-06-18 09:02:55 -04:00
Jay Berkenbilt 04fc7c4bea Add conversions to ISO-8601 date format 2022-05-30 20:03:08 -04:00
Jay Berkenbilt 05460d405c Format code 2022-05-21 16:11:42 -04:00
Jay Berkenbilt 47c093c48b Replace std::regex with validators for better performance 2022-05-21 08:43:21 -04:00
Jay Berkenbilt 3eb77a7004 JSON: detect duplicate dictionary keys while parsing 2022-05-20 10:13:15 -04:00
Jay Berkenbilt 6c7326b290 JSON fix: correctly parse UTF-16 surrogate pairs 2022-05-20 09:16:25 -04:00
Jay Berkenbilt 56f1b411fe Back out fluent QPDFObjectHandle methods. Keep the andGet methods.
I decided these were confusing and inconsistent with how JSON works.
They muddle the API rather than improving it.
2022-05-20 09:16:25 -04:00
Jay Berkenbilt 60ec94a7c3 Add QUtil::is_long_long 2022-05-16 13:39:26 -04:00
Jay Berkenbilt 4c7cfd5cbc JSON reactor: improve handling of nested containers
Call the parent container's item method before calling the child
item's start method so we can easily know the current nesting level
when nested items are added.
2022-05-14 17:35:06 -04:00
Jay Berkenbilt 16f4f94cd9 Prepare code for JSON v2
Update getJSON() methods and calls to them
2022-05-07 11:12:01 -04:00
Jay Berkenbilt 0500d4347a JSON: add blob type that generates base64-encoded binary data 2022-05-06 19:14:52 -04:00
Jay Berkenbilt 05fda4afa2 Change JSON parser to parse from an InputSource 2022-05-04 12:07:11 -04:00
Jay Berkenbilt e5f3910c3e Add new FileInputSource constructors 2022-05-04 12:07:11 -04:00
Jay Berkenbilt 16139d97c8 Add new Pl_OStream Pipeline 2022-05-03 18:54:51 -04:00
Jay Berkenbilt 59f3e09edf Make Pipeline::write take an unsigned char const* (API change) 2022-05-03 18:31:22 -04:00
Jay Berkenbilt d55c7ac570 Spell check with newer cSpell 2022-05-03 18:31:22 -04:00
Jay Berkenbilt 62bf296a9c Make assert handling less error-prone
Prevent my future self or other contributors from using assert in
tests and then having that assert not do anything because of the
NDEBUG macro.
2022-05-03 18:31:22 -04:00
Jay Berkenbilt 3d9bac43da Add internal Pl_Base64
Bidirectional base64; will be used by JSON v2.
2022-05-03 18:31:22 -04:00
Jay Berkenbilt 8d2a0eda5a Add reactors to the JSON parser 2022-05-01 19:55:52 -04:00
Jay Berkenbilt f5dd63819d Windows perl workaround 2022-05-01 19:55:52 -04:00
Jay Berkenbilt 72e5c73419 Limit parser depth for json parser 2022-05-01 12:56:22 -04:00
Jay Berkenbilt 4f24617e1e Code clean up: use range-style for loops wherever possible
Where not possible, use "auto" to get the iterator type.

Editorial note: I have avoid this change for a long time because of
not wanting to make gratuitous changes to version history, which can
obscure when certain changes were made, but with having recently
touched every single file to apply automatic code formatting and with
making several broad changes to the API, I decided it was time to take
the plunge and get rid of the older (pre-C++11) verbose iterator
syntax. The new code is just easier to read and understand, and in
many cases, it will be more effecient as fewer temporary copies are
being made.

m-holger, if you're reading, you can see that I've finally come
around. :-)
2022-04-30 13:27:18 -04:00
Jay Berkenbilt 7f023701dd Formatting: remove space in range-style for loops
Change .clang-format and commit automated changes from a fresh run of
format-code
2022-04-30 13:26:43 -04:00
Jay Berkenbilt 2878c186bf Use fluent appendItem 2022-04-30 10:54:16 -04:00
Jay Berkenbilt ab9d557cb0 Use fluent replaceKey 2022-04-29 20:39:54 -04:00
Jay Berkenbilt 22b35c4928 Expose QUtil::get_next_utf8_codepoint 2022-04-23 18:25:43 -04:00
Jay Berkenbilt cdd0b4fb7d Use = default and = delete where possible in classes 2022-04-16 11:39:14 -04:00
Jay Berkenbilt 128e41648f Remove PointerHolder.hh from other than public header files
Increase to POINTERHOLDER_TRANSITION=4
2022-04-09 17:33:29 -04:00
Jay Berkenbilt ba0ef7a124 Replace PointerHolder with std::shared_ptr in the rest of the code
Increase to POINTERHOLDER_TRANSITION=3

patrepl s/PointerHolder/std::shared_ptr/g **/*.cc **/*.hh
patrepl s/make_pointer_holder/std::make_shared/g **/*.cc
patrepl s/make_array_pointer_holder/QUtil::make_shared_array/g **/*.cc
patrepl s,qpdf/std::shared_ptr,qpdf/PointerHolder, **/*.cc **/*.hh
git restore include/qpdf/PointerHolder.hh
git restore libtests/pointer_holder.cc
cleanpatch
./format-code
2022-04-09 17:33:29 -04:00
Jay Berkenbilt ae819b5318 Rewrite PointerHolder as derived from std::shared_ptr 2022-04-09 17:33:29 -04:00
Jay Berkenbilt 12f1eb15ca Programmatically apply new formatting to code
Run this:

for i in  **/*.cc **/*.c **/*.h **/*.hh; do
  clang-format < $i >| $i.new && mv $i.new $i
done
2022-04-04 08:10:40 -04:00
Jay Berkenbilt 70d0d0889b Remove old build files 2022-03-18 19:53:18 -04:00
Jay Berkenbilt b8aff90997 Add cmake configuration files 2022-03-18 19:53:18 -04:00
Jay Berkenbilt f030789104 Rename bits_include.cc to qpdf/bits_functions.hh
It's better to just make it a .hh file to reduce confusion.
2022-03-07 18:01:27 -05:00
Jay Berkenbilt 17c0e38c8e Force assert to be defined in test code 2022-03-07 10:07:27 -05:00
Jay Berkenbilt 6aa58d51be Rename bits.icc to bits_include.cc 2022-02-26 12:08:58 -05:00
Jay Berkenbilt 36794a60cf Allow \/ in a json string 2022-02-25 11:42:50 -05:00
Jay Berkenbilt 3e2109ab37 Remove special case for 0xad for 10.6.2. 2022-02-16 06:52:05 -05:00
Jay Berkenbilt 74d66a9349 Spell check 2022-02-15 19:36:12 -05:00
Jay Berkenbilt a478cbb6dc Silently/transparently recognize UTF-16LE as UTF-16 (fixes #649)
The PDF spec only allows UTF-16BE, but most readers seem to accept
UTF-16LE as well, so now qpdf does too.
2022-02-15 16:13:12 -05:00
Jay Berkenbilt 1065bbb016 Handle odd PDFDoc codepoints in UTF-8 during transcoding (fixes #650)
There are codepoints in PDFDoc that are not valid UTF-8 but map to
valid UTF-8. We were handling those correctly with bidirectional
mapping.

However, if those same code points appeared in UTF-8, where they have
no meaning, they were left as fixed points when converting to PDFDoc,
where they do have meaning. This change recognizes them as errors.
2022-02-15 08:32:38 -05:00
Jay Berkenbilt 235c89e037 Fix one more PDF doc encoding error for 10.6 release (fixes #637) 2022-02-09 05:47:58 -05:00
Jay Berkenbilt cfd5147d92 Add QPDF::getVersionAsPDFVersion 2022-02-08 12:34:14 -05:00
Jay Berkenbilt 8082af09be Add PDFVersion class 2022-02-08 12:34:14 -05:00
Jay Berkenbilt cb769c62e5 WHITESPACE ONLY -- expand tabs in source code
This comment expands all tabs using an 8-character tab-width. You
should ignore this commit when using git blame or use git blame -w.

In the early days, I used to use tabs where possible for indentation,
since emacs did this automatically. In recent years, I have switched
to only using spaces, which means qpdf source code has been a mixture
of spaces and tabs. I have avoided cleaning this up because of not
wanting gratuitous whitespaces change to cloud the output of git
blame, but I changed my mind after discussing with users who view qpdf
source code in editors/IDEs that have other tab widths by default and
in light of the fact that I am planning to start applying automatic
code formatting soon.
2022-02-08 11:51:15 -05:00
Jay Berkenbilt c62e8e2b28 Update for clean compile with POINTERHOLDER_TRANSITION=2 2022-02-07 17:38:22 -05:00
Jay Berkenbilt dd4f30226f Rework PointerHolder transition to make it smoother
* Don't surprise people with deprecation warnings
* Provide detailed instructions and support for the transition
2022-02-07 17:38:20 -05:00
Jay Berkenbilt 5f3f78822b Improve use of std::unique_ptr
* Use unique_ptr in place of shared_ptr in some cases
* unique_ptr for arrays does not require a custom deleter
* use std::make_unique (c++14) where possible
2022-02-05 11:24:56 -05:00
Jay Berkenbilt 88c3d556d5 Spell check 2022-02-05 11:24:56 -05:00
Jay Berkenbilt 9044a24097 PointerHolder: deprecate getPointer() and getRefcount()
Use get() and use_count() instead. Add #define
NO_POINTERHOLDER_DEPRECATION to remove deprecation markers for these
only.

This commit also removes all deprecated PointerHolder API calls from
qpdf's code except in PointerHolder's test suite, which must continue
to test the deprecated APIs.
2022-02-04 13:12:37 -05:00
Jay Berkenbilt f727bc9443 PointerHolder: add get() and use_count() for forward compatibility
PointerHolder will be replaced with shared_ptr, so let people start
moving.
2022-02-04 13:12:37 -05:00
Jay Berkenbilt f76191f0c2 Add array test to PointerHolder 2022-02-04 13:12:37 -05:00
Jay Berkenbilt b02d37bc0a Make QPDFArgParser accept const argv
This makes it much more convention to use the initializeFromArgv
functions since you can use string literals.
2022-02-01 13:50:58 -05:00
Jay Berkenbilt 3b60224bae JSONHandler: pass JSON object to array start function 2022-01-31 15:57:45 -05:00
Jay Berkenbilt ce3406e93f JSONHandler: pass JSON object to dict start function
If some keys depend on others, we have to check up front since there
is no control of what order key handlers will be called. Anyway, keys
are unordered in json, so we don't want to depend on ordering.
2022-01-31 15:57:45 -05:00
Jay Berkenbilt 7097f29019 More editorial changes from m-holger + spell check 2022-01-30 13:11:03 -05:00
Jay Berkenbilt 0e909bab8e Improve top-level help information 2022-01-30 13:11:03 -05:00
Jay Berkenbilt 0364024781 Use QPDFUsage exception for cli, json, and QPDFJob errors 2022-01-30 13:11:03 -05:00
Jay Berkenbilt 901e3e4fbf QPDFArgParser: remove unused copyFromOtherTable
This was used, but it no longer is, so let's not keep the extra
complexity around.
2022-01-30 13:11:03 -05:00
Jay Berkenbilt 76c4f78b5c Add QUtil::make_shared_cstr
Replace most of the calls to QUtil::copy_string with this instead.
2022-01-30 13:11:03 -05:00
Jay Berkenbilt 8dea480c9f Allow optional fields in json "schema" checks 2022-01-30 13:11:03 -05:00
Jay Berkenbilt 1db0a7ffce JSONHandler: rework dictionary and array handlers 2022-01-30 13:11:03 -05:00
Jay Berkenbilt 37105710ee Implement JSONHandler for recursively processing JSON 2022-01-30 13:11:03 -05:00
Jay Berkenbilt e8e8f6f43c Add JSON::parse 2022-01-30 13:11:03 -05:00
Jay Berkenbilt c8729398dd Generate help content from manual
This is a massive rewrite of the help text and cli.rst section of the
manual. All command-line flags now have their own help and are
specifically index. qpdf --help is completely redone.
2022-01-30 13:11:03 -05:00
Jay Berkenbilt b4bd124be4 QPDFArgParser: support adding/printing help information 2022-01-30 13:11:03 -05:00
Jay Berkenbilt 53ba65eb59 QPDFArgParser: handle optional choices including help
Handle optional choices in addition to required choices. Refactor the
way help options are added to completion to make it work with optional
help choices.
2022-01-30 13:11:03 -05:00
Jay Berkenbilt f1d805badc Add QPDFArgParser::copyFromOtherTable 2022-01-30 13:11:03 -05:00
Jay Berkenbilt 7d48b446b3 Add raw string and user defined literals to c++11 tests 2022-01-30 13:11:02 -05:00
Jay Berkenbilt 52817f0a45 Implement QPDFArgParser based on ArgParser from qpdf.cc 2022-01-30 13:11:02 -05:00
Jay Berkenbilt 370710657a Add missing characters from PDF doc encoding (fixes #606) 2022-01-11 15:55:19 -05:00
Jay Berkenbilt 0f1ffa1215 Move bash/zsh completion helpers to libtests/arg_parser 2022-01-05 18:13:25 -05:00
Jay Berkenbilt 4782b5904f Move filter-completion.pl to libtests/arg_parser 2022-01-05 18:13:25 -05:00
Jay Berkenbilt af91b5b584 Add QUtil::file_can_be_opened 2021-12-29 13:41:02 -05:00
Jay Berkenbilt fee7489ee4 Add Pl_Buffer::getMallocBuffer 2021-12-17 12:38:52 -05:00
Jay Berkenbilt af2a71aa2c Handle bitstream overflow errors more gracefully (fixes #581)
* Make it a runtime error, not a logic error
* Include additional information
* Capture it properly in checkLinearization
2021-12-10 15:37:35 -05:00
Jay Berkenbilt ec09b91443 Add QIntC::range_check_subtract 2021-11-04 13:53:46 -04:00
Jay Berkenbilt 1886673d7e Spell check 2021-02-23 10:38:05 -05:00
Jay Berkenbilt 0b1623d07d Add QUtil::path_basename 2021-02-18 09:59:03 -05:00
Jay Berkenbilt 07f40bd254 QUtil::double_to_string: trim trailing zeroes with option to disable 2021-02-13 02:30:00 -05:00
Jay Berkenbilt 8fbc8579f2 Allow zone information to be omitted from timestamp strings 2021-02-11 14:26:55 -05:00
Jay Berkenbilt bf0e6eb302 Add QUtil methods for dealing with PDF timestamp strings 2021-02-09 17:50:24 -05:00
Jay Berkenbilt bfbeec5497 Make newly created name/number trees indirect objects 2021-02-08 06:49:56 -05:00
Jay Berkenbilt 553ac7f353 Add QUtil::pipe_file and QUtil::file_provider 2021-02-07 19:41:34 -05:00
Jay Berkenbilt 63e5cb533d Use new QPDF{Name,Number}TreeObjectHelper API 2021-01-24 03:27:28 -05:00
Jay Berkenbilt 5f0708418a Add iterators to name/number tree helpers 2021-01-24 03:22:59 -05:00
Jay Berkenbilt 4a1cce0a47 Reimplement name and number tree object helpers
Create a computationally and memory efficient implementation of name
and number trees that does binary searches as intended by the data
structure rather than loading into a map, which can use a great deal
of memory and can be very slow.
2021-01-24 03:22:51 -05:00
Jay Berkenbilt 4cbe2abcc0 Test empty function detection 2020-12-28 12:58:19 -05:00
Jay Berkenbilt 9d64481571 Handle negative numbers in QIntC::range_check (fuzz issue 26994) 2020-11-21 13:43:04 -05:00
Jay Berkenbilt bcea54fcaa Revert removal of unreadCh change for performance
Turns out unreadCh is much more efficient than seek(-1, SEEK_CUR).
Update comments and code to reflect this.
2020-10-27 11:57:48 -04:00
Jay Berkenbilt 026330ebcd Make libtests depend on qpdf
We need to run qpdf --show-crypto.
2020-10-24 19:16:46 -04:00
Jay Berkenbilt 98f6c00dad Protect numeric conversion against user's locale (fixes #459) 2020-10-21 16:42:51 -04:00
Jay Berkenbilt bed165c9fc Stop using InputSource::unreadCh 2020-10-18 07:43:05 -04:00