Commit Graph

77 Commits

Author SHA1 Message Date
m-holger 34a6f8938f Add methods QPDFTokenizer::Token::isWord 2022-11-20 11:55:02 -05:00
m-holger b0c1ae05a3 Fix commit b45420a 2022-08-27 12:43:49 +01:00
m-holger 69a5fb7047 Add methods InputSource::fastRead, fastUnRead and fastTell
Provide buffered input for QPDFTokenizer.
2022-08-26 23:55:56 +01:00
m-holger 7108cd7b98 Remove redundant tests in QPDFTokenizer::readToken 2022-08-25 11:32:08 +01:00
m-holger 10fda01b07 In QPDFTokenizer::readToken move call to getToken out of loop 2022-08-25 11:31:45 +01:00
m-holger e4073ee868 Remove unnecessary string copy in QPDFTokenizer::getToken 2022-08-25 11:31:09 +01:00
m-holger b45420a980 Remove QPDFTokenizer::unread_char 2022-08-25 11:30:49 +01:00
m-holger 706106dabb Refactor QPDFTokenizer::betweenTokens() 2022-08-25 11:30:35 +01:00
m-holger 6371b90ae3 Refactor QPDFTokenizer::presentEOF 2022-08-25 11:30:24 +01:00
m-holger 42ed58e446 Integrate booleans and null into state machine in QPDFTokenizer 2022-08-25 11:30:13 +01:00
m-holger fe33b7ca18 Integrate numbers into state machine in QPDFTokenizer 2022-08-25 11:26:46 +01:00
m-holger 931fbb6156 Integrate names into state machine in QPDFTokenizer 2022-08-25 11:26:38 +01:00
m-holger a3f3238f37 Split QPDFTokenizer::handleCharacter into individual methods 2022-08-25 11:26:05 +01:00
m-holger 6111a6a424 Refactor QPDFTokenizer::inCharCode 2022-08-25 10:55:45 +01:00
m-holger e7889ec5dc Refactor st_top case in QPDFTokenizer::handleCharacter 2022-08-25 10:51:51 +01:00
m-holger e4fe0d5cf5 Refactor QPDFTokenizer::inHexstring 2022-08-25 10:50:06 +01:00
m-holger a5d2e88775 Code tidy: replace if with case statement in QPDFTokenizer::inString 2022-08-25 10:43:29 +01:00
m-holger 7c32f6cc2e Add state st_string_escape in QPDFTokenizer 2022-08-25 10:41:36 +01:00
m-holger 7c5778f999 Add state st_string_after_cr in QPDFTokenizer 2022-08-21 11:13:48 +01:00
m-holger f29d0a6312 Add state st_char_code in QPDFTokenizer 2022-08-21 11:01:48 +01:00
m-holger d26b537a7c Add private method QPDFTokenizer::inString 2022-08-21 02:54:34 +01:00
m-holger 2697ba49bc Add private method QPDFTokenizer::inHexstring 2022-08-21 02:46:31 +01:00
m-holger f9530a5815 Code tidy: replace if with case statement in QPDFTokenizer::handleCharacter 2022-08-21 02:38:49 +01:00
m-holger 86ade3f9cd Add private method QPDFTokenizer::handleCharacter 2022-08-21 02:26:27 +01:00
m-holger 91fb61eda5 Code tidy: replace if with case statement in QPDFTokenizer::presentCharacter 2022-08-21 00:54:41 +01:00
m-holger cf945eeabf Avoid shrinking QPDFTokenizer::val and QPDFTokenizer::raw_val 2022-08-20 19:43:00 +01:00
m-holger c08bb0ec02 Remove QPDFTokenizer::Members 2022-08-18 13:13:19 +01:00
m-holger 073808aa50 Code tidy : replace 0 with nullptr or true 2022-07-26 13:40:13 +01:00
m-holger 6c69a747b9 Code clean up: use range-style for loops wherever possible
Remove variables obsoleted by commit 4f24617.
2022-05-21 16:06:29 -04:00
Jay Berkenbilt 4f24617e1e Code clean up: use range-style for loops wherever possible
Where not possible, use "auto" to get the iterator type.

Editorial note: I have avoid this change for a long time because of
not wanting to make gratuitous changes to version history, which can
obscure when certain changes were made, but with having recently
touched every single file to apply automatic code formatting and with
making several broad changes to the API, I decided it was time to take
the plunge and get rid of the older (pre-C++11) verbose iterator
syntax. The new code is just easier to read and understand, and in
many cases, it will be more effecient as fewer temporary copies are
being made.

m-holger, if you're reading, you can see that I've finally come
around. :-)
2022-04-30 13:27:18 -04:00
Jay Berkenbilt 75fe4f60c3 Use anonymous namespaces for file-private classes 2022-04-16 13:35:27 -04:00
Jay Berkenbilt cdd0b4fb7d Use = default and = delete where possible in classes 2022-04-16 11:39:14 -04:00
Jay Berkenbilt a68703b07e Replace PointerHolder with std::shared_ptr in library sources only
(patrepl and cleanpatch are my own utilities)

patrepl s/PointerHolder/std::shared_ptr/g {include,libqpdf}/qpdf/*.hh
patrepl s/PointerHolder/std::shared_ptr/g libqpdf/*.cc
patrepl s/make_pointer_holder/std::make_shared/g libqpdf/*.cc
patrepl s/make_array_pointer_holder/QUtil::make_shared_array/g libqpdf/*.cc
patrepl s,qpdf/std::shared_ptr,qpdf/PointerHolder, **/*.cc **/*.hh
git restore include/qpdf/PointerHolder.hh
cleanpatch
./format-code
2022-04-09 17:33:29 -04:00
Jay Berkenbilt 12f1eb15ca Programmatically apply new formatting to code
Run this:

for i in  **/*.cc **/*.c **/*.h **/*.hh; do
  clang-format < $i >| $i.new && mv $i.new $i
done
2022-04-04 08:10:40 -04:00
Jay Berkenbilt cb769c62e5 WHITESPACE ONLY -- expand tabs in source code
This comment expands all tabs using an 8-character tab-width. You
should ignore this commit when using git blame or use git blame -w.

In the early days, I used to use tabs where possible for indentation,
since emacs did this automatically. In recent years, I have switched
to only using spaces, which means qpdf source code has been a mixture
of spaces and tabs. I have avoided cleaning this up because of not
wanting gratuitous whitespaces change to cloud the output of git
blame, but I changed my mind after discussing with users who view qpdf
source code in editors/IDEs that have other tab widths by default and
in light of the fact that I am planning to start applying automatic
code formatting soon.
2022-02-08 11:51:15 -05:00
Jay Berkenbilt 9044a24097 PointerHolder: deprecate getPointer() and getRefcount()
Use get() and use_count() instead. Add #define
NO_POINTERHOLDER_DEPRECATION to remove deprecation markers for these
only.

This commit also removes all deprecated PointerHolder API calls from
qpdf's code except in PointerHolder's test suite, which must continue
to test the deprecated APIs.
2022-02-04 13:12:37 -05:00
Jay Berkenbilt 77c31305fe Fix signed/unsigned char warning (fixes #604) 2022-01-11 06:51:31 -05:00
Jay Berkenbilt bcea54fcaa Revert removal of unreadCh change for performance
Turns out unreadCh is much more efficient than seek(-1, SEEK_CUR).
Update comments and code to reflect this.
2020-10-27 11:57:48 -04:00
Jay Berkenbilt bed165c9fc Stop using InputSource::unreadCh 2020-10-18 07:43:05 -04:00
Jay Berkenbilt 92d3cbecd4 Fix warnings reported by -Wshadow=local (fixes #431) 2020-04-16 12:41:43 -04:00
Jay Berkenbilt 43f91f58b8 Improve invalid name token warning message
This message used to only appear for PDF >= 1.2. The invalid name is
valid for PDF 1.0 and 1.1. However, since QPDFWriter may write a newer
version, it's better to detect and warn in all cases. Therefore make
the warning more informative.
2019-08-19 19:48:27 -04:00
Jay Berkenbilt 42d396f1dd Handle invalid name tokens symmetrically for PDF < 1.2 (fixes #332) 2019-08-19 19:48:27 -04:00
Jay Berkenbilt 45dac410b5 Remove broken QPDFTokenizer::expectInlineImage 2019-06-21 22:29:31 -04:00
Jay Berkenbilt d71f05ca07 Fix sign and conversion warnings (major)
This makes all integer type conversions that have potential data loss
explicit with calls that do range checks and raise an exception. After
this commit, qpdf builds with no warnings when -Wsign-conversion
-Wconversion is used with gcc or clang or when -W3 -Wd4800 is used
with MSVC. This significantly reduces the likelihood of potential
crashes from bogus integer values.

There are some parts of the code that take int when they should take
size_t or an offset. Such places would make qpdf not support files
with more than 2^31 of something that usually wouldn't be so large. In
the event that such a file shows up and is valid, at least qpdf would
raise an error in the right spot so the issue could be legitimately
addressed rather than failing in some weird way because of a silent
overflow condition.
2019-06-21 13:17:21 -04:00
Thorsten Schöning 2c704b99a1 Undefined functions because of missing std:: or header. (#295)
* [bcc32 Error] QPDF.cc(375): E2268 Call to undefined function 'atof'
  Full parser context
    QPDF.cc(358): parsing: void QPDF::parse(const char *)

* [bcc32 Error] QPDFTokenizer.cc(183): E2268 Call to undefined function 'strtol'
  Full parser context
    QPDFTokenizer.cc(163): parsing: void QPDFTokenizer::resolveLiteral()

* [bcc32 Error] pdf-split-pages.cc(52): E2268 Call to undefined function 'exit'
  Full parser context
    pdf-split-pages.cc(50): parsing: void usage()

* PR #295: Including "cstdlib" should be replaced with "stdlib.h" to be more consistent. At the same time I changed the order of the surrounding includes to reflect alphabetical order, because at some files this already have been the case.
2019-03-12 10:05:29 -04:00
Jay Berkenbilt eb49e07c0a Make inline image token exactly contain the image data
Do not include the trailing EI, and handle cases where EI is not
preceded by a delimiter. Such cases have been seen in the wild.
2019-01-31 20:28:44 -05:00
Jay Berkenbilt 2b6c79bcae Improve locating inline image's EI
We've actually seen a PDF file in the wild that contained EI
surrounded by delimiters inside the image data, which confused qpdf's
naive code. This significantly improves EI detection.
2019-01-31 09:26:37 -05:00
Jay Berkenbilt ec9e310c9e Refactor QPDFTokenizer's inline image handling
Add a version of expectInlineImage that takes an input source and
searches for EI. This is in preparation for improving the way EI is
found. This commit just refactors the code without changing the
functionality and adds tests to make sure the old and new code behave
identically.
2019-01-31 09:26:37 -05:00
Jay Berkenbilt 31372edce0 Inline image token value ends with EI, not delimiter
The inline image token erroneously included the delimiter that
followed EI. The ObjectHandle created from it was correct.
2019-01-31 09:26:37 -05:00
Jay Berkenbilt 4a4736c695 Fix EOL handling inside strings (fixes #226)
CR, CRLF, and LF are all supposed to be treated as LF; only one EOL is
to be ignored after backslash.
2018-08-05 20:48:35 -04:00