Commit Graph

64 Commits

Author SHA1 Message Date
m-holger 34a6f8938f Add methods QPDFTokenizer::Token::isWord 2022-11-20 11:55:02 -05:00
m-holger dca70f13e7 Add method QPDFTokenizer::Token::isInteger 2022-11-20 11:55:02 -05:00
Jay Berkenbilt 9a9a7ab097 Comment about qpdf/PointerHolder.hh in public headers 2022-09-23 15:15:39 -04:00
m-holger b45420a980 Remove QPDFTokenizer::unread_char 2022-08-25 11:30:49 +01:00
m-holger 706106dabb Refactor QPDFTokenizer::betweenTokens() 2022-08-25 11:30:35 +01:00
m-holger 42ed58e446 Integrate booleans and null into state machine in QPDFTokenizer 2022-08-25 11:30:13 +01:00
m-holger fe33b7ca18 Integrate numbers into state machine in QPDFTokenizer 2022-08-25 11:26:46 +01:00
m-holger 931fbb6156 Integrate names into state machine in QPDFTokenizer 2022-08-25 11:26:38 +01:00
m-holger a3f3238f37 Split QPDFTokenizer::handleCharacter into individual methods 2022-08-25 11:26:05 +01:00
m-holger 6111a6a424 Refactor QPDFTokenizer::inCharCode 2022-08-25 10:55:45 +01:00
m-holger e4fe0d5cf5 Refactor QPDFTokenizer::inHexstring 2022-08-25 10:50:06 +01:00
m-holger 7c32f6cc2e Add state st_string_escape in QPDFTokenizer 2022-08-25 10:41:36 +01:00
m-holger 7c5778f999 Add state st_string_after_cr in QPDFTokenizer 2022-08-21 11:13:48 +01:00
m-holger f29d0a6312 Add state st_char_code in QPDFTokenizer 2022-08-21 11:01:48 +01:00
m-holger d26b537a7c Add private method QPDFTokenizer::inString 2022-08-21 02:54:34 +01:00
m-holger 2697ba49bc Add private method QPDFTokenizer::inHexstring 2022-08-21 02:46:31 +01:00
m-holger 86ade3f9cd Add private method QPDFTokenizer::handleCharacter 2022-08-21 02:26:27 +01:00
m-holger c08bb0ec02 Remove QPDFTokenizer::Members 2022-08-18 13:13:19 +01:00
Jay Berkenbilt cdd0b4fb7d Use = default and = delete where possible in classes 2022-04-16 11:39:14 -04:00
Jay Berkenbilt a68703b07e Replace PointerHolder with std::shared_ptr in library sources only
(patrepl and cleanpatch are my own utilities)

patrepl s/PointerHolder/std::shared_ptr/g {include,libqpdf}/qpdf/*.hh
patrepl s/PointerHolder/std::shared_ptr/g libqpdf/*.cc
patrepl s/make_pointer_holder/std::make_shared/g libqpdf/*.cc
patrepl s/make_array_pointer_holder/QUtil::make_shared_array/g libqpdf/*.cc
patrepl s,qpdf/std::shared_ptr,qpdf/PointerHolder, **/*.cc **/*.hh
git restore include/qpdf/PointerHolder.hh
cleanpatch
./format-code
2022-04-09 17:33:29 -04:00
Jay Berkenbilt 12f1eb15ca Programmatically apply new formatting to code
Run this:

for i in  **/*.cc **/*.c **/*.h **/*.hh; do
  clang-format < $i >| $i.new && mv $i.new $i
done
2022-04-04 08:10:40 -04:00
Jay Berkenbilt cb769c62e5 WHITESPACE ONLY -- expand tabs in source code
This comment expands all tabs using an 8-character tab-width. You
should ignore this commit when using git blame or use git blame -w.

In the early days, I used to use tabs where possible for indentation,
since emacs did this automatically. In recent years, I have switched
to only using spaces, which means qpdf source code has been a mixture
of spaces and tabs. I have avoided cleaning this up because of not
wanting gratuitous whitespaces change to cloud the output of git
blame, but I changed my mind after discussing with users who view qpdf
source code in editors/IDEs that have other tab widths by default and
in light of the fact that I am planning to start applying automatic
code formatting soon.
2022-02-08 11:51:15 -05:00
Jay Berkenbilt c62e8e2b28 Update for clean compile with POINTERHOLDER_TRANSITION=2 2022-02-07 17:38:22 -05:00
Jay Berkenbilt cfaa2de804 Update copyright for 2022 2022-02-04 16:36:22 -05:00
Jay Berkenbilt bf8fd41fee Update copyright to 2021 2021-01-04 16:26:58 -05:00
Jay Berkenbilt 802de87c30 Fix outdated comment in QPDFTokenizer.hh 2020-10-23 06:39:42 -04:00
Jay Berkenbilt a6f1f829db Use deleted copy/assignment (C++11) 2020-04-03 12:17:57 -04:00
Jay Berkenbilt e5cc065598 Update copyright to 2020 2020-01-26 16:57:27 -05:00
Jay Berkenbilt 42d396f1dd Handle invalid name tokens symmetrically for PDF < 1.2 (fixes #332) 2019-08-19 19:48:27 -04:00
Jay Berkenbilt 45dac410b5 Remove broken QPDFTokenizer::expectInlineImage 2019-06-21 22:29:31 -04:00
Jay Berkenbilt eb49e07c0a Make inline image token exactly contain the image data
Do not include the trailing EI, and handle cases where EI is not
preceded by a delimiter. Such cases have been seen in the wild.
2019-01-31 20:28:44 -05:00
Jay Berkenbilt 1eb35a355f Exclude space after ID in image data 2019-01-31 10:38:10 -05:00
Jay Berkenbilt 2b6c79bcae Improve locating inline image's EI
We've actually seen a PDF file in the wild that contained EI
surrounded by delimiters inside the image data, which confused qpdf's
naive code. This significantly improves EI detection.
2019-01-31 09:26:37 -05:00
Jay Berkenbilt ec9e310c9e Refactor QPDFTokenizer's inline image handling
Add a version of expectInlineImage that takes an input source and
searches for EI. This is in preparation for improving the way EI is
found. This commit just refactors the code without changing the
functionality and adds tests to make sure the old and new code behave
identically.
2019-01-31 09:26:37 -05:00
Jay Berkenbilt 3472f6c984 Update copyrights for 2019 2019-01-07 07:54:55 -05:00
Jay Berkenbilt 3873f5fd9b Protect headers with compliant identifiers (fixes #233) 2018-08-12 14:10:32 -04:00
Jay Berkenbilt 4a4736c695 Fix EOL handling inside strings (fixes #226)
CR, CRLF, and LF are all supposed to be treated as LF; only one EOL is
to be ignored after backslash.
2018-08-05 20:48:35 -04:00
Jay Berkenbilt 9910104442 Implement TokenFilter and refactor Pl_QPDFTokenizer
Implement a TokenFilter class and refactor Pl_QPDFTokenizer to use a
TokenFilter class called ContentNormalizer. Pl_QPDFTokenizer is now a
general filter that passes data through a TokenFilter.
2018-02-18 21:05:46 -05:00
Jay Berkenbilt fefe25030e Inline image token type 2018-02-18 21:05:46 -05:00
Jay Berkenbilt 2699ecf13e Push QPDFTokenizer members into a nested structure
This is for protection against future ABI breaking changes.
2018-02-18 21:05:46 -05:00
Jay Berkenbilt d97474868d Lexer enhancements: EOF, comment, space
Significant enhancements to the lexer to improve EOF handling and to
support comments and spaces as tokens. Various other minor issues were
fixed as well.
2018-02-18 20:18:40 -05:00
Jay Berkenbilt 68572df2bf Update copyright to 2018 2018-01-13 20:25:58 -05:00
Jay Berkenbilt 07c8bb2843 Additionally license under Apache License version 2.0
The Apache License version 2.0 is now the primary license for qpdf.
However, users may, at their option, continue to use Artistic version
2.0.
2017-09-14 12:59:25 -04:00
Jay Berkenbilt fabff0f3ec Limit token length during xref recovery
While scanning the file looking for objects, limit the length of
tokens we allow. This prevents us from getting caught up in reading a
file character by character while digging through large streams.
2017-08-22 14:13:10 -04:00
Jay Berkenbilt 8288a4eb3a Update copyright to 2017 2017-08-21 21:18:47 -04:00
Jay Berkenbilt ef8ae5449d Allow QPDFTokenizer::readToken to return bad tokens
Sometimes we want to ignore bad tokens rather than having them throw
an exception. A coverage case is commented out here and added in a
later commit.
2017-08-10 19:01:41 -04:00
Jay Berkenbilt f77acbdbba Copyright 2015 2015-05-24 17:26:49 -04:00
Jay Berkenbilt 225b018290 Update Copyright to 2014 2014-01-14 15:40:02 -05:00
Jay Berkenbilt f81152311e Add QPDFObjectHandle::parseContentStream method
This method allows parsing of the PDF objects in a content stream or
array of content streams.
2013-01-20 15:35:39 -05:00
Jay Berkenbilt 8843e499b8 Update copyright year to 2013
Also add copyright notice to a few public headers that were missing
one.
2012-12-31 10:32:32 -05:00