Ignore objects with offset 0

This commit is contained in:
Jay Berkenbilt 2012-11-20 13:15:14 -05:00
parent 041397fdab
commit f256670eba
7 changed files with 57 additions and 7 deletions

View File

@ -1,3 +1,9 @@
2012-11-20 Jay Berkenbilt <ejb@ql.org>
* Ignore (with warning) non-freed objects in the xref table whose
offset is 0. Some PDF producers (incorrectly) do this. See
https://bugs.linuxfoundation.org/show_bug.cgi?id=1081.
2012-09-23 Jay Berkenbilt <ejb@ql.org>
* Add public methods QPDF::processInputSource and

31
TODO
View File

@ -1,12 +1,31 @@
General
=======
* See if I can support the encryption format used with /R 5 /V 5,
even though a qpdf-announce subscriber with an adobe.com email
address mentioned that this is deprecated. There is also a new
encryption format coming in a future release, which may be better
to support. As of the qpdf 3.0 release, the specification was not
publicly available yet.
* See if I can support the encryption format used with /R 5 /V 5
(AESV3), even though a qpdf-announce subscriber with an adobe.com
email address mentioned that this is deprecated. There is also a
new encryption format coming in a future release (PDF 2.0), which
may be better to support. As of the qpdf 3.0 release, the
specification was not publicly available yet.
AESV3 encryption is supported with PDF 1.7 extension level 3 and is
being deprecated, but there are plenty of files out there. The
encryption format is decribed in adobe_supplement_iso32000.pdf.
Such a file must specify that it uses these extensions in its
document catalog:
<<
/Type /Catalog
/Extensions <<
/ADBE <<
/BaseVersion /1.7
/ExtensionLevel 3
>>
>>
>>
Possible sha256 implementations: http://sol-biotech.com/code/sha2/,
http://hashlib2plus.sourceforge.net/
* Consider the possibility of doing something locale-aware to support
non-ASCII passwords. Update documentation if this is done.

View File

@ -1253,6 +1253,21 @@ QPDF::readObjectAtOffset(bool try_recovery,
int& objid, int& generation)
{
setLastObjectDescription(description, exp_objid, exp_generation);
// Special case: if offset is 0, just return null. Some PDF
// writers, in particuar "Mac OS X 10.7.5 Quartz PDFContext", may
// store deleted objects in the xref table as "0000000000 00000
// n", which is not correct, but it won't hurt anything for to
// ignore these.
if (offset == 0)
{
QTC::TC("qpdf", "QPDF bogus 0 offset", 0);
warn(QPDFExc(qpdf_e_damaged_pdf, this->file->getName(),
this->last_object_description, 0,
"object has offset 0"));
return QPDFObjectHandle::newNull();
}
this->file->seek(offset, SEEK_SET);
QPDFTokenizer::Token tobjid = readToken(this->file);

View File

@ -242,3 +242,4 @@ QPDF_Tokenizer EOF reading token 0
QPDF_Tokenizer EOF reading appendable token 0
QPDFWriter extra header text no newline 0
QPDFWriter extra header text add newline 0
QPDF bogus 0 offset 0

View File

@ -149,7 +149,7 @@ $td->runtest("remove page we don't have",
$td->NORMALIZE_NEWLINES);
# ----------
$td->notify("--- Miscellaneous Tests ---");
$n_tests += 55;
$n_tests += 56;
$td->runtest("qpdf version",
{$td->COMMAND => "qpdf --version"},
@ -410,6 +410,10 @@ $td->runtest("output to custom pipeline",
$td->runtest("check output",
{$td->FILE => "a.pdf"},
{$td->FILE => "custom-pipeline.pdf"});
$td->runtest("object with zero offset",
{$td->COMMAND => "qpdf --check zero-offset.pdf"},
{$td->FILE => "zero-offset.out", $td->EXIT_STATUS => 3},
$td->NORMALIZE_NEWLINES);
show_ntests();
# ----------

View File

@ -0,0 +1,5 @@
checking zero-offset.pdf
PDF Version: 1.3
File is not encrypted
File is not linearized
WARNING: zero-offset.pdf (object 27 0): object has offset 0

Binary file not shown.