2
1
mirror of https://github.com/qpdf/qpdf.git synced 2024-12-22 10:58:58 +00:00

Remove PCRE

This commit is contained in:
Jay Berkenbilt 2017-08-10 20:24:16 -04:00
parent 30f109e244
commit 9a96e233b0
18 changed files with 53 additions and 685 deletions

View File

@ -1,5 +1,7 @@
2017-08-10 Jay Berkenbilt <ejb@ql.org>
* Remove dependency on libpcre.
* Be more forgiving of certain types of errors in the xref table
that don't interfere with interpreting the table.

25
README
View File

@ -13,11 +13,11 @@ warranty.
Prerequisites
=============
QPDF depends on external libraries "zlib" and "pcre". These are part
of virtually all Linux distributions and are readily available;
download information appears in the documentation. For Windows, you
can download pre-built binary versions of those libraries for some
compilers; see README-windows.txt for additional details.
QPDF depends on the external library "zlib". This are part of every
Linux distribution and is readily available. Download information
appears in the documentation. For Windows, you can download pre-built
binary versions of this libraries for some compilers; see
README-windows.txt for additional details.
QPDF requires a C++ compiler that works with STL. Your compiler must
also support "long long". Almost all modern compilers do. If you are
@ -34,10 +34,9 @@ you had an otherwise working qpdf.
Licensing terms of embedded software
====================================
QPDF makes use of zlib and pcre for its functionality. These packages
can be downloaded separately from their own download locations, or
they can be downloaded in the external-libs area of the qpdf download
site.
QPDF makes use of zlib for its functionality. This package can be
downloaded separately from its own download location, or it can be
downloaded in the external-libs area of the qpdf download site.
The Rijndael encryption implementation used as the basis for AES
encryption and decryption support comes from Philip J. Erdelsky's
@ -148,10 +147,10 @@ sources to the user's manual can be found in the "manual" directory.
The software library is just libqpdf, and all the header files are in
the qpdf subdirectory. If you link statically with -lqpdf, then you
will also need to link with -lpcre and -lz. The shared qpdf library
is linked with -lpcre and -lz, and none of qpdf's public header files
directly include files from pcre or libz, so in many cases, qpdf's
development files are self contained.
will also need to link with -lz. The shared qpdf library is linked
with -lz, and none of qpdf's public header files directly include
files from libz, so in many cases, qpdf's development files are self
contained.
To learn about using the library, please read comments in the header
files in include/qpdf, especially QPDF.hh, QPDFObjectHandle.hh, and

View File

@ -47,8 +47,7 @@ download.
If you want to build the external libraries on your own (for
Windows or anything else), you can download this archive. In
addition to including unmodified distributions of pcre and zlib, it
includes a README file and some scripts to help you build them for
Windows.
addition to including an unmodified distribution zlib, it includes
a README file and some scripts to help you build it for Windows.
If you want to build on Windows, please see also README-windows.txt.

View File

@ -84,29 +84,32 @@ installers are provided, they might do that already by default.
External Libraries
==================
In order to build qpdf, you must have copies of zlib and pcre. The
easy way to get them is to download them from the qpdf download area.
There are packages called external-libs-bin.zip and
external-libs-src.zip. If you are building with MSVC 2010 or MINGW,
you can just extract the qpdf-external-libs-bin.zip zip file into the
top-level qpdf source tree. Note that you need the 2012-06-20 version
(at least) to build qpdf 3.0 or greater since this includes 64-bit
libraries. It will create a directory called external-libs which
contains header files and precompiled libraries. Passing
--enable-external-libs to ./configure (which is done automatically if
you follow the instructions below) is sufficient to find them.
In order to build qpdf, you must have a copy of zlib. The easy way to
get it is to download it from the qpdf download area. There are
packages called external-libs-bin.zip and external-libs-src.zip. If
you are building with MSVC 2010 or MINGW, you can just extract the
qpdf-external-libs-bin.zip zip file into the top-level qpdf source
tree. Note that you need the 2012-06-20 version (at least) to build
qpdf 3.0 or greater since this includes 64-bit libraries. The
2017-08-10 version includes libraries built with MSVC 2015 and
contains only zlib. Older versions also contain pcre, which is no
longer required as of qpdf 7.0.0. Extracting the zip will create a
directory called external-libs which contains header files and
precompiled libraries. Passing --enable-external-libs to ./configure
(which is done automatically if you follow the instructions below) is
sufficient to find them.
You can also obtain pcre and zlib directly on your own and install
them. If you are using mingw, you can just set CPPFLAGS, LDFLAGS, and
LIBS when you run ./configure so that it can find the header files and
libraries. If you are building with msvc and you want to do this, it
probably won't work because ./configure doesn't know how to interpret
LDFLAGS and LIBS properly for MSVC (though qpdf's own build system
does). In this case, you can probably get away with cheating by
passing --enable-external-libs to ./configure and then just editing
CPPFLAGS, LDFLAGS, LIBS in the generated autoconf.mk file. Note that
you should use UNIX-like syntax (-I, -L, -l) even though this is not
what cl takes on the command line. qpdf's build rules will fix it.
You can also obtain zlib directly on your own and install it. If you
are using mingw, you can just set CPPFLAGS, LDFLAGS, and LIBS when you
run ./configure so that it can find the header files and libraries. If
you are building with msvc and you want to do this, it probably won't
work because ./configure doesn't know how to interpret LDFLAGS and
LIBS properly for MSVC (though qpdf's own build system does). In this
case, you can probably get away with cheating by passing
--enable-external-libs to ./configure and then just editing CPPFLAGS,
LDFLAGS, LIBS in the generated autoconf.mk file. Note that you should
use UNIX-like syntax (-I, -L, -l) even though this is not what cl
takes on the command line. qpdf's build rules will fix it.
You can also download qpdf-external-libs-src.zip and follow the
instructions in the README.txt there for how to build external libs.

View File

@ -113,7 +113,12 @@ Release Reminders
version control system into a directory called qpdf-external-libs
and just make a zip file of the result called
qpdf-external-libs-src.zip. See the README.txt file there for
information on creating binary external libs releases.
information on creating binary external libs releases. Run this
from the external-libs repository:
git archive --prefix=external-libs/ HEAD . | (cd /tmp; tar xf -)
cd /tmp
zip -r qpdf-external-libs-src.zip external-libs
* To create Windows binary releases, extract the qpdf source
distribution in Windows (MSYS + MINGW, MSVC). From the extracted

3
TODO
View File

@ -7,9 +7,6 @@ version if needed.
Soon
====
* Eliminate dependency on PCRE. There aren't that many regular
expressions, and they are used only for internal purposes.
* Consider whether there should be a mode in which QPDFObjectHandle
returns nulls for operations on the wrong type instead of asserting
the type. The way things are wired up now, this would have to be a

View File

@ -82,8 +82,6 @@ fi
if test "$BUILD_INTERNAL_LIBS" = "0"; then
AC_CHECK_HEADER(zlib.h,,[MISSING_ZLIB_H=1; MISSING_ANY=1])
AC_SEARCH_LIBS(deflate,z zlib,,[MISSING_ZLIB=1; MISSING_ANY=1])
AC_CHECK_HEADER(pcre.h,,[MISSING_PCRE_H=1; MISSING_ANY=1])
AC_SEARCH_LIBS(pcre_compile,pcre,,[MISSING_PCRE=1; MISSING_ANY=1])
fi
if test "x$qpdf_OS_SECURE_RANDOM" = "x1"; then
@ -453,14 +451,6 @@ if test "$MISSING_ZLIB" = "1"; then
AC_MSG_WARN(unable to find required library z (or zlib))
fi
if test "$MISSING_PCRE_H" = "1"; then
AC_MSG_WARN(unable to find required header pcre.h)
fi
if test "$MISSING_PCRE" = "1"; then
AC_MSG_WARN(unable to find required library pcre)
fi
if test "$MISSING_DOCBOOK_FO" = "1"; then
AC_MSG_WARN(docbook fo stylesheets are required to build PDF documentation)
fi
@ -497,7 +487,7 @@ if test "$USE_EXTERNAL_LIBS" = "1"; then
# much trouble getting it to work with a different compiler.
CPPFLAGS="$CPPFLAGS -Iexternal-libs/include"
LDFLAGS="$LDFLAGS -Lexternal-libs/lib-$BUILDRULES$WINDOWS_WORDSIZE"
LIBS="$LIBS -lz -lpcre"
LIBS="$LIBS -lz"
fi
AC_OUTPUT

View File

@ -727,7 +727,6 @@ lld
lookup
lossy
LowPart
lpcre
lqpdf
lsb
lt
@ -914,7 +913,6 @@ pb
pbytes
pc
pcre
pcreapi
pdf
PDFâ
PDFContext

View File

@ -6,6 +6,6 @@ includedir=@includedir@
Name: libqpdf
Description: PDF transformation library
Version: @PACKAGE_VERSION@
Requires.private: zlib, libpcre
Requires.private: zlib
Libs: -L${libdir} -lqpdf
Cflags: -I${includedir}

View File

@ -1,354 +0,0 @@
#include <qpdf/PCRE.hh>
#include <qpdf/QUtil.hh>
#include <stdexcept>
#include <iostream>
#include <string.h>
PCRE::NoBackref::NoBackref() :
std::logic_error("PCRE error: no match")
{
}
PCRE::Match::Match(int nbackrefs, char const* subject)
{
this->init(-1, nbackrefs, subject);
}
PCRE::Match::~Match()
{
this->destroy();
}
PCRE::Match::Match(Match const& rhs)
{
this->copy(rhs);
}
PCRE::Match&
PCRE::Match::operator=(Match const& rhs)
{
if (this != &rhs)
{
this->destroy();
this->copy(rhs);
}
return *this;
}
void
PCRE::Match::init(int nmatches, int nbackrefs, char const* subject)
{
this->nmatches = nmatches;
this->nbackrefs = nbackrefs;
this->subject = subject;
this->ovecsize = 3 * (1 + nbackrefs);
this->ovector = 0;
if (this->ovecsize)
{
this->ovector = new int[this->ovecsize];
}
}
void
PCRE::Match::copy(Match const& rhs)
{
this->init(rhs.nmatches, rhs.nbackrefs, rhs.subject);
int i;
for (i = 0; i < this->ovecsize; ++i)
{
this->ovector[i] = rhs.ovector[i];
}
}
void
PCRE::Match::destroy()
{
delete [] this->ovector;
}
PCRE::Match::operator bool()
{
return (this->nmatches >= 0);
}
std::string
PCRE::Match::getMatch(int n, int flags)
{
// This method used to be implemented in terms of
// pcre_get_substring, but that function gives you an empty string
// for an unmatched backreference that is in range.
int offset;
int length;
try
{
getOffsetLength(n, offset, length);
}
catch (NoBackref&)
{
if (flags & gm_no_substring_returns_empty)
{
return "";
}
else
{
throw;
}
}
return std::string(this->subject).substr(offset, length);
}
void
PCRE::Match::getOffsetLength(int n, int& offset, int& length)
{
if ((this->nmatches < 0) ||
(n > this->nmatches - 1) ||
(this->ovector[n * 2] == -1))
{
throw NoBackref();
}
offset = this->ovector[n * 2];
length = this->ovector[n * 2 + 1] - offset;
}
int
PCRE::Match::getOffset(int n)
{
int offset;
int length;
this->getOffsetLength(n, offset, length);
return offset;
}
int
PCRE::Match::getLength(int n)
{
int offset;
int length;
this->getOffsetLength(n, offset, length);
return length;
}
int
PCRE::Match::nMatches() const
{
return this->nmatches;
}
PCRE::PCRE(char const* pattern, int options)
{
char const *errptr;
int erroffset;
this->code = pcre_compile(pattern, options, &errptr, &erroffset, 0);
if (this->code)
{
pcre_fullinfo(this->code, 0, PCRE_INFO_CAPTURECOUNT, &(this->nbackrefs));
}
else
{
std::string message = (std::string("compilation of ") + pattern +
" failed at offset " +
QUtil::int_to_string(erroffset) + ": " +
errptr);
throw std::runtime_error("PCRE error: " + message);
}
}
PCRE::~PCRE()
{
pcre_free(this->code);
}
PCRE::Match
PCRE::match(char const* subject, int options, int startoffset, int size)
{
if (size == -1)
{
size = strlen(subject);
}
Match result(this->nbackrefs, subject);
int status = pcre_exec(this->code, 0, subject, size,
startoffset, options,
result.ovector, result.ovecsize);
if (status >= 0)
{
result.nmatches = status;
}
else
{
std::string message;
switch (status)
{
case PCRE_ERROR_NOMATCH:
break;
case PCRE_ERROR_BADOPTION:
message = "bad option passed to PCRE::match()";
throw std::logic_error(message);
break;
case PCRE_ERROR_NOMEMORY:
message = "insufficient memory";
throw std::runtime_error(message);
break;
case PCRE_ERROR_NULL:
case PCRE_ERROR_BADMAGIC:
case PCRE_ERROR_UNKNOWN_NODE:
default:
message = "pcre_exec returned " + QUtil::int_to_string(status);
throw std::logic_error(message);
}
}
return result;
}
void
PCRE::test(int n)
{
try
{
if (n == 1)
{
static char const* utf8 = "abπdefq";
PCRE u1("^([[:alpha:]]+)");
PCRE u2("^([\\p{L}]+)", PCRE_UTF8);
PCRE::Match m1 = u1.match(utf8);
if (m1)
{
std::cout << "no utf8: " << m1.getMatch(1) << std::endl;
}
PCRE::Match m2 = u2.match(utf8);
if (m2)
{
std::cout << "utf8: " << m2.getMatch(1) << std::endl;
}
return;
}
try
{
PCRE pcre1("a**");
}
catch (std::exception& e)
{
std::cout << e.what() << std::endl;
}
PCRE pcre2("^([^\\s:]*)\\s*:\\s*(.*?)\\s*$");
PCRE::Match m2 = pcre2.match("key: value one two three ");
if (m2)
{
std::cout << m2.nMatches() << std::endl;
std::cout << m2.getMatch(0) << std::endl;
std::cout << m2.getOffset(0) << std::endl;
std::cout << m2.getLength(0) << std::endl;
std::cout << m2.getMatch(1) << std::endl;
std::cout << m2.getOffset(1) << std::endl;
std::cout << m2.getLength(1) << std::endl;
std::cout << m2.getMatch(2) << std::endl;
std::cout << m2.getOffset(2) << std::endl;
std::cout << m2.getLength(2) << std::endl;
try
{
std::cout << m2.getMatch(3) << std::endl;
}
catch (std::exception& e)
{
std::cout << e.what() << std::endl;
}
try
{
std::cout << m2.getOffset(3) << std::endl;
}
catch (std::exception& e)
{
std::cout << e.what() << std::endl;
}
}
PCRE pcre3("^(a+)(b+)?$");
PCRE::Match m3 = pcre3.match("aaa");
try
{
if (m3)
{
std::cout << m3.nMatches() << std::endl;
std::cout << m3.getMatch(0) << std::endl;
std::cout << m3.getMatch(1) << std::endl;
std::cout << "-"
<< m3.getMatch(
2, Match::gm_no_substring_returns_empty)
<< "-" << std::endl;
std::cout << "hello" << std::endl;
std::cout << m3.getMatch(2) << std::endl;
std::cout << "can't see this" << std::endl;
}
}
catch (std::exception& e)
{
std::cout << e.what() << std::endl;
}
// backref: 1 2 3 4 5
PCRE pcre4("^((?:(a(b)?)(?:,(c))?)|(c))?$");
static char const* candidates[] = {
"qqqcqqq", // no match
"ab,c", // backrefs: 0, 1, 2, 3, 4
"ab", // backrefs: 0, 1, 2, 3
"a", // backrefs: 0, 1, 2
"a,c", // backrefs: 0, 1, 2, 4
"c", // backrefs: 0, 1, 5
"", // backrefs: 0
0
};
for (char const** p = candidates; *p; ++p)
{
PCRE::Match m(pcre4.match(*p));
if (m)
{
int nmatches = m.nMatches();
for (int i = 0; i < nmatches; ++i)
{
std::cout << *p << ": " << i << ": ";
try
{
std::string match = m.getMatch(i);
std::cout << match;
}
catch (NoBackref&)
{
std::cout << "no backref (getMatch)";
}
std::cout << std::endl;
std::cout << *p << ": " << i << ": ";
try
{
int offset;
int length;
m.getOffsetLength(i, offset, length);
std::cout << offset << ", " << length;
}
catch (NoBackref&)
{
std::cout << "no backref (getOffsetLength)";
}
std:: cout << std::endl;
}
}
else
{
std::cout << *p << ": no match" << std::endl;
}
}
}
catch (std::exception& e)
{
std::cout << "unexpected exception: " << e.what() << std::endl;
}
}

View File

@ -14,7 +14,6 @@ SRCS_libqpdf = \
libqpdf/InsecureRandomDataProvider.cc \
libqpdf/MD5.cc \
libqpdf/OffsetInputSource.cc \
libqpdf/PCRE.cc \
libqpdf/Pipeline.cc \
libqpdf/Pl_AES_PDF.cc \
libqpdf/Pl_ASCII85Decoder.cc \

View File

@ -1,117 +0,0 @@
// This is a C++ wrapper class around Philip Hazel's perl-compatible
// regular expressions library.
//
#ifndef __PCRE_HH__
#define __PCRE_HH__
#include <qpdf/DLL.h>
#ifdef _WIN32
# define PCRE_STATIC
#endif
#include <pcre.h>
#include <string>
#include <stdexcept>
// Note: this class does not encapsulate all features of the PCRE
// package -- only those that I actually need right now are here.
class PCRE
{
public:
// This is thrown when an attempt is made to access a non-existent
// back reference.
class NoBackref: public std::logic_error
{
public:
QPDF_DLL
NoBackref();
virtual ~NoBackref() throw() {}
};
class Match
{
friend class PCRE;
public:
QPDF_DLL
Match(int nbackrefs, char const* subject);
QPDF_DLL
Match(Match const&);
QPDF_DLL
Match& operator=(Match const&);
QPDF_DLL
~Match();
QPDF_DLL
operator bool();
// All the back reference accessing routines may throw the
// special exception NoBackref (derived from Exception) if the
// back reference does not exist. Exception will be thrown
// for other error conditions. This allows callers to trap
// this condition explicitly when they care about the
// difference between a backreference matching an empty string
// and not matching at all.
// see getMatch flags below
QPDF_DLL
std::string getMatch(int n, int flags = 0);
QPDF_DLL
void getOffsetLength(int n, int& offset, int& length);
QPDF_DLL
int getOffset(int n);
QPDF_DLL
int getLength(int n);
// nMatches returns the number of available matches including
// match 0 which is the whole string. In other words, if you
// have one backreference in your expression and the
// expression matches, nMatches() will return 2, getMatch(0)
// will return the whole string, getMatch(1) will return the
// text that matched the backreference, and getMatch(2) will
// throw an exception because it is out of range.
QPDF_DLL
int nMatches() const;
// Flags for getMatch
// getMatch on a substring that didn't match should return
// empty string instead of throwing an exception
static int const gm_no_substring_returns_empty = (1 << 0);
private:
void init(int nmatches, int nbackrefs, char const* subject);
void copy(Match const&);
void destroy();
int nbackrefs;
char const* subject;
int* ovector;
int ovecsize;
int nmatches;
};
// The value passed in as options is passed to pcre_exec. See man
// pcreapi for details.
QPDF_DLL
PCRE(char const* pattern, int options = 0);
QPDF_DLL
~PCRE();
QPDF_DLL
Match match(char const* subject, int options = 0, int startoffset = 0,
int size = -1);
QPDF_DLL
static void test(int n = 0);
private:
// prohibit copying and assignment
PCRE(PCRE const&);
PCRE& operator=(PCRE const&);
pcre* code;
int nbackrefs;
};
#endif // __PCRE_HH__

View File

@ -9,7 +9,6 @@ BINS_libtests = \
input_source \
lzw \
md5 \
pcre \
png_filter \
pointer_holder \
qutil \

View File

@ -1,30 +0,0 @@
#include <qpdf/PCRE.hh>
#include <iostream>
#include <string.h>
int main(int argc, char* argv[])
{
if ((argc == 2) && (strcmp(argv[1], "--unicode-classes-supported") == 0))
{
try
{
PCRE("^([\\p{L}]+)", PCRE_UTF8);
std::cout << "1" << std::endl;
}
catch (std::exception&)
{
std::cout << "0" << std::endl;
}
return 0;
}
if ((argc == 2) && (strcmp(argv[1], "--unicode-classes") == 0))
{
PCRE::test(1);
}
else
{
PCRE::test();
}
return 0;
}

View File

@ -1,47 +0,0 @@
#!/usr/bin/env perl
require 5.008;
BEGIN { $^W = 1; }
use strict;
chdir("pcre") or die "chdir testdir failed: $!\n";
require TestDriver;
my $td = new TestDriver('pcre');
$td->runtest("PCRE",
{$td->COMMAND => "pcre"},
{$td->FILE => "pcre.out",
$td->EXIT_STATUS => 0},
$td->NORMALIZE_NEWLINES);
chop(my $supported = `pcre --unicode-classes-supported`);
if ($supported =~ m/^1/)
{
my $xflags = 0;
if (`pcre --unicode-classes | wc -l` == 1)
{
# On Red Hat Enterprise Linux 5, the version of pcre provided
# by default claims to support unicode character classes, but
# they don't actually work. Since qpdf doesn't use this
# functionality, we won't care if this particular test case
# fails. If someone were to make general use of this wrapper,
# this test should be re-enabled, but on the other hand, they
# could just use the C++ interface that's been added to pcre
# since this code was written.
$xflags |= $td->EXPECT_FAILURE;
}
$td->runtest("unicode character classes",
{$td->COMMAND => "pcre --unicode-classes"},
{$td->FILE => "pcre-unicode-classes.out",
$td->EXIT_STATUS => 0},
$td->NORMALIZE_NEWLINES | $xflags);
}
else
{
$td->runtest("unicode classes are not supported",
{$td->STRING => "1"},
{$td->STRING => "1"});
}
$td->report(2);

View File

@ -1,2 +0,0 @@
no utf8: ab
utf8: abπdefq

View File

@ -1,68 +0,0 @@
PCRE error: compilation of a** failed at offset 2: nothing to repeat
3
key: value one two three
0
25
key
0
3
value one two three
5
19
PCRE error: no match
PCRE error: no match
2
aaa
aaa
--
hello
PCRE error: no match
qqqcqqq: no match
ab,c: 0: ab,c
ab,c: 0: 0, 4
ab,c: 1: ab,c
ab,c: 1: 0, 4
ab,c: 2: ab
ab,c: 2: 0, 2
ab,c: 3: b
ab,c: 3: 1, 1
ab,c: 4: c
ab,c: 4: 3, 1
ab: 0: ab
ab: 0: 0, 2
ab: 1: ab
ab: 1: 0, 2
ab: 2: ab
ab: 2: 0, 2
ab: 3: b
ab: 3: 1, 1
a: 0: a
a: 0: 0, 1
a: 1: a
a: 1: 0, 1
a: 2: a
a: 2: 0, 1
a,c: 0: a,c
a,c: 0: 0, 3
a,c: 1: a,c
a,c: 1: 0, 3
a,c: 2: a
a,c: 2: 0, 1
a,c: 3: no backref (getMatch)
a,c: 3: no backref (getOffsetLength)
a,c: 4: c
a,c: 4: 2, 1
c: 0: c
c: 0: 0, 1
c: 1: c
c: 1: 0, 1
c: 2: no backref (getMatch)
c: 2: no backref (getOffsetLength)
c: 3: no backref (getMatch)
c: 3: no backref (getOffsetLength)
c: 4: no backref (getMatch)
c: 4: no backref (getOffsetLength)
c: 5: c
c: 5: 0, 1
: 0:
: 0: 0, 0

View File

@ -93,7 +93,7 @@
<sect1 id="ref.prerequisites">
<title>System Requirements</title>
<para>
The qpdf package has relatively few external dependencies. In
The qpdf package has only one external dependencies. In
order to build qpdf, the following packages are required:
<itemizedlist>
<listitem>
@ -101,11 +101,6 @@
zlib: <ulink url="http://www.zlib.net/">http://www.zlib.net/</ulink>
</para>
</listitem>
<listitem>
<para>
pcre: <ulink url="http://www.pcre.org/">http://www.pcre.org/</ulink>
</para>
</listitem>
<listitem>
<para>
gnu make 3.81 or newer: <ulink url="http://www.gnu.org/software/make">http://www.gnu.org/software/make</ulink>
@ -1466,7 +1461,7 @@ outfile.pdf</option>
</para>
<para>
When linking against the qpdf static library, you may also need to
specify <literal>-lpcre -lz</literal> on your link command. If
specify <literal>-lz</literal> on your link command. If
your system understands how to read libtool
<filename>.la</filename> files, this may not be necessary.
</para>