C API: simplify new error handling and improve documentation

This commit is contained in:
Jay Berkenbilt 2021-12-17 15:30:47 -05:00
parent 59a64115a6
commit ddbe59179e
10 changed files with 161 additions and 131 deletions

View File

@ -1,5 +1,10 @@
2021-12-17 Jay Berkenbilt <ejb@ql.org>
* C API: simplify error handling for uncaught errors (never in a
released version) and clarify documentation in qpdf-c.h around
error handling. See qpdf-c.h for details, including how to check
for errors and the new function qpdf_silence_errors.
* C API: expose getTypeCode and getTypeName from QPDFObjectHandle.
Fixes #597.
@ -40,10 +45,7 @@
interfaces. Clarify documentation regarding object accessors and
how type errors and warnings are handled. Many cases that used to
crash code that used the C API can now be trapped and will be
written stderr if not trapped. The new method
qpdf_register_oh_error_handler can be used to specifically handle
errors that occur when accessing object handles. See qpdf-c.h for
details.
written stderr if not trapped. See qpdf-c.h for details.
* C API: Add qpdf_oh_new_uninitialized to explicitly create
uninitialized object handles.

View File

@ -33,6 +33,10 @@
*
* There are several things to keep in mind when using the C API.
*
* Error handling is tricky because the underlying C++ API uses
* exception handling. See "ERROR HANDLING" below for a detailed
* explanation.
*
* The C API is not as rich as the C++ API. For any operations
* that involve actually manipulating PDF objects, you must use
* the C++ API. The C API is primarily useful for doing basic
@ -46,8 +50,9 @@
* multiple threads.
*
* All dynamic memory, except for that of the qpdf_data object
* itself, is managed by the library. You must create a qpdf_data
* object using qpdf_init and free it using qpdf_cleanup.
* itself, is managed by the library unless otherwise noted. You
* must create a qpdf_data object using qpdf_init and free it
* using qpdf_cleanup.
*
* Many functions return char*. In all cases, the char* values
* returned are pointers to data inside the qpdf_data object. As
@ -61,29 +66,6 @@
* qpdf_get_last_string_length() to get the length of whatever
* string was just returned.
*
* Many functions defined here merely set parameters and therefore
* never return error conditions. Functions that access or return
* qpdf_oh object handles may generate warnings but have no way to
* return errors, but the errors may be checked afterwards or
* handled using a registered handler. This is discussed in more
* detail in the section on object handling. Functions that may
* cause PDF files to be read or written may return error
* conditions. Such functions return an error code. If there were
* no errors or warnings, they return QPDF_SUCCESS. If there were
* warnings, the return value has the QPDF_WARNINGS bit set. If
* there were errors, the QPDF_ERRORS bit is set. In other words,
* if there are both warnings and errors, then the return status
* will be QPDF_WARNINGS | QPDF_ERRORS. You may also call the
* qpdf_more_warnings and qpdf_more_errors functions to test
* whether there are unseen warning or error conditions. By
* default, warnings are written to stderr when detected, but this
* behavior can be suppressed. In all cases, errors and warnings
* may be retrieved by calling qpdf_next_warning and
* qpdf_get_error. All exceptions thrown by the C++ interface are
* caught and converted into error messages by the C interface.
* Any exceptions to this are qpdf bugs and should be reported at
* https://github.com/qpdf/qpdf/issues/new.
*
* Most functions defined here have obvious counterparts that are
* methods to either QPDF or QPDFWriter. Please see comments in
* QPDF.hh and QPDFWriter.hh for details on their use. In order
@ -91,6 +73,72 @@
* primarily on differences between the C and C++ API.
*/
/* ERROR HANDLING -- changed in qpdf 10.5 */
/* SUMMARY: The only way to know whether a function that does not
* return an error code has encountered an error is to call
* qpdf_has_error after each function. You can do this even for
* functions that do return error codes. You can also call
* qpdf_silence_errors to prevent qpdf from writing these errors to
* stderr.
*
* DETAILS:
*
* There is some complexity in this API's error handling as it tries
* to adopt C++-style exception handling to work with languages that
* don't support exceptions, such as C.
*
* The data type underlying qpdf_data maintains a list of warnings and
* a single error. To retrieve warnings, call qpdf_next_warning while
* qpdf_more_warnings is true. To retrieve the error, call
* qpdf_get_error when qpdf_has_error is true.
*
* There are several things that are important to understand.
*
* Some functions return an error code. The value of the error code is
* made up of a bitwise-OR of QPDF_WARNINGS and QPDF_ERRORS. The
* QPDF_ERRORS bit is set if there was an error during the *most
* recent call* to the API. The QPDF_WARNINGS bit is set if there are
* any warnings that have not yet been retrieved by calling
* qpdf_more_warnings. It is possible for both its or neither bit to
* be set.
*
* The expected mode of operation is to go through a series of
* operations, checking for errors after each call, but only checking
* for warnings at the end. This is similar to how it works in the C++
* API where warnings are handled in exactly this way but errors
* result in exceptions being thrown. However, in both the C and C++
* API, it is possible to check for and handle warnings as they arise.
*
* Some functions return values (or void) rather than an error code.
* This is especially true with the object handling functions. Those
* functions can still generate errors. To handle errors in those
* cases, you should explicitly call qpdf_has_error(). Note that, if
* you want to avoid the inconsistencies in the interface, you can
* always check for error conditions in this way rather than looking
* at status return codes.
*
* Prior to qpdf 10.5, if one of the functions that does not return an
* error code encountered an exception, it would cause the entire
* program to crash. Starting in qpdf 10.5, the default response to an
* error condition in these situations is to print the error to
* standard error, issue exactly one warning indicating that such an
* error occurred, and return a sensible fallback value (0 for
* numbers, QPDF_FALSE for booleans, "" for strings, or a null or
* uninitialized object handle). This is better than the old behavior
* but still undesirable as the best option is to explicitly check for
* error conditions.
*
* To prevent qpdf from writing error messages to stderr in this way,
* you can call qpdf_silence_errors(). This signals to the qpdf
* library that you intend to check the error codes yourself.
*
* If you encounter a situation where an exception from the C++ code
* is not properly converted to an error as described above, it is a
* bug in qpdf, which should be reported at
* https://github.com/qpdf/qpdf/issues/new.
*/
#include <qpdf/DLL.h>
#include <qpdf/Types.h>
#include <qpdf/Constants.h>
@ -116,6 +164,15 @@ extern "C" {
# define QPDF_TRUE 1
# define QPDF_FALSE 0
/* From qpdf 10.5: call this method to signal to the library that
* you are explicitly handling errors from functions that don't
* return error codes. Otherwise, the library will print these
* error conditions to stderr and issue a warning. Prior to 10.5,
* the program would have crashed from an unhandled exception.
*/
QPDF_DLL
void qpdf_silence_errors(qpdf_data qpdf);
/* Returns the version of the qpdf software */
QPDF_DLL
char const* qpdf_get_qpdf_version();
@ -560,48 +617,14 @@ extern "C" {
* same as letting a QPDFObjectHandle go out of scope in the C++
* API.
*
* Important note about error handling:
*
* While many of the functions that operate on the QPDF object
* return error codes, the qpdf_oh functions return values such as
* object handles or data. They have no way to return error codes.
* If they generate warnings, the warnings are handled using the
* error/warning handling functions described above. If the
* underlying C++ call throws an exception, the error handler
* registered with qpdf_register_oh_error_handler() will be
* called. If no handler is registered, the exception is written
* to STDERR. In either case, a sensible fallback value is
* returned (0 for numbers, QPDF_FALSE for booleans, "" for
* strings, or a null object). It is sensible for a C program to
* use setjmp and longjmp with this error handler since the C++
* code has raised an exception, but you can also just set a flag
* and check it after each call.
*
* All conditions under which exceptions would be thrown by object
* accessors are caused by programmer error or major problems such
* as running out of memory or not being able to read the input
* file. If they are ever caused by invalid data in the PDF file,
* it is a bug in qpdf, which should be reported at
* https://github.com/qpdf/qpdf/issues/new.
* Please see "ERROR HANDLING" above for details on how error
* conditions are handled.
*/
/* For examples of using this API, see examples/pdf-c-objects.c */
typedef unsigned int qpdf_oh;
/* If an exception is thrown by the C++ code when any of the
* qpdf_oh functions are called, the registered handle_error
* function will be called. The value passed to data will be
* passed along to the error handler function. If any errors occur
* and no error handler is accessed, a single warning will be
* issued, and the error will be written to stderr.
*/
QPDF_DLL
void qpdf_register_oh_error_handler(
qpdf_data qpdf,
void (*handle_error)(qpdf_data qpdf, qpdf_error error, void* data),
void* data);
/* Releasing objects -- see comments above. These functions have no
* equivalent in the C++ API.
*/

View File

@ -42,9 +42,8 @@ struct _qpdf_data
PointerHolder<Buffer> output_buffer;
// QPDFObjectHandle support
void (*oh_error_handler)(qpdf_data, qpdf_error, void*);
void* oh_error_handler_data;
bool default_oh_error_handler_called;
bool silence_errors;
bool oh_error_occurred;
std::map<qpdf_oh, PointerHolder<QPDFObjectHandle>> oh_cache;
qpdf_oh next_oh;
std::set<std::string> cur_iter_dict_keys;
@ -52,32 +51,10 @@ struct _qpdf_data
std::string cur_dict_key;
};
static void default_oh_error_handler(qpdf_data qpdf, qpdf_error e, void* data)
{
bool* called = reinterpret_cast<bool*>(data);
if (called != nullptr)
{
QTC::TC("qpdf", "qpdf-c warn about oh error", *called ? 0 : 1);
if (! *called)
{
qpdf->warnings.push_back(
QPDFExc(
qpdf_e_internal,
qpdf->qpdf->getFilename(),
"", 0,
"C API object handle accessor errors occurred,"
" and the application did not define an error handler"));
*called = true;
}
}
std::cerr << e->exc->what() << std::endl;
}
_qpdf_data::_qpdf_data() :
write_memory(false),
oh_error_handler(default_oh_error_handler),
oh_error_handler_data(&this->default_oh_error_handler_called),
default_oh_error_handler_called(false),
silence_errors(false),
oh_error_occurred(false),
next_oh(0)
{
}
@ -876,14 +853,10 @@ QPDF_ERROR_CODE qpdf_write(qpdf_data qpdf)
return status;
}
void qpdf_register_oh_error_handler(
qpdf_data qpdf,
void (*handle_error)(qpdf_data qpdf, qpdf_error error, void* data),
void* data)
void qpdf_silence_errors(qpdf_data qpdf)
{
QTC::TC("qpdf", "qpdf-c registered oh error handler");
qpdf->oh_error_handler = handle_error;
qpdf->oh_error_handler_data = data;
QTC::TC("qpdf", "qpdf-c silence oh errors");
qpdf->silence_errors = true;
}
template<class RET>
@ -901,8 +874,24 @@ static RET trap_oh_errors(
});
if (status & QPDF_ERRORS)
{
(*qpdf->oh_error_handler)(
qpdf, qpdf_get_error(qpdf), qpdf->oh_error_handler_data);
if (! qpdf->silence_errors)
{
QTC::TC("qpdf", "qpdf-c warn about oh error",
qpdf->oh_error_occurred ? 0 : 1);
if (! qpdf->oh_error_occurred)
{
qpdf->warnings.push_back(
QPDFExc(
qpdf_e_internal,
qpdf->qpdf->getFilename(),
"", 0,
"C API function caught an exception that it isn't"
" returning; please point the application developer"
" to ERROR HANDLING in qpdf-c.h"));
qpdf->oh_error_occurred = true;
}
std::cerr << qpdf->error->what() << std::endl;
}
return fallback();
}
return ret;

View File

@ -3632,14 +3632,16 @@ For a detailed list of changes, please see the file
- C API Enhancements
- Overhaul error handling for the object handle functions
C API. See comments in the "Object handling" section of
:file:`include/qpdf/qpdf-c.h` for details.
In particular, exceptions thrown by the underlying C++ code
when calling object accessors are caught and converted into
errors. The errors can be trapped by registering an error
handler with ``qpdf_register_oh_error_handler`` or will be
written to stderr if no handler is registered.
- Overhaul error handling for the object handle functions C API.
Some rare error conditions that would previously have caused a
crash are now trapped and reported, and the functions that
generate them return fallback values. See comments in the
``ERROR HANDLING`` section of :file:`include/qpdf/qpdf-c.h` for
details. In particular, exceptions thrown by the underlying C++
code when calling object accessors are caught and converted into
errors. The errors can be checked by call ``qpdf_has_error``.
Use ``qpdf_silence_errors`` to prevent the error from being
written to stderr.
- Add ``qpdf_get_last_string_length`` to the C API to get the
length of the last string that was returned. This is needed to

View File

@ -73,14 +73,12 @@ static void report_errors()
}
}
static void handle_oh_error(qpdf_data qpdf, qpdf_error error, void* data)
static void handle_oh_error(qpdf_data qpdf, char const* label)
{
char const* label = "oh error";
if (data)
if (qpdf_has_error(qpdf))
{
label = *((char const**)data);
print_error(label, qpdf, qpdf_get_error(qpdf));
}
print_error(label, qpdf, error);
}
static void read_file_into_memory(char const* filename,
@ -837,42 +835,44 @@ static void test29(char const* infile,
* errors rather than warnings when they don't have an owning QPDF
* object.
*/
char const* label = "oh error";
qpdf_register_oh_error_handler(qpdf, handle_oh_error, (void*)&label);
qpdf_silence_errors(qpdf);
/* get_root fails when we have no trailer */
label = "get root";
qpdf_oh root = qpdf_get_root(qpdf);
handle_oh_error(qpdf, "get root");
assert(root != 0);
assert(! qpdf_oh_is_initialized(qpdf, root));
label = "bad parse";
assert(! qpdf_oh_is_initialized(qpdf, qpdf_oh_parse(qpdf, "[oops")));
handle_oh_error(qpdf, "bad parse");
report_errors();
label = "type mismatch";
assert(qpdf_oh_get_int_value_as_int(
qpdf, qpdf_oh_new_string(qpdf, "x")) == 0);
handle_oh_error(qpdf, "type mismatch (int operation on string)");
qpdf_oh int_oh = qpdf_oh_new_integer(qpdf, 12);
assert(strlen(qpdf_oh_get_string_value(qpdf, int_oh)) == 0);
handle_oh_error(qpdf, "type mismatch (string operation on int)");
// This doesn't test every possible error flow, but it tests each
// way of handling errors in the library code.
label = "array type mismatch";
assert(qpdf_oh_get_array_n_items(qpdf, int_oh) == 0);
handle_oh_error(qpdf, "array type mismatch - n_items");
assert(qpdf_oh_is_null(qpdf, qpdf_oh_get_array_item(qpdf, int_oh, 3)));
label = "append to non-array";
handle_oh_error(qpdf, "array type mismatch - item");
qpdf_oh_append_item(qpdf, int_oh, qpdf_oh_new_null(qpdf));
handle_oh_error(qpdf, "append to non-array");
qpdf_oh array = qpdf_oh_new_array(qpdf);
label = "array bounds";
assert(qpdf_oh_is_null(qpdf, qpdf_oh_get_array_item(qpdf, array, 3)));
handle_oh_error(qpdf, "array bounds");
label = "dictionary iter type mismatch";
qpdf_oh_begin_dict_key_iter(qpdf, int_oh);
assert(qpdf_oh_dict_more_keys(qpdf) == QPDF_FALSE);
label = "dictionary type mismatch";
handle_oh_error(qpdf, "dictionary iter type mismatch");
assert(qpdf_oh_is_null(qpdf, qpdf_oh_get_key(qpdf, int_oh, "potato")));
handle_oh_error(qpdf, "dictionary type mismatch");
assert(qpdf_oh_has_key(qpdf, int_oh, "potato") == QPDF_FALSE);
handle_oh_error(qpdf, "dictionary type mismatch");
report_errors();
}
@ -1030,6 +1030,9 @@ static void test35(char const* infile,
assert(qpdf_more_warnings(qpdf));
e = qpdf_next_warning(qpdf);
assert(qpdf_get_error_code(qpdf, e) != QPDF_SUCCESS);
assert(qpdf_has_error(qpdf));
e = qpdf_get_error(qpdf);
assert(qpdf_get_error_code(qpdf, e) != QPDF_SUCCESS);
assert(! qpdf_has_error(qpdf));
assert(qpdf_find_page_by_id(qpdf, 100, 0) == -1);

View File

@ -606,7 +606,6 @@ qpdf-c called qpdf_oh_is_initialized 0
qpdf-c registered progress reporter 0
qpdf-c called qpdf_oh_new_uninitialized 0
qpdf-c warn about oh error 1
qpdf-c registered oh error handler 0
qpdf-c cleanup warned about unhandled error 0
qpdf-c called qpdf_get_object_by_id 0
qpdf-c called qpdf_replace_object 0
@ -625,3 +624,4 @@ qpdf-c stream data filtered set 1
qpdf-c stream data buf set 1
qpdf-c called qpdf_oh_get_page_content_data 0
qpdf-c called qpdf_oh_replace_stream_data 0
qpdf-c silence oh errors 0

View File

@ -10,9 +10,14 @@ item 3: 792 792.00
minimal.pdf (C API object handle 6): attempted access to unknown object handle
minimal.pdf (C API object handle 9): attempted access to unknown object handle
minimal.pdf (C API object handle 9): attempted access to unknown object handle
warning: minimal.pdf: C API object handle accessor errors occurred, and the application did not define an error handler
warning: minimal.pdf: C API function caught an exception that it isn't returning; please point the application developer to ERROR HANDLING in qpdf-c.h
code: 1
file: minimal.pdf
pos : 0
text: C API object handle accessor errors occurred, and the application did not define an error handler
text: C API function caught an exception that it isn't returning; please point the application developer to ERROR HANDLING in qpdf-c.h
error: minimal.pdf (C API object handle 9): attempted access to unknown object handle
code: 1
file: minimal.pdf
pos : 0
text: attempted access to unknown object handle
C test 24 done

View File

@ -8,22 +8,22 @@ bad parse: parsed object (offset 1): unknown token while reading object; treatin
file: parsed object
pos : 1
text: unknown token while reading object; treating as string
type mismatch: operation for integer attempted on object of type string: returning 0
type mismatch (int operation on string): operation for integer attempted on object of type string: returning 0
code: 7
file:
pos : 0
text: operation for integer attempted on object of type string: returning 0
type mismatch: operation for string attempted on object of type integer: returning empty string
type mismatch (string operation on int): operation for string attempted on object of type integer: returning empty string
code: 7
file:
pos : 0
text: operation for string attempted on object of type integer: returning empty string
array type mismatch: operation for array attempted on object of type integer: treating as empty
array type mismatch - n_items: operation for array attempted on object of type integer: treating as empty
code: 7
file:
pos : 0
text: operation for array attempted on object of type integer: treating as empty
array type mismatch: operation for array attempted on object of type integer: returning null
array type mismatch - item: operation for array attempted on object of type integer: returning null
code: 7
file:
pos : 0

View File

@ -1,2 +1,3 @@
closed input source (C API object handle 1): attempted access to unknown object handle
WARNING: application did not handle error: closed input source (C API object handle 1): attempted access to unknown object handle
C test 26 done

View File

@ -6,4 +6,9 @@ warning: object 27 0: operation for dictionary attempted on object of type null:
file:
pos : 0
text: operation for dictionary attempted on object of type null: ignoring key replacement request
error: 11-pages.pdf (C API object handle 1000): attempted access to unknown object handle
code: 1
file: 11-pages.pdf
pos : 0
text: attempted access to unknown object handle
C test 35 done