From ddbe59179e64c45a375d6886f892059b49fd81b2 Mon Sep 17 00:00:00 2001 From: Jay Berkenbilt Date: Fri, 17 Dec 2021 15:30:47 -0500 Subject: [PATCH] C API: simplify new error handling and improve documentation --- ChangeLog | 10 +- include/qpdf/qpdf-c.h | 145 ++++++++++-------- libqpdf/qpdf-c.cc | 61 +++----- manual/index.rst | 18 ++- qpdf/qpdf-ctest.c | 33 ++-- qpdf/qpdf.testcov | 2 +- qpdf/qtest/qpdf/c-object-handles.out | 9 +- qpdf/qtest/qpdf/c-oh-errors.out | 8 +- .../qtest/qpdf/c-oh-uninitialized-objects.out | 1 + qpdf/qtest/qpdf/c-page-errors.out | 5 + 10 files changed, 161 insertions(+), 131 deletions(-) diff --git a/ChangeLog b/ChangeLog index 2dc49e65..5f02ba43 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,5 +1,10 @@ 2021-12-17 Jay Berkenbilt + * C API: simplify error handling for uncaught errors (never in a + released version) and clarify documentation in qpdf-c.h around + error handling. See qpdf-c.h for details, including how to check + for errors and the new function qpdf_silence_errors. + * C API: expose getTypeCode and getTypeName from QPDFObjectHandle. Fixes #597. @@ -40,10 +45,7 @@ interfaces. Clarify documentation regarding object accessors and how type errors and warnings are handled. Many cases that used to crash code that used the C API can now be trapped and will be - written stderr if not trapped. The new method - qpdf_register_oh_error_handler can be used to specifically handle - errors that occur when accessing object handles. See qpdf-c.h for - details. + written stderr if not trapped. See qpdf-c.h for details. * C API: Add qpdf_oh_new_uninitialized to explicitly create uninitialized object handles. diff --git a/include/qpdf/qpdf-c.h b/include/qpdf/qpdf-c.h index fa511337..3045dec6 100644 --- a/include/qpdf/qpdf-c.h +++ b/include/qpdf/qpdf-c.h @@ -33,6 +33,10 @@ * * There are several things to keep in mind when using the C API. * + * Error handling is tricky because the underlying C++ API uses + * exception handling. See "ERROR HANDLING" below for a detailed + * explanation. + * * The C API is not as rich as the C++ API. For any operations * that involve actually manipulating PDF objects, you must use * the C++ API. The C API is primarily useful for doing basic @@ -46,8 +50,9 @@ * multiple threads. * * All dynamic memory, except for that of the qpdf_data object - * itself, is managed by the library. You must create a qpdf_data - * object using qpdf_init and free it using qpdf_cleanup. + * itself, is managed by the library unless otherwise noted. You + * must create a qpdf_data object using qpdf_init and free it + * using qpdf_cleanup. * * Many functions return char*. In all cases, the char* values * returned are pointers to data inside the qpdf_data object. As @@ -61,29 +66,6 @@ * qpdf_get_last_string_length() to get the length of whatever * string was just returned. * - * Many functions defined here merely set parameters and therefore - * never return error conditions. Functions that access or return - * qpdf_oh object handles may generate warnings but have no way to - * return errors, but the errors may be checked afterwards or - * handled using a registered handler. This is discussed in more - * detail in the section on object handling. Functions that may - * cause PDF files to be read or written may return error - * conditions. Such functions return an error code. If there were - * no errors or warnings, they return QPDF_SUCCESS. If there were - * warnings, the return value has the QPDF_WARNINGS bit set. If - * there were errors, the QPDF_ERRORS bit is set. In other words, - * if there are both warnings and errors, then the return status - * will be QPDF_WARNINGS | QPDF_ERRORS. You may also call the - * qpdf_more_warnings and qpdf_more_errors functions to test - * whether there are unseen warning or error conditions. By - * default, warnings are written to stderr when detected, but this - * behavior can be suppressed. In all cases, errors and warnings - * may be retrieved by calling qpdf_next_warning and - * qpdf_get_error. All exceptions thrown by the C++ interface are - * caught and converted into error messages by the C interface. - * Any exceptions to this are qpdf bugs and should be reported at - * https://github.com/qpdf/qpdf/issues/new. - * * Most functions defined here have obvious counterparts that are * methods to either QPDF or QPDFWriter. Please see comments in * QPDF.hh and QPDFWriter.hh for details on their use. In order @@ -91,6 +73,72 @@ * primarily on differences between the C and C++ API. */ +/* ERROR HANDLING -- changed in qpdf 10.5 */ + +/* SUMMARY: The only way to know whether a function that does not + * return an error code has encountered an error is to call + * qpdf_has_error after each function. You can do this even for + * functions that do return error codes. You can also call + * qpdf_silence_errors to prevent qpdf from writing these errors to + * stderr. + * + * DETAILS: + * + * There is some complexity in this API's error handling as it tries + * to adopt C++-style exception handling to work with languages that + * don't support exceptions, such as C. + * + * The data type underlying qpdf_data maintains a list of warnings and + * a single error. To retrieve warnings, call qpdf_next_warning while + * qpdf_more_warnings is true. To retrieve the error, call + * qpdf_get_error when qpdf_has_error is true. + * + * There are several things that are important to understand. + * + * Some functions return an error code. The value of the error code is + * made up of a bitwise-OR of QPDF_WARNINGS and QPDF_ERRORS. The + * QPDF_ERRORS bit is set if there was an error during the *most + * recent call* to the API. The QPDF_WARNINGS bit is set if there are + * any warnings that have not yet been retrieved by calling + * qpdf_more_warnings. It is possible for both its or neither bit to + * be set. + * + * The expected mode of operation is to go through a series of + * operations, checking for errors after each call, but only checking + * for warnings at the end. This is similar to how it works in the C++ + * API where warnings are handled in exactly this way but errors + * result in exceptions being thrown. However, in both the C and C++ + * API, it is possible to check for and handle warnings as they arise. + * + * Some functions return values (or void) rather than an error code. + * This is especially true with the object handling functions. Those + * functions can still generate errors. To handle errors in those + * cases, you should explicitly call qpdf_has_error(). Note that, if + * you want to avoid the inconsistencies in the interface, you can + * always check for error conditions in this way rather than looking + * at status return codes. + * + * Prior to qpdf 10.5, if one of the functions that does not return an + * error code encountered an exception, it would cause the entire + * program to crash. Starting in qpdf 10.5, the default response to an + * error condition in these situations is to print the error to + * standard error, issue exactly one warning indicating that such an + * error occurred, and return a sensible fallback value (0 for + * numbers, QPDF_FALSE for booleans, "" for strings, or a null or + * uninitialized object handle). This is better than the old behavior + * but still undesirable as the best option is to explicitly check for + * error conditions. + * + * To prevent qpdf from writing error messages to stderr in this way, + * you can call qpdf_silence_errors(). This signals to the qpdf + * library that you intend to check the error codes yourself. + * + * If you encounter a situation where an exception from the C++ code + * is not properly converted to an error as described above, it is a + * bug in qpdf, which should be reported at + * https://github.com/qpdf/qpdf/issues/new. + */ + #include #include #include @@ -116,6 +164,15 @@ extern "C" { # define QPDF_TRUE 1 # define QPDF_FALSE 0 + /* From qpdf 10.5: call this method to signal to the library that + * you are explicitly handling errors from functions that don't + * return error codes. Otherwise, the library will print these + * error conditions to stderr and issue a warning. Prior to 10.5, + * the program would have crashed from an unhandled exception. + */ + QPDF_DLL + void qpdf_silence_errors(qpdf_data qpdf); + /* Returns the version of the qpdf software */ QPDF_DLL char const* qpdf_get_qpdf_version(); @@ -560,48 +617,14 @@ extern "C" { * same as letting a QPDFObjectHandle go out of scope in the C++ * API. * - * Important note about error handling: - * - * While many of the functions that operate on the QPDF object - * return error codes, the qpdf_oh functions return values such as - * object handles or data. They have no way to return error codes. - * If they generate warnings, the warnings are handled using the - * error/warning handling functions described above. If the - * underlying C++ call throws an exception, the error handler - * registered with qpdf_register_oh_error_handler() will be - * called. If no handler is registered, the exception is written - * to STDERR. In either case, a sensible fallback value is - * returned (0 for numbers, QPDF_FALSE for booleans, "" for - * strings, or a null object). It is sensible for a C program to - * use setjmp and longjmp with this error handler since the C++ - * code has raised an exception, but you can also just set a flag - * and check it after each call. - * - * All conditions under which exceptions would be thrown by object - * accessors are caused by programmer error or major problems such - * as running out of memory or not being able to read the input - * file. If they are ever caused by invalid data in the PDF file, - * it is a bug in qpdf, which should be reported at - * https://github.com/qpdf/qpdf/issues/new. + * Please see "ERROR HANDLING" above for details on how error + * conditions are handled. */ /* For examples of using this API, see examples/pdf-c-objects.c */ typedef unsigned int qpdf_oh; - /* If an exception is thrown by the C++ code when any of the - * qpdf_oh functions are called, the registered handle_error - * function will be called. The value passed to data will be - * passed along to the error handler function. If any errors occur - * and no error handler is accessed, a single warning will be - * issued, and the error will be written to stderr. - */ - QPDF_DLL - void qpdf_register_oh_error_handler( - qpdf_data qpdf, - void (*handle_error)(qpdf_data qpdf, qpdf_error error, void* data), - void* data); - /* Releasing objects -- see comments above. These functions have no * equivalent in the C++ API. */ diff --git a/libqpdf/qpdf-c.cc b/libqpdf/qpdf-c.cc index 3f11d89c..9593a011 100644 --- a/libqpdf/qpdf-c.cc +++ b/libqpdf/qpdf-c.cc @@ -42,9 +42,8 @@ struct _qpdf_data PointerHolder output_buffer; // QPDFObjectHandle support - void (*oh_error_handler)(qpdf_data, qpdf_error, void*); - void* oh_error_handler_data; - bool default_oh_error_handler_called; + bool silence_errors; + bool oh_error_occurred; std::map> oh_cache; qpdf_oh next_oh; std::set cur_iter_dict_keys; @@ -52,32 +51,10 @@ struct _qpdf_data std::string cur_dict_key; }; -static void default_oh_error_handler(qpdf_data qpdf, qpdf_error e, void* data) -{ - bool* called = reinterpret_cast(data); - if (called != nullptr) - { - QTC::TC("qpdf", "qpdf-c warn about oh error", *called ? 0 : 1); - if (! *called) - { - qpdf->warnings.push_back( - QPDFExc( - qpdf_e_internal, - qpdf->qpdf->getFilename(), - "", 0, - "C API object handle accessor errors occurred," - " and the application did not define an error handler")); - *called = true; - } - } - std::cerr << e->exc->what() << std::endl; -} - _qpdf_data::_qpdf_data() : write_memory(false), - oh_error_handler(default_oh_error_handler), - oh_error_handler_data(&this->default_oh_error_handler_called), - default_oh_error_handler_called(false), + silence_errors(false), + oh_error_occurred(false), next_oh(0) { } @@ -876,14 +853,10 @@ QPDF_ERROR_CODE qpdf_write(qpdf_data qpdf) return status; } -void qpdf_register_oh_error_handler( - qpdf_data qpdf, - void (*handle_error)(qpdf_data qpdf, qpdf_error error, void* data), - void* data) +void qpdf_silence_errors(qpdf_data qpdf) { - QTC::TC("qpdf", "qpdf-c registered oh error handler"); - qpdf->oh_error_handler = handle_error; - qpdf->oh_error_handler_data = data; + QTC::TC("qpdf", "qpdf-c silence oh errors"); + qpdf->silence_errors = true; } template @@ -901,8 +874,24 @@ static RET trap_oh_errors( }); if (status & QPDF_ERRORS) { - (*qpdf->oh_error_handler)( - qpdf, qpdf_get_error(qpdf), qpdf->oh_error_handler_data); + if (! qpdf->silence_errors) + { + QTC::TC("qpdf", "qpdf-c warn about oh error", + qpdf->oh_error_occurred ? 0 : 1); + if (! qpdf->oh_error_occurred) + { + qpdf->warnings.push_back( + QPDFExc( + qpdf_e_internal, + qpdf->qpdf->getFilename(), + "", 0, + "C API function caught an exception that it isn't" + " returning; please point the application developer" + " to ERROR HANDLING in qpdf-c.h")); + qpdf->oh_error_occurred = true; + } + std::cerr << qpdf->error->what() << std::endl; + } return fallback(); } return ret; diff --git a/manual/index.rst b/manual/index.rst index ba2fe1dc..fcbdea4c 100644 --- a/manual/index.rst +++ b/manual/index.rst @@ -3632,14 +3632,16 @@ For a detailed list of changes, please see the file - C API Enhancements - - Overhaul error handling for the object handle functions - C API. See comments in the "Object handling" section of - :file:`include/qpdf/qpdf-c.h` for details. - In particular, exceptions thrown by the underlying C++ code - when calling object accessors are caught and converted into - errors. The errors can be trapped by registering an error - handler with ``qpdf_register_oh_error_handler`` or will be - written to stderr if no handler is registered. + - Overhaul error handling for the object handle functions C API. + Some rare error conditions that would previously have caused a + crash are now trapped and reported, and the functions that + generate them return fallback values. See comments in the + ``ERROR HANDLING`` section of :file:`include/qpdf/qpdf-c.h` for + details. In particular, exceptions thrown by the underlying C++ + code when calling object accessors are caught and converted into + errors. The errors can be checked by call ``qpdf_has_error``. + Use ``qpdf_silence_errors`` to prevent the error from being + written to stderr. - Add ``qpdf_get_last_string_length`` to the C API to get the length of the last string that was returned. This is needed to diff --git a/qpdf/qpdf-ctest.c b/qpdf/qpdf-ctest.c index f4142127..9d2ac6fd 100644 --- a/qpdf/qpdf-ctest.c +++ b/qpdf/qpdf-ctest.c @@ -73,14 +73,12 @@ static void report_errors() } } -static void handle_oh_error(qpdf_data qpdf, qpdf_error error, void* data) +static void handle_oh_error(qpdf_data qpdf, char const* label) { - char const* label = "oh error"; - if (data) + if (qpdf_has_error(qpdf)) { - label = *((char const**)data); + print_error(label, qpdf, qpdf_get_error(qpdf)); } - print_error(label, qpdf, error); } static void read_file_into_memory(char const* filename, @@ -837,42 +835,44 @@ static void test29(char const* infile, * errors rather than warnings when they don't have an owning QPDF * object. */ - char const* label = "oh error"; - qpdf_register_oh_error_handler(qpdf, handle_oh_error, (void*)&label); + qpdf_silence_errors(qpdf); /* get_root fails when we have no trailer */ - label = "get root"; qpdf_oh root = qpdf_get_root(qpdf); + handle_oh_error(qpdf, "get root"); assert(root != 0); assert(! qpdf_oh_is_initialized(qpdf, root)); - label = "bad parse"; assert(! qpdf_oh_is_initialized(qpdf, qpdf_oh_parse(qpdf, "[oops"))); + handle_oh_error(qpdf, "bad parse"); report_errors(); - label = "type mismatch"; assert(qpdf_oh_get_int_value_as_int( qpdf, qpdf_oh_new_string(qpdf, "x")) == 0); + handle_oh_error(qpdf, "type mismatch (int operation on string)"); qpdf_oh int_oh = qpdf_oh_new_integer(qpdf, 12); assert(strlen(qpdf_oh_get_string_value(qpdf, int_oh)) == 0); + handle_oh_error(qpdf, "type mismatch (string operation on int)"); // This doesn't test every possible error flow, but it tests each // way of handling errors in the library code. - label = "array type mismatch"; assert(qpdf_oh_get_array_n_items(qpdf, int_oh) == 0); + handle_oh_error(qpdf, "array type mismatch - n_items"); assert(qpdf_oh_is_null(qpdf, qpdf_oh_get_array_item(qpdf, int_oh, 3))); - label = "append to non-array"; + handle_oh_error(qpdf, "array type mismatch - item"); qpdf_oh_append_item(qpdf, int_oh, qpdf_oh_new_null(qpdf)); + handle_oh_error(qpdf, "append to non-array"); qpdf_oh array = qpdf_oh_new_array(qpdf); - label = "array bounds"; assert(qpdf_oh_is_null(qpdf, qpdf_oh_get_array_item(qpdf, array, 3))); + handle_oh_error(qpdf, "array bounds"); - label = "dictionary iter type mismatch"; qpdf_oh_begin_dict_key_iter(qpdf, int_oh); assert(qpdf_oh_dict_more_keys(qpdf) == QPDF_FALSE); - label = "dictionary type mismatch"; + handle_oh_error(qpdf, "dictionary iter type mismatch"); assert(qpdf_oh_is_null(qpdf, qpdf_oh_get_key(qpdf, int_oh, "potato"))); + handle_oh_error(qpdf, "dictionary type mismatch"); assert(qpdf_oh_has_key(qpdf, int_oh, "potato") == QPDF_FALSE); + handle_oh_error(qpdf, "dictionary type mismatch"); report_errors(); } @@ -1030,6 +1030,9 @@ static void test35(char const* infile, assert(qpdf_more_warnings(qpdf)); e = qpdf_next_warning(qpdf); assert(qpdf_get_error_code(qpdf, e) != QPDF_SUCCESS); + assert(qpdf_has_error(qpdf)); + e = qpdf_get_error(qpdf); + assert(qpdf_get_error_code(qpdf, e) != QPDF_SUCCESS); assert(! qpdf_has_error(qpdf)); assert(qpdf_find_page_by_id(qpdf, 100, 0) == -1); diff --git a/qpdf/qpdf.testcov b/qpdf/qpdf.testcov index 6422393d..034f8d8e 100644 --- a/qpdf/qpdf.testcov +++ b/qpdf/qpdf.testcov @@ -606,7 +606,6 @@ qpdf-c called qpdf_oh_is_initialized 0 qpdf-c registered progress reporter 0 qpdf-c called qpdf_oh_new_uninitialized 0 qpdf-c warn about oh error 1 -qpdf-c registered oh error handler 0 qpdf-c cleanup warned about unhandled error 0 qpdf-c called qpdf_get_object_by_id 0 qpdf-c called qpdf_replace_object 0 @@ -625,3 +624,4 @@ qpdf-c stream data filtered set 1 qpdf-c stream data buf set 1 qpdf-c called qpdf_oh_get_page_content_data 0 qpdf-c called qpdf_oh_replace_stream_data 0 +qpdf-c silence oh errors 0 diff --git a/qpdf/qtest/qpdf/c-object-handles.out b/qpdf/qtest/qpdf/c-object-handles.out index 1a7809ce..57dc2e98 100644 --- a/qpdf/qtest/qpdf/c-object-handles.out +++ b/qpdf/qtest/qpdf/c-object-handles.out @@ -10,9 +10,14 @@ item 3: 792 792.00 minimal.pdf (C API object handle 6): attempted access to unknown object handle minimal.pdf (C API object handle 9): attempted access to unknown object handle minimal.pdf (C API object handle 9): attempted access to unknown object handle -warning: minimal.pdf: C API object handle accessor errors occurred, and the application did not define an error handler +warning: minimal.pdf: C API function caught an exception that it isn't returning; please point the application developer to ERROR HANDLING in qpdf-c.h code: 1 file: minimal.pdf pos : 0 - text: C API object handle accessor errors occurred, and the application did not define an error handler + text: C API function caught an exception that it isn't returning; please point the application developer to ERROR HANDLING in qpdf-c.h +error: minimal.pdf (C API object handle 9): attempted access to unknown object handle + code: 1 + file: minimal.pdf + pos : 0 + text: attempted access to unknown object handle C test 24 done diff --git a/qpdf/qtest/qpdf/c-oh-errors.out b/qpdf/qtest/qpdf/c-oh-errors.out index 164defd2..992b0328 100644 --- a/qpdf/qtest/qpdf/c-oh-errors.out +++ b/qpdf/qtest/qpdf/c-oh-errors.out @@ -8,22 +8,22 @@ bad parse: parsed object (offset 1): unknown token while reading object; treatin file: parsed object pos : 1 text: unknown token while reading object; treating as string -type mismatch: operation for integer attempted on object of type string: returning 0 +type mismatch (int operation on string): operation for integer attempted on object of type string: returning 0 code: 7 file: pos : 0 text: operation for integer attempted on object of type string: returning 0 -type mismatch: operation for string attempted on object of type integer: returning empty string +type mismatch (string operation on int): operation for string attempted on object of type integer: returning empty string code: 7 file: pos : 0 text: operation for string attempted on object of type integer: returning empty string -array type mismatch: operation for array attempted on object of type integer: treating as empty +array type mismatch - n_items: operation for array attempted on object of type integer: treating as empty code: 7 file: pos : 0 text: operation for array attempted on object of type integer: treating as empty -array type mismatch: operation for array attempted on object of type integer: returning null +array type mismatch - item: operation for array attempted on object of type integer: returning null code: 7 file: pos : 0 diff --git a/qpdf/qtest/qpdf/c-oh-uninitialized-objects.out b/qpdf/qtest/qpdf/c-oh-uninitialized-objects.out index 0fe5d307..200795af 100644 --- a/qpdf/qtest/qpdf/c-oh-uninitialized-objects.out +++ b/qpdf/qtest/qpdf/c-oh-uninitialized-objects.out @@ -1,2 +1,3 @@ closed input source (C API object handle 1): attempted access to unknown object handle +WARNING: application did not handle error: closed input source (C API object handle 1): attempted access to unknown object handle C test 26 done diff --git a/qpdf/qtest/qpdf/c-page-errors.out b/qpdf/qtest/qpdf/c-page-errors.out index e4c0e79d..9c55bd48 100644 --- a/qpdf/qtest/qpdf/c-page-errors.out +++ b/qpdf/qtest/qpdf/c-page-errors.out @@ -6,4 +6,9 @@ warning: object 27 0: operation for dictionary attempted on object of type null: file: pos : 0 text: operation for dictionary attempted on object of type null: ignoring key replacement request +error: 11-pages.pdf (C API object handle 1000): attempted access to unknown object handle + code: 1 + file: 11-pages.pdf + pos : 0 + text: attempted access to unknown object handle C test 35 done