mirror of
https://github.com/qpdf/qpdf.git
synced 2024-06-04 03:10:52 +00:00
JSON schema: support multi-element array validation
This commit is contained in:
parent
f8d1ab9462
commit
4674c04cb8
|
@ -8,7 +8,11 @@
|
||||||
allow QPDFJob JSON to start accepting multiple items where a
|
allow QPDFJob JSON to start accepting multiple items where a
|
||||||
single item used to be expected without breaking backward
|
single item used to be expected without breaking backward
|
||||||
compatibility. Without this change, the earlier fix to
|
compatibility. Without this change, the earlier fix to
|
||||||
removeAttachment would be a breaking change.
|
removeAttachment would be a breaking change. Also allow the schema
|
||||||
|
to contain a multi-element array, which means that the output has
|
||||||
|
to have an array of the same length in the corresponding location,
|
||||||
|
and each element is validated against the corresponding schema
|
||||||
|
element.
|
||||||
|
|
||||||
* QPDFObjectHandle: for the methods insertItem, appendItem,
|
* QPDFObjectHandle: for the methods insertItem, appendItem,
|
||||||
eraseItem, replaceKey, and removeKey, add a corresponding
|
eraseItem, replaceKey, and removeKey, add a corresponding
|
||||||
|
|
5
TODO
5
TODO
|
@ -124,11 +124,6 @@ JSON v2 fixes
|
||||||
compatibility, but at least this gives people a namespace they can
|
compatibility, but at least this gives people a namespace they can
|
||||||
know will never conflict with future keys.
|
know will never conflict with future keys.
|
||||||
|
|
||||||
* Change schema validation so that if the schema contains an array
|
|
||||||
with more than one element, the output has to have an array with
|
|
||||||
the same number of elements whose individual elements are
|
|
||||||
validated according to the regular rules.
|
|
||||||
|
|
||||||
* When reading back in, we'll have to call
|
* When reading back in, we'll have to call
|
||||||
pushInheritedAttributesToPage or getAllPages based on the values
|
pushInheritedAttributesToPage or getAllPages based on the values
|
||||||
of the metadata.
|
of the metadata.
|
||||||
|
|
|
@ -183,25 +183,30 @@ class JSON
|
||||||
//
|
//
|
||||||
// * The schema is a nested structure containing dictionaries,
|
// * The schema is a nested structure containing dictionaries,
|
||||||
// single-element arrays, and strings only.
|
// single-element arrays, and strings only.
|
||||||
// * Recursively walk the schema.
|
// * Recursively walk the schema. In the items below, "schema
|
||||||
// * Whenever the schema has an array of length 1 and the object
|
// object" refers to an object in the schema, and "checked
|
||||||
// does not have an array in the corresponding location,
|
// object" refers to the correspondingi part of the object
|
||||||
// validate the object against the array's single element.
|
// being checked.
|
||||||
// This effectively enables a single element to appear in
|
// * If the schema object is a dictionary, the checked object
|
||||||
// place of an array and be treated as if it were an array of
|
// must have a dictionary in the same place with the same
|
||||||
// one element. This makes it possible to decide later that
|
// keys. If flags contains f_optional, a key in the schema
|
||||||
// something that used to contain a single element now allows
|
// does not have to be present in the object. Otherwise, all
|
||||||
// an array without invalidating any old data.
|
// keys have to be present. Any key in the object must be
|
||||||
// * If the current value is a dictionary, this object must have
|
// present in the schema.
|
||||||
// a dictionary in the same place with the same keys. If flags
|
// * If the schema object is an array of length 1, the checked
|
||||||
// contains f_optional, a key in the schema does not have to
|
// object may either be a single item or an array of items.
|
||||||
// be present in the object. Otherwise, all keys have to be
|
// The single item or each element of the checked object's
|
||||||
// present. Any key in the object must be present in the
|
// array is validated against the single element of the
|
||||||
// schema.
|
// schema's array. The rationale behind this logic is that a
|
||||||
// * If the current value is an array, this object must have an
|
// single element may appear wherever the schema allows a
|
||||||
// array in the same place. The schema's array must contain a
|
// variable-length array. This makes it possible to start
|
||||||
// single element, which is used as a schema to validate each
|
// allowing an array in the future where a single element was
|
||||||
// element of this object's corresponding array.
|
// previously required without breaking backward
|
||||||
|
// compatibility.
|
||||||
|
// * If the schema object is an array of length > 1, the checked
|
||||||
|
// object must be an array of the same length. In this case,
|
||||||
|
// each element of the checked object array is validated
|
||||||
|
// against the corresponding element of the schema array.
|
||||||
// * Otherwise, the value must be a string whose value is a
|
// * Otherwise, the value must be a string whose value is a
|
||||||
// description of the object's corresponding value, which may
|
// description of the object's corresponding value, which may
|
||||||
// have any type.
|
// have any type.
|
||||||
|
|
|
@ -538,27 +538,55 @@ JSON::checkSchemaInternal(
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
} else if (sch_arr) {
|
} else if (sch_arr) {
|
||||||
if (sch_arr->elements.size() != 1) {
|
auto n_elements = sch_arr->elements.size();
|
||||||
QTC::TC("libtests", "JSON schema array error");
|
if (n_elements == 1) {
|
||||||
errors.push_back(
|
// A single-element array in the schema allows a single
|
||||||
err_prefix + " schema array contains other than one item");
|
// element in the object or a variable-length array, each
|
||||||
return false;
|
// of whose items must conform to the single element of
|
||||||
}
|
// the schema array. This doesn't apply to arrays of
|
||||||
if (this_arr) {
|
// arrays -- we fall back to the behavior of allowing a
|
||||||
int i = 0;
|
// single item only when the object is not an array.
|
||||||
for (auto const& element: this_arr->elements) {
|
if (this_arr) {
|
||||||
|
int i = 0;
|
||||||
|
for (auto const& element: this_arr->elements) {
|
||||||
|
checkSchemaInternal(
|
||||||
|
element.get(),
|
||||||
|
sch_arr->elements.at(0).get(),
|
||||||
|
flags,
|
||||||
|
errors,
|
||||||
|
prefix + "." + QUtil::int_to_string(i));
|
||||||
|
++i;
|
||||||
|
}
|
||||||
|
} else {
|
||||||
|
QTC::TC("libtests", "JSON schema array for single item");
|
||||||
checkSchemaInternal(
|
checkSchemaInternal(
|
||||||
element.get(),
|
this_v,
|
||||||
sch_arr->elements.at(0).get(),
|
sch_arr->elements.at(0).get(),
|
||||||
flags,
|
flags,
|
||||||
errors,
|
errors,
|
||||||
prefix + "." + QUtil::int_to_string(i));
|
prefix);
|
||||||
|
}
|
||||||
|
} else if (!this_arr || (this_arr->elements.size() != n_elements)) {
|
||||||
|
QTC::TC("libtests", "JSON schema array length mismatch");
|
||||||
|
errors.push_back(
|
||||||
|
err_prefix + " is supposed to be an array of length " +
|
||||||
|
QUtil::uint_to_string(n_elements));
|
||||||
|
return false;
|
||||||
|
} else {
|
||||||
|
// A multi-element array in the schema must correspond to
|
||||||
|
// an element of the same length in the object. Each
|
||||||
|
// element in the object is validated against the
|
||||||
|
// corresponding element in the schema.
|
||||||
|
size_t i = 0;
|
||||||
|
for (auto const& element: this_arr->elements) {
|
||||||
|
checkSchemaInternal(
|
||||||
|
element.get(),
|
||||||
|
sch_arr->elements.at(i).get(),
|
||||||
|
flags,
|
||||||
|
errors,
|
||||||
|
prefix + "." + QUtil::uint_to_string(i));
|
||||||
++i;
|
++i;
|
||||||
}
|
}
|
||||||
} else {
|
|
||||||
QTC::TC("libtests", "JSON schema array for single item");
|
|
||||||
checkSchemaInternal(
|
|
||||||
this_v, sch_arr->elements.at(0).get(), flags, errors, prefix);
|
|
||||||
}
|
}
|
||||||
} else if (!sch_str) {
|
} else if (!sch_str) {
|
||||||
QTC::TC("libtests", "JSON schema other type");
|
QTC::TC("libtests", "JSON schema other type");
|
||||||
|
|
|
@ -179,7 +179,11 @@ test_schema()
|
||||||
"z": "ebra",
|
"z": "ebra",
|
||||||
"o": "ptional"
|
"o": "ptional"
|
||||||
}
|
}
|
||||||
}
|
},
|
||||||
|
"four": [
|
||||||
|
{ "first": "first element" },
|
||||||
|
{ "second": "second element" }
|
||||||
|
]
|
||||||
}
|
}
|
||||||
)");
|
)");
|
||||||
|
|
||||||
|
@ -227,17 +231,28 @@ test_schema()
|
||||||
"else": {
|
"else": {
|
||||||
"z": "okay"
|
"z": "okay"
|
||||||
}
|
}
|
||||||
}
|
},
|
||||||
|
"four": [
|
||||||
|
{"first": "missing second"}
|
||||||
|
]
|
||||||
}
|
}
|
||||||
)");
|
)");
|
||||||
|
|
||||||
check_schema(b, schema, 0, false, "missing items");
|
check_schema(b, schema, 0, false, "missing items");
|
||||||
check_schema(a, a, 0, false, "top-level schema array error");
|
|
||||||
check_schema(b, b, 0, false, "lower-level schema array error");
|
|
||||||
|
|
||||||
JSON bad_schema = JSON::parse(R"({"a": true, "b": "potato?"})");
|
JSON bad_schema = JSON::parse(R"({"a": true, "b": "potato?"})");
|
||||||
check_schema(bad_schema, bad_schema, 0, false, "bad schema field type");
|
check_schema(bad_schema, bad_schema, 0, false, "bad schema field type");
|
||||||
|
|
||||||
|
JSON c = JSON::parse(R"(
|
||||||
|
{
|
||||||
|
"four": [
|
||||||
|
{ "first": 1 },
|
||||||
|
{ "oops": [2] }
|
||||||
|
]
|
||||||
|
}
|
||||||
|
)");
|
||||||
|
check_schema(c, schema, JSON::f_optional, false, "array element mismatch");
|
||||||
|
|
||||||
// "two" exercises the case of the JSON containing a single
|
// "two" exercises the case of the JSON containing a single
|
||||||
// element where the schema has an array.
|
// element where the schema has an array.
|
||||||
JSON good = JSON::parse(R"(
|
JSON good = JSON::parse(R"(
|
||||||
|
@ -262,7 +277,11 @@ test_schema()
|
||||||
"<objid>": {
|
"<objid>": {
|
||||||
"z": "ebra"
|
"z": "ebra"
|
||||||
}
|
}
|
||||||
}
|
},
|
||||||
|
"four": [
|
||||||
|
{ "first": 1 },
|
||||||
|
{ "second": [2] }
|
||||||
|
]
|
||||||
}
|
}
|
||||||
)");
|
)");
|
||||||
check_schema(good, schema, 0, false, "not optional");
|
check_schema(good, schema, 0, false, "not optional");
|
||||||
|
|
|
@ -36,7 +36,6 @@ Pl_PNGFilter decodePaeth 0
|
||||||
Pl_TIFFPredictor processRow 1
|
Pl_TIFFPredictor processRow 1
|
||||||
JSON wanted dictionary 0
|
JSON wanted dictionary 0
|
||||||
JSON key missing in object 0
|
JSON key missing in object 0
|
||||||
JSON schema array error 0
|
|
||||||
JSON key extra in object 0
|
JSON key extra in object 0
|
||||||
QPDFArgParser read args from stdin 0
|
QPDFArgParser read args from stdin 0
|
||||||
QPDFArgParser read args from file 0
|
QPDFArgParser read args from file 0
|
||||||
|
@ -93,3 +92,4 @@ JSON 16 low not after high 0
|
||||||
JSON 16 dangling high 0
|
JSON 16 dangling high 0
|
||||||
JSON parse duplicate key 0
|
JSON parse duplicate key 0
|
||||||
JSON schema array for single item 0
|
JSON schema array for single item 0
|
||||||
|
JSON schema array length mismatch 0
|
||||||
|
|
|
@ -2,6 +2,7 @@
|
||||||
top-level object is supposed to be a dictionary
|
top-level object is supposed to be a dictionary
|
||||||
---
|
---
|
||||||
--- missing items
|
--- missing items
|
||||||
|
json key ".four" is supposed to be an array of length 2
|
||||||
json key ".one.a": key "q" is present in schema but missing in object
|
json key ".one.a": key "q" is present in schema but missing in object
|
||||||
json key ".one.a.r" is supposed to be a dictionary
|
json key ".one.a.r" is supposed to be a dictionary
|
||||||
json key ".one.a.s": key "ss" is present in schema but missing in object
|
json key ".one.a.s": key "ss" is present in schema but missing in object
|
||||||
|
@ -15,16 +16,12 @@ json key ".two.1": key "flarp" is not present in schema but appears in object
|
||||||
json key ".two.2" is supposed to be a dictionary
|
json key ".two.2" is supposed to be a dictionary
|
||||||
json key ".two.3" is supposed to be a dictionary
|
json key ".two.3" is supposed to be a dictionary
|
||||||
---
|
---
|
||||||
--- top-level schema array error
|
|
||||||
top-level object schema array contains other than one item
|
|
||||||
---
|
|
||||||
--- lower-level schema array error
|
|
||||||
json key ".one.a.r" schema array contains other than one item
|
|
||||||
json key ".two" schema array contains other than one item
|
|
||||||
---
|
|
||||||
--- bad schema field type
|
--- bad schema field type
|
||||||
json key ".a" schema value is not dictionary, array, or string
|
json key ".a" schema value is not dictionary, array, or string
|
||||||
---
|
---
|
||||||
|
--- array element mismatch
|
||||||
|
json key ".four.1": key "oops" is not present in schema but appears in object
|
||||||
|
---
|
||||||
--- not optional
|
--- not optional
|
||||||
json key ".three.<objid>": key "o" is present in schema but missing in object
|
json key ".three.<objid>": key "o" is present in schema but missing in object
|
||||||
---
|
---
|
||||||
|
|
|
@ -595,11 +595,17 @@ Documentation
|
||||||
appears in the corresponding location of the actual output. The
|
appears in the corresponding location of the actual output. The
|
||||||
corresponding output can have any value including ``null``.
|
corresponding output can have any value including ``null``.
|
||||||
|
|
||||||
- An array in the help output always contains a single element. It
|
- A single-element array in the help output indicates that the
|
||||||
indicates that the corresponding location in the actual output is
|
corresponding location in the actual output is either a single
|
||||||
an array of any length, and that each element of the array has
|
item or is an array of any length. The single item or each
|
||||||
whatever format is implied by the single element of the help
|
element of the array has whatever format is implied by the single
|
||||||
output's array.
|
element of the help output's array.
|
||||||
|
|
||||||
|
- A multi-element array in the help output indicates that the
|
||||||
|
corresponding location in the actual output is an array of the
|
||||||
|
same length. Each element of the output array has whatever format
|
||||||
|
is implied by the corresponding element of the help output's
|
||||||
|
array.
|
||||||
|
|
||||||
For example, the help output indicates includes a ``"pagelabels"``
|
For example, the help output indicates includes a ``"pagelabels"``
|
||||||
key whose value is an array of one element. That element is a
|
key whose value is an array of one element. That element is a
|
||||||
|
|
Loading…
Reference in New Issue
Block a user