mirror of
https://github.com/qpdf/qpdf.git
synced 2024-12-22 02:49:00 +00:00
QPDFJob: documentation
This commit is contained in:
parent
5a7bb3474e
commit
cc5485dac1
@ -124,14 +124,32 @@ CODING RULES
|
||||
|
||||
HOW TO ADD A COMMAND-LINE ARGUMENT
|
||||
|
||||
QPDFJob is documented in three places:
|
||||
|
||||
* This section provides a quick reminder for how to add a command-line
|
||||
argument
|
||||
|
||||
* generate_auto_job has a detailed explanation about how QPDFJob and
|
||||
generate_auto_job work together
|
||||
|
||||
* The manual ("QPDFJob Design" in qpdf-job.rst) discusses the design
|
||||
approach, rationale, and evolution of QPDFJob.
|
||||
|
||||
Command-line arguments are closely coupled with QPDFJob. To add a new
|
||||
command-line argument, add the option to the appropriate table in
|
||||
job.yml. This will automatically declare a method in the private
|
||||
ArgParser class in QPDFJob_argv.cc which you have to implement. The
|
||||
implementation should make calls to methods in QPDFJob. Then, add the
|
||||
same option to either the no-json section of job.yml if it is to be
|
||||
excluded from the job json structure, or add it under the json
|
||||
structure to the place where it should appear in the json structure.
|
||||
implementation should make calls to methods in QPDFJob via its Config
|
||||
classes. Then, add the same option to either the no-json section of
|
||||
job.yml if it is to be excluded from the job json structure, or add it
|
||||
under the json structure to the place where it should appear in the
|
||||
json structure.
|
||||
|
||||
In most cases, adding a new option will automatically declare and call
|
||||
the appropriate Config method, which you then have to implement. If
|
||||
you need a manual handler, you have to declare the option as manual in
|
||||
job.yml and implement the handler yourself, though the automatically
|
||||
generated code will declare it for you.
|
||||
|
||||
The build will fail until the new option is documented in
|
||||
manual/cli.rst. To do that, create documentation for the option by
|
||||
@ -148,6 +166,10 @@ When done, the following should happen:
|
||||
* qpdf --help=topic should list --new-option for the correct topic
|
||||
* --new-option should appear in the manual
|
||||
* --new-option should be in the command-line option index in the manual
|
||||
* A Config method (in Config or one of the other Config classes in
|
||||
QPDFJob) should exist that corresponds to the command-line flag
|
||||
* The job JSON file should have a new key in the schema corresponding
|
||||
to the new option
|
||||
|
||||
|
||||
RELEASE PREPARATION
|
||||
|
@ -100,6 +100,7 @@
|
||||
"encodable",
|
||||
"encp",
|
||||
"endianness",
|
||||
"endl",
|
||||
"endobj",
|
||||
"endstream",
|
||||
"enspliel",
|
||||
@ -128,6 +129,7 @@
|
||||
"fuzzer",
|
||||
"fuzzers",
|
||||
"fvisibility",
|
||||
"iostream",
|
||||
"gajic",
|
||||
"gajić",
|
||||
"gcurl",
|
||||
|
@ -8,13 +8,13 @@ BINS_examples = \
|
||||
pdf-filter-tokens \
|
||||
pdf-invert-images \
|
||||
pdf-mod-info \
|
||||
pdf-job \
|
||||
pdf-name-number-tree \
|
||||
pdf-npages \
|
||||
pdf-overlay-page \
|
||||
pdf-parse-content \
|
||||
pdf-set-form-values \
|
||||
pdf-split-pages
|
||||
pdf-split-pages \
|
||||
qpdf-job
|
||||
CBINS_examples = \
|
||||
pdf-c-objects \
|
||||
pdf-linearize
|
||||
|
@ -9,6 +9,121 @@ import json
|
||||
import filecmp
|
||||
from contextlib import contextmanager
|
||||
|
||||
# The purpose of this code is to automatically generate various parts
|
||||
# of the QPDFJob class. It is fairly complicated and extremely
|
||||
# bespoke, so understanding it is important if modifications are to be
|
||||
# made.
|
||||
|
||||
# Documentation of QPDFJob is divided among three places:
|
||||
#
|
||||
# * "HOW TO ADD A COMMAND-LINE ARGUMENT" in README-maintainer provides
|
||||
# a quick reminder for how to add a command-line argument
|
||||
#
|
||||
# * This file has a detailed explanation about how QPDFJob and
|
||||
# generate_auto_job work together
|
||||
#
|
||||
# * The manual ("QPDFJob Design" in qpdf-job.rst) discusses the design
|
||||
# approach, rationale, and evolution of QPDFJob.
|
||||
#
|
||||
# QPDFJob solved the problem of moving extensive functionality that
|
||||
# lived in qpdf.cc into the library. The QPDFJob class consists of
|
||||
# four major sections:
|
||||
#
|
||||
# * The run() method and its subsidiaries are responsible for
|
||||
# performing the actual operations on PDF files. This is implemented
|
||||
# in QPDFJob.cc
|
||||
#
|
||||
# * The nested Config class and the other classes it creates provide
|
||||
# an API for setting up a QPDFJob instance and correspond to the
|
||||
# command-line arguments of the qpdf executable. This is implemented
|
||||
# in QPDFJob_config.cc
|
||||
#
|
||||
# * The argument parsing code reads an argv array and calls
|
||||
# configuration methods. This is implemented in QPDFJob_argv.cc. The
|
||||
# argument parsing logic itself is implemented in the QPDFArgParser
|
||||
# class.
|
||||
#
|
||||
# * The job JSON handling code, which reads a QPDFJob JSON file and
|
||||
# calls configuration methods. This is implemented in
|
||||
# QPDFJob_json.cc. The JSON parsing code is in the JSON class. A
|
||||
# sax-like JSON handler class that calls callbacks in response to
|
||||
# items in the JSON is implemented in the JSONHandler class.
|
||||
#
|
||||
# This code has the job of ensuring that configuration, command-line
|
||||
# arguments, and JSON are all consistent and complete so that a
|
||||
# developer or user can freely move among those different ways of
|
||||
# interacting with QPDFJob in a predictable fashion. In addition, help
|
||||
# information for each option appears in manual/cli.rst, and that
|
||||
# information is used in creation of the job JSON schema and to supply
|
||||
# help text to QPDFArgParser. This code also ensures that there is an
|
||||
# exact match between options in job.yml and options in cli.rst.
|
||||
#
|
||||
# The job.yml file contains the data that drives this code. To
|
||||
# understand job.yml, here are some important concepts.
|
||||
#
|
||||
# QPDFArgParser option table. There is support for positional
|
||||
# arguments, options consisting of flags and optional parameters, and
|
||||
# subparsers that start with a regular parameterless flag, have their
|
||||
# own positional and option sections, and are terminated with -- by
|
||||
# itself. Examples of this include --encrypt and --pages. An "option
|
||||
# table" contains an optional positional argument handler and a list
|
||||
# of valid options with specifications about their parameters. There
|
||||
# are three kinds of option tables:
|
||||
#
|
||||
# * The built-in "help" option table contains help commands, like
|
||||
# --help and --version, that are only valid when they appear as the
|
||||
# single command-line argument.
|
||||
#
|
||||
# * The "main" option table contains the options that are valid
|
||||
# starting at the beginning of argument parsing.
|
||||
#
|
||||
# * A named option table can be started manually by the argument
|
||||
# parsing code to switch the argument parser's context. Switching
|
||||
# the parser to a new option table is manual (via a call to
|
||||
# selectOptionTable). Context reverts to the main option table
|
||||
# automatically when -- is encountered.
|
||||
#
|
||||
# In QPDFJob.hh, there is a Config class for each option table except
|
||||
# help.
|
||||
#
|
||||
# Option type: bare, required/optional parameter, required/optional
|
||||
# choices. A bare argument is just a flag, like --qdf. A parameter
|
||||
# option takes an arbitrary parameter, like --password. A choices
|
||||
# option takes one of a fixed list of choices, like --object-streams.
|
||||
# If a parameter or choices option's parameter is option, the empty
|
||||
# string may be specified as an option, such as --collate (or
|
||||
# --collate=). For a bare option, --option= is always the same as just
|
||||
# --option. This makes it possible to switch an option from bare to
|
||||
# optional choice to optional parameter all without breaking
|
||||
# compatibility.
|
||||
#
|
||||
# JSON "schema". This is a qpdf-specific "schema" for JSON. It is not
|
||||
# related to any kind of standard JSON schema. It is described in
|
||||
# JSON.hh and in the manual. QPDFJob uses the JSON "schema" in a mode
|
||||
# in which keys in the schema are all optional in the JSON object.
|
||||
#
|
||||
# Here is the mapping between configuration, argv, and JSON.
|
||||
#
|
||||
# The help options table is implemented solely for argv processing and
|
||||
# has no counterpart in configuration or JSON.
|
||||
#
|
||||
# The config() method returns a shared pointer to a Config object.
|
||||
# Every command-line option in the main option table has a
|
||||
# corresponding method in Config whose name is the option converted to
|
||||
# camel case. For bare options and options with optional parameters, a
|
||||
# version exists that takes no arguments. For others, a version exists
|
||||
# that takes a char const*. For example, the --qdf flag implies a
|
||||
# qdf() method in Config, and the --object-streams flag implies an
|
||||
# objectStreams(char const*) method in Config. For flags in option
|
||||
# tables, the method is declared inside a config class specific to the
|
||||
# option table. The mapping between option tables and config classes
|
||||
# is explicit in job.yml. Positional arguments are handled
|
||||
# individually and manually -- see QPDFJob.hh in the CONFIGURATION
|
||||
# section for details. See examples/qpdf-job.cc for an example.
|
||||
#
|
||||
# To understand the rest, start at main and follow comments in the
|
||||
# code.
|
||||
|
||||
whoami = os.path.basename(sys.argv[0])
|
||||
BANNER = f'''//
|
||||
// This file is automatically generated by {whoami}.
|
||||
@ -33,12 +148,18 @@ def write_file(filename):
|
||||
|
||||
|
||||
class Main:
|
||||
# SOURCES is a list of source files whose contents are used by
|
||||
# this program. If they change, we are out of date.
|
||||
SOURCES = [
|
||||
whoami,
|
||||
'manual/_ext/qpdf.py',
|
||||
'job.yml',
|
||||
'manual/cli.rst',
|
||||
]
|
||||
# DESTS is a map to the output files this code generates. These
|
||||
# generated files, as well as those added to DESTS later in the
|
||||
# code, are included in various places by QPDFJob.hh or any of the
|
||||
# implementing QPDFJob*.cc files.
|
||||
DESTS = {
|
||||
'decl': 'libqpdf/qpdf/auto_job_decl.hh',
|
||||
'init': 'libqpdf/qpdf/auto_job_init.hh',
|
||||
@ -48,6 +169,11 @@ class Main:
|
||||
'json_init': 'libqpdf/qpdf/auto_job_json_init.hh',
|
||||
# Others are added in top
|
||||
}
|
||||
# SUBS contains a checksum for each source and destination and is
|
||||
# used to detect whether we're up to date without having to force
|
||||
# recompilation all the time. This way the build can invoke this
|
||||
# script unconditionally without causing stuff to rebuild every
|
||||
# time.
|
||||
SUMS = 'job.sums'
|
||||
|
||||
def main(self, args=sys.argv[1:], prog=whoami):
|
||||
@ -71,8 +197,17 @@ class Main:
|
||||
def top(self, options):
|
||||
with open('job.yml', 'r') as f:
|
||||
data = yaml.safe_load(f.read())
|
||||
# config_decls maps a config key from an option in "options"
|
||||
# (from job.yml) to a list of declarations. A declaration is
|
||||
# generated for each config method for that option table.
|
||||
self.config_decls = {}
|
||||
# Keep track of which configs we've declared since we can have
|
||||
# option tables share a config class, as with the encryption
|
||||
# tables.
|
||||
self.declared_configs = set()
|
||||
|
||||
# Update DESTS -- see above. This ensures that each config
|
||||
# class's contents are included in job.sums.
|
||||
for o in data['options']:
|
||||
config = o.get('config', None)
|
||||
if config is not None:
|
||||
@ -257,12 +392,21 @@ class Main:
|
||||
def generate(self, data):
|
||||
warn(f'{whoami}: regenerating auto job files')
|
||||
self.validate(data)
|
||||
# Add the built-in help options to tables that we populate as
|
||||
# we read job.yml since we won't encounter these in job.yml
|
||||
|
||||
# Keep track of which options are help options since they are
|
||||
# handled specially. Add the built-in help options to tables
|
||||
# that we populate as we read job.yml since we won't encounter
|
||||
# these in job.yml
|
||||
self.help_options = set(
|
||||
['--completion-bash', '--completion-zsh', '--help']
|
||||
)
|
||||
# Keep track of which options we have encountered but haven't
|
||||
# seen help text for. This enables us to report if any option
|
||||
# is missing help.
|
||||
self.options_without_help = set(self.help_options)
|
||||
|
||||
# Compute the information needed for generated files and write
|
||||
# the files.
|
||||
self.prepare(data)
|
||||
with write_file(self.DESTS['decl']) as f:
|
||||
print(BANNER, file=f)
|
||||
@ -276,6 +420,11 @@ class Main:
|
||||
with open('manual/cli.rst', 'r') as df:
|
||||
print(BANNER, file=f)
|
||||
self.generate_doc(df, f)
|
||||
|
||||
# Compute the json files after the config and arg parsing
|
||||
# files. We need to have full information about all the
|
||||
# options before we can generate the schema. Generating the
|
||||
# schema also generates the json header files.
|
||||
self.generate_schema(data)
|
||||
with write_file(self.DESTS['schema']) as f:
|
||||
print('static constexpr char const* JOB_SCHEMA_DATA = R"(' +
|
||||
@ -301,6 +450,9 @@ class Main:
|
||||
# DON'T ADD CODE TO generate AFTER update_hashes
|
||||
|
||||
def handle_trivial(self, i, identifier, cfg, prefix, kind, v):
|
||||
# A "trivial" option is one whose handler does nothing other
|
||||
# than to call the config method with the same name (switched
|
||||
# to camelCase).
|
||||
decl_arg = 1
|
||||
decl_arg_optional = False
|
||||
if kind == 'bare':
|
||||
@ -341,11 +493,18 @@ class Main:
|
||||
# strategy enables us to change an option from bare to
|
||||
# optional_parameter or optional_choices without
|
||||
# breaking binary compatibility. The overloaded
|
||||
# methods both have to be implemented manually.
|
||||
# methods both have to be implemented manually. They
|
||||
# are not automatically called, so if you forget,
|
||||
# someone will get a link error if they try to call
|
||||
# one.
|
||||
self.config_decls[cfg].append(
|
||||
f'QPDF_DLL {config_prefix}* {identifier}();')
|
||||
|
||||
def handle_flag(self, i, identifier, kind, v):
|
||||
# For flags that require manual handlers, declare the handler
|
||||
# and register it. They have to be implemented manually in
|
||||
# QPDFJob_argv.cc. You get compiler/linker errors for any
|
||||
# missing methods.
|
||||
if kind == 'bare':
|
||||
self.decls.append(f'void {identifier}();')
|
||||
self.init.append(f'this->ap.addBare("{i}", '
|
||||
@ -371,14 +530,17 @@ class Main:
|
||||
f', false, {v}_choices);')
|
||||
|
||||
def prepare(self, data):
|
||||
self.decls = []
|
||||
self.init = []
|
||||
self.json_decls = []
|
||||
self.json_init = []
|
||||
self.jdata = {}
|
||||
self.by_table = {}
|
||||
self.decls = [] # argv handler declarations
|
||||
self.init = [] # initialize arg parsing code
|
||||
self.json_decls = [] # json handler declarations
|
||||
self.json_init = [] # initialize json handlers
|
||||
self.jdata = {} # running data used for json generate
|
||||
self.by_table = {} # table information by name for easy lookup
|
||||
|
||||
def add_jdata(flag, table, details):
|
||||
# Keep track of each flag and where it appears so we can
|
||||
# check consistency between the json information and the
|
||||
# options section.
|
||||
nonlocal self
|
||||
if table == 'help':
|
||||
self.help_options.add(f'--{flag}')
|
||||
@ -389,6 +551,7 @@ class Main:
|
||||
'tables': {table: details},
|
||||
}
|
||||
|
||||
# helper functions
|
||||
self.init.append('auto b = [this](void (ArgParser::*f)()) {')
|
||||
self.init.append(' return QPDFArgParser::bindBare(f, this);')
|
||||
self.init.append('};')
|
||||
@ -396,6 +559,8 @@ class Main:
|
||||
self.init.append(' return QPDFArgParser::bindParam(f, this);')
|
||||
self.init.append('};')
|
||||
self.init.append('')
|
||||
|
||||
# static variables for each set of choices for choices options
|
||||
for k, v in data['choices'].items():
|
||||
s = f'static char const* {k}_choices[] = {{'
|
||||
for i in v:
|
||||
@ -406,6 +571,8 @@ class Main:
|
||||
self.init.append('')
|
||||
self.json_init.append('')
|
||||
|
||||
# constants for the table names to reduce hard-coding strings
|
||||
# in the handlers
|
||||
for o in data['options']:
|
||||
table = o['table']
|
||||
if table in ('main', 'help'):
|
||||
@ -413,6 +580,20 @@ class Main:
|
||||
i = self.to_identifier(table, 'O', True)
|
||||
self.decls.append(f'static constexpr char const* {i} = "{table}";')
|
||||
self.decls.append('')
|
||||
|
||||
# Walk through all the options adding declarations for the
|
||||
# option handlers and initialization code to register the
|
||||
# handlers in QPDFArgParser. For "trivial" cases,
|
||||
# QPDFArgParser will call the corresponding config method
|
||||
# automatically. Otherwise, it will declare a handler that you
|
||||
# have to explicitly implement.
|
||||
|
||||
# If you add a new option table, you have to set config to the
|
||||
# name of a member variable that you declare in the ArgParser
|
||||
# class in QPDFJob_argv.cc. Then there should be an option in
|
||||
# the main table, also listed as manual in job.yml, that
|
||||
# switches to it. See implementations of any of the existing
|
||||
# options that do this for examples.
|
||||
for o in data['options']:
|
||||
table = o['table']
|
||||
config = o.get('config', None)
|
||||
@ -437,8 +618,8 @@ class Main:
|
||||
self.decls.append(f'void {arg_prefix}Positional(char*);')
|
||||
self.init.append('this->ap.addPositional('
|
||||
f'p(&ArgParser::{arg_prefix}Positional));')
|
||||
flags = {}
|
||||
|
||||
flags = {}
|
||||
for i in o.get('bare', []):
|
||||
flags[i] = ['bare', None]
|
||||
for i, v in o.get('required_parameter', {}).items():
|
||||
@ -462,6 +643,11 @@ class Main:
|
||||
self.handle_trivial(
|
||||
i, identifier, config, config_prefix, kind, v)
|
||||
|
||||
# Subsidiary options tables need end methods to do any
|
||||
# final checking within the option table. Final checking
|
||||
# for the main option table is handled by
|
||||
# checkConfiguration, which is called explicitly in the
|
||||
# QPDFJob code.
|
||||
if table not in ('main', 'help'):
|
||||
identifier = self.to_identifier(table, 'argEnd', False)
|
||||
self.decls.append(f'void {identifier}();')
|
||||
@ -510,6 +696,19 @@ class Main:
|
||||
return self.option_to_json_key(schema_key)
|
||||
|
||||
def build_schema(self, j, path, flag, expected, options_seen):
|
||||
# j: the part of data from "json" in job.yml as we traverse it
|
||||
# path: a string representation of the path in the json
|
||||
# flag: the command-line flag
|
||||
# expected: a map of command-line options we expect to eventually see
|
||||
# options_seen: which options we have seen so far
|
||||
|
||||
# As described in job.yml, the json can have keys that don't
|
||||
# map to options. This includes keys whose values are
|
||||
# dictionaries as well as keys that correspond to positional
|
||||
# arguments. These start with _ and get their help from
|
||||
# job.yml. Things that correspond to options get their help
|
||||
# from the help text we gathered from cli.rst.
|
||||
|
||||
if flag in expected:
|
||||
options_seen.add(flag)
|
||||
elif isinstance(j, str):
|
||||
@ -519,6 +718,19 @@ class Main:
|
||||
elif not (flag == '' or flag.startswith('_')):
|
||||
raise Exception(f'json: unknown key {flag}')
|
||||
|
||||
# The logic here is subtle and makes sense if you understand
|
||||
# how our JSON schemas work. They are described in JSON.hh,
|
||||
# but basically, if you see a dictionary, the schema should
|
||||
# have a dictionary with the same keys whose values are
|
||||
# descriptive. If you see an array, the array should have
|
||||
# single member that describes each element of the array. See
|
||||
# JSON.hh for details.
|
||||
|
||||
# See comments in QPDFJob_json.cc in the Handlers class
|
||||
# declaration to understand how and why the methods called
|
||||
# here work. The idea is that Handlers keeps a stack of
|
||||
# JSONHandler shared pointers so that we can register our
|
||||
# handlers in the right place as we go.
|
||||
if isinstance(j, dict):
|
||||
schema_value = {}
|
||||
if flag:
|
||||
@ -579,14 +791,20 @@ class Main:
|
||||
|
||||
def generate_schema(self, data):
|
||||
# Check to make sure that every command-line option is
|
||||
# represented in data['json'].
|
||||
|
||||
# Build a list of options that we expect. If an option appears
|
||||
# once, we just expect to see it once. If it appears in more
|
||||
# than one options table, we need to see a separate version of
|
||||
# it for each option table. It is represented in job.yml
|
||||
# prepended with the table prefix. The table prefix is removed
|
||||
# in the schema.
|
||||
# represented in data['json']. Build a list of options that we
|
||||
# expect. If an option appears once, we just expect to see it
|
||||
# once. If it appears in more than one options table, we need
|
||||
# to see a separate version of it for each option table. It is
|
||||
# represented in job.yml prepended with the table prefix. The
|
||||
# table prefix is removed in the schema. Example: "password"
|
||||
# appears multiple times, so the json section of job.yml has
|
||||
# main.password, uo.password, etc. But most options appear
|
||||
# only once, so we can just list them as they are. There is a
|
||||
# nearly exact match between option tables and dictionary in
|
||||
# the job json schema, but it's not perfect because of how
|
||||
# positional arguments are handled, so we have to do this
|
||||
# extra work. Information about which tables a particular
|
||||
# option appeared in is gathered up in prepare().
|
||||
expected = {}
|
||||
for k, v in self.jdata.items():
|
||||
tables = v['tables']
|
||||
@ -600,7 +818,11 @@ class Main:
|
||||
# Walk through the json information building the schema as we
|
||||
# go. This verifies consistency between command-line options
|
||||
# and the json section of the data and builds up a schema by
|
||||
# populating with help information as available.
|
||||
# populating with help information as available. In addition
|
||||
# to generating the schema, we declare and register json
|
||||
# handlers that correspond with it. That way, we can first
|
||||
# check a job JSON file against the schema, and if it matches,
|
||||
# we have fewer error opportunities while calling handlers.
|
||||
self.schema = self.build_schema(
|
||||
data['json'], '', '', expected, options_seen)
|
||||
if options_seen != set(expected.keys()):
|
||||
|
@ -62,10 +62,10 @@ class QPDFJob
|
||||
// the regular API. This is exposed in the C API, which makes it
|
||||
// easier to get certain high-level qpdf functionality from other
|
||||
// languages. If there are any command-line errors, this method
|
||||
// will throw QPDFArgParser::Usage which is derived from
|
||||
// std::runtime_error. Other exceptions may be thrown in some
|
||||
// cases. Note that argc, and argv should be UTF-8 encoded. If you
|
||||
// are calling this from a Windows Unicode-aware main (wmain), see
|
||||
// will throw QPDFUsage which is derived from std::runtime_error.
|
||||
// Other exceptions may be thrown in some cases. Note that argc,
|
||||
// and argv should be UTF-8 encoded. If you are calling this from
|
||||
// a Windows Unicode-aware main (wmain), see
|
||||
// QUtil::call_main_from_wmain for information about converting
|
||||
// arguments to UTF-8. This method will mutate arguments that are
|
||||
// passed to it.
|
||||
@ -76,7 +76,7 @@ class QPDFJob
|
||||
// Initialize a QPDFJob from json. Passing partial = true prevents
|
||||
// this method from doing the final checks (calling
|
||||
// checkConfiguration) after processing the json file. This makes
|
||||
// it possible to initialze QPDFJob in stages using multiple json
|
||||
// it possible to initialize QPDFJob in stages using multiple json
|
||||
// files or to have a json file that can be processed from the CLI
|
||||
// with --job-json-file and be combined with other arguments. For
|
||||
// example, you might include only encryption parameters, leaving
|
||||
@ -84,7 +84,11 @@ class QPDFJob
|
||||
// input and output files. initializeFromJson is called with
|
||||
// partial = true when invoked from the command line. To make sure
|
||||
// that the json file is fully valid on its own, just don't
|
||||
// specify any other command-line flags.
|
||||
// specify any other command-line flags. If there are any
|
||||
// configuration errors, QPDFUsage is thrown. Some error messages
|
||||
// may be CLI-centric. If an an exception tells you to use the
|
||||
// "--some-option" option, set the "someOption" key in the JSON
|
||||
// object instead.
|
||||
QPDF_DLL
|
||||
void initializeFromJson(std::string const& json, bool partial = false);
|
||||
|
||||
@ -160,7 +164,7 @@ class QPDFJob
|
||||
// object. The Config object contains methods that correspond with
|
||||
// qpdf command-line arguments. You can use a fluent interface to
|
||||
// configure a QPDFJob object that would do exactly the same thing
|
||||
// as a specific qpdf command. The example pdf-job.cc contains an
|
||||
// as a specific qpdf command. The example qpdf-job.cc contains an
|
||||
// example of this usage. You can also use initializeFromJson or
|
||||
// initializeFromArgv to initialize a QPDFJob object.
|
||||
|
||||
@ -180,6 +184,10 @@ class QPDFJob
|
||||
// with references. Returning pointers instead of references
|
||||
// makes for a more uniform interface.
|
||||
|
||||
// Maintainer documentation: see the section in README-maintainer
|
||||
// called "HOW TO ADD A COMMAND-LINE ARGUMENT", which contains
|
||||
// references to additional places in the documentation.
|
||||
|
||||
class Config;
|
||||
|
||||
class AttConfig
|
||||
@ -330,7 +338,10 @@ class QPDFJob
|
||||
// Return a top-level configuration item. See CONFIGURATION above
|
||||
// for details. If an invalid configuration is created (such as
|
||||
// supplying contradictory options, omitting an input file, etc.),
|
||||
// QPDFUsage is thrown.
|
||||
// QPDFUsage is thrown. Note that error messages are CLI-centric,
|
||||
// but you can map them into config calls. For example, if an
|
||||
// exception tells you to use the --some-option flag, you should
|
||||
// call config()->someOption() instead.
|
||||
QPDF_DLL
|
||||
std::shared_ptr<Config> config();
|
||||
|
||||
|
8
job.sums
8
job.sums
@ -1,17 +1,17 @@
|
||||
# Generated by generate_auto_job
|
||||
generate_auto_job 1fdb113412a444aad67b0232f3f6c4f50d9e2a5701691e5146fd1b559039ef2e
|
||||
generate_auto_job 5d6ec1e4f0b94d8f73df665061d8a2188cbbe8f25ea42be78ec576547261d5ac
|
||||
include/qpdf/auto_job_c_att.hh 7ad43bb374c1370ef32ebdcdcb7b73a61d281f7f4e3f12755585872ab30fb60e
|
||||
include/qpdf/auto_job_c_copy_att.hh 32275d03cdc69b703dd7e02ba0bbe15756e714e9ad185484773a6178dc09e1ee
|
||||
include/qpdf/auto_job_c_enc.hh 72e138c7b96ed5aacdce78c1dec04b1c20d361faec4f8faf52f64c1d6be99265
|
||||
include/qpdf/auto_job_c_main.hh 69d5ea26098bcb6ec5b5e37ba0bca9e7d16a784d2618e0c05d635046848d5123
|
||||
include/qpdf/auto_job_c_pages.hh 931840b329a36ca0e41401190e04537b47f2867671a6643bfd8da74014202671
|
||||
include/qpdf/auto_job_c_uo.hh 0585b7de459fa479d9e51a45fa92de0ff6dee748efc9ec1cedd0dde6cee1ad50
|
||||
job.yml effc93a805fb74503be2213ad885238db21991ba3d084fbfeff01183c66cb002
|
||||
job.yml 9544c6e046b25d3274731fbcd07ba25b300fd67055021ac4364ad8a91f77c6b6
|
||||
libqpdf/qpdf/auto_job_decl.hh 9f79396ec459f191be4c5fe34cf88c265cf47355a1a945fa39169d1c94cf04f6
|
||||
libqpdf/qpdf/auto_job_help.hh 6002f503368f319a3d717484ac39d1558f34e67989d442f394791f6f6f5f0500
|
||||
libqpdf/qpdf/auto_job_help.hh 43184f01816b5210bbc981de8de48446546fb94f4fd6e63cfc7f2fbac3578e6b
|
||||
libqpdf/qpdf/auto_job_init.hh fd13b9f730e6275a39a15d193bd9af19cf37f4495699ec1886c2b208d7811ab1
|
||||
libqpdf/qpdf/auto_job_json_decl.hh c5e3fd38a3b0c569eb0c6b4c60953a09cd6bc7d3361a357a81f64fe36af2b0cf
|
||||
libqpdf/qpdf/auto_job_json_init.hh 3f86ce40931ca8f417d050fcd49104d73c1fa4e977ad19d54b372831a8ea17ed
|
||||
libqpdf/qpdf/auto_job_schema.hh 18a3780671d95224cb9a27dcac627c421cae509d59f33a63e6bda0ab53cce923
|
||||
manual/_ext/qpdf.py e9ac9d6c70642a3d29281ee5ad92ae2422dee8be9306fb8a0bc9dba0ed5e28f3
|
||||
manual/cli.rst 35289dbf593085016a62249f760cdcad50d5cce76d799ea4acf5dff58b78679a
|
||||
manual/cli.rst 3746df6c4f115387cca0d921f25619a6b8407fc10b0e4c9dcf40b0b1656c6f8a
|
||||
|
7
job.yml
7
job.yml
@ -1,4 +1,11 @@
|
||||
# See "HOW TO ADD A COMMAND-LINE ARGUMENT" in README-maintainer.
|
||||
|
||||
# REMEMBER: if you add an optional_choices or optional_parameter, you
|
||||
# have to explicitly remember to implement the overloaded config
|
||||
# method that takes no arguments. Since no generated code will call it
|
||||
# automatically, there is no automated reminder to do this. If you
|
||||
# forget, it will be a link error if someone tries to call it.
|
||||
|
||||
choices:
|
||||
yn:
|
||||
- "y"
|
||||
|
@ -646,7 +646,6 @@ QPDFJob::createsOutput() const
|
||||
void
|
||||
QPDFJob::checkConfiguration()
|
||||
{
|
||||
// QXXXQ messages are CLI-centric
|
||||
if (m->replace_input)
|
||||
{
|
||||
if (m->outfilename)
|
||||
@ -722,7 +721,8 @@ QPDFJob::checkConfiguration()
|
||||
{
|
||||
QTC::TC("qpdf", "qpdf same file error");
|
||||
usage("input file and output file are the same;"
|
||||
" use --replace-input to intentionally overwrite the input file");
|
||||
" use --replace-input to intentionally"
|
||||
" overwrite the input file");
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -28,7 +28,6 @@ QPDFJob::Config::emptyInput()
|
||||
{
|
||||
if (o.m->infilename == 0)
|
||||
{
|
||||
// QXXXQ decide whether to fix this or just leave the comment:
|
||||
// Various places in QPDFJob.cc know that the empty string for
|
||||
// infile means empty. This means that passing "" as the
|
||||
// argument to inputFile, or equivalently using "" as a
|
||||
|
@ -29,6 +29,28 @@ namespace
|
||||
typedef std::function<void(char const*)> param_handler_t;
|
||||
typedef std::function<void(JSON)> json_handler_t;
|
||||
|
||||
// The code that calls these methods is automatically
|
||||
// generated by generate_auto_job. This describes how we
|
||||
// implement what it does. We keep a stack of handlers in
|
||||
// json_handlers. The top of the stack is the "current" json
|
||||
// handler, intially for the top-level object. Whenever we
|
||||
// encounter a scalar, we add a handler using addBare,
|
||||
// addParameter, or addChoices. Whenever we encounter a
|
||||
// dictionary, we first add the dictionary handlers. Then we
|
||||
// walk into the dictionary and, for each key, we register a
|
||||
// dict key handler and push it to the stack, then do the same
|
||||
// process for the key's value. Then we pop the key handler
|
||||
// off the stack. When we encounter an array, we add the array
|
||||
// handlers, push an item handler to the stack, call
|
||||
// recursively for the array's single item (as this is what is
|
||||
// expected in a schema), and pop the item handler. Note that
|
||||
// we don't pop dictionary start/end handlers. The dictionary
|
||||
// handlers and the key handlers are at the same level in
|
||||
// JSONHandler. This logic is subtle and took several tries to
|
||||
// get right. It's best understood by carefully understanding
|
||||
// the behavior of JSONHandler, the JSON schema, and the code
|
||||
// in generate_auto_job.
|
||||
|
||||
void addBare(bare_handler_t);
|
||||
void addParameter(param_handler_t);
|
||||
void addChoices(char const** choices, bool required, param_handler_t);
|
||||
|
@ -812,7 +812,8 @@ This option is repeatable. If given, only specified objects will
|
||||
be shown in the "objects" key of the JSON output. Otherwise, all
|
||||
objects will be shown.
|
||||
)");
|
||||
ap.addOptionHelp("--job-json-help", "json", "show format of job JSON", R"(Describe the format of the QPDFJob JSON input.
|
||||
ap.addOptionHelp("--job-json-help", "json", "show format of job JSON", R"(Describe the format of the QPDFJob JSON input used by
|
||||
--job-json-file.
|
||||
)");
|
||||
ap.addHelpTopic("testing", "options for testing or debugging", R"(The options below are useful when writing automated test code that
|
||||
includes files created by qpdf or when testing qpdf itself.
|
||||
|
@ -167,9 +167,11 @@ Related Options
|
||||
description of the JSON input file format.
|
||||
|
||||
Specify the name of a file whose contents are expected to contain a
|
||||
QPDFJob JSON file. QXXXQ ref. This file is read and treated as if
|
||||
the equivalent command-line arguments were supplied. It can be
|
||||
mixed freely with other options.
|
||||
QPDFJob JSON file. This file is read and treated as if the
|
||||
equivalent command-line arguments were supplied. It can be repeated
|
||||
and mixed freely with other options. Run ``qpdf`` with
|
||||
:qpdf:ref:`--job-json-help` for a description of the job JSON input
|
||||
file format. For more information, see :ref:`qpdf-job`.
|
||||
|
||||
.. _exit-status:
|
||||
|
||||
@ -3200,9 +3202,12 @@ Related Options
|
||||
|
||||
.. help: show format of job JSON
|
||||
|
||||
Describe the format of the QPDFJob JSON input.
|
||||
Describe the format of the QPDFJob JSON input used by
|
||||
--job-json-file.
|
||||
|
||||
Describe the format of the QPDFJob JSON input. QXXXQ doc ref.
|
||||
Describe the format of the QPDFJob JSON input used by
|
||||
:qpdf:ref:`--job-json-file`. For more information about QPDFJob,
|
||||
see :ref:`qpdf-job`.
|
||||
|
||||
.. _test-options:
|
||||
|
||||
|
@ -28,6 +28,7 @@ documentation, please visit `https://qpdf.readthedocs.io
|
||||
weak-crypto
|
||||
json
|
||||
design
|
||||
qpdf-job
|
||||
linearization
|
||||
object-streams
|
||||
encryption
|
||||
|
248
manual/qpdf-job.rst
Normal file
248
manual/qpdf-job.rst
Normal file
@ -0,0 +1,248 @@
|
||||
|
||||
.. _qpdf-job:
|
||||
|
||||
QPDFJob: a Job-Based Interface
|
||||
==============================
|
||||
|
||||
All of the functionality from the :command:`qpdf` command-line
|
||||
executable is available from inside the C++ library using the
|
||||
``QPDFJob`` class. There are several ways to access this functionality:
|
||||
|
||||
- Command-line options
|
||||
|
||||
- Run the :command:`qpdf` command line
|
||||
|
||||
- Use from the C++ API with ``QPDFJob::initializeFromArgv``
|
||||
|
||||
- Use from the C API with QXXXQ
|
||||
|
||||
- The job JSON file format
|
||||
|
||||
- Use from the CLI with the :qpdf:ref:`--job-json-file` parameter
|
||||
|
||||
- Use from the C++ API with ``QPDFJob::initializeFromJson``
|
||||
|
||||
- Use from the C API with QXXXQ
|
||||
|
||||
- The ``QPDFJob`` C++ API
|
||||
|
||||
If you can understand how to use the :command:`qpdf` CLI, you can
|
||||
understand the ``QPDFJob`` class and the json file. qpdf guarantees
|
||||
that all of the above methods are in sync. Here's how it works:
|
||||
|
||||
.. list-table:: QPDFJob Interfaces
|
||||
:widths: 30 30 30
|
||||
:header-rows: 1
|
||||
|
||||
- - CLI
|
||||
- JSON
|
||||
- C++
|
||||
|
||||
- - ``--some-option``
|
||||
- ``"someOption": ""``
|
||||
- ``config()->someOption()``
|
||||
|
||||
- - ``--some-option=value``
|
||||
- ``"someOption": "value"``
|
||||
- ``config()->someOption("value")``
|
||||
|
||||
- - positional argument
|
||||
- ``"otherOption": "value"``
|
||||
- ``config()->otherOption("value")``
|
||||
|
||||
In the JSON file, the JSON structure is an object (dictionary) whose
|
||||
keys are command-line flags converted to camelCase. Positional
|
||||
arguments have some corresponding key, which you can find by running
|
||||
``qpdf`` with the :qpdf:ref:`--job-json-help` flag. For example, input
|
||||
and output files are named by positional arguments on the CLI. In the
|
||||
JSON, they are ``"inputFile"`` and ``"outputFile"``. The following are
|
||||
equivalent:
|
||||
|
||||
.. It would be nice to have an automated test that these are all the
|
||||
same, but we have so few live examples that it's not worth it for
|
||||
now.
|
||||
|
||||
CLI:
|
||||
::
|
||||
|
||||
qpdf infile.pdf outfile.pdf \
|
||||
--pages . other.pdf --password=x 1-5 -- \
|
||||
--encrypt user owner 256 --print=low -- \
|
||||
--object-streams=generate
|
||||
|
||||
Job JSON:
|
||||
.. code-block:: json
|
||||
|
||||
{
|
||||
"inputFile": "infile.pdf",
|
||||
"outputFile": "outfile.pdf",
|
||||
"pages": [
|
||||
{
|
||||
"file": "."
|
||||
},
|
||||
{
|
||||
"file": "other.pdf",
|
||||
"password": "x",
|
||||
"range": "1-5"
|
||||
}
|
||||
],
|
||||
"encrypt": {
|
||||
"userPassword": "user",
|
||||
"ownerPassword": "owner",
|
||||
"256bit": {
|
||||
"print": "low"
|
||||
}
|
||||
},
|
||||
"objectStreams": "generate"
|
||||
}
|
||||
|
||||
C++ code:
|
||||
.. code-block:: c++
|
||||
|
||||
#include <qpdf/QPDFJob.hh>
|
||||
#include <qpdf/QPDFUsage.hh>
|
||||
#include <iostream>
|
||||
|
||||
int main(int argc, char* argv[])
|
||||
{
|
||||
try
|
||||
{
|
||||
QPDFJob j;
|
||||
j.config()
|
||||
->inputFile("infile.pdf")
|
||||
->outputFile("outfile.pdf")
|
||||
->pages()
|
||||
->pageSpec(".", "1-z")
|
||||
->pageSpec("other.pdf", "1-5", "x")
|
||||
->endPages()
|
||||
->encrypt(256, "user", "owner")
|
||||
->print("low")
|
||||
->endEncrypt()
|
||||
->objectStreams("generate")
|
||||
->checkConfiguration();
|
||||
j.run();
|
||||
}
|
||||
catch (QPDFUsage& e)
|
||||
{
|
||||
std::cerr << "configuration error: " << e.what() << std::endl;
|
||||
return 2;
|
||||
}
|
||||
catch (std::exception& e)
|
||||
{
|
||||
std::cerr << "other error: " << e.what() << std::endl;
|
||||
return 2;
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
It is also possible to mix and match command-line options and json
|
||||
from the CLI. For example, you could create a file called
|
||||
:file:`my-options.json` containing the following:
|
||||
|
||||
.. code-block:: json
|
||||
|
||||
{
|
||||
"encrypt": {
|
||||
"userPassword": "",
|
||||
"ownerPassword": "owner",
|
||||
"256bit": {
|
||||
}
|
||||
},
|
||||
"objectStreams": "generate"
|
||||
}
|
||||
|
||||
and use it with other options to create 256-bit encrypted (but
|
||||
unrestricted) files with object streams while specifying other
|
||||
parameters on the command line, such as
|
||||
|
||||
::
|
||||
|
||||
qpdf infile.pdf outfile.pdf --job-json-file=my-options.json
|
||||
|
||||
.. _qpdfjob-design:
|
||||
|
||||
See also :file:`examples/qpdf-job.cc` in the source distribution as
|
||||
well as comments in ``QPDFJob.hh``.
|
||||
|
||||
|
||||
QPDFJob Design
|
||||
--------------
|
||||
|
||||
This section describes some of the design rationale and history behind
|
||||
``QPDFJob``.
|
||||
|
||||
Documentation of ``QPDFJob`` is divided among three places:
|
||||
|
||||
- "HOW TO ADD A COMMAND-LINE ARGUMENT" in :file:`README-maintainer`
|
||||
provides a quick reminder for how to add a command-line argument
|
||||
|
||||
- The source file :file:`generate_auto_job` has a detailed explanation
|
||||
about how ``QPDFJob`` and ``generate_auto_job`` work together
|
||||
|
||||
- This chapter of the manual has other details.
|
||||
|
||||
Prior to qpdf version 10.6.0, the qpdf CLI executable had a lot of
|
||||
functionality built into the executable that was not callable from the
|
||||
library as such. This created a number of problems:
|
||||
|
||||
- Some of the logic in :file:`qpdf.cc` was pretty complex, such as
|
||||
image optimization, generating json output, and many of the page
|
||||
manipulations. While those things could all be coded using the C++
|
||||
API, there would be a lot of duplicated code.
|
||||
|
||||
- Page splitting and merging will get more complicated over time as
|
||||
qpdf supports a wider range of document-level options. It would be
|
||||
nice to be able to expose this to library users instead of baking it
|
||||
all into the CLI.
|
||||
|
||||
- Users of other languages who just wanted an interface to do things
|
||||
that the CLI could do didn't have a good way to do it, such as just
|
||||
handling a library call a set of command-line options or an
|
||||
equivalent JSON object that could be passed in as a string.
|
||||
|
||||
- The qpdf CLI itself was almost 8,000 lines of code. It needed to be
|
||||
refactored, cleaned up, and split.
|
||||
|
||||
- Exposing a new feature via the command-line required making lots of
|
||||
small edits to lots of small bits of code, and it was easy to forget
|
||||
something. Adding a code generator, while complex in some ways,
|
||||
greatly reduces the chances of error when extending qpdf.
|
||||
|
||||
Here are a few notes on some design decisions about QPDFJob and its
|
||||
various interfaces.
|
||||
|
||||
- Bare command-line options (flags with no parameter) map to config
|
||||
functions that take no options and to json keys whose values are
|
||||
required to be the empty string. The rationale is that we can later
|
||||
change these bare options to options that take an optional parameter
|
||||
without breaking backward compatibility in the CLI or the JSON.
|
||||
Options that take optional parameters generate two config functions:
|
||||
one has no arguments, and one that has a ``char const*`` argument.
|
||||
This means that adding an optional parameter to a previously bare
|
||||
option also doesn't break binary compatibility.
|
||||
|
||||
- Adding a new argument to :file:`job.yml` automatically triggers
|
||||
almost everything by declaring and referencing things that you have
|
||||
to implement. This way, once you get the code to compile and link,
|
||||
you know you haven't forgotten anything. There are two tricky cases:
|
||||
|
||||
- If an argument handler has to do something special, like call a
|
||||
nested config method or select an option table, you have to
|
||||
implement it manually. This is discussed in
|
||||
:file:`generate_auto_job`.
|
||||
|
||||
- When you add an option that has optional parameters or choices,
|
||||
both of the handlers described above are declared, but only the
|
||||
one that takes an argument is referenced. You have to remember to
|
||||
implement the one that doesn't take an argument or else people
|
||||
will get a linker error if they try to call it. The assumption is
|
||||
that things with optional parameters started out as bare, so the
|
||||
argument-less version is already there.
|
||||
|
||||
- If you have to add a new option that requires its own option table,
|
||||
you will have to do some extra work including adding a new nested
|
||||
Config class, adding a config member variable to ``ArgParser`` in
|
||||
:file:`QPDFJob_argv.cc` and ``Handlers`` in :file:`QPDFJob_json.cc`,
|
||||
and make sure that manually implemented handlers are consistent with
|
||||
each other. It is best under the cases to explicit test cases for
|
||||
all the various ways to get to the option.
|
@ -2303,9 +2303,9 @@ For a detailed list of changes, please see the file
|
||||
been added to the :command:`qpdf` command-line
|
||||
tool. See :ref:`page-selection`.
|
||||
|
||||
- Options have been added to the :command:`qpdf`
|
||||
command-line tool for copying encryption parameters from another
|
||||
file. (QXXXQ Link)
|
||||
- The :qpdf:ref:`--copy-encryption` option have been added to the
|
||||
:command:`qpdf` command-line tool for copying encryption
|
||||
parameters from another file.
|
||||
|
||||
- New methods have been added to the ``QPDF`` object for adding and
|
||||
removing pages. See :ref:`adding-and-remove-pages`.
|
||||
|
Loading…
Reference in New Issue
Block a user