The PIRL Conductor package provides an application to manage procedure pipelines.

A procedure pipeline is defined by a Database table of Procedures definition records paired with a table of file Sources records. Each file specified in a Sources record is processed in sequence by each procedure defined in a Procedures record. Procedure definitions include the relative order of the procedure in the pipeline, its command line to be executed, the successful exits status or message, the maximum amount of time a procedure will be allowed to run, and a command line to be executed if the procedure does not complete successfully. The Sources records include, in addition to the pathname of the file to be processed, a record of the completion status of each pipeline procedure applied to it.

Each source file processed is provided with a log file that is tied by a unique name to is Sources record. The log includes a detailed description of all procedures executed, their stdout and stderr output, and a record of how the procedure completed.

All procedure definitions may include embedded references to Database fields or Conductor configuration parameters. These are effectively variable names that are replaced with the values to which they resolve. A reference to a Database field resolves to the value of the field in the table of a catalog named in the reference for the record selected by a conditional expression. A reference to a configuration parameter resolves to the value of user specified parameter name in the current configuration file or any of the implicit parameters describing the current state of source processing. References may contain nested references and may resolve to values that are themselves references. This allows procedure definitions to be described in terms of variables that are dynamically resolved to specific values at the time of procedure execution.

Conductor manages the acquisition of Sources records for processing to ensure that only one Conductor will process each record, and the source will only be processed once unless specifically reset for reprocessing by some other agent. It is possible to set Sources to be (re)processed starting in the middle of the Procedures sequence. The exclusive control of a Sources record by one Conductor does not preclude multiple entries of the same file into a Sources table, if appropriate. It does enable more than one Conductor, on the same or separate host systems, to process the same pipeline at the same time; for example to safely distribute and parallelize processing.

Conductor offers a monitor mode that includes user controlled starting and stopping of processing and a terminal-like scrolling text display of all log output. Conductor can also be run silently for batch processing by manual initiation, as a cron job, or as a daemon that will constantly poll its pipeline for new Sources entries to be processed. @see PIRL.Database