Postgres Pro Enterprise : Documentation: 16: F.53. pgpro_scheduler — scheduling, monitoring and managing job execution : Postgres Professional

F.53. pgpro_scheduler — scheduling, monitoring and managing job execution
Prev	Up	Appendix F. Additional Supplied Modules and Extensions Shipped in `postgrespro-ent-16-contrib`	Home	Next

F.53. pgpro_scheduler — scheduling, monitoring and managing job execution #

F.53.1. Installation and Setup
F.53.2. Configuration
F.53.3. Calculating Required Background Workers
F.53.4. Usage
F.53.5. Reference
F.53.6. Authors

pgpro_scheduler is a built-in Postgres Pro Enterprise extension for scheduling, monitoring, and managing job execution within the Postgres Pro Enterprise database. With pgpro_scheduler, you can:

Set advanced schedules using jsonb objects or crontab strings.
Dynamically calculate the next execution time for repeated jobs.
Execute SQL commands of the job in a single transaction or in sequential transactions, if required.
Submit jobs for immediate or delayed one-time execution in parallel with the scheduled jobs.

Unlike external scheduling daemons, pgpro_scheduler offers the following benefits:

Any user can schedule jobs independently.
Job scheduling can be managed on the fly without restarting the database.
Scheduling is very lightweight since pgpro_scheduler uses background workers to schedule, monitor, and manage job execution. At the same time, pgpro_scheduler does not require any client connections for scheduling.
For enhanced stability, each database has its own supervisor scheduler, with each scheduled job executed by a separate background worker.

Note

pgpro_scheduler waits in the suspended state on a standby server to be started when the standby is promoted to a primary server.

Note

Note that for all the executed jobs, the pg_stat_activity view will still show the name of the database superuser, which is used by the background worker.

F.53.1. Installation and Setup #

The pgpro_scheduler extension is included into Postgres Pro Enterprise. Once you have Postgres Pro Enterprise installed, complete the following steps to enable pgpro_scheduler:

Add pgpro_scheduler to the shared_preload_libraries parameter in the postgresql.conf file:
```
shared_preload_libraries = 'pgpro_scheduler'
```
Create the pgpro_scheduler extension using the following query:
```
CREATE EXTENSION pgpro_scheduler;
```
Make sure to create the pgpro_scheduler extension for each database you are planning to use.

Once you complete the installation and setup, configure pgpro_scheduler for your database.

F.53.2. Configuration #

You must have superuser rights to configure pgpro_scheduler.

To configure pgpro_scheduler, modify the following settings in the postgresql.conf file:

Specify the names of the databases for which you need to schedule jobs, in the comma-separated format:
```
schedule.database = 'database1,database2'
```
To control the workload in your system, set the maximum number of background workers that can run simultaneously on each database:
```
schedule.max_workers = 5
```
Optionally, set the number of background workers available for one-time job execution:
```
schedule.max_parallel_workers = 3
```
By default, two background workers for one-time jobs are available. These workers are not included into the schedule.max_workers number. Thus, one-time jobs can run in parallel with the scheduled jobs even if all the schedule.max_workers workers are busy.
Run pg_reload_conf() for the changes to take effect:
```
SELECT pg_reload_conf();
```

Important

When setting the schedule.max_workers, schedule.max_parallel_workers, and schedule.database variables, make sure that enough background workers remain available from the total pool of workers established by max_worker_processes. Other Postgres Pro subsystems may also use background workers.

For detailed instructions on calculating the required number of background workers and for configuration examples, see Section F.53.3.

You can also dynamically configure pgpro_scheduler from the command line. In this case, you can set different number of workers for different databases:

ALTER SYSTEM SET schedule.database = 'database1,database2';
ALTER DATABASE database1 SET schedule.max_workers = 5;
ALTER DATABASE database2 SET schedule.max_workers = 3;
ALTER SYSTEM SET schedule.max_parallel_workers = 3;
SELECT pg_reload_conf();

Once pgpro_scheduler is configured, enable it on your system, as follows:

SELECT schedule.enable();

If this function returns true, pgpro_scheduler is ready to use, and you can start scheduling jobs as explained in Section F.53.4.1 and Section F.53.4.2.

Note

If you restart the server, pgpro_scheduler is not automatically restarted by default. To change this behavior, you can set the schedule.auto_enabled parameter to on.

See Also

Section F.53.5.1

F.53.3. Calculating Required Background Workers #

To calculate the maximum number of background workers used by the pgpro_scheduler extension, use the following formula:

1 + N * (1 + schedule.max_workers + schedule.max_parallel_workers)

1: The global supervisor worker for the pgpro_scheduler extension.
N: The number of databases listed in the schedule.database parameter. Each database runs its own database manager and other background workers.
schedule.max_workers: The maximum number of background workers for scheduled jobs.
schedule.max_parallel_workers: The maximum number of background workers for one-time jobs.

Note that this formula calculates the total number of background workers required for the general configuration based on the global default values set using ALTER SYSTEM. To optimize performance and to meet task-specific requirements, tune the schedule.max_workers and schedule.max_parallel_workers parameters for each database listed in schedule.database and include these values in the total worker count.

The max_worker_processes parameter must provide worker processes for both pgpro_scheduler and other Postgres Pro subsystems. Before using pgpro_scheduler for the first time, increase max_worker_processes by the value obtained from the formula above. Adjust this parameter whenever you modify the pgpro_scheduler configuration.

For example, if you work with two databases and set schedule.max_workers to 5 and schedule.max_parallel_workers to 3, pgpro_scheduler may use up to 1 + 2 * (1 + 5 + 3) = 19 background workers. Consequently, you should increase the max_worker_processes value by 19.

Consider a more complex example: you decide to enable the extension for a third database. You configure the parameters as follows:

For the third database, set schedule.max_parallel_workers to 2 and keep its schedule.max_workers set to 5 (default).
For the second database, set schedule.max_workers to 2.

In this case, the total number of workers required for pgpro_scheduler would be:

1 + (1 + 5 + 3) + (1 + 2 + 3) + (1 + 5 + 2) = 24

You would also need to increase the max_worker_processes value by additional 5 workers (24 - 19 = 5).

If all background workers within this pool are busy, jobs will wait for the next available worker. This can lead to delays in the execution of scheduler jobs. Scheduled and one-time jobs are placed into separate queues to manage their execution.

If required, you can later change the number of workers. To check the extension status, use the schedule.status() function. If you see jobs in the submitted state, verify that enough background workers are allocated.

Changes to the schedule.max_workers and schedule.max_parallel_workers parameters do not affect the running jobs.

F.53.4. Usage #

F.53.4.1. Creating Scheduled Jobs #

To create and schedule a job, run the create_job() function that takes scheduling options as a jsonb object:

schedule.create_job(options jsonb)

In the jsonb object, you must specify one or more SQL commands in the commands key, and set the job schedule with at least one of the following keys:

dates — a single date or an array of dates, in the timestamp with time zone format
cron — a string, in the crontab format. A traditional five-field crontab format is used. The first field stands for minute, the second — for hour, the third — for day of the month, the fourth — for month, and the fifth — for day of the week.
```
┌── minute (0 - 59)
│ ┌─── hour (0 - 23)
│ │ ┌─── day of the month (1 - 31)
│ │ │ ┌──── month (1 - 12)
│ │ │ │ ┌──── day of the week (0 - 6) (Sunday to Saturday)
│ │ │ │ │
* * * * *
```
A six-field crontab format can be used alongside the traditional five-field format. In this case the first field stands for second. When you use the six-field format and do not want to specify a second you have to put 0 in the first field.
Alternatively, the following keywords can be used instead of a crontab string to specify when the job will be started:
- @every_second — each second
- @hourly — at the beginning of each hour
- @daily — at the beginning of each day
- @midnight — at the beginning of each day
- @weekly — at the beginning of each week
- @monthly — at the beginning of each month
- @yearly — at the beginning of each year
- @annually — at the beginning of each year
rule — a jsonb object that includes one or more of the following keys:
- seconds — seconds; an array of integers in range [0, 59]
- minutes — minutes; an array of integers in range [0, 59]
- hours — hours; an array of integers in range [0, 23]
- days — days of the month; an array of integers in range [1, 31]
- months — months; an array of integers in range [1, 12]
- wdays — days of the week; an array of integers in range [0, 6], where 0 is Sunday.
- onstart — integer value 0 or 1. If onstart is set to 1, the job is executed only once when pgpro_scheduler is started.

You can combine dates, cron, and rule scheduling keys for advanced use cases.

As a result, pgpro_scheduler creates an active job with the specified schedule and returns the job ID.

Tip

For simple job schedules, you can use the following shortcut syntax:

schedule.create_job(cron, commands)
schedule.create_job(dates, commands)

For details, see schedule.create_job() function description.

If required, you can later modify one or more scheduling options with the set_job_attribute() or set_job_attributes() functions, respectively.

If all background workers are busy at the specified time, the job waits for the next available worker. By default, the job can wait forever. You can limit the maximum wait time by setting the last_start_available key, in the time interval format. If the timeout is reached, pgpro_scheduler cancels the job execution.

Examples:

To run the job every day at 3pm, and, additionally, on December 31, 2017 at 7pm , and on April 4, 2020 at 1pm:

SELECT schedule.create_job('{"commands": "SELECT 15", "cron": "0 15 * * *", "dates": [ "2017-12-31 19:00", "2020-04-04 13:00" ]}');

To limit the wait time for job execution to 30 seconds after the scheduled time:

SELECT schedule.create_job('{"commands": "SELECT pg_sleep(100)", "cron": "15 */2 * * *", "last_start_available": "30 seconds" }');

Note

Both for scheduled and one-time jobs, you cannot manage their main transactions, namely using COMMIT and ROLLBACK. However, you can create and manage autonomous transactions.

F.53.4.1.1. Specifying the Time Window for Job Execution #

In addition to the general schedule, you can specify the timeframe during which the scheduled job can be executed. To ensure that pgpro_scheduler only executes the job within the specified time window, define the start_date and end_date keys, in the timestamp with time zone format. You can set one of these keys only to limit the start or the end time, respectively. If you define a time window for the job, pgpro_scheduler will only schedule this job within this time window. If the started job is incomplete when the specified time window ends, pgpro_scheduler completes the job and then excludes the job from further scheduling.

Examples:

To start scheduling the job only after 11am on May 1, 2017:

SELECT schedule.create_job('{"commands": "SELECT now()", "cron": "2 17 * * *", "start_date": "2017-05-01 11:00" }');

To schedule the job in the timeframe from 11am on May 1 to 3pm on June 4, 2017:

SELECT schedule.create_job('{"commands": "SELECT now()", "cron": "2 17 * * *", "start_date": "2017-05-01 11:00", "end_date": "2017-06-04 15:00" }');

F.53.4.1.2. Running SQL Commands in Separate Transactions #

The commands key can have values of text and array types. If you specify several SQL commands as text separated by semicolons, the whole job is executed in a single transaction. If it is critical to perform each SQL command in a separate transaction, pass the SQL commands as an array. You can modify this behavior by setting the use_same_transaction key to true. In this case, SQL commands in the array are executed in a single transaction.

Examples:

To run the whole job in a single transaction:

SELECT schedule.create_job('{"commands": "SELECT 1; SELECT 2; SELECT 3;", "cron": "23 23 */2 * *" }');

To run commands in separate transactions:

SELECT schedule.create_job('{"commands": [ "SELECT 1", "SELECT 2", "SELECT 3" ], "cron": "23 23 */2 * *" }');

To run the whole job in a single transaction when passing the commands as an array:

SELECT schedule.create_job('{"commands": [ "SELECT 1", "SELECT 2", "SELECT 3" ], "cron": "23 23 */2 * *", "use_same_transaction": true }');

F.53.4.1.3. Calculating the Next Start Time of the Scheduled Job #

For repeated jobs, the next start time can be computed by an SQL statement specified in the next_time_statement key. In this case, the first job starts on schedule, while all the successive job runs occur at the computed times.

After the job run completes, pgpro_scheduler executes the SQL statement in the next_time_statement key to calculate the next start time and returns the result, in the timestamp with time zone type. If the return value is of a different type or an error occurs, pgpro_scheduler marks the job as broken and cancels any further execution. This process is repeated for each successive job run.

Tip

When the job run completes, pgpro_scheduler sets the transaction state in the schedule.transaction_state variable, in the text format. You can use this variable in your next_time_statement to dynamically calculate the next start time depending on the transaction state. At the time of the next_time_statement execution, the schedule.transaction_state variable must contain either success or failure state values for the main transaction. Other values may indicate an internal pgpro_scheduler error.

Examples:

To run the job first at 10:45, and then in a day after the job completes:

SELECT schedule.create_job('{"commands": "SELECT random()", "cron": "45 10 * * *", "next_time_statement": "SELECT now() + ''1 day''::interval" }');

F.53.4.1.4. Setting Additional Conditions for Job Execution #

The pgpro_scheduler extension enables you to define additional conditions for task execution:

Set time limits for job execution with the max_run_time key. If the execution time is exceeded, pgpro_scheduler cancels the job.
Define the maximum time a scheduled job can wait for execution using the last_start_available key. If the timeout is reached, pgpro_scheduler cancels the job.
Schedule a job to be executed with the rights of another user by specifying the run_as key. You must have superuser rights to use this key.
Define an SQL command to execute if the main command fails using the onrollback key.

Examples:

To limit job execution to 5 seconds:

SELECT schedule.create_job('{"commands": "SELECT pg_sleep(10)", "cron": "15 */10 * * *", "max_run_time": "5 seconds" }');

To limit the wait time for job execution to 30 seconds after the scheduled time:

SELECT schedule.create_job('{"commands": "SELECT pg_sleep(100)", "cron": "15 */2 * * *", "last_start_available": "30 seconds" }');

To start the job with the rights of the robot user:

SELECT schedule.create_job('{"commands": "SELECT session_user", "cron": "5 */5 * * *", "run_as": "robot" }');

To define a fallback SQL command in case the main command fails:

SELECT schedule.create_job('{"commands": "SELECT ''zzz''", "cron": "55 */12 * * *", "onrollback": "SELECT ''Cannot select zzz''" }');

F.53.4.2. Submitting One-Time Jobs #

You can submit jobs for one-time execution using the schedule.submit_job() function. Such jobs use a separate pool of background workers defined by the schedule.max_parallel_workers variable, and can run in parallel with the scheduled jobs. By default, two one-time jobs can run concurrently. If you submit more jobs, they will wait in the queue for the next available background worker.

To execute a one-time job immediately, pass SQL commands in the query argument. For example:

schedule.submit_job(query := 'select 1');

Instead of passing SQL query parameters directly, you can define numbered placeholders in the query argument, such as $1 and $2, and pass an array of parameters in the params argument, with each array element corresponding to a placeholder. For brevity, you can omit the query and params names:

schedule.submit_job(query := 'select $1, $2', params := '{"text 1", "text 2"}')

To start a one-time job at the specified time, use the run_after argument:

schedule.submit_job('select ''flowers''', run_after := '2017-03-08 08:00:01');

Alternatively, you can delay the job start until the specified jobs are complete using the depends_on argument. For example, to run a job after completing the jobs with 23, 15, and 334 IDs, run:

schedule.submit_job('select ''well done''', depends_on := '{23, 15, 334}')

If required, you can repeat the job execution by passing the schedule.resubmit() function as part of the query argument. For example:

schedule.submit_job('select 1, schedule.resubmit(run_after := ''5'')');

The run_after argument specifies the time interval before the job is restarted, in seconds. By default, the interval is 1 second.

The resubmitted job cannot be executed more than the number of times set in the resubmit_limit argument. If this limit is reached, the job receives the done status, with the corresponding error message.

If you want to cancel a resubmitted job, run:

schedule.cancel_job(job_id bigint);

To monitor one-time jobs, use the job_status and all_job_status pgpro_scheduler views.

For details on all the functions available for managing one-time jobs, see Section F.53.5.6.3.

F.53.4.3. Changing and Removing Scheduled Jobs #

When you create a new job with the create_job() function, the job becomes active and waits for execution based on the specified schedule. Using the job ID returned by the create_job() function, you can change the scheduling settings or remove the job from the schedule. To change the specified schedule for the jobs, use set_job_attribute() or set_job_attributes() functions:

To modify a single property of the job, run the set_job_attribute() function with the job ID, the property name to change, and the new value for this property.
To modify more than one property of the job, run the set_job_attributes() function instead. In this case, you can specify all the job properties at once in a jsonb object. For details on all the keys available for job scheduling, see the create_job() function description.

To temporarily exclude the job from scheduling, run the deactivate_job() function:

schedule.deactivate_job(job_id integer)

You can re-activate the job later by running the activate_job() function:

schedule.activate_job(job_id integer)

To permanently remove the job from the schedule, run the drop_job() function:

schedule.drop_job(job_id integer)

F.53.4.4. Monitoring Scheduled Jobs #

You must have superuser rights to monitor job execution for the whole system. Otherwise, you can only monitor the jobs that you own. To monitor scheduled jobs, pgpro_scheduler provides multiple functions that return cron_rec or cron_job records:

get_job() — retrieves information about the job.
get_owned_cron() — retrieves the list of jobs owned by user.
get_cron() — retrieves the list of jobs executed by user.
get_active_jobs() — returns the list of jobs executed at the moment of the function call.
get_log() — returns the list of all completed jobs.
get_user_log() — returns list of the completed jobs executed by the specified user.
clean_log() — deletes all records with information about completed jobs.

To learn more about each function, see Section F.53.5.6.

F.53.4.5. Auditing Job Scheduling #

pgpro_scheduler enables you to audit job scheduling to rule out human error if you observe unexpected changes in scheduled job execution.

By default, pgpro_scheduler does not store information on schedule changes. To enable this feature, set the schedule.enable_history parameter to true. Once this parameter is enabled, pgpro_scheduler stores schedule modifications in the schedule.cron__history table, and logs all deleted jobs in the schedule.cron__deleted table. Logged history is never deleted from these tables, so a superuser can review schedule changes introduced by all users at any time.

For details on logged information, see Section F.53.5.5.

F.53.4.6. Scheduling Jobs on a Multi-Master Cluster #

Using pgpro_scheduler, you can manage scheduled and one-time jobs on a cluster configured with multimaster. pgpro_scheduler can only manage jobs on the node on which it is installed. Thus, you must install and enable pgpro_scheduler on all nodes on which you would like to schedule jobs. pgpro_scheduler instances will manage jobs on different nodes independently, but the executed jobs will be replicated to other nodes.

Even if you are planning to schedule jobs on a single node only, it is recommended to enable pgpro_scheduler on several nodes. In this case, if a node with scheduled jobs fails, another pgpro_scheduler instance picks up these jobs. If pgpro_scheduler is running on more than one node, the node with the smallest node ID is selected. The naming pattern of node IDs is defined by the schedule.nodename GUC variable.

F.53.5. Reference #

F.53.5.1. GUC Variables #

schedule.enabled (boolean) #

Deprecated. Specifies whether pgpro_scheduler is enabled on your system.

Default: false.

For pgpro_scheduler 2.5 or higher, you can set the schedule.auto_enabled parameter to control whether pgpro_scheduler is enabled at the server start, or use schedule.enable()/schedule.disable() functions to enable/disable pgpro_scheduler on demand. To check if pgpro_scheduler is currently running, use the schedule.is_enabled() function.

schedule.auto_enabled (boolean) #

Specifies whether to enable pgpro_scheduler at the server start.

Default: false.

schedule.database (text) #

Specifies the databases for which pgpro_scheduler is enabled. Database names must be separated by commas.

Default: empty string.

schedule.database_to_connect (text) #

The database to which pgpro_scheduler gets connected to receive Postgres Pro Enterprise cluster metadata. The specified database cannot be dropped while pgpro_scheduler is running. You can change this parameter only when restarting the server.

Default: postgres.

schedule.schema (text) #

Deprecated. Specifies the name of a schema where the scheduler stores its tables and functions. If you need to change the default schema, use ALTER EXTENSION.

Default: schedule.

schedule.nodename (text) #

Specifies the name of the cluster node on which pgpro_scheduler is running. Do not change or use this variable if you run a single-server cluster configuration.

On a cluster configured with multimaster, the node name is derived from the node ID provided by multimaster. For example, if the node ID is 3, the schedule.nodename variable is set to mtm-node-3. However, if you explicitly set the schedule.nodename variable by editing the postgresql.conf file or running the ALTER command, pgpro_scheduler will ignore the node ID and use the provided value instead.

Default: primary.

schedule.max_workers (integer) #

Specifies the maximum number of simultaneously running scheduled jobs in one database.

Default: 2.

schedule.max_parallel_workers (integer) #

Specifies the maximum number of parallel threads that can be used for executing one-time jobs.

Default: 2.

schedule.transaction_state (text) #

An internal variable containing the state of the executed job. pgpro_scheduler uses this variable when calculating the next job start time. Possible values are:

success — transaction has finished successfully.
failure — transaction has failed to finish.
running — transaction is in progress.
undefined — transaction has not started yet.

At the time of the next_time_statement execution, the schedule.transaction_state variable must contain either success or failure state values. Other values may indicate an internal pgpro_scheduler error.

schedule.enable_history (boolean) #

Log all schedule changes, including the name of the user who initiated the change and the time when this change occurred. If a new job is added, or the schedule of an existing job is modified, this information is stored in the schedule.cron__history table. If a job is deleted, this information is stored in the schedule.cron__deleted table. If you later disable the schedule.enable_history parameter, the history of the already recorded changes is preserved.

Default: false

F.53.5.2. SQL Schema #

To store its internal tables and functions, pgpro_scheduler uses the schedule SQL schema. Direct access to tables is not recommended and should not be attempted. To manage job scheduling, use the functions defined by the pgpro_scheduler extension.

F.53.5.3. SQL Types #

pgpro_scheduler defines the following types that are used by some of the pgpro_scheduler functions.

F.53.5.3.1. cron_rec #

This type contains information about the scheduled job.

CREATE TYPE schedule.cron_rec AS(
    id integer,             -- job ID
    node text,              -- name of the node
                            -- on which to execute the job
    name text,              -- job name
    comments text,          -- comments about the job
    rule jsonb,             -- scheduling rules
    commands text[],        -- SQL commands to be executed
    run_as text,            -- username of the job executor
    owner text,             -- username of the job owner
    start_date timestamptz,   -- lower bound of the execution window;
                            -- NULL if unbound
    end_date timestamptz,     -- upper bound of the execution window;
                            -- NULL if unbound
    use_same_transaction boolean,   -- true if an array of SQL 
                                    -- commands will be executed 
                                    -- in a single transaction
    last_start_available interval,  -- maximum wait time for 
                                    -- the scheduled job if all 
                                    -- allowed workers are busy
    max_run_time interval,  -- maximum execution time
    onrollback text,        -- SQL statement to execute 
                            -- if the main transaction fails
    max_instances int,      -- maximum number of simultaneously 
                            -- running job instances
    next_time_statement text,   -- SQL statement to calculate 
                                -- the next start time
    active boolean,         -- true if job is scheduled 
                            -- successfully
    broken boolean          -- true if job has errors in 
                            -- configuration that prevented 
                            -- its further execution
);

F.53.5.3.2. cron_job #

This type contains information about a particular job execution.

CREATE TYPE schedule.cron_job AS(
    cron integer,           -- job id
    node text,              -- name of the node
                            -- on which to execute the job
    scheduled_at timestamptz, -- scheduled execution time
    name text,              -- job name
    comments text,          -- comments about the job
    commands text[],        -- SQL statement to be executed
    run_as text,            -- username of the job executor
    owner text,             -- username of the job owner
    use_same_transaction boolean,   -- true if an array of SQL 
                                    -- commands will be executed
                                    -- in a single transaction
    started timestamptz,      -- timestamp of the job execution start
    last_start_available timestamp, -- maximum wait time for 
                                    -- the scheduled job if all 
                                    -- allowed workers are busy
    finished timestamptz,     -- timestamp of the job
                            -- execution finish
    max_run_time interval,  -- maximum execution time
    onrollback text,        -- SQL statement to execute if the main
                            -- transaction fails
    next_time_statement text,   -- SQL statement to calculate 
                                -- the next start time
    max_instances int,      -- the number of simultaneously 
                            -- running job instances 
    status job_status_t,    -- status of the task:
                            -- working, done, or error
    message text            -- error message
);

F.53.5.3.3. job_status_t #

Enumerated type. Can take the following values:

working — the job is being executed.
done — job execution is complete.
error — job execution has failed.

F.53.5.3.4. job_at_status_t #

Enumerated type. Can take the following values:

submitted — the job is submitted into the queue, but the execution has not started yet.
processing — the job is being executed.
done — job execution is complete.

F.53.5.3.5. timetable_job_type_t #

Enumerated type. Can take the following values:

periodical — a scheduled job.
onetime — a one-time job.

F.53.5.3.6. timetable_job_status_t #

Enumerated type. Can take the following values:

inprogress — the job is being executed.
done — job execution is complete.
error — job execution has failed.
submitted — the job is submitted into the queue, but the execution has not started yet.

F.53.5.4. Views #

pgpro_scheduler provides several views for monitoring execution status of one-time jobs.

F.53.5.4.1. job_status View #

Shows the status of one-time jobs belonging to the current user.

Table F.40. job_status View

Column Name	Column Type	Description
`id`	`bigint`	Job ID.
`node`	`text`	Name of the node on which the job is being executed.
`name`	`text`	Name of the job.
`comments`	`text`	Comments about the job.
`run_after`	`timestamp with time zone`	Timestamp after which the job execution must start.
`query`	`text`	SQL commands executed by the job.
`params`	`text[]`	An array of parameters for the SQL query.
`depends_on`	`bigint[]`	An array of job IDs on which the job execution depends.
`run_as`	`text`	User or role whose rights are used to execute the job.
`attempt`	`bigint`	The number of execution attempts.
`resubmit_limit`	`bigint`	The maximum number of allowed job resubmissions.
`max_wait_interval`	`interval`	The maximum time interval to postpone the job execution if all background workers are busy at the scheduled moment.
`max_duration`	`interval`	Time interval during which the job can be executed.
`submit_time`	`timestamp with time zone`	Time when the job was submitted to the execution queue.
`canceled`	`boolean`	Specifies whether the job was canceled by user.
`start_time`	`timestamp with time zone`	Job execution start time.
`is_success`	`boolean`	`true` — job execution completed successfully. `false` — job execution completed with errors.
`error`	`text`	Error message.
`done_time`	`timestamp with time zone`	Time when the job execution completed.
`status`	`job_at_status_t`	Job status. See the Section F.53.5.3.4 for details.

F.53.5.4.2. all_job_status View #

Shows the status of all one-time jobs. You must have superuser rights to access this view.

Table F.41. all_job_status View

Column Name	Column Type	Description
`id`	`bigint`	Job ID.
`node`	`text`	Name of the node on which the job is being executed.
`name`	`text`	Name of the job.
`comments`	`text`	Comments about the job.
`run_after`	`timestamp with time zone`	Timestamp after which the job execution must start.
`query`	`text`	SQL commands executed by the job.
`params`	`text[]`	An array of parameters for the SQL query.
`depends_on`	`bigint[]`	An array of job IDs on which the job execution depends.
`run_as`	`text`	User or role whose rights are used to execute the job.
`owner`	`text`	The user who created the job.
`attempt`	`bigint`	The number of execution attempts.
`resubmit_limit`	`bigint`	The maximum number of allowed job resubmissions.
`max_wait_interval`	`interval`	The maximum time interval to postpone the job execution for if all background workers are busy at the scheduled moment.
`max_duration`	`interval`	Time interval during which the job can be executed.
`submit_time`	`timestamp with time zone`	Time when the job was submitted to the execution queue.
`canceled`	`boolean`	Specifies whether the job was canceled by user.
`start_time`	`timestamp with time zone`	Job execution start time.
`is_success`	`boolean`	`true` — job execution completed successfully. `false` — job execution completed with errors.
`error`	`text`	Error message.
`done_time`	`timestamp with time zone`	Time when the job execution completed.
`status`	`job_at_status_t`	Job status. See the Section F.53.5.3.4 for details.

F.53.5.5. Audit Tables #

The following tables store all schedule changes if the schedule.enable_history parameter is set to true. If you later disable the schedule.enable_history parameter, the history of the already recorded changes is preserved.

F.53.5.5.1. schedule.cron__history Table #

Registers job scheduling changes. Whenever a new job is scheduled, or the schedule for an existing job is changed, a new row is inserted into this table to record the following information:

All details about the scheduled job, as defined by the cron_rec data type. For details on the cron_rec type, see Section F.53.5.3.
submitter — name of the user who updated the schedule.
version_id — a unique ID for each registered change in the schedule.
submit_time — time when the schedule was updated.

F.53.5.5.2. schedule.cron__deleted Table #

Registers all jobs that were removed from the schedule:

cron — ID of the deleted job.
submitter — name of the user who deleted the job.
submit_time — time when the job was deleted.

F.53.5.6. Functions #

pgpro_scheduler provides two separate sets of functions for managing scheduled and one-time jobs, as well as several common functions that can toggle pgpro_scheduler on and off for your database and show the current status of the extension:

Important

With each job, you can only use the function specifically tailored for this job type.

F.53.5.6.1. Common Functions #

These functions facilitate pgpro_scheduler management.

schedule.enable() #

Enables pgpro_scheduler for the current Postgres Pro Enterprise instance.

Return values:

true if pgpro_scheduler is enabled and ready to use.
false if the command has failed.

schedule.is_enabled() #

Checks whether pgpro_scheduler is enabled.

Return values:

true if pgpro_scheduler is enabled and ready to use.
false if pgpro_scheduler is not currently running.

schedule.disable() #

Disables pgpro_scheduler for the current Postgres Pro Enterprise instance.

Return values:

true if pgpro_scheduler is disabled.
false if the command has failed.

schedule.start() #

Launches pgpro_scheduler for the currently connected database.

Return values:

true — pgpro_scheduler started successfully.
false — if the command has failed, or pgpro_scheduler is already started.

schedule.stop() #

Stops pgpro_scheduler for the currently connected database.

Return values:

true — pgpro_scheduler is stopped.
false — if the command has failed, or pgpro_scheduler is not running.

schedule.status() #

Returns the status of pgpro_scheduler background workers:

pid — process ID of the background worker. If the process ID is NULL, the background worker is not running.
database — name of the database to which the background worker is connected.
type — type of the background worker:
- supervisor — distributes the scheduled jobs between the databases.
- database manager distributes the scheduled jobs within the database.
- cron job executor executes a scheduled job.
- at job executor executes a one-time job.

schedule.version() #

Returns pgpro_scheduler version.

F.53.5.6.2. Functions for Managing Scheduled Jobs #

schedule.create_job(options jsonb) #

Creates an active job and returns the job ID.

Alternative Syntax:

schedule.create_job(cron text, commands text [, node text])
schedule.create_job(cron text, commands text[] [, node text])
schedule.create_job(dates timestamp with time zone, commands text [, node text])
schedule.create_job(dates timestamp with time zone, commands text[] [, node text])
schedule.create_job(dates timestamp with time zone[], commands text [, node text])
schedule.create_job(dates timestamp with time zone[], commands text[] [, node text])

Arguments:

options — a jsonb object defining all the job properties. You do not need to define other parameters if the data is set. All the available jsonb keys are listed in Table F.42.
Type: jsonb
cron — a crontab-like string defining the job schedule.
Type: text
dates — the exact date or an array of dates for job execution.
Type: timestamp with time zone, timestamp with time zone[]
commands — SQL statement to execute. You can pass one or more SQL commands separated by semicolons, or an array of SQL commands. When passed as an array, SQL commands are executed in separate transactions.
Type: text, text[]
node — the name of the node on which the scheduled jobs run. Optional. You may need to specify this argument if you are scheduling jobs on a multi-master cluster.
Type: text

Return values:

ID of the created job.

Table F.42. jsonb Keys for Job Scheduling

Key	Type	Description
`cron`	`text`	A crontab-like string defining the job schedule. A traditional five-field or nonstandard six-field (seconds in the first field) `crontab` format may be used. You can combine `cron` with `rule` and `dates` keys, but at least one of them is mandatory. Alternatively, the following keywords can be used instead of a `crontab` string to specify when the job will be started: `@every_second` — each second `@hourly` — at the beginning of each hour `@daily` — at the beginning of each day `@midnight` — at the beginning of each day `@weekly` — at the beginning of each week `@monthly` — at the beginning of each month `@yearly` — at the beginning of each year `@annually` — at the beginning of each year
`dates`	`timestamp with time zone`, `timestamp with time zone[]`	The exact date or an array of dates when the scheduled job will be executed. You can combine `dates` with `rule` and `cron` keys, but at least one of them is mandatory.
`rule`	`jsonb`	A `jsonb` object defining the job schedule. Mandatory, if both `cron` and `dates` keys are undefined. The `rule` object includes one or more of the following keys: `seconds` — seconds; an array of integers in range [0, 59] `minutes` — minutes; an array of integers in range [0, 59] `hours` — hours; an array of integers in range [0, 23] `days` — days of the month; an array of integers in range [1, 31] `months` — months; an array of integers in range [1, 12] `wdays` — days of the week; an array of integers in range [0, 6], where 0 is Sunday. `onstart` — integer value 0 or 1. If `onstart` is set to 1, the job is executed only once when `pgpro_scheduler` is started.
`commands`	`text`, `text[]`	SQL statements to execute. You can pass one or more SQL statements separated by semicolons, or an array of SQL statements. When passed as an array, SQL statements are executed in separate transactions by default. You can change this behavior by setting the `use_same_transaction` key.
`name`	`text`	Optional. Job name.
`node`	`text`	Optional. The name of the node on which the scheduled jobs run. You may need to specify this argument if you are scheduling jobs on a multi-master cluster.
`comments`	`text`	Optional. Comments about the scheduled job.
`run_as`	`text`	Optional. The user whose rights are used to execute the job.
`start_date`	`timestamp with time zone`	Optional. The start of the timeframe when the scheduled job can be executed. This key can be `NULL`.
`end_date`	`timestamp with time zone`	Optional. The end of the timeframe when the scheduled job can be executed. This key can be `NULL`.
`use_same_transaction`	`boolean`	Optional. If set to `true`, forces an array of SQL statements to be executed in a single transaction. Default: `false`
`last_start_available`	`interval`	Optional. The maximum time interval to postpone the job execution for if all background workers are busy at the scheduled moment. For example, if this key is set to '00:02:34', the job will wait for 2 minutes 34 seconds. If this key is `NULL` or not set, the job can wait forever. Default: `NULL`.
`max_instances`	`integer`	Optional. The maximum number of job instances that can be executed simultaneously. Default: 1. For cron jobs where `next_time_statement` is set, the `max_instances` value must always be `1`.
`max_run_time`	`interval`	Optional. The maximum time interval during which the scheduled job can be executed. If this key is `NULL` or not set, there are no time limits. Default: `NULL`.
`onrollback`	`text`	Optional. SQL statement to be executed if the main transaction fails.
`next_time_statement`	`text`	Optional. SQL statement to calculate the start time for the next job execution. For details, see Section F.53.4.1.3.

schedule.set_job_attributes(job_id integer, data jsonb) #

Updates properties of the existing job.

Arguments:

job_id — identifier of the existing job.
data — a jsonb object with properties to be edited. For the list of keys and their structure, see Table F.42.

Return values:

true — job properties were updated successfully.
false — job properties were not updated.

To update the job properties, you must be the owner of the job or have superuser rights.

schedule.set_job_attribute(job_id integer, name text, value text || anyarray) #

Updates a property of the existing job.

Arguments:

job_id — identifier of the existing job.
name — property name.
value — property value.

See Table F.42 for the list of job properties you can update. Some values are of array types. They should be passed as an array. If a value of a wrong type is passed, an exception is raised.

Return values:

true — job property was updated successfully.
false — job property was not updated.

To update the job properties, you must be the owner of the job or have superuser rights.

schedule.deactivate_job(job_id integer) #

Deactivates the job and suspends its further scheduling and execution.

Arguments:

job_id — identifier of the existing job.

Return values:

true — the job is deactivated successfully.
false — job deactivation failed.

schedule.activate_job(job_id integer) #

Activates a job and starts its scheduling and execution.

Arguments:

job_id — identifier of the existing job.

Return values:

true — the job was activated successfully.
false — job activation failed.

schedule.drop_job(job_id integer) #

Deletes a job.

Arguments:

job_id — identifier of the existing job.

Return values:

true — the job was deleted successfully.
false — job was not deleted.

schedule.get_job(job_id integer) #

Returns information about the specified job.

Arguments:

job_id — identifier of the existing job.

Return values:

An object of type cron_rec.

For details on the cron_rec type, see Section F.53.5.3.

schedule.get_owned_cron(username text) #

Retrieves the list of jobs owned by the specified user.

Arguments:

username — username, optional.

Return values:

A set of records of type cron_rec. These records contain information about all jobs owned by the specified user. If the username is omitted, the session username is used. You must have superuser rights to retrieve jobs owned by another user.

For details on the cron_rec type, see Section F.53.5.3.

schedule.get_cron() #

Retrieves the list of jobs executed by the session user.

Return values:

A set of records of type cron_rec. These records contain information about all jobs executed by the session user. You must have superuser rights to retrieve the jobs.

For details on the cron_rec type, see Section F.53.5.3.

schedule.get_active_jobs(username text) #

Returns the list of jobs currently being executed by the specified user.

Arguments:

username — username, optional.

If username is omitted, the session username is used. You must have superuser rights to retrieve jobs executed by another user.

Return values:

A set of records of type cron_job.

For details on the cron_job type, see Section F.53.5.3.

schedule.get_active_jobs() #

Returns the list of jobs being currently executed. You must have superuser rights to call this function.

Return values:

A set of records of type cron_job.

For details on the cron_job type, see Section F.53.5.3.

schedule.get_log() #

Returns the list of all completed jobs. You must have superuser rights to call this function.

Return values:

A set of records of type cron_job.

For details on the cron_job type, see Section F.53.5.3.

schedule.get_user_log(username text) #

Returns the list of completed jobs executed by the specified user.

Arguments:

username — username, optional.

If username is omitted, the session username is used. You must have superuser rights to retrieve the list of jobs executed by another user.

Return values:

A set of records of type cron_job.

For details on the cron_job type, see Section F.53.5.3.

schedule.clean_log() #

Deletes all records with information about the completed jobs. You must have superuser rights to call this function.

Return values:

The number of records deleted.

schedule.nodename() #

Returns the current node name.

F.53.5.6.3. Functions for Managing One-Time Jobs #

schedule.submit_job(query text [options...]) #

Submits a job for immediate or delayed one-time execution. By default, the job is scheduled for immediate execution and can run in parallel with other scheduled jobs. To submit a job with a delayed start, you can set the execution start time using the run_after argument, or pass an array of job IDs in the depends_on argument to schedule job execution right after these jobs are complete.

Arguments:

query — SQL commands to execute.
Type: text
params — an array of parameters for the SQL query that can substitute numbered placeholders in the query argument, such as $1, $2, etc. Default: NULL
Type: text[]
run_after — a timestamp after which the job execution starts. If this argument is set to NULL, the job is scheduled for immediate execution. You can also use the depends_on argument to delay the job start. Default: NULL
Type: timestamp with time zone
node — the name of the node on which to execute the job. Default: NULL
Type: text
max_duration — the maximum time interval during which the job can be executed. If this time is exceeded, the job is forced to stop. If this argument is NULL or not set, there are no time limits. Default: NULL
Type: interval
max_wait_interval — the maximum time interval to postpone the job execution for if all background workers are busy at the scheduled moment. For example, if this key is set to '00:02:34', the job will wait for 2 minutes 34 seconds. If this key is NULL or not set, the job can wait forever. Default: NULL
Type: interval
run_as — user or role whose rights are used to execute the job. If run_as is set to NULL, the job is executed with the rights of the current user. You must have superuser rights to set this argument. Default: NULL
Type: text
depends_on — an array of job IDs. The created job starts immediately after the specified jobs complete the execution. This argument is an alternative to run_after. Default: NULL
Type: bigint[]
name — name of the job. Default: NULL
Type: text
comments — comments about the job.
Type: text
resubmit_limit — maximum number of times the job can be resubmitted for execution. See the schedule.resubmit() function for details. Default: 100
Type: bigint

Return values:

ID of the created job.
Type: bigint

schedule.get_self_id() #

Returns the ID of the job, in the context of which it was called. The returned ID is of the bigint type. This function must be called inside the query of the schedule.submit_job() function. Otherwise, an exception is raised.

Return values:

Job ID.

schedule.cancel_job(job_id bigint) #

Cancels all subsequent runs of the specified job. If the job is currently being executed, it will not be interrupted, but cannot be resubmitted. You must have superuser rights or be the owner of the job to call this function.

Arguments:

job_id — identifier of the job to cancel.

Return values:

true if the operation completed successfully.
false if the operation failed.

schedule.resubmit(run_after interval default NULL) #

Sets the start time for the next execution of the job, without interrupting the current job run. This function must be called inside the query argument of the schedule.submit_job() function. Otherwise, an exception is raised. If this function is called several times within a single job execution, only the last function call is taken into account.

Arguments:

run_after — time interval after which the job will be resubmitted for execution. If the time interval is less than a second but greater than zero, it is rounded to 1 second. Intervals longer than 1 second are rounded to integral values. If 0 is passed, the job is resubmitted immediately after execution. Default: 1 second
Type: interval

Return values:

The number of seconds after which the job will be resubmitted for execution.

F.53.5.6.4. Cleanup Functions #

When many one-time jobs are submitted, the table that stores completed one-time jobs may grow rapidly. To clean up this table, pgpro_scheduler provides the following functions with the same name but different argument types:

schedule.clean_at_jobs_done(older_than interval, delete_failed_tasks boolean, default NULL) schedule.clean_at_jobs_done(older_than timestamp with time zone, delete_failed_tasks boolean, default NULL)

Delete all records from the table with completed one-time jobs that are older than either a specified timestamp or a specified interval relative to the current time.

Arguments:

older_than — an interval or a timestamp for cleanup. Records older than this value will be deleted.
Type: interval, timestamp with time zone
delete_failed_tasks — specifies whether to delete failed jobs. Default: false.
Type: boolean

Return values:

The number of deleted records.

These functions can be called manually or scheduled using cron jobs in the pgpro_scheduler interface.

F.53.5.6.5. Functions for Managing Scheduled and One-Time Jobs #

schedule.timetable(start_time timestamp with time zone, end_timetimestamp with time zone) #: Returns a table, which describes all the jobs, both repeated and one-time, that are scheduled to execute within the specified time interval.

Table F.43. schedule.timetable Columns

Column Name	Column Type	Description
`id`	`bigint`	Job ID, which is unique for the jobs of this type.
`type`	`timetable_job_type_t`	Job type. See Section F.53.5.3.5 for details.
`node`	`text`	Name of the node on which the job is being executed.
`name`	`text`	Name of the job.
`comments`	`text`	Comments about the job.
`commands`	`text[]`	An array of SQL commands executed by the job.
`scheduled_at`	`timestamp with time zone`	Scheduled job execution time.
`start_time`	`timestamp with time zone`	Job execution start time.
`done_time`	`timestamp with time zone`	Time when the job execution completed.
`status`	`timetable_job_status_t`	Job status. See Section F.53.5.3.6 for details.
`error`	`text`	Error message.

F.53.6. Authors #

Postgres Professional, Moscow, Russia

Prev	Up	Next
F.52. pgpro_rp — resource prioritization	Home	F.54. pgpro_sfile — storage for large objects

epub pdf