F.47. pgpro_ilm — enables moving rarely used relations to a lower-cost storage #

The pgpro_ilm extension solves the problem similar to Oracle's information lifecycle management (ILM), which analyzes usage frequency of data in a database and performs certain actions with rarely used tables, for example, moves tables to a lower-cost and slower storage.

pgpro_ilm allows you to define rules for moving tables. A rule can have the NO_MODIFICATION or NO_ACCESS type. A NO_MODIFICATION rule defines an action to perform on the table if it was not modified (by UPDATE, INSERT, DELETE, or TRUNCATE commands), but could be read (by SELECT) for a certain period of time. A NO_ACCESS rule defines an action to perform on the table if none of the above commands was executed on this table for a certain period of time. pgpro_ilm provides functions to manage the rules and apply them.

pgpro_ilm gets per-user statistics on when tables were last accessed from pgpro_usage functions and views. For the ILM purposes, maintenance operations with tables, such as taking regular backups, archiving, etc., should not be taken into account. These operations are performed by particular users, e.g., postgres, and such users must be excluded when checking whether pgpro_ilm rules apply. pgpro_ilm provides functions to manage the user exclusion list.

F.47.1.  Limitations #

The current version of pgpro_ilm has the following limitations:

  • The only action supported is moving the relation to another tablespace.

  • Regular tables and partitions of partitioned tables are only supported.

  • Rules for indexes are not supported, and indexes are moved together with the respective tables.

F.47.2. Installation #

pgpro_ilm is a built-in extension included in Postgres Pro Enterprise. To enable pgpro_ilm, create the extension using the following query:

CREATE EXTENSION pgpro_ilm;

For pgpro_ilm to work, make sure that the pgpro_usage extension is installed, as explained in Section F.52.1.

F.47.3. Functions #

To configure moving tables to a lower-cost storage, pgpro_ilm provides an SQL interface that consists of several functions.

F.47.3.1. Rule Management Functions #

The following functions manage pgpro_ilm rules:

add_rule( iv_object_name text, iv_rule_type text, iv_period interval, iv_action text, iv_parameter text ) returns void

Adds a new rule.

Arguments:

  • iv_object_name — name of the relation. Can be qualified or not. If the name is unqualified, the schema will be determined automatically by following a search path (as explained in Section 5.10.4) when saving the rule to an internal table.

  • iv_rule_type — type of the rule: NO_ACCESS or NO_MODIFICATION.

  • iv_period — period after which the rule gets activated.

  • iv_action — action to perform. Possible value: ALTER_TS — move the relation to another tablespace.

  • iv_parameter — action parameter. Possible value is the tablespace where to move the table.

remove_rule( iv_object_name text, iv_rule_type text, iv_period interval, iv_action text ) returns void

Removes a rule or rules.

Arguments:

  • iv_object_name — name of the relation. Can be qualified or not. If the name is unqualified, the schema will be determined automatically by following a search path (as explained in Section 5.10.4) when searching the rule to remove.

  • iv_rule_type — type of the rule: NO_ACCESS or NO_MODIFICATION. NULL value removes rules with any rule type.

  • iv_period — period after which the rule gets activated. NULL value removes rules with any value of the period.

  • iv_action — action to perform. Possible value: ALTER_TS — move the relation to another tablespace. NULL value removes rules with any action.

get_rules( [iv_object_name text] ) returns table

Returns the list of all rules if called without the parameter or the list of rules that apply to the specified relation. For tables that are partitions of partitioned tables, this list also includes rules that apply to them indirectly, through the rules that apply to different-level parent tables. The list is returned in a table of the following format:

Column Type

Description

nspname name

Name of the schema that a relation is in

relname name

Name of the relation

relkind char

The value defined in the relkind column of the pg_class catalog corresponding to this relation.

rule_type char

Type of a rule: NO_ACCESS or NO_MODIFICATION

period interval

Period after which the rule gets activated

action text

Action to perform

parameter text

Action parameter

Arguments:

  • iv_object_name — name of the relation. Can be qualified or not. If the name is unqualified, the schema will be determined automatically by following a search path (as explained in Section 5.10.4).

F.47.3.2. Exclusion-List Management Functions #

The following functions manage the list of users to be excluded when checking pgpro_ilm rules:

set_exclude_users( iv_exclude_access name[], iv_exclude_modification name[] ) returns void

Resets the user exclusion list and creates a new one.

Arguments:

  • iv_exclude_access — array of names of users to exclude when checking NO_ACCESS rules.

  • iv_exclude_modification — array of names of users to exclude when checking NO_MODIFICATION rules.

get_exclude_users() returns table

Returns the user exclusion list in a table of the following format:

Column Type

Description

user_name name

Name of a user.

exclude_access bool

Whether to exclude this user from checking NO_ACCESS rules.

exclude_modification bool

Whether to exclude this user from checking NO_MODIFICATION rules.

F.47.3.3. Rule Processing Functions #

The following functions check whether pgpro_ilm rules apply to relations and perform the needed action with the relations to which the rules apply. For regular tables, the functions check the rules in the descending order of the period after which the rule gets activated. For partitions of partitioned tables, rules are checked for parent partitions from the lowest level of the partitioned table hierarchy to the highest level, and on the same level, rules are checked in the descending order of the period. Once a rule that applies to a table is found, the functions perform the appropriate action, and finish processing of this table.

process_rules( iv_object_name text ) returns void

Checks whether any rule applies to a table, performs the needed action with the table if the rule applies to it, and exits. When this function is called for a table more than once, pgpro_ilm determines whether a certain rule has already been applied and does not apply the rule again.

Arguments:

  • iv_object_name — name of the table. Can be qualified or not. If the name is unqualified, the schema will be determined automatically by following a search path (as explained in Section 5.10.4) when iterating through the rules.

process_all_rules() returns void

For each table in the database, checks whether any rule applies to this table, performs the needed action with the table if the rule applies to it, and proceeds to the next table. The function actually calls process_rules for all the tables in the database for which any rule exists.

F.47.4. Examples #

The following example illustrates usage of pgpro_ilm:

Let's create and view the user exception list:

test=# SELECT pgpro_ilm.set_exclude_users(array['exclude_access_user'], array['Exclude Write User']);
 set_exclude_users
-------------------

(1 row)

test=# SELECT user_name, exclude_access, exclude_modification FROM pgpro_ilm.get_exclude_users();
      user_name      | exclude_access | exclude_modification
---------------------+----------------+----------------------
 exclude_access_user | t              | f
 Exclude Write User  | f              | t
(2 rows)

Consider a partitioned table with one partition:

test=# \d+ measurement
                                  Partitioned table "public.measurement"
  Column   |  Type   | Collation | Nullable | Default | Storage | Compression | Stats target | Description
-----------+---------+-----------+----------+---------+---------+-------------+--------------+-------------
 city_id   | integer |           | not null |         | plain   |             |              |
 logdate   | date    |           | not null |         | plain   |             |              |
 peaktemp  | integer |           | not null |         | plain   |             |              |
 unitsales | integer |           |          |         | plain   |             |              |
Partition key: RANGE (logdate)
Indexes:
    "measurement_pkey" PRIMARY KEY, btree (city_id, logdate, peaktemp)
    "measurement_city_id_idx" btree (city_id)
    "measurement_logdate_idx" btree (logdate)
Partitions: measurement_y2006m03 FOR VALUES FROM ('2006-03-01') TO ('2006-04-01')

test=# \d+ measurement_y2006m03
                                    Table "public.measurement_y2006m03"
  Column   |  Type   | Collation | Nullable | Default | Storage | Compression | Stats target | Description
-----------+---------+-----------+----------+---------+---------+-------------+--------------+-------------
 city_id   | integer |           | not null |         | plain   |             |              |
 logdate   | date    |           | not null |         | plain   |             |              |
 peaktemp  | integer |           | not null |         | plain   |             |              |
 unitsales | integer |           |          |         | plain   |             |              |
Partition of: measurement FOR VALUES FROM ('2006-03-01') TO ('2006-04-01')
Partition constraint: ((logdate IS NOT NULL) AND (logdate >= '2006-03-01'::date) AND (logdate < '2006-04-01'::date))
Indexes:
    "measurement_y2006m03_pkey" PRIMARY KEY, btree (city_id, logdate, peaktemp)
    "measurement_y2006m03_city_id_idx" btree (city_id)
    "measurement_y2006m03_logdate_idx" btree (logdate)
Access method: heap

Let's add a few rules:

test=# SELECT * from pgpro_ilm.add_rule('measurement', 'NO_ACCESS', interval '16 mons', 'ALTER_TS', 'slow_space');
 add_rule
----------

(1 row)

test=# SELECT * from pgpro_ilm.add_rule('measurement', 'NO_MODIFICATION', interval '16 mons', 'ALTER_TS', 'slow_space');
 add_rule
----------

(1 row)

test=# SELECT * from pgpro_ilm.add_rule('measurement_y2006m03', 'NO_ACCESS', interval '14 mons', 'ALTER_TS', 'slow_space');
 add_rule
----------

(1 row)

test=# SELECT * from pgpro_ilm.add_rule('measurement_y2006m03', 'NO_ACCESS', interval '12 mons', 'ALTER_TS', 'slow_space');
 add_rule
----------

(1 row)

Rules for the partitioned table look as follows:

test=# select * from pgpro_ilm.get_rules('measurement');
 nspname |   relname   |    rule_type    |    period     |  action  | parameter
---------+-------------+-----------------+---------------+----------+------------
 public  | measurement | NO_ACCESS       | 1 year 4 mons | ALTER_TS | slow_space
 public  | measurement | NO_MODIFICATION | 1 year 4 mons | ALTER_TS | slow_space
(2 rows)

Two explicitly specified NO_ACCESS rules and a NO_MODIFICATION rule inherited from the parent table apply to the partition:

test=# select * from pgpro_ilm.get_rules('measurement_y2006m03');
 nspname |       relname        |    rule_type    |    period     |  action  | parameter
---------+----------------------+-----------------+---------------+----------+------------
 public  | measurement          | NO_MODIFICATION | 1 year 4 mons | ALTER_TS | slow_space
 public  | measurement_y2006m03 | NO_ACCESS       | 1 year 2 mons | ALTER_TS | slow_space
 public  | measurement_y2006m03 | NO_ACCESS       | 1 year        | ALTER_TS | slow_space
(3 rows)

Let's add one more explicit rule for the partition:

test=# SELECT * from pgpro_ilm.add_rule('measurement_y2006m03', 'NO_MODIFICATION', interval '12 mons', 'ALTER_TS', 'slow_space');
 add_rule
----------

(1 row)

Now only explicitly specified rules apply to the partition:

test=# select * from pgpro_ilm.get_rules('measurement_y2006m03');
 nspname |       relname        |    rule_type    |    period     |  action  | parameter
---------+----------------------+-----------------+---------------+----------+------------
 public  | measurement_y2006m03 | NO_ACCESS       | 1 year 2 mons | ALTER_TS | slow_space
 public  | measurement_y2006m03 | NO_ACCESS       | 1 year        | ALTER_TS | slow_space
 public  | measurement_y2006m03 | NO_MODIFICATION | 1 year        | ALTER_TS | slow_space
(3 rows)

Note that rules are not processed for partitioned tables.

test=# SELECT pgpro_ilm.process_rules('measurement');
NOTICE:  ILM: no rules matched for 'public.measurement'
 process_rules
---------------

(1 row)

When rules are processed, indexes are moved together with the table:

test=# SELECT pgpro_ilm.process_rules('measurement_y2006m03');
NOTICE:  ILM: found rule 'public.measurement_y2006m03' with type 'NO_ACCESS' and period '1 year 2 mons'. Action 'ALTER_TS', parameter 'slow_space'
NOTICE:  ILM: executing 'ALTER TABLE public.measurement_y2006m03 SET TABLESPACE slow_space;'
NOTICE:  ILM: executing 'ALTER INDEX public.measurement_y2006m03_pkey SET TABLESPACE slow_space;'
NOTICE:  ILM: executing 'ALTER INDEX public.measurement_y2006m03_city_id_idx SET TABLESPACE slow_space;'
NOTICE:  ILM: executing 'ALTER INDEX public.measurement_y2006m03_logdate_idx SET TABLESPACE slow_space;'
 process_rules
---------------

(1 row)

Result:

test=# \d+ measurement_y2006m03
                                    Table "public.measurement_y2006m03"
  Column   |  Type   | Collation | Nullable | Default | Storage | Compression | Stats target | Description
-----------+---------+-----------+----------+---------+---------+-------------+--------------+-------------
 city_id   | integer |           | not null |         | plain   |             |              |
 logdate   | date    |           | not null |         | plain   |             |              |
 peaktemp  | integer |           | not null |         | plain   |             |              |
 unitsales | integer |           |          |         | plain   |             |              |
Partition of: measurement FOR VALUES FROM ('2006-03-01') TO ('2006-04-01')
Partition constraint: ((logdate IS NOT NULL) AND (logdate >= '2006-03-01'::date) AND (logdate < '2006-04-01'::date))
Indexes:
    "measurement_y2006m03_pkey" PRIMARY KEY, btree (city_id, logdate, peaktemp), tablespace "slow_space"
    "measurement_y2006m03_city_id_idx" btree (city_id), tablespace "slow_space"
    "measurement_y2006m03_logdate_idx" btree (logdate), tablespace "slow_space"
Tablespace: "slow_space"
Access method: heap

When the function is called again, pgpro_ilm determines whether the table has already been moved and does not move it again.

test=# SELECT pgpro_ilm.process_rules('measurement_y2006m03');
NOTICE:  ILM: found rule 'public.measurement_y2006m03' with type 'NO_ACCESS' and period '1 year 2 mons'. Action 'ALTER_TS', parameter 'slow_space'
NOTICE:  ILM: 'public.measurement_y2006m03' is already in the 'slow_space' tablespace
 process_rules
---------------

(1 row)

Removing all types of rules for the measurement table:

test=# select pgpro_ilm.remove_rule('measurement', NULL, '16 month'::interval, 'ALTER_TS');
 remove_rule
-------------

(1 row)

test=# select * from pgpro_ilm.get_rules('measurement');
 nspname | relname | rule_type | period | action | parameter
---------+---------+-----------+--------+--------+-----------
(0 rows)

To process all tables, pgpro_ilm.process_all_rules() function can be called. In this example, one already moved table is found:

test=# call pgpro_ilm.process_all_rules();
NOTICE:  ILM: checking 'public.measurement_y2006m03'
NOTICE:  ILM: found rule 'public.measurement_y2006m03' with type 'NO_ACCESS' and period '1 year 2 mons'. Action 'ALTER_TS', parameter 'slow_space'
NOTICE:  ILM: 'public.measurement_y2006m03' is already in the 'slow_space' tablespace
CALL