PostgreSQL 17: part 3 or CommitFest 2023-11
The November commitfest is ripe with new interesting features! Without further ado, let’s proceed with the review.
If you missed our July and September commitfest reviews, you can check them out here: 2023-07, 2023-09.
- ON LOGIN trigger
- Event triggers for REINDEX
- ALTER OPERATOR: commutator, negator, hashes, merges
- pg_dump --filter=dump.txt
- psql: displaying default privileges
- pg_stat_statements: track statement entry timestamps and reset min/max statistics
- pg_stat_checkpointer: checkpointer process statistics
- pg_stats: statistics for range type columns
- Planner: exclusion of unnecessary table self-joins
- Planner: materialized CTE statistics
- Planner: accessing a table with multiple clauses
- Index range scan optimization
- dblink, postgres_fdw: detailed wait events
- Logical replication: migration of replication slots during publisher upgrade
- Replication slot use log
- Unicode: new information functions
- New function: xmltext
- AT LOCAL support
- Infinite intervals
- ALTER SYSTEM with unrecognized custom parameters
- Building the server from source
ON LOGIN trigger
commit: e83d1b0c
The ability to put event an trigger on a database login event is coming to PostgreSQL.
As usual, the trigger is created in two steps. First, the trigger function:
CREATE FUNCTION check_login() RETURNS event_trigger
AS $$
BEGIN
IF session_user = 'postgres' THEN RETURN; END IF;
IF to_char(current_date, 'DY') IN ('SAT','SUN')
THEN
RAISE 'Have a nice weekend, see you on Monday!';
END IF;
END;
$$ LANGUAGE plpgsql;
Then the trigger itself:
CREATE EVENT TRIGGER check_login
ON LOGIN
EXECUTE FUNCTION check_login();
Now, to everyone’s delight, users will not bother the administrator on weekends. :)
$ psql -U alice -d postgres
psql: error: connection to server on socket "/tmp/.s.PGSQL.5401" failed: FATAL: Have a nice weekend, see you on Monday!
CONTEXT: PL/pgSQL function check_login() line 6 at RAISE
See also:
Event triggers for REINDEX
commit: f21848de
Event triggers now work with the REINDEX
command, allowing you to create triggers for ddl_command_start
and ddl_command_end
events.
ALTER OPERATOR: commutator, negator, hashes, merges
commit: 2b5154be
In the ALTER OPERATOR command, you can now specify a commutator and a negator for the operator being defined, if they were not specified in the CREATE OPERATOR
already. You can also add support for hash join and merge join operations (HASHES
and MERGES
)
pg_dump --filter=dump.txt
commit: a5cf808b
The new parameter --filter
allows the user to specify a file name with a list of objects to include or exclude from the dump:
$ cat dump.txt
include table bookings
exclude table_data bookings
$ pg_dump -d demo --filter=dump.txt |grep -v -E '^SET|^SELECT|^--|^$'
CREATE TABLE bookings.bookings (
book_ref character(6) NOT NULL,
book_date timestamp with time zone NOT NULL,
total_amount numeric(10,2) NOT NULL
);
ALTER TABLE bookings.bookings OWNER TO postgres;
COMMENT ON TABLE bookings.bookings IS 'Bookings';
COMMENT ON COLUMN bookings.bookings.book_ref IS 'Booking number';
COMMENT ON COLUMN bookings.bookings.book_date IS 'Booking date';
COMMENT ON COLUMN bookings.bookings.total_amount IS 'Total booking cost';
ALTER TABLE ONLY bookings.bookings
ADD CONSTRAINT bookings_pkey PRIMARY KEY (book_ref);
The exact file syntax is described in the documentation.
This parameter may come in handy when the list of database objects is so long that it does not fit into the pg_dump
command’s character limit.
--filter
has also been added to pg_dumpall
and pg_restore
.
psql: displaying default privileges
commit: d1379ebf
psql
commands were never a good way to view default privileges. We will use schemas in this example, although the same logic applies to any type of database object.
CREATE SCHEMA s;
When a schema is created, its owner gets both its privileges, USAGE
and CREATE
, by default. However, if you try to check the privileges using the \dn+
command, it will return NULL
, since the privileges aren’t recorded into pg_namespace.nspacl
.
16=# SELECT nspacl IS NULL
FROM pg_namespace
WHERE oid = 's'::regnamespace;
?column?
----------
t
16=# \pset null '(null)'
16=# \dn+ s
List of schemas
Name | Owner | Access privileges | Description
------+----------+-------------------+-------------
s | postgres | |
Note how even if you set \pset null
, it does not affect the NULL values being displayed. When viewing system catalog objects, psql
commands used to ignore this setting.
Running any GRANT
or REVOKE
command will have the privileges recorded into the system catalog explicitly. The following commands grant and revoke privileges for the schema. Nothing changes as a result, except that the default privileges now appear in the table:
16=# GRANT USAGE ON SCHEMA s TO public;
16=# REVOKE USAGE ON SCHEMA s FROM public;
16=# \dn+ s
List of schemas
Name | Owner | Access privileges | Description
------+----------+----------------------+-------------
s | postgres | postgres=UC/postgres |
The owner may revoke their own privileges (and grant them back again later):
16=# REVOKE ALL ON SCHEMA s FROM postgres;
At the moment, pg_namespace.nspacl
is an empty aclitem[]
array. An empty array does not equal NULL
, but how to distinguish between the two in the command output?
16=# \dn+ s
List of schemas
Name | Owner | Access privileges | Description
------+----------+-------------------+-------------
s | postgres | |
The only solution used to be to query pg_namespace
.
In PostgreSQL 17 this is no longer the case. The \pset null
setting now properly affects NULL
values:
17=# CREATE SCHEMA s;
17=# \pset null '(null)'
17=# \dn+ s
List of schemas
Name | Owner | Access privileges | Description
------+----------+-------------------+-------------
s | postgres | (null) | (null)
And the absence of privileges (an empty array) is displayed with a special value (none)
:
17=# REVOKE ALL ON SCHEMA s FROM postgres;
17=# \dn+ s
List of schemas
Name | Owner | Access privileges | Description
------+----------+-------------------+-------------
s | postgres | (none) | (null)
pg_stat_statements: track statement entry timestamps and reset min/max statistics
commit: dc9f8a79
pg_stat_statements
now has a new column stats_since
, which records the time when collecting statistics for each operator was initiated. In addition to that, you can now reset the statistics in the min/max*
columns for specific operators by calling pg_stat_statements with the parameter minmax_only
. The reset time for the statistics is recorded in the minmax_stats_since
column.
These new features will come in handy for monitoring systems that rely on sampling information from pg_stat_statements
. In particular, they may skip collecting all the statistics before starting a snapshot.
pg_stat_checkpointer: checkpointer process statistics
Buffer cache changes can be written to disk by three types of processes: background writer, checkpointer, and user backend processes. For the longest time, all the relevant statistics have been tracked in a single pg_stat_bgwriter
view.
The new patch trims the number of columns in pg_stat_bgwriter
down considerably. All checkpointer statistics have been moved into a new view pg_stat_checkpointer
(first commit).
At the same time, the buffers_backend
and buffers_backend_fsync
columns were removed from pg_stat_bgwriter
, since more accurate and detailed information about the backend processes can now be found in the pg_stat_io
view that was introduced in PostgreSQL 16 (second commit).
pg_stats: statistics for range type columns
commit: bc3c8db8
Statistics for range type columns have traditionally been collected and stored in the pg_statistic
table. However, this information was not displayed in the pg_stats
view.
This patch fixes that by adding three new columns to pg_stats
: range_length_histogram
, range_empty_frac
, and range_bounds_histogram
.
Planner: exclusion of unnecessary table self-joins
commit: d3d55ce5
In a poorly composed query, a table may be joined with itself for any reason, as the SQL syntax allows it. Such queries often come from all sorts of ORMs, and the burden of optimisation often falls onto those who are powerless to change the way the queries are structured in the first place.
In PostgreSQL 17, the planner now can catch these unnecessary joins and exclude them from the query plan. The following example demonstrates an unnecessary semijoin of the table bookings with itself.
17=# EXPLAIN (costs off)
WITH b AS (
SELECT book_ref FROM bookings
)
SELECT *
FROM bookings
WHERE book_ref IN (SELECT book_ref FROM b);
QUERY PLAN
----------------------------------
Seq Scan on bookings
Filter: (book_ref IS NOT NULL)
The planner is justified in deciding that the table should only be scanned once.
This optimization is toggled by the parameter enable_self_join_removal. Switching it off leads to the behavior observed in the older versions:
17=# SET enable_self_join_removal = off;
17=# EXPLAIN (costs off)
WITH b AS (
SELECT book_ref FROM bookings
)
SELECT *
FROM bookings
WHERE book_ref IN (SELECT book_ref FROM b);
QUERY PLAN
--------------------------------------------------------
Hash Join
Hash Cond: (bookings.book_ref = bookings_1.book_ref)
-> Seq Scan on bookings
-> Hash
-> Seq Scan on bookings bookings_1
Planner: materialized CTE statistics
commit: f7816aec
Let’s try the same example, but with CTE materialization. This is how it plays out in PostgreSQL 16 (jit off):
16=# EXPLAIN (analyze,timing off)
WITH b AS MATERIALIZED (
SELECT book_ref FROM bookings
)
SELECT *
FROM bookings
WHERE book_ref IN (SELECT book_ref FROM b);
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=82058.51..83469.66 rows=1055555 width=21) (actual rows=2111110 loops=1)
CTE b
-> Seq Scan on bookings bookings_1 (cost=0.00..34558.10 rows=2111110 width=7) (actual rows=2111110 loops=1)
-> HashAggregate (cost=47499.98..47501.98 rows=200 width=28) (actual rows=2111110 loops=1)
Group Key: b.book_ref
Batches: 141 Memory Usage: 11113kB Disk Usage: 57936kB
-> CTE Scan on b (cost=0.00..42222.20 rows=2111110 width=28) (actual rows=2111110 loops=1)
-> Index Scan using bookings_pkey on bookings (cost=0.43..8.37 rows=1 width=21) (actual rows=1 loops=2111110)
Index Cond: (book_ref = b.book_ref)
Planning Time: 0.237 ms
Execution Time: 10847.689 ms
The planner has decided to group the CTE by book_ref
before joining with the outer query (the HashAggregate node). But book_ref
is the primary key of the bookings table, so the number of rows after grouping will remain at ~2 million, not at the estimated 200 rows. As a result, the planner selects a wrong join type (nested loop) for the CTE and the outer query.
In PostgreSQL 17, the planner has access to some statistics about columns from materialized CTE and can use those in the outer parts of the plan. This leads to more precise cardinality estimates. This is how the same query runs now:
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------
Hash Semi Join (cost=117601.18..222997.93 rows=2111110 width=21) (actual rows=2111110 loops=1)
Hash Cond: (bookings.book_ref = b.book_ref)
CTE b
-> Seq Scan on bookings bookings_1 (cost=0.00..34558.10 rows=2111110 width=7) (actual rows=2111110 loops=1)
-> Seq Scan on bookings (cost=0.00..34558.10 rows=2111110 width=21) (actual rows=2111110 loops=1)
-> Hash (cost=42222.20..42222.20 rows=2111110 width=28) (actual rows=2111110 loops=1)
Buckets: 131072 Batches: 32 Memory Usage: 3529kB
-> CTE Scan on b (cost=0.00..42222.20 rows=2111110 width=28) (actual rows=2111110 loops=1)
Planning Time: 0.127 ms
Execution Time: 1556.576 ms
With the unnecessary grouping gone and using hash join for the CTE and the outer query, the execution is about 7 times faster.
Planner: accessing a table with multiple clauses
commit: 5d8aa8bc
When a table is queried with multiple restriction clauses, should it always be queried in the first place? This patch introduces a small optimisation along this line of thinking.
Consider this query in PostgreSQL 16.
16=# EXPLAIN (costs off, analyze, timing off, summary off)
SELECT *
FROM tickets
WHERE ticket_no = '0005432' AND ticket_no = '0005000';
QUERY PLAN
-----------------------------------------------------------------
Result (actual rows=0 loops=1)
One-Time Filter: false
-> Index Scan using tickets_pkey on tickets (never executed)
Index Cond: (ticket_no = '0005432'::bpchar)
Two clauses for the tickets table contradict each other, but the planner does not immediately figure this out and proceeds to select a table access method. It concludes at a later stage that there is no need to access the table after all, but the resources spent on planning end up wasted.
Now, here is the plan for the same query in PostgreSQL 17:
QUERY PLAN
--------------------------------
Result (actual rows=0 loops=1)
One-Time Filter: false
Both the clauses are analyzed in advance. The plan not only becomes cheaper to build, but looks much neater as well.
Index range scan optimization
commit: e0b1ee17
When a range scan is performed using a B-tree index, each scanned index page is searched for values within the given range.
Previously, PostgreSQL used to check if every value in an index page is within the range. With the new patch, it first checks the last value in a page: if the last value is within the range, so must be all other values. The more index pages are there to scan and the more expensive the comparison operator, the greater the optimization effect.
In the following example, more than 40 thousand index pages are scanned. This is the query in PostgreSQL 16:
16=# EXPLAIN (analyze, buffers, costs off, timing off)
SELECT *
FROM tickets
WHERE ticket_no > '0005432' AND ticket_no < '0005434';
QUERY PLAN
-------------------------------------------------------------------------------------
Index Scan using tickets_pkey on tickets (actual rows=1469571 loops=1)
Index Cond: ((ticket_no > '0005432'::bpchar) AND (ticket_no < '0005434'::bpchar))
Buffers: shared hit=10675 read=30185
Planning:
Buffers: shared read=4
Planning Time: 0.214 ms
Execution Time: 683.801 ms
The same query in PostgreSQL 17 executes much faster:
QUERY PLAN
-------------------------------------------------------------------------------------
Index Scan using tickets_pkey on tickets (actual rows=1469571 loops=1)
Index Cond: ((ticket_no > '0005432'::bpchar) AND (ticket_no < '0005434'::bpchar))
Buffers: shared hit=10690 read=30170
Planning:
Buffers: shared read=4
Planning Time: 0.268 ms
Execution Time: 237.177 ms
dblink, postgres_fdw: detailed wait events
Extensions dblink, postgres_fdw first started using the new interface for creating wait events. The new wait events are explained in each extension’s documentation .
Logical replication: migration of replication slots during publisher upgrade
commit: 29d0a77f
Upgrading the publisher with pg_upgrade creates an issue with logical replication. Publications migrate to the new version, but replication slots do not. This forces subscribers to synchronize data again when the publisher is upgraded.
This patch makes pg_upgrade migrate the replication slots to the new server, so the subscribers only have to adjust the connection string and then carry on collecting updates.
Replication slot use log
commit: 7c3fb505
In addition to logging replication commands, the parameter log_replication_commands
now also enables logging of wal sender processes acquiring and releasing replication slots.
Analyzing these log messages may show how long replication slots remain idle, which in turn may help hunt down issues with consumers (such as replicas or logical replication subscribers).
Let’s enable log_replication_commands and make a backup using pg_basebackup
. There are new messages in the log:
LOG: acquired physical replication slot "pg_basebackup_339820"
…
LOG: released physical replication slot "pg_basebackup_339820"
Unicode: new information functions
commit: a02b37fc
The function unicode_assigned
returns True if all every character in a string has a Unicode code point assigned:
SELECT unicode_assigned('Hello World!');
unicode_assigned
------------------
t
The other two functions return the Unicode version for PostgreSQL and ICU:
SELECT unicode_version(), icu_unicode_version();
unicode_version | icu_unicode_version
-----------------+---------------------
15.1 | 14.0
New function: xmltext
commit: 526fe0d7
xmltext
is a standard SQL function. It converts an input string into an XML value, properly escaping any special characters:
SELECT xmltext('<Hello & World>');
xmltext
--------------------------
<Привет & Мир>
AT LOCAL support
commit: 97957fdb
The construction AT TIME ZONE
is used to explicitly specify a time zone when converting values. The SQL standard has an abbreviation AT LOCAL
to indicate the current time zone (defined by the timezone parameter value). Now AT LOCAL
can be used in PostgreSQL. The following two expressions for the Europe/Moscow
timezone are now equivalent:
SELECT now() AT TIME ZONE 'Europe/Moscow',
now() AT LOCAL
\gx
-[ RECORD 1 ]------------------------
timezone | 2023-12-18 12:57:29.612578
timezone | 2023-12-18 12:57:29.612578
Infinite intervals
commit: 519fc1bd
The type interval
now works with infinite values:
SELECT 'infinity'::interval, '-infinity'::interval;
interval | interval
----------+-----------
infinity | -infinity
This allows for some new arithmetic operations:
SELECT now() + 'infinity'::interval,
now() - 'infinity'::interval;
?column? | ?column?
----------+-----------
infinity | -infinity
ALTER SYSTEM with unrecognized custom parameters
commit: 2d870b4a
The ALTER SYSTEM
command now can write custom parameters to postgresql.auto.conf
:
=# ALTER SYSTEM SET myapp.today = '2023-12-06';
ALTER SYSTEM
By default, only the superuser can do this, but the privileges for specific parameters can be granted with the GRANT ... ON PARAMETER
command.
In fact, ALTER SYSTEM
has been capable of working with custom parameters even in previous versions, but they had to be declared in-session with the SET
command and stored in the hash table in memory. This strange behavior was declared a mistake and is now fixed.
Again, you can still set custom parameters in the main configuration file postgresql.conf
.
Building the server from source
commit: 721856ff
Tar archives with source code will no longer contain files pre-generated with Flex
, Bison
, and perl
utilities. There will also be no generated documentation files and man pages. Therefore, the build from the source code obtained from the Downloads section will be identical to the build from the git repository source code. This means that in any case, flex
, bison
and perl
will be required.
This was done primarily for the sake of the assembly system meson, because it can’t compile a server from tar archives as they exist now.
That’s all I’ve got in 2023. Looking forward to the next PostgreSQL 17 commitfest in January!