Home > Blog

February 17 • PostgreSQL • By Pavel Luzanov

PostgreSQL 19: part 1 or CommitFest 2025-07

PostgreSQL

Source:
habr.com

Subscribe to blog

Source:
habr.com

Subscribe to blog

We’re launching a new series of articles covering the changes coming up in PostgreSQL 19. This first article focuses on the events from last summer’s July CommitFest.

Connection service file in libpq parameter and psql variable
regdatabase: a type for database identifiers
pg_stat_statements: counters for generic and custom plans
pg_stat_statements: FETCH command normalization
pg_stat_statements: normalizing commands with parameter lists in IN clauses
EXPLAIN: Memoize node estimates
btree_gin: comparison operators for integer types
pg_upgrade: optimized migration of large objects
Optimized temporary table truncation
Planner: incremental sort in Append and MergeAppend nodes
Domain constraint validation no longer blocks DML operations
CHECKPOINT command parameters
COPY FROM: skipping initial rows
pg_dsm_registry_allocations: dynamic shared memory (DSM) usage

Connection service file in libpq parameter and psql variable

commit: 092f3c63efc6, 6b1c4d326b06

When connecting to a server using a connection service file, you can now specify its name in the connection string via the new servicefile parameter.

For example, there’s a demo.conf file in the current directory:

$ cat ./demo.conf

[demo]
host=localhost
port=5401
user=postgres
dbname=demo
options=-c search_path=bookings

You can use it for the connection:

$ psql ’servicefile=./demo.conf service=demo'

psql (19devel)
Type "help" for help.

postgres@demo(19.0)=#

In psql, there is a new SERVICEFILE variable for the service file name:

postgres@demo(19.0)=# \echo :SERVICEFILE
./demo.conf

regdatabase: a type for database identifiers

commit: bd09f024a

A new addition to the family of object identifier alias types. The regdatabase type enables conversion between database names and their identifiers in both directions.

SELECT current_database()::regdatabase,
       current_database()::regdatabase::oid
\gx

-[ RECORD 1 ]----+------
current_database | demo
current_database | 16561

pg_stat_statements: counters for generic and custom plans

commit: 3357471cf

The pg_stat_statements view now includes two new columns that track how many times generic and custom plans were selected for query execution:

SELECT pg_stat_statements_reset();

SELECT * FROM bookings WHERE book_ref = $1 \bind 'NWQI2S' \g /dev/null
SELECT * FROM bookings WHERE book_ref = $1 \bind 'WF2DGZ' \g /dev/null

SELECT query, calls, generic_plan_calls, custom_plan_calls
FROM pg_stat_statements
WHERE query ~ 'bookings'
\gx

-[ RECORD 1 ]------+-------------------------------------------
query              | SELECT * FROM bookings WHERE book_ref = $1
calls              | 2
generic_plan_calls | 0
custom_plan_calls  | 2

pg_stat_statements: FETCH command normalization

commit: bee23ea4d

When normalizing the FETCH command, the fetch count is replaced with a constant. This means that FETCH commands retrieving different numbers of rows will share the same query identifier and appear as a single entry in pg_stat_statements:

SELECT pg_stat_statements_reset();

BEGIN;
DECLARE cur CURSOR FOR SELECT * FROM flights;
FETCH 1 cur\g /dev/null
FETCH 2 cur\g /dev/null
FETCH -1 cur\g /dev/null
COMMIT;

SELECT queryid, query, calls
FROM pg_stat_statements
WHERE query ~ '^FETCH'
\gx

-[ RECORD 1 ]----------------
queryid | 4164749676997500190
query   | FETCH $1 cur
calls   | 3

pg_stat_statements: normalizing commands with parameter lists in IN clauses

commit: c2da1a5d6

Commands with lists of constants in IN clauses were normalized in version 18. In version 19, this same normalization applies not only to constants but also to parameter lists:

SELECT pg_stat_statements_reset();

SELECT * FROM flights WHERE flight_id IN (1,2) \g /dev/null
SELECT * FROM flights WHERE flight_id IN (1,2,3) \g /dev/null

SELECT * FROM flights WHERE flight_id IN ($1,$2) \bind 11 12 \g /dev/null
SELECT * FROM flights WHERE flight_id IN ($1,$2,$3) \bind 21 22 23 \g /dev/null
SELECT * FROM flights WHERE flight_id IN ($1,$2,$3,$4) \bind 31 32 33 34 \g /dev/null

SELECT queryid, query, calls
FROM pg_stat_statements
WHERE query ~ 'flights'

-[ RECORD 1 ]-----------------------------------------------------
queryid | -5928905334469394952
query   | SELECT * FROM flights WHERE flight_id IN ($1 /*, ... */)
calls   | 5

This is particularly useful for applications that use the extended query protocol.

EXPLAIN: Memoize node estimates

commit: 4bc62b86849

To help understand why the planner chose to use a Memoize node, the EXPLAIN output now includes the planner’s estimates for this node (shown in the Estimates line):

EXPLAIN            
SELECT * FROM routes r
  JOIN airports_data a ON r.departure_airport = a.airport_code
WHERE r.days_of_week = '{1,2,3,4,5,6,7}';

                                               QUERY PLAN                                               
--------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.29..212.71 rows=898 width=293)
   ->  Seq Scan on routes r  (cost=0.00..111.12 rows=898 width=103)
         Filter: (days_of_week = '{1,2,3,4,5,6,7}'::integer[])
   ->  Memoize  (cost=0.29..1.08 rows=1 width=190)
         Cache Key: r.departure_airport
         Cache Mode: logical
         Estimates: capacity=73 distinct keys=73 lookups=898 hit percent=91.87%
         ->  Index Scan using airports_data_pkey on airports_data a  (cost=0.28..1.07 rows=1 width=190)
               Index Cond: (airport_code = r.departure_airport)
(9 rows)

This line shows the planner’s estimates for the following metrics:

hash table size
number of unique elements in the hash table
how many times the hash table will be searched
hit rate

Running EXPLAIN with the analyze parameter shows the difference between the estimates and the actual results. Here are two relevant lines from the execution plan for this same query:

…
         Estimates: capacity=73 distinct keys=73 lookups=898 hit percent=91.87%
         Hits: 830  Misses: 68  Evictions: 0  Overflows: 0  Memory Usage: 19kB
…

btree_gin: comparison operators for integer types

commit: e2b64fcef35, fc896821c44

A small convenience for users of the btree_gin extension.

CREATE EXTENSION btree_gin;

CREATE TABLE t (a bigint, b text);
INSERT INTO t SELECT random(1,100), random()::text FROM generate_series(1, 100000);

CREATE INDEX idx ON t USING GIN (a, b gin_trgm_ops);
VACUUM ANALYZE t;

SELECT * FROM t WHERE a = 42 AND b LIKE '*42*';

Previously, the planner would only use the index for the condition on column b in this type of query, but wouldn’t include a = 42 in the index search condition because column a is of type bigint while 42 defaults to int. This required explicitly specifying a = 42::bigint in the query.

Now btree_gin has been improved so that int2, int4, and int8 will automatically be cast to the appropriate types for index usage.

EXPLAIN SELECT * FROM t WHERE a = 42 AND b LIKE '*42*';

                             QUERY PLAN                             
--------------------------------------------------------------------
 Bitmap Heap Scan on t  (cost=229.49..233.51 rows=1 width=27)
   Recheck Cond: ((a = 42) AND (b ~~ '*42*'::text))
   ->  Bitmap Index Scan on idx  (cost=0.00..229.49 rows=1 width=0)
         Index Cond: ((a = 42) AND (b ~~ '*42*'::text))
(4 rows)

pg_upgrade: optimized migration of large objects

commit: 161a3e8b682, 3bcfcd815e1

When upgrading from version 12 onwards, pg_dump with the --binary-upgrade parameter generates COPY commands for large objects that insert data directly into the pg_largeobject_metadata and pg_shdepend tables.

Restoring from such a dump is much faster than executing ALTER LARGE OBJECT commands to assign ownership and GRANT commands to set privileges. As a result, pg_upgrade will spend considerably less time upgrading servers that contain large numbers of large objects.

Upgrades from version 16 and later (the second commit) will be even faster, as pg_largeobject_metadata table files will be migrated just like regular user table files.

pg_dump without the --binary-upgrade parameter continues to export large object metadata using SQL commands, as it always has.

Optimized temporary table truncation

commit: 78ebda66bf2

Truncating a temporary table previously required scanning the local buffer cache (temp_buffers) three times for each fork: the main relation, the visibility map, and the free space map. Now a single scan is sufficient. This is a highly valuable optimization for applications that make heavy use of temporary tables.

A similar optimization for regular tables was implemented back in PostgreSQL 13.

Planner: incremental sort in Append and MergeAppend nodes

commit: 55a780e9476

The planner can now leverage incremental sort for nodes nested within Append or MergeAppend. When nested nodes can produce partially sorted data, the planner will favor completing the sort incrementally rather than performing a full sort on the entire result set.

Domain constraint validation no longer blocks DML operations

commit: 16a0039dc0d

The ALTER DOMAIN … VALIDATE CONSTRAINT … command now uses a lighter SHARE UPDATE EXCLUSIVE lock instead of the previous SHARE lock. This allows concurrent modifications to tables using the domain to proceed without blocking.

CHECKPOINT command parameters

commit: cd8324cc89a, bb938e2c3c7, a4f126516e6, 2f698d7f4b7, 8d33fbacbac

The CHECKPOINT command now supports parameters:

=# \h checkpoint

Command:     CHECKPOINT
Description: force a write-ahead log checkpoint
Syntax:
CHECKPOINT [ ( option [, ...] ) ]

where option can be one of:

    FLUSH_UNLOGGED [ boolean ]
    MODE { FAST | SPREAD }

URL: https://www.postgresql.org/docs/devel/sql-checkpoint.html

When FLUSH_UNLOGGED is enabled, modified buffers of unlogged objects (tables, indexes, etc.) will be flushed to disk. This parameter is disabled by default.

This parameter helps reduce server restart time when unlogged tables are heavily used. Before shutting down the server, you can manually run CHECKPOINT so that the subsequent checkpoint during shutdown completes faster. Previously, dirty buffers from unlogged objects weren’t flushed to disk by the checkpoint process – this only happened during shutdown. Now, by manually running a checkpoint with FLUSH_UNLOGGED ON, you can offload even more work before the server stops.

To avoid putting too much write load on the system, you can run the preliminary checkpoint with the other new parameter MODE SPREAD. The checkpoint will be spread out over time, taking the checkpoint_completion_target value into account. This works similarly to the --checkpoint={fast|spread} parameter in the pg_basebackup utility.

COPY FROM: skipping initial rows

commit: bc2f348e87c

The COPY command’s header parameter now accepts a positive numeric value in addition to true/false and match. This number specifies how many initial rows the command should skip before it starts loading data into the table.

CREATE TABLE test(id int);

=# COPY test FROM stdin (header 3);
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> first header row
>> second header row
>> third, then data
>> 1
>> 2
>> \.

COPY 2

SELECT * FROM test;

 id 
----
  1
  2
(2 rows)

Note that in this case, you won’t be able to use the header match option to map columns between the file and the table.

pg_dsm_registry_allocations: dynamic shared memory (DSM) usage

commit: 167ed8082f4

The new system view pg_dsm_registry_allocations provides information about dynamic shared memory allocation.

=# \d pg_dsm_registry_allocations

  View "pg_catalog.pg_dsm_registry_allocations"
 Column |  Type  | Collation | Nullable | Default 
--------+--------+-----------+----------+---------
 name   | text   |           |          | 
 type   | text   |           |          | 
 size   | bigint |           |          |

That’s all for now. Coming up next: news from the September CommitFest.

← Back to all articles