Home > Blog

September 12, 2023 • PostgreSQL • By Pavel Luzanov

PostgreSQL 17: part 1 or CommitFest 2023-07

Source:
habr.com

We continue to follow the news in the world of PostgreSQL. The PostgreSQL 16 Release Candidate 1 was rolled out on August 31. If all is well, PostgreSQL 16 will officially release on September 14.

What has changed in the upcoming release after the April code freeze? What’s getting into PostgreSQL 17 after the first commitfest? Read our latest review to find out!

PostgreSQL 16

For reference, here are our previous reviews of PostgreSQL 16 commitfests: 2022-07, 2022-09, 2022-11, 2023-01, 2023-03.

Since April, there have been some notable changes.

Let’s start with the losses. The following updates have not made it into the release:

MAINTAIN — a new privilege for table maintenance (commit: 151c22de)
Setting parameter values at the database and user level (commit: b9a7a822)

Some patches have been updated:

psql: \drg — a new command for viewing role membership information

commit: d65ddaca

The new command was added to PostgreSQL 16 after the code freeze, which is not done very often. However, this patch is just a psql interface update for the server functionality that has been introduced in PostgreSQL 16.

Create a role alice with role administration privileges.


CREATE ROLE alice LOGIN CREATEROLE;
\c - alice

Make alice automatically inherit the privileges of any created rolesand be able to switch to them using SET ROLE:


SET createrole_self_grant = 'INHERIT, SET';

CREATE ROLE bob LOGIN;

Use the new command \drg to verify that alice is a member of bob with the following parameters:


\drg alice


               List of role grants
 Role name | Member of |   Options    | Grantor  
-----------+-----------+--------------+----------
 alice     | bob       | INHERIT, SET | alice
 alice     | bob       | ADMIN        | postgres
(2 rows)

Note that alice is a member of bob twice. The ADMIN OPTION grant was made on behalf of the bootstrap superuser. This is what lets alice manage bob. It may be important to know who granted what membership to whom, since only the grantor can REVOKE the membership.

Create a role charlie and grant it membership in bob.


CREATE ROLE charlie LOGIN;
GRANT bob TO charlie WITH ADMIN FALSE, INHERIT FALSE, SET TRUE;

\drg


               List of role grants
 Role name | Member of |   Options    | Grantor  
-----------+-----------+--------------+----------
 alice     | bob       | INHERIT, SET | alice
 alice     | bob       | ADMIN        | postgres
 alice     | charlie   | INHERIT, SET | alice
 alice     | charlie   | ADMIN        | postgres
 charlie   | bob       | SET          | alice
(5 rows)

In previous versions, membership was displayed in the “Member of” column in the \du (or \dg) output. Turns out, adding the information about the grantor into that column wasn’t that easy, so the new command \drg was created instead, and \du and \dg lost the “Member of” column altogether.

\du


                             List of roles
 Role name |                         Attributes                         
-----------+------------------------------------------------------------
 alice     | Create role
 bob       |
 charlie   |
 postgres  | Superuser, Create role, Create DB, Replication, Bypass RLS

More about role membership management from our previous commitfest reviews:

The parameter io_direct is renamed to debug_io_direct

commit: 319bae9a

PostgreSQL 16 introduced a new debugging parameter io_direct. To avoid people accidentally switching it on without understanding what it does, the parameter was renamed to debug_io_direct.

Parameters lc_collate and lc_ctype removed

commit: b0f6c437

Starting with PostgreSQL 15, the ICU library can be used as the default localization provider for the database. If it is, information about the selected provider and locale is stored in the columns pg_database.datlocprovider and pg_database.daticulocale.

However, for a number of reasons, libc settings are still needed in every database. Therefore, the pg_database.datcollate and pg_database.datctype columns still have information about the libc locale, even if the ICU provider is selected for the database, and the lc_collate and lc_ctype parameters specifically display the libc settings. Therefore, just checking the lc_ctype or lc_collate value is not enough to correctly determine the database locale. You have to check the values of the datlocprovider and daticulocale columns as well.

To avoid errors in determining the database locale, the lc_collate and lc_ctype parameters were removed. To view the locale information, you should use pg_database or the \list command in psql from now on.

Documentation: links to previously hidden HTML elements visible in the web interface

commit: e2922702

Some HTML documentation pages contain anchors for sections or terms. For example, each configuration parameter in the “Server Configuration” chapter has its own anchor and can be referenced directly. But how do you find out what the anchor is without looking into the page source code?

In PostgreSQL16’s web interface, when you hover the mouse pointer over the name of a section or term that has an anchor, a # sign now appears on the right with the corresponding hyperlink.

For example, if you hover the mouse over the “search_path” section name on the Client Connection Defaults page, a # sign will appear on the right with a direct link to the parameter: https://www.postgresql.org/docs/16/runtime-config-client.html#GUC-SEARCH-PATH

This is very convenient when you need to share an exact link.

PostgreSQL 17

Now, let’s move to version 17. After the first commitfest, the following patches were added:

Index vacuuming progress in pg_stat_progress_vacuum
Incremental sorting for GiST and SP-GiST indexes
Exclusion constraints for partitioned tables
Logical replication: REPLICA IDENTITY FULL and hash indexes
pg_archivecleanup: --clean-backup-history
New parameter huge_pages_status
Removed the parameter db_user_namespace
Wait events for extensions
psql: the ECHO_HIDDEN output is now framed with comment lines

Index vacuuming progress in pg_stat_progress_vacuum

commit: 46ebdfe1, f1889729

Two new columns were added to the pg_stat_progress_vacuum view: indexes_total and indexes_processed. The first one shows the total number of indexes to be vacuumed, and the second one shows how many indexes have already been processed. The information is updated during the vacuuming phases associated with the indexes: “vacuuming indexes” and “cleaning up indexes”.

This is an important improvement in monitoring the vacuuming of large tables with multiple indexes, since using pg_stat_progress_vacuum in such cases was leaving a lot to be desired. After all, thanks to the visibility map, vacuuming of the table itself affects only the pages that were changed since the previous vacuuming, so it is performed quickly enough, provided that those changes are small in number. On the other hand, each index is scanned in full, so the index vacuuming phases take up most of the table vacuuming time, and there was no way to track their progress. Now the progress can be evaluated by the number of processed indexes.

There is still no way to monitor the progress on the single index level, however. The process is vastly different for different kinds of indexes, and the same index page may be processed multiple times. Therefore, it is impossible to reliably implement monitoring similar to the one used with tables, i.e. number_of_vacuumed_pages / total_number_of_pages.

Incremental sorting for GiST and SP-GiST indexes

commit: 625d5b3c

Incremental sorting was first introduced back in PostgreSQL 13, but so far it has only been used for B-tree indexes. Now it works with GiST and SP-GiST indexes too.

In the demodatabase, let’s create a GiST index by airport coordinates:


CREATE INDEX ON airports_data USING gist (coordinates);

Find 10 flights from airports closest to a given point (for example, with the coordinates of 0 and 0), and sorted by departure date:


EXPLAIN (costs off)
SELECT f.*
FROM   flights f
         JOIN airports a ON (f.departure_airport=a.airport_code)
ORDER BY point(0,0) <-> a.coordinates, f.scheduled_departure
LIMIT 10;


                                      QUERY PLAN                                      
--------------------------------------------------------------------------------------
 Limit
   ->  Incremental Sort
         Sort Key: (('(0,0)'::point <-> ml.coordinates)), f.scheduled_departure
         Presorted Key: (('(0,0)'::point <-> ml.coordinates))
         ->  Nested Loop
               Join Filter: (ml.airport_code = f.departure_airport)
               ->  Index Scan using airports_data_coordinates_idx on airports_data ml
                     Order By: (coordinates <-> '(0,0)'::point)
               ->  Materialize
                     ->  Seq Scan on flights f
(10 rows)

Ignore the output here, the interesting thing is that we now get the Incremental Sort node with the GiST index in the query plan.

Exclusion constraints for partitioned tables

commit: 8c852ba9

It is now possible to create exclusion constraints for partitioned tables.

Create a table for booking meeting rooms, partitioned by room number:


CREATE TABLE booking(
    room integer PRIMARY KEY,
    during tstzrange NOT NULL
) PARTITION BY RANGE(room);

CREATE TABLE booking_1_10
    PARTITION OF booking FOR VALUES FROM (1) TO (10);
CREATE TABLE booking_11_20
    PARTITION OF booking FOR VALUES FROM (11) TO (20);

To make it impossible to book the same room twice, add an exclusion constraint. Like the primary key, the constraint must include partitioning key columns, and only with the equality operator:


CREATE EXTENSION btree_gist;

ALTER TABLE booking ADD CONSTRAINT no_intersect
    EXCLUDE USING gist(room WITH =, during WITH &&);

Different rooms from different sections can be booked for the same periods of time:


INSERT INTO booking(room, during) VALUES
    (1, '[today 13:00,today 16:00)'::tstzrange),
    (11, '[today 13:00,today 16:00)'::tstzrange);


INSERT 0 2

But you can’t book the same room for overlapping time ranges:


INSERT INTO booking(room, during)
    VALUES (1, '[today 14:00,today 18:00)'::tstzrange);


ERROR:  duplicate key value violates unique constraint "booking_1_10_pkey"
DETAIL:  Key (room)=(1) already exists.

Logical replication: REPLICA IDENTITY FULL and hash indexes

commit: edca3424

The ability to use a non-unique index to identify modified rows on the subscriber first appeared in PostgreSQL 16, but it only worked on B-tree indexes. In PostgreSQL 17, it also works with hash indexes.

pg_archivecleanup: --clean-backup-history

commit: dd7c60f1, 4a7556f7, 3f8c98d0

The first commit adds the ability to specify long options, the second commit is a refactoring of the existing code, and only the third one implements the original idea by adding a new option --clean-backup-history. This option is used for deleting old backup history files. Previously, these small files were always retained for debugging purposes.

New parameter huge_pages_status

commit: a14354ca

The new parameter huge_pages_status shows the usage status of huge pages:


\dconfig huge*


List of configuration parameters
     Parameter     | Value
-------------------+-------
 huge_pages        | try
 huge_page_size    | 2MB
 huge_pages_status | off
(3 rows)

Here, the parameter value suggests that the server could not allocate memory for huge pages at startup.

Removed the parameter db_user_namespace

commit :884eee5b

The parameter was used to relate user names to databases. 21 years ago, this functionality was announced as a temporary measure, but it seems to have never garnered any popularity.

Wait events for extensions

commit: c9af0546

Extension developers get the ability to define their own wait events. Currently, all extensions use the same wait event type: Extension. But if several extensions are installed, pg_stat_activity does not give a clear indication as to which extension has caused a delay.

The patch introduces an interface for extensions to create wait events. In the future, the developers plan to update the existing contrib extensions to clarify the names of the wait events.

psql: the ECHO_HIDDEN output is now framed with comment lines

commit: 19c590f6

Enabling the ECHO_HIDDEN parameter in the command line or as a psql variable makes the SQL queries used in psql commands appear in the output:


$ psql --echo-hidden -c '\db';


/******** QUERY *********/
SELECT spcname AS "Name",
  pg_catalog.pg_get_userbyid(spcowner) AS "Owner",
  pg_catalog.pg_tablespace_location(oid) AS "Location"
FROM pg_catalog.pg_tablespace
ORDER BY 1;
/************************/

       List of tablespaces
    Name    |  Owner   | Location
------------+----------+----------
 pg_default | postgres |
 pg_global  | postgres |
(2 rows)

The lines surrounding the query are now output as comment lines, which is convenient for later use in scripts or for copying and pasting. In previous versions, the lines were framed with * signs instead.

The same changes were made to --log-file and --single-step.

That’s all for now. Looking forward to the release of PostgreSQL 16 and the September commitfest for PostgreSQL 17!

← Back to all articles