Thread: PostgreSQL 16 release announcement draft
Hi, Attached is the first draft of the PostgreSQL 16 release announcement, authored by Chelsea Dole & myself. To frame this up, the goal of the GA release announcement is to help folks discover the awesome new features of PostgreSQL. It's impossible to list out every single feature in the release and still have a coherent announcement, so we try to target features that have the broadest range of impact. It's possible we missed or incorrectly stated something, so please provide feedback if we did so. (Note I have not added in all of the links etc. to the Markdown yet, as I want to wait for the first pass of feedback to come through). **Please provide feedback by August 26, 12:00 UTC**. After that point, we need to freeze all changes so we can begin the release announcement translation effort. Thanks, Jonathan
Attachment
>>>>> Additionally, this release adds a new field to the pg_stat_all_tables view, capturing a timestamp representing when a table or index was last scanned. PostgreSQL also makes auto_explain more readable by logging values passed into parameterized statements, and improves accuracy of pg_stat_activity’s normalization algorithm. >>>>> I am not sure if it's "capturing a timestamp representing" or "capturing the timestamp representing". "pg_stat_activity’s normalization algorithm", I think you are referring to "pg_stat_statements"?
>>> PostgreSQL 16 improves the performance of existing PostgreSQL functionality through new query planner optimizations. In this latest release, the query planner can parallelize `FULL` and `RIGHT` joins, utilize incremental sorts for `SELECT DISTINCT` queries, and execute window functions more efficiently. It also introduces `RIGHT` and `OUTER` "anti-joins", which enable users to identify rows not present in a joined table. >>> I think "utilize incremental sorts is for" something like select my_avg(distinct one),my_sum(one) from (values(1),(3)) t(one); so it's not the same as `SELECT DISTINCT` queries? ref: https://git.postgresql.org/cgit/postgresql.git/commit/?id=1349d2790bf48a4de072931c722f39337e72055e also <<<< "the query planner ....., and execute window functions more efficiently." since the query planner doesn't execute anything. probably "and optimize window functions execution"?
On Wed, 23 Aug 2023 at 22:21, jian he <jian.universality@gmail.com> wrote: > > >>> > PostgreSQL 16 improves the performance of existing PostgreSQL functionality > through new query planner optimizations. In this latest release, the query > planner can parallelize `FULL` and `RIGHT` joins, utilize incremental sorts for > `SELECT DISTINCT` queries, and execute window functions more efficiently. It > also introduces `RIGHT` and `OUTER` "anti-joins", which enable users to identify > rows not present in a joined table. > >>> > > I think "utilize incremental sorts is for" something like select > my_avg(distinct one),my_sum(one) from (values(1),(3)) t(one); > so it's not the same as `SELECT DISTINCT` queries? > ref: https://git.postgresql.org/cgit/postgresql.git/commit/?id=1349d2790bf48a4de072931c722f39337e72055e The incremental sorts for DISTINCT will likely be a reference to 3c6fc5820, so, not the same thing as 1349d2790. I don't see anything there relating to 1349d2790. > also > <<<< "the query planner ....., and execute window functions more efficiently." > since the query planner doesn't execute anything. probably "and > optimize window functions execution"? Yeah, that or "and optimize window functions so they execute more efficiently" is likely an improvement there. David
On 8/23/23 8:02 AM, David Rowley wrote: > On Wed, 23 Aug 2023 at 22:21, jian he <jian.universality@gmail.com> wrote: >> >>>>> >> PostgreSQL 16 improves the performance of existing PostgreSQL functionality >> through new query planner optimizations. In this latest release, the query >> planner can parallelize `FULL` and `RIGHT` joins, utilize incremental sorts for >> `SELECT DISTINCT` queries, and execute window functions more efficiently. It >> also introduces `RIGHT` and `OUTER` "anti-joins", which enable users to identify >> rows not present in a joined table. >>>>> >> >> I think "utilize incremental sorts is for" something like select >> my_avg(distinct one),my_sum(one) from (values(1),(3)) t(one); >> so it's not the same as `SELECT DISTINCT` queries? >> ref: https://git.postgresql.org/cgit/postgresql.git/commit/?id=1349d2790bf48a4de072931c722f39337e72055e > > The incremental sorts for DISTINCT will likely be a reference to > 3c6fc5820, so, not the same thing as 1349d2790. I don't see anything > there relating to 1349d2790. We could add something about 1349d2790 -- do you have suggested wording? >> also >> <<<< "the query planner ....., and execute window functions more efficiently." >> since the query planner doesn't execute anything. probably "and >> optimize window functions execution"? > > Yeah, that or "and optimize window functions so they execute more > efficiently" is likely an improvement there. Modified. See updated announcement, with other incorporated changes. Reminder that the window to submit changes closes at **August 26, 12:00 UTC**. Thanks, Jonathan
Attachment
On Thu, 24 Aug 2023 at 05:55, Jonathan S. Katz <jkatz@postgresql.org> wrote: > We could add something about 1349d2790 -- do you have suggested wording? I think it's worth a mention. See the text added in square brackets below: PostgreSQL 16 improves the performance of existing PostgreSQL functionality through new query planner optimizations. In this latest release, the query planner can parallelize `FULL` and `RIGHT` joins, [generate more optimal plans for queries containing aggregate functions with a `DISTINCT` or `ORDER BY` clause,] utilize incremental sorts for `SELECT DISTINCT` queries, and optimize window function executions so they execute more efficiently. It also introduces `RIGHT` and `OUTER` "anti-joins", which enable users to identify rows not present in a joined table. Thanks David
On 8/23/23 5:07 PM, David Rowley wrote: > On Thu, 24 Aug 2023 at 05:55, Jonathan S. Katz <jkatz@postgresql.org> wrote: >> We could add something about 1349d2790 -- do you have suggested wording? > > I think it's worth a mention. See the text added in square brackets below: > > PostgreSQL 16 improves the performance of existing PostgreSQL functionality > through new query planner optimizations. In this latest release, the query > planner can parallelize `FULL` and `RIGHT` joins, [generate more > optimal plans for > queries containing aggregate functions with a `DISTINCT` or `ORDER BY` clause,] > utilize incremental sorts for `SELECT DISTINCT` queries, and optimize > window function > executions so they execute more efficiently. It also introduces > `RIGHT` and `OUTER` > "anti-joins", which enable users to identify rows not present in a joined table. I added this in mostly verbatim. I'm concerned the sentence is a bit long, but we could break it up into two: (1) with the new JOIN capabilities and (2) with the optimizations. Jonathan
Attachment
hi. Can you check my first email about "a" versus "the" and "pg_stat_activity". also: "including the `\bind` command, which allows users to execute parameterized queries (e.g `SELECT $1 + $2`) then use `\bind` to substitute the variables." The example actually is very hard to reproduce. (it's not that super intuitive). fail case: test16-# SELECT $1 + $2 \bind 1 2 test16-# ; a better example would be (e.g `SELECT $1 , $2`). The semicolon still needed to be in the next line.
Op 8/24/23 om 16:32 schreef Jonathan S. Katz: > On 8/23/23 5:07 PM, David Rowley wrote: >> On Thu, 24 Aug 2023 at 05:55, Jonathan S. Katz <jkatz@postgresql.org> >> wrote: Hi, When v15 docs have: "27.2.7. Cascading Replication The cascading replication feature allows a standby server to accept replication connections and stream WAL records to other standbys, acting as a relay. This can be used to reduce the number of direct connections to the primary and also to minimize inter-site bandwidth overheads." why then, in the release draft, is that capability mentioned as something that is new for v16? " In PostgreSQL 16, users can perform logical decoding from a standby instance, meaning a standby can publish logical changes to other servers. " Is there a difference between the two? Thanks, Erik
On 2023-Aug-24, Jonathan S. Katz wrote: > ### Performance Improvements > > PostgreSQL 16 improves the performance of existing PostgreSQL functionality > through new query planner optimizations. In this latest release, the query > planner can parallelize `FULL` and `RIGHT` joins, generate better optimized > plans for queries that use aggregate functions (e.g. `count`) with a `DISTINCT` > or `ORDER BY` clause, utilize incremental sorts for `SELECT DISTINCT` queries, > and optimize window function executions so they execute more efficiently. "optimize window function executions so that they execute blah" sounds redundant and strange. Maybe just "optimize execution of window functions" is sufficient? Also, using "e.g." there looks somewhat out of place; maybe "(such as `count`)" is a good replacement? > It also introduces `RIGHT` and `OUTER` "anti-joins", which enable users to > identify rows not present in a joined table. Wait. Are you saying we didn't have those already? Looking at release-16.sgml I think this refers to commit 16dc2703c541, which means this made them more efficient rather than invented them. > This release includes improvements for bulk loading using `COPY` in both single > and concurrent operations, with tests showing up to a 300% performance > improvement in some cases. PostgreSQL adds support for load balancing in clients PostgreSQL 16 > that use `libpq`, and improvements to vacuum strategy that reduce the necessity > of full-table freezes. Additionally, PostgreSQL 16 introduces CPU acceleration > using `SIMD` in both x86 and ARM architectures, resulting in performance gains > when processing ASCII and JSON strings, and performing array and subtransaction > searches. > > ### Logical replication > > Logical replication lets PostgreSQL users stream data to other PostgreSQL "L.R. in PostgreSQL lets users"? > instances or subscribers that can interpret the PostgreSQL logical replication > protocol. In PostgreSQL 16, users can perform logical decoding from a standby s/decoding/replication/ ? (It seems odd to use "decoding" when the previous sentence used "replication") > instance, meaning a standby can publish logical changes to other servers. This > provides developers with new workload distribution options – for example, using > a standby rather than the busier primary to logically replicate changes to > downstream systems. > > Additionally, there are several performance improvements in PostgreSQL 16 to > logical replication. Subscribers can now apply large transactions using parallel > workers. For tables that do not have a `PRIMARY KEY`, subscribers can use B-tree "a primary key", no caps. > indexes instead of sequential scans to find rows. Under certain conditions, > users can also speed up initial table synchronization using the binary format. > > There are several access control improvements to logical replication in > PostgreSQL 16, including the new predefined role pg_create_subscription, which > grants users the ability to create a new logical subscription. Finally, this > release begins adding support for bidirectional logical replication, introducing > functionality to replicate data between two tables from different publishers. "to create a new logical subscription" -> "to create new logical subscriptions" > ### Developer Experience > > PostgreSQL 16 adds more syntax from the SQL/JSON standard, including > constructors and predicates such as `JSON_ARRAY()`, `JSON_ARRAYAGG()`, and > `IS JSON`. This release also introduces the ability to use underscores for > thousands separators (e.g. `5_432_000`) and non-decimal integer literals, such > as `0x1538`, `0o12470`, and `0b1010100111000`. > > Developers using PostgreSQL 16 will also benefit from the addition of multiple > commands to `psql` client protocol, including the `\bind` command, which allows > users to execute parameterized queries (e.g `SELECT $1 + $2`) then use `\bind` > to substitute the variables. This paragraph sounds a bit suspicious. What do you mean with "multiple commands to psql client protocol"? Also, I think "to execute parameterized queries" should be "to prepare parameterized queries", and later "then use \bind to execute the query substituting the variables". > ### Monitoring > > A key aspect of tuning the performance of database workloads is understanding > the impact of your I/O operations on your system. PostgreSQL 16 helps simplify > how you can analyze this data with the new pg_stat_io view, which tracks key I/O > statistics such as shared_buffer hits and I/O latency. Hmm, I think what pg_stat_io gives you is data which wasn't available previously at all. Maybe do something like "Pg 16 introduces pg_stat_io, a new source of key I/O metrics that can be used for more fine grained something something". > Additionally, this release adds a new field to the `pg_stat_all_tables` view > that records a timestamp representing when a table or index was last scanned. > PostgreSQL also makes auto_explain more readable by logging values passed into PostgreSQL 16 > parameterized statements, and improves accuracy of pg_stat_activity's > normalization algorithm. I think jian already mentioned that this refers to pg_stat_statement query fingerprinting. I know that the query_id also appears in pg_stat_activity, but that is much newer, and it's not permanent there like in pss. Maybe it should be "of the query fingerprinting algorithm used by pg_stat_statement and pg_stat_activity". > ## Images and Logos > > Postgres, PostgreSQL, and the Elephant Logo (Slonik) are all registered > trademarks of the [PostgreSQL Community Association of Canada](https://www.postgres.ca). Isn't this just the "PostgreSQL Community Association", no Canada? -- Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/ "Ellos andaban todos desnudos como su madre los parió, y también las mujeres, aunque no vi más que una, harto moza, y todos los que yo vi eran todos mancebos, que ninguno vi de edad de más de XXX años" (Cristóbal Colón)
On 8/24/23 11:16 AM, jian he wrote: > hi. Can you check my first email about "a" versus "the" and "pg_stat_activity". I did when you first sent it, and did not make any changes. > also: > "including the `\bind` command, which allows > users to execute parameterized queries (e.g `SELECT $1 + $2`) then use `\bind` > to substitute the variables." > > The example actually is very hard to reproduce. (it's not that super intuitive). > fail case: > test16-# SELECT $1 + $2 \bind 1 2 > test16-# ; > > a better example would be (e.g `SELECT $1 , $2`). > The semicolon still needed to be in the next line. I agree with updating the example, I'd propose: SELECT $1::int + $2::int \bind 1 2 \g which mirrors what's in the docs[1] Thanks, Jonathan [1] https://www.postgresql.org/docs/16/app-psql.html#APP-PSQL-META-COMMAND-BIND
Attachment
On 2023-08-24 11:23, Jonathan S. Katz wrote: > > SELECT $1::int + $2::int \bind 1 2 \g One cast also works, letting type inference figure out the other. So if I say SELECT $1::int + $2 \gdesc it tells me the result will be int. That made me wonder if there is a \gdesc variant to issue the "statement variant" Describe message and show what the parameter types have been inferred to be. If there's not, obviously it won't be in 16, but it might be useful. Regards, -Chap
> Postgres, PostgreSQL, and the Elephant Logo (Slonik) are all registered
> trademarks of the [PostgreSQL Community Association of Canada](https://www.postgres.ca).
Isn't this just the "PostgreSQL Community Association", no Canada?
Certainly confusing from the website, but in the about section is this
"PostgreSQL Community Association is a trade or business name of the PostgreSQL Community Association of Canada."
Dave
On 8/24/23 12:54 PM, Dave Cramer wrote: > > > > > Postgres, PostgreSQL, and the Elephant Logo (Slonik) are all > registered > > trademarks of the [PostgreSQL Community Association of > Canada](https://www.postgres.ca <https://www.postgres.ca>). > > Isn't this just the "PostgreSQL Community Association", no Canada? > > > Certainly confusing from the website, but in the about section is this > "PostgreSQL Community Association is a trade or business name of the > PostgreSQL Community Association of Canada." This was something I missed when reviewing the fulltext, and went ahead and fixed it. Thanks, Jonathan
Attachment
On 8/24/23 11:17 AM, Erik Rijkers wrote: > Op 8/24/23 om 16:32 schreef Jonathan S. Katz: >> On 8/23/23 5:07 PM, David Rowley wrote: >>> On Thu, 24 Aug 2023 at 05:55, Jonathan S. Katz <jkatz@postgresql.org> >>> wrote: > > Hi, > > When v15 docs have: > > "27.2.7. Cascading Replication > The cascading replication feature allows a standby server to accept > replication connections and stream WAL records to other standbys, acting > as a relay. This can be used to reduce the number of direct connections > to the primary and also to minimize inter-site bandwidth overheads." > > why then, in the release draft, is that capability mentioned as > something that is new for v16? > " > In PostgreSQL 16, users can perform logical decoding from a standby > instance, meaning a standby can publish logical changes to other servers. > " > > Is there a difference between the two? Yes. Those docs refer to **physical** replication, where a standby can continue to replicate WAL records to other standbys. In v16, standbys can now publish changes over **logical** replication. Thanks, Jonathan
Attachment
On 8/24/23 11:19 AM, Alvaro Herrera wrote: > On 2023-Aug-24, Jonathan S. Katz wrote: > >> ### Performance Improvements >> >> PostgreSQL 16 improves the performance of existing PostgreSQL functionality >> through new query planner optimizations. In this latest release, the query >> planner can parallelize `FULL` and `RIGHT` joins, generate better optimized >> plans for queries that use aggregate functions (e.g. `count`) with a `DISTINCT` >> or `ORDER BY` clause, utilize incremental sorts for `SELECT DISTINCT` queries, >> and optimize window function executions so they execute more efficiently. > > "optimize window function executions so that they execute blah" sounds > redundant and strange. Maybe just "optimize execution of window > functions" is sufficient? Also, using "e.g." there looks somewhat out > of place; maybe "(such as `count`)" is a good replacement? > >> It also introduces `RIGHT` and `OUTER` "anti-joins", which enable users to >> identify rows not present in a joined table. > > Wait. Are you saying we didn't have those already? Looking at > release-16.sgml I think this refers to commit 16dc2703c541, which means > this made them more efficient rather than invented them. > > >> This release includes improvements for bulk loading using `COPY` in both single >> and concurrent operations, with tests showing up to a 300% performance >> improvement in some cases. PostgreSQL adds support for load balancing in clients > > PostgreSQL 16 > >> that use `libpq`, and improvements to vacuum strategy that reduce the necessity >> of full-table freezes. Additionally, PostgreSQL 16 introduces CPU acceleration >> using `SIMD` in both x86 and ARM architectures, resulting in performance gains >> when processing ASCII and JSON strings, and performing array and subtransaction >> searches. >> >> ### Logical replication >> >> Logical replication lets PostgreSQL users stream data to other PostgreSQL > > "L.R. in PostgreSQL lets users"? > >> instances or subscribers that can interpret the PostgreSQL logical replication >> protocol. In PostgreSQL 16, users can perform logical decoding from a standby > > s/decoding/replication/ ? (It seems odd to use "decoding" when the > previous sentence used "replication") > >> instance, meaning a standby can publish logical changes to other servers. This >> provides developers with new workload distribution options – for example, using >> a standby rather than the busier primary to logically replicate changes to >> downstream systems. >> >> Additionally, there are several performance improvements in PostgreSQL 16 to >> logical replication. Subscribers can now apply large transactions using parallel >> workers. For tables that do not have a `PRIMARY KEY`, subscribers can use B-tree > > "a primary key", no caps. > >> indexes instead of sequential scans to find rows. Under certain conditions, >> users can also speed up initial table synchronization using the binary format. >> >> There are several access control improvements to logical replication in >> PostgreSQL 16, including the new predefined role pg_create_subscription, which >> grants users the ability to create a new logical subscription. Finally, this >> release begins adding support for bidirectional logical replication, introducing >> functionality to replicate data between two tables from different publishers. > > "to create a new logical subscription" -> "to create new logical subscriptions" > >> ### Developer Experience >> >> PostgreSQL 16 adds more syntax from the SQL/JSON standard, including >> constructors and predicates such as `JSON_ARRAY()`, `JSON_ARRAYAGG()`, and >> `IS JSON`. This release also introduces the ability to use underscores for >> thousands separators (e.g. `5_432_000`) and non-decimal integer literals, such >> as `0x1538`, `0o12470`, and `0b1010100111000`. >> >> Developers using PostgreSQL 16 will also benefit from the addition of multiple >> commands to `psql` client protocol, including the `\bind` command, which allows >> users to execute parameterized queries (e.g `SELECT $1 + $2`) then use `\bind` >> to substitute the variables. > > This paragraph sounds a bit suspicious. What do you mean with "multiple > commands to psql client protocol"? Also, I think "to execute parameterized > queries" should be "to prepare parameterized queries", and later "then > use \bind to execute the query substituting the variables". > > > >> ### Monitoring >> >> A key aspect of tuning the performance of database workloads is understanding >> the impact of your I/O operations on your system. PostgreSQL 16 helps simplify >> how you can analyze this data with the new pg_stat_io view, which tracks key I/O >> statistics such as shared_buffer hits and I/O latency. > > Hmm, I think what pg_stat_io gives you is data which wasn't available > previously at all. Maybe do something like "Pg 16 introduces > pg_stat_io, a new source of key I/O metrics that can be used for more > fine grained something something". > >> Additionally, this release adds a new field to the `pg_stat_all_tables` view >> that records a timestamp representing when a table or index was last scanned. >> PostgreSQL also makes auto_explain more readable by logging values passed into > > PostgreSQL 16 > >> parameterized statements, and improves accuracy of pg_stat_activity's >> normalization algorithm. > > I think jian already mentioned that this refers to pg_stat_statement > query fingerprinting. I know that the query_id also appears in > pg_stat_activity, but that is much newer, and it's not permanent there > like in pss. Maybe it should be "of the query fingerprinting algorithm > used by pg_stat_statement and pg_stat_activity". > >> ## Images and Logos >> >> Postgres, PostgreSQL, and the Elephant Logo (Slonik) are all registered >> trademarks of the [PostgreSQL Community Association of Canada](https://www.postgres.ca). > > Isn't this just the "PostgreSQL Community Association", no Canada? Thanks for the feedback. I accepted most of the changes. Please see revised text here, which also includes the URL substitutions. Jonathan
Attachment
Op 8/26/23 om 04:51 schreef Jonathan S. Katz: > On 8/24/23 11:17 AM, Erik Rijkers wrote: >> Op 8/24/23 om 16:32 schreef Jonathan S. Katz: >>> On 8/23/23 5:07 PM, David Rowley wrote: >>>> On Thu, 24 Aug 2023 at 05:55, Jonathan S. Katz >>>> <jkatz@postgresql.org> wrote: >> >> Hi, >> >> When v15 docs have: >> >> "27.2.7. Cascading Replication >> The cascading replication feature allows a standby server to accept >> replication connections and stream WAL records to other standbys, >> acting as a relay. This can be used to reduce the number of direct >> connections to the primary and also to minimize inter-site bandwidth >> overheads." >> >> why then, in the release draft, is that capability mentioned as >> something that is new for v16? >> " >> In PostgreSQL 16, users can perform logical decoding from a standby >> instance, meaning a standby can publish logical changes to other servers. >> " >> >> Is there a difference between the two? > > Yes. Those docs refer to **physical** replication, where a standby can > continue to replicate WAL records to other standbys. In v16, standbys > can now publish changes over **logical** replication. Well, I must assume you are right. But why is the attached program, running 3 cascading v15 servers, showing 'logical' in the middle server's (port 6526) pg_replication_slots.slot_type ? Surely that is not physical but logical replication? port | svn | slot_name | slot_type ------+--------+--------------------+----------- 6526 | 150003 | pub_6527_from_6526 | logical <-- (1 row) I must be confused -- I will be thankful for enlightenment. Erik > Thanks, > > Jonathan >