Re: Pg 16: will pg_dump & pg_restore be faster? - Mailing list pgsql-general

From Jonathan S. Katz
Subject Re: Pg 16: will pg_dump & pg_restore be faster?
Date
Msg-id d966a173-d66c-c873-a154-50ab822ea933@postgresql.org
Whole thread Raw
In response to Re: Pg 16: will pg_dump & pg_restore be faster?  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: Pg 16: will pg_dump & pg_restore be faster?
List pgsql-general
On 5/30/23 10:05 PM, David Rowley wrote:

> My understanding had been that concurrency was required, but I see the
> commit message for 00d1e02be mentions:
> 
>> Even single threaded
>> COPY is measurably faster, primarily due to not dirtying pages while
>> extending, if supported by the operating system (see commit 4d330a61bb1).
> 
> If that's the case then maybe the beta release notes could be edited
> slightly to reflect this. Maybe something like:
> 
> "Relation extensions have been improved allowing faster bulk loading
> of data using COPY. These improvements are more significant when
> multiple processes are concurrently loading data into the same table."
> 
> The current text of "PostgreSQL 16 can also improve the performance of
> concurrent bulk loading of data using COPY up to 300%." does lead me
> to believe that nothing has been done to improve things when only a
> single backend is involved.

Typically once a release announcement is out, we'll only edit it if it's 
inaccurate. I don't think the statement in the release announcement is 
inaccurate, as it specifies that concurrent bulk loading is faster.

I had based the description on what Andres described in the original 
discussion and through reading[1], which showed a "measurable" 
improvement as the commit message said, but it was not to the same 
degree as concurrently loading. It does still seem impactful -- the 
results show up to 20% improvement on a single backend -- but the bigger 
story was around the concurrency.

I'm -0.5 for revising the announcement, but I also don't want people to 
miss out on testing this. I'd be OK with this:

"PostgreSQL 16 can also improve the performance of bulk loading of data, 
with some tests showing using up to 300% improvement when concurrently 
executing `COPY` commands."

Thanks,

Jonathan

[1] 
https://www.postgresql.org/message-id/20221029025420.eplyow6k7tgu6he3@awork3.anarazel.de


Attachment

pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: [Beginner Question]A question about yacc & lex
Next
From: Oliver Kohll
Date:
Subject: Interconnected views