Re: Pg 16: will pg_dump & pg_restore be faster? - Mailing list pgsql-general

From Bruce Momjian
Subject Re: Pg 16: will pg_dump & pg_restore be faster?
Date
Msg-id ZHafJF3FadgFr/5A@momjian.us
Whole thread Raw
In response to Re: Pg 16: will pg_dump & pg_restore be faster?  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: Pg 16: will pg_dump & pg_restore be faster?
Re: Pg 16: will pg_dump & pg_restore be faster?
List pgsql-general
On Wed, May 31, 2023 at 09:14:20AM +1200, David Rowley wrote:
> On Wed, 31 May 2023 at 08:54, Ron <ronljohnsonjr@gmail.com> wrote:
> > https://www.postgresql.org/about/news/postgresql-16-beta-1-released-2643/
> > says "PostgreSQL 16 can also improve the performance of concurrent bulk
> > loading of data using COPY up to 300%."
> >
> > Since pg_dump & pg_restore use COPY (or something very similar), will the
> > speed increase translate to higher speeds for those utilities?
> 
> I think the improvements to relation extension only help when multiple
> backends need to extend the relation at the same time.  pg_restore can
> have multiple workers, but the tasks that each worker performs are
> only divided as far as an entire table, i.e. 2 workers will never be
> working on the same table at the same time. So there is no concurrency
> in terms of 2 or more workers working on loading data into the same
> table at the same time.
> 
> It might be an interesting project now that we have TidRange scans, to
> have pg_dump split larger tables into chunks so that they can be
> restored in parallel.

Uh, the release notes say:

    <!--
    Author: Andres Freund <andres@anarazel.de>
    2023-04-06 [00d1e02be] hio: Use ExtendBufferedRelBy() to extend tables more eff
    Author: Andres Freund <andres@anarazel.de>
    2023-04-06 [26158b852] Use ExtendBufferedRelTo() in XLogReadBufferExtended()
    -->
    
    <listitem>
    <para>
    Allow more efficient addition of heap and index pages (Andres Freund)
    </para>
    </listitem>

There is no mention of concurrency being a requirement.  Is it wrong?  I
think there was a question of whether you had to add _multiple_ blocks
ot get a benefit, not if concurrency was needed.  This email about the
release notes didn't mention the concurrent requirement:

    https://www.postgresql.org/message-id/20230521171341.jjxykfsefsek4kzj%40awork3.anarazel.de


    While the case of extending by multiple pages improved the most, even
    extending by a single page at a time got a good bit more scalable. Maybe
    just "Improve efficiency of extending relations"?

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.



pgsql-general by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: How to make the generate_series to generate the letter series?
Next
From: David Rowley
Date:
Subject: Re: Pg 16: will pg_dump & pg_restore be faster?