Re: Parallel Inserts in CREATE TABLE AS - Mailing list pgsql-hackers

From Bharath Rupireddy
Subject Re: Parallel Inserts in CREATE TABLE AS
Date
Msg-id CALj2ACXSQJXDjyPui0nmOH+fHNKzEVi3WJ+H34k_rjyQMVgovQ@mail.gmail.com
Whole thread Raw
In response to Re: Parallel Inserts in CREATE TABLE AS  (vignesh C <vignesh21@gmail.com>)
List pgsql-hackers
On Wed, Dec 30, 2020 at 5:26 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Wed, Dec 30, 2020 at 10:47 AM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> >
> > On Wed, Dec 30, 2020 at 10:32 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > I have completed reviewing 0001, I don't have more comments, just one
> > > question.  Soon I will review the remaining patches.
> >
> > Thanks.
> >
> > > +    /* If parallel inserts are to be allowed, set a few extra information. */
> > > +    if (myState->is_parallel)
> > > +    {
> > > +        myState->object_id = intoRelationAddr.objectId;
> > > +
> > > +        /*
> > > +         * We don't need to skip contacting FSM while inserting tuples for
> > > +         * parallel mode, while extending the relations, workers instead of
> > > +         * blocking on a page while another worker is inserting, can check the
> > > +         * FSM for another page that can accommodate the tuples. This results
> > > +         * in major benefit for parallel inserts.
> > > +         */
> > > +        myState->ti_options = 0;
> > >
> > > Is there any performance data for this or just theoretical analysis?
> >
> > I have seen that we don't get much performance with the skip fsm
> > option, though I don't have the data to back it up. I'm planning to
> > run performance tests after the patches 0001, 0002 and 0003 get
> > reviewed. I will capture the data at that time. Hope that's fine.
> >
>
> When you run the performance tests, you can try to capture and publish
> relation size & the number of pages that are getting created for base
> table and the CTAS table, you can use something like SELECT relpages
> FROM pg_class WHERE relname = 'tablename &  SELECT
> pg_total_relation_size('tablename'). Just to make sure that there is
> no significant difference between the base table and CTAS table.

I can do that, I'm sure the number of pages will be equal or little
more, since I observed this for parallel copy.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Peter Smith
Date:
Subject: Re: Single transaction in the tablesync worker?
Next
From: Bharath Rupireddy
Date:
Subject: Re: Parallel Inserts in CREATE TABLE AS