Re: pg_stat_statements issue with parallel maintenance (Was Re: WALusage calculation patch) - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: pg_stat_statements issue with parallel maintenance (Was Re: WALusage calculation patch)
Date
Msg-id CAFiTN-tFpy9y5Sv0uLpoT8FH7wgwBe56Tsi7_YON4MBsc1UGzQ@mail.gmail.com
Whole thread Raw
In response to Re: pg_stat_statements issue with parallel maintenance (Was Re: WALusage calculation patch)  (Dilip Kumar <dilipbalaut@gmail.com>)
List pgsql-hackers
On Thu, Apr 2, 2020 at 9:13 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Thu, Apr 2, 2020 at 8:34 AM Peter Geoghegan <pg@bowt.ie> wrote:
> >
> > On Wed, Apr 1, 2020 at 7:52 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > Peter, Is this behavior expected?
> > >
> > > Let me summarize the situation so that it would be easier for Peter to
> > > comment.  Julien has noticed that parallel vacuum and parallel create
> > > index doesn't seem to report correct values for buffer usage stats.
> > > Sawada-San wrote a patch to fix the problem for both the cases.  We
> > > expect that 'total_read_blks' as reported in pg_stat_statements should
> > > give the same value for parallel and non-parallel operations.  We see
> > > that is true for parallel vacuum and previously we have the same
> > > observation for the parallel query. Now, for parallel create index
> > > this doesn't seem to be true as test results by Dilip show that.  We
> > > have two possibilities here (a) there is some bug in Sawada-San's
> > > patch or (b) this is expected behavior for parallel create index.
> > > What do you think?
> >
> > nbtree CREATE INDEX doesn't even go through the buffer manager.
>
> Thanks for clarifying.  So IIUC, it will not go through the buffer
> manager for the index pages,  but for the heap pages, it will still go
> through the buffer manager.
>
> > The
> > difference that Dilip showed is probably due to extra catalog accesses
> > in the two parallel workers -- pg_amproc lookups, and the like. Those
> > are rather small differences, overall.
>
> > Can Dilip demonstrate the the "extra" buffer accesses are
> > proportionate to the number of workers launched in some constant,
> > predictable way?
>
> Okay, I will test this.

0-worker
           query             | total_time  | shared_blks_hit |
shared_blks_read | total_read_blks | shared_blks_dirtied |
shared_blks_written

------------------------------+-------------+-----------------+------------------+-----------------+---------------------+---------------------
 CREATE INDEX idx1 on test(a) | 1228.895057 |            8947 |
       11 |            8971 |                   5 |
0

1-worker
            query             | total_time  | shared_blks_hit |
shared_blks_read | total_read_blks | shared_blks_dirtied |
shared_blks_written

------------------------------+-------------+-----------------+------------------+-----------------+---------------------+---------------------
 CREATE INDEX idx1 on test(a) | 1006.157231 |            8962 |
       12 |            8974 |                   5 |
0

2-workers
            query             | total_time | shared_blks_hit |
shared_blks_read | total_read_blks | shared_blks_dirtied |
shared_blks_written

------------------------------+------------+-----------------+------------------+-----------------+---------------------+---------------------
 CREATE INDEX idx1 on test(a) |  949.44663 |            8965 |
      12 |            8977 |                   5 |                   0

3-workers
            query             | total_time  | shared_blks_hit |
shared_blks_read | total_read_blks | shared_blks_dirtied |
shared_blks_written

------------------------------+-------------+-----------------+------------------+-----------------+---------------------+---------------------
 CREATE INDEX idx1 on test(a) | 1037.297196 |            8968 |
       12 |            8980 |                   5 |
0

4-workers
            query             | total_time | shared_blks_hit |
shared_blks_read | total_read_blks | shared_blks_dirtied |
shared_blks_written

------------------------------+------------+-----------------+------------------+-----------------+---------------------+---------------------
 CREATE INDEX idx1 on test(a) | 889.332782 |            8971 |
      12 |            8983 |                   6 |                   0

You are right, it is increasing with some constant factor.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: User Interface for WAL usage data
Next
From: Andrey Lepikhov
Date:
Subject: Re: NOT IN subquery optimization