Thread: WAL usage calculation patch

WAL usage calculation patch

From

Kirill Bychik

Date:

05 February 2020, 13:35:59

Hello pgsql-hackers,

Submitting a patch that would enable gathering of per-statement WAL
generation statistics, similar to how it is done for buffer usage.
Collected is the number of records added to WAL and number of WAL
bytes written.

The data collected was found valuable to analyze update-heavy load,
with WAL generation being the bottleneck.

The usage data is collected at low level, after compression is done on
WAL record. Data is then exposed via pg_stat_statements, could also be
used in EXPLAIN ANALYZE if needed. Instrumentation is alike to the one
used for buffer stats. I didn't dare to unify both usage metric sets
into single struct, nor rework the way both are passed to parallel
workers.

Performance impact is (supposed to be) very low, essentially adding
two int operations and memory access on WAL record insert. Additional
efforts to allocate shmem chunk for parallel workers. Parallel workers
shmem usage is increased to fir in a struct of two longs.

Patch is separated in two parts: core changes and pg_stat_statements
additions. Essentially the extension has its schema updated to allow
two more fields, docs updated to reflect the change. Patch is prepared
against master branch.

Please provide your comments and/or code findings.

> вт, 18 февр. 2020 г. в 06:23, Thomas Munro <thomas.munro@gmail.com>:
> > On Mon, Feb 10, 2020 at 8:20 PM Craig Ringer <craig@2ndquadrant.com> wrote:
> > > On Wed, 5 Feb 2020 at 21:36, Kirill Bychik <kirill.bychik@gmail.com> wrote:
> > > > Patch is separated in two parts: core changes and pg_stat_statements
> > > > additions. Essentially the extension has its schema updated to allow
> > > > two more fields, docs updated to reflect the change. Patch is prepared
> > > > against master branch.
> > > >
> > > > Please provide your comments and/or code findings.
> > >
> > > I like the concept, I'm a big fan of anything that affordably improves
> > > visibility into Pg's I/O and activity.
> >
> > +1
> >
> > > To date I've been relying on tools like systemtap to do this sort of
> > > thing. But that's a bit specialised, and Pg currently lacks useful
> > > instrumentation for it so it can be a pain to match up activity by
> > > parallel workers and that sort of thing. (I aim to find time to submit
> > > a patch for that.)
> >
> > (I'm interested in seeing your conference talk about that!  I did a
> > bunch of stuff with static probes to measure PHJ behaviour around
> > barrier waits and so on but it was hard to figure out what stuff like
> > that to put in the actual tree, it was all a bit
> > use-once-to-test-a-theory-and-then-throw-away.)
> >
> > Kirill, I noticed that you included a regression test that is failing.  Can
> > this possibly be stable across machines or even on the same machine?
> > Does it still pass for you or did something change on the master
> > branch to add a new WAL record since you posted the patch?
>
> Thank you for testing the patch and running extension checks. I assume
> the patch applies without problems.
>
> As for the regr test, it apparently requires some rework. I didn't pay
> attention enough to make sure the data I check is actually meaningful
> and isolated enough to be repeatable.
>
> Please consider the extension part of the patch as WIP, I'll resubmit
> the patch once I get a stable and meanngful test up. Thanks for
> finding it!
>

I have reworked the extension regression test to be more isolated.
Apparently, something merged into master branch shifted my numbers.

PFA the new patch. Core part didn't change a bit, the extension part
has regression test SQL and expected log changed.

Looking forward for new comments.

Attachment

Re: WAL usage calculation patch

From

Julien Rouhaud

Date:

04 March 2020, 16:02:25

On Thu, Feb 20, 2020 at 06:56:27PM +0300, Kirill Bychik wrote:
> > вт, 18 февр. 2020 г. в 06:23, Thomas Munro <thomas.munro@gmail.com>:
> > > On Mon, Feb 10, 2020 at 8:20 PM Craig Ringer <craig@2ndquadrant.com> wrote:
> > > > On Wed, 5 Feb 2020 at 21:36, Kirill Bychik <kirill.bychik@gmail.com> wrote:
> > > > > Patch is separated in two parts: core changes and pg_stat_statements
> > > > > additions. Essentially the extension has its schema updated to allow
> > > > > two more fields, docs updated to reflect the change. Patch is prepared
> > > > > against master branch.
> > > > >
> > > > > Please provide your comments and/or code findings.
> > > >
> > > > I like the concept, I'm a big fan of anything that affordably improves
> > > > visibility into Pg's I/O and activity.
> > >
> > > +1

Huge +1 too.

> > Thank you for testing the patch and running extension checks. I assume
> > the patch applies without problems.
> >
> > As for the regr test, it apparently requires some rework. I didn't pay
> > attention enough to make sure the data I check is actually meaningful
> > and isolated enough to be repeatable.
> >
> > Please consider the extension part of the patch as WIP, I'll resubmit
> > the patch once I get a stable and meanngful test up. Thanks for
> > finding it!
> >
>
> I have reworked the extension regression test to be more isolated.
> Apparently, something merged into master branch shifted my numbers.
>
> PFA the new patch. Core part didn't change a bit, the extension part
> has regression test SQL and expected log changed.

I'm quite worried about the stability of those counters for regression tests.
Wouldn't a checkpoint happening during the test change them?

While at it, did you consider adding a full-page image counter in the WalUsage?
That's something I'd really like to have and it doesn't seem hard to integrate.

Another point is that this patch won't help to see autovacuum activity.
As an example, I did a quick test to store the informations in pgstat, sending
the data in the PG_FINALLY part of vacuum():

rjuju=# create table t1(id integer, val text);
CREATE TABLE
rjuju=# insert into t1 select i, 'val ' || i from generate_series(1, 100000) i;
INSERT 0 100000
rjuju=# vacuum t1;
VACUUM
rjuju=# select datname, vac_wal_records, vac_wal_bytes, autovac_wal_records, autovac_wal_bytes
from pg_stat_database where datname = 'rjuju';
 datname | vac_wal_records | vac_wal_bytes | autovac_wal_records | autovac_wal_bytes
---------+-----------------+---------------+---------------------+-------------------
 rjuju   |             547 |         65201 |                   0 |                 0
(1 row)

rjuju=# delete from t1 where id % 2 = 0;
DELETE 50000
rjuju=# select pg_sleep(60);
 pg_sleep
----------

(1 row)

rjuju=# select datname, vac_wal_records, vac_wal_bytes, autovac_wal_records, autovac_wal_bytes
from pg_stat_database where datname = 'rjuju';
 datname | vac_wal_records | vac_wal_bytes | autovac_wal_records | autovac_wal_bytes
---------+-----------------+---------------+---------------------+-------------------
 rjuju   |             547 |         65201 |                1631 |            323193
(1 row)

That's seems like useful data (especially since I recently had to dig into a
problematic WAL consumption issue that was due to some autovacuum activity),
but that may seem strange to only account for (auto)vacuum activity, rather
than globally, grouping per RmgrId or CommandTag for instance.  We could then
see the complete WAL usage per-database.  What do you think?

Some minor points I noticed:

- the extension patch doesn't apply anymore, I guess since 70a7732007bc4689

 #define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
+#define PARALLEL_KEY_WAL_USAGE         UINT64CONST(0xE000000000000010)

Shouldn't it be 0xA rather than 0x10?

- it would be better to add a version number to the patches, so we're sure
  which one we're talking about.

Re: WAL usage calculation patch

From

Michael Paquier

Date:

05 March 2020, 06:35:48

On Wed, Mar 04, 2020 at 05:02:25PM +0100, Julien Rouhaud wrote:
> I'm quite worried about the stability of those counters for regression tests.
> Wouldn't a checkpoint happening during the test change them?

Yep.  One way to go through that would be to test if this output is
non-zero still I suspect at quick glance that this won't be entirely
reliable either.

> While at it, did you consider adding a full-page image counter in the WalUsage?
> That's something I'd really like to have and it doesn't seem hard to integrate.

FWIW, one reason here is that we had recently some benchmark work done
internally where this would have been helpful in studying some spiky
WAL load patterns.
--
Michael

Attachment

signature.asc

Fwd: WAL usage calculation patch

From

Kirill Bychik

Date:

05 March 2020, 19:55:34

> I'm quite worried about the stability of those counters for regression tests.
> Wouldn't a checkpoint happening during the test change them?

Agree, stability of test could be an issue, even shifting of write
format or compression method or adding compatible changes could break
such test. Frankly speaking, the numbers expected are not actually
calculated, my logic was rather well described by "these numbers
should be non-zero for real tables". I believe the test can be
modified to check that numbers are above zero, both for bytes written
and for records stored.

Having a checkpoint in the moddle of the test can be almost 100%
countered by triggering one before the test. I'll add a checkpoint
call to the test scenario, if no objections here.

> While at it, did you consider adding a full-page image counter in the WalUsage?
> That's something I'd really like to have and it doesn't seem hard to integrate.

Well, not sure I understand you 100%, being new to Postgres dev. Do
you want a separate counter for pages written whenever doPageWrites is
true? I can do that, if needed. Please confirm.

> Another point is that this patch won't help to see autovacuum activity.
> As an example, I did a quick te.....
> ...LONG QUOTE...
> but that may seem strange to only account for (auto)vacuum activity, rather
> than globally, grouping per RmgrId or CommandTag for instance.  We could then
> see the complete WAL usage per-database.  What do you think?

I wanted to keep the patch small and simple, and fit to practical
needs. This patch is supposed to provide tuning assistance, catching
an io heavy query in commit-bound situation.
Total WAL usage per DB can be assessed rather easily using other means.
Let's get this change into the codebase and then work on connecting
WAL usage to  (auto)vacuum stats.

>
> Some minor points I noticed:
>
> - the extension patch doesn't apply anymore, I guess since 70a7732007bc4689

Will fix, thank you.

>
>  #define PARALLEL_KEY_JIT_INSTRUMENTATION UINT64CONST(0xE000000000000009)
> +#define PARALLEL_KEY_WAL_USAGE         UINT64CONST(0xE000000000000010)
>
> Shouldn't it be 0xA rather than 0x10?

Oww, my bad, this is embaracing! Will fix, thank you.

> - it would be better to add a version number to the patches, so we're sure
>   which one we're talking about.

Noted, thank you.

Please comment on the proposed changes, I will cook up a new version
once all are agreed upon.

Re: WAL usage calculation patch

From

Julien Rouhaud

Date:

06 March 2020, 17:14:37

On Thu, Mar 5, 2020 at 8:55 PM Kirill Bychik <kirill.bychik@gmail.com> wrote:
>
> > While at it, did you consider adding a full-page image counter in the WalUsage?
> > That's something I'd really like to have and it doesn't seem hard to integrate.
>
> Well, not sure I understand you 100%, being new to Postgres dev. Do
> you want a separate counter for pages written whenever doPageWrites is
> true? I can do that, if needed. Please confirm.

Yes, I meant a separate 3rd counter for the number of full page images
written.  However after a quick look I think that a FPI should be
detected with (doPageWrites && fpw_lsn != InvalidXLogRecPtr && fpw_lsn
<= RedoRecPtr).

> > Another point is that this patch won't help to see autovacuum activity.
> > As an example, I did a quick te.....
> > ...LONG QUOTE...
> > but that may seem strange to only account for (auto)vacuum activity, rather
> > than globally, grouping per RmgrId or CommandTag for instance.  We could then
> > see the complete WAL usage per-database.  What do you think?
>
> I wanted to keep the patch small and simple, and fit to practical
> needs. This patch is supposed to provide tuning assistance, catching
> an io heavy query in commit-bound situation.
> Total WAL usage per DB can be assessed rather easily using other means.
> Let's get this change into the codebase and then work on connecting
> WAL usage to  (auto)vacuum stats.

I agree that having a view of the full activity is a way bigger scope,
so it could be done later (and at this point in pg14), but I'm still
hoping that we can get insight of other backend WAL activity, such as
autovacuum, in pg13.

Re: WAL usage calculation patch

From

Kirill Bychik

Date:

06 March 2020, 17:59:31

пт, 6 мар. 2020 г. в 20:14, Julien Rouhaud <rjuju123@gmail.com>:
>
> On Thu, Mar 5, 2020 at 8:55 PM Kirill Bychik <kirill.bychik@gmail.com> wrote:
> >
> > > While at it, did you consider adding a full-page image counter in the WalUsage?
> > > That's something I'd really like to have and it doesn't seem hard to integrate.
> >
> > Well, not sure I understand you 100%, being new to Postgres dev. Do
> > you want a separate counter for pages written whenever doPageWrites is
> > true? I can do that, if needed. Please confirm.
>
> Yes, I meant a separate 3rd counter for the number of full page images
> written.  However after a quick look I think that a FPI should be
> detected with (doPageWrites && fpw_lsn != InvalidXLogRecPtr && fpw_lsn
> <= RedoRecPtr).

This seems easy, will implement once I get some spare time.

> > > Another point is that this patch won't help to see autovacuum activity.
> > > As an example, I did a quick te.....
> > > ...LONG QUOTE...
> > > but that may seem strange to only account for (auto)vacuum activity, rather
> > > than globally, grouping per RmgrId or CommandTag for instance.  We could then
> > > see the complete WAL usage per-database.  What do you think?
> >
> > I wanted to keep the patch small and simple, and fit to practical
> > needs. This patch is supposed to provide tuning assistance, catching
> > an io heavy query in commit-bound situation.
> > Total WAL usage per DB can be assessed rather easily using other means.
> > Let's get this change into the codebase and then work on connecting
> > WAL usage to  (auto)vacuum stats.
>
> I agree that having a view of the full activity is a way bigger scope,
> so it could be done later (and at this point in pg14), but I'm still
> hoping that we can get insight of other backend WAL activity, such as
> autovacuum, in pg13.

How do you think this information should be exposed? Via the pg_stat_statement?

Anyways, I believe this change could be bigger than FPI. I propose to
plan a separate patch for it, or even add it to the TODO after the
core patch of wal usage is merged.

Please expect a new patch version next week, with FPI counters added.

Re: WAL usage calculation patch

From

Julien Rouhaud

Date:

06 March 2020, 19:19:17

On Fri, Mar 6, 2020 at 6:59 PM Kirill Bychik <kirill.bychik@gmail.com> wrote:
>
> пт, 6 мар. 2020 г. в 20:14, Julien Rouhaud <rjuju123@gmail.com>:
> >
> > On Thu, Mar 5, 2020 at 8:55 PM Kirill Bychik <kirill.bychik@gmail.com> wrote:
> > > I wanted to keep the patch small and simple, and fit to practical
> > > needs. This patch is supposed to provide tuning assistance, catching
> > > an io heavy query in commit-bound situation.
> > > Total WAL usage per DB can be assessed rather easily using other means.
> > > Let's get this change into the codebase and then work on connecting
> > > WAL usage to  (auto)vacuum stats.
> >
> > I agree that having a view of the full activity is a way bigger scope,
> > so it could be done later (and at this point in pg14), but I'm still
> > hoping that we can get insight of other backend WAL activity, such as
> > autovacuum, in pg13.
>
> How do you think this information should be exposed? Via the pg_stat_statement?

That's unlikely, since autovacuum won't trigger any hook.  I was
thinking on some new view for pgstats, similarly to the example I
showed previously. The implementation is straightforward, although
pg_stat_database is maybe not the best choice here.

> Anyways, I believe this change could be bigger than FPI. I propose to
> plan a separate patch for it, or even add it to the TODO after the
> core patch of wal usage is merged.

Just in case, if the problem is a lack of time, I'd be happy to help
on that if needed.  Otherwise, I'll definitely not try to block any
progress for the feature as proposed.

> Please expect a new patch version next week, with FPI counters added.

Thanks!

Re: WAL usage calculation patch

From

Kirill Bychik

Date:

15 March 2020, 18:52:18

> > > On Thu, Mar 5, 2020 at 8:55 PM Kirill Bychik <kirill.bychik@gmail.com> wrote:
> > > > I wanted to keep the patch small and simple, and fit to practical
> > > > needs. This patch is supposed to provide tuning assistance, catching
> > > > an io heavy query in commit-bound situation.
> > > > Total WAL usage per DB can be assessed rather easily using other means.
> > > > Let's get this change into the codebase and then work on connecting
> > > > WAL usage to  (auto)vacuum stats.
> > >
> > > I agree that having a view of the full activity is a way bigger scope,
> > > so it could be done later (and at this point in pg14), but I'm still
> > > hoping that we can get insight of other backend WAL activity, such as
> > > autovacuum, in pg13.
> >
> > How do you think this information should be exposed? Via the pg_stat_statement?
>
> That's unlikely, since autovacuum won't trigger any hook.  I was
> thinking on some new view for pgstats, similarly to the example I
> showed previously. The implementation is straightforward, although
> pg_stat_database is maybe not the best choice here.

After extensive thinking and some code diving, I did not manage to
come up with a sane idea on how to expose data about autovacuum WAL
usage. Must be the flu.

> > Anyways, I believe this change could be bigger than FPI. I propose to
> > plan a separate patch for it, or even add it to the TODO after the
> > core patch of wal usage is merged.
>
> Just in case, if the problem is a lack of time, I'd be happy to help
> on that if needed.  Otherwise, I'll definitely not try to block any
> progress for the feature as proposed.

Please feel free to work on any extension of this patch idea. I lack
both time and knowledge to do it all by myself.

> > Please expect a new patch version next week, with FPI counters added.

Please find attached patch version 003, with FP writes and minor
corrections. Hope i use attachment versioning as expected in this
group :)

Test had been reworked, and I believe it should be stable now, the
part which checks WAL is written and there is a correlation between
affected rows and WAL records. I still have no idea how to test
full-page writes against regular updates, it seems very unstable.
Please share ideas if any.

Thanks!

Attachment

Re: WAL usage calculation patch

From

Julien Rouhaud

Date:

17 March 2020, 15:31:36

On Sun, Mar 15, 2020 at 09:52:18PM +0300, Kirill Bychik wrote:
> > > > On Thu, Mar 5, 2020 at 8:55 PM Kirill Bychik <kirill.bychik@gmail.com> wrote:
> After extensive thinking and some code diving, I did not manage to
> come up with a sane idea on how to expose data about autovacuum WAL
> usage. Must be the flu.
>
> > > Anyways, I believe this change could be bigger than FPI. I propose to
> > > plan a separate patch for it, or even add it to the TODO after the
> > > core patch of wal usage is merged.
> >
> > Just in case, if the problem is a lack of time, I'd be happy to help
> > on that if needed.  Otherwise, I'll definitely not try to block any
> > progress for the feature as proposed.
>
> Please feel free to work on any extension of this patch idea. I lack
> both time and knowledge to do it all by myself.

I'm adding a 3rd patch on top of yours to expose the new WAL counters in
pg_stat_database, for vacuum and autovacuum.  I'm not really enthiusiastic with
this approach but I didn't find better, and maybe this will raise some better
ideas.  The only sure thing is that we're not going to add a bunch of new
fields in pg_stat_all_tables anyway.

We can also drop this 3rd patch entirely if no one's happy about it without
impacting the first two.

> > > Please expect a new patch version next week, with FPI counters added.
>
> Please find attached patch version 003, with FP writes and minor
> corrections. Hope i use attachment versioning as expected in this
> group :)

Thanks!

> Test had been reworked, and I believe it should be stable now, the
> part which checks WAL is written and there is a correlation between
> affected rows and WAL records. I still have no idea how to test
> full-page writes against regular updates, it seems very unstable.
> Please share ideas if any.

I just reviewed the patches, and it globally looks good to me.  The way to
detect full page images looks sensible, but I'm really not familiar with that
code so additional review would be useful.

I noticed that the new wal_write_fp_records field in pg_stat_statements wasn't
used in the test.  Since I have to add all the patches to make the cfbot happy,
I slightly adapted the tests to reference the fp column too.  There was also a
minor issue in the documentation, as wal_records and wal_bytes were copy/pasted
twice while wal_write_fp_records wasn't documented, so I also changed it.

Let me know if you're ok with those changes.

On Wed, Mar 18, 2020 at 09:02:58AM +0300, Kirill Bychik wrote:
>
> There is a higher-level Instrumentation API that can be used with
> INSTRUMENT_WAL flag to collect the wal usage information. I believe
> the instrumentation is widely used in the executor code, so it should
> not be a problem to colelct instrumentation information on autovacuum
> worker level.
>
> Just a recommendation/chat, though. I am happy with the way the data
> is collected now. If you commit this variant, please add a TODO to
> rework wal usage to common instr API.

The instrumentation is somewhat intended to be used with executor nodes, not
backend commands.  I don't see real technical reason that would prevent that,
but I prefer to keep things as-is for now, as it sound less controversial.
This is for the 3rd patch, which may not even be considered for this CF anyway.

> > > As for the tests, please get somebody else to review this. I strongly
> > > believe checking full page writes here could be a source of
> > > instability.
> >
> >
> > I'm also a little bit dubious about it.  The initial checkpoint should make
> > things stable (of course unless full_page_writes is disabled), and Cfbot also
> > seems happy about it.  At least keeping it for the temporary tables test
> > shouldn't be a problem.
>
> Temp tables should show zero FPI WAL records, true :)
>
> I have no objections to the patch.

I'm attaching a v5 with fp records only for temp tables, so there's no risk of
instability.  As I previously said I'm fine with your two patches, so unless
you have objections on the fpi test for temp tables or the documentation
changes, I believe those should be ready for committer.

On Thu, Mar 19, 2020 at 09:03:02PM +0900, Fujii Masao wrote:
> 
> On 2020/03/19 2:19, Julien Rouhaud wrote:
> > 
> > I'm attaching a v5 with fp records only for temp tables, so there's no risk of
> > instability.  As I previously said I'm fine with your two patches, so unless
> > you have objections on the fpi test for temp tables or the documentation
> > changes, I believe those should be ready for committer.
> 
> You added the columns into pg_stat_database, but seem to forget to
> update the document for pg_stat_database.

Ah right, I totally missed that when I tried to clean up the original POC.

> Is it really reasonable to add the columns for vacuum's WAL usage into
> pg_stat_database? I'm not sure how much the information about
> the amount of WAL generated by vacuum per database is useful.

The amount per database isn't really useful, but I didn't had a better idea on
how to expose (auto)vacuum WAL usage until this:

> Isn't it better to make VACUUM VERBOSE and autovacuum log include
> that information, instead, to see how much each vacuum activity
> generates the WAL? Sorry if this discussion has already been done
> upthread.

That's a way better idea!  I'm attaching the full patchset with the 3rd patch
to use this approach instead.  There's a bit a duplicate code for computing the
WalUsage, as I didn't find a better way to avoid that without exposing
WalUsageAccumDiff().

Autovacuum log sample:

2020-03-19 15:49:05.708 CET [5843] LOG:  automatic vacuum of table "rjuju.public.t1": index scans: 0
    pages: 0 removed, 2213 remain, 0 skipped due to pins, 0 skipped frozen
    tuples: 250000 removed, 250000 remain, 0 are dead but not yet removable, oldest xmin: 502
    buffer usage: 4448 hits, 4 misses, 4 dirtied
    avg read rate: 0.160 MB/s, avg write rate: 0.160 MB/s
    system usage: CPU: user: 0.13 s, system: 0.00 s, elapsed: 0.19 s
    WAL usage: 6643 records, 4 full page records, 1402679 bytes

VACUUM log sample:

# vacuum VERBOSE t1;
INFO:  vacuuming "public.t1"
INFO:  "t1": removed 50000 row versions in 443 pages
INFO:  "t1": found 50000 removable, 0 nonremovable row versions in 443 out of 443 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 512
There were 50000 unused item identifiers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
1332 WAL records, 4 WAL full page records, 306901 WAL bytes
CPU: user: 0.01 s, system: 0.00 s, elapsed: 0.01 s.
INFO:  "t1": truncated 443 to 0 pages
DETAIL:  CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
INFO:  vacuuming "pg_toast.pg_toast_16385"
INFO:  index "pg_toast_16385_index" now contains 0 row versions in 1 pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  "pg_toast_16385": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 513
There were 0 unused item identifiers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
0 WAL records, 0 WAL full page records, 0 WAL bytes
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
VACUUM

Note that the 3rd patch is an addition on top of Kirill's original patch, as
this is information that would have been greatly helpful to investigate in some
performance issues I had to investigate recently.  I'd be happy to have it land
into v13, but if that's controversial or too late I'm happy to postpone it to
v14 if the infrastructure added in Kirill's patches can make it to v13.

On Mon, Mar 23, 2020 at 11:24:50PM +0900, Fujii Masao wrote:
> 
> > Here are the comments for 0001 patch.
> > 
> > +            /*
> > +             * Report a full page image constructed for the WAL record
> > +             */
> > +            pgWalUsage.wal_fp_records++;
> > 
> > Isn't it better to use "fpw" or "fpi" for the variable name rather than
> > "fp" here? In other places, "fpw" and "fpi" are used for full page
> > writes/image.

Agreed, I went with fpw.

> > ISTM that this counter could be incorrect if XLogInsertRecord() determines to
> > calculate again whether FPI is necessary or not. No? IOW, this issue could
> > happen if XLogInsert() calls  XLogRecordAssemble() multiple times in
> > its do-while loop. Isn't this problematic?

Yes probably.  I also see while adding support for EXPLAIN/auto_explain that
the previous approach was incrementing both records and fpw_records, while it
should be only one of those for each record.  I fixed this using the approach I
previously mentionned in [1] which seems to work just fine.

> > +    long        wal_bytes;        /* size of wal records produced */
> > 
> > Isn't it safer to use uint64 (i.e., XLogRecPtr) as the type of this variable
> > rather than long?

Yes indeed.  I switched to uint64, and modified everything accordingly (and
changed pgss to output numeric as there's no other way to handle unsigned int8)

> > +    shm_toc_insert(pcxt->toc, PARALLEL_KEY_WAL_USAGE, bufusage_space);
> > 
> > bufusage_space should be walusage_space here?

Good catch, fixed.

> > /*
> >   * Finish parallel execution.  We wait for parallel workers to finish, and
> >   * accumulate their buffer usage.
> >   */
> > 
> > There are some comments mentioning buffer usage, in execParallel.c.
> > For example, the top comment for ExecParallelFinish(), as the above.
> > These should be updated.

I went through all the file and quickly checked in other places, and I think I
fixed all required comments.

> Here are the comments for 0002 patch.
> 
> +    OUT wal_write_bytes int8,
> +    OUT wal_write_records int8,
> +    OUT wal_write_fp_records int8
> 
> Isn't "write" part in the column names confusing because it's WAL
> *generated* (not written) by the statement?

Agreed, I simply dropped the "_write" part everywhere.

> +RETURNS SETOF record
> +AS 'MODULE_PATHNAME', 'pg_stat_statements_1_4'
> +LANGUAGE C STRICT VOLATILE;
> 
> PARALLEL SAFE should be specified?

Indeed, fixed.

> +/* contrib/pg_stat_statements/pg_stat_statements--1.7--1.8.sql */
> 
> ISTM it's good timing to have also pg_stat_statements--1.8.sql since
> the definition of pg_stat_statements() is changed. Thought?

As mentionned in other pgss thread, I think the general agreement is to never
provide full script anymore, so I didn't changed that.

> +-- CHECKPOINT before WAL tests to ensure test stability
> +CHECKPOINT;
> 
> Is this true? I thought you added this because the number of FPI
> should be larger than zero in the subsequent test. No? But there
> seems no such test. I'm not excited about adding the test checking
> the number of FPI because it looks fragile, though...

It should ensure a FPW for each new block touch, but yes that's quite fragile.

Since I fixed the record / FPW record counters, I saw that this was actually
already broken as there was a mix of FPW and non-FPW, so I dropped the
checkpoint and just tested (wal_record + wal_fpw_record) instead.

> +UPDATE pgss_test SET b = '333' WHERE a = 3 \;
> +UPDATE pgss_test SET b = '444' WHERE a = 4 ;
> 
> Could you tell me why several queries need to be run to test
> the WAL usage? Isn't running a few query enough for the test purpase?

As far as I can see it's used to test multiple scenario (single command /
multiple commands in or outside explicit transaction).  It shouldn't add a lot
of overhead and since some commands are issues with "\;" it's also testing
proper query string isolation when multi-command query string is provided,
which doesn't seem like a bad idea.  I didn't changed that but I'm not opposed
to remove some of the updates if needed.

Also, to answer Amit Kapila's comments about WAL records and parallel query, I
added support for both EXPLAIN and auto_explain (tab completion and
documentation are also updated), and using a simple table with an index, with
forced parallelism and no leader participation and concurrent update on the
same table, I could test that WAL usage is working as expected:

rjuju=# explain (analyze, wal, verbose) select * from t1;
                                                          QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=0.00..8805.05 rows=100010 width=14) (actual time=8.695..47.592 rows=100010 loops=1)
   Output: id, val
   Workers Planned: 2
   Workers Launched: 2
   WAL: records=204 bytes=86198
   ->  Parallel Seq Scan on public.t1  (cost=0.00..8805.05 rows=50005 width=14) (actual time=0.056..29.112 rows=50005
loops
         Output: id, val
         WAL: records=204 bytes=86198
         Worker 0:  actual time=0.060..28.995 rows=49593 loops=1
           WAL: records=105 bytes=44222
         Worker 1:  actual time=0.052..29.230 rows=50417 loops=1
           WAL: records=99 bytes=41976
 Planning Time: 0.038 ms
 Execution Time: 53.957 ms
(14 rows)

and the same query when nothing end up being modified:

rjuju=# explain (analyze, wal, verbose) select * from t1;
                                                          QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=0.00..8805.05 rows=100010 width=14) (actual time=9.413..48.187 rows=100010 loops=1)
   Output: id, val
   Workers Planned: 2
   Workers Launched: 2
   ->  Parallel Seq Scan on public.t1  (cost=0.00..8805.05 rows=50005 width=14) (actual time=0.033..24.697 rows=50005
loops
         Output: id, val
         Worker 0:  actual time=0.028..24.786 rows=50447 loops=1
         Worker 1:  actual time=0.038..24.609 rows=49563 loops=1
 Planning Time: 0.282 ms
 Execution Time: 55.643 ms
(10 rows)

So it seems to me that WAL usage infrastructure for parallel query is working
just fine.  I added the EXPLAIN/auto_explain in a separate commit just in case.

[1] https://www.postgresql.org/message-id/CAOBaU_aECK1Z7Nn+x=MhvEwrJzK8wyPsPtWAafjqtZN1fYjEmg@mail.gmail.com

Attachment

Re: WAL usage calculation patch

From

Julien Rouhaud

Date:

29 March 2020, 12:31:41

Hi Amit,

Sorry I just noticed your mail.

On Sun, Mar 29, 2020 at 05:12:16PM +0530, Amit Kapila wrote:
> On Sun, Mar 29, 2020 at 1:26 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> >
> > I'm not sure that I get your point.  I'm assuming that you meant
> > parallel-read-only queries, but surely buffer usage infrastructure for
> > parallel query relies on the same approach as non-parallel one (each node
> > computes the process-local pgBufferUsage diff) and sums all of that at the end
> > of the parallel query execution.  I also don't see how whether the query is
> > read-only or not is relevant here as far as instrumentation is concerned,
> > especially since read-only query can definitely do writes and increase the
> > count of dirtied buffers, like a write query would.  For instance a hint
> > bit change can be done in a parallel query AFAIK, and this can generate WAL
> > records in wal_log_hints is enabled, so that's probably one way to test it.
> >
> 
> Yeah, that way we can test it.  Can you try that?
> 
> > I now think that not adding support for WAL buffers in EXPLAIN output in the
> > initial patch scope was a mistake, as this is probably the best way to test the
> > WAL counters for parallel queries.  This shouldn't be hard to add though, and I
> > can work on it quickly if there's still a chance to get this feature included
> > in pg13.
> >
> 
> I am not sure we will add it in Explain or not (maybe we need inputs
> from others in this regard), but if it helps in testing this part of
> the patch, then it is a good idea to write a patch for it.  You might
> want to keep it separate from the main patch as we might not commit
> it.

As I just wrote in [1] that's exactly what I did.  Using parallel query and
concurrent update on a table I could see that WAL usage for parallel query
seems to be working as one could expect.

> Sure, I am fine with that but I am not sure if it is a good idea to
> commit this patch without having a way to compute WAL utilization for
> those commands.

I'm generally fine with waiting for a fix for the existing issue to be
committed.  But as the feature freeze is approaching, I hope that it won't mean
postponing this feature to v14 because a related 2yo bug has just been
discovered, as it would seem a bit unfair.

Re: pg_stat_statements issue with parallel maintenance (Was Re: WALusage calculation patch)

From

Masahiko Sawada

Date:

30 March 2020, 06:46:34

On Sun, 29 Mar 2020 at 20:44, Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Sun, 29 Mar 2020 at 20:15, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Sun, Mar 29, 2020 at 1:44 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> > >
> > > On Sun, Mar 29, 2020 at 9:52 AM Masahiko Sawada
> > > <masahiko.sawada@2ndquadrant.com> wrote:
> > > >
> > > > I've run vacuum with/without parallel workers on the table having 5
> > > > indexes. The vacuum reads all blocks of table and indexes.
> > > >
> > > > * VACUUM command with no parallel workers
> > > > =# select total_time, shared_blks_hit, shared_blks_read,
> > > > shared_blks_hit + shared_blks_read as total_read_blks,
> > > > shared_blks_dirtied, shared_blks_written from pg_stat_statements where
> > > > query ~ 'vacuum';
> > > >
> > > >   total_time  | shared_blks_hit | shared_blks_read | total_read_blks |
> > > > shared_blks_dirtied | shared_blks_written
> > > >
--------------+-----------------+------------------+-----------------+---------------------+---------------------
> > > >  19857.217207 |           45238 |           226944 |          272182 |
> > > >              225943 |              225894
> > > > (1 row)
> > > >
> > > > * VACUUM command with 4 parallel workers
> > > > =# select total_time, shared_blks_hit, shared_blks_read,
> > > > shared_blks_hit + shared_blks_read as total_read_blks,
> > > > shared_blks_dirtied, shared_blks_written from pg_stat_statements where
> > > > query ~ 'vacuum';
> > > >
> > > >  total_time  | shared_blks_hit | shared_blks_read | total_read_blks |
> > > > shared_blks_dirtied | shared_blks_written
> > > >
-------------+-----------------+------------------+-----------------+---------------------+---------------------
> > > >  6932.117365 |           45205 |            73079 |          118284 |
> > > >              72403 |               72365
> > > > (1 row)
> > > >
> > > > The total number of blocks of table and indexes are about 182243
> > > > blocks. As Julien reported, obviously the total number of read blocks
> > > > during parallel vacuum is much less than single process vacuum's
> > > > result.
> > > >
> > > > Parallel create index has the same issue but it doesn't exist in
> > > > parallel queries for SELECTs.
> > > >
> > > > I think we need to change parallel maintenance commands so that they
> > > > report buffer usage like what ParallelQueryMain() does; prepare to
> > > > track buffer usage during query execution by
> > > > InstrStartParallelQuery(), and report it by InstrEndParallelQuery()
> > > > after parallel maintenance command. To report buffer usage of parallel
> > > > maintenance command correctly, I'm thinking that we can (1) change
> > > > parallel create index and parallel vacuum so that they prepare
> > > > gathering buffer usage, or (2) have a common entry point for parallel
> > > > maintenance commands that is responsible for gathering buffer usage
> > > > and calling the entry functions for individual maintenance command.
> > > > I'll investigate it more in depth.
> > >
> > > As I just mentioned, (2) seems like a better design as it's quite
> > > likely that the number of parallel-aware utilities will probably
> > > continue to increase.  One problem also is that parallel CREATE INDEX
> > > has been introduced in pg11, so (2) probably won't be packpatchable
> > > (and (1) seems problematic too).
> > >
> >
> > I am not sure if we can decide at this stage whether it is
> > back-patchable or not.  Let's first see the patch and if it turns out
> > to be complex, then we can try to do some straight-forward fix for
> > back-branches.
>
> Agreed.
>
> > In general, I don't see why the fix here should be
> > complex?
>
> Yeah, particularly the approach (1) will not be complex. I'll write a
> patch tomorrow.
>

I've attached two patches fixing this issue for parallel index
creation and parallel vacuum. These approaches take the same approach;
we allocate DSM to share buffer usage and the leader gathers them,
described as approach (1) above. I think this is a straightforward
approach for this issue. We can create a common entry point for
parallel maintenance command that is responsible for gathering buffer
usage as well as sharing query text etc. But it will accompany
relatively big change and it might be overkill at this stage. We can
discuss that and it will become an item for PG14.

Regards,

-- 
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: pg_stat_statements issue with parallel maintenance (Was Re: WALusage calculation patch)

From

Masahiko Sawada

Date:

30 March 2020, 07:01:18

On Mon, 30 Mar 2020 at 15:46, Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Sun, 29 Mar 2020 at 20:44, Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > On Sun, 29 Mar 2020 at 20:15, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Sun, Mar 29, 2020 at 1:44 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> > > >
> > > > On Sun, Mar 29, 2020 at 9:52 AM Masahiko Sawada
> > > > <masahiko.sawada@2ndquadrant.com> wrote:
> > > > >
> > > > > I've run vacuum with/without parallel workers on the table having 5
> > > > > indexes. The vacuum reads all blocks of table and indexes.
> > > > >
> > > > > * VACUUM command with no parallel workers
> > > > > =# select total_time, shared_blks_hit, shared_blks_read,
> > > > > shared_blks_hit + shared_blks_read as total_read_blks,
> > > > > shared_blks_dirtied, shared_blks_written from pg_stat_statements where
> > > > > query ~ 'vacuum';
> > > > >
> > > > >   total_time  | shared_blks_hit | shared_blks_read | total_read_blks |
> > > > > shared_blks_dirtied | shared_blks_written
> > > > >
--------------+-----------------+------------------+-----------------+---------------------+---------------------
> > > > >  19857.217207 |           45238 |           226944 |          272182 |
> > > > >              225943 |              225894
> > > > > (1 row)
> > > > >
> > > > > * VACUUM command with 4 parallel workers
> > > > > =# select total_time, shared_blks_hit, shared_blks_read,
> > > > > shared_blks_hit + shared_blks_read as total_read_blks,
> > > > > shared_blks_dirtied, shared_blks_written from pg_stat_statements where
> > > > > query ~ 'vacuum';
> > > > >
> > > > >  total_time  | shared_blks_hit | shared_blks_read | total_read_blks |
> > > > > shared_blks_dirtied | shared_blks_written
> > > > >
-------------+-----------------+------------------+-----------------+---------------------+---------------------
> > > > >  6932.117365 |           45205 |            73079 |          118284 |
> > > > >              72403 |               72365
> > > > > (1 row)
> > > > >
> > > > > The total number of blocks of table and indexes are about 182243
> > > > > blocks. As Julien reported, obviously the total number of read blocks
> > > > > during parallel vacuum is much less than single process vacuum's
> > > > > result.
> > > > >
> > > > > Parallel create index has the same issue but it doesn't exist in
> > > > > parallel queries for SELECTs.
> > > > >
> > > > > I think we need to change parallel maintenance commands so that they
> > > > > report buffer usage like what ParallelQueryMain() does; prepare to
> > > > > track buffer usage during query execution by
> > > > > InstrStartParallelQuery(), and report it by InstrEndParallelQuery()
> > > > > after parallel maintenance command. To report buffer usage of parallel
> > > > > maintenance command correctly, I'm thinking that we can (1) change
> > > > > parallel create index and parallel vacuum so that they prepare
> > > > > gathering buffer usage, or (2) have a common entry point for parallel
> > > > > maintenance commands that is responsible for gathering buffer usage
> > > > > and calling the entry functions for individual maintenance command.
> > > > > I'll investigate it more in depth.
> > > >
> > > > As I just mentioned, (2) seems like a better design as it's quite
> > > > likely that the number of parallel-aware utilities will probably
> > > > continue to increase.  One problem also is that parallel CREATE INDEX
> > > > has been introduced in pg11, so (2) probably won't be packpatchable
> > > > (and (1) seems problematic too).
> > > >
> > >
> > > I am not sure if we can decide at this stage whether it is
> > > back-patchable or not.  Let's first see the patch and if it turns out
> > > to be complex, then we can try to do some straight-forward fix for
> > > back-branches.
> >
> > Agreed.
> >
> > > In general, I don't see why the fix here should be
> > > complex?
> >
> > Yeah, particularly the approach (1) will not be complex. I'll write a
> > patch tomorrow.
> >
>
> I've attached two patches fixing this issue for parallel index
> creation and parallel vacuum. These approaches take the same approach;
> we allocate DSM to share buffer usage and the leader gathers them,
> described as approach (1) above. I think this is a straightforward
> approach for this issue. We can create a common entry point for
> parallel maintenance command that is responsible for gathering buffer
> usage as well as sharing query text etc. But it will accompany
> relatively big change and it might be overkill at this stage. We can
> discuss that and it will become an item for PG14.
>

The patch for vacuum conflicts with recent changes in vacuum. So I've
attached rebased one.

Regards,

-- 
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

bufferusage_vacuum_v2.patch

Re: pg_stat_statements issue with parallel maintenance (Was Re: WALusage calculation patch)

From

Julien Rouhaud

Date:

30 March 2020, 08:00:05

On Mon, Mar 30, 2020 at 04:01:18PM +0900, Masahiko Sawada wrote:
> On Mon, 30 Mar 2020 at 15:46, Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > On Sun, 29 Mar 2020 at 20:44, Masahiko Sawada
> > <masahiko.sawada@2ndquadrant.com> wrote:
> > >
> > > > > > I think we need to change parallel maintenance commands so that they
> > > > > > report buffer usage like what ParallelQueryMain() does; prepare to
> > > > > > track buffer usage during query execution by
> > > > > > InstrStartParallelQuery(), and report it by InstrEndParallelQuery()
> > > > > > after parallel maintenance command. To report buffer usage of parallel
> > > > > > maintenance command correctly, I'm thinking that we can (1) change
> > > > > > parallel create index and parallel vacuum so that they prepare
> > > > > > gathering buffer usage, or (2) have a common entry point for parallel
> > > > > > maintenance commands that is responsible for gathering buffer usage
> > > > > > and calling the entry functions for individual maintenance command.
> > > > > > I'll investigate it more in depth.
> > > > >
> > > [...]
> >
> > I've attached two patches fixing this issue for parallel index
> > creation and parallel vacuum. These approaches take the same approach;
> > we allocate DSM to share buffer usage and the leader gathers them,
> > described as approach (1) above. I think this is a straightforward
> > approach for this issue. We can create a common entry point for
> > parallel maintenance command that is responsible for gathering buffer
> > usage as well as sharing query text etc. But it will accompany
> > relatively big change and it might be overkill at this stage. We can
> > discuss that and it will become an item for PG14.
> >
> 
> The patch for vacuum conflicts with recent changes in vacuum. So I've
> attached rebased one.

Thanks Sawada-san!

Just minor nitpicking:

+   int         i;

    Assert(!IsParallelWorker());
    Assert(ParallelVacuumIsActive(lps));
@@ -2166,6 +2172,13 @@ lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
    /* Wait for all vacuum workers to finish */
    WaitForParallelWorkersToFinish(lps->pcxt);

+   /*
+    * Next, accumulate buffer usage.  (This must wait for the workers to
+    * finish, or we might get incomplete data.)
+    */
+   for (i = 0; i < nworkers; i++)
+       InstrAccumParallelQuery(&lps->buffer_usage[i]);

We now allow declaring a variable in those loops, so it may be better to avoid
declaring i outside the for scope?

Other than that both patch looks good to me and a good fit for packpatching.  I
also did some testing on VACUUM and CREATE INDEX and it works as expected.

Re: WAL usage calculation patch

From

Amit Kapila

Date:

30 March 2020, 10:22:38

On Sun, Mar 29, 2020 at 5:49 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
>

@@ -1249,6 +1250,16 @@ XLogInsertRecord(XLogRecData *rdata,
  ProcLastRecPtr = StartPos;
  XactLastRecEnd = EndPos;

+ /* Provide WAL update data to the instrumentation */
+ if (inserted)
+ {
+ pgWalUsage.wal_bytes += rechdr->xl_tot_len;
+ if (doPageWrites && fpw_lsn <= RedoRecPtr)
+ pgWalUsage.wal_fpw_records++;
+ else
+ pgWalUsage.wal_records++;
+ }
+

I think the above code has multiple problems. (a) fpw_lsn can be
InvalidXLogRecPtr and still there could be full-page image (for ex.
when REGBUF_FORCE_IMAGE flag for buffer is set).  (b) There could be
multiple FPW records while inserting a record; consider when there are
multiple registered buffers.  I think the right place to figure this
out is XLogRecordAssemble. (c) There are cases when we also attach the
record data even when we decide to write FPW (cf. REGBUF_KEEP_DATA),
so we might want to increment wal_fpw_records and wal_records for such
cases.

I think the right place to compute this information is
XLogRecordAssemble even though we update it at the place where you
have it in the patch.  You can probably compute that in local
variables and then transfer to pgWalUsage in XLogInsertRecord.  I am
fine if you can think of some other way but the current patch doesn't
seem correct to me.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: WAL usage calculation patch

From

Julien Rouhaud

Date:

30 March 2020, 12:43:56

On Mon, Mar 30, 2020 at 03:52:38PM +0530, Amit Kapila wrote:
> On Sun, Mar 29, 2020 at 5:49 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> >
> 
> @@ -1249,6 +1250,16 @@ XLogInsertRecord(XLogRecData *rdata,
>   ProcLastRecPtr = StartPos;
>   XactLastRecEnd = EndPos;
> 
> + /* Provide WAL update data to the instrumentation */
> + if (inserted)
> + {
> + pgWalUsage.wal_bytes += rechdr->xl_tot_len;
> + if (doPageWrites && fpw_lsn <= RedoRecPtr)
> + pgWalUsage.wal_fpw_records++;
> + else
> + pgWalUsage.wal_records++;
> + }
> +
> 
> I think the above code has multiple problems. (a) fpw_lsn can be
> InvalidXLogRecPtr and still there could be full-page image (for ex.
> when REGBUF_FORCE_IMAGE flag for buffer is set).  (b) There could be
> multiple FPW records while inserting a record; consider when there are
> multiple registered buffers.  I think the right place to figure this
> out is XLogRecordAssemble. (c) There are cases when we also attach the
> record data even when we decide to write FPW (cf. REGBUF_KEEP_DATA),
> so we might want to increment wal_fpw_records and wal_records for such
> cases.
> 
> I think the right place to compute this information is
> XLogRecordAssemble even though we update it at the place where you
> have it in the patch.  You can probably compute that in local
> variables and then transfer to pgWalUsage in XLogInsertRecord.  I am
> fine if you can think of some other way but the current patch doesn't
> seem correct to me.

My previous approach was indeed totally broken.  v8 attached which hopefully
will be ok.

Attachment

Re: pg_stat_statements issue with parallel maintenance (Was Re: WALusage calculation patch)

From

Amit Kapila

Date:

31 March 2020, 03:57:52

On Mon, Mar 30, 2020 at 12:31 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> The patch for vacuum conflicts with recent changes in vacuum. So I've
> attached rebased one.
>

+ /*
+ * Next, accumulate buffer usage.  (This must wait for the workers to
+ * finish, or we might get incomplete data.)
+ */
+ for (i = 0; i < nworkers; i++)
+ InstrAccumParallelQuery(&lps->buffer_usage[i]);
+

This should be done for launched workers aka
lps->pcxt->nworkers_launched.  I think a similar problem exists in
create index related patch.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: pg_stat_statements issue with parallel maintenance (Was Re: WALusage calculation patch)

From

Masahiko Sawada

Date:

31 March 2020, 05:13:34

On Tue, 31 Mar 2020 at 12:58, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Mar 30, 2020 at 12:31 PM Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > The patch for vacuum conflicts with recent changes in vacuum. So I've
> > attached rebased one.
> >
>
> + /*
> + * Next, accumulate buffer usage.  (This must wait for the workers to
> + * finish, or we might get incomplete data.)
> + */
> + for (i = 0; i < nworkers; i++)
> + InstrAccumParallelQuery(&lps->buffer_usage[i]);
> +
>
> This should be done for launched workers aka
> lps->pcxt->nworkers_launched.  I think a similar problem exists in
> create index related patch.

You're right. Fixed in the new patches.

On Mon, 30 Mar 2020 at 17:00, Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> Just minor nitpicking:
>
> +   int         i;
>
>     Assert(!IsParallelWorker());
>     Assert(ParallelVacuumIsActive(lps));
> @@ -2166,6 +2172,13 @@ lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
>     /* Wait for all vacuum workers to finish */
>     WaitForParallelWorkersToFinish(lps->pcxt);
>
> +   /*
> +    * Next, accumulate buffer usage.  (This must wait for the workers to
> +    * finish, or we might get incomplete data.)
> +    */
> +   for (i = 0; i < nworkers; i++)
> +       InstrAccumParallelQuery(&lps->buffer_usage[i]);
>
> We now allow declaring a variable in those loops, so it may be better to avoid
> declaring i outside the for scope?

We can do that but I was not sure if it's good since other codes
around there don't use that. So I'd like to leave it for committers.
It's a trivial change.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

On Wed, 1 Apr 2020 at 11:46, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Mar 31, 2020 at 7:32 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > While testing I have found one issue.  Basically, during a parallel
> > vacuum, it was showing more number of
> > shared_blk_hits+shared_blks_read.  After, some investigation, I found
> > that during the cleanup phase nworkers are -1, and because of this we
> > didn't try to launch worker but "lps->pcxt->nworkers_launched" had the
> > old launched worker count and shared memory also had old buffer read
> > data which was never updated as we did not try to launch the worker.
> >
> > diff --git a/src/backend/access/heap/vacuumlazy.c
> > b/src/backend/access/heap/vacuumlazy.c
> > index b97b678..5dfaf4d 100644
> > --- a/src/backend/access/heap/vacuumlazy.c
> > +++ b/src/backend/access/heap/vacuumlazy.c
> > @@ -2150,7 +2150,8 @@ lazy_parallel_vacuum_indexes(Relation *Irel,
> > IndexBulkDeleteResult **stats,
> >          * Next, accumulate buffer usage.  (This must wait for the workers to
> >          * finish, or we might get incomplete data.)
> >          */
> > -       for (i = 0; i < lps->pcxt->nworkers_launched; i++)
> > +       nworkers = Min(nworkers, lps->pcxt->nworkers_launched);
> > +       for (i = 0; i < nworkers; i++)
> >                 InstrAccumParallelQuery(&lps->buffer_usage[i]);
> >
> > It worked after the above fix.
> >
>
> Good catch.  I think we should not even call
> WaitForParallelWorkersToFinish for such a case.  So, I guess the fix
> could be,
>
> if (workers > 0)
> {
> WaitForParallelWorkersToFinish();
> for (i = 0; i < lps->pcxt->nworkers_launched; i++)
>                  InstrAccumParallelQuery(&lps->buffer_usage[i]);
> }
>

Agreed. I've attached the updated patch.

Thank you for testing, Dilip!

Regards,

-- 
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

bufferusage_vacuum_v4.patch

Re: pg_stat_statements issue with parallel maintenance (Was Re: WALusage calculation patch)

From

Dilip Kumar

Date:

01 April 2020, 03:21:04

On Wed, Apr 1, 2020 at 8:26 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Wed, 1 Apr 2020 at 11:46, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Mar 31, 2020 at 7:32 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > While testing I have found one issue.  Basically, during a parallel
> > > vacuum, it was showing more number of
> > > shared_blk_hits+shared_blks_read.  After, some investigation, I found
> > > that during the cleanup phase nworkers are -1, and because of this we
> > > didn't try to launch worker but "lps->pcxt->nworkers_launched" had the
> > > old launched worker count and shared memory also had old buffer read
> > > data which was never updated as we did not try to launch the worker.
> > >
> > > diff --git a/src/backend/access/heap/vacuumlazy.c
> > > b/src/backend/access/heap/vacuumlazy.c
> > > index b97b678..5dfaf4d 100644
> > > --- a/src/backend/access/heap/vacuumlazy.c
> > > +++ b/src/backend/access/heap/vacuumlazy.c
> > > @@ -2150,7 +2150,8 @@ lazy_parallel_vacuum_indexes(Relation *Irel,
> > > IndexBulkDeleteResult **stats,
> > >          * Next, accumulate buffer usage.  (This must wait for the workers to
> > >          * finish, or we might get incomplete data.)
> > >          */
> > > -       for (i = 0; i < lps->pcxt->nworkers_launched; i++)
> > > +       nworkers = Min(nworkers, lps->pcxt->nworkers_launched);
> > > +       for (i = 0; i < nworkers; i++)
> > >                 InstrAccumParallelQuery(&lps->buffer_usage[i]);
> > >
> > > It worked after the above fix.
> > >
> >
> > Good catch.  I think we should not even call
> > WaitForParallelWorkersToFinish for such a case.  So, I guess the fix
> > could be,
> >
> > if (workers > 0)
> > {
> > WaitForParallelWorkersToFinish();
> > for (i = 0; i < lps->pcxt->nworkers_launched; i++)
> >                  InstrAccumParallelQuery(&lps->buffer_usage[i]);
> > }
> >
>
> Agreed. I've attached the updated patch.
>
> Thank you for testing, Dilip!

Thanks!  One hunk is failing on the latest head.  And, I have rebased
the patch for my testing so posting the same.  I have done some more
testing to test multi-pass vacuum.

postgres[114321]=# show maintenance_work_mem ;
 maintenance_work_mem
----------------------
 1MB
(1 row)

--Test case
select pg_stat_statements_reset();
drop table test;
CREATE TABLE test (a int, b int);
CREATE INDEX idx1 on test(a);
CREATE INDEX idx2 on test(b);
INSERT INTO test SELECT i, i FROM GENERATE_SERIES(1,2000000) as i;
DELETE FROM test where a%2=0;
VACUUM (PARALLEL n) test;
select query, total_time, shared_blks_hit, shared_blks_read,
shared_blks_hit + shared_blks_read as total_read_blks,
shared_blks_dirtied, shared_blks_written from pg_stat_statements where
query like 'VACUUM%';

          query           | total_time  | shared_blks_hit |
shared_blks_read | total_read_blks | shared_blks_dirtied |
shared_blks_written

--------------------------+-------------+-----------------+------------------+-----------------+---------------------+---------------------
 VACUUM (PARALLEL 0) test | 5964.282408 |           92447 |
    6 |           92453 |               19789 |                   0


          query           |     total_time     | shared_blks_hit |
shared_blks_read | total_read_blks | shared_blks_dirtied |
shared_blks_written

--------------------------+--------------------+-----------------+------------------+-----------------+---------------------+---------------------
 VACUUM (PARALLEL 1) test | 3957.7658810000003 |           92447 |
           6 |           92453 |               19789 |
  0
(1 row)

So I am getting correct results with the multi-pass vacuum.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachment

bufferusage_vacuum_v5.patch

Re: pg_stat_statements issue with parallel maintenance (Was Re: WALusage calculation patch)

From

Amit Kapila

Date:

01 April 2020, 06:31:24

On Wed, Apr 1, 2020 at 8:51 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> > Agreed. I've attached the updated patch.
> >
> > Thank you for testing, Dilip!
>
> Thanks!  One hunk is failing on the latest head.  And, I have rebased
> the patch for my testing so posting the same.  I have done some more
> testing to test multi-pass vacuum.
>

The patch looks good to me.  I have done a few minor modifications (a)
moved the declaration of variable closer to where it is used, (b)
changed a comment, (c) ran pgindent.  I have also done some additional
testing with more number of indexes and found that vacuum and parallel
vacuum used the same number of total_read_blks and that is what is
expected here.

Let me know what you think of the attached?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment

bufferusage_vacuum_v6.patch

Re: pg_stat_statements issue with parallel maintenance (Was Re: WALusage calculation patch)

From

Dilip Kumar

Date:

01 April 2020, 07:11:38

On Wed, Apr 1, 2020 at 8:51 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Wed, Apr 1, 2020 at 8:26 AM Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > On Wed, 1 Apr 2020 at 11:46, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Tue, Mar 31, 2020 at 7:32 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > >
> > > > While testing I have found one issue.  Basically, during a parallel
> > > > vacuum, it was showing more number of
> > > > shared_blk_hits+shared_blks_read.  After, some investigation, I found
> > > > that during the cleanup phase nworkers are -1, and because of this we
> > > > didn't try to launch worker but "lps->pcxt->nworkers_launched" had the
> > > > old launched worker count and shared memory also had old buffer read
> > > > data which was never updated as we did not try to launch the worker.
> > > >
> > > > diff --git a/src/backend/access/heap/vacuumlazy.c
> > > > b/src/backend/access/heap/vacuumlazy.c
> > > > index b97b678..5dfaf4d 100644
> > > > --- a/src/backend/access/heap/vacuumlazy.c
> > > > +++ b/src/backend/access/heap/vacuumlazy.c
> > > > @@ -2150,7 +2150,8 @@ lazy_parallel_vacuum_indexes(Relation *Irel,
> > > > IndexBulkDeleteResult **stats,
> > > >          * Next, accumulate buffer usage.  (This must wait for the workers to
> > > >          * finish, or we might get incomplete data.)
> > > >          */
> > > > -       for (i = 0; i < lps->pcxt->nworkers_launched; i++)
> > > > +       nworkers = Min(nworkers, lps->pcxt->nworkers_launched);
> > > > +       for (i = 0; i < nworkers; i++)
> > > >                 InstrAccumParallelQuery(&lps->buffer_usage[i]);
> > > >
> > > > It worked after the above fix.
> > > >
> > >
> > > Good catch.  I think we should not even call
> > > WaitForParallelWorkersToFinish for such a case.  So, I guess the fix
> > > could be,
> > >
> > > if (workers > 0)
> > > {
> > > WaitForParallelWorkersToFinish();
> > > for (i = 0; i < lps->pcxt->nworkers_launched; i++)
> > >                  InstrAccumParallelQuery(&lps->buffer_usage[i]);
> > > }
> > >
> >
> > Agreed. I've attached the updated patch.
> >
> > Thank you for testing, Dilip!
>
> Thanks!  One hunk is failing on the latest head.  And, I have rebased
> the patch for my testing so posting the same.  I have done some more
> testing to test multi-pass vacuum.
>
> postgres[114321]=# show maintenance_work_mem ;
>  maintenance_work_mem
> ----------------------
>  1MB
> (1 row)
>
> --Test case
> select pg_stat_statements_reset();
> drop table test;
> CREATE TABLE test (a int, b int);
> CREATE INDEX idx1 on test(a);
> CREATE INDEX idx2 on test(b);
> INSERT INTO test SELECT i, i FROM GENERATE_SERIES(1,2000000) as i;
> DELETE FROM test where a%2=0;
> VACUUM (PARALLEL n) test;
> select query, total_time, shared_blks_hit, shared_blks_read,
> shared_blks_hit + shared_blks_read as total_read_blks,
> shared_blks_dirtied, shared_blks_written from pg_stat_statements where
> query like 'VACUUM%';
>
>           query           | total_time  | shared_blks_hit |
> shared_blks_read | total_read_blks | shared_blks_dirtied |
> shared_blks_written
>
--------------------------+-------------+-----------------+------------------+-----------------+---------------------+---------------------
>  VACUUM (PARALLEL 0) test | 5964.282408 |           92447 |
>     6 |           92453 |               19789 |                   0
>
>
>           query           |     total_time     | shared_blks_hit |
> shared_blks_read | total_read_blks | shared_blks_dirtied |
> shared_blks_written
>
--------------------------+--------------------+-----------------+------------------+-----------------+---------------------+---------------------
>  VACUUM (PARALLEL 1) test | 3957.7658810000003 |           92447 |
>            6 |           92453 |               19789 |
>   0
> (1 row)
>
> So I am getting correct results with the multi-pass vacuum.

I have done some testing for the parallel "create index".

postgres[99536]=# show maintenance_work_mem ;
 maintenance_work_mem
----------------------
 1MB
(1 row)

CREATE TABLE test (a int, b int);
INSERT INTO test SELECT i, i FROM GENERATE_SERIES(1,2000000) as i;
CREATE INDEX idx1 on test(a);
select query, total_time, shared_blks_hit, shared_blks_read,
shared_blks_hit + shared_blks_read as total_read_blks,
shared_blks_dirtied, shared_blks_written from pg_stat_statements where
query like 'CREATE INDEX%';


SET max_parallel_maintenance_workers TO 0;
            query             |     total_time     | shared_blks_hit |
shared_blks_read | total_read_blks | shared_blks_dirtied |
shared_blks_written

------------------------------+--------------------+-----------------+------------------+-----------------+---------------------+---------------------
 CREATE INDEX idx1 on test(a) | 1947.4959979999999 |            8947 |
              11 |            8958 |                   5 |
      0

SET max_parallel_maintenance_workers TO 2;

            query             |     total_time     | shared_blks_hit |
shared_blks_read | total_read_blks | shared_blks_dirtied |
shared_blks_written

------------------------------+--------------------+-----------------+------------------+-----------------+---------------------+---------------------
 CREATE INDEX idx1 on test(a) | 1942.1426040000001 |            8960 |
              14 |            8974 |                   5 |
      0
(1 row)

I have noticed that the total_read_blks, with the parallel, create
index is more compared to non-parallel one.  I have created a fresh
database before each run.  I am not much aware of the internal code of
parallel create an index so I am not sure whether it is expected to
read extra blocks with the parallel create an index.  I guess maybe
because multiple workers are inserting int the btree they might need
to visit some btree nodes multiple times while traversing the tree
down.  But, it's better if someone who have more idea with this code
can confirm this.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: WAL usage calculation patch

From

Julien Rouhaud

Date:

01 April 2020, 08:01:52

So here's a v9, rebased on top of the latest versions of Sawada-san's bug fixes
(Amit's v6 for vacuum and Sawada-san's v2 for create index), with all
previously mentionned changes.

Note that I'm only attaching those patches for convenience and to make sure
that cfbot is happy.

Hi,

I'm replying here to all reviews that have been sent, thanks a lot!

On Wed, Apr 01, 2020 at 04:29:16PM +0530, Amit Kapila wrote:
> On Wed, Apr 1, 2020 at 1:32 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> >
> > So here's a v9, rebased on top of the latest versions of Sawada-san's bug fixes
> > (Amit's v6 for vacuum and Sawada-san's v2 for create index), with all
> > previously mentionned changes.
> >
> 
> Few other comments:
> v9-0003-Add-infrastructure-to-track-WAL-usage
> 1.
>  static void BufferUsageAdd(BufferUsage *dst, const BufferUsage *add);
> -
> +static void WalUsageAdd(WalUsage *dst, WalUsage *add);
> 
> Looks like a spurious line removal


Fixed.


> 2.
> + /* Report a full page imsage constructed for the WAL record */
> + *num_fpw += 1;
> 
> Typo. /imsage/image


Ah sorry I though I fixed it previously, fixed.


> 3. Doing some testing with and without parallelism to ensure WAL usage
> data is correct would be great and if possible, share the results?


I just saw that Dilip did some testing, but just in case here is some
additional one

- vacuum, after a truncate, loading 1M row and a "UPDATE t1 SET id = id"

=# select query, calls, wal_bytes, wal_records, wal_num_fpw from pg_stat_statements where query ilike '%vacuum%';
         query          | calls | wal_bytes | wal_records | wal_num_fpw
------------------------+-------+-----------+-------------+-------------
 vacuum (parallel 3) t1 |     1 |  20098962 |       34104 |           2
 vacuum (parallel 0) t1 |     1 |  20098962 |       34104 |           2
(2 rows)

- create index, overload t1's parallel_workers, using the 1M line just
  vacuumed:

=# alter table t1 set (parallel_workers = 2);
ALTER TABLE

=# create index t1_parallel_2 on t1(id);
CREATE INDEX

=# alter table t1 set (parallel_workers = 0);
ALTER TABLE

=# create index t1_parallel_0 on t1(id);
CREATE INDEX

=# select query, calls, wal_bytes, wal_records, wal_num_fpw from pg_stat_statements where query ilike '%create
index%';
                query                 | calls | wal_bytes | wal_records | wal_num_fpw
--------------------------------------+-------+-----------+-------------+-------------
 create index t1_parallel_0 on t1(id) |     1 |  20355540 |        2762 |        2745
 create index t1_parallel_2 on t1(id) |     1 |  20406811 |        2762 |        2758
(2 rows)

It all looks good to me.


> v9-0005-Keep-track-of-WAL-usage-in-pg_stat_statements
> 4.
> +-- SELECT usage data, check WAL usage is reported, wal_records equal
> rows count for INSERT/UPDATE/DELETE
> +SELECT query, calls, rows,
> +wal_bytes > 0 as wal_bytes_generated,
> +wal_records > 0 as wal_records_generated,
> +wal_records = rows as wal_records_as_rows
> +FROM pg_stat_statements ORDER BY query COLLATE "C";
> +                              query                               |
> calls | rows | wal_bytes_generated | wal_records_generated |
> wal_records_as_rows
>
+------------------------------------------------------------------+-------+------+---------------------+-----------------------+---------------------
> + DELETE FROM pgss_test WHERE a > $1                               |
>   1 |    1 | t                   | t                     | t
> + DROP TABLE pgss_test                                             |
>   1 |    0 | t                   | t                     | f
> + INSERT INTO pgss_test (a, b) VALUES ($1, $2), ($3, $4), ($5, $6) |
>   1 |    3 | t                   | t                     | t
> + INSERT INTO pgss_test VALUES(generate_series($1, $2), $3)        |
>   1 |   10 | t                   | t                     | t
> + SELECT * FROM pgss_test ORDER BY a                               |
>   1 |   12 | f                   | f                     | f
> + SELECT * FROM pgss_test WHERE a > $1 ORDER BY a                  |
>   2 |    4 | f                   | f                     | f
> + SELECT * FROM pgss_test WHERE a IN ($1, $2, $3, $4, $5)          |
>   1 |    8 | f                   | f                     | f
> + SELECT pg_stat_statements_reset()                                |
>   1 |    1 | f                   | f                     | f
> + SET pg_stat_statements.track_utility = FALSE                     |
>   1 |    0 | f                   | f                     | t
> + UPDATE pgss_test SET b = $1 WHERE a = $2                         |
>   6 |    6 | t                   | t                     | t
> + UPDATE pgss_test SET b = $1 WHERE a > $2                         |
>   1 |    3 | t                   | t                     | t
> +(11 rows)
> +
> 
> I am not sure if the above tests make much sense as they are just
> testing that if WAL is generated for these commands.  I understand it
> is not easy to make these tests reliable but in that case, we can
> think of some simple tests.  It seems to me that the difficulty is due
> to full_page_writes as that depends on the checkpoint.  Can we make
> full_page_writes = off for these tests and check some simple
> Insert/Update/Delete cases?  Alternatively, if you can present the
> reason why that is unstable or are tricky to write, then we can simply
> get rid of these tests because I don't see tests for BufferUsage.  Let
> not write tests for the sake of writing it unless they can detect bugs
> in the future or are meaningfully covering the new code added.


I don't think that we can have any hope in a stable amount of WAL bytes
generated, so testing a positive number looks sensible to me.  Then testing
that each 1-line-write query generates a WAL record also looks sensible, so I
kept this.  I realized that Kirill used an existing set of queries that were
previously added to validate the multi queries commands behavior, so there's no
need to have all of them again.  I just kept one of each (insert, update,
delete, select) to make sure that we do record WAL activity there, but I don't
think that more can really be done.  I still think that this is better than
nothing, but if you disagree feel free to drop those tests.


> 5.
> -SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
> -               query               | calls | rows
> ------------------------------------+-------+------
> - SELECT $1::TEXT                   |     1 |    1
> - SELECT PLUS_ONE($1)               |     2 |    2
> - SELECT PLUS_TWO($1)               |     2 |    2
> - SELECT pg_stat_statements_reset() |     1 |    1
> +SELECT query, calls, rows, wal_bytes, wal_records FROM
> pg_stat_statements ORDER BY query COLLATE "C";
> +               query               | calls | rows | wal_bytes | wal_records
> +-----------------------------------+-------+------+-----------+-------------
> + SELECT $1::TEXT                   |     1 |    1 |         0 |           0
> + SELECT PLUS_ONE($1)               |     2 |    2 |         0 |           0
> + SELECT PLUS_TWO($1)               |     2 |    2 |         0 |           0
> + SELECT pg_stat_statements_reset() |     1 |    1 |         0 |           0
>  (4 rows)
> 
> Again, I am not sure if these modifications make much sense?


Those are queries that were previously executed.  As those are read-only query,
that are pretty much guaranteed to not cause any WAL activity, I don't see how
it hurts to test at the same time that that's we indeed record with
pg_stat_statements, just to be safe.  Once again, feel free to drop the extra
wal_* columns from the output if you disagree.


> 6.
>  static void pgss_shmem_startup(void);
> @@ -313,6 +318,7 @@ static void pgss_store(const char *query, uint64 queryId,
>      int query_location, int query_len,
>      double total_time, uint64 rows,
>      const BufferUsage *bufusage,
> +    const WalUsage* walusage,
>      pgssJumbleState *jstate);
> 
> The alignment for walusage doesn't seem to be correct. Running
> pgindent will fix this.


Indeed, fixed.

> 7.
> + values[i++] = Int64GetDatumFast(tmp.wal_records);
> + values[i++] = UInt64GetDatum(tmp.wal_num_fpw);
> 
> Why are they different?  I think we should use the same *GetDatum API
> (probably Int64GetDatumFast) for these.


Oops, that's a mistake from when I was working on the wal_bytes output, fixed.

> > v9-0005-Keep-track-of-WAL-usage-in-pg_stat_statements
> >
>
> One more comment related to this patch.
> +
> + snprintf(buf, sizeof buf, UINT64_FORMAT, tmp.wal_bytes);
> +
> + /* Convert to numeric. */
> + wal_bytes = DirectFunctionCall3(numeric_in,
> + CStringGetDatum(buf),
> + ObjectIdGetDatum(0),
> + Int32GetDatum(-1));
> +
> + values[i++] = wal_bytes;
>
> I see that other places that display uint64 values use BIGINT datatype
> in SQL, so why can't we do the same here?  See the usage of queryid in
> pg_stat_statements or internal_pages, *_pages exposed via
> pgstatindex.c.


That's because it's harmless to report a signed number for a hash (at least
comapred to the overhead of having it unsigned), while that's certainly not
wanted to report a negative amount of WAL bytes generated if it goes beyond
bigint limit.  See the usage of pg_lsn_mi in pg_lsn.c for instance.

On Wed, Apr 01, 2020 at 07:20:31PM +0530, Dilip Kumar wrote:
>
> I have reviewed 0003 and 0004,  I have a few comments.
> v9-0003-Add-infrastructure-to-track-WAL-usage
>
> 1.
>   /* Points to buffer usage area in DSM */
>   BufferUsage *buffer_usage;
> + /* Points to WAL usage area in DSM */
> + WalUsage   *wal_usage;
>
> Better to give one blank line between the previous statement/variable
> declaration and the next comment line.


Fixed.


> 2.
> @@ -2154,7 +2157,7 @@ lazy_parallel_vacuum_indexes(Relation *Irel,
> IndexBulkDeleteResult **stats,
>   WaitForParallelWorkersToFinish(lps->pcxt);
>
>   for (i = 0; i < lps->pcxt->nworkers_launched; i++)
> - InstrAccumParallelQuery(&lps->buffer_usage[i]);
> + InstrAccumParallelQuery(&lps->buffer_usage[i], &lps->wal_usage[i]);
>   }
>
> The existing comment above this loop, which just mentions the buffer
> usage, not the wal usage so I guess we need to change that.


Ah indeed, I thought I caught all the comments but missed this one.  Fixed.


> v9-0004-Add-option-to-report-WAL-usage-in-EXPLAIN-and-aut
>
> 3.
> + if (usage->wal_num_fpw > 0)
> + appendStringInfo(es->str, " full page records=%ld",
> +    usage->wal_num_fpw);
> + if (usage->wal_bytes > 0)
> + appendStringInfo(es->str, " bytes=" UINT64_FORMAT,
> +    usage->wal_bytes);
>
> Shall we change to 'full page writes' or 'full page image' instead of
> full page records?


Indeed, I changed it in the (auto)vacuum output but missed this one.  Fixed.


> Apart from this, I have some testing to see the wal_usage with the
> parallel vacuum and the results look fine.
>
> postgres[104248]=# CREATE TABLE test (a int, b int);
> CREATE TABLE
> postgres[104248]=# INSERT INTO test SELECT i, i FROM
> GENERATE_SERIES(1,2000000) as i;
> INSERT 0 2000000
> postgres[104248]=# CREATE INDEX idx1 on test(a);
> CREATE INDEX
> postgres[104248]=# VACUUM (PARALLEL 1) test;
> VACUUM
> postgres[104248]=# select query , wal_bytes, wal_records, wal_num_fpw
> from pg_stat_statements where query like 'VACUUM%';
>           query           | wal_bytes | wal_records | wal_num_fpw
> --------------------------+-----------+-------------+-------------
>  VACUUM (PARALLEL 1) test |  72814331 |        8857 |        8855
>
>
>
> postgres[106479]=# CREATE TABLE test (a int, b int);
> CREATE TABLE
> postgres[106479]=# INSERT INTO test SELECT i, i FROM
> GENERATE_SERIES(1,2000000) as i;
> INSERT 0 2000000
> postgres[106479]=# CREATE INDEX idx1 on test(a);
> CREATE INDEX
> postgres[106479]=# VACUUM (PARALLEL 0) test;
> VACUUM
> postgres[106479]=# select query , wal_bytes, wal_records, wal_num_fpw
> from pg_stat_statements where query like 'VACUUM%';
>           query           | wal_bytes | wal_records | wal_num_fpw
> --------------------------+-----------+-------------+-------------
>  VACUUM (PARALLEL 0) test |  72814331 |        8857 |        8855


Thanks!  I did some similar testing, with also seq/parallel index creation and
got similar results.


> By tomorrow, I will try to finish reviewing 0005 and 0006.

Thanks!

On Thu, Apr 02, 2020 at 11:07:29AM +0530, Amit Kapila wrote:
> On Wed, Apr 1, 2020 at 8:00 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> >
> > On Wed, Apr 01, 2020 at 04:29:16PM +0530, Amit Kapila wrote:
> > > 3. Doing some testing with and without parallelism to ensure WAL usage
> > > data is correct would be great and if possible, share the results?
> >
> >
> > I just saw that Dilip did some testing, but just in case here is some
> > additional one
> >
> > - vacuum, after a truncate, loading 1M row and a "UPDATE t1 SET id = id"
> >
> > =# select query, calls, wal_bytes, wal_records, wal_num_fpw from pg_stat_statements where query ilike '%vacuum%';
> >          query          | calls | wal_bytes | wal_records | wal_num_fpw
> > ------------------------+-------+-----------+-------------+-------------
> >  vacuum (parallel 3) t1 |     1 |  20098962 |       34104 |           2
> >  vacuum (parallel 0) t1 |     1 |  20098962 |       34104 |           2
> > (2 rows)
> >
> > - create index, overload t1's parallel_workers, using the 1M line just
> >   vacuumed:
> >
> > =# alter table t1 set (parallel_workers = 2);
> > ALTER TABLE
> >
> > =# create index t1_parallel_2 on t1(id);
> > CREATE INDEX
> >
> > =# alter table t1 set (parallel_workers = 0);
> > ALTER TABLE
> >
> > =# create index t1_parallel_0 on t1(id);
> > CREATE INDEX
> >
> > =# select query, calls, wal_bytes, wal_records, wal_num_fpw from pg_stat_statements where query ilike '%create
index%';
> >                 query                 | calls | wal_bytes | wal_records | wal_num_fpw
> > --------------------------------------+-------+-----------+-------------+-------------
> >  create index t1_parallel_0 on t1(id) |     1 |  20355540 |        2762 |        2745
> >  create index t1_parallel_2 on t1(id) |     1 |  20406811 |        2762 |        2758
> > (2 rows)
> >
> > It all looks good to me.
> >
> 
> Here the wal_num_fpw and wal_bytes are different between parallel and
> non-parallel versions.  Is it due to checkpoint or something else?  We
> can probably rule out checkpoint by increasing checkpoint_timeout and
> other checkpoint related parameters.

I think this is because I did a checkpoint after the VACUUM tests, so the 1st
CREATE INDEX (with parallelism) induced some FPW on the catalog blocks.  I
didn't try to investigate more since:

On Thu, Apr 02, 2020 at 11:22:16AM +0530, Amit Kapila wrote:
>
> Also, I forgot to mention that let's not base this on buffer usage
> patch for create index
> (v10-0002-Allow-parallel-index-creation-to-accumulate-buff) because as
> per recent discussion I am not sure about its usefulness.  I think we
> can proceed with this patch without
> v10-0002-Allow-parallel-index-creation-to-accumulate-buff as well.


Which is done in attached v11.


> > > 5.
> > > -SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
> > > -               query               | calls | rows
> > > ------------------------------------+-------+------
> > > - SELECT $1::TEXT                   |     1 |    1
> > > - SELECT PLUS_ONE($1)               |     2 |    2
> > > - SELECT PLUS_TWO($1)               |     2 |    2
> > > - SELECT pg_stat_statements_reset() |     1 |    1
> > > +SELECT query, calls, rows, wal_bytes, wal_records FROM
> > > pg_stat_statements ORDER BY query COLLATE "C";
> > > +               query               | calls | rows | wal_bytes | wal_records
> > > +-----------------------------------+-------+------+-----------+-------------
> > > + SELECT $1::TEXT                   |     1 |    1 |         0 |           0
> > > + SELECT PLUS_ONE($1)               |     2 |    2 |         0 |           0
> > > + SELECT PLUS_TWO($1)               |     2 |    2 |         0 |           0
> > > + SELECT pg_stat_statements_reset() |     1 |    1 |         0 |           0
> > >  (4 rows)
> > >
> > > Again, I am not sure if these modifications make much sense?
> >
> >
> > Those are queries that were previously executed.  As those are read-only query,
> > that are pretty much guaranteed to not cause any WAL activity, I don't see how
> > it hurts to test at the same time that that's we indeed record with
> > pg_stat_statements, just to be safe.
> >
> 
> On a similar theory, one could have checked bufferusage stats as well.
> The statements are using some expressions so don't see any value in
> check all usage data for such statements.


Dropped.


> Right now, that particular patch is not getting applied (probably due
> to recent commit 17e0328224).  Can you rebase it?


Done.


> > > v9-0004-Add-option-to-report-WAL-usage-in-EXPLAIN-and-aut
> > >
> > > 3.
> > > + if (usage->wal_num_fpw > 0)
> > > + appendStringInfo(es->str, " full page records=%ld",
> > > +    usage->wal_num_fpw);
> > > + if (usage->wal_bytes > 0)
> > > + appendStringInfo(es->str, " bytes=" UINT64_FORMAT,
> > > +    usage->wal_bytes);
> > >
> > > Shall we change to 'full page writes' or 'full page image' instead of
> > > full page records?
> >
> >
> > Indeed, I changed it in the (auto)vacuum output but missed this one.  Fixed.
> >
> 
> I don't see this change in the patch.


Yes, as Dilip reported I fixuped the wrong commit, sorry about that.  This
version should now be ok.


On Thu, Apr 02, 2020 at 12:04:32PM +0530, Dilip Kumar wrote:
> On Wed, Apr 1, 2020 at 8:00 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> >
> > > By tomorrow, I will try to finish reviewing 0005 and 0006.
>
> I have reviewed these patches and I have a few cosmetic comments.
> v10-0005-Keep-track-of-WAL-usage-in-pg_stat_statements
>
> 1.
> + uint64 wal_bytes; /* total amount of wal bytes written */
> + int64 wal_records; /* # of wal records written */
> + int64 wal_num_fpw; /* # of full page wal records written */
>
>
> /s/# of full page wal records written / /* # of WAL full page image produced */


Done, I also consistently s/wal/WAL/.

>
> 2.
>  static void pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
>   ProcessUtilityContext context, ParamListInfo params,
>   QueryEnvironment *queryEnv,
> - DestReceiver *dest, QueryCompletion *qc);
> + DestReceiver *dest, QueryCompletion * qc);
>
> Useless hunk.


Oops, leftover of a pgindent as QueryCompletion isn't in the typedefs yet.  I
thought I discarded all the useless hunks but missed this one.  Thanks, fixed.


>
> 3.
>
> v10-0006-Expose-WAL-usage-counters-in-verbose-auto-vacuum
>
> @@ -3105,7 +3105,7 @@ show_wal_usage(ExplainState *es, const WalUsage *usage)
>   {
>   ExplainPropertyInteger("WAL records", NULL,
>      usage->wal_records, es);
> - ExplainPropertyInteger("WAL full page records", NULL,
> + ExplainPropertyInteger("WAL full page writes", NULL,
>      usage->wal_num_fpw, es);
> Just noticed that in 0004 you have first added "WAL full page
> records", which is later corrected to "WAL full page writes" in 0006.
> I think we better keep this proper in 0004 itself and avoid this hunk
> in 0006, otherwise, it creates confusion while reviewing.


Oh, I didn't realized that I fixuped the wrong commit.  Fixed.


I also adapted the documentation that mentioned full page records instead of
full page images, and integrated Justin's comment:

> In 0003:
> +       /* Provide WAL update data to the instrumentation */
> Remove "data" ??

so changed to "Report WAL traffic to the instrumentation."

I didn't change the (auto)vacuum output yet (except fixing the s/full page
records/full page writes/ that I previously missed), as it's not clear what the
consensus is yet.  I'll take care of that as soon as we reach to a consensus.

Attachment

Re: WAL usage calculation patch

From

Amit Kapila

Date:

02 April 2020, 09:02:07

On Thu, Apr 2, 2020 at 2:00 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> On Thu, Apr 02, 2020 at 11:07:29AM +0530, Amit Kapila wrote:
> > On Wed, Apr 1, 2020 at 8:00 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> > >
> > > On Wed, Apr 01, 2020 at 04:29:16PM +0530, Amit Kapila wrote:
> > > > 3. Doing some testing with and without parallelism to ensure WAL usage
> > > > data is correct would be great and if possible, share the results?
> > >
> > >
> > > I just saw that Dilip did some testing, but just in case here is some
> > > additional one
> > >
> > > - vacuum, after a truncate, loading 1M row and a "UPDATE t1 SET id = id"
> > >
> > > =# select query, calls, wal_bytes, wal_records, wal_num_fpw from pg_stat_statements where query ilike
'%vacuum%';
> > >          query          | calls | wal_bytes | wal_records | wal_num_fpw
> > > ------------------------+-------+-----------+-------------+-------------
> > >  vacuum (parallel 3) t1 |     1 |  20098962 |       34104 |           2
> > >  vacuum (parallel 0) t1 |     1 |  20098962 |       34104 |           2
> > > (2 rows)
> > >
> > > - create index, overload t1's parallel_workers, using the 1M line just
> > >   vacuumed:
> > >
> > > =# alter table t1 set (parallel_workers = 2);
> > > ALTER TABLE
> > >
> > > =# create index t1_parallel_2 on t1(id);
> > > CREATE INDEX
> > >
> > > =# alter table t1 set (parallel_workers = 0);
> > > ALTER TABLE
> > >
> > > =# create index t1_parallel_0 on t1(id);
> > > CREATE INDEX
> > >
> > > =# select query, calls, wal_bytes, wal_records, wal_num_fpw from pg_stat_statements where query ilike '%create
index%';
> > >                 query                 | calls | wal_bytes | wal_records | wal_num_fpw
> > > --------------------------------------+-------+-----------+-------------+-------------
> > >  create index t1_parallel_0 on t1(id) |     1 |  20355540 |        2762 |        2745
> > >  create index t1_parallel_2 on t1(id) |     1 |  20406811 |        2762 |        2758
> > > (2 rows)
> > >
> > > It all looks good to me.
> > >
> >
> > Here the wal_num_fpw and wal_bytes are different between parallel and
> > non-parallel versions.  Is it due to checkpoint or something else?  We
> > can probably rule out checkpoint by increasing checkpoint_timeout and
> > other checkpoint related parameters.
>
> I think this is because I did a checkpoint after the VACUUM tests, so the 1st
> CREATE INDEX (with parallelism) induced some FPW on the catalog blocks.  I
> didn't try to investigate more since:
>

We need to do this.

> On Thu, Apr 02, 2020 at 11:22:16AM +0530, Amit Kapila wrote:
> >
> > Also, I forgot to mention that let's not base this on buffer usage
> > patch for create index
> > (v10-0002-Allow-parallel-index-creation-to-accumulate-buff) because as
> > per recent discussion I am not sure about its usefulness.  I think we
> > can proceed with this patch without
> > v10-0002-Allow-parallel-index-creation-to-accumulate-buff as well.
>
>
> Which is done in attached v11.
>

Hmm, I haven't suggested removing the WAL usage from the parallel
create index.   I just told not to use the infrastructure of another
patch.  We bypass the buffer manager but do write WAL.  See
_bt_blwritepage->log_newpage.  So we need to accumulate WAL usage even
if we decide not to do anything about BufferUsage which means we need
to investigate the above inconsistency in wal_num_fpw and wal_bytes
between parallel and non-parallel version.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: WAL usage calculation patch

From

Julien Rouhaud

Date:

02 April 2020, 12:48:18

On Thu, Apr 02, 2020 at 02:32:07PM +0530, Amit Kapila wrote:
> On Thu, Apr 2, 2020 at 2:00 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> >
> > On Thu, Apr 02, 2020 at 11:07:29AM +0530, Amit Kapila wrote:
> > > On Wed, Apr 1, 2020 at 8:00 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> > > >
> > > > On Wed, Apr 01, 2020 at 04:29:16PM +0530, Amit Kapila wrote:
> > > > > 3. Doing some testing with and without parallelism to ensure WAL usage
> > > > > data is correct would be great and if possible, share the results?
> > > >
> > > >
> > > > I just saw that Dilip did some testing, but just in case here is some
> > > > additional one
> > > >
> > > > - vacuum, after a truncate, loading 1M row and a "UPDATE t1 SET id = id"
> > > >
> > > > =# select query, calls, wal_bytes, wal_records, wal_num_fpw from pg_stat_statements where query ilike
'%vacuum%';
> > > >          query          | calls | wal_bytes | wal_records | wal_num_fpw
> > > > ------------------------+-------+-----------+-------------+-------------
> > > >  vacuum (parallel 3) t1 |     1 |  20098962 |       34104 |           2
> > > >  vacuum (parallel 0) t1 |     1 |  20098962 |       34104 |           2
> > > > (2 rows)
> > > >
> > > > - create index, overload t1's parallel_workers, using the 1M line just
> > > >   vacuumed:
> > > >
> > > > =# alter table t1 set (parallel_workers = 2);
> > > > ALTER TABLE
> > > >
> > > > =# create index t1_parallel_2 on t1(id);
> > > > CREATE INDEX
> > > >
> > > > =# alter table t1 set (parallel_workers = 0);
> > > > ALTER TABLE
> > > >
> > > > =# create index t1_parallel_0 on t1(id);
> > > > CREATE INDEX
> > > >
> > > > =# select query, calls, wal_bytes, wal_records, wal_num_fpw from pg_stat_statements where query ilike '%create
index%';
> > > >                 query                 | calls | wal_bytes | wal_records | wal_num_fpw
> > > > --------------------------------------+-------+-----------+-------------+-------------
> > > >  create index t1_parallel_0 on t1(id) |     1 |  20355540 |        2762 |        2745
> > > >  create index t1_parallel_2 on t1(id) |     1 |  20406811 |        2762 |        2758
> > > > (2 rows)
> > > >
> > > > It all looks good to me.
> > > >
> > >
> > > Here the wal_num_fpw and wal_bytes are different between parallel and
> > > non-parallel versions.  Is it due to checkpoint or something else?  We
> > > can probably rule out checkpoint by increasing checkpoint_timeout and
> > > other checkpoint related parameters.
> >
> > I think this is because I did a checkpoint after the VACUUM tests, so the 1st
> > CREATE INDEX (with parallelism) induced some FPW on the catalog blocks.  I
> > didn't try to investigate more since:
> >
> 
> We need to do this.
> 
> > On Thu, Apr 02, 2020 at 11:22:16AM +0530, Amit Kapila wrote:
> > >
> > > Also, I forgot to mention that let's not base this on buffer usage
> > > patch for create index
> > > (v10-0002-Allow-parallel-index-creation-to-accumulate-buff) because as
> > > per recent discussion I am not sure about its usefulness.  I think we
> > > can proceed with this patch without
> > > v10-0002-Allow-parallel-index-creation-to-accumulate-buff as well.
> >
> >
> > Which is done in attached v11.
> >
> 
> Hmm, I haven't suggested removing the WAL usage from the parallel
> create index.   I just told not to use the infrastructure of another
> patch.  We bypass the buffer manager but do write WAL.  See
> _bt_blwritepage->log_newpage.  So we need to accumulate WAL usage even
> if we decide not to do anything about BufferUsage which means we need
> to investigate the above inconsistency in wal_num_fpw and wal_bytes
> between parallel and non-parallel version.


Oh, I thought that you wanted to wait on that part, as we'll probably change
the parallel create index to report buffer access eventually.

v12 attached with an adaptation of Sawada-san's original patch but only dealing
with WAL activity.

I did some more experiment, ensuring as much stability as possible:

=# create table t1(id integer);
CREATE TABLE
=# insert into t1 select * from generate_series(1, 1000000);
INSERT 0 1000000
=# select * from pg_stat_statements_reset() ;
 pg_stat_statements_reset
--------------------------

(1 row)

=# alter table t1 set (parallel_workers = 0);
ALTER TABLE
=# vacuum;checkpoint;
VACUUM
CHECKPOINT
=# create index t1_idx_parallel_0 ON t1(id);
CREATE INDEX

=# alter table t1 set (parallel_workers = 1);
ALTER TABLE
=# vacuum;checkpoint;
VACUUM
CHECKPOINT
=# create index t1_idx_parallel_1 ON t1(id);
CREATE INDEX

=# alter table t1 set (parallel_workers = 2);
ALTER TABLE
=# vacuum;checkpoint;
VACUUM
CHECKPOINT
=# create index t1_idx_parallel_2 ON t1(id);
CREATE INDEX

=# alter table t1 set (parallel_workers = 3);
ALTER TABLE
=# vacuum;checkpoint;
VACUUM
CHECKPOINT
=# create index t1_idx_parallel_3 ON t1(id);
CREATE INDEX

=# alter table t1 set (parallel_workers = 4);
ALTER TABLE
=# vacuum;checkpoint;
VACUUM
CHECKPOINT
=# create index t1_idx_parallel_4 ON t1(id);
CREATE INDEX

=# alter table t1 set (parallel_workers = 5);
ALTER TABLE
=# vacuum;checkpoint;
VACUUM
CHECKPOINT
=# create index t1_idx_parallel_5 ON t1(id);
CREATE INDEX

=# alter table t1 set (parallel_workers = 6);
ALTER TABLE
=# vacuum;checkpoint;
VACUUM
CHECKPOINT
=# create index t1_idx_parallel_6 ON t1(id);
CREATE INDEX

=# alter table t1 set (parallel_workers = 7);
ALTER TABLE
=# vacuum;checkpoint;
VACUUM
CHECKPOINT
=# create index t1_idx_parallel_7 ON t1(id);
CREATE INDEX

=# alter table t1 set (parallel_workers = 8);
ALTER TABLE
=# vacuum;checkpoint;
VACUUM
CHECKPOINT
=# create index t1_idx_parallel_8 ON t1(id);
CREATE INDEX

=# alter table t1 set (parallel_workers = 0);
ALTER TABLE
=# vacuum;checkpoint;
VACUUM
CHECKPOINT
=# create index t1_idx_parallel_0_bis ON t1(id);
CREATE INDEX
=# vacuum;checkpoint;
VACUUM
CHECKPOINT
=# create index t1_idx_parallel_0_ter ON t1(id);
CREATE INDEX

=# select query, calls, wal_bytes, wal_records, wal_num_fpw from pg_stat_statements where query ilike '%create
index%';
                  query                       | calls | wal_bytes | wal_records | wal_num_fpw
----------------------------------------------+-------+-----------+-------------+-------------
 create index t1_idx_parallel_0 ON t1(id)     |     1 |  20389743 |        2762 |        2758
 create index t1_idx_parallel_0_bis ON t1(id) |     1 |  20394391 |        2762 |        2758
 create index t1_idx_parallel_0_ter ON t1(id) |     1 |  20395155 |        2762 |        2758
 create index t1_idx_parallel_1 ON t1(id)     |     1 |  20388335 |        2762 |        2758
 create index t1_idx_parallel_2 ON t1(id)     |     1 |  20389091 |        2762 |        2758
 create index t1_idx_parallel_3 ON t1(id)     |     1 |  20389847 |        2762 |        2758
 create index t1_idx_parallel_4 ON t1(id)     |     1 |  20390603 |        2762 |        2758
 create index t1_idx_parallel_5 ON t1(id)     |     1 |  20391359 |        2762 |        2758
 create index t1_idx_parallel_6 ON t1(id)     |     1 |  20392115 |        2762 |        2758
 create index t1_idx_parallel_7 ON t1(id)     |     1 |  20392871 |        2762 |        2758
 create index t1_idx_parallel_8 ON t1(id)     |     1 |  20393627 |        2762 |        2758
(11 rows)

=# select relname, pg_relation_size(oid) from pg_class where relname like '%t1_id%';
      relname          | pg_relation_size
-----------------------+------------------
 t1_idx_parallel_0     |         22487040
 t1_idx_parallel_0_bis |         22487040
 t1_idx_parallel_0_ter |         22487040
 t1_idx_parallel_2     |         22487040
 t1_idx_parallel_1     |         22487040
 t1_idx_parallel_4     |         22487040
 t1_idx_parallel_3     |         22487040
 t1_idx_parallel_5     |         22487040
 t1_idx_parallel_6     |         22487040
 t1_idx_parallel_7     |         22487040
 t1_idx_parallel_8     |         22487040
(9 rows)


So while the number of WAL records and full page images stay constant, we can
see some small fluctuations in the total amount of generated WAL data, even for
multiple execution of the sequential create index.  I'm wondering if the
fluctuations are due to some other internal details or if the WalUsage support
is just completely broken (although I don't see any obvious issue ATM).

On Thu, Apr 02, 2020 at 06:40:51PM +0530, Amit Kapila wrote:
> On Thu, Apr 2, 2020 at 6:18 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> >
> > =# select query, calls, wal_bytes, wal_records, wal_num_fpw from pg_stat_statements where query ilike '%create
index%';
> >                   query                       | calls | wal_bytes | wal_records | wal_num_fpw
> > ----------------------------------------------+-------+-----------+-------------+-------------
> >  create index t1_idx_parallel_0 ON t1(id)     |     1 |  20389743 |        2762 |        2758
> >  create index t1_idx_parallel_0_bis ON t1(id) |     1 |  20394391 |        2762 |        2758
> >  create index t1_idx_parallel_0_ter ON t1(id) |     1 |  20395155 |        2762 |        2758
> >  create index t1_idx_parallel_1 ON t1(id)     |     1 |  20388335 |        2762 |        2758
> >  create index t1_idx_parallel_2 ON t1(id)     |     1 |  20389091 |        2762 |        2758
> >  create index t1_idx_parallel_3 ON t1(id)     |     1 |  20389847 |        2762 |        2758
> >  create index t1_idx_parallel_4 ON t1(id)     |     1 |  20390603 |        2762 |        2758
> >  create index t1_idx_parallel_5 ON t1(id)     |     1 |  20391359 |        2762 |        2758
> >  create index t1_idx_parallel_6 ON t1(id)     |     1 |  20392115 |        2762 |        2758
> >  create index t1_idx_parallel_7 ON t1(id)     |     1 |  20392871 |        2762 |        2758
> >  create index t1_idx_parallel_8 ON t1(id)     |     1 |  20393627 |        2762 |        2758
> > (11 rows)
> >
> > =# select relname, pg_relation_size(oid) from pg_class where relname like '%t1_id%';
> >       relname          | pg_relation_size
> > -----------------------+------------------
> >  t1_idx_parallel_0     |         22487040
> >  t1_idx_parallel_0_bis |         22487040
> >  t1_idx_parallel_0_ter |         22487040
> >  t1_idx_parallel_2     |         22487040
> >  t1_idx_parallel_1     |         22487040
> >  t1_idx_parallel_4     |         22487040
> >  t1_idx_parallel_3     |         22487040
> >  t1_idx_parallel_5     |         22487040
> >  t1_idx_parallel_6     |         22487040
> >  t1_idx_parallel_7     |         22487040
> >  t1_idx_parallel_8     |         22487040
> > (9 rows)
> >
> >
> > So while the number of WAL records and full page images stay constant, we can
> > see some small fluctuations in the total amount of generated WAL data, even for
> > multiple execution of the sequential create index.  I'm wondering if the
> > fluctuations are due to some other internal details or if the WalUsage support
> > is just completely broken (although I don't see any obvious issue ATM).
> >
> 
> I think we need to know the reason for this.  Can you try with small
> size indexes and see if the problem is reproducible? If it is, then it
> will be easier to debug the same.


I did some quick testing using the attached shell script:

- one a 1k line base number of lines, scales 1 10 100 1000 (suffix _s)
- parallel workers from 0 to 8 (suffix _w)
- each index created twice (suffix _pa and _pb)
- with a vacuum;checkpoint;pg_switch_wal executed each time

I get the following results:

                   query                    | wal_bytes | wal_records | wal_num_fpw 
--------------------------------------------+-----------+-------------+-------------
 CREATE INDEX t1_idx_s001_pa_w0 ON t1 (id)  |     61871 |          22 |          18
 CREATE INDEX t1_idx_s001_pa_w1 ON t1 (id)  |     62394 |          21 |          18
 CREATE INDEX t1_idx_s001_pa_w2 ON t1 (id)  |     63150 |          21 |          18
 CREATE INDEX t1_idx_s001_pa_w3 ON t1 (id)  |     63906 |          21 |          18
 CREATE INDEX t1_idx_s001_pa_w4 ON t1 (id)  |     64662 |          21 |          18
 CREATE INDEX t1_idx_s001_pa_w5 ON t1 (id)  |     65418 |          21 |          18
 CREATE INDEX t1_idx_s001_pa_w6 ON t1 (id)  |     65450 |          21 |          18
 CREATE INDEX t1_idx_s001_pa_w7 ON t1 (id)  |     66206 |          21 |          18
 CREATE INDEX t1_idx_s001_pa_w8 ON t1 (id)  |     66962 |          21 |          18
 CREATE INDEX t1_idx_s001_pb_w0 ON t1 (id)  |     67718 |          21 |          18
 CREATE INDEX t1_idx_s001_pb_w1 ON t1 (id)  |     68474 |          21 |          18
 CREATE INDEX t1_idx_s001_pb_w2 ON t1 (id)  |     68418 |          21 |          18
 CREATE INDEX t1_idx_s001_pb_w3 ON t1 (id)  |     69174 |          21 |          18
 CREATE INDEX t1_idx_s001_pb_w4 ON t1 (id)  |     69930 |          21 |          18
 CREATE INDEX t1_idx_s001_pb_w5 ON t1 (id)  |     70686 |          21 |          18
 CREATE INDEX t1_idx_s001_pb_w6 ON t1 (id)  |     71442 |          21 |          18
 CREATE INDEX t1_idx_s001_pb_w7 ON t1 (id)  |     64922 |          21 |          18
 CREATE INDEX t1_idx_s001_pb_w8 ON t1 (id)  |     65682 |          21 |          18
 CREATE INDEX t1_idx_s010_pa_w0 ON t1 (id)  |    250460 |          47 |          44
 CREATE INDEX t1_idx_s010_pa_w1 ON t1 (id)  |    251216 |          47 |          44
 CREATE INDEX t1_idx_s010_pa_w2 ON t1 (id)  |    251972 |          47 |          44
 CREATE INDEX t1_idx_s010_pa_w3 ON t1 (id)  |    252728 |          47 |          44
 CREATE INDEX t1_idx_s010_pa_w4 ON t1 (id)  |    253484 |          47 |          44
 CREATE INDEX t1_idx_s010_pa_w5 ON t1 (id)  |    254240 |          47 |          44
 CREATE INDEX t1_idx_s010_pa_w6 ON t1 (id)  |    253552 |          47 |          44
 CREATE INDEX t1_idx_s010_pa_w7 ON t1 (id)  |    254308 |          47 |          44
 CREATE INDEX t1_idx_s010_pa_w8 ON t1 (id)  |    255064 |          47 |          44
 CREATE INDEX t1_idx_s010_pb_w0 ON t1 (id)  |    255820 |          47 |          44
 CREATE INDEX t1_idx_s010_pb_w1 ON t1 (id)  |    256576 |          47 |          44
 CREATE INDEX t1_idx_s010_pb_w2 ON t1 (id)  |    257332 |          47 |          44
 CREATE INDEX t1_idx_s010_pb_w3 ON t1 (id)  |    258088 |          47 |          44
 CREATE INDEX t1_idx_s010_pb_w4 ON t1 (id)  |    258844 |          47 |          44
 CREATE INDEX t1_idx_s010_pb_w5 ON t1 (id)  |    259600 |          47 |          44
 CREATE INDEX t1_idx_s010_pb_w6 ON t1 (id)  |    260356 |          47 |          44
 CREATE INDEX t1_idx_s010_pb_w7 ON t1 (id)  |    260012 |          47 |          44
 CREATE INDEX t1_idx_s010_pb_w8 ON t1 (id)  |    260768 |          47 |          44
 CREATE INDEX t1_idx_s1000_pa_w0 ON t1 (id) |  20400595 |        2762 |        2759
 CREATE INDEX t1_idx_s1000_pa_w1 ON t1 (id) |  20401351 |        2762 |        2759
 CREATE INDEX t1_idx_s1000_pa_w2 ON t1 (id) |  20402107 |        2762 |        2759
 CREATE INDEX t1_idx_s1000_pa_w3 ON t1 (id) |  20402863 |        2762 |        2759
 CREATE INDEX t1_idx_s1000_pa_w4 ON t1 (id) |  20403619 |        2762 |        2759
 CREATE INDEX t1_idx_s1000_pa_w5 ON t1 (id) |  20404375 |        2762 |        2759
 CREATE INDEX t1_idx_s1000_pa_w6 ON t1 (id) |  20403687 |        2762 |        2759
 CREATE INDEX t1_idx_s1000_pa_w7 ON t1 (id) |  20404443 |        2762 |        2759
 CREATE INDEX t1_idx_s1000_pa_w8 ON t1 (id) |  20405199 |        2762 |        2759
 CREATE INDEX t1_idx_s1000_pb_w0 ON t1 (id) |  20405955 |        2762 |        2759
 CREATE INDEX t1_idx_s1000_pb_w1 ON t1 (id) |  20406711 |        2762 |        2759
 CREATE INDEX t1_idx_s1000_pb_w2 ON t1 (id) |  20407467 |        2762 |        2759
 CREATE INDEX t1_idx_s1000_pb_w3 ON t1 (id) |  20408223 |        2762 |        2759
 CREATE INDEX t1_idx_s1000_pb_w4 ON t1 (id) |  20408979 |        2762 |        2759
 CREATE INDEX t1_idx_s1000_pb_w5 ON t1 (id) |  20409735 |        2762 |        2759
 CREATE INDEX t1_idx_s1000_pb_w6 ON t1 (id) |  20410491 |        2762 |        2759
 CREATE INDEX t1_idx_s1000_pb_w7 ON t1 (id) |  20410147 |        2762 |        2759
 CREATE INDEX t1_idx_s1000_pb_w8 ON t1 (id) |  20410903 |        2762 |        2759
 CREATE INDEX t1_idx_s100_pa_w0 ON t1 (id)  |   2082194 |         293 |         290
 CREATE INDEX t1_idx_s100_pa_w1 ON t1 (id)  |   2082950 |         293 |         290
 CREATE INDEX t1_idx_s100_pa_w2 ON t1 (id)  |   2083706 |         293 |         290
 CREATE INDEX t1_idx_s100_pa_w3 ON t1 (id)  |   2084462 |         293 |         290
 CREATE INDEX t1_idx_s100_pa_w4 ON t1 (id)  |   2085218 |         293 |         290
 CREATE INDEX t1_idx_s100_pa_w5 ON t1 (id)  |   2085974 |         293 |         290
 CREATE INDEX t1_idx_s100_pa_w6 ON t1 (id)  |   2085286 |         293 |         290
 CREATE INDEX t1_idx_s100_pa_w7 ON t1 (id)  |   2086042 |         293 |         290
 CREATE INDEX t1_idx_s100_pa_w8 ON t1 (id)  |   2086798 |         293 |         290
 CREATE INDEX t1_idx_s100_pb_w0 ON t1 (id)  |   2087554 |         293 |         290
 CREATE INDEX t1_idx_s100_pb_w1 ON t1 (id)  |   2088310 |         293 |         290
 CREATE INDEX t1_idx_s100_pb_w2 ON t1 (id)  |   2089066 |         293 |         290
 CREATE INDEX t1_idx_s100_pb_w3 ON t1 (id)  |   2089822 |         293 |         290
 CREATE INDEX t1_idx_s100_pb_w4 ON t1 (id)  |   2090578 |         293 |         290
 CREATE INDEX t1_idx_s100_pb_w5 ON t1 (id)  |   2091334 |         293 |         290
 CREATE INDEX t1_idx_s100_pb_w6 ON t1 (id)  |   2092090 |         293 |         290
 CREATE INDEX t1_idx_s100_pb_w7 ON t1 (id)  |   2091746 |         293 |         290
 CREATE INDEX t1_idx_s100_pb_w8 ON t1 (id)  |   2092502 |         293 |         290
(72 rows)

The fluctuations exist for all scales, but doesn't seem to depend on the input
size.


Just to be sure I tried to measure the amount of WAL for various INSERT size
using roughly the same approach, and results are stable:

                        query                        | wal_bytes | wal_records | wal_num_fpw
-----------------------------------------------------+-----------+-------------+-------------
 INSERT INTO t_001_a SELECT generate_series($1, $2)  |     59000 |        1000 |           0
 INSERT INTO t_001_b SELECT generate_series($1, $2)  |     59000 |        1000 |           0
 INSERT INTO t_010_a SELECT generate_series($1, $2)  |    590000 |       10000 |           0
 INSERT INTO t_010_b SELECT generate_series($1, $2)  |    590000 |       10000 |           0
 INSERT INTO t_1000_a SELECT generate_series($1, $2) |  59000000 |     1000000 |           0
 INSERT INTO t_1000_b SELECT generate_series($1, $2) |  59000000 |     1000000 |           0
 INSERT INTO t_100_a SELECT generate_series($1, $2)  |   5900000 |      100000 |           0
 INSERT INTO t_100_b SELECT generate_series($1, $2)  |   5900000 |      100000 |           0
(8 rows)


At this point I tend to think that this is somehow due to btbuild specific
behavior, or somewhere nearby.


> Few other minor comments
> ------------------------------------
> pg_stat_statements patch
> 1.
> +--
> +-- CRUD: INSERT SELECT UPDATE DELETE on test non-temp table to
> validate WAL generation metrics
> +--
> 
> The word 'non-temp' in the above comment appears out of place.  We
> don't need to specify it.


Fixed.


> 2.
> +-- SELECT usage data, check WAL usage is reported, wal_records equal
> rows count for INSERT/UPDATE/DELETE
> +SELECT query, calls, rows,
> +wal_bytes > 0 as wal_bytes_generated,
> +wal_records > 0 as wal_records_generated,
> +wal_records = rows as wal_records_as_rows
> +FROM pg_stat_statements ORDER BY query COLLATE "C";
> 
> The comment doesn't seem to match what we are doing in the statement.
> I think we can simplify it to something like "check WAL is generated
> for above statements:


Done.


> 3.
> @@ -185,6 +185,9 @@ typedef struct Counters
>   int64 local_blks_written; /* # of local disk blocks written */
>   int64 temp_blks_read; /* # of temp blocks read */
>   int64 temp_blks_written; /* # of temp blocks written */
> + uint64 wal_bytes; /* total amount of WAL bytes generated */
> + int64 wal_records; /* # of WAL records generated */
> + int64 wal_num_fpw; /* # of WAL full page image generated */
>   double blk_read_time; /* time spent reading, in msec */
>   double blk_write_time; /* time spent writing, in msec */
>   double usage; /* usage factor */
> 
> It is better to keep wal_bytes should be after wal_num_fpw as it is in
> the main patch.  Also, consider changing at other places in this
> patch.  I think we should add these new fields after blk_write_time or
> at the end after usage.


Done.


> 4.
> /* # of WAL full page image generated */
> Can we change it to "/* # of WAL full page image records generated */"?
> 
> If you agree, then a similar comment exists in
> v11-0001-Add-infrastructure-to-track-WAL-usage, consider changing that
> as well.


Agreed, and fixed in both place.


> v11-0002-Add-option-to-report-WAL-usage-in-EXPLAIN-and-au
> 5.
> Specifically, include the
> +      number of records, full page images and bytes generated.
> 
> How about making the above slightly clear?  "Specifically, include the
> number of records, number of full page image records and amount of WAL
> bytes generated.


Thanks, that's clearer.  Done

Attachment

Re: WAL usage calculation patch

From

Dilip Kumar

Date:

02 April 2020, 15:58:13

On Thu, Apr 2, 2020 at 6:41 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Apr 2, 2020 at 6:18 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> >
> > =# select query, calls, wal_bytes, wal_records, wal_num_fpw from pg_stat_statements where query ilike '%create
index%';
> >                   query                       | calls | wal_bytes | wal_records | wal_num_fpw
> > ----------------------------------------------+-------+-----------+-------------+-------------
> >  create index t1_idx_parallel_0 ON t1(id)     |     1 |  20389743 |        2762 |        2758
> >  create index t1_idx_parallel_0_bis ON t1(id) |     1 |  20394391 |        2762 |        2758
> >  create index t1_idx_parallel_0_ter ON t1(id) |     1 |  20395155 |        2762 |        2758
> >  create index t1_idx_parallel_1 ON t1(id)     |     1 |  20388335 |        2762 |        2758
> >  create index t1_idx_parallel_2 ON t1(id)     |     1 |  20389091 |        2762 |        2758
> >  create index t1_idx_parallel_3 ON t1(id)     |     1 |  20389847 |        2762 |        2758
> >  create index t1_idx_parallel_4 ON t1(id)     |     1 |  20390603 |        2762 |        2758
> >  create index t1_idx_parallel_5 ON t1(id)     |     1 |  20391359 |        2762 |        2758
> >  create index t1_idx_parallel_6 ON t1(id)     |     1 |  20392115 |        2762 |        2758
> >  create index t1_idx_parallel_7 ON t1(id)     |     1 |  20392871 |        2762 |        2758
> >  create index t1_idx_parallel_8 ON t1(id)     |     1 |  20393627 |        2762 |        2758
> > (11 rows)
> >
> > =# select relname, pg_relation_size(oid) from pg_class where relname like '%t1_id%';
> >       relname          | pg_relation_size
> > -----------------------+------------------
> >  t1_idx_parallel_0     |         22487040
> >  t1_idx_parallel_0_bis |         22487040
> >  t1_idx_parallel_0_ter |         22487040
> >  t1_idx_parallel_2     |         22487040
> >  t1_idx_parallel_1     |         22487040
> >  t1_idx_parallel_4     |         22487040
> >  t1_idx_parallel_3     |         22487040
> >  t1_idx_parallel_5     |         22487040
> >  t1_idx_parallel_6     |         22487040
> >  t1_idx_parallel_7     |         22487040
> >  t1_idx_parallel_8     |         22487040
> > (9 rows)
> >
> >
> > So while the number of WAL records and full page images stay constant, we can
> > see some small fluctuations in the total amount of generated WAL data, even for
> > multiple execution of the sequential create index.  I'm wondering if the
> > fluctuations are due to some other internal details or if the WalUsage support
> > is just completely broken (although I don't see any obvious issue ATM).
> >
>
> I think we need to know the reason for this.  Can you try with small
> size indexes and see if the problem is reproducible? If it is, then it
> will be easier to debug the same.

I have done some testing to see where these extra WAL size is coming
from.  First I tried to create new db before every run then the size
is consistent.  But, then on the same server, I tired as Julien showed
in his experiment then I am getting few extra wal bytes from next
create index onwards.  And, the waldump(attached in the mail) shows
that is pg_class insert wal.  I still have to check that why we need
to write an extra wal size.

create extension pg_stat_statements;
drop table t1;
create table t1(id integer);
insert into t1 select * from generate_series(1, 10);
alter table t1 set (parallel_workers = 0);
vacuum;checkpoint;
select * from pg_stat_statements_reset() ;
create index t1_idx_parallel_0 ON t1(id);
select query, calls, wal_bytes, wal_records, wal_num_fpw from
pg_stat_statements where query ilike '%create index%';;
                                      query
           | calls | wal_bytes | wal_records | wal_num_fpw

----------------------------------------------------------------------------------+-------+-----------+-------------+-------------
 create index t1_idx_parallel_0 ON t1(id)
           |     1 |     49320 |          23 |          15


drop table t1;
create table t1(id integer);
insert into t1 select * from generate_series(1, 10);
--select * from pg_stat_statements_reset() ;
alter table t1 set (parallel_workers = 0);
vacuum;checkpoint;
create index t1_idx_parallel_1 ON t1(id);

select query, calls, wal_bytes, wal_records, wal_num_fpw from
pg_stat_statements where query ilike '%create index%';;
postgres[110383]=# select query, calls, wal_bytes, wal_records,
wal_num_fpw from pg_stat_statements;
                                      query
           | calls | wal_bytes | wal_records | wal_num_fpw

----------------------------------------------------------------------------------+-------+-----------+-------------+-------------
 create index t1_idx_parallel_1 ON t1(id)
           |     1 |     50040 |          23 |          15

wal_bytes diff = 50040-49320 = 720

Below, WAL record is causing the 720 bytes difference, all other WALs
are of the same size.
t1_idx_parallel_0:
rmgr: Heap        len (rec/tot):     54/  7498, tx:        489, lsn:
0/0167B9B0, prev 0/0167B970, desc: INSERT off 30 flags 0x01, blkref
#0: rel 1663/13580/1249

t1_idx_parallel_1:
rmgr: Heap        len (rec/tot):     54/  8218, tx:        494, lsn:
0/016B84F8, prev 0/016B84B8, desc: INSERT off 30 flags 0x01, blkref
#0: rel 1663/13580/1249

wal diff: 8218 - 7498 = 720


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

On Fri, Apr 3, 2020 at 9:40 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Apr 3, 2020 at 9:35 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > I have analyzed the WAL and there could be multiple reasons for the
> > same.  With small data, I have noticed that while inserting in the
> > system index there was a Page Split and that created extra WAL.
> >
>
> Thanks for the investigation.  I think it is clear that we can't
> expect the same WAL size even if we repeat the same operation unless
> it is a fresh database.
>

Attached find the latest patches.  I have modified based on our
discussion on user interface thread [1], ran pgindent on all patches,
slightly modified one comment based on Dilip's input and added commit
messages.  I think the patches are in good shape.  I would like to
commit the first patch in this series tomorrow unless I see more
comments or any other objections.  The patch-2 might need to be
rebased if the other related patch [2] got committed first and we
might need to tweak a bit based on the input from other thread [1]
where we are discussing user interface for it.

[1] - https://www.postgresql.org/message-id/CAA4eK1%2Bo1Vj4Rso09pKOaKhY8QWTA0gWwCL3TGCi1rCLBBf-QQ%40mail.gmail.com
[2] - https://www.postgresql.org/message-id/E1jKC4J-0007R3-Bo%40gemulon.postgresql.org

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

On Tue, 31 Mar 2020 at 14:13, Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Tue, 31 Mar 2020 at 12:58, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, Mar 30, 2020 at 12:31 PM Masahiko Sawada
> > <masahiko.sawada@2ndquadrant.com> wrote:
> > >
> > > The patch for vacuum conflicts with recent changes in vacuum. So I've
> > > attached rebased one.
> > >
> >
> > + /*
> > + * Next, accumulate buffer usage.  (This must wait for the workers to
> > + * finish, or we might get incomplete data.)
> > + */
> > + for (i = 0; i < nworkers; i++)
> > + InstrAccumParallelQuery(&lps->buffer_usage[i]);
> > +
> >
> > This should be done for launched workers aka
> > lps->pcxt->nworkers_launched.  I think a similar problem exists in
> > create index related patch.
>
> You're right. Fixed in the new patches.
>
> On Mon, 30 Mar 2020 at 17:00, Julien Rouhaud <rjuju123@gmail.com> wrote:
> >
> > Just minor nitpicking:
> >
> > +   int         i;
> >
> >     Assert(!IsParallelWorker());
> >     Assert(ParallelVacuumIsActive(lps));
> > @@ -2166,6 +2172,13 @@ lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
> >     /* Wait for all vacuum workers to finish */
> >     WaitForParallelWorkersToFinish(lps->pcxt);
> >
> > +   /*
> > +    * Next, accumulate buffer usage.  (This must wait for the workers to
> > +    * finish, or we might get incomplete data.)
> > +    */
> > +   for (i = 0; i < nworkers; i++)
> > +       InstrAccumParallelQuery(&lps->buffer_usage[i]);
> >
> > We now allow declaring a variable in those loops, so it may be better to avoid
> > declaring i outside the for scope?
>
> We can do that but I was not sure if it's good since other codes
> around there don't use that. So I'd like to leave it for committers.
> It's a trivial change.
>

I've updated the buffer usage patch for parallel index creation as the
previous patch conflicts with commit df3b181499b40.

This comment in commit df3b181499b40 seems the comment which had been
replaced by Amit with a better sentence when introducing buffer usage
to parallel vacuum.

+   /*
+    * Estimate space for WalUsage -- PARALLEL_KEY_WAL_USAGE
+    *
+    * WalUsage during execution of maintenance command can be used by an
+    * extension that reports the WAL usage, such as pg_stat_statements. We
+    * have no way of knowing whether anyone's looking at pgWalUsage, so do it
+    * unconditionally.
+    */

Would the following sentence in lazyvacuum.c be also better for
parallel create index?

    * If there are no extensions loaded that care, we could skip this.  We
    * have no way of knowing whether anyone's looking at pgBufferUsage or
    * pgWalUsage, so do it unconditionally.

The attached patch changes to the above comment and removed the code
that is used to un-support only buffer usage accumulation.

Regards,

-- 
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

bufferusage_create_index_v3.patch

Re: pg_stat_statements issue with parallel maintenance (Was Re: WALusage calculation patch)

From

Amit Kapila

Date:

06 April 2020, 07:16:05

On Mon, Apr 6, 2020 at 11:19 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> The attached patch changes to the above comment and removed the code
> that is used to un-support only buffer usage accumulation.
>

So, IIUC, the purpose of this patch will be to count the buffer usage
due to the heap scan (in heapam_index_build_range_scan) we perform
while parallel create index? Because the index creation itself won't
use buffer manager.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: pg_stat_statements issue with parallel maintenance (Was Re: WALusage calculation patch)

From

Masahiko Sawada

Date:

06 April 2020, 07:24:53

On Mon, 6 Apr 2020 at 16:16, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Apr 6, 2020 at 11:19 AM Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > The attached patch changes to the above comment and removed the code
> > that is used to un-support only buffer usage accumulation.
> >
>
> So, IIUC, the purpose of this patch will be to count the buffer usage
> due to the heap scan (in heapam_index_build_range_scan) we perform
> while parallel create index? Because the index creation itself won't
> use buffer manager.

Oops, I'd missed Peter's comment. Btree index doesn't use
heapam_index_build_range_scan so it's not necessary. Sorry for the
noise.

Regards,

-- 
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: WAL usage calculation patch

From

Julien Rouhaud

Date:

06 April 2020, 08:23:07

On Mon, Apr 06, 2020 at 08:55:01AM +0530, Amit Kapila wrote:
> On Sat, Apr 4, 2020 at 2:50 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> >
> I have pushed pg_stat_statements and Explain related patches.  I am
> now looking into (auto)vacuum patch and have few comments.
> 

Thanks!

> @@ -614,6 +616,9 @@ heap_vacuum_rel(Relation onerel, VacuumParams *params,
> 
>   TimestampDifference(starttime, endtime, &secs, &usecs);
> 
> + memset(&walusage, 0, sizeof(WalUsage));
> + WalUsageAccumDiff(&walusage, &pgWalUsage, &walusage_start);
> +
>   read_rate = 0;
>   write_rate = 0;
>   if ((secs > 0) || (usecs > 0))
> @@ -666,7 +671,13 @@ heap_vacuum_rel(Relation onerel, VacuumParams *params,
>   (long long) VacuumPageDirty);
>   appendStringInfo(&buf, _("avg read rate: %.3f MB/s, avg write rate:
> %.3f MB/s\n"),
>   read_rate, write_rate);
> - appendStringInfo(&buf, _("system usage: %s"), pg_rusage_show(&ru0));
> + appendStringInfo(&buf, _("system usage: %s\n"), pg_rusage_show(&ru0));
> + appendStringInfo(&buf,
> + _("WAL usage: %ld records, %ld full page writes, "
> +    UINT64_FORMAT " bytes"),
> + walusage.wal_records,
> + walusage.wal_num_fpw,
> + walusage.wal_bytes);
> 
> Here, we are not displaying Buffers related data, so why do we think
> it is important to display WAL data?  I see some point in displaying
> Buffers and WAL data in a vacuum (verbose), but I feel it is better to
> make a case for both the statistics together rather than just
> displaying one and leaving other.  I think the other change related to
> autovacuum stats seems okay to me.

One thing is that the amount of WAL, and more precisely FPW, is quite
unpredictable wrt. vacuum and even more anti-wraparound vacuum, so this is IMHO
a very useful metric.  That being said I totally agree with you that both
should be displayed.  Should I send a patch to also expose it?

Re: WAL usage calculation patch

From

Amit Kapila

Date:

06 April 2020, 09:04:36

On Mon, Apr 6, 2020 at 1:53 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> On Mon, Apr 06, 2020 at 08:55:01AM +0530, Amit Kapila wrote:
> >
> > Here, we are not displaying Buffers related data, so why do we think
> > it is important to display WAL data?  I see some point in displaying
> > Buffers and WAL data in a vacuum (verbose), but I feel it is better to
> > make a case for both the statistics together rather than just
> > displaying one and leaving other.  I think the other change related to
> > autovacuum stats seems okay to me.
>
> One thing is that the amount of WAL, and more precisely FPW, is quite
> unpredictable wrt. vacuum and even more anti-wraparound vacuum, so this is IMHO
> a very useful metric.
>

I agree but we already have a way via pg_stat_statements to find it if
the metric is so useful.

>  That being said I totally agree with you that both
> should be displayed.  Should I send a patch to also expose it?
>

I think this should be a separate proposal.  Let's not add things
unless they are really essential.  We can separately discuss of
enhancing vacuum verbose for Buffer and WAL usage stats and see if
others also find that information useful.  I think you can send a
patch by removing the code I mentioned above if you agree.  Thanks for
working on this.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: pg_stat_statements issue with parallel maintenance (Was Re: WALusage calculation patch)

From

Amit Kapila

Date:

06 April 2020, 09:21:03

On Mon, Apr 6, 2020 at 12:55 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Mon, 6 Apr 2020 at 16:16, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, Apr 6, 2020 at 11:19 AM Masahiko Sawada
> > <masahiko.sawada@2ndquadrant.com> wrote:
> > >
> > > The attached patch changes to the above comment and removed the code
> > > that is used to un-support only buffer usage accumulation.
> > >
> >
> > So, IIUC, the purpose of this patch will be to count the buffer usage
> > due to the heap scan (in heapam_index_build_range_scan) we perform
> > while parallel create index? Because the index creation itself won't
> > use buffer manager.
>
> Oops, I'd missed Peter's comment. Btree index doesn't use
> heapam_index_build_range_scan so it's not necessary.
>

AFAIU, it uses heapam_index_build_range_scan but for writing to index,
it doesn't use buffer manager.  So, I guess probably we can accumulate
BufferUsage stats for parallel create index.  What I wanted to know is
whether the extra lookup for pg_amproc or any other catalog access via
parallel workers is fine or we somehow want to eliminate that?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: WAL usage calculation patch

From

Julien Rouhaud

Date:

06 April 2020, 09:33:34

On Mon, Apr 06, 2020 at 02:34:36PM +0530, Amit Kapila wrote:
> On Mon, Apr 6, 2020 at 1:53 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> >
> > On Mon, Apr 06, 2020 at 08:55:01AM +0530, Amit Kapila wrote:
> > >
> > > Here, we are not displaying Buffers related data, so why do we think
> > > it is important to display WAL data?  I see some point in displaying
> > > Buffers and WAL data in a vacuum (verbose), but I feel it is better to
> > > make a case for both the statistics together rather than just
> > > displaying one and leaving other.  I think the other change related to
> > > autovacuum stats seems okay to me.
> >
> > One thing is that the amount of WAL, and more precisely FPW, is quite
> > unpredictable wrt. vacuum and even more anti-wraparound vacuum, so this is IMHO
> > a very useful metric.
> >
> 
> I agree but we already have a way via pg_stat_statements to find it if
> the metric is so useful.
> 

Agreed.

> 
> >  That being said I totally agree with you that both
> > should be displayed.  Should I send a patch to also expose it?
> >
> 
> I think this should be a separate proposal.  Let's not add things
> unless they are really essential.  We can separately discuss of
> enhancing vacuum verbose for Buffer and WAL usage stats and see if
> others also find that information useful.  I think you can send a
> patch by removing the code I mentioned above if you agree.  Thanks for
> working on this.

Thanks!  v15 attached.

Attachment

v15-0001-Expose-WAL-usage-counters-in-verbose-auto-vacuum.patch

Re: WAL usage calculation patch

From

Euler Taveira

Date:

06 April 2020, 13:12:55

On Mon, 6 Apr 2020 at 00:25, Amit Kapila <amit.kapila16@gmail.com> wrote:

I have pushed pg_stat_statements and Explain related patches. I am
now looking into (auto)vacuum patch and have few comments.

I wasn't paying much attention to this thread. May I suggest changing wal_num_fpw to wal_fpw? wal_records and wal_bytes does not have a prefix 'num'. It seems inconsistent to me.

Regards,

Euler Taveira http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: WAL usage calculation patch

From

Julien Rouhaud

Date:

06 April 2020, 13:37:35

On Mon, Apr 06, 2020 at 10:12:55AM -0300, Euler Taveira wrote:
> On Mon, 6 Apr 2020 at 00:25, Amit Kapila <amit.kapila16@gmail.com> wrote:
> 
> >
> > I have pushed pg_stat_statements and Explain related patches.  I am
> > now looking into (auto)vacuum patch and have few comments.
> >
> > I wasn't paying much attention to this thread. May I suggest changing
> wal_num_fpw to wal_fpw? wal_records and wal_bytes does not have a prefix
> 'num'. It seems inconsistent to me.
> 

If we want to be consistent shouldn't we rename it to wal_fpws?  FTR I don't
like much either version.

Re: WAL usage calculation patch

From

Euler Taveira

Date:

06 April 2020, 14:28:14

On Mon, 6 Apr 2020 at 10:37, Julien Rouhaud <rjuju123@gmail.com> wrote:

On Mon, Apr 06, 2020 at 10:12:55AM -0300, Euler Taveira wrote:
> On Mon, 6 Apr 2020 at 00:25, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> >
> > I have pushed pg_stat_statements and Explain related patches. I am
> > now looking into (auto)vacuum patch and have few comments.
> >
> > I wasn't paying much attention to this thread. May I suggest changing
> wal_num_fpw to wal_fpw? wal_records and wal_bytes does not have a prefix
> 'num'. It seems inconsistent to me.
>

If we want to be consistent shouldn't we rename it to wal_fpws? FTR I don't
like much either version.

Since FPW is an acronym, plural form reads better when you are using uppercase (such as FPWs or FPW's); thus, I prefer singular form because parameter names are lowercase. Function description will clarify that this is "number of WAL full page writes".

Regards,

Euler Taveira http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: WAL usage calculation patch

From

Peter Eisentraut

Date:

06 April 2020, 15:01:30

I noticed in some of the screenshots that were tweeted that for example in

     WAL:  records=1  bytes=56

there are two spaces between pieces of data.  This doesn't match the 
rest of the EXPLAIN output.  Can that be adjusted?

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: WAL usage calculation patch

From

Justin Pryzby

Date:

06 April 2020, 16:31:09

On Mon, Apr 06, 2020 at 05:01:30PM +0200, Peter Eisentraut wrote:
> I noticed in some of the screenshots that were tweeted that for example in
> 
>     WAL:  records=1  bytes=56
> 
> there are two spaces between pieces of data.  This doesn't match the rest of
> the EXPLAIN output.  Can that be adjusted?

We talked about that here:
https://www.postgresql.org/message-id/20200402054120.GC14618%40telsasoft.com

-- 
Justin

Re: pg_stat_statements issue with parallel maintenance (Was Re: WALusage calculation patch)

From

Peter Geoghegan

Date:

06 April 2020, 17:40:38

On Mon, Apr 6, 2020 at 2:21 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> AFAIU, it uses heapam_index_build_range_scan but for writing to index,
> it doesn't use buffer manager.

Right. It doesn't need to use the buffer manager to write to the
index, unlike (say) GIN's CREATE INDEX.

-- 
Peter Geoghegan

Re: WAL usage calculation patch

From

Amit Kapila

Date:

07 April 2020, 02:12:49

On Mon, Apr 6, 2020 at 10:01 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
>
> On Mon, Apr 06, 2020 at 05:01:30PM +0200, Peter Eisentraut wrote:
> > I noticed in some of the screenshots that were tweeted that for example in
> >
> >     WAL:  records=1  bytes=56
> >
> > there are two spaces between pieces of data.  This doesn't match the rest of
> > the EXPLAIN output.  Can that be adjusted?
>
> We talked about that here:
> https://www.postgresql.org/message-id/20200402054120.GC14618%40telsasoft.com
>

Yeah.  Just to brief here, the main reason was that one of the fields
(full page writes) already had a single space and then we had prior
cases as mentioned in Justin's email [1] where we use two spaces which
lead us to decide using two spaces in this case.

Now, we can change back to one space as suggested by you but I am not
sure if that is an improvement over what we have done.  Let me know if
you think otherwise.

[1] - https://www.postgresql.org/message-id/20200402054120.GC14618%40telsasoft.com

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: WAL usage calculation patch

From

Amit Kapila

Date:

07 April 2020, 02:35:57

On Mon, Apr 6, 2020 at 7:58 PM Euler Taveira
<euler.taveira@2ndquadrant.com> wrote:
>
> On Mon, 6 Apr 2020 at 10:37, Julien Rouhaud <rjuju123@gmail.com> wrote:
>>
>> On Mon, Apr 06, 2020 at 10:12:55AM -0300, Euler Taveira wrote:
>> > On Mon, 6 Apr 2020 at 00:25, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> >
>> > >
>> > > I have pushed pg_stat_statements and Explain related patches.  I am
>> > > now looking into (auto)vacuum patch and have few comments.
>> > >
>> > > I wasn't paying much attention to this thread. May I suggest changing
>> > wal_num_fpw to wal_fpw? wal_records and wal_bytes does not have a prefix
>> > 'num'. It seems inconsistent to me.
>> >
>>
>> If we want to be consistent shouldn't we rename it to wal_fpws?  FTR I don't
>> like much either version.
>
>
> Since FPW is an acronym, plural form reads better when you are using uppercase (such as FPWs or FPW's); thus, I
prefersingular form because parameter names are lowercase. Function description will clarify that this is "number of
WALfull page writes".
 
>

I like Euler's suggestion to change wal_num_fpw to wal_fpw.  It is
better if others who didn't like this name can also share their
opinion now because changing multiple times the same thing is not a
good idea.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: pg_stat_statements issue with parallel maintenance (Was Re: WALusage calculation patch)

From

Masahiko Sawada

Date:

07 April 2020, 07:59:39

On Tue, 7 Apr 2020 at 02:40, Peter Geoghegan <pg@bowt.ie> wrote:
>
> On Mon, Apr 6, 2020 at 2:21 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > AFAIU, it uses heapam_index_build_range_scan but for writing to index,
> > it doesn't use buffer manager.
>
> Right. It doesn't need to use the buffer manager to write to the
> index, unlike (say) GIN's CREATE INDEX.

Hmm, after more thoughts and testing, it seems to me that parallel
btree index creation uses buffer manager while scanning the table in
parallel, i.e in heapam_index_build_range_scan, which affects
shared_blks_xxx in pg_stat_statements. I've some parallel create index
tests with the current HEAD and with the attached patch. The table has
44248 blocks.

HEAD, no workers:

-[ RECORD 1 ]-------+----------
total_plan_time     | 0
total_plan_time     | 0
shared_blks_hit     | 148
shared_blks_read    | 44281
total_read_blks     | 44429
shared_blks_dirtied | 44261
shared_blks_written | 24644
wal_records         | 71693
wal_num_fpw         | 71682
wal_bytes           | 566815038

HEAD, 4 workers:

-[ RECORD 1 ]-------+----------
total_plan_time     | 0
total_plan_time     | 0
shared_blks_hit     | 160
shared_blks_read    | 8892
total_read_blks     | 9052
shared_blks_dirtied | 8871
shared_blks_written | 5342
wal_records         | 71693
wal_num_fpw         | 71682
wal_bytes           | 566815038

The WAL usage statistics are good but the buffer usage statistics seem
not correct.

Patched, no workers:

-[ RECORD 1 ]-------+----------
total_plan_time     | 0
total_plan_time     | 0
shared_blks_hit     | 148
shared_blks_read    | 44281
total_read_blks     | 44429
shared_blks_dirtied | 44261
shared_blks_written | 24843
wal_records         | 71693
wal_num_fpw         | 71682
wal_bytes           | 566815038

Patched, 4 workers:

-[ RECORD 1 ]-------+----------
total_plan_time     | 0
total_plan_time     | 0
shared_blks_hit     | 172
shared_blks_read    | 44282
total_read_blks     | 44454
shared_blks_dirtied | 44261
shared_blks_written | 26968
wal_records         | 71693
wal_num_fpw         | 71682
wal_bytes           | 566815038

Buffer usage statistics seem correct. The small differences would be
catalog lookups Peter mentioned.

Regards,

-- 
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

bufferusage_create_index_v4.patch

Re: pg_stat_statements issue with parallel maintenance (Was Re: WALusage calculation patch)

From

Amit Kapila

Date:

07 April 2020, 08:42:01

On Tue, Apr 7, 2020 at 1:30 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> Buffer usage statistics seem correct. The small differences would be
> catalog lookups Peter mentioned.
>

Agreed, but can you check which part of code does that lookup?  I want
to see if we can avoid that from buffer usage stats or at least write
a comment about it, otherwise, we might have to face this question
again and again.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: WAL usage calculation patch

From

Julien Rouhaud

Date:

07 April 2020, 09:18:24

On Tue, Apr 7, 2020 at 4:36 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Apr 6, 2020 at 7:58 PM Euler Taveira
> <euler.taveira@2ndquadrant.com> wrote:
> >
> > On Mon, 6 Apr 2020 at 10:37, Julien Rouhaud <rjuju123@gmail.com> wrote:
> >>
> >> On Mon, Apr 06, 2020 at 10:12:55AM -0300, Euler Taveira wrote:
> >> > On Mon, 6 Apr 2020 at 00:25, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >> >
> >> > >
> >> > > I have pushed pg_stat_statements and Explain related patches.  I am
> >> > > now looking into (auto)vacuum patch and have few comments.
> >> > >
> >> > > I wasn't paying much attention to this thread. May I suggest changing
> >> > wal_num_fpw to wal_fpw? wal_records and wal_bytes does not have a prefix
> >> > 'num'. It seems inconsistent to me.
> >> >
> >>
> >> If we want to be consistent shouldn't we rename it to wal_fpws?  FTR I don't
> >> like much either version.
> >
> >
> > Since FPW is an acronym, plural form reads better when you are using uppercase (such as FPWs or FPW's); thus, I
prefersingular form because parameter names are lowercase. Function description will clarify that this is "number of
WALfull page writes".
 
> >
>
> I like Euler's suggestion to change wal_num_fpw to wal_fpw.  It is
> better if others who didn't like this name can also share their
> opinion now because changing multiple times the same thing is not a
> good idea.

+1

About Justin and your comments on the other thread:

On Tue, Apr 7, 2020 at 4:31 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Apr 6, 2020 at 10:04 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
> >
> > On Thu, Apr 02, 2020 at 08:29:31AM +0200, Julien Rouhaud wrote:
> > > > > "full page records" seems to be showing the number of full page
> > > > > images, not the record having full page images.
> > > >
> > > > I am not sure what exactly is a difference but it is the records
> > > > having full page images.  Julien correct me if I am wrong.
> >
> > > Obviously previous complaints about the meaning and parsability of
> > > "full page writes" should be addressed here for consistency.
> >
> > There's a couple places that say "full page image records" which I think is
> > language you were trying to avoid.  It's the number of pages, not the number of
> > records, no ?  I see explain and autovacuum say what I think is wanted, but
> > these say the wrong thing?  Find attached slightly larger patch.
> >
> > $ git grep 'image record'
> > contrib/pg_stat_statements/pg_stat_statements.c:        int64           wal_num_fpw;    /* # of WAL full page image
recordsgenerated */
 
> > doc/src/sgml/ref/explain.sgml:      number of records, number of full page image records and amount of WAL
> >
>
> Few comments:
> 1.
> - int64 wal_num_fpw; /* # of WAL full page image records generated */
> + int64 wal_num_fpw; /* # of WAL full page images generated */
>
> Let's change comment as " /* # of WAL full page writes generated */"
> to be consistent with other places like instrument.h.  Also, make a
> similar change at other places if required.

Agreed.  That's pg_stat_statements.c and instrument.h.  I'll send a
patch once we reach consensus with the rest of the comments.

> 2.
>        <entry>
> -        Total amount of WAL bytes generated by the statement
> +        Total number of WAL bytes generated by the statement
>        </entry>
>
> I feel the previous text was better as this field can give us the size
> of WAL with which we can answer "how much WAL data is generated by a
> particular statement?".  Julien, do you have any thoughts on this?

I also prefer "amount" as it feels more natural.  I'm not a native
english speaker though, so maybe I'm just biased.

Re: pg_stat_statements issue with parallel maintenance (Was Re: WALusage calculation patch)

From

Masahiko Sawada

Date:

07 April 2020, 09:29:58

On Tue, 7 Apr 2020 at 17:42, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Apr 7, 2020 at 1:30 PM Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > Buffer usage statistics seem correct. The small differences would be
> > catalog lookups Peter mentioned.
> >
>
> Agreed, but can you check which part of code does that lookup?  I want
> to see if we can avoid that from buffer usage stats or at least write
> a comment about it, otherwise, we might have to face this question
> again and again.

Okay, I'll check it.

Regards,

-- 
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: WAL usage calculation patch

From

Peter Eisentraut

Date:

07 April 2020, 10:00:29

On 2020-04-07 04:12, Amit Kapila wrote:
> On Mon, Apr 6, 2020 at 10:01 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
>>
>> On Mon, Apr 06, 2020 at 05:01:30PM +0200, Peter Eisentraut wrote:
>>> I noticed in some of the screenshots that were tweeted that for example in
>>>
>>>      WAL:  records=1  bytes=56
>>>
>>> there are two spaces between pieces of data.  This doesn't match the rest of
>>> the EXPLAIN output.  Can that be adjusted?
>>
>> We talked about that here:
>> https://www.postgresql.org/message-id/20200402054120.GC14618%40telsasoft.com
>>
> 
> Yeah.  Just to brief here, the main reason was that one of the fields
> (full page writes) already had a single space and then we had prior
> cases as mentioned in Justin's email [1] where we use two spaces which
> lead us to decide using two spaces in this case.

We also have existing cases for the other way:

     actual time=0.050..0.052
     Buffers: shared hit=3 dirtied=1

The cases mentioned by Justin are not formatted in a key=value format, 
so it's not quite the same, but it also raises the question why they are 
not.

Let's figure out a way to consolidate this without making up a third format.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: pg_stat_statements issue with parallel maintenance (Was Re: WALusage calculation patch)

From

Masahiko Sawada

Date:

07 April 2020, 11:47:14

On Tue, 7 Apr 2020 at 18:29, Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Tue, 7 Apr 2020 at 17:42, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Apr 7, 2020 at 1:30 PM Masahiko Sawada
> > <masahiko.sawada@2ndquadrant.com> wrote:
> > >
> > > Buffer usage statistics seem correct. The small differences would be
> > > catalog lookups Peter mentioned.
> > >
> >
> > Agreed, but can you check which part of code does that lookup?  I want
> > to see if we can avoid that from buffer usage stats or at least write
> > a comment about it, otherwise, we might have to face this question
> > again and again.
>
> Okay, I'll check it.
>

I've checked the buffer usage differences when parallel btree index creation.

TL;DR;

During tuple sorting individual parallel workers read blocks of
pg_amproc and pg_amproc_fam_proc_index to get the sort support
function. The call flow is like:

ParallelWorkerMain()
  _bt_parallel_scan_and_sort()
    tuplesort_begin_index_btree()
      PrepareSortSupportFromIndexRel()
        FinishSortSupportFunction()
          get_opfamily_proc()

The details are as follows.

I populated the test table by the following scripts:

create table test (c int) with (autovacuum_enabled = off, parallel_workers = 8);
insert into test select generate_series(1,10000000);

and create index DDL is:

create index test_idx on test (c);

Before executing the test script, I've put code at the following 4
places which checks the buffer usage at that point, and calculated the
difference between points: (a), (b) and (c). For example, (b) shows
the number of blocks read or hit during executing scanning heap and
building index.

1. Before executing CREATE INDEX command (at pgss_ProcessUtility())
(a)
2. Before parallel create index (at _bt_begin_parallel())
(b)
3. After parallel create index, after accumlating workers stats (at
_bt_end_parallel())
(c)
4. After executing CREATE INDEX command (at pgss_ProcessUtility())

And here is the results:

2 workers:
(a) hit: 107, read: 26
(b) hit: 12(=6+3+3), read: 44248(=15538+14453+14527)
(c) hit: 13, read: 2
total hit: 132, read:44276

4 workers:
(a) hit: 107, read: 26
(b) hit: 18(=6+3+3+3+3), read: 44248(=9368+8582+8544+9250+8504)
(c) hit: 13, read: 2
total hit: 138, read:44276

The table 'test' has 44276 blocks.

From the above results, the total number of reading blocks (44248
blocks) during parallel index creation is stable and equals to the
number of blocks of the test table. And we can see that extra three
blocks are read per workers. These three blocks are two for
pg_amproc_fam_proc_index and one for pg_amproc. That is, individual
parallel workers accesses these relations to get the sort support
function. The full backtrace is:

* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
  * frame #0: 0x00007fff779c561a libsystem_kernel.dylib`__select + 10
    frame #1: 0x000000010cc9f90d postgres`pg_usleep(microsec=20000000)
at pgsleep.c:56:10
    frame #2: 0x000000010ca5a668
postgres`ReadBuffer_common(smgr=0x00007fe872848f70,
relpersistence='p', forkNum=MAIN_FORKNUM, blockNum=3, mode=RBM_NORMAL,
strategy=0x0000000000000000, hit=0x00007ffee363071b) at bufmgr.c:685:3
    frame #3: 0x000000010ca5a4b6
postgres`ReadBufferExtended(reln=0x000000010d58f790,
forkNum=MAIN_FORKNUM, blockNum=3, mode=RBM_NORMAL,
strategy=0x0000000000000000) at bufmgr.c:628:8
    frame #4: 0x000000010ca5a397
postgres`ReadBuffer(reln=0x000000010d58f790, blockNum=3) at
bufmgr.c:560:9
    frame #5: 0x000000010c67187e
postgres`_bt_getbuf(rel=0x000000010d58f790, blkno=3, access=1) at
nbtpage.c:792:9
    frame #6: 0x000000010c670507
postgres`_bt_getroot(rel=0x000000010d58f790, access=1) at
nbtpage.c:294:13
    frame #7: 0x000000010c679393
postgres`_bt_search(rel=0x000000010d58f790, key=0x00007ffee36312d0,
bufP=0x00007ffee3631bec, access=1, snapshot=0x00007fe8728388e0) at
nbtsearch.c:107:10
    frame #8: 0x000000010c67b489
postgres`_bt_first(scan=0x00007fe86f814998, dir=ForwardScanDirection)
at nbtsearch.c:1355:10
    frame #9: 0x000000010c676869
postgres`btgettuple(scan=0x00007fe86f814998, dir=ForwardScanDirection)
at nbtree.c:253:10
    frame #10: 0x000000010c6656ad
postgres`index_getnext_tid(scan=0x00007fe86f814998,
direction=ForwardScanDirection) at indexam.c:530:10
    frame #11: 0x000000010c66585b
postgres`index_getnext_slot(scan=0x00007fe86f814998,
direction=ForwardScanDirection, slot=0x00007fe86f814880) at
indexam.c:622:10
    frame #12: 0x000000010c663eac
postgres`systable_getnext(sysscan=0x00007fe86f814828) at genam.c:454:7
    frame #13: 0x000000010cc0be41
postgres`SearchCatCacheMiss(cache=0x00007fe872818e80, nkeys=4,
hashValue=3052139574, hashIndex=6, v1=1976, v2=23, v3=23, v4=2) at
catcache.c:1368:9
    frame #14: 0x000000010cc0bced
postgres`SearchCatCacheInternal(cache=0x00007fe872818e80, nkeys=4,
v1=1976, v2=23, v3=23, v4=2) at catcache.c:1299:9
    frame #15: 0x000000010cc0baa8
postgres`SearchCatCache4(cache=0x00007fe872818e80, v1=1976, v2=23,
v3=23, v4=2) at catcache.c:1191:9
    frame #16: 0x000000010cc27c82 postgres`SearchSysCache4(cacheId=5,
key1=1976, key2=23, key3=23, key4=2) at syscache.c:1156:9
    frame #17: 0x000000010cc105dd
postgres`get_opfamily_proc(opfamily=1976, lefttype=23, righttype=23,
procnum=2) at lsyscache.c:751:7
    frame #18: 0x000000010cc72e1d
postgres`FinishSortSupportFunction(opfamily=1976, opcintype=23,
ssup=0x00007fe86f8147d0) at sortsupport.c:99:24
    frame #19: 0x000000010cc73100
postgres`PrepareSortSupportFromIndexRel(indexRel=0x000000010d5ced48,
strategy=1, ssup=0x00007fe86f8147d0) at sortsupport.c:176:2
    frame #20: 0x000000010cc75463
postgres`tuplesort_begin_index_btree(heapRel=0x000000010d5cf808,
indexRel=0x000000010d5ced48, enforceUnique=false, workMem=21845,
coordinate=0x00007fe872839248, randomAccess=false) at
tuplesort.c:1114:3
    frame #21: 0x000000010c681ffc
postgres`_bt_parallel_scan_and_sort(btspool=0x00007fe872839738,
btspool2=0x0000000000000000, btshared=0x000000010d56c4c0,
sharedsort=0x000000010d56c460, sharedsort2=0x0000000000000000,
sortmem=21845, progress=false) at nbtsort.c:1941:23
    frame #22: 0x000000010c681eb2
postgres`_bt_parallel_build_main(seg=0x00007fe87280a058,
toc=0x000000010d56c000) at nbtsort.c:1889:2
    frame #23: 0x000000010c6b7358
postgres`ParallelWorkerMain(main_arg=1169089032) at parallel.c:1471:2
    frame #24: 0x000000010c9da86f postgres`StartBackgroundWorker at
bgworker.c:813:2
    frame #25: 0x000000010c9efbc0
postgres`do_start_bgworker(rw=0x00007fe86f419290) at
postmaster.c:5852:4
    frame #26: 0x000000010c9eff9f postgres`maybe_start_bgworkers at
postmaster.c:6078:9
    frame #27: 0x000000010c9eee99
postgres`sigusr1_handler(postgres_signal_arg=30) at
postmaster.c:5247:3
    frame #28: 0x00007fff77a74b5d libsystem_platform.dylib`_sigtramp + 29
    frame #29: 0x00007fff779c561b libsystem_kernel.dylib`__select + 11
    frame #30: 0x000000010c9ea48c postgres`ServerLoop at postmaster.c:1691:13
    frame #31: 0x000000010c9e9e06 postgres`PostmasterMain(argc=5,
argv=0x00007fe86f4036f0) at postmaster.c:1400:11
    frame #32: 0x000000010c8ee399 postgres`main(argc=<unavailable>,
argv=<unavailable>) at main.c:210:3
    frame #33: 0x00007fff778893d5 libdyld.dylib`start + 1

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: WAL usage calculation patch

From

Julien Rouhaud

Date:

07 April 2020, 14:23:47

On Tue, Apr 7, 2020 at 12:00 PM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
>
> On 2020-04-07 04:12, Amit Kapila wrote:
> > On Mon, Apr 6, 2020 at 10:01 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
> >>
> >> On Mon, Apr 06, 2020 at 05:01:30PM +0200, Peter Eisentraut wrote:
> >>> I noticed in some of the screenshots that were tweeted that for example in
> >>>
> >>>      WAL:  records=1  bytes=56
> >>>
> >>> there are two spaces between pieces of data.  This doesn't match the rest of
> >>> the EXPLAIN output.  Can that be adjusted?
> >>
> >> We talked about that here:
> >> https://www.postgresql.org/message-id/20200402054120.GC14618%40telsasoft.com
> >>
> >
> > Yeah.  Just to brief here, the main reason was that one of the fields
> > (full page writes) already had a single space and then we had prior
> > cases as mentioned in Justin's email [1] where we use two spaces which
> > lead us to decide using two spaces in this case.
>
> We also have existing cases for the other way:
>
>      actual time=0.050..0.052
>      Buffers: shared hit=3 dirtied=1
>
> The cases mentioned by Justin are not formatted in a key=value format,
> so it's not quite the same, but it also raises the question why they are
> not.
>
> Let's figure out a way to consolidate this without making up a third format.

The parsability problem Justin was mentioning is only due to "full
page writes", so we could use "full_page_writes" or "fpw" instead and
remove the extra spaces.  There would be a small discrepancy with the
verbose autovacuum log, but there are others differences already.

I'd slightly in favor of "fpw" to be more concise. Would that be ok?

Re: WAL usage calculation patch

From

Justin Pryzby

Date:

07 April 2020, 22:50:34

On Tue, Apr 07, 2020 at 12:00:29PM +0200, Peter Eisentraut wrote:
> We also have existing cases for the other way:
> 
>     actual time=0.050..0.052
>     Buffers: shared hit=3 dirtied=1
> 
> The cases mentioned by Justin are not formatted in a key=value format, so
> it's not quite the same, but it also raises the question why they are not.
> 
> Let's figure out a way to consolidate this without making up a third format.

So this re-raises my suggestion here to use colons, Title Case Field Names, and
"Size: ..kB" rather than "bytes=":
|https://www.postgresql.org/message-id/20200403054451.GN14618%40telsasoft.com

As I see it, the sort/hashjoin style is being used for cases with fields with
different units:

   Sort Method: quicksort  Memory: 931kB
   Buckets: 1024  Batches: 1  Memory Usage: 16kB

..which is distinguished from the case where the units are the same, like
buffers (hit=Npages read=Npages dirtied=Npages written=Npages).

Note, as of 1f39bce021, we have hashagg_disk, which looks like this:

template1=# explain analyze SELECT a, COUNT(1) FROM generate_series(1,99999) a GROUP BY 1 ORDER BY 1;
...
   ->  HashAggregate  (cost=1499.99..1501.99 rows=200 width=12) (actual time=166.883..280.943 rows=99999 loops=1)
         Group Key: a
         Peak Memory Usage: 4913 kB
         Disk Usage: 1848 kB
         HashAgg Batches: 8

Incremental sort adds yet another variation, which I've mentioned that thread.
I'm hoping to come to some resolution here, first.
https://www.postgresql.org/message-id/20200407042521.GH2228%40telsasoft.com

-- 
Justin

Re: WAL usage calculation patch

From

Amit Kapila

Date:

08 April 2020, 03:06:34

On Tue, Apr 7, 2020 at 3:30 PM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
>
> On 2020-04-07 04:12, Amit Kapila wrote:
> > On Mon, Apr 6, 2020 at 10:01 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
> >>
> >> On Mon, Apr 06, 2020 at 05:01:30PM +0200, Peter Eisentraut wrote:
> >>> I noticed in some of the screenshots that were tweeted that for example in
> >>>
> >>>      WAL:  records=1  bytes=56
> >>>
> >>> there are two spaces between pieces of data.  This doesn't match the rest of
> >>> the EXPLAIN output.  Can that be adjusted?
> >>
> >> We talked about that here:
> >> https://www.postgresql.org/message-id/20200402054120.GC14618%40telsasoft.com
> >>
> >
> > Yeah.  Just to brief here, the main reason was that one of the fields
> > (full page writes) already had a single space and then we had prior
> > cases as mentioned in Justin's email [1] where we use two spaces which
> > lead us to decide using two spaces in this case.
>
> We also have existing cases for the other way:
>
>      actual time=0.050..0.052
>      Buffers: shared hit=3 dirtied=1
>

Buffers case is not the same because 'shared' is used for 'hit',
'read', 'dirtied', etc.  However, I think it is arguable.

> The cases mentioned by Justin are not formatted in a key=value format,
> so it's not quite the same, but it also raises the question why they are
> not.
>
> Let's figure out a way to consolidate this without making up a third format.
>

Sure, I think my intention is to keep the format of WAL stats as close
to Buffers stats as possible because both depict I/O and users would
probably be interested to check/read both together.  There is a point
to keep things in a format so that it is easier for someone to parse
but I guess as these as fixed 'words', it shouldn't be difficult
either way and we should give more weightage to consistency.  Any
suggestions?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: pg_stat_statements issue with parallel maintenance (Was Re: WALusage calculation patch)

From

Amit Kapila

Date:

08 April 2020, 05:43:57

On Tue, Apr 7, 2020 at 5:17 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Tue, 7 Apr 2020 at 18:29, Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > On Tue, 7 Apr 2020 at 17:42, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Tue, Apr 7, 2020 at 1:30 PM Masahiko Sawada
> > > <masahiko.sawada@2ndquadrant.com> wrote:
> > > >
> > > > Buffer usage statistics seem correct. The small differences would be
> > > > catalog lookups Peter mentioned.
> > > >
> > >
> > > Agreed, but can you check which part of code does that lookup?  I want
> > > to see if we can avoid that from buffer usage stats or at least write
> > > a comment about it, otherwise, we might have to face this question
> > > again and again.
> >
> > Okay, I'll check it.
> >
>
> I've checked the buffer usage differences when parallel btree index creation.
>
> TL;DR;
>
> During tuple sorting individual parallel workers read blocks of
> pg_amproc and pg_amproc_fam_proc_index to get the sort support
> function. The call flow is like:
>
> ParallelWorkerMain()
>   _bt_parallel_scan_and_sort()
>     tuplesort_begin_index_btree()
>       PrepareSortSupportFromIndexRel()
>         FinishSortSupportFunction()
>           get_opfamily_proc()
>

Thanks for the investigation.  I don't see we can do anything special
about this.  In an ideal world, this should be done once and not for
each worker but I guess it doesn't matter too much.  I am not sure if
it is worth adding a comment for this, what do you think?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: pg_stat_statements issue with parallel maintenance (Was Re: WALusage calculation patch)

From

Masahiko Sawada

Date:

08 April 2020, 06:23:00

On Wed, 8 Apr 2020 at 14:44, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Apr 7, 2020 at 5:17 PM Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > On Tue, 7 Apr 2020 at 18:29, Masahiko Sawada
> > <masahiko.sawada@2ndquadrant.com> wrote:
> > >
> > > On Tue, 7 Apr 2020 at 17:42, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Tue, Apr 7, 2020 at 1:30 PM Masahiko Sawada
> > > > <masahiko.sawada@2ndquadrant.com> wrote:
> > > > >
> > > > > Buffer usage statistics seem correct. The small differences would be
> > > > > catalog lookups Peter mentioned.
> > > > >
> > > >
> > > > Agreed, but can you check which part of code does that lookup?  I want
> > > > to see if we can avoid that from buffer usage stats or at least write
> > > > a comment about it, otherwise, we might have to face this question
> > > > again and again.
> > >
> > > Okay, I'll check it.
> > >
> >
> > I've checked the buffer usage differences when parallel btree index creation.
> >
> > TL;DR;
> >
> > During tuple sorting individual parallel workers read blocks of
> > pg_amproc and pg_amproc_fam_proc_index to get the sort support
> > function. The call flow is like:
> >
> > ParallelWorkerMain()
> >   _bt_parallel_scan_and_sort()
> >     tuplesort_begin_index_btree()
> >       PrepareSortSupportFromIndexRel()
> >         FinishSortSupportFunction()
> >           get_opfamily_proc()
> >
>
> Thanks for the investigation.  I don't see we can do anything special
> about this.  In an ideal world, this should be done once and not for
> each worker but I guess it doesn't matter too much.  I am not sure if
> it is worth adding a comment for this, what do you think?
>

I agree with you. If the differences were considerably large probably
we would do something but I think we don't need to anything at this
time.

Regards,

-- 
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: pg_stat_statements issue with parallel maintenance (Was Re: WALusage calculation patch)

From

Amit Kapila

Date:

08 April 2020, 07:04:09

On Wed, Apr 8, 2020 at 11:53 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Wed, 8 Apr 2020 at 14:44, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> >
> > Thanks for the investigation.  I don't see we can do anything special
> > about this.  In an ideal world, this should be done once and not for
> > each worker but I guess it doesn't matter too much.  I am not sure if
> > it is worth adding a comment for this, what do you think?
> >
>
> I agree with you. If the differences were considerably large probably
> we would do something but I think we don't need to anything at this
> time.
>

Fair enough, can you once check this in back-branches as this needs to
be backpatched?  I will do that once by myself as well.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: pg_stat_statements issue with parallel maintenance (Was Re: WALusage calculation patch)

From

Julien Rouhaud

Date:

08 April 2020, 07:11:41

On Wed, Apr 8, 2020 at 8:23 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Wed, 8 Apr 2020 at 14:44, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Apr 7, 2020 at 5:17 PM Masahiko Sawada
> > <masahiko.sawada@2ndquadrant.com> wrote:
> > >
> > > On Tue, 7 Apr 2020 at 18:29, Masahiko Sawada
> > > <masahiko.sawada@2ndquadrant.com> wrote:
> > > >
> > > > On Tue, 7 Apr 2020 at 17:42, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > > On Tue, Apr 7, 2020 at 1:30 PM Masahiko Sawada
> > > > > <masahiko.sawada@2ndquadrant.com> wrote:
> > > > > >
> > > > > > Buffer usage statistics seem correct. The small differences would be
> > > > > > catalog lookups Peter mentioned.
> > > > > >
> > > > >
> > > > > Agreed, but can you check which part of code does that lookup?  I want
> > > > > to see if we can avoid that from buffer usage stats or at least write
> > > > > a comment about it, otherwise, we might have to face this question
> > > > > again and again.
> > > >
> > > > Okay, I'll check it.
> > > >
> > >
> > > I've checked the buffer usage differences when parallel btree index creation.
> > >
> > > TL;DR;
> > >
> > > During tuple sorting individual parallel workers read blocks of
> > > pg_amproc and pg_amproc_fam_proc_index to get the sort support
> > > function. The call flow is like:
> > >
> > > ParallelWorkerMain()
> > >   _bt_parallel_scan_and_sort()
> > >     tuplesort_begin_index_btree()
> > >       PrepareSortSupportFromIndexRel()
> > >         FinishSortSupportFunction()
> > >           get_opfamily_proc()
> > >
> >
> > Thanks for the investigation.  I don't see we can do anything special
> > about this.  In an ideal world, this should be done once and not for
> > each worker but I guess it doesn't matter too much.  I am not sure if
> > it is worth adding a comment for this, what do you think?
> >
>
> I agree with you. If the differences were considerably large probably
> we would do something but I think we don't need to anything at this
> time.

+1

Re: pg_stat_statements issue with parallel maintenance (Was Re: WALusage calculation patch)

From

Masahiko Sawada

Date:

08 April 2020, 08:19:19

On Wed, 8 Apr 2020 at 16:04, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Apr 8, 2020 at 11:53 AM Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > On Wed, 8 Apr 2020 at 14:44, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > >
> > > Thanks for the investigation.  I don't see we can do anything special
> > > about this.  In an ideal world, this should be done once and not for
> > > each worker but I guess it doesn't matter too much.  I am not sure if
> > > it is worth adding a comment for this, what do you think?
> > >
> >
> > I agree with you. If the differences were considerably large probably
> > we would do something but I think we don't need to anything at this
> > time.
> >
>
> Fair enough, can you once check this in back-branches as this needs to
> be backpatched?  I will do that once by myself as well.

I've done the same test with HEAD of both REL_12_STABLE and
REL_11_STABLE. I think the patch needs to be backpatched to PG11 where
parallel index creation was introduced. I've attached the patches
for PG12 and PG11 I used for this test for reference.

Here are the results:

* PG12

With no worker:
-[ RECORD 1 ]-------+-------------
shared_blks_hit     | 119
shared_blks_read    | 44283
total_read_blks     | 44402
shared_blks_dirtied | 44262
shared_blks_written | 24925

With 4 workers:
-[ RECORD 1 ]-------+------------
shared_blks_hit     | 128
shared_blks_read    | 8844
total_read_blks     | 8972
shared_blks_dirtied | 8822
shared_blks_written | 5393

With 4 workers after patching:
-[ RECORD 1 ]-------+------------
shared_blks_hit     | 140
shared_blks_read    | 44284
total_read_blks     | 44424
shared_blks_dirtied | 44262
shared_blks_written | 26574

* PG11

With no worker:
-[ RECORD 1 ]-------+------------
shared_blks_hit     | 124
shared_blks_read    | 44284
total_read_blks     | 44408
shared_blks_dirtied | 44263
shared_blks_written | 24908

With 4 workers:
-[ RECORD 1 ]-------+-------------
shared_blks_hit     | 132
shared_blks_read    | 8910
total_read_blks     | 9042
shared_blks_dirtied | 8888
shared_blks_written | 5370

With 4 workers after patched:
-[ RECORD 1 ]-------+-------------
shared_blks_hit     | 144
shared_blks_read    | 44285
total_read_blks     | 44429
shared_blks_dirtied | 44263
shared_blks_written | 26861


Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

On Fri, Apr 10, 2020 at 9:37 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> On Fri, Apr 10, 2020 at 8:17 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > Would you like to send a consolidated patch that includes Euler's
> > suggestion and Justin's patch (by making changes for points we
> > discussed.)?  I think we can keep the point related to number of
> > spaces before each field open?
>
> Sure, I'll take care of that tomorrow!

I tried to take into account all that have been discussed, but I have
to admit that I'm absolutely not sure of what was actually decided
here.  I went with those changes:

- rename wal_num_fpw to wal_fpw for consistency, both in pgss view
fiel name but also everywhere in the code
- change comments to consistently mention "full page writes generated"
- changed pgss and explain documentation to mention "full page images
generated", from Justin's patch on another thread
- kept "amount" of WAL bytes
- no change to the explain output as I have no idea what is the
consensus (one or two spaces, use semicolon or equal, show unit or
not)

Attachment

v1-wal_usage_fixup.diff

Re: pg_stat_statements issue with parallel maintenance (Was Re: WALusage calculation patch)

From

Justin Pryzby

Date:

11 April 2020, 22:33:19

On Sat, Mar 28, 2020 at 04:17:21PM +0100, Julien Rouhaud wrote:
> On Sat, Mar 28, 2020 at 02:38:27PM +0100, Julien Rouhaud wrote:
> > On Sat, Mar 28, 2020 at 04:14:04PM +0530, Amit Kapila wrote:
> > > 
> > > I see some basic problems with the patch.  The way it tries to compute
> > > WAL usage for parallel stuff doesn't seem right to me.  Can you share
> > > or point me to any test done where we have computed WAL for parallel
> > > operations like Parallel Vacuum or Parallel Create Index?
> > 
> > Ah, that's indeed a good point and AFAICT WAL records from parallel utility
> > workers won't be accounted for.  That being said, I think that an argument
> > could be made that proper infrastructure should have been added in the original
> > parallel utility patches, as pg_stat_statement is already broken wrt. buffer
> > usage in parallel utility, unless I'm missing something.
> 
> Just to be sure I did a quick test with pg_stat_statements behavior using
> parallel/non-parallel CREATE INDEX and VACUUM, and unsurprisingly buffer usage
> doesn't reflect parallel workers' activity.
> 
> I added an open for that, and adding Robert in Cc as 9da0cc352 is the first
> commit adding parallel maintenance.

I believe this is resolved for parallel vacuum in master and parallel create
index back to PG11.

I marked this as closed.
https://wiki.postgresql.org/index.php?title=PostgreSQL_13_Open_Items&diff=34802&oldid=34781

-- 
Justin

Re: pg_stat_statements issue with parallel maintenance (Was Re: WALusage calculation patch)

From

Julien Rouhaud

Date:

12 April 2020, 10:53:58

Le dim. 12 avr. 2020 à 00:33, Justin Pryzby <pryzby@telsasoft.com> a écrit :

On Sat, Mar 28, 2020 at 04:17:21PM +0100, Julien Rouhaud wrote:
>
> Just to be sure I did a quick test with pg_stat_statements behavior using
> parallel/non-parallel CREATE INDEX and VACUUM, and unsurprisingly buffer usage
> doesn't reflect parallel workers' activity.
>
> I added an open for that, and adding Robert in Cc as 9da0cc352 is the first
> commit adding parallel maintenance.

I believe this is resolved for parallel vacuum in master and parallel create
index back to PG11.

indeed, I was about to take care of this too

I marked this as closed.
https://wiki.postgresql.org/index.php?title=PostgreSQL_13_Open_Items&diff=34802&oldid=34781

thanks a lot!

Re: pg_stat_statements issue with parallel maintenance (Was Re: WALusage calculation patch)

From

Amit Kapila

Date:

13 April 2020, 04:42:23

On Sun, Apr 12, 2020 at 4:03 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
>
> On Sat, Mar 28, 2020 at 04:17:21PM +0100, Julien Rouhaud wrote:
> > On Sat, Mar 28, 2020 at 02:38:27PM +0100, Julien Rouhaud wrote:
> > > On Sat, Mar 28, 2020 at 04:14:04PM +0530, Amit Kapila wrote:
> > > >
> > > > I see some basic problems with the patch.  The way it tries to compute
> > > > WAL usage for parallel stuff doesn't seem right to me.  Can you share
> > > > or point me to any test done where we have computed WAL for parallel
> > > > operations like Parallel Vacuum or Parallel Create Index?
> > >
> > > Ah, that's indeed a good point and AFAICT WAL records from parallel utility
> > > workers won't be accounted for.  That being said, I think that an argument
> > > could be made that proper infrastructure should have been added in the original
> > > parallel utility patches, as pg_stat_statement is already broken wrt. buffer
> > > usage in parallel utility, unless I'm missing something.
> >
> > Just to be sure I did a quick test with pg_stat_statements behavior using
> > parallel/non-parallel CREATE INDEX and VACUUM, and unsurprisingly buffer usage
> > doesn't reflect parallel workers' activity.
> >
> > I added an open for that, and adding Robert in Cc as 9da0cc352 is the first
> > commit adding parallel maintenance.
>
> I believe this is resolved for parallel vacuum in master and parallel create
> index back to PG11.
>
> I marked this as closed.
> https://wiki.postgresql.org/index.php?title=PostgreSQL_13_Open_Items&diff=34802&oldid=34781
>

Okay, thanks.



-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: WAL usage calculation patch

From

Amit Kapila

Date:

13 April 2020, 06:10:51

On Sat, Apr 11, 2020 at 6:55 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> On Fri, Apr 10, 2020 at 9:37 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> >
> > On Fri, Apr 10, 2020 at 8:17 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > Would you like to send a consolidated patch that includes Euler's
> > > suggestion and Justin's patch (by making changes for points we
> > > discussed.)?  I think we can keep the point related to number of
> > > spaces before each field open?
> >
> > Sure, I'll take care of that tomorrow!
>
> I tried to take into account all that have been discussed, but I have
> to admit that I'm absolutely not sure of what was actually decided
> here.  I went with those changes:
>
> - rename wal_num_fpw to wal_fpw for consistency, both in pgss view
> fiel name but also everywhere in the code
> - change comments to consistently mention "full page writes generated"
> - changed pgss and explain documentation to mention "full page images
> generated", from Justin's patch on another thread
>

I think it is better to use "full page writes" to be consistent with
other places.

> - kept "amount" of WAL bytes
>

Okay, but I would like to make another change suggested by Justin
which is to replace "count" with "number" at a few places.

I have made the above two changes in the attached.  Let me know what
you think about attached?

> - no change to the explain output as I have no idea what is the
> consensus (one or two spaces, use semicolon or equal, show unit or
> not)
>

Yeah, let's do this separately once we have consensus.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment

v2-wal_usage_fixup.patch

Re: WAL usage calculation patch

From

Julien Rouhaud

Date:

13 April 2020, 07:40:14

On Mon, Apr 13, 2020 at 8:11 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Sat, Apr 11, 2020 at 6:55 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> >
> > On Fri, Apr 10, 2020 at 9:37 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> > >
> > I tried to take into account all that have been discussed, but I have
> > to admit that I'm absolutely not sure of what was actually decided
> > here.  I went with those changes:
> >
> > - rename wal_num_fpw to wal_fpw for consistency, both in pgss view
> > fiel name but also everywhere in the code
> > - change comments to consistently mention "full page writes generated"
> > - changed pgss and explain documentation to mention "full page images
> > generated", from Justin's patch on another thread
> >
>
> I think it is better to use "full page writes" to be consistent with
> other places.
>
> > - kept "amount" of WAL bytes
> >
>
> Okay, but I would like to make another change suggested by Justin
> which is to replace "count" with "number" at a few places.

Ah sorry I missed this one.  +1 it also sounds better.

> I have made the above two changes in the attached.  Let me know what
> you think about attached?

It all looks good to me!

> > - no change to the explain output as I have no idea what is the
> > consensus (one or two spaces, use semicolon or equal, show unit or
> > not)
> >
>
> Yeah, let's do this separately once we have consensus.

Agreed.

Re: WAL usage calculation patch

From

Amit Kapila

Date:

13 April 2020, 11:46:54

On Mon, Apr 13, 2020 at 1:10 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> On Mon, Apr 13, 2020 at 8:11 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Sat, Apr 11, 2020 at 6:55 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> > >
> > > On Fri, Apr 10, 2020 at 9:37 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> > > >
> > > I tried to take into account all that have been discussed, but I have
> > > to admit that I'm absolutely not sure of what was actually decided
> > > here.  I went with those changes:
> > >
> > > - rename wal_num_fpw to wal_fpw for consistency, both in pgss view
> > > fiel name but also everywhere in the code
> > > - change comments to consistently mention "full page writes generated"
> > > - changed pgss and explain documentation to mention "full page images
> > > generated", from Justin's patch on another thread
> > >
> >
> > I think it is better to use "full page writes" to be consistent with
> > other places.
> >
> > > - kept "amount" of WAL bytes
> > >
> >
> > Okay, but I would like to make another change suggested by Justin
> > which is to replace "count" with "number" at a few places.
>
> Ah sorry I missed this one.  +1 it also sounds better.
>
> > I have made the above two changes in the attached.  Let me know what
> > you think about attached?
>
> It all looks good to me!
>

Pushed.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: WAL usage calculation patch

From

Julien Rouhaud

Date:

13 April 2020, 13:37:13

Le lun. 13 avr. 2020 à 13:47, Amit Kapila <amit.kapila16@gmail.com> a écrit :

On Mon, Apr 13, 2020 at 1:10 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> On Mon, Apr 13, 2020 at 8:11 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Sat, Apr 11, 2020 at 6:55 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> > >
> > > On Fri, Apr 10, 2020 at 9:37 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> > > >
> > > I tried to take into account all that have been discussed, but I have
> > > to admit that I'm absolutely not sure of what was actually decided
> > > here. I went with those changes:
> > >
> > > - rename wal_num_fpw to wal_fpw for consistency, both in pgss view
> > > fiel name but also everywhere in the code
> > > - change comments to consistently mention "full page writes generated"
> > > - changed pgss and explain documentation to mention "full page images
> > > generated", from Justin's patch on another thread
> > >
> >
> > I think it is better to use "full page writes" to be consistent with
> > other places.
> >
> > > - kept "amount" of WAL bytes
> > >
> >
> > Okay, but I would like to make another change suggested by Justin
> > which is to replace "count" with "number" at a few places.
>
> Ah sorry I missed this one. +1 it also sounds better.
>
> > I have made the above two changes in the attached. Let me know what
> > you think about attached?
>
> It all looks good to me!
>

Pushed.

Thanks a lot Amit!

Re: WAL usage calculation patch

From

Amit Kapila

Date:

14 April 2020, 03:57:18

On Wed, Apr 8, 2020 at 8:36 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Apr 7, 2020 at 3:30 PM Peter Eisentraut
> <peter.eisentraut@2ndquadrant.com> wrote:
> >
> >
> > We also have existing cases for the other way:
> >
> >      actual time=0.050..0.052
> >      Buffers: shared hit=3 dirtied=1
> >
>
> Buffers case is not the same because 'shared' is used for 'hit',
> 'read', 'dirtied', etc.  However, I think it is arguable.
>
> > The cases mentioned by Justin are not formatted in a key=value format,
> > so it's not quite the same, but it also raises the question why they are
> > not.
> >
> > Let's figure out a way to consolidate this without making up a third format.
> >
>
> Sure, I think my intention is to keep the format of WAL stats as close
> to Buffers stats as possible because both depict I/O and users would
> probably be interested to check/read both together.  There is a point
> to keep things in a format so that it is easier for someone to parse
> but I guess as these as fixed 'words', it shouldn't be difficult
> either way and we should give more weightage to consistency.  Any
> suggestions?
>

Peter E, others, any suggestions on how to move forward?  I think here
we should follow the rule "follow the style of nearby code" which in
this case would be to have one space after each field as we would like
it to be closer to the "Buffers" format.  It would be good if we have
a unified format among all Explain stuff but we might not want to
change the existing things and even if we want to do that it might be
a broader/bigger change and we should do that as a PG14 change.  What
do you think?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: WAL usage calculation patch

From

Peter Eisentraut

Date:

17 April 2020, 13:15:22

On 2020-04-14 05:57, Amit Kapila wrote:
> Peter E, others, any suggestions on how to move forward?  I think here
> we should follow the rule "follow the style of nearby code" which in
> this case would be to have one space after each field as we would like
> it to be closer to the "Buffers" format.  It would be good if we have
> a unified format among all Explain stuff but we might not want to
> change the existing things and even if we want to do that it might be
> a broader/bigger change and we should do that as a PG14 change.  What
> do you think?

If looks like shortening to fpw= and using one space is the easiest way 
to solve this issue.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: WAL usage calculation patch

From

Amit Kapila

Date:

18 April 2020, 04:16:03

On Fri, Apr 17, 2020 at 6:45 PM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
>
> On 2020-04-14 05:57, Amit Kapila wrote:
> > Peter E, others, any suggestions on how to move forward?  I think here
> > we should follow the rule "follow the style of nearby code" which in
> > this case would be to have one space after each field as we would like
> > it to be closer to the "Buffers" format.  It would be good if we have
> > a unified format among all Explain stuff but we might not want to
> > change the existing things and even if we want to do that it might be
> > a broader/bigger change and we should do that as a PG14 change.  What
> > do you think?
>
> If looks like shortening to fpw= and using one space is the easiest way
> to solve this issue.
>

I am fine with this approach and will change accordingly.  I will wait
for a few days (3-4 days) to see if someone shows up with either an
objection to this or with a better idea for the display of WAL usage
information.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: WAL usage calculation patch

From

Julien Rouhaud

Date:

18 April 2020, 15:39:35

On Sat, Apr 18, 2020 at 6:16 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Apr 17, 2020 at 6:45 PM Peter Eisentraut
> <peter.eisentraut@2ndquadrant.com> wrote:
> >
> > On 2020-04-14 05:57, Amit Kapila wrote:
> > > Peter E, others, any suggestions on how to move forward?  I think here
> > > we should follow the rule "follow the style of nearby code" which in
> > > this case would be to have one space after each field as we would like
> > > it to be closer to the "Buffers" format.  It would be good if we have
> > > a unified format among all Explain stuff but we might not want to
> > > change the existing things and even if we want to do that it might be
> > > a broader/bigger change and we should do that as a PG14 change.  What
> > > do you think?
> >
> > If looks like shortening to fpw= and using one space is the easiest way
> > to solve this issue.
> >
>
> I am fine with this approach and will change accordingly.  I will wait
> for a few days (3-4 days) to see if someone shows up with either an
> objection to this or with a better idea for the display of WAL usage
> information.

That was also my preferred alternative.  PFA a patch for that.  I also
changed to "fpw" for the non textual output for consistency.

Attachment

v1-fix_explain_wal_output.diff

Re: WAL usage calculation patch

From

Justin Pryzby

Date:

18 April 2020, 20:41:05

On Sat, Apr 18, 2020 at 05:39:35PM +0200, Julien Rouhaud wrote:
> On Sat, Apr 18, 2020 at 6:16 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Fri, Apr 17, 2020 at 6:45 PM Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:
> > > On 2020-04-14 05:57, Amit Kapila wrote:
> > > > Peter E, others, any suggestions on how to move forward?  I think here
> > > > we should follow the rule "follow the style of nearby code" which in
> > > > this case would be to have one space after each field as we would like
> > > > it to be closer to the "Buffers" format.  It would be good if we have
> > > > a unified format among all Explain stuff but we might not want to
> > > > change the existing things and even if we want to do that it might be
> > > > a broader/bigger change and we should do that as a PG14 change.  What
> > > > do you think?
> > >
> > > If looks like shortening to fpw= and using one space is the easiest way
> > > to solve this issue.
> > >
> >
> > I am fine with this approach and will change accordingly.  I will wait
> > for a few days (3-4 days) to see if someone shows up with either an
> > objection to this or with a better idea for the display of WAL usage
> > information.
> 
> That was also my preferred alternative.  PFA a patch for that.  I also
> changed to "fpw" for the non textual output for consistency.

Should capitalize at least the non-text one ?  And maybe the text one for
consistency ?

+               ExplainPropertyInteger("WAL fpw", NULL,
                                                                            
 

And add the acronym to the docs:

$ git grep 'full page' '*/explain.sgml'
doc/src/sgml/ref/explain.sgml:      number of records, number of full page writes and amount of WAL bytes

"..full page writes (FPW).."

Should we also change vacuumlazy.c for consistency ?

+                                                        _("WAL usage: %ld records, %ld full page writes, "
+                                                          UINT64_FORMAT " bytes"),

-- 
Justin

Re: WAL usage calculation patch

From

Julien Rouhaud

Date:

19 April 2020, 14:22:26

Hi Justin,

Thanks for the review!

On Sat, Apr 18, 2020 at 10:41 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
>
> Should capitalize at least the non-text one ?  And maybe the text one for
> consistency ?
>
> +               ExplainPropertyInteger("WAL fpw", NULL,

I think we should keep both version consistent, whether lower or upper
case.  The uppercase version is probably more correct, but it's a
little bit weird to have it being the only upper case label in all
output, so I kept it lower case.

> And add the acronym to the docs:
>
> $ git grep 'full page' '*/explain.sgml'
> doc/src/sgml/ref/explain.sgml:      number of records, number of full page writes and amount of WAL bytes
>
> "..full page writes (FPW).."

Indeed!  Fixed (using lowercase to match current output).

> Should we also change vacuumlazy.c for consistency ?
>
> +                                                        _("WAL usage: %ld records, %ld full page writes, "
> +                                                          UINT64_FORMAT " bytes"),

I don't think this one should be changed, vacuumlazy output is already
entirely different, and is way more verbose so keeping it as is makes
sense to me.

Attachment

v2-fix_explain_wal_output.diff

Re: WAL usage calculation patch

From

Kyotaro Horiguchi

Date:

20 April 2020, 07:46:56

At Sun, 19 Apr 2020 16:22:26 +0200, Julien Rouhaud <rjuju123@gmail.com> wrote in 
> Hi Justin,
> 
> Thanks for the review!
> 
> On Sat, Apr 18, 2020 at 10:41 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
> >
> > Should capitalize at least the non-text one ?  And maybe the text one for
> > consistency ?
> >
> > +               ExplainPropertyInteger("WAL fpw", NULL,
> 
> I think we should keep both version consistent, whether lower or upper
> case.  The uppercase version is probably more correct, but it's a
> little bit weird to have it being the only upper case label in all
> output, so I kept it lower case.

One space follwed by an acronym looks perfect.  I'd prefer capital
letters but small-letters also works well.

> > And add the acronym to the docs:
> >
> > $ git grep 'full page' '*/explain.sgml'
> > doc/src/sgml/ref/explain.sgml:      number of records, number of full page writes and amount of WAL bytes
> >
> > "..full page writes (FPW).."
> 
> Indeed!  Fixed (using lowercase to match current output).

I searched through the documentation and AFAICS most of occurances of
"full page" are follwed by "image" and full_page_writes is used only
as the parameter name.

I'm fine with fpw as the acronym, but "fpw means the number of full
page images" looks odd..

> > Should we also change vacuumlazy.c for consistency ?
> >
> > +                                                        _("WAL usage: %ld records, %ld full page writes, "
> > +                                                          UINT64_FORMAT " bytes"),
> 
> I don't think this one should be changed, vacuumlazy output is already
> entirely different, and is way more verbose so keeping it as is makes
> sense to me.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: WAL usage calculation patch

From

Amit Kapila

Date:

22 April 2020, 03:45:08

On Mon, Apr 20, 2020 at 1:17 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
>
> At Sun, 19 Apr 2020 16:22:26 +0200, Julien Rouhaud <rjuju123@gmail.com> wrote in
> > Hi Justin,
> >
> > Thanks for the review!
> >
> > On Sat, Apr 18, 2020 at 10:41 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
> > >
> > > Should capitalize at least the non-text one ?  And maybe the text one for
> > > consistency ?
> > >
> > > +               ExplainPropertyInteger("WAL fpw", NULL,
> >
> > I think we should keep both version consistent, whether lower or upper
> > case.  The uppercase version is probably more correct, but it's a
> > little bit weird to have it being the only upper case label in all
> > output, so I kept it lower case.

I think we can keep upper-case for all non-text ones in case of WAL
usage, something like WAL Records, WAL FPW, WAL Bytes.  The buffer
usage seems to be following a similar convention.

>
> One space follwed by an acronym looks perfect.  I'd prefer capital
> letters but small-letters also works well.
>
> > > And add the acronym to the docs:
> > >
> > > $ git grep 'full page' '*/explain.sgml'
> > > doc/src/sgml/ref/explain.sgml:      number of records, number of full page writes and amount of WAL bytes
> > >
> > > "..full page writes (FPW).."
> >
> > Indeed!  Fixed (using lowercase to match current output).
>
> I searched through the documentation and AFAICS most of occurances of
> "full page" are follwed by "image" and full_page_writes is used only
> as the parameter name.
>
> I'm fine with fpw as the acronym, but "fpw means the number of full
> page images" looks odd..
>

I don't understand this.  Where are we using such a description of fpw?


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: WAL usage calculation patch

From

Justin Pryzby

Date:

22 April 2020, 03:55:01

On Wed, Apr 22, 2020 at 09:15:08AM +0530, Amit Kapila wrote:
> > > > And add the acronym to the docs:
> > > >
> > > > $ git grep 'full page' '*/explain.sgml'
> > > > doc/src/sgml/ref/explain.sgml:      number of records, number of full page writes and amount of WAL bytes
> > > >
> > > > "..full page writes (FPW).."
> > >
> > > Indeed!  Fixed (using lowercase to match current output).
> >
> > I searched through the documentation and AFAICS most of occurances of
> > "full page" are follwed by "image" and full_page_writes is used only
> > as the parameter name.
> >
> > I'm fine with fpw as the acronym, but "fpw means the number of full
> > page images" looks odd..
> >
> 
> I don't understand this.  Where are we using such a description of fpw?

I suggested to add " (FPW)" to the new docs for "explain(wal)"
But, the documentation before this commit mostly refers to "full page images".
So the implication is that maybe we should use that language (and FPI acronym).

The only pre-existing use of "full page writes" seems to be here:
$ git grep -iC2 'full page write' origin doc 
origin:doc/src/sgml/wal.sgml-      Internal data structures such as <filename>pg_xact</filename>,
<filename>pg_subtrans</filename>,<filename>pg_multixact</filename>,
 
origin:doc/src/sgml/wal.sgml-      <filename>pg_serial</filename>, <filename>pg_notify</filename>,
<filename>pg_stat</filename>,<filename>pg_snapshots</filename> are not directly
 
origin:doc/src/sgml/wal.sgml:      checksummed, nor are pages protected by full page writes. However, where

And we're not using either acronym.

-- 
Justin

Re: WAL usage calculation patch

From

Amit Kapila

Date:

22 April 2020, 12:27:17

On Wed, Apr 22, 2020 at 9:25 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
>
> On Wed, Apr 22, 2020 at 09:15:08AM +0530, Amit Kapila wrote:
> > > > > And add the acronym to the docs:
> > > > >
> > > > > $ git grep 'full page' '*/explain.sgml'
> > > > > doc/src/sgml/ref/explain.sgml:      number of records, number of full page writes and amount of WAL bytes
> > > > >
> > > > > "..full page writes (FPW).."
> > > >
> > > > Indeed!  Fixed (using lowercase to match current output).
> > >
> > > I searched through the documentation and AFAICS most of occurances of
> > > "full page" are follwed by "image" and full_page_writes is used only
> > > as the parameter name.
> > >
> > > I'm fine with fpw as the acronym, but "fpw means the number of full
> > > page images" looks odd..
> > >
> >
> > I don't understand this.  Where are we using such a description of fpw?
>
> I suggested to add " (FPW)" to the new docs for "explain(wal)"
> But, the documentation before this commit mostly refers to "full page images".
> So the implication is that maybe we should use that language (and FPI acronym).
>

I am not sure if it matters that much. I think we can use "full page
writes (FPW)" in this case but we should be consistent wherever we
refer it in the WAL usage context and I think we already are, if not
then let's be consistent.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: WAL usage calculation patch

From

Amit Kapila

Date:

23 April 2020, 05:20:38

On Wed, Apr 22, 2020 at 9:15 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Apr 20, 2020 at 1:17 PM Kyotaro Horiguchi
> <horikyota.ntt@gmail.com> wrote:
> >
> > At Sun, 19 Apr 2020 16:22:26 +0200, Julien Rouhaud <rjuju123@gmail.com> wrote in
> > > Hi Justin,
> > >
> > > Thanks for the review!
> > >
> > > On Sat, Apr 18, 2020 at 10:41 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
> > > >
> > > > Should capitalize at least the non-text one ?  And maybe the text one for
> > > > consistency ?
> > > >
> > > > +               ExplainPropertyInteger("WAL fpw", NULL,
> > >
> > > I think we should keep both version consistent, whether lower or upper
> > > case.  The uppercase version is probably more correct, but it's a
> > > little bit weird to have it being the only upper case label in all
> > > output, so I kept it lower case.
>
> I think we can keep upper-case for all non-text ones in case of WAL
> usage, something like WAL Records, WAL FPW, WAL Bytes.  The buffer
> usage seems to be following a similar convention.
>

The attached patch changed the non-text display format as mentioned.
Let me know if you have any comments?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment

v3-fix_explain_wal_output.patch

Re: WAL usage calculation patch

From

Julien Rouhaud

Date:

23 April 2020, 05:31:47

On Wed, Apr 22, 2020 at 2:27 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Apr 22, 2020 at 9:25 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
> >
> > On Wed, Apr 22, 2020 at 09:15:08AM +0530, Amit Kapila wrote:
> > > > > > And add the acronym to the docs:
> > > > > >
> > > > > > $ git grep 'full page' '*/explain.sgml'
> > > > > > doc/src/sgml/ref/explain.sgml:      number of records, number of full page writes and amount of WAL bytes
> > > > > >
> > > > > > "..full page writes (FPW).."
> > > > >
> > > > > Indeed!  Fixed (using lowercase to match current output).
> > > >
> > > > I searched through the documentation and AFAICS most of occurances of
> > > > "full page" are follwed by "image" and full_page_writes is used only
> > > > as the parameter name.
> > > >
> > > > I'm fine with fpw as the acronym, but "fpw means the number of full
> > > > page images" looks odd..
> > > >
> > >
> > > I don't understand this.  Where are we using such a description of fpw?
> >
> > I suggested to add " (FPW)" to the new docs for "explain(wal)"
> > But, the documentation before this commit mostly refers to "full page images".
> > So the implication is that maybe we should use that language (and FPI acronym).
> >
>
> I am not sure if it matters that much. I think we can use "full page
> writes (FPW)" in this case but we should be consistent wherever we
> refer it in the WAL usage context and I think we already are, if not
> then let's be consistent.

I agree that full page writes can be used in this case, but I'm
wondering if that can be misleading for some reader which might e.g.
confuse with the full_page_writes GUC.  And as Justin pointed out, the
documentation for now usually mentions "full page image(s)" in such
cases.

Re: WAL usage calculation patch

From

Julien Rouhaud

Date:

23 April 2020, 05:33:13

On Thu, Apr 23, 2020 at 7:20 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Apr 22, 2020 at 9:15 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, Apr 20, 2020 at 1:17 PM Kyotaro Horiguchi
> > <horikyota.ntt@gmail.com> wrote:
> > >
> > > At Sun, 19 Apr 2020 16:22:26 +0200, Julien Rouhaud <rjuju123@gmail.com> wrote in
> > > > Hi Justin,
> > > >
> > > > Thanks for the review!
> > > >
> > > > On Sat, Apr 18, 2020 at 10:41 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
> > > > >
> > > > > Should capitalize at least the non-text one ?  And maybe the text one for
> > > > > consistency ?
> > > > >
> > > > > +               ExplainPropertyInteger("WAL fpw", NULL,
> > > >
> > > > I think we should keep both version consistent, whether lower or upper
> > > > case.  The uppercase version is probably more correct, but it's a
> > > > little bit weird to have it being the only upper case label in all
> > > > output, so I kept it lower case.
> >
> > I think we can keep upper-case for all non-text ones in case of WAL
> > usage, something like WAL Records, WAL FPW, WAL Bytes.  The buffer
> > usage seems to be following a similar convention.
> >
>
> The attached patch changed the non-text display format as mentioned.
> Let me know if you have any comments?

Assuming that we're fine using full page write(s) / FPW  rather than
full page image(s) / FPI (see previous mail), I'm fine with this
patch.

Re: WAL usage calculation patch

From

Kyotaro Horiguchi

Date:

23 April 2020, 05:54:27

At Thu, 23 Apr 2020 07:33:13 +0200, Julien Rouhaud <rjuju123@gmail.com> wrote in 
> > > > > I think we should keep both version consistent, whether lower or upper
> > > > > case.  The uppercase version is probably more correct, but it's a
> > > > > little bit weird to have it being the only upper case label in all
> > > > > output, so I kept it lower case.
> > >
> > > I think we can keep upper-case for all non-text ones in case of WAL
> > > usage, something like WAL Records, WAL FPW, WAL Bytes.  The buffer
> > > usage seems to be following a similar convention.
> > >
> >
> > The attached patch changed the non-text display format as mentioned.
> > Let me know if you have any comments?
> 
> Assuming that we're fine using full page write(s) / FPW  rather than
> full page image(s) / FPI (see previous mail), I'm fine with this
> patch.

FWIW, I like FPW, and the patch looks good to me. The index in the
documentation has the entry for full_page_writes (having underscores)
and it would work.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: WAL usage calculation patch

From

Amit Kapila

Date:

23 April 2020, 09:05:07

On Thu, Apr 23, 2020 at 12:16 PM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
>
> On 2020-04-23 07:31, Julien Rouhaud wrote:
> > I agree that full page writes can be used in this case, but I'm
> > wondering if that can be misleading for some reader which might e.g.
> > confuse with the full_page_writes GUC.  And as Justin pointed out, the
> > documentation for now usually mentions "full page image(s)" in such
> > cases.
>
> ISTM that in the context of this patch, "full-page image" is correct.  A
> "full-page write" is what you do to a table or index page when you are
> recovering a full-page image.
>

So what do we call when we log the page after it is touched after
checkpoint?  I thought we call that as full-page write.

>  The internal symbol for the WAL record is
> XLOG_FPI and xlogdesc.c prints it as "FPI".
>

That is just one way/reason we log the page.  There are others as
well.  I thought here we are computing the number of full-page writes
happened in the system due to various reasons like (a) a page is
operated upon first time after the checkpoint, (b) log the XLOG_FPI
record, (c) Guc for WAL consistency checker is on, etc.  If we see in
XLogRecordAssemble where we decide to log this information, there is a
comment " .... log a full-page write for the current block." and there
was an existing variable with 'fpw_lsn' which indicates to an extent
that what we are computing in this patch is full-page writes.  But
there is a reference to full-page image as well.  I think as
full_page_writes is an exposed variable that is well understood so
exposing information with similar name via this patch doesn't sound
illogical to me. Whatever we use here we need to be consistent all
throughout, even pg_stat_statements need to name exposed variable as
wal_fpi instead of wal_fpw.

To me, full-page writes sound more appealing with other WAL usage
variables like records and bytes. I might be more used to this term as
'fpw' that is why it occurred better to me.  OTOH, if most of us think
that a full-page image is better suited here, I am fine with changing
it at all places.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: WAL usage calculation patch

From

Amit Kapila

Date:

27 April 2020, 03:05:51

On Thu, Apr 23, 2020 at 2:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Apr 23, 2020 at 12:16 PM Peter Eisentraut
> <peter.eisentraut@2ndquadrant.com> wrote:
>
> >  The internal symbol for the WAL record is
> > XLOG_FPI and xlogdesc.c prints it as "FPI".
> >
>
> That is just one way/reason we log the page.  There are others as
> well.  I thought here we are computing the number of full-page writes
> happened in the system due to various reasons like (a) a page is
> operated upon first time after the checkpoint, (b) log the XLOG_FPI
> record, (c) Guc for WAL consistency checker is on, etc.  If we see in
> XLogRecordAssemble where we decide to log this information, there is a
> comment " .... log a full-page write for the current block." and there
> was an existing variable with 'fpw_lsn' which indicates to an extent
> that what we are computing in this patch is full-page writes.  But
> there is a reference to full-page image as well.  I think as
> full_page_writes is an exposed variable that is well understood so
> exposing information with similar name via this patch doesn't sound
> illogical to me. Whatever we use here we need to be consistent all
> throughout, even pg_stat_statements need to name exposed variable as
> wal_fpi instead of wal_fpw.
>
> To me, full-page writes sound more appealing with other WAL usage
> variables like records and bytes. I might be more used to this term as
> 'fpw' that is why it occurred better to me.  OTOH, if most of us think
> that a full-page image is better suited here, I am fine with changing
> it at all places.
>

Julien, Peter, others do you have any opinion here?  I think it is
better if we decide on one of FPW or FPI and make the changes at all
places for this patch.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: WAL usage calculation patch

From

Michael Paquier

Date:

27 April 2020, 06:11:50

On Mon, Apr 27, 2020 at 08:35:51AM +0530, Amit Kapila wrote:
> On Thu, Apr 23, 2020 at 2:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>> On Thu, Apr 23, 2020 at 12:16 PM Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:
>>>  The internal symbol for the WAL record is
>>> XLOG_FPI and xlogdesc.c prints it as "FPI".
>
> Julien, Peter, others do you have any opinion here?  I think it is
> better if we decide on one of FPW or FPI and make the changes at all
> places for this patch.

It seems to me that Peter is right here.  A full-page write is the
action to write a full-page image, so if you consider only a way to
define the static data of a full-page and/or a quantity associated to
it, we should talk about full-page images.
--
Michael

Attachment

signature.asc

Re: WAL usage calculation patch

From

Julien Rouhaud

Date:

27 April 2020, 07:52:17

On Mon, Apr 27, 2020 at 8:12 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Mon, Apr 27, 2020 at 08:35:51AM +0530, Amit Kapila wrote:
> > On Thu, Apr 23, 2020 at 2:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >> On Thu, Apr 23, 2020 at 12:16 PM Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:
> >>>  The internal symbol for the WAL record is
> >>> XLOG_FPI and xlogdesc.c prints it as "FPI".
> >
> > Julien, Peter, others do you have any opinion here?  I think it is
> > better if we decide on one of FPW or FPI and make the changes at all
> > places for this patch.
>
> It seems to me that Peter is right here.  A full-page write is the
> action to write a full-page image, so if you consider only a way to
> define the static data of a full-page and/or a quantity associated to
> it, we should talk about full-page images.

I agree with that definition.  I can send a cleanup patch if there's
no objection.

Re: WAL usage calculation patch

From

Amit Kapila

Date:

28 April 2020, 02:08:57

On Mon, Apr 27, 2020 at 1:22 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> On Mon, Apr 27, 2020 at 8:12 AM Michael Paquier <michael@paquier.xyz> wrote:
> >
> > On Mon, Apr 27, 2020 at 08:35:51AM +0530, Amit Kapila wrote:
> > > On Thu, Apr 23, 2020 at 2:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >> On Thu, Apr 23, 2020 at 12:16 PM Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:
> > >>>  The internal symbol for the WAL record is
> > >>> XLOG_FPI and xlogdesc.c prints it as "FPI".
> > >
> > > Julien, Peter, others do you have any opinion here?  I think it is
> > > better if we decide on one of FPW or FPI and make the changes at all
> > > places for this patch.
> >
> > It seems to me that Peter is right here.  A full-page write is the
> > action to write a full-page image, so if you consider only a way to
> > define the static data of a full-page and/or a quantity associated to
> > it, we should talk about full-page images.
>

Fair enough, if more people want full-page image terminology in this
context then we can do that.

> I agree with that definition.  I can send a cleanup patch if there's
> no objection.
>

Okay, feel free to send the patch.  Thanks for taking the initiative
to write a patch for this.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: WAL usage calculation patch

From

Amit Kapila

Date:

30 April 2020, 03:05:13

On Tue, Apr 28, 2020 at 7:38 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Apr 27, 2020 at 1:22 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> >
>
> > I agree with that definition.  I can send a cleanup patch if there's
> > no objection.
> >
>
> Okay, feel free to send the patch.  Thanks for taking the initiative
> to write a patch for this.
>

Julien, are you planning to write a cleanup patch for this open item?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: WAL usage calculation patch

From

Julien Rouhaud

Date:

30 April 2020, 07:18:57

On Thu, Apr 30, 2020 at 5:05 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Apr 28, 2020 at 7:38 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, Apr 27, 2020 at 1:22 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> > >
> >
> > > I agree with that definition.  I can send a cleanup patch if there's
> > > no objection.
> > >
> >
> > Okay, feel free to send the patch.  Thanks for taking the initiative
> > to write a patch for this.
> >
>
> Julien, are you planning to write a cleanup patch for this open item?

Sorry Amit, I've been quite busy at work for the last couple of days.
I'll take care of that this morning for sure!

Re: WAL usage calculation patch

From

Julien Rouhaud

Date:

30 April 2020, 08:48:46

On Thu, Apr 30, 2020 at 9:18 AM Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> On Thu, Apr 30, 2020 at 5:05 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Apr 28, 2020 at 7:38 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Mon, Apr 27, 2020 at 1:22 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> > > >
> > >
> > > > I agree with that definition.  I can send a cleanup patch if there's
> > > > no objection.
> > > >
> > >
> > > Okay, feel free to send the patch.  Thanks for taking the initiative
> > > to write a patch for this.
> > >
> >
> > Julien, are you planning to write a cleanup patch for this open item?
>
> Sorry Amit, I've been quite busy at work for the last couple of days.
> I'll take care of that this morning for sure!

Here's the patch.  I included the content of
v3-fix_explain_wal_output.patch you provided before, and tried to
consistently replace full page writes/fpw to full page images/fpi
everywhere on top of it (so documentation, command output, variable
names and comments).

Attachment

v4-fix_wal_usage.diff

Re: WAL usage calculation patch

From

Amit Kapila

Date:

02 May 2020, 11:49:01

On Thu, Apr 30, 2020 at 2:19 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> On Thu, Apr 30, 2020 at 9:18 AM Julien Rouhaud <rjuju123@gmail.com> wrote:
> >
> > On Thu, Apr 30, 2020 at 5:05 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > Julien, are you planning to write a cleanup patch for this open item?
> >
> > Sorry Amit, I've been quite busy at work for the last couple of days.
> > I'll take care of that this morning for sure!
>
> Here's the patch.
>

Thanks for the patch. I will look into it early next week.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: WAL usage calculation patch

From

Amit Kapila

Date:

04 May 2020, 04:09:56

On Thu, Apr 30, 2020 at 2:19 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> Here's the patch.  I included the content of
> v3-fix_explain_wal_output.patch you provided before, and tried to
> consistently replace full page writes/fpw to full page images/fpi
> everywhere on top of it (so documentation, command output, variable
> names and comments).
>

Your patch looks mostly good to me.  I have made slight modifications
which include changing the non-text format in show_wal_usage to use a
capital letter for the second word, which makes it similar to Buffer
usage stats, and additionally, ran pgindent.

Let me know what do you think of attached?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment

v5-fix_wal_usage.patch

Re: WAL usage calculation patch

From

Julien Rouhaud

Date:

04 May 2020, 14:32:55

On Mon, May 4, 2020 at 6:10 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Apr 30, 2020 at 2:19 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> >
> > Here's the patch.  I included the content of
> > v3-fix_explain_wal_output.patch you provided before, and tried to
> > consistently replace full page writes/fpw to full page images/fpi
> > everywhere on top of it (so documentation, command output, variable
> > names and comments).
> >
>
> Your patch looks mostly good to me.  I have made slight modifications
> which include changing the non-text format in show_wal_usage to use a
> capital letter for the second word, which makes it similar to Buffer
> usage stats, and additionally, ran pgindent.
>
> Let me know what do you think of attached?

Thanks a lot Amit.  It looks perfect to me!

Re: WAL usage calculation patch

From

Amit Kapila

Date:

05 May 2020, 10:44:02

On Mon, May 4, 2020 at 8:03 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> On Mon, May 4, 2020 at 6:10 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, Apr 30, 2020 at 2:19 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> > >
> > > Here's the patch.  I included the content of
> > > v3-fix_explain_wal_output.patch you provided before, and tried to
> > > consistently replace full page writes/fpw to full page images/fpi
> > > everywhere on top of it (so documentation, command output, variable
> > > names and comments).
> > >
> >
> > Your patch looks mostly good to me.  I have made slight modifications
> > which include changing the non-text format in show_wal_usage to use a
> > capital letter for the second word, which makes it similar to Buffer
> > usage stats, and additionally, ran pgindent.
> >
> > Let me know what do you think of attached?
>
> Thanks a lot Amit.  It looks perfect to me!
>

Pushed.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: WAL usage calculation patch

From

Julien Rouhaud

Date:

05 May 2020, 18:48:58

On Tue, May 5, 2020 at 12:44 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, May 4, 2020 at 8:03 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> >
> > On Mon, May 4, 2020 at 6:10 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Thu, Apr 30, 2020 at 2:19 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> > > >
> > > > Here's the patch.  I included the content of
> > > > v3-fix_explain_wal_output.patch you provided before, and tried to
> > > > consistently replace full page writes/fpw to full page images/fpi
> > > > everywhere on top of it (so documentation, command output, variable
> > > > names and comments).
> > > >
> > >
> > > Your patch looks mostly good to me.  I have made slight modifications
> > > which include changing the non-text format in show_wal_usage to use a
> > > capital letter for the second word, which makes it similar to Buffer
> > > usage stats, and additionally, ran pgindent.
> > >
> > > Let me know what do you think of attached?
> >
> > Thanks a lot Amit.  It looks perfect to me!
> >
>
> Pushed.

Thanks!

Re: WAL usage calculation patch

From

Amit Kapila

Date:

07 May 2020, 03:00:29

On Wed, May 6, 2020 at 12:19 AM Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> On Tue, May 5, 2020 at 12:44 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > > >
> > > > Your patch looks mostly good to me.  I have made slight modifications
> > > > which include changing the non-text format in show_wal_usage to use a
> > > > capital letter for the second word, which makes it similar to Buffer
> > > > usage stats, and additionally, ran pgindent.
> > > >
> > > > Let me know what do you think of attached?
> > >
> > > Thanks a lot Amit.  It looks perfect to me!
> > >
> >
> > Pushed.
>
> Thanks!
>

I have updated the open items page to reflect this commit [1].

[1] - https://wiki.postgresql.org/wiki/PostgreSQL_13_Open_Items

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com