Re: WAL usage calculation patch - Mailing list pgsql-hackers

From Julien Rouhaud
Subject Re: WAL usage calculation patch
Date
Msg-id 20200317203222.GA44781@nol
Whole thread Raw
In response to Re: WAL usage calculation patch  (Kirill Bychik <kirill.bychik@gmail.com>)
Responses Re: WAL usage calculation patch
List pgsql-hackers
On Tue, Mar 17, 2020 at 10:27:05PM +0300, Kirill Bychik wrote:
> > > Please feel free to work on any extension of this patch idea. I lack
> > > both time and knowledge to do it all by myself.
> >
> > I'm adding a 3rd patch on top of yours to expose the new WAL counters in
> > pg_stat_database, for vacuum and autovacuum.  I'm not really enthiusiastic with
> > this approach but I didn't find better, and maybe this will raise some better
> > ideas.  The only sure thing is that we're not going to add a bunch of new
> > fields in pg_stat_all_tables anyway.
> >
> > We can also drop this 3rd patch entirely if no one's happy about it without
> > impacting the first two.
>
> No objections about 3rd on my side, unless we miss the CF completely.
>
> As for the code, I believe:
> + walusage.wal_records = pgWalUsage.wal_records -
> + walusage_start.wal_records;
> + walusage.wal_fp_records = pgWalUsage.wal_fp_records -
> + walusage_start.wal_fp_records;
> + walusage.wal_bytes = pgWalUsage.wal_bytes - walusage_start.wal_bytes;
>
> Could be done much simpler via the utility:
> WalUsageAccumDiff(walusage, pgWalUsage, walusage_start);


Indeed, but this function is private to instrument.c.  AFAICT
pg_stat_statements is already duplicating similar code for buffers rather than
having BufferUsageAccumDiff being exported, so I chose the same approach.

I'd be in favor of exporting both functions though.


> On a side note, I agree API to the buf/wal usage is far from perfect.


Yes clearly.


> > > Test had been reworked, and I believe it should be stable now, the
> > > part which checks WAL is written and there is a correlation between
> > > affected rows and WAL records. I still have no idea how to test
> > > full-page writes against regular updates, it seems very unstable.
> > > Please share ideas if any.
> >
> >
> > I just reviewed the patches, and it globally looks good to me.  The way to
> > detect full page images looks sensible, but I'm really not familiar with that
> > code so additional review would be useful.
> >
> > I noticed that the new wal_write_fp_records field in pg_stat_statements wasn't
> > used in the test.  Since I have to add all the patches to make the cfbot happy,
> > I slightly adapted the tests to reference the fp column too.  There was also a
> > minor issue in the documentation, as wal_records and wal_bytes were copy/pasted
> > twice while wal_write_fp_records wasn't documented, so I also changed it.
> >
> > Let me know if you're ok with those changes.
>
> Sorry for not getting wal_fp_usage into the docs, my fault.
>
> As for the tests, please get somebody else to review this. I strongly
> believe checking full page writes here could be a source of
> instability.


I'm also a little bit dubious about it.  The initial checkpoint should make
things stable (of course unless full_page_writes is disabled), and Cfbot also
seems happy about it.  At least keeping it for the temporary tables test
shouldn't be a problem.



pgsql-hackers by date:

Previous
From: Sergei Kornilov
Date:
Subject: Re: allow online change primary_conninfo
Next
From: Sergei Kornilov
Date:
Subject: Re: pgsql: walreceiver uses a temporary replication slot by default