Re: pg_walinspect - a new extension to get raw WAL data and WAL stats - Mailing list pgsql-hackers

From Ashutosh Sharma
Subject Re: pg_walinspect - a new extension to get raw WAL data and WAL stats
Date
Msg-id CAE9k0PkgAf-bCdMR6MhNX_3F43RhTa8vafRMvfiNB+QaJA5-Qg@mail.gmail.com
Whole thread Raw
In response to Re: pg_walinspect - a new extension to get raw WAL data and WAL stats  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
List pgsql-hackers
On Fri, Mar 11, 2022 at 8:22 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
>
> Sorry, some minor non-syntactical corrections.
>
> At Fri, 11 Mar 2022 11:38:22 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in
> > I played with this a bit, and would like to share some thoughts on it.
> >
> > It seems to me too rigorous that pg_get_wal_records_info/stats()
> > reject future LSNs as end-LSN and I think WARNING or INFO and stop at
> > the real end-of-WAL is more kind to users.  I think the same with the
> > restriction that start and end LSN are required to be different.
> >
> > The definition of end-lsn is fuzzy here.  If I fed a future LSN to the
> > functions, they tell me the beginning of the current insertion point
> > in error message.  On the other hand they don't accept the same
> > value as end-LSN.  I think it is right that they tell the current
> > insertion point and they should take the end-LSN as the LSN to stop
> > reading.
> >
> > I think pg_get_wal_stats() is worth to have but I think it should be
> > implemented in SQL.  Currently pg_get_wal_records_info() doesn't tell
> > about FPI since pg_waldump doesn't but it is internally collected (of
> > course!) and easily revealed. If we do that, the
> > pg_get_wal_records_stats() would be reduced to the following SQL
> > statement
> >
> > SELECT resource_manager resmgr,
> >        count(*) AS N,
> >          (count(*) * 100 / sum(count(*)) OVER tot)::numeric(5,2) AS "%N",
> >          sum(total_length) AS "combined size",
> >          (sum(total_length) * 100 / sum(sum(total_length)) OVER tot)::numeric(5,2) AS "%combined size",
> >          sum(fpi_len) AS fpilen,
> >          (sum(fpi_len) * 100 / sum(sum(fpi_len)) OVER tot)::numeric(5,2) AS "%fpilen"
> >          FROM pg_get_wal_records_info('0/1000000', '0/175DD7f')
> >          GROUP by resource_manager
> >          WINDOW tot AS ()
> >          ORDER BY "combined size" desc;
> >
> > The only difference with pg_waldump is the statement above doesn't
> > show lines for the resource managers that don't contained in the
> > result of pg_get_wal_records_info(). But I don't think that matters.
> >
> >
> > Sometimes the field description has very long (28kb long) content. It
> > makes the result output almost unreadable and I had a bit hard time
> > struggling with the output full of '-'s.  I would like have a default
> > limit on the length of such fields that can be long but I'm not sure
> > we want that.
> >
> >
> - The difference between pg_get_wal_record_info and _records_ other than
> - the number of argument is the former accepts incorrect LSNs.
>
> The discussion is somewhat confused after some twists and turns..  It
> should be something like the following.
>
> pg_get_wal_record_info and pg_get_wal_records_info are almost same
> since the latter can show a single record.  However it is a bit
> annoying to do that. Since, other than it doens't accept same LSNs for
> start and end, it doesn't show a record when there' no record in the
> specfied LSN range.  But I don't think there's no usefulness of the
> behavior.
>
> The following works,
>  pg_get_wal_record_info('0/1000000');

This does work but it doesn't show any WARNING message for the start
pointer adjustment. I think it should.

>  pg_get_wal_records_info('0/1000000');
>

I think this is fine. It should be working because the user hasn't
specified the end pointer so we assume the default end pointer is
end-of-WAL.

> but this doesn't
>  pg_get_wal_records_info('0/1000000', '0/1000000');
> > ERROR:  WAL start LSN must be less than end LSN
>

I think this behaviour is fine. We cannot have the same start and end
lsn pointers.

> And the following shows no records.
>  pg_get_wal_records_info('0/1000000', '0/1000001');
>  pg_get_wal_records_info('0/1000000', '0/1000028');
>

I think we should be erroring out here saying - couldn't find any
valid WAL record between given start and end lsn because there exists
no valid wal records between the specified start and end lsn pointers.

--
With Regards,
Ashutosh Sharma.



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: logical decoding and replication of sequences
Next
From: Ashutosh Sharma
Date:
Subject: Re: pg_walinspect - a new extension to get raw WAL data and WAL stats