Re: Add a new pg_walinspect function to extract FPIs from WAL records - Mailing list pgsql-hackers

From Bharath Rupireddy
Subject Re: Add a new pg_walinspect function to extract FPIs from WAL records
Date
Msg-id CALj2ACV-WBN=EUgUPyYOGitp+rn163vMnQd=HcWrnKt-uqFYFA@mail.gmail.com
Whole thread Raw
In response to Re: Add a new pg_walinspect function to extract FPIs from WAL records  (Michael Paquier <michael@paquier.xyz>)
Responses Re: Add a new pg_walinspect function to extract FPIs from WAL records
List pgsql-hackers
On Thu, Jan 12, 2023 at 11:23 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Wed, Jan 11, 2023 at 06:59:18PM +0530, Bharath Rupireddy wrote:
> > I've done it that way for pg_get_wal_fpi_info. If this format looks
> > okay, I can propose to do the same for other functions (for
> > backpatching too) in a separate thread though.
>
> My vote would be to make that happen first, to have in place cleaner
> basics for the docs.  I could just do it and move on..

Sure. Posted a separate patch here -
https://www.postgresql.org/message-id/CALj2ACVGcUpziGgQrcT-1G3dHWQQfWjYBu1YQ2ypv9y86dgogg%40mail.gmail.com

> This reminds me of the slot advancing, where we discarded this case
> just because it is useful to enforce a LSN far in the future.
> Honestly, I cannot think of any case where I would use this level of
> validation, especially having *two* functions to decide one behavior
> or the other: this stuff could just use one function and use for
> example NULL as a setup to enforce the end of WAL, on top of a LSN far
> ahead..  But well..

I understand. I don't mind discussing something like [1] with the
following behaviour and discarding till_end_of_wal functions
altogether:
If start_lsn is NULL, error out/return NULL.
If end_lsn isn't specified, default to NULL, then determine the end_lsn.
If end_lsn is specified as NULL, then determine the end_lsn.
If end_lsn is specified as non-NULL, then determine if it is greater
than start_lsn if yes, go ahead do the job, otherwise error out.

I'll think a bit more on this and perhaps discuss it separately.

> > Hm. How about having pg_get_wal_fpi_info_till_end_of_wal() then?
>
> I don't really want to make the interface more bloated with more
> functions than necessary, TBH, though I get your point to push for
> consistency in these functions.  This makes me really wonder whether
> we'd better make relax all the existing functions, but I may get
> outvoted.

I'll keep the FPI extract function simple as proposed in the patch and
I'll not go write till_end_of_wal version. If needed to get all the
FPIs till the end of WAL, one can always determine the end_lsn with
pg_current_wal_flush_lsn()/pg_last_wal_replay_lsn() when in recovery,
and use the proposed function.

Note that I kept the FPI extract function test simple - ensure FPI
gets generated and check if the new function can fetch it, discarding
lsn or other FPI sanity checks. Whole FPI sanity check needs
pageinspect and we don't want to create the dependency just for tests.
And checking for FPI lsn with WAL record lsn requires us to fetch lsn
from raw bytea stream. As there's no straightforward way to convert
raw bytes from bytea to pg_lsn, doing that in a platform-agnostic
manner (little-endian and big-endian comes into play here) actually is
a no-go IMO. FWIW, for a little-endian system I had to do [2].

Therefore I stick to the simple test unless there's a better way.

I'm attaching the v6 patch for further review.

[1]
CREATE FUNCTION pg_get_wal_records_info(IN start_lsn pg_lsn,
    IN end_lsn pg_lsn DEFAULT NULL,
    OUT start_lsn pg_lsn,
    OUT end_lsn pg_lsn,
    OUT prev_lsn pg_lsn,
    OUT xid xid,
    OUT resource_manager text,
    OUT record_type text,
    OUT record_length int4,
    OUT main_data_length int4,
    OUT fpi_length int4,
    OUT description text,
    OUT block_ref text
)
RETURNS SETOF record
AS 'MODULE_PATHNAME', 'pg_get_wal_records_info'
LANGUAGE C CALLED ON NULL INPUT IMMUTABLE PARALLEL SAFE STRICT PARALLEL SAFE;

CREATE FUNCTION pg_get_wal_stats(IN start_lsn pg_lsn,
    IN end_lsn pg_lsn DEFAULT NULL,
    IN  per_record boolean DEFAULT false,
    OUT "resource_manager/record_type" text,
    OUT count int8,
    OUT count_percentage float8,
    OUT record_size int8,
    OUT record_size_percentage float8,
    OUT fpi_size int8,
    OUT fpi_size_percentage float8,
    OUT combined_size int8,
    OUT combined_size_percentage float8
)
RETURNS SETOF record
AS 'MODULE_PATHNAME', 'pg_get_wal_stats'
LANGUAGE C CALLED ON NULL INPUT IMMUTABLE PARALLEL SAFE STRICT PARALLEL SAFE;

[2]

select '\x00000000b8078901000000002c00601f00200420df020000e09f3800c09f3800a09f380080'::bytea
AS fpi \gset
select substring(:'fpi' from 3 for 16) AS rlsn \gset
select substring(:'rlsn' from 7 for 2) || substring(:'rlsn' from 5 for
2) || substring(:'rlsn' from 3 for 2) || substring(:'rlsn' from 1 for
2) AS hi \gset
select substring(:'rlsn' from 15 for 2) || substring(:'rlsn' from 13
for 2) || substring(:'rlsn' from 11 for 2) || substring(:'rlsn' from 9
for 2) AS lo \gset
select (:'hi' || '/' || :'lo')::pg_lsn;

postgres=# select (:'hi' || '/' || :'lo')::pg_lsn;
  pg_lsn
-----------
 0/18907B8
(1 row)

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

pgsql-hackers by date:

Previous
From: Bharath Rupireddy
Date:
Subject: Beautify pg_walinspect docs a bit
Next
From: "Daniel Verite"
Date:
Subject: Re: psql's FETCH_COUNT (cursor) is not being respected for CTEs