Thread: Re: Add contrib/pg_logicalsnapinspect

Re: Add contrib/pg_logicalsnapinspect

From
Amit Kapila
Date:
On Thu, Aug 22, 2024 at 5:56 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> Please find attached a patch to $SUBJECT.
>
> This module provides SQL functions to inspect the contents of serialized logical
> snapshots of a running database cluster, which I think could be useful for
> debugging or educational purposes.
>

+1. I see it could be a good debugging aid.

>
> 2. The SnapBuildOnDisk and SnapBuild structs are now exposed to public. Means
> we should now pay much more attention when changing their contents but I think
> it's worth it.
>

Is it possible to avoid exposing these structures? Can we expose some
function from snapbuild.c that provides the required information?

--
With Regards,
Amit Kapila.



Re: Add contrib/pg_logicalsnapinspect

From
Bertrand Drouvot
Date:
Hi,

On Mon, Aug 26, 2024 at 07:05:27PM +0530, Amit Kapila wrote:
> On Thu, Aug 22, 2024 at 5:56 PM Bertrand Drouvot
> <bertranddrouvot.pg@gmail.com> wrote:
> >
> > Please find attached a patch to $SUBJECT.
> >
> > This module provides SQL functions to inspect the contents of serialized logical
> > snapshots of a running database cluster, which I think could be useful for
> > debugging or educational purposes.
> >
> 
> +1. I see it could be a good debugging aid.

Thanks for the feedback!

> >
> > 2. The SnapBuildOnDisk and SnapBuild structs are now exposed to public. Means
> > we should now pay much more attention when changing their contents but I think
> > it's worth it.
> >
> 
> Is it possible to avoid exposing these structures? Can we expose some
> function from snapbuild.c that provides the required information?

Yeah, that's an option if we don't want to expose those structs to public.

I think we could 1/ create a function that would return a formed HeapTuple, or
2/ we could create multiple functions (about 15) that would return the values
we are interested in.

I think 2/ is fine as it would give more flexiblity (no need to retrieve a whole
tuple if one is interested to only one value).

What do you think? Did you have something else in mind?

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Add contrib/pg_logicalsnapinspect

From
Amit Kapila
Date:
On Wed, Aug 28, 2024 at 1:25 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> On Mon, Aug 26, 2024 at 07:05:27PM +0530, Amit Kapila wrote:
> > On Thu, Aug 22, 2024 at 5:56 PM Bertrand Drouvot
> > <bertranddrouvot.pg@gmail.com> wrote:
> > >
> > > 2. The SnapBuildOnDisk and SnapBuild structs are now exposed to public. Means
> > > we should now pay much more attention when changing their contents but I think
> > > it's worth it.
> > >
> >
> > Is it possible to avoid exposing these structures? Can we expose some
> > function from snapbuild.c that provides the required information?
>
> Yeah, that's an option if we don't want to expose those structs to public.
>
> I think we could 1/ create a function that would return a formed HeapTuple, or
> 2/ we could create multiple functions (about 15) that would return the values
> we are interested in.
>
> I think 2/ is fine as it would give more flexiblity (no need to retrieve a whole
> tuple if one is interested to only one value).
>

True, but OTOH, each time we add a new field to these structures, a
new function has to be exposed. I don't have a strong opinion on this
but seeing current use cases, it seems okay to expose a single
function.

> What do you think? Did you have something else in mind?
>

On similar lines, we can also provide a function to get the slot's
on-disk data. IIRC, Bharath had previously proposed a tool to achieve
the same. It is fine if we don't want to add that as part of this
patch but I mentioned it because by having that we can have a set of
functions to view logical decoding data.

--
With Regards,
Amit Kapila.



Re: Add contrib/pg_logicalsnapinspect

From
Bertrand Drouvot
Date:
Hi,

On Thu, Aug 29, 2024 at 02:51:36PM +0530, Amit Kapila wrote:
> On Wed, Aug 28, 2024 at 1:25 AM Bertrand Drouvot
> <bertranddrouvot.pg@gmail.com> wrote:
> >
> > On Mon, Aug 26, 2024 at 07:05:27PM +0530, Amit Kapila wrote:
> > > On Thu, Aug 22, 2024 at 5:56 PM Bertrand Drouvot
> > > <bertranddrouvot.pg@gmail.com> wrote:
> > > >
> > > > 2. The SnapBuildOnDisk and SnapBuild structs are now exposed to public. Means
> > > > we should now pay much more attention when changing their contents but I think
> > > > it's worth it.
> > > >
> > >
> > > Is it possible to avoid exposing these structures? Can we expose some
> > > function from snapbuild.c that provides the required information?
> >
> > Yeah, that's an option if we don't want to expose those structs to public.
> >
> > I think we could 1/ create a function that would return a formed HeapTuple, or
> > 2/ we could create multiple functions (about 15) that would return the values
> > we are interested in.
> >
> > I think 2/ is fine as it would give more flexiblity (no need to retrieve a whole
> > tuple if one is interested to only one value).
> >
> 
> True, but OTOH, each time we add a new field to these structures, a
> new function has to be exposed. I don't have a strong opinion on this
> but seeing current use cases, it seems okay to expose a single
> function.

Yeah that's fair. And now I'm wondering if we need an extra module. I think
we could "simply" expose 2 new functions in core, thoughts?

> > What do you think? Did you have something else in mind?
> >
> 
> On similar lines, we can also provide a function to get the slot's
> on-disk data.

Yeah, having a way to expose the data from the disk makes fully sense to me.
 
> IIRC, Bharath had previously proposed a tool to achieve
> the same. It is fine if we don't want to add that as part of this
> patch but I mentioned it because by having that we can have a set of
> functions to view logical decoding data.
>

That's right. I think this one would be simply enough to expose one or two
functions in core too (and probably would not need an extra module).

Regards,
 
-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Add contrib/pg_logicalsnapinspect

From
Bharath Rupireddy
Date:
On Thu, Aug 29, 2024 at 3:44 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> Yeah that's fair. And now I'm wondering if we need an extra module. I think
> we could "simply" expose 2 new functions in core, thoughts?
>
> > > What do you think? Did you have something else in mind?
> > >
> >
> > On similar lines, we can also provide a function to get the slot's
> > on-disk data.
>
> Yeah, having a way to expose the data from the disk makes fully sense to me.
>
> > IIRC, Bharath had previously proposed a tool to achieve
> > the same. It is fine if we don't want to add that as part of this
> > patch but I mentioned it because by having that we can have a set of
> > functions to view logical decoding data.
>
> That's right. I think this one would be simply enough to expose one or two
> functions in core too (and probably would not need an extra module).

+1 for functions in core unless this extra module
pg_logicalsnapinspect works as a tool to be helpful even when the
server is down.

FWIW, I wrote pg_replslotdata as a tool, not as an extension for
reading on-disk replication slot data to help when the server is down
-
https://www.postgresql.org/message-id/flat/CALj2ACW0rV5gWK8A3m6_X62qH%2BVfaq5hznC%3Di0R5Wojt5%2Byhyw%40mail.gmail.com.
When the server is running, pg_get_replication_slots() pretty much
gives the on-disk contents.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Add contrib/pg_logicalsnapinspect

From
Bertrand Drouvot
Date:
Hi,

On Thu, Aug 29, 2024 at 06:33:19PM +0530, Bharath Rupireddy wrote:
> On Thu, Aug 29, 2024 at 3:44 PM Bertrand Drouvot
> <bertranddrouvot.pg@gmail.com> wrote:
> >
> > That's right. I think this one would be simply enough to expose one or two
> > functions in core too (and probably would not need an extra module).
> 
> +1 for functions in core unless this extra module
> pg_logicalsnapinspect works as a tool to be helpful even when the
> server is down.

Thanks for the feedback!

I don't see any use case where it could be useful when the server is down. So,
I think I'll move forward with in core functions (unless someone has a different
opinion).

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Add contrib/pg_logicalsnapinspect

From
Amit Kapila
Date:
On Thu, Aug 29, 2024 at 6:33 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Thu, Aug 29, 2024 at 3:44 PM Bertrand Drouvot
> <bertranddrouvot.pg@gmail.com> wrote:
> >
> > Yeah that's fair. And now I'm wondering if we need an extra module. I think
> > we could "simply" expose 2 new functions in core, thoughts?
> >
> > > > What do you think? Did you have something else in mind?
> > > >
> > >
> > > On similar lines, we can also provide a function to get the slot's
> > > on-disk data.
> >
> > Yeah, having a way to expose the data from the disk makes fully sense to me.
> >
> > > IIRC, Bharath had previously proposed a tool to achieve
> > > the same. It is fine if we don't want to add that as part of this
> > > patch but I mentioned it because by having that we can have a set of
> > > functions to view logical decoding data.
> >
> > That's right. I think this one would be simply enough to expose one or two
> > functions in core too (and probably would not need an extra module).
>
> +1 for functions in core unless this extra module
> pg_logicalsnapinspect works as a tool to be helpful even when the
> server is down.
>

We have an example of pageinspect which provides low-level functions
to aid debugging. The proposal for these APIs also seems to fall in
the same category, so why go for the core for these functions?

--
With Regards,
Amit Kapila.



Re: Add contrib/pg_logicalsnapinspect

From
Bertrand Drouvot
Date:
Hi,

On Fri, Aug 30, 2024 at 03:43:12PM +0530, Amit Kapila wrote:
> On Thu, Aug 29, 2024 at 6:33 PM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> >
> > On Thu, Aug 29, 2024 at 3:44 PM Bertrand Drouvot
> > <bertranddrouvot.pg@gmail.com> wrote:
> > >
> > > Yeah that's fair. And now I'm wondering if we need an extra module. I think
> > > we could "simply" expose 2 new functions in core, thoughts?
> > >
> > > > > What do you think? Did you have something else in mind?
> > > > >
> > > >
> > > > On similar lines, we can also provide a function to get the slot's
> > > > on-disk data.
> > >
> > > Yeah, having a way to expose the data from the disk makes fully sense to me.
> > >
> > > > IIRC, Bharath had previously proposed a tool to achieve
> > > > the same. It is fine if we don't want to add that as part of this
> > > > patch but I mentioned it because by having that we can have a set of
> > > > functions to view logical decoding data.
> > >
> > > That's right. I think this one would be simply enough to expose one or two
> > > functions in core too (and probably would not need an extra module).
> >
> > +1 for functions in core unless this extra module
> > pg_logicalsnapinspect works as a tool to be helpful even when the
> > server is down.
> >
> 
> We have an example of pageinspect which provides low-level functions
> to aid debugging. The proposal for these APIs also seems to fall in
> the same category,

That's right, but...

>  so why go for the core for these functions?

as we decided not to expose the SnapBuildOnDisk and SnapBuild structs to public
and to create/expose 2 new functions in snapbuild.c then the functions in the
module would do nothing but expose the data coming from the snapbuild.c's
functions (get the tuple and send it to the client). That sounds weird to me to
create a module that would "only" do so, that's why I thought that in core
functions taking care of everything make more sense.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Add contrib/pg_logicalsnapinspect

From
Amit Kapila
Date:
On Fri, Aug 30, 2024 at 5:18 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> On Fri, Aug 30, 2024 at 03:43:12PM +0530, Amit Kapila wrote:
> > On Thu, Aug 29, 2024 at 6:33 PM Bharath Rupireddy
> > <bharath.rupireddyforpostgres@gmail.com> wrote:
> > >
> > > On Thu, Aug 29, 2024 at 3:44 PM Bertrand Drouvot
> > > <bertranddrouvot.pg@gmail.com> wrote:
> > > >
> > > > Yeah that's fair. And now I'm wondering if we need an extra module. I think
> > > > we could "simply" expose 2 new functions in core, thoughts?
> > > >
> > > > > > What do you think? Did you have something else in mind?
> > > > > >
> > > > >
> > > > > On similar lines, we can also provide a function to get the slot's
> > > > > on-disk data.
> > > >
> > > > Yeah, having a way to expose the data from the disk makes fully sense to me.
> > > >
> > > > > IIRC, Bharath had previously proposed a tool to achieve
> > > > > the same. It is fine if we don't want to add that as part of this
> > > > > patch but I mentioned it because by having that we can have a set of
> > > > > functions to view logical decoding data.
> > > >
> > > > That's right. I think this one would be simply enough to expose one or two
> > > > functions in core too (and probably would not need an extra module).
> > >
> > > +1 for functions in core unless this extra module
> > > pg_logicalsnapinspect works as a tool to be helpful even when the
> > > server is down.
> > >
> >
> > We have an example of pageinspect which provides low-level functions
> > to aid debugging. The proposal for these APIs also seems to fall in
> > the same category,
>
> That's right, but...
>
> >  so why go for the core for these functions?
>
> as we decided not to expose the SnapBuildOnDisk and SnapBuild structs to public
> and to create/expose 2 new functions in snapbuild.c then the functions in the
> module would do nothing but expose the data coming from the snapbuild.c's
> functions (get the tuple and send it to the client). That sounds weird to me to
> create a module that would "only" do so, that's why I thought that in core
> functions taking care of everything make more sense.
>

I see your point. Does anyone else have an opinion on the need for
these functions and whether to expose them from a contrib module or
have them as core functions?

--
With Regards,
Amit Kapila.



Re: Add contrib/pg_logicalsnapinspect

From
Bertrand Drouvot
Date:
Hi,

On Mon, Sep 09, 2024 at 04:24:09PM +0530, Amit Kapila wrote:
> On Fri, Aug 30, 2024 at 5:18 PM Bertrand Drouvot
> <bertranddrouvot.pg@gmail.com> wrote:
> > as we decided not to expose the SnapBuildOnDisk and SnapBuild structs to public
> > and to create/expose 2 new functions in snapbuild.c then the functions in the
> > module would do nothing but expose the data coming from the snapbuild.c's
> > functions (get the tuple and send it to the client). That sounds weird to me to
> > create a module that would "only" do so, that's why I thought that in core
> > functions taking care of everything make more sense.
> >
> 
> I see your point. Does anyone else have an opinion on the need for
> these functions and whether to expose them from a contrib module or
> have them as core functions?

I looked at when the SNAPBUILD_VERSION has been changed:

ec5896aed3 (2014)
a975ff4980 (2021)
8bdb1332eb (2021)
7f13ac8123 (2022)
bb19b70081 (2024)

So it's not like we are changing the SnapBuildOnDisk or SnapBuild structs that
frequently. Furthermore, those structs are serialized and so we have to preserve
their on-disk compatibility (means we can change them only in a major release
if we need to).

So, I think it would not be that much of an issue to expose those structs and
create a new contrib module (as v1 did propose) instead of in core new functions.

If we want to insist that external modules "should" not rely on those structs then
we could put them into a new internal_snapbuild.h file (instead of snapbuild.h
as proposed in v1).

At the end, I think that creating a contrib module and exposing those structs in
internal_snapbuild.h make more sense (as compared to in core functions).

Thoughts?

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Add contrib/pg_logicalsnapinspect

From
Amit Kapila
Date:
On Tue, Sep 10, 2024 at 8:56 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> On Mon, Sep 09, 2024 at 04:24:09PM +0530, Amit Kapila wrote:
> > On Fri, Aug 30, 2024 at 5:18 PM Bertrand Drouvot
> > <bertranddrouvot.pg@gmail.com> wrote:
> > > as we decided not to expose the SnapBuildOnDisk and SnapBuild structs to public
> > > and to create/expose 2 new functions in snapbuild.c then the functions in the
> > > module would do nothing but expose the data coming from the snapbuild.c's
> > > functions (get the tuple and send it to the client). That sounds weird to me to
> > > create a module that would "only" do so, that's why I thought that in core
> > > functions taking care of everything make more sense.
> > >
> >
> > I see your point. Does anyone else have an opinion on the need for
> > these functions and whether to expose them from a contrib module or
> > have them as core functions?
>
> I looked at when the SNAPBUILD_VERSION has been changed:
>
> ec5896aed3 (2014)
> a975ff4980 (2021)
> 8bdb1332eb (2021)
> 7f13ac8123 (2022)
> bb19b70081 (2024)
>
> So it's not like we are changing the SnapBuildOnDisk or SnapBuild structs that
> frequently. Furthermore, those structs are serialized and so we have to preserve
> their on-disk compatibility (means we can change them only in a major release
> if we need to).
>
> So, I think it would not be that much of an issue to expose those structs and
> create a new contrib module (as v1 did propose) instead of in core new functions.
>
> If we want to insist that external modules "should" not rely on those structs then
> we could put them into a new internal_snapbuild.h file (instead of snapbuild.h
> as proposed in v1).
>

Adding snapbuild_internal.h sounds like a good idea.

> At the end, I think that creating a contrib module and exposing those structs in
> internal_snapbuild.h make more sense (as compared to in core functions).
>

Fail enough. We can keep the module name as logicalinspect so that we
can extend it for other logical decoding/replication-related files in
the future.


--
With Regards,
Amit Kapila.



Re: Add contrib/pg_logicalsnapinspect

From
shveta malik
Date:
On Wed, Sep 11, 2024 at 4:21 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
>
> Yeah, good idea. Done that way in v3 attached.
>

Thanks for the patch. +1 on the patch's idea. I have started
reviewing/testing it. It is WIP but please find few initial comments:

src/backend/replication/logical/snapbuild.c:

1)
+ fsync_fname("pg_logical/snapshots", true);

Should we use macros PG_LOGICAL_DIR and  PG_LOGICAL_SNAPSHOTS_DIR  in
ValidateSnapshotFile(), instead of hard coding the path


2)
Same as above in pg_get_logical_snapshot_meta() and
pg_get_logical_snapshot_info()

+ sprintf(path, "pg_logical/snapshots/%X-%X.snap",
+ LSN_FORMAT_ARGS(lsn));                        LSN_FORMAT_ARGS(lsn));

3)
+#include "replication/internal_snapbuild.h"

Shall we name new file as 'snapbuild_internal.h' instead of
'internal_snapbuild.h'. Please see other files' name under
'./src/include/replication':
worker_internal.h
walsender_private.h

4)
+static void ValidateSnapshotFile(XLogRecPtr lsn, SnapBuildOnDisk *ondisk,
+ const char *path);

Is it required? We generally don't add declaration unless required by
compiler. Since definition is prior to usage, it is not needed?

thanks
Shveta



Re: Add contrib/pg_logicalsnapinspect

From
shveta malik
Date:
On Mon, Sep 16, 2024 at 8:03 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> Hi,
>
> On Mon, Sep 16, 2024 at 04:02:51PM +0530, shveta malik wrote:
> > On Wed, Sep 11, 2024 at 4:21 PM Bertrand Drouvot
> > <bertranddrouvot.pg@gmail.com> wrote:
> > >
> > >
> > > Yeah, good idea. Done that way in v3 attached.
> > >
> >
> > Thanks for the patch. +1 on the patch's idea. I have started
> > reviewing/testing it. It is WIP but please find few initial comments:
>
> Thanks for sharing your thoughts and for the review!
>
> >
> > src/backend/replication/logical/snapbuild.c:
> >
> > 1)
> > + fsync_fname("pg_logical/snapshots", true);
> >
> > Should we use macros PG_LOGICAL_DIR and  PG_LOGICAL_SNAPSHOTS_DIR  in
> > ValidateSnapshotFile(), instead of hard coding the path
> >
> >
> > 2)
> > Same as above in pg_get_logical_snapshot_meta() and
> > pg_get_logical_snapshot_info()
> >
> > + sprintf(path, "pg_logical/snapshots/%X-%X.snap",
> > + LSN_FORMAT_ARGS(lsn));                        LSN_FORMAT_ARGS(lsn));
> >
>
> Doh! Yeah, agree that we should use those macros. They are coming from c39afc38cf
> which has been introduced after v1 of this patch. I thought I took care of it once
> c39afc38cf went in, but it looks like I missed it somehow. Done in v4 attached,
> Thanks!
>
> > 3)
> > +#include "replication/internal_snapbuild.h"
> >
> > Shall we name new file as 'snapbuild_internal.h' instead of
> > 'internal_snapbuild.h'. Please see other files' name under
> > './src/include/replication':
> > worker_internal.h
> > walsender_private.h
>
> Agree, that should be snapbuild_internal.h, done in v4.
>
> >
> > 4)
> > +static void ValidateSnapshotFile(XLogRecPtr lsn, SnapBuildOnDisk *ondisk,
> > + const char *path);
> >
> > Is it required? We generally don't add declaration unless required by
> > compiler. Since definition is prior to usage, it is not needed?
> >
>
> I personally prefer to add them even if not required by the compiler. I did not
> pay attention that "We generally don't add declaration unless required by compiler"
> and (after a quick check) I did not find any reference in the coding style
> documentation [1]. That said, I don't have a strong opinion about that and so
> removed in v4. Worth to add a mention in the coding convention doc?
>

Okay. I was somehow under the impression that this is the way in the
postgres i.e. not add redundant declarations.  Will be good  to know
what others think on this.

Thanks for addressing the comments. I have not started reviewing v4
yet, but here are few more comments on v3:

1)
+#include "port/pg_crc32c.h"

It is not needed in pg_logicalinspect.c as it is already included in
internal_snapbuild.h

2)
+ values[0] = Int16GetDatum(ondisk.builder.state);
........
+ values[8] = LSNGetDatum(ondisk.builder.last_serialized_snapshot);
+ values[9] = TransactionIdGetDatum(ondisk.builder.next_phase_at);
+ values[10] = Int64GetDatum(ondisk.builder.committed.xcnt);

We can have values[i++] in all the places and later we can check :
Assert(i == PG_GET_LOGICAL_SNAPSHOT_INFO_COLS);
Then we need not to keep track of number even in later part of code,
as it goes till 14.

3)
Similar change can be done here:

+ values[0] = Int32GetDatum(ondisk.magic);
+ values[1] = Int32GetDatum(ondisk.checksum);
+ values[2] = Int32GetDatum(ondisk.version);

check at the end will be: Assert(i == PG_GET_LOGICAL_SNAPSHOT_META_COLS);

4)
Most of the output columns in pg_get_logical_snapshot_info() look
self-explanatory except 'state'. Should we have meaningful 'text' here
corresponding to SnapBuildState? Similar to what we do for
'invalidation_reason' in pg_replication_slots. (SlotInvalidationCauses
for ReplicationSlotInvalidationCause)

thanks
Shveta



Re: Add contrib/pg_logicalsnapinspect

From
shveta malik
Date:
On Tue, Sep 17, 2024 at 10:18 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> Thanks for addressing the comments. I have not started reviewing v4
> yet, but here are few more comments on v3:
>

I just noticed that when we pass NULL input, both the new functions
give 1 row as output, all cols as NULL:

newdb1=# SELECT * FROM pg_get_logical_snapshot_meta(NULL);
 magic | checksum | version
-------+----------+---------
       |          |

(1 row)

Similar behavior with pg_get_logical_snapshot_info(). While the
existing 'pg_ls_logicalsnapdir' function gives this error, which looks
more meaningful:

newdb1=# select * from pg_ls_logicalsnapdir(NULL);
ERROR:  function pg_ls_logicalsnapdir(unknown) does not exist
LINE 1: select * from pg_ls_logicalsnapdir(NULL);
HINT:  No function matches the given name and argument types. You
might need to add explicit type casts.


Shouldn't the new functions have same behavior?

thanks
Shveta



Re: Add contrib/pg_logicalsnapinspect

From
"David G. Johnston"
Date:


On Monday, September 16, 2024, shveta malik <shveta.malik@gmail.com> wrote:
On Tue, Sep 17, 2024 at 10:18 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> Thanks for addressing the comments. I have not started reviewing v4
> yet, but here are few more comments on v3:
>

I just noticed that when we pass NULL input, both the new functions
give 1 row as output, all cols as NULL:

newdb1=# SELECT * FROM pg_get_logical_snapshot_meta(NULL);
 magic | checksum | version
-------+----------+---------
       |          |

(1 row)

Similar behavior with pg_get_logical_snapshot_info(). While the
existing 'pg_ls_logicalsnapdir' function gives this error, which looks
more meaningful:

newdb1=# select * from pg_ls_logicalsnapdir(NULL);
ERROR:  function pg_ls_logicalsnapdir(unknown) does not exist
LINE 1: select * from pg_ls_logicalsnapdir(NULL);
HINT:  No function matches the given name and argument types. You
might need to add explicit type casts.


Shouldn't the new functions have same behavior?

No. Since the name pg_ls_logicalsnapdir has zero single-argument implementations passing a null value as an argument is indeed attempt to invoke a function signature that doesn’t exist.

If there is exactly one single input argument function of the given name the parser is going to cast the null literal to the data type of the single argument and invoke the function.  It will not and cannot be convinced to fail to find a matching function.

I can see an argument that they should produce an empty set instead of a single all-null row, but the idea that they wouldn’t even be found is contrary to a core design of the system.

David J.

Re: Add contrib/pg_logicalsnapinspect

From
Bertrand Drouvot
Date:
Hi,

On Mon, Sep 16, 2024 at 10:16:16PM -0700, David G. Johnston wrote:
> On Monday, September 16, 2024, shveta malik <shveta.malik@gmail.com> wrote:
> 
> > On Tue, Sep 17, 2024 at 10:18 AM shveta malik <shveta.malik@gmail.com>
> > wrote:
> > >
> > > Thanks for addressing the comments. I have not started reviewing v4
> > > yet, but here are few more comments on v3:
> > >
> >
> > I just noticed that when we pass NULL input, both the new functions
> > give 1 row as output, all cols as NULL:
> >
> > newdb1=# SELECT * FROM pg_get_logical_snapshot_meta(NULL);
> >  magic | checksum | version
> > -------+----------+---------
> >        |          |
> >
> > (1 row)
> >
> > Similar behavior with pg_get_logical_snapshot_info(). While the
> > existing 'pg_ls_logicalsnapdir' function gives this error, which looks
> > more meaningful:
> >
> > newdb1=# select * from pg_ls_logicalsnapdir(NULL);
> > ERROR:  function pg_ls_logicalsnapdir(unknown) does not exist
> > LINE 1: select * from pg_ls_logicalsnapdir(NULL);
> > HINT:  No function matches the given name and argument types. You
> > might need to add explicit type casts.
> >
> >
> > Shouldn't the new functions have same behavior?
> >
> 
> No. Since the name pg_ls_logicalsnapdir has zero single-argument
> implementations passing a null value as an argument is indeed attempt to
> invoke a function signature that doesn’t exist.

Agree.

> I can see an argument that they should produce an empty set instead of a
> single all-null row,

Yeah, it's outside the scope of this patch but I've seen different behavior
in this area.

For example:

postgres=# select * from  pg_ls_replslotdir(NULL);
 name | size | modification
------+------+--------------
(0 rows)

as compared to:

postgres=# select * from  pg_walfile_name_offset(NULL);
 file_name | file_offset
-----------+-------------
           |
(1 row)

I thought that it might be linked to the volatility but it is not:

postgres=# select * from pg_stat_get_xact_blocks_fetched(NULL);
 pg_stat_get_xact_blocks_fetched
---------------------------------

(1 row)

postgres=# select * from pg_get_multixact_members(NULL);
 xid | mode
-----+------
(0 rows)

while both are volatile.

I think both make sense: It's "empty" or we "don't know the values of the fields".
I don't know if there is any reason about this "inconsistency".

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Add contrib/pg_logicalsnapinspect

From
shveta malik
Date:
On Tue, Sep 17, 2024 at 10:46 AM David G. Johnston
<david.g.johnston@gmail.com> wrote:
>
>
>
> On Monday, September 16, 2024, shveta malik <shveta.malik@gmail.com> wrote:
>>
>> On Tue, Sep 17, 2024 at 10:18 AM shveta malik <shveta.malik@gmail.com> wrote:
>> >
>> > Thanks for addressing the comments. I have not started reviewing v4
>> > yet, but here are few more comments on v3:
>> >
>>
>> I just noticed that when we pass NULL input, both the new functions
>> give 1 row as output, all cols as NULL:
>>
>> newdb1=# SELECT * FROM pg_get_logical_snapshot_meta(NULL);
>>  magic | checksum | version
>> -------+----------+---------
>>        |          |
>>
>> (1 row)
>>
>> Similar behavior with pg_get_logical_snapshot_info(). While the
>> existing 'pg_ls_logicalsnapdir' function gives this error, which looks
>> more meaningful:
>>
>> newdb1=# select * from pg_ls_logicalsnapdir(NULL);
>> ERROR:  function pg_ls_logicalsnapdir(unknown) does not exist
>> LINE 1: select * from pg_ls_logicalsnapdir(NULL);
>> HINT:  No function matches the given name and argument types. You
>> might need to add explicit type casts.
>>
>>
>> Shouldn't the new functions have same behavior?
>
>
> No. Since the name pg_ls_logicalsnapdir has zero single-argument implementations passing a null value as an argument
isindeed attempt to invoke a function signature that doesn’t exist. 
>
> If there is exactly one single input argument function of the given name the parser is going to cast the null literal
tothe data type of the single argument and invoke the function.  It will not and cannot be convinced to fail to find a
matchingfunction. 

Okay, understood. Thanks for explaining.

> I can see an argument that they should produce an empty set instead of a single all-null row, but the idea that they
wouldn’teven be found is contrary to a core design of the system. 

Okay, a single row can be investigated if it comes under this scope.
But I see why 'ERROR' is not a possibility here.

thanks
Shveta



Re: Add contrib/pg_logicalsnapinspect

From
shveta malik
Date:
On Tue, Sep 17, 2024 at 12:44 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> Hi,
>
> On Tue, Sep 17, 2024 at 10:18:35AM +0530, shveta malik wrote:
> > Thanks for addressing the comments. I have not started reviewing v4
> > yet, but here are few more comments on v3:
> >
> > 1)
> > +#include "port/pg_crc32c.h"
> >
> > It is not needed in pg_logicalinspect.c as it is already included in
> > internal_snapbuild.h
>
> Yeap, forgot to remove that one when creating the new "internal".h file, done
> in v5 attached, thanks!
>
> >
> > 2)
> > + values[0] = Int16GetDatum(ondisk.builder.state);
> > ........
> > + values[8] = LSNGetDatum(ondisk.builder.last_serialized_snapshot);
> > + values[9] = TransactionIdGetDatum(ondisk.builder.next_phase_at);
> > + values[10] = Int64GetDatum(ondisk.builder.committed.xcnt);
> >
> > We can have values[i++] in all the places and later we can check :
> > Assert(i == PG_GET_LOGICAL_SNAPSHOT_INFO_COLS);
> > Then we need not to keep track of number even in later part of code,
> > as it goes till 14.
>
> Right, let's do it that way (as it is done in pg_walinspect for example).
>
> > 4)
> > Most of the output columns in pg_get_logical_snapshot_info() look
> > self-explanatory except 'state'. Should we have meaningful 'text' here
> > corresponding to SnapBuildState? Similar to what we do for
> > 'invalidation_reason' in pg_replication_slots. (SlotInvalidationCauses
> > for ReplicationSlotInvalidationCause)
>
> Yeah we could. I was not sure about that (and that was my first remark in [1])
> , as the module is mainly for debugging purpose, I was thinking that the one
> using it could refer to "snapbuild.h". Let's see what others think.
>

okay, makes sense. lets wait what others have to say.

Thanks for the patch. Few trivial things:

1)
May be we shall change 'INTERNAL_SNAPBUILD_H' in snapbuild_internal.h
to 'SNAPBUILD_INTERNAL_H'?

2)
ValidateSnapshotFile()

It is not only validating, but loading the content as well. So may be
we can rename to ValidateAndRestoreSnapshotFile?

3) sgml:
a)
+ The pg_logicalinspect functions are called using an LSN argument
that can be extracted from the output name of the
pg_ls_logicalsnapdir() function.

Is it possible to give link to pg_ls_logicalsnapdir function here?

b)
+ Gets logical snapshot metadata about a snapshot file that is located
in the pg_logical/snapshots directory.

located in server's pg_logical/snapshots directory
 (i.e. use server keyword, similar to how pg_ls_logicalsnapdir ,
pg_ls_logicalmapdir explains it)

thanks
Shveta



Re: Add contrib/pg_logicalsnapinspect

From
Amit Kapila
Date:
On Tue, Sep 17, 2024 at 12:44 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> On Tue, Sep 17, 2024 at 10:18:35AM +0530, shveta malik wrote:
> > Thanks for addressing the comments. I have not started reviewing v4
> > yet, but here are few more comments on v3:
> >
>
> > 4)
> > Most of the output columns in pg_get_logical_snapshot_info() look
> > self-explanatory except 'state'. Should we have meaningful 'text' here
> > corresponding to SnapBuildState? Similar to what we do for
> > 'invalidation_reason' in pg_replication_slots. (SlotInvalidationCauses
> > for ReplicationSlotInvalidationCause)
>
> Yeah we could. I was not sure about that (and that was my first remark in [1])
> , as the module is mainly for debugging purpose, I was thinking that the one
> using it could refer to "snapbuild.h". Let's see what others think.
>

Displaying the 'text' for the state column makes it easy to
understand. So, +1 for doing it that way.

--
With Regards,
Amit Kapila.



Re: Add contrib/pg_logicalsnapinspect

From
shveta malik
Date:
On Thu, Sep 19, 2024 at 12:17 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> Hi,
>
> Thanks for the review!
>

Thanks for the patch.

Should we include in the document who can execute these functions and
the required access permissions, similar to how it's done for
pgwalinspect, pg_ls_logicalmapdir(), and other such functions?

thanks
Shveta



Re: Add contrib/pg_logicalsnapinspect

From
shveta malik
Date:
On Fri, Sep 20, 2024 at 12:22 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
>
> Please find attached v8, that:
>

Thank You for the patch. In one of my tests, I noticed that I got
negative checksum:

postgres=# SELECT * FROM pg_get_logical_snapshot_meta('0/3481F20');
   magic    |  checksum  | version
------------+------------+---------
 1369563137 | -266346460 |       6

But pg_crc32c is uint32. Is it because we are getting it as
Int32GetDatum(ondisk.checksum) in pg_get_logical_snapshot_meta()?
Instead should it be UInt32GetDatum?

Same goes for below:
values[i++] = Int32GetDatum(ondisk.magic);
values[i++] = Int32GetDatum(ondisk.magic);

We need to recheck the rest of the fields in the info() function as well.

thanks
Shveta



Re: Add contrib/pg_logicalsnapinspect

From
shveta malik
Date:
On Mon, Sep 23, 2024 at 7:57 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> My review comments for v8-0001
>
> ======
> contrib/pg_logicalinspect/pg_logicalinspect.c
>
> 1.
> +/*
> + * Lookup table for SnapBuildState.
> + */
> +
> +#define SNAPBUILD_STATE_INCR 1
> +
> +static const char *const SnapBuildStateDesc[] = {
> + [SNAPBUILD_START + SNAPBUILD_STATE_INCR] = "start",
> + [SNAPBUILD_BUILDING_SNAPSHOT + SNAPBUILD_STATE_INCR] = "building",
> + [SNAPBUILD_FULL_SNAPSHOT + SNAPBUILD_STATE_INCR] = "full",
> + [SNAPBUILD_CONSISTENT + SNAPBUILD_STATE_INCR] = "consistent",
> +};
> +
> +/*
>
> nit - the SNAPBUILD_STATE_INCR made this code appear more complicated
> than it is.

I agree.

> Please take a look at the attachment for an alternative
> implementation which includes an explanatory comment. YMMV. Feel free
> to ignore it.
>

+1. I find Peter's version with comments easier to understand.

thanks
Shveta



Re: Add contrib/pg_logicalsnapinspect

From
Bertrand Drouvot
Date:
Hi,

On Tue, Sep 24, 2024 at 09:15:31AM +0530, shveta malik wrote:
> On Fri, Sep 20, 2024 at 12:22 PM Bertrand Drouvot
> <bertranddrouvot.pg@gmail.com> wrote:
> >
> >
> > Please find attached v8, that:
> >
> 
> Thank You for the patch. In one of my tests, I noticed that I got
> negative checksum:
> 
> postgres=# SELECT * FROM pg_get_logical_snapshot_meta('0/3481F20');
>    magic    |  checksum  | version
> ------------+------------+---------
>  1369563137 | -266346460 |       6
> 
> But pg_crc32c is uint32. Is it because we are getting it as
> Int32GetDatum(ondisk.checksum) in pg_get_logical_snapshot_meta()?
> Instead should it be UInt32GetDatum?

Thanks for the testing.

As the checksum could be > 2^31 - 1, then v9 (just shared up-thread) changes it
to an int8 in the pg_logicalinspect--1.0.sql file. So, to avoid CI failure on
the 32bit build, then v9 is using Int64GetDatum() instead of UInt32GetDatum().

> Same goes for below:
> values[i++] = Int32GetDatum(ondisk.magic);
> values[i++] = Int32GetDatum(ondisk.magic);

The 2 others field (magic and version) are unlikely to be > 2^31 - 1, so v9 is
making use of UInt32GetDatum() and keep int4 in the sql file.

> We need to recheck the rest of the fields in the info() function as well.

I think that the pg_get_logical_snapshot_info()'s fields are ok (I did spend some
time to debug CI failing on the 32bit build for some on them before submitting v1).

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com