Thread: Re: Add contrib/pg_logicalsnapinspect
On Thu, Aug 22, 2024 at 5:56 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > Please find attached a patch to $SUBJECT. > > This module provides SQL functions to inspect the contents of serialized logical > snapshots of a running database cluster, which I think could be useful for > debugging or educational purposes. > +1. I see it could be a good debugging aid. > > 2. The SnapBuildOnDisk and SnapBuild structs are now exposed to public. Means > we should now pay much more attention when changing their contents but I think > it's worth it. > Is it possible to avoid exposing these structures? Can we expose some function from snapbuild.c that provides the required information? -- With Regards, Amit Kapila.
Hi, On Mon, Aug 26, 2024 at 07:05:27PM +0530, Amit Kapila wrote: > On Thu, Aug 22, 2024 at 5:56 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > Please find attached a patch to $SUBJECT. > > > > This module provides SQL functions to inspect the contents of serialized logical > > snapshots of a running database cluster, which I think could be useful for > > debugging or educational purposes. > > > > +1. I see it could be a good debugging aid. Thanks for the feedback! > > > > 2. The SnapBuildOnDisk and SnapBuild structs are now exposed to public. Means > > we should now pay much more attention when changing their contents but I think > > it's worth it. > > > > Is it possible to avoid exposing these structures? Can we expose some > function from snapbuild.c that provides the required information? Yeah, that's an option if we don't want to expose those structs to public. I think we could 1/ create a function that would return a formed HeapTuple, or 2/ we could create multiple functions (about 15) that would return the values we are interested in. I think 2/ is fine as it would give more flexiblity (no need to retrieve a whole tuple if one is interested to only one value). What do you think? Did you have something else in mind? Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On Wed, Aug 28, 2024 at 1:25 AM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > On Mon, Aug 26, 2024 at 07:05:27PM +0530, Amit Kapila wrote: > > On Thu, Aug 22, 2024 at 5:56 PM Bertrand Drouvot > > <bertranddrouvot.pg@gmail.com> wrote: > > > > > > 2. The SnapBuildOnDisk and SnapBuild structs are now exposed to public. Means > > > we should now pay much more attention when changing their contents but I think > > > it's worth it. > > > > > > > Is it possible to avoid exposing these structures? Can we expose some > > function from snapbuild.c that provides the required information? > > Yeah, that's an option if we don't want to expose those structs to public. > > I think we could 1/ create a function that would return a formed HeapTuple, or > 2/ we could create multiple functions (about 15) that would return the values > we are interested in. > > I think 2/ is fine as it would give more flexiblity (no need to retrieve a whole > tuple if one is interested to only one value). > True, but OTOH, each time we add a new field to these structures, a new function has to be exposed. I don't have a strong opinion on this but seeing current use cases, it seems okay to expose a single function. > What do you think? Did you have something else in mind? > On similar lines, we can also provide a function to get the slot's on-disk data. IIRC, Bharath had previously proposed a tool to achieve the same. It is fine if we don't want to add that as part of this patch but I mentioned it because by having that we can have a set of functions to view logical decoding data. -- With Regards, Amit Kapila.
Hi, On Thu, Aug 29, 2024 at 02:51:36PM +0530, Amit Kapila wrote: > On Wed, Aug 28, 2024 at 1:25 AM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > On Mon, Aug 26, 2024 at 07:05:27PM +0530, Amit Kapila wrote: > > > On Thu, Aug 22, 2024 at 5:56 PM Bertrand Drouvot > > > <bertranddrouvot.pg@gmail.com> wrote: > > > > > > > > 2. The SnapBuildOnDisk and SnapBuild structs are now exposed to public. Means > > > > we should now pay much more attention when changing their contents but I think > > > > it's worth it. > > > > > > > > > > Is it possible to avoid exposing these structures? Can we expose some > > > function from snapbuild.c that provides the required information? > > > > Yeah, that's an option if we don't want to expose those structs to public. > > > > I think we could 1/ create a function that would return a formed HeapTuple, or > > 2/ we could create multiple functions (about 15) that would return the values > > we are interested in. > > > > I think 2/ is fine as it would give more flexiblity (no need to retrieve a whole > > tuple if one is interested to only one value). > > > > True, but OTOH, each time we add a new field to these structures, a > new function has to be exposed. I don't have a strong opinion on this > but seeing current use cases, it seems okay to expose a single > function. Yeah that's fair. And now I'm wondering if we need an extra module. I think we could "simply" expose 2 new functions in core, thoughts? > > What do you think? Did you have something else in mind? > > > > On similar lines, we can also provide a function to get the slot's > on-disk data. Yeah, having a way to expose the data from the disk makes fully sense to me. > IIRC, Bharath had previously proposed a tool to achieve > the same. It is fine if we don't want to add that as part of this > patch but I mentioned it because by having that we can have a set of > functions to view logical decoding data. > That's right. I think this one would be simply enough to expose one or two functions in core too (and probably would not need an extra module). Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On Thu, Aug 29, 2024 at 3:44 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > Yeah that's fair. And now I'm wondering if we need an extra module. I think > we could "simply" expose 2 new functions in core, thoughts? > > > > What do you think? Did you have something else in mind? > > > > > > > On similar lines, we can also provide a function to get the slot's > > on-disk data. > > Yeah, having a way to expose the data from the disk makes fully sense to me. > > > IIRC, Bharath had previously proposed a tool to achieve > > the same. It is fine if we don't want to add that as part of this > > patch but I mentioned it because by having that we can have a set of > > functions to view logical decoding data. > > That's right. I think this one would be simply enough to expose one or two > functions in core too (and probably would not need an extra module). +1 for functions in core unless this extra module pg_logicalsnapinspect works as a tool to be helpful even when the server is down. FWIW, I wrote pg_replslotdata as a tool, not as an extension for reading on-disk replication slot data to help when the server is down - https://www.postgresql.org/message-id/flat/CALj2ACW0rV5gWK8A3m6_X62qH%2BVfaq5hznC%3Di0R5Wojt5%2Byhyw%40mail.gmail.com. When the server is running, pg_get_replication_slots() pretty much gives the on-disk contents. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Hi, On Thu, Aug 29, 2024 at 06:33:19PM +0530, Bharath Rupireddy wrote: > On Thu, Aug 29, 2024 at 3:44 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > That's right. I think this one would be simply enough to expose one or two > > functions in core too (and probably would not need an extra module). > > +1 for functions in core unless this extra module > pg_logicalsnapinspect works as a tool to be helpful even when the > server is down. Thanks for the feedback! I don't see any use case where it could be useful when the server is down. So, I think I'll move forward with in core functions (unless someone has a different opinion). Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On Thu, Aug 29, 2024 at 6:33 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Thu, Aug 29, 2024 at 3:44 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > Yeah that's fair. And now I'm wondering if we need an extra module. I think > > we could "simply" expose 2 new functions in core, thoughts? > > > > > > What do you think? Did you have something else in mind? > > > > > > > > > > On similar lines, we can also provide a function to get the slot's > > > on-disk data. > > > > Yeah, having a way to expose the data from the disk makes fully sense to me. > > > > > IIRC, Bharath had previously proposed a tool to achieve > > > the same. It is fine if we don't want to add that as part of this > > > patch but I mentioned it because by having that we can have a set of > > > functions to view logical decoding data. > > > > That's right. I think this one would be simply enough to expose one or two > > functions in core too (and probably would not need an extra module). > > +1 for functions in core unless this extra module > pg_logicalsnapinspect works as a tool to be helpful even when the > server is down. > We have an example of pageinspect which provides low-level functions to aid debugging. The proposal for these APIs also seems to fall in the same category, so why go for the core for these functions? -- With Regards, Amit Kapila.
Hi, On Fri, Aug 30, 2024 at 03:43:12PM +0530, Amit Kapila wrote: > On Thu, Aug 29, 2024 at 6:33 PM Bharath Rupireddy > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > On Thu, Aug 29, 2024 at 3:44 PM Bertrand Drouvot > > <bertranddrouvot.pg@gmail.com> wrote: > > > > > > Yeah that's fair. And now I'm wondering if we need an extra module. I think > > > we could "simply" expose 2 new functions in core, thoughts? > > > > > > > > What do you think? Did you have something else in mind? > > > > > > > > > > > > > On similar lines, we can also provide a function to get the slot's > > > > on-disk data. > > > > > > Yeah, having a way to expose the data from the disk makes fully sense to me. > > > > > > > IIRC, Bharath had previously proposed a tool to achieve > > > > the same. It is fine if we don't want to add that as part of this > > > > patch but I mentioned it because by having that we can have a set of > > > > functions to view logical decoding data. > > > > > > That's right. I think this one would be simply enough to expose one or two > > > functions in core too (and probably would not need an extra module). > > > > +1 for functions in core unless this extra module > > pg_logicalsnapinspect works as a tool to be helpful even when the > > server is down. > > > > We have an example of pageinspect which provides low-level functions > to aid debugging. The proposal for these APIs also seems to fall in > the same category, That's right, but... > so why go for the core for these functions? as we decided not to expose the SnapBuildOnDisk and SnapBuild structs to public and to create/expose 2 new functions in snapbuild.c then the functions in the module would do nothing but expose the data coming from the snapbuild.c's functions (get the tuple and send it to the client). That sounds weird to me to create a module that would "only" do so, that's why I thought that in core functions taking care of everything make more sense. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On Fri, Aug 30, 2024 at 5:18 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > On Fri, Aug 30, 2024 at 03:43:12PM +0530, Amit Kapila wrote: > > On Thu, Aug 29, 2024 at 6:33 PM Bharath Rupireddy > > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > > > On Thu, Aug 29, 2024 at 3:44 PM Bertrand Drouvot > > > <bertranddrouvot.pg@gmail.com> wrote: > > > > > > > > Yeah that's fair. And now I'm wondering if we need an extra module. I think > > > > we could "simply" expose 2 new functions in core, thoughts? > > > > > > > > > > What do you think? Did you have something else in mind? > > > > > > > > > > > > > > > > On similar lines, we can also provide a function to get the slot's > > > > > on-disk data. > > > > > > > > Yeah, having a way to expose the data from the disk makes fully sense to me. > > > > > > > > > IIRC, Bharath had previously proposed a tool to achieve > > > > > the same. It is fine if we don't want to add that as part of this > > > > > patch but I mentioned it because by having that we can have a set of > > > > > functions to view logical decoding data. > > > > > > > > That's right. I think this one would be simply enough to expose one or two > > > > functions in core too (and probably would not need an extra module). > > > > > > +1 for functions in core unless this extra module > > > pg_logicalsnapinspect works as a tool to be helpful even when the > > > server is down. > > > > > > > We have an example of pageinspect which provides low-level functions > > to aid debugging. The proposal for these APIs also seems to fall in > > the same category, > > That's right, but... > > > so why go for the core for these functions? > > as we decided not to expose the SnapBuildOnDisk and SnapBuild structs to public > and to create/expose 2 new functions in snapbuild.c then the functions in the > module would do nothing but expose the data coming from the snapbuild.c's > functions (get the tuple and send it to the client). That sounds weird to me to > create a module that would "only" do so, that's why I thought that in core > functions taking care of everything make more sense. > I see your point. Does anyone else have an opinion on the need for these functions and whether to expose them from a contrib module or have them as core functions? -- With Regards, Amit Kapila.
Hi, On Mon, Sep 09, 2024 at 04:24:09PM +0530, Amit Kapila wrote: > On Fri, Aug 30, 2024 at 5:18 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > as we decided not to expose the SnapBuildOnDisk and SnapBuild structs to public > > and to create/expose 2 new functions in snapbuild.c then the functions in the > > module would do nothing but expose the data coming from the snapbuild.c's > > functions (get the tuple and send it to the client). That sounds weird to me to > > create a module that would "only" do so, that's why I thought that in core > > functions taking care of everything make more sense. > > > > I see your point. Does anyone else have an opinion on the need for > these functions and whether to expose them from a contrib module or > have them as core functions? I looked at when the SNAPBUILD_VERSION has been changed: ec5896aed3 (2014) a975ff4980 (2021) 8bdb1332eb (2021) 7f13ac8123 (2022) bb19b70081 (2024) So it's not like we are changing the SnapBuildOnDisk or SnapBuild structs that frequently. Furthermore, those structs are serialized and so we have to preserve their on-disk compatibility (means we can change them only in a major release if we need to). So, I think it would not be that much of an issue to expose those structs and create a new contrib module (as v1 did propose) instead of in core new functions. If we want to insist that external modules "should" not rely on those structs then we could put them into a new internal_snapbuild.h file (instead of snapbuild.h as proposed in v1). At the end, I think that creating a contrib module and exposing those structs in internal_snapbuild.h make more sense (as compared to in core functions). Thoughts? Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On Tue, Sep 10, 2024 at 8:56 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > On Mon, Sep 09, 2024 at 04:24:09PM +0530, Amit Kapila wrote: > > On Fri, Aug 30, 2024 at 5:18 PM Bertrand Drouvot > > <bertranddrouvot.pg@gmail.com> wrote: > > > as we decided not to expose the SnapBuildOnDisk and SnapBuild structs to public > > > and to create/expose 2 new functions in snapbuild.c then the functions in the > > > module would do nothing but expose the data coming from the snapbuild.c's > > > functions (get the tuple and send it to the client). That sounds weird to me to > > > create a module that would "only" do so, that's why I thought that in core > > > functions taking care of everything make more sense. > > > > > > > I see your point. Does anyone else have an opinion on the need for > > these functions and whether to expose them from a contrib module or > > have them as core functions? > > I looked at when the SNAPBUILD_VERSION has been changed: > > ec5896aed3 (2014) > a975ff4980 (2021) > 8bdb1332eb (2021) > 7f13ac8123 (2022) > bb19b70081 (2024) > > So it's not like we are changing the SnapBuildOnDisk or SnapBuild structs that > frequently. Furthermore, those structs are serialized and so we have to preserve > their on-disk compatibility (means we can change them only in a major release > if we need to). > > So, I think it would not be that much of an issue to expose those structs and > create a new contrib module (as v1 did propose) instead of in core new functions. > > If we want to insist that external modules "should" not rely on those structs then > we could put them into a new internal_snapbuild.h file (instead of snapbuild.h > as proposed in v1). > Adding snapbuild_internal.h sounds like a good idea. > At the end, I think that creating a contrib module and exposing those structs in > internal_snapbuild.h make more sense (as compared to in core functions). > Fail enough. We can keep the module name as logicalinspect so that we can extend it for other logical decoding/replication-related files in the future. -- With Regards, Amit Kapila.
On Wed, Sep 11, 2024 at 4:21 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > > Yeah, good idea. Done that way in v3 attached. > Thanks for the patch. +1 on the patch's idea. I have started reviewing/testing it. It is WIP but please find few initial comments: src/backend/replication/logical/snapbuild.c: 1) + fsync_fname("pg_logical/snapshots", true); Should we use macros PG_LOGICAL_DIR and PG_LOGICAL_SNAPSHOTS_DIR in ValidateSnapshotFile(), instead of hard coding the path 2) Same as above in pg_get_logical_snapshot_meta() and pg_get_logical_snapshot_info() + sprintf(path, "pg_logical/snapshots/%X-%X.snap", + LSN_FORMAT_ARGS(lsn)); LSN_FORMAT_ARGS(lsn)); 3) +#include "replication/internal_snapbuild.h" Shall we name new file as 'snapbuild_internal.h' instead of 'internal_snapbuild.h'. Please see other files' name under './src/include/replication': worker_internal.h walsender_private.h 4) +static void ValidateSnapshotFile(XLogRecPtr lsn, SnapBuildOnDisk *ondisk, + const char *path); Is it required? We generally don't add declaration unless required by compiler. Since definition is prior to usage, it is not needed? thanks Shveta
On Mon, Sep 16, 2024 at 8:03 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > Hi, > > On Mon, Sep 16, 2024 at 04:02:51PM +0530, shveta malik wrote: > > On Wed, Sep 11, 2024 at 4:21 PM Bertrand Drouvot > > <bertranddrouvot.pg@gmail.com> wrote: > > > > > > > > > Yeah, good idea. Done that way in v3 attached. > > > > > > > Thanks for the patch. +1 on the patch's idea. I have started > > reviewing/testing it. It is WIP but please find few initial comments: > > Thanks for sharing your thoughts and for the review! > > > > > src/backend/replication/logical/snapbuild.c: > > > > 1) > > + fsync_fname("pg_logical/snapshots", true); > > > > Should we use macros PG_LOGICAL_DIR and PG_LOGICAL_SNAPSHOTS_DIR in > > ValidateSnapshotFile(), instead of hard coding the path > > > > > > 2) > > Same as above in pg_get_logical_snapshot_meta() and > > pg_get_logical_snapshot_info() > > > > + sprintf(path, "pg_logical/snapshots/%X-%X.snap", > > + LSN_FORMAT_ARGS(lsn)); LSN_FORMAT_ARGS(lsn)); > > > > Doh! Yeah, agree that we should use those macros. They are coming from c39afc38cf > which has been introduced after v1 of this patch. I thought I took care of it once > c39afc38cf went in, but it looks like I missed it somehow. Done in v4 attached, > Thanks! > > > 3) > > +#include "replication/internal_snapbuild.h" > > > > Shall we name new file as 'snapbuild_internal.h' instead of > > 'internal_snapbuild.h'. Please see other files' name under > > './src/include/replication': > > worker_internal.h > > walsender_private.h > > Agree, that should be snapbuild_internal.h, done in v4. > > > > > 4) > > +static void ValidateSnapshotFile(XLogRecPtr lsn, SnapBuildOnDisk *ondisk, > > + const char *path); > > > > Is it required? We generally don't add declaration unless required by > > compiler. Since definition is prior to usage, it is not needed? > > > > I personally prefer to add them even if not required by the compiler. I did not > pay attention that "We generally don't add declaration unless required by compiler" > and (after a quick check) I did not find any reference in the coding style > documentation [1]. That said, I don't have a strong opinion about that and so > removed in v4. Worth to add a mention in the coding convention doc? > Okay. I was somehow under the impression that this is the way in the postgres i.e. not add redundant declarations. Will be good to know what others think on this. Thanks for addressing the comments. I have not started reviewing v4 yet, but here are few more comments on v3: 1) +#include "port/pg_crc32c.h" It is not needed in pg_logicalinspect.c as it is already included in internal_snapbuild.h 2) + values[0] = Int16GetDatum(ondisk.builder.state); ........ + values[8] = LSNGetDatum(ondisk.builder.last_serialized_snapshot); + values[9] = TransactionIdGetDatum(ondisk.builder.next_phase_at); + values[10] = Int64GetDatum(ondisk.builder.committed.xcnt); We can have values[i++] in all the places and later we can check : Assert(i == PG_GET_LOGICAL_SNAPSHOT_INFO_COLS); Then we need not to keep track of number even in later part of code, as it goes till 14. 3) Similar change can be done here: + values[0] = Int32GetDatum(ondisk.magic); + values[1] = Int32GetDatum(ondisk.checksum); + values[2] = Int32GetDatum(ondisk.version); check at the end will be: Assert(i == PG_GET_LOGICAL_SNAPSHOT_META_COLS); 4) Most of the output columns in pg_get_logical_snapshot_info() look self-explanatory except 'state'. Should we have meaningful 'text' here corresponding to SnapBuildState? Similar to what we do for 'invalidation_reason' in pg_replication_slots. (SlotInvalidationCauses for ReplicationSlotInvalidationCause) thanks Shveta
On Tue, Sep 17, 2024 at 10:18 AM shveta malik <shveta.malik@gmail.com> wrote: > > Thanks for addressing the comments. I have not started reviewing v4 > yet, but here are few more comments on v3: > I just noticed that when we pass NULL input, both the new functions give 1 row as output, all cols as NULL: newdb1=# SELECT * FROM pg_get_logical_snapshot_meta(NULL); magic | checksum | version -------+----------+--------- | | (1 row) Similar behavior with pg_get_logical_snapshot_info(). While the existing 'pg_ls_logicalsnapdir' function gives this error, which looks more meaningful: newdb1=# select * from pg_ls_logicalsnapdir(NULL); ERROR: function pg_ls_logicalsnapdir(unknown) does not exist LINE 1: select * from pg_ls_logicalsnapdir(NULL); HINT: No function matches the given name and argument types. You might need to add explicit type casts. Shouldn't the new functions have same behavior? thanks Shveta
On Monday, September 16, 2024, shveta malik <shveta.malik@gmail.com> wrote:
On Tue, Sep 17, 2024 at 10:18 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> Thanks for addressing the comments. I have not started reviewing v4
> yet, but here are few more comments on v3:
>
I just noticed that when we pass NULL input, both the new functions
give 1 row as output, all cols as NULL:
newdb1=# SELECT * FROM pg_get_logical_snapshot_meta(NULL);
magic | checksum | version
-------+----------+---------
| |
(1 row)
Similar behavior with pg_get_logical_snapshot_info(). While the
existing 'pg_ls_logicalsnapdir' function gives this error, which looks
more meaningful:
newdb1=# select * from pg_ls_logicalsnapdir(NULL);
ERROR: function pg_ls_logicalsnapdir(unknown) does not exist
LINE 1: select * from pg_ls_logicalsnapdir(NULL);
HINT: No function matches the given name and argument types. You
might need to add explicit type casts.
Shouldn't the new functions have same behavior?
No. Since the name pg_ls_logicalsnapdir has zero single-argument implementations passing a null value as an argument is indeed attempt to invoke a function signature that doesn’t exist.
If there is exactly one single input argument function of the given name the parser is going to cast the null literal to the data type of the single argument and invoke the function. It will not and cannot be convinced to fail to find a matching function.
I can see an argument that they should produce an empty set instead of a single all-null row, but the idea that they wouldn’t even be found is contrary to a core design of the system.
David J.
Hi, On Mon, Sep 16, 2024 at 10:16:16PM -0700, David G. Johnston wrote: > On Monday, September 16, 2024, shveta malik <shveta.malik@gmail.com> wrote: > > > On Tue, Sep 17, 2024 at 10:18 AM shveta malik <shveta.malik@gmail.com> > > wrote: > > > > > > Thanks for addressing the comments. I have not started reviewing v4 > > > yet, but here are few more comments on v3: > > > > > > > I just noticed that when we pass NULL input, both the new functions > > give 1 row as output, all cols as NULL: > > > > newdb1=# SELECT * FROM pg_get_logical_snapshot_meta(NULL); > > magic | checksum | version > > -------+----------+--------- > > | | > > > > (1 row) > > > > Similar behavior with pg_get_logical_snapshot_info(). While the > > existing 'pg_ls_logicalsnapdir' function gives this error, which looks > > more meaningful: > > > > newdb1=# select * from pg_ls_logicalsnapdir(NULL); > > ERROR: function pg_ls_logicalsnapdir(unknown) does not exist > > LINE 1: select * from pg_ls_logicalsnapdir(NULL); > > HINT: No function matches the given name and argument types. You > > might need to add explicit type casts. > > > > > > Shouldn't the new functions have same behavior? > > > > No. Since the name pg_ls_logicalsnapdir has zero single-argument > implementations passing a null value as an argument is indeed attempt to > invoke a function signature that doesn’t exist. Agree. > I can see an argument that they should produce an empty set instead of a > single all-null row, Yeah, it's outside the scope of this patch but I've seen different behavior in this area. For example: postgres=# select * from pg_ls_replslotdir(NULL); name | size | modification ------+------+-------------- (0 rows) as compared to: postgres=# select * from pg_walfile_name_offset(NULL); file_name | file_offset -----------+------------- | (1 row) I thought that it might be linked to the volatility but it is not: postgres=# select * from pg_stat_get_xact_blocks_fetched(NULL); pg_stat_get_xact_blocks_fetched --------------------------------- (1 row) postgres=# select * from pg_get_multixact_members(NULL); xid | mode -----+------ (0 rows) while both are volatile. I think both make sense: It's "empty" or we "don't know the values of the fields". I don't know if there is any reason about this "inconsistency". Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On Tue, Sep 17, 2024 at 10:46 AM David G. Johnston <david.g.johnston@gmail.com> wrote: > > > > On Monday, September 16, 2024, shveta malik <shveta.malik@gmail.com> wrote: >> >> On Tue, Sep 17, 2024 at 10:18 AM shveta malik <shveta.malik@gmail.com> wrote: >> > >> > Thanks for addressing the comments. I have not started reviewing v4 >> > yet, but here are few more comments on v3: >> > >> >> I just noticed that when we pass NULL input, both the new functions >> give 1 row as output, all cols as NULL: >> >> newdb1=# SELECT * FROM pg_get_logical_snapshot_meta(NULL); >> magic | checksum | version >> -------+----------+--------- >> | | >> >> (1 row) >> >> Similar behavior with pg_get_logical_snapshot_info(). While the >> existing 'pg_ls_logicalsnapdir' function gives this error, which looks >> more meaningful: >> >> newdb1=# select * from pg_ls_logicalsnapdir(NULL); >> ERROR: function pg_ls_logicalsnapdir(unknown) does not exist >> LINE 1: select * from pg_ls_logicalsnapdir(NULL); >> HINT: No function matches the given name and argument types. You >> might need to add explicit type casts. >> >> >> Shouldn't the new functions have same behavior? > > > No. Since the name pg_ls_logicalsnapdir has zero single-argument implementations passing a null value as an argument isindeed attempt to invoke a function signature that doesn’t exist. > > If there is exactly one single input argument function of the given name the parser is going to cast the null literal tothe data type of the single argument and invoke the function. It will not and cannot be convinced to fail to find a matchingfunction. Okay, understood. Thanks for explaining. > I can see an argument that they should produce an empty set instead of a single all-null row, but the idea that they wouldn’teven be found is contrary to a core design of the system. Okay, a single row can be investigated if it comes under this scope. But I see why 'ERROR' is not a possibility here. thanks Shveta
On Tue, Sep 17, 2024 at 12:44 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > Hi, > > On Tue, Sep 17, 2024 at 10:18:35AM +0530, shveta malik wrote: > > Thanks for addressing the comments. I have not started reviewing v4 > > yet, but here are few more comments on v3: > > > > 1) > > +#include "port/pg_crc32c.h" > > > > It is not needed in pg_logicalinspect.c as it is already included in > > internal_snapbuild.h > > Yeap, forgot to remove that one when creating the new "internal".h file, done > in v5 attached, thanks! > > > > > 2) > > + values[0] = Int16GetDatum(ondisk.builder.state); > > ........ > > + values[8] = LSNGetDatum(ondisk.builder.last_serialized_snapshot); > > + values[9] = TransactionIdGetDatum(ondisk.builder.next_phase_at); > > + values[10] = Int64GetDatum(ondisk.builder.committed.xcnt); > > > > We can have values[i++] in all the places and later we can check : > > Assert(i == PG_GET_LOGICAL_SNAPSHOT_INFO_COLS); > > Then we need not to keep track of number even in later part of code, > > as it goes till 14. > > Right, let's do it that way (as it is done in pg_walinspect for example). > > > 4) > > Most of the output columns in pg_get_logical_snapshot_info() look > > self-explanatory except 'state'. Should we have meaningful 'text' here > > corresponding to SnapBuildState? Similar to what we do for > > 'invalidation_reason' in pg_replication_slots. (SlotInvalidationCauses > > for ReplicationSlotInvalidationCause) > > Yeah we could. I was not sure about that (and that was my first remark in [1]) > , as the module is mainly for debugging purpose, I was thinking that the one > using it could refer to "snapbuild.h". Let's see what others think. > okay, makes sense. lets wait what others have to say. Thanks for the patch. Few trivial things: 1) May be we shall change 'INTERNAL_SNAPBUILD_H' in snapbuild_internal.h to 'SNAPBUILD_INTERNAL_H'? 2) ValidateSnapshotFile() It is not only validating, but loading the content as well. So may be we can rename to ValidateAndRestoreSnapshotFile? 3) sgml: a) + The pg_logicalinspect functions are called using an LSN argument that can be extracted from the output name of the pg_ls_logicalsnapdir() function. Is it possible to give link to pg_ls_logicalsnapdir function here? b) + Gets logical snapshot metadata about a snapshot file that is located in the pg_logical/snapshots directory. located in server's pg_logical/snapshots directory (i.e. use server keyword, similar to how pg_ls_logicalsnapdir , pg_ls_logicalmapdir explains it) thanks Shveta
On Tue, Sep 17, 2024 at 12:44 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > On Tue, Sep 17, 2024 at 10:18:35AM +0530, shveta malik wrote: > > Thanks for addressing the comments. I have not started reviewing v4 > > yet, but here are few more comments on v3: > > > > > 4) > > Most of the output columns in pg_get_logical_snapshot_info() look > > self-explanatory except 'state'. Should we have meaningful 'text' here > > corresponding to SnapBuildState? Similar to what we do for > > 'invalidation_reason' in pg_replication_slots. (SlotInvalidationCauses > > for ReplicationSlotInvalidationCause) > > Yeah we could. I was not sure about that (and that was my first remark in [1]) > , as the module is mainly for debugging purpose, I was thinking that the one > using it could refer to "snapbuild.h". Let's see what others think. > Displaying the 'text' for the state column makes it easy to understand. So, +1 for doing it that way. -- With Regards, Amit Kapila.
On Thu, Sep 19, 2024 at 12:17 AM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > Hi, > > Thanks for the review! > Thanks for the patch. Should we include in the document who can execute these functions and the required access permissions, similar to how it's done for pgwalinspect, pg_ls_logicalmapdir(), and other such functions? thanks Shveta
On Fri, Sep 20, 2024 at 12:22 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > > Please find attached v8, that: > Thank You for the patch. In one of my tests, I noticed that I got negative checksum: postgres=# SELECT * FROM pg_get_logical_snapshot_meta('0/3481F20'); magic | checksum | version ------------+------------+--------- 1369563137 | -266346460 | 6 But pg_crc32c is uint32. Is it because we are getting it as Int32GetDatum(ondisk.checksum) in pg_get_logical_snapshot_meta()? Instead should it be UInt32GetDatum? Same goes for below: values[i++] = Int32GetDatum(ondisk.magic); values[i++] = Int32GetDatum(ondisk.magic); We need to recheck the rest of the fields in the info() function as well. thanks Shveta
On Mon, Sep 23, 2024 at 7:57 AM Peter Smith <smithpb2250@gmail.com> wrote: > > My review comments for v8-0001 > > ====== > contrib/pg_logicalinspect/pg_logicalinspect.c > > 1. > +/* > + * Lookup table for SnapBuildState. > + */ > + > +#define SNAPBUILD_STATE_INCR 1 > + > +static const char *const SnapBuildStateDesc[] = { > + [SNAPBUILD_START + SNAPBUILD_STATE_INCR] = "start", > + [SNAPBUILD_BUILDING_SNAPSHOT + SNAPBUILD_STATE_INCR] = "building", > + [SNAPBUILD_FULL_SNAPSHOT + SNAPBUILD_STATE_INCR] = "full", > + [SNAPBUILD_CONSISTENT + SNAPBUILD_STATE_INCR] = "consistent", > +}; > + > +/* > > nit - the SNAPBUILD_STATE_INCR made this code appear more complicated > than it is. I agree. > Please take a look at the attachment for an alternative > implementation which includes an explanatory comment. YMMV. Feel free > to ignore it. > +1. I find Peter's version with comments easier to understand. thanks Shveta
Hi, On Tue, Sep 24, 2024 at 09:15:31AM +0530, shveta malik wrote: > On Fri, Sep 20, 2024 at 12:22 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > > > Please find attached v8, that: > > > > Thank You for the patch. In one of my tests, I noticed that I got > negative checksum: > > postgres=# SELECT * FROM pg_get_logical_snapshot_meta('0/3481F20'); > magic | checksum | version > ------------+------------+--------- > 1369563137 | -266346460 | 6 > > But pg_crc32c is uint32. Is it because we are getting it as > Int32GetDatum(ondisk.checksum) in pg_get_logical_snapshot_meta()? > Instead should it be UInt32GetDatum? Thanks for the testing. As the checksum could be > 2^31 - 1, then v9 (just shared up-thread) changes it to an int8 in the pg_logicalinspect--1.0.sql file. So, to avoid CI failure on the 32bit build, then v9 is using Int64GetDatum() instead of UInt32GetDatum(). > Same goes for below: > values[i++] = Int32GetDatum(ondisk.magic); > values[i++] = Int32GetDatum(ondisk.magic); The 2 others field (magic and version) are unlikely to be > 2^31 - 1, so v9 is making use of UInt32GetDatum() and keep int4 in the sql file. > We need to recheck the rest of the fields in the info() function as well. I think that the pg_get_logical_snapshot_info()'s fields are ok (I did spend some time to debug CI failing on the 32bit build for some on them before submitting v1). Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com