Thread: logical decoding/replication: new functions pg_ls_logicaldir and pg_ls_replslotdir

logical decoding/replication: new functions pg_ls_logicaldir and pg_ls_replslotdir

From
Bharath Rupireddy
Date:
Hi,

At times, users want to know what are the files (snapshot and mapping
files) that are available under pg_logical directory and also the
spill files that are under pg_replslot directory and how much space
they occupy. This will help to better know the storage usage pattern
of these directories. Can we have two new functions pg_ls_logicaldir
and pg_ls_replslotdir on the similar lines of pg_ls_logdir,
pg_ls_logdir,pg_ls_tmpdir, pg_ls_archive_statusdir [1]?

[1] - https://www.postgresql.org/docs/devel/functions-admin.html

Regards,
Bharath Rupireddy.



On Fri, Oct 8, 2021 at 4:39 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> Hi,
>
> At times, users want to know what are the files (snapshot and mapping
> files) that are available under pg_logical directory and also the
> spill files that are under pg_replslot directory and how much space
> they occupy.
>

Why can't you use pg_ls_dir to see the contents of pg_replslot? To
know the space taken by spilling, you might want to check
pg_stat_replication_slots[1] as that gives information about
spill_bytes.

[1] - https://www.postgresql.org/docs/devel/monitoring-stats.html#MONITORING-PG-STAT-REPLICATION-SLOTS-VIEW

-- 
With Regards,
Amit Kapila.



Re: logical decoding/replication: new functions pg_ls_logicaldir and pg_ls_replslotdir

From
Bharath Rupireddy
Date:
On Fri, Oct 22, 2021 at 3:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Oct 8, 2021 at 4:39 PM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> >
> > Hi,
> >
> > At times, users want to know what are the files (snapshot and mapping
> > files) that are available under pg_logical directory and also the
> > spill files that are under pg_replslot directory and how much space
> > they occupy.
> >
>
> Why can't you use pg_ls_dir to see the contents of pg_replslot? To
> know the space taken by spilling, you might want to check
> pg_stat_replication_slots[1] as that gives information about
> spill_bytes.
>
> [1] - https://www.postgresql.org/docs/devel/monitoring-stats.html#MONITORING-PG-STAT-REPLICATION-SLOTS-VIEW

Thanks Amit!

pg_ls_dir gives the list of directories and files, but not their
sizes. And it looks like the spill_bytes from
pg_stat_replication_slots is the accumulated byte count (see [1]), not
the current size of the spills files, so it's not representing the
spill files and their size at that moment.

If we have  pg_ls_logicaldir and pg_ls_replslotdir returning the
files, szies, and last modified times, it will be useful in production
environments to see the disk usage of those files at the current
moment. The data from these functions can be fed to an external
analytics tool invoking the functions at regular intervals of time and
report the disk usage of these folders. This will be super useful to
analyze the questions like: Was the disk usage more at time t1? What
happened to my database system at that time? etc. And,  these
functions can run independent of the stats collector process which is
currently required for the pg_stat_replication_slots view.

Thoughts?

I plan to work on a patch if okay.

[1]
postgres=# select
pg_ls_dir('/home/bharath/postgres/inst/bin/data/pg_replslot/mysub');
 pg_ls_dir
-----------
 state
(1 row)

postgres=# select * from pg_stat_replication_slots;
 slot_name | spill_txns | spill_count | spill_bytes | stream_txns |
stream_count | stream_bytes | total_txns | total_bytes | stats_reset

-----------+------------+-------------+-------------+-------------+--------------+--------------+------------+-------------+-------------
 mysub     |          3 |           6 |   396000000 |           0 |
        0 |            0 |          5 |   396001128 |
(1 row)

Regards,
Bharath Rupireddy.



Re: logical decoding/replication: new functions pg_ls_logicaldir and pg_ls_replslotdir

From
Bharath Rupireddy
Date:
On Fri, Oct 22, 2021 at 9:26 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
> I concluded that it's better to add a function to list metadata of an arbitrary
> dir, rather than adding more functions to handle specific, hardcoded dirs:
> https://www.postgresql.org/message-id/flat/20191227170220.GE12890@telsasoft.com

I just had a quick look at the pg_ls_dir_metadata() patch(I didn't
look at the other patches). While it's a good idea to have a single
function for all the PGDATA directories, I'm not sure if one would
ever need the info like type, change, creation path etc. If we do
this, the function will become the linux equivalent command. I don't
see the difference between modification and change time stamps. For
debugging or analytical purposes in production environments, one would
majorly look at the file name, it's size on the disk, modification
time (to decide whether the file is stale or not, creation time (to
decide how old is the file),  file/directory(maybe?). I'm not sure if
your patch has a recursive option for pg_ls_dir_metadata(), if it has,
I think it's more complex from a usability perspective.

And the functions like pg_ls_tmpdir, pg_ls_tmpdir, pg_ls_waldir etc.
(existing) and pg_ls_logicaldir, pg_ls_replslotdir (yet to have) will
provide the better usability compared to a generic function. Having
said this, I don't oppose having a generic function returning the file
name, file size, modification time, creation time, but not other info,
please. If one is interested in knowing the other information file
type, path etc. they can go run linux/windows/OS commands.

In summary what I think at this point is:
1) pg_ls_logicaldir, pg_ls_replslotdir - better for usability and
serving special purpose like their peers
2) modify pg_ls_dir such that it returns the file name, file size,
modification time, creation time, for directories, to be simple, it
shouldn't go recursively over all the directories, it should just
return the directory name, size, modification time, creation time.

If okay, I'm ready to spend time implementing them.

Thoughts?

Regards,
Bharath Rupireddy.



Re: logical decoding/replication: new functions pg_ls_logicaldir and pg_ls_replslotdir

From
Bharath Rupireddy
Date:
On Sat, Oct 23, 2021 at 11:10 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Fri, Oct 22, 2021 at 9:26 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
> > I concluded that it's better to add a function to list metadata of an arbitrary
> > dir, rather than adding more functions to handle specific, hardcoded dirs:
> > https://www.postgresql.org/message-id/flat/20191227170220.GE12890@telsasoft.com
>
> I just had a quick look at the pg_ls_dir_metadata() patch(I didn't
> look at the other patches). While it's a good idea to have a single
> function for all the PGDATA directories, I'm not sure if one would
> ever need the info like type, change, creation path etc. If we do
> this, the function will become the linux equivalent command. I don't
> see the difference between modification and change time stamps. For
> debugging or analytical purposes in production environments, one would
> majorly look at the file name, it's size on the disk, modification
> time (to decide whether the file is stale or not, creation time (to
> decide how old is the file),  file/directory(maybe?). I'm not sure if
> your patch has a recursive option for pg_ls_dir_metadata(), if it has,
> I think it's more complex from a usability perspective.
>
> And the functions like pg_ls_tmpdir, pg_ls_tmpdir, pg_ls_waldir etc.
> (existing) and pg_ls_logicaldir, pg_ls_replslotdir (yet to have) will
> provide the better usability compared to a generic function. Having
> said this, I don't oppose having a generic function returning the file
> name, file size, modification time, creation time, but not other info,
> please. If one is interested in knowing the other information file
> type, path etc. they can go run linux/windows/OS commands.
>
> In summary what I think at this point is:
> 1) pg_ls_logicaldir, pg_ls_replslotdir - better for usability and
> serving special purpose like their peers

I've added 3 functions pg_ls_logicalsnapdir, pg_ls_logicalmapdir,
pg_ls_replslotdir, and attached the patch. The sample output looks
like [1]. Please review it further.

Here's the CF entry - https://commitfest.postgresql.org/35/3390/

[1]
postgres=# select pg_ls_logicalsnapdir();
             pg_ls_logicalsnapdir
-----------------------------------------------
 (0-14A50C0.snap,128,"2021-10-30 09:15:56+00")
 (0-14C46D8.snap,128,"2021-10-30 09:16:05+00")
 (0-14C97C8.snap,132,"2021-10-30 09:16:20+00")

postgres=# select pg_ls_logicalmapdir();
                      pg_ls_logicalmapdir
---------------------------------------------------------------
 (map-31d5-4eb-0_CDDDE88-2d9-2db,108,"2021-10-30 09:24:34+00")
 (map-31d5-4eb-0_CDDDE88-2da-2db,108,"2021-10-30 09:24:34+00")
 (map-31d5-4eb-0_CE48038-2dc-2de,108,"2021-10-30 09:24:35+00")
 (map-31d5-4eb-0_CE6BAF0-2dd-2df,108,"2021-10-30 09:24:35+00")
 (map-31d5-4eb-0_CD97DE0-2d9-2d9,36,"2021-10-30 09:24:30+00")
 (map-31d5-4eb-0_CE24808-2da-2dd,108,"2021-10-30 09:24:35+00")
 (map-31d5-4eb-0_CE01200-2dc-2dc,36,"2021-10-30 09:24:34+00")
 (map-31d5-4eb-0_CDDDE88-2db-2db,36,"2021-10-30 09:24:34+00")
 (map-31d5-4eb-0_CE6BAF0-2dc-2df,108,"2021-10-30 09:24:35+00")
 (map-31d5-4eb-0_CDBA920-2d9-2da,108,"2021-10-30 09:24:32+00")
 (map-31d5-4eb-0_CE01200-2da-2dc,108,"2021-10-30 09:24:34+00")
 (map-31d5-4eb-0_CE6BAF0-2d9-2df,108,"2021-10-30 09:24:35+00")
 (map-31d5-4eb-0_CE24808-2db-2dd,108,"2021-10-30 09:24:35+00")
 (map-31d5-4eb-0_CE6BAF0-2db-2df,108,"2021-10-30 09:24:35+00")
 (map-31d5-4eb-0_CE24808-2dd-2dd,36,"2021-10-30 09:24:35+00")
 (map-31d5-4eb-0_CE24808-2dc-2dd,108,"2021-10-30 09:24:35+00")
 (map-31d5-4eb-0_CD74E48-2d8-2d8,36,"2021-10-30 09:24:25+00")
 (map-31d5-4eb-0_CE24808-2d9-2dd,108,"2021-10-30 09:24:35+00")

 postgres=# select pg_ls_replslotdir('mysub');
                        pg_ls_replslotdir
-----------------------------------------------------------------
 (xid-722-lsn-0-2000000.spill,36592640,"2021-10-30 09:18:29+00")
 (xid-722-lsn-0-5000000.spill,4577860,"2021-10-30 09:18:32+00")
 (state,200,"2021-10-30 09:18:25+00")
 (xid-722-lsn-0-1000000.spill,25644220,"2021-10-30 09:18:29+00")
 (xid-722-lsn-0-4000000.spill,36592640,"2021-10-30 09:18:32+00")
 (xid-722-lsn-0-3000000.spill,36592640,"2021-10-30 09:18:32+00")

Regards,
Bharath Rupireddy.

Attachment

Re: logical decoding/replication: new functions pg_ls_logicaldir and pg_ls_replslotdir

From
"Bossart, Nathan"
Date:
On 10/30/21, 2:36 AM, "Bharath Rupireddy" <bharath.rupireddyforpostgres@gmail.com> wrote:
> I've added 3 functions pg_ls_logicalsnapdir, pg_ls_logicalmapdir,
> pg_ls_replslotdir, and attached the patch. The sample output looks
> like [1]. Please review it further.

I took a look at the patch.

+    char        path[MAXPGPATH + 11];

Why are you adding 11 to MAXPGPATH here?  I would think that MAXPGPATH
is sufficient.

+    filename = text_to_cstring(filename_t);
+    snprintf(path, sizeof(path), "%s/%s", "pg_replslot", filename);
+    return pg_ls_dir_files(fcinfo, path, false);

I think we need to do some additional input validation here.  It's
pretty easy to use this to see the contents of other directories.

        postgres=# SELECT * FROM pg_ls_replslotdir('../');
                 name         | size  |      modification
        ----------------------+-------+------------------------
         postgresql.conf      | 28995 | 2021-11-17 18:40:33+00
         pg_hba.conf          |  4789 | 2021-11-17 18:40:33+00
         postmaster.opts      |    39 | 2021-11-17 18:43:07+00
         postgresql.auto.conf |    88 | 2021-11-17 18:40:33+00
         pg_ident.conf        |  1636 | 2021-11-17 18:40:33+00
         postmaster.pid       |    95 | 2021-11-17 18:43:07+00
         PG_VERSION           |     3 | 2021-11-17 18:40:33+00
        (7 rows)

Nathan


Re: logical decoding/replication: new functions pg_ls_logicaldir and pg_ls_replslotdir

From
Bharath Rupireddy
Date:
On Thu, Nov 18, 2021 at 12:16 AM Bossart, Nathan <bossartn@amazon.com> wrote:
>
> On 10/30/21, 2:36 AM, "Bharath Rupireddy" <bharath.rupireddyforpostgres@gmail.com> wrote:
> > I've added 3 functions pg_ls_logicalsnapdir, pg_ls_logicalmapdir,
> > pg_ls_replslotdir, and attached the patch. The sample output looks
> > like [1]. Please review it further.
>
> I took a look at the patch.
>
> +       char            path[MAXPGPATH + 11];
>
> Why are you adding 11 to MAXPGPATH here?  I would think that MAXPGPATH
> is sufficient.

Yeah, MAXPGPATH is sufficient. Note that the replication slot name be
at max NAMEDATALEN(64 bytes) size
(ReplicationSlotPersistentData->name) and what we pass to the
pg_ls_dir_files is
"pg_replslot/<<user_entered_slot_name_with_max_64_bytes>>", so it can
never cross MAXPGPATH (1024).

> +       filename = text_to_cstring(filename_t);
> +       snprintf(path, sizeof(path), "%s/%s", "pg_replslot", filename);
> +       return pg_ls_dir_files(fcinfo, path, false);
>
> I think we need to do some additional input validation here.  It's
> pretty easy to use this to see the contents of other directories.

Done. Checking if the entered slot exists or not, if not throwing an
error, see [1].

Please review the attached v2.

[1]
postgres=# select * from pg_ls_replslotdir('');
ERROR:  replication slot "" does not exist
postgres=# select * from pg_ls_replslotdir('../');
ERROR:  replication slot "../" does not exist
postgres=# select pg_ls_replslotdir('mysub');
                        pg_ls_replslotdir
-----------------------------------------------------------------
 (xid-722-lsn-0-2000000.spill,36592640,"2021-11-18 07:34:40+00")
 (xid-722-lsn-0-5000000.spill,36592640,"2021-11-18 07:34:43+00")
 (xid-722-lsn-0-A000000.spill,29910720,"2021-11-18 07:34:48+00")
 (xid-722-lsn-0-7000000.spill,36592640,"2021-11-18 07:34:45+00")
 (xid-722-lsn-0-9000000.spill,36592640,"2021-11-18 07:34:47+00")
 (state,200,"2021-11-18 07:34:36+00")
 (xid-722-lsn-0-8000000.spill,36592500,"2021-11-18 07:34:46+00")
 (xid-722-lsn-0-6000000.spill,36592640,"2021-11-18 07:34:44+00")
 (xid-722-lsn-0-1000000.spill,11171300,"2021-11-18 07:34:39+00")
 (xid-722-lsn-0-4000000.spill,36592500,"2021-11-18 07:34:42+00")
 (xid-722-lsn-0-3000000.spill,36592640,"2021-11-18 07:34:42+00")
(11 rows)

Regards,
Bharath Rupireddy.

Attachment

Re: logical decoding/replication: new functions pg_ls_logicaldir and pg_ls_replslotdir

From
"Bossart, Nathan"
Date:
On 11/17/21, 11:39 PM, "Bharath Rupireddy" <bharath.rupireddyforpostgres@gmail.com> wrote:
> Please review the attached v2.

LGTM.  I've marked this one as ready-for-committer.

Nathan


On Sat, Nov 20, 2021 at 12:29:51AM +0000, Bossart, Nathan wrote:
> On 11/17/21, 11:39 PM, "Bharath Rupireddy" <bharath.rupireddyforpostgres@gmail.com> wrote:
>> Please review the attached v2.
>
> LGTM.  I've marked this one as ready-for-committer.

One issue that I have with this patch is that there are zero
regression tests.  Could you add a couple of things in
misc_functions.sql (for the negative tests perhaps) or
contrib/test_decoding/, taking advantage of places where slots are
already created?  You may want to look after the non-superuser case
where the calls should fail, and the second case where a role is part
of pg_monitor where the call succeeds.  Note that any roles created in
the tests have to be prefixed with "regress_".

+   snprintf(path, sizeof(path), "%s/%s", "pg_replslot", slotname);
+   return pg_ls_dir_files(fcinfo, path, false);
"pg_replslot" could be part of the third argument here.  There is no
need to separate it.

+        ordinary file in the server's pg_logical/mappings directory.
Paths had better have <filename> markups around them, no?
--
Michael

Attachment

Re: logical decoding/replication: new functions pg_ls_logicaldir and pg_ls_replslotdir

From
Bharath Rupireddy
Date:
On Sun, Nov 21, 2021 at 6:58 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Sat, Nov 20, 2021 at 12:29:51AM +0000, Bossart, Nathan wrote:
> > On 11/17/21, 11:39 PM, "Bharath Rupireddy" <bharath.rupireddyforpostgres@gmail.com> wrote:
> >> Please review the attached v2.
> >
> > LGTM.  I've marked this one as ready-for-committer.
>
> One issue that I have with this patch is that there are zero
> regression tests.  Could you add a couple of things in
> misc_functions.sql (for the negative tests perhaps) or
> contrib/test_decoding/, taking advantage of places where slots are
> already created?  You may want to look after the non-superuser case
> where the calls should fail, and the second case where a role is part
> of pg_monitor where the call succeeds.  Note that any roles created in
> the tests have to be prefixed with "regress_".

I don't think we need to go far to contrib/test_decoding/, even if we
add it there we can't test it for the outputs of these functions, so
I've added the tests in misc_functinos.sql itself.

> +   snprintf(path, sizeof(path), "%s/%s", "pg_replslot", slotname);
> +   return pg_ls_dir_files(fcinfo, path, false);
> "pg_replslot" could be part of the third argument here.  There is no
> need to separate it.

Done.

> +        ordinary file in the server's pg_logical/mappings directory.
> Paths had better have <filename> markups around them, no?

Done.

Attached v3 patch, please review it further.

Regards,
Bharath Rupireddy.

Attachment
On Sun, Nov 21, 2021 at 08:45:52AM +0530, Bharath Rupireddy wrote:
> I don't think we need to go far to contrib/test_decoding/, even if we
> add it there we can't test it for the outputs of these functions, so
> I've added the tests in misc_functinos.sql itself.

+SELECT COUNT(*) >= 0 AS OK FROM pg_ls_replslotdir('slot_dir_funcs');
+ ok
+----
+ t
+(1 row)
Creating a slot within the main regression test suite is something we
should avoid as it impacts the portability of the tests (note that we
don't have tests creating slots in src/test/regress/, and we'd require
max_replication_slots > 0 with this version of the patch).  This was
the point I was trying to make upthread about using test_decoding/
where we already have slots.

A second thing I have noticed is the set of OIDs used by the patch
which was incorrect.  On a development branch, we require new features
to use OIDs between 8000-9999 (unused_oids would recommend a random
range of them).  A third thing was that pg_proc.dat had an incorrect
description for pg_ls_replslotdir(), and that it was in need of
indentation.

I have tweaked a bit the tests and the docs, and the result looked
fine at the end.  Hence, applied.
--
Michael

Attachment