Re: logical decoding/replication: new functions pg_ls_logicaldir and pg_ls_replslotdir - Mailing list pgsql-hackers

From Bharath Rupireddy
Subject Re: logical decoding/replication: new functions pg_ls_logicaldir and pg_ls_replslotdir
Date
Msg-id CALj2ACXbkyA0FL7kKsiU6fh+mFHvxx=YOz+L+8a0hhPWgO3SfQ@mail.gmail.com
Whole thread Raw
In response to Re: logical decoding/replication: new functions pg_ls_logicaldir and pg_ls_replslotdir  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Fri, Oct 22, 2021 at 3:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Oct 8, 2021 at 4:39 PM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> >
> > Hi,
> >
> > At times, users want to know what are the files (snapshot and mapping
> > files) that are available under pg_logical directory and also the
> > spill files that are under pg_replslot directory and how much space
> > they occupy.
> >
>
> Why can't you use pg_ls_dir to see the contents of pg_replslot? To
> know the space taken by spilling, you might want to check
> pg_stat_replication_slots[1] as that gives information about
> spill_bytes.
>
> [1] - https://www.postgresql.org/docs/devel/monitoring-stats.html#MONITORING-PG-STAT-REPLICATION-SLOTS-VIEW

Thanks Amit!

pg_ls_dir gives the list of directories and files, but not their
sizes. And it looks like the spill_bytes from
pg_stat_replication_slots is the accumulated byte count (see [1]), not
the current size of the spills files, so it's not representing the
spill files and their size at that moment.

If we have  pg_ls_logicaldir and pg_ls_replslotdir returning the
files, szies, and last modified times, it will be useful in production
environments to see the disk usage of those files at the current
moment. The data from these functions can be fed to an external
analytics tool invoking the functions at regular intervals of time and
report the disk usage of these folders. This will be super useful to
analyze the questions like: Was the disk usage more at time t1? What
happened to my database system at that time? etc. And,  these
functions can run independent of the stats collector process which is
currently required for the pg_stat_replication_slots view.

Thoughts?

I plan to work on a patch if okay.

[1]
postgres=# select
pg_ls_dir('/home/bharath/postgres/inst/bin/data/pg_replslot/mysub');
 pg_ls_dir
-----------
 state
(1 row)

postgres=# select * from pg_stat_replication_slots;
 slot_name | spill_txns | spill_count | spill_bytes | stream_txns |
stream_count | stream_bytes | total_txns | total_bytes | stats_reset

-----------+------------+-------------+-------------+-------------+--------------+--------------+------------+-------------+-------------
 mysub     |          3 |           6 |   396000000 |           0 |
        0 |            0 |          5 |   396001128 |
(1 row)

Regards,
Bharath Rupireddy.



pgsql-hackers by date:

Previous
From: Bharath Rupireddy
Date:
Subject: Re: add retry mechanism for achieving recovery target before emitting FATA error "recovery ended before configured recovery target was reached"
Next
From: Zhihong Yu
Date:
Subject: Re: Multi-Column List Partitioning