Thread: pg_stat_wal: tracking the compression effect

pg_stat_wal: tracking the compression effect

From
Ken Kato
Date:
Hi hackers,

We can specify compression method (for example, lz4, zstd), but it is 
hard to know the effect of compression depending on the method. There is 
already a way to know the compression effect using pg_waldump. However, 
having these statistics in the view makes it more accessible. I am 
proposing to add statistics, which keeps track of compression effect in 
pg_stat_ wal view.

The design I am thinking is below:

compression_saved | compression_times
------------------+-------------------
             38741 |                6


Accumulating the values, which indicates how much space is saved by each 
compression (size before compression - size after compression), and keep 
track of how many times compression has happened. So that one can know 
how much space is saved on average.

What do you think?

Regards,

-- 
Ken Kato
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION



Re: pg_stat_wal: tracking the compression effect

From
Kyotaro Horiguchi
Date:
At Thu, 25 Aug 2022 16:04:50 +0900, Ken Kato <katouknl@oss.nttdata.com> wrote in 
> Accumulating the values, which indicates how much space is saved by
> each compression (size before compression - size after compression),
> and keep track of how many times compression has happened. So that one
> can know how much space is saved on average.

Honestly, I don't think its useful much.
How about adding them to pg_waldump and pg_walinspect instead?

# It further widens the output of pg_waldump, though..

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: pg_stat_wal: tracking the compression effect

From
Kyotaro Horiguchi
Date:
At Fri, 26 Aug 2022 11:55:27 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in 
> At Thu, 25 Aug 2022 16:04:50 +0900, Ken Kato <katouknl@oss.nttdata.com> wrote in 
> > Accumulating the values, which indicates how much space is saved by
> > each compression (size before compression - size after compression),
> > and keep track of how many times compression has happened. So that one
> > can know how much space is saved on average.
> 
> Honestly, I don't think its useful much.
> How about adding them to pg_waldump and pg_walinspect instead?
> 
> # It further widens the output of pg_waldump, though..

Sorry, that was apparently too short.

I know you already see that in per-record output of pg_waldump, but
maybe we need the summary of saved bytes in "pg_waldump -b -z" output
and the corresponding output of pg_walinspect.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: pg_stat_wal: tracking the compression effect

From
Andrey Borodin
Date:

> On 25 Aug 2022, at 12:04, Ken Kato <katouknl@oss.nttdata.com> wrote:
>
> What do you think?

I think users will need to choose between Lz4 and Zstd. So they need to know tradeoff - compression ratio vs cpu time
spendper page(or any other segment). 

I know that Zstd must be kind of "better", but doubt it have enough runway on 1 block to show off. If only we could
persistcompression context between many pages... 
Compression ratio may be different on different workloads, so system view or something similar could be of use.

Thanks!

Best regards, Andrey Borodin.


Re: pg_stat_wal: tracking the compression effect

From
Bharath Rupireddy
Date:
On Fri, Aug 26, 2022 at 8:39 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
>
> At Fri, 26 Aug 2022 11:55:27 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in
> > At Thu, 25 Aug 2022 16:04:50 +0900, Ken Kato <katouknl@oss.nttdata.com> wrote in
> > > Accumulating the values, which indicates how much space is saved by
> > > each compression (size before compression - size after compression),
> > > and keep track of how many times compression has happened. So that one
> > > can know how much space is saved on average.
> >
> > Honestly, I don't think its useful much.
> > How about adding them to pg_waldump and pg_walinspect instead?
> >
> > # It further widens the output of pg_waldump, though..
>
> Sorry, that was apparently too short.
>
> I know you already see that in per-record output of pg_waldump, but
> maybe we need the summary of saved bytes in "pg_waldump -b -z" output
> and the corresponding output of pg_walinspect.

+1 for adding compression stats such as type and saved bytes to
pg_waldump and pg_walinspect given that the WAL records already have
the saved bytes info. Collecting them in the server via pg_stat_wal
will require some extra effort, for instance, every WAL record insert
requires that code to be executed. When users want to analyze the
compression efforts they can either use pg_walinspect or pg_waldump
and change the compression type if required.

-- 
Bharath Rupireddy
RDS Open Source Databases: https://aws.amazon.com/rds/postgresql/



Re: pg_stat_wal: tracking the compression effect

From
Ken Kato
Date:
On 2022-08-27 16:48, Bharath Rupireddy wrote:
> On Fri, Aug 26, 2022 at 8:39 AM Kyotaro Horiguchi
> <horikyota.ntt@gmail.com> wrote:
>> 
>> At Fri, 26 Aug 2022 11:55:27 +0900 (JST), Kyotaro Horiguchi 
>> <horikyota.ntt@gmail.com> wrote in
>> > At Thu, 25 Aug 2022 16:04:50 +0900, Ken Kato <katouknl@oss.nttdata.com> wrote in
>> > > Accumulating the values, which indicates how much space is saved by
>> > > each compression (size before compression - size after compression),
>> > > and keep track of how many times compression has happened. So that one
>> > > can know how much space is saved on average.
>> >
>> > Honestly, I don't think its useful much.
>> > How about adding them to pg_waldump and pg_walinspect instead?
>> >
>> > # It further widens the output of pg_waldump, though..
>> 
>> Sorry, that was apparently too short.
>> 
>> I know you already see that in per-record output of pg_waldump, but
>> maybe we need the summary of saved bytes in "pg_waldump -b -z" output
>> and the corresponding output of pg_walinspect.
> 
> +1 for adding compression stats such as type and saved bytes to
> pg_waldump and pg_walinspect given that the WAL records already have
> the saved bytes info. Collecting them in the server via pg_stat_wal
> will require some extra effort, for instance, every WAL record insert
> requires that code to be executed. When users want to analyze the
> compression efforts they can either use pg_walinspect or pg_waldump
> and change the compression type if required.

Thank you for all the comments!

I will go with adding the compression stats in pg_waldump and 
pg_walinspect.

Regards,
-- 
Ken Kato
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION