Re: libpq compression - Mailing list pgsql-hackers

From Konstantin Knizhnik
Subject Re: libpq compression
Date
Msg-id e1ab60c4-894e-04e2-b42d-3bb4e48a8256@postgrespro.ru
Whole thread Raw
In response to Re: libpq compression  (Matthias van de Meent <boekewurm+postgres@gmail.com>)
List pgsql-hackers

On 05.11.2020 21:07, Matthias van de Meent wrote:
> On Thu, 5 Nov 2020 at 17:01, Konstantin Knizhnik
> <k.knizhnik@postgrespro.ru> wrote:
>> Sorry, I do not understand your point.
>> This view reports network traffic from server's side.
>> But client's traffic information is "mirror" of this statistic: server_tx=client_rx and visa versa.
>>
>> Yes, first few bytes exchanged by client and server during handshake are not compressed.
>> But them are correctly calculated as "raw bytes". And certainly this few bytes can not have any influence on
>> measured average compression ratio (the main goal of using this network traffic statistic from my point of view).
> As I understand it, the current metrics are as follows:
>
> Server
>   |<- |<- Xx_raw_bytes
>   |  Compression
>   |   |<- Xx_compressed_bytes
> Client connection
>   |
> Network
>
>  From the views' name 'pg_stat_network_traffic', to me 'Xx_raw_bytes'
> would indicate the amount of bytes sent/received over the client
> connection (e.g. measured between the Client connection and Network
> part, or between the Server/Client connection and Compression/Client
> connection sections), because that is my natural understanding of
> 'raw tx network traffic'. This is why I proposed 'logical' instead
> of 'raw', as 'raw' is quite apparently understood differently when
> interpreted by different people, whereas 'logical' already implies
> that the value is an application logic-determined value (e.g. size
> before compression).
>
> The current name implies a 'network' viewpoint when observing this
> view, not the 'server'/'backend' viewpoint you describe. If the
> 'server'/'backend' viewpoint is the desired default viewpoint, then
> I suggest to rename the view to `pg_stat_network_compression`, as
> that moves the focus to the compression used, and subsequently
> clarifies `raw` as the raw application command data.
>
> If instead the name `pg_stat_network_traffic` is kept, I suggest
> changing the metrics collected to the following scheme:
>
> Server
>   |<- |<- Xx_logical_bytes
>   |  Compression
>   |   |<- Xx_compressed_bytes (?)
>   |<- |<- Xx_raw_bytes
> Client connection
>   |
> Network
>
> This way, `raw` in the context of 'network_traffic' means
> "sent-over-the-connection"-data, and 'logical' is 'application logic'
> -data (as I'd expect from both a network as an application point of
> view). 'Xx_compressed_bytes' is a nice addition, but not strictly
> necessary, as you can subtract raw from logical to derive the bytes
> saved by compression.
Sorry, but "raw" in this context means "not transformed", i.e. not 
compressed.
I have not used term uncompressed, because it assumes that there are 
"compressed" bytes which is not true if compression is not used.
So "raw" bytes are not bytes which we sent through network - quite 
opposite: application writes "raw" (uncompressed) data,
it is compressed ans then compressed bytes are sent.

May be I am wrong, but term "logical" is much more confusing and 
overloaded than "raw".
Especially taken in account that it is widely used in Postgres for 
logical replication.
The antonym to "logical" is "physical", i.e. something materialized.
But in case of data exchanged between client and server, which one can 
be named physical, which one logical?
Did you ever heard about logical size of the file (assuming that may 
contain holes or be compressed by file system?)
In zfs it is called "apparent" size.

Also I do not understand at your picture why Xx_compressed_bytes may be 
different from Xx_raw_bytes?





pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: [Patch] Optimize dropping of relation buffers using dlist
Next
From: Konstantin Knizhnik
Date:
Subject: Re: libpq compression