Re: Session WAL activity - Mailing list pgsql-hackers

From Konstantin Knizhnik
Subject Re: Session WAL activity
Date
Msg-id f0777b17-52ff-d1a7-dc75-b04df51f41b5@postgrespro.ru
Whole thread Raw
In response to Re: Session WAL activity  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Responses Re: Session WAL activity
List pgsql-hackers

On 05.12.2019 5:37, Kyotaro Horiguchi wrote:
> It seems to be useful to me. We also might want statistics of other
>>> session IOs.  In that case the table name would be
>>> "pg_stat_session/process_activity". We are aleady collecting most
>>> kinds of the IO activity but it loses session information...
>> Well, actually monitoring disk activity for the particular
>> backend/session can be easily done using some external tools
>> (just because now in Postgres session=backend=process). So we can
>> monitor IO of processes, for example using iotop at Unix
>> or Performance Monitor at Windows.
> Operations that completes on shared buffers cannot be monitored that
> way. This is the same with WAL writing.

The questions is what we are going to monitor?
Amount of read/dirtied buffers or amount of disk ops?
>
>> Certainly it is more convenient to have such statstic inside
>> Postgres. But I am not sure if it is really needed.
>> Concerning WAL activity situation is more obscure: records can be
>> added to the WAL by one process, but written by another.
>> This is why it is not possible to use some external tools.
> For clarity, I didn't suggest that this patch should include general
> session IO statistics. Just the view name looked a bit specific.

I am not sure if pg_stat_wal_activity view should be added at all.
We can just add pg_stat_get_wal_activity function and let user specify 
PID of backend himself (for example by performing join with 
pg_stat_activity).
I proposed name pg_stat_wal_activity just for similarity with 
pg_stat_activity but can use any other proposed name.

>
>>> Briefly looking the patch, I have some comments on it.
>>>
>>> As mentioned above, if we are intending future exantion of the
>>> session-stats table, the name should be changed.
>>>
>>> Backend status is more appropriate than PGPROC. See pgstat.c.
>> Do you mean pgstat_fetch_stat_beentry?
>> But why it is better than storing this information directly in PGPROC?
> No it cannot be used there for performance reasons as you are
> saying. I'm not sure it's acceptable, but we can directly access
> backend status the same way if we expose MyBEEntry (and update it
> through a macro or a inline function).  If we don't need per record
> resolution for the value, we can update a process local variable at
> WAL-write time then write it to backend status at commit time or at
> the same timing as pgstat reporting.
>
> According to my faint memory, PGPROC is thought that it must be kept
> as small as possible for the reasons of CPU caches, that is the reason
> for PgBackendStatus.
Why do you think that adding one addition (without any locks and 
function calls) to CopyXLogRecordToWAL is not acceptable.
It is just one instruction added to expensive functions. At least I have 
not noticed any measurable impact on performance.

Concerning keeping PGPROC size as small as possible, I agree that it is 
reasonable argument.
But even now it is very large (816 bytes) and adding extra 8 bytes will 
increase it on less than 1%.

>>
>> This information is updated locally only by backend itself.
>> Certainly update of 64 bit field is not atomic at 32-but
>> architectures.
>> But it is just statistic. I do not think that it will be fatal if for
>> a moment
>> we can see some incorrect value of written WAL bytes (and at most
>> platforms this
>> update will be atomic).
> At least reader needs to take procarray lock to keep PID-WALwrite
> consistency, in order to prevent reading WALwrite values for a wrong
> process.

Sorry, but I still do not understand whats wrong can happen if reader 
will see WAL activity of wrong process.
Yes, correspondent backend may be already terminated and its PGPROC 
entry can be reused for some other process.
In this case we can wrongly attribute WAL traffic generated by 
terminated backend to the new process
or report zero traffic for old process. But this information is mostly 
needed for live (active) backends. So I do not think
that race conditions here are so critical.

Right now pg_stat_activity also accessing PGPROC to obtain wait event 
information and also not taking any locks.
So it can wrongly report backend status. But I never heard that somebody 
complains about it.

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




pgsql-hackers by date:

Previous
From: Julien Rouhaud
Date:
Subject: Misleading comment in pg_upgrade.c
Next
From: Amit Kapila
Date:
Subject: Re: Windows buildfarm members vs. new async-notify isolation test