Re: About to add WAL write/fsync statistics to pg_stat_wal view - Mailing list pgsql-hackers

From ikedamsh
Subject Re: About to add WAL write/fsync statistics to pg_stat_wal view
Date
Msg-id 82ccde1a-ead9-598c-f472-004a01ff23e9@oss.nttdata.com
Whole thread Raw
In response to Re: About to add WAL write/fsync statistics to pg_stat_wal view  (Fujii Masao <masao.fujii@oss.nttdata.com>)
Responses Re: About to add WAL write/fsync statistics to pg_stat_wal view  (Fujii Masao <masao.fujii@oss.nttdata.com>)
List pgsql-hackers
On 2021-03-19 16:30, Fujii Masao wrote:
> On 2021/03/15 10:39, Masahiro Ikeda wrote:
>> Thanks, I understood get_sync_bit() checks the sync flags and
>> the write unit of generated wal data and replicated wal data is 
>> different.
>> (It's interesting optimization whether to use kernel cache or not.)
>> 
>> OK. Although I agree to separate the stats for the walrecever,
>> I want to hear opinions from other people too. I didn't change the 
>> patch.
>> 
>> Please feel to your comments.
> 
> What about applying the patch for common WAL write function like
> XLogWriteFile(), separately from the patch for walreceiver's stats?
> Seems the former reaches the consensus, so we can commit it firstly.
> Also even only the former change is useful because which allows
> walreceiver to report WALWrite wait event.

Agreed. I separated the patches.

If only the former is committed, my trivial concern is that there may be
a disadvantage, but no advantage for the standby server. It may lead to
performance degradation to the wal receiver by calling
INSTR_TIME_SET_CURRENT(), but the stats can't visible for users until the
latter patch is committed.

I think it's ok because this not happening in the case to disable the
"track_wal_io_timing" in the standby server. Although some users may start the
standby server using the backup which "track_wal_io_timing" is enabled in the
primary server, they will say it's ok since the users already accept the
performance degradation in the primary server.

>> OK. I agree.
>> 
>> I wonder to change the error check ways depending on who calls this 
>> function?
>> Now, only the walreceiver checks (1)errno==0 and doesn't check 
>> (2)errno==ENITR.
>> Other processes are the opposite.
>> 
>> IIUC, it's appropriate that every process checks (1)(2).
>> Please let me know my understanding is wrong.
> 
> I'm thinking the same. Regarding (2), commit 79ce29c734 introduced
> that code. According to the following commit log, it seems harmless
> to retry on EINTR even walreceiver.
> 
>     Also retry on EINTR. All signals used in the backend are flagged 
> SA_RESTART
>     nowadays, so it shouldn't happen, but better to be defensive.

Thanks, I understood.


>>> BTW, currently XLogWrite() increments IO timing even when pg_pwrite()
>>> reports an error. But this is useless. Probably IO timing should be
>>> incremented after the return code of pg_pwrite() is checked, instead?
>> 
>> Yes, I agree. I fixed it.
>> (v18-0003-Makes-the-wal-receiver-report-WAL-statistics.patch)
> 
> Thanks for the patch!
> 
>              nleft = nbytes;
>              do
>              {
> -                errno = 0;
> +                written = XLogWriteFile(openLogFile, from, nleft, (off_t) 
> startoffset,
> +                                        ThisTimeLineID, openLogSegNo, wal_segment_size);
> 
> Can we merge this do-while loop in XLogWrite() into the loop
> in XLogWriteFile()?
> If we do that, ISTM that the following codes are not necessary in 
> XLogWrite().
> 
>                  nleft -= written;
>                  from += written;

OK, I fixed it.


> + * 'segsize' is a segment size of WAL segment file.
> 
> Since segsize is always wal_segment_size, segsize argument seems
> not necessary in XLogWriteFile().

Right. I fixed it.


> +XLogWriteFile(int fd, const void *buf, size_t nbyte, off_t offset,
> +              TimeLineID timelineid, XLogSegNo segno, int segsize)
> 
> Why did you use "const void *" instead of "char *" for *buf?

I followed the argument of pg_pwrite().
But, I think "char *" is better, so fixed it.


> Regarding 0005 patch, I will review it later.

Thanks.


Regards,
-- 
Masahiro Ikeda
NTT DATA CORPORATION




Attachment

pgsql-hackers by date:

Previous
From: Tatsuo Ishii
Date:
Subject: Re: Using COPY FREEZE in pgbench
Next
From: Kyotaro Horiguchi
Date:
Subject: Re: shared-memory based stats collector