Re: About to add WAL write/fsync statistics to pg_stat_wal view - Mailing list pgsql-hackers
From | ikedamsh |
---|---|
Subject | Re: About to add WAL write/fsync statistics to pg_stat_wal view |
Date | |
Msg-id | 82ccde1a-ead9-598c-f472-004a01ff23e9@oss.nttdata.com Whole thread Raw |
In response to | Re: About to add WAL write/fsync statistics to pg_stat_wal view (Fujii Masao <masao.fujii@oss.nttdata.com>) |
Responses |
Re: About to add WAL write/fsync statistics to pg_stat_wal view
|
List | pgsql-hackers |
On 2021-03-19 16:30, Fujii Masao wrote: > On 2021/03/15 10:39, Masahiro Ikeda wrote: >> Thanks, I understood get_sync_bit() checks the sync flags and >> the write unit of generated wal data and replicated wal data is >> different. >> (It's interesting optimization whether to use kernel cache or not.) >> >> OK. Although I agree to separate the stats for the walrecever, >> I want to hear opinions from other people too. I didn't change the >> patch. >> >> Please feel to your comments. > > What about applying the patch for common WAL write function like > XLogWriteFile(), separately from the patch for walreceiver's stats? > Seems the former reaches the consensus, so we can commit it firstly. > Also even only the former change is useful because which allows > walreceiver to report WALWrite wait event. Agreed. I separated the patches. If only the former is committed, my trivial concern is that there may be a disadvantage, but no advantage for the standby server. It may lead to performance degradation to the wal receiver by calling INSTR_TIME_SET_CURRENT(), but the stats can't visible for users until the latter patch is committed. I think it's ok because this not happening in the case to disable the "track_wal_io_timing" in the standby server. Although some users may start the standby server using the backup which "track_wal_io_timing" is enabled in the primary server, they will say it's ok since the users already accept the performance degradation in the primary server. >> OK. I agree. >> >> I wonder to change the error check ways depending on who calls this >> function? >> Now, only the walreceiver checks (1)errno==0 and doesn't check >> (2)errno==ENITR. >> Other processes are the opposite. >> >> IIUC, it's appropriate that every process checks (1)(2). >> Please let me know my understanding is wrong. > > I'm thinking the same. Regarding (2), commit 79ce29c734 introduced > that code. According to the following commit log, it seems harmless > to retry on EINTR even walreceiver. > > Also retry on EINTR. All signals used in the backend are flagged > SA_RESTART > nowadays, so it shouldn't happen, but better to be defensive. Thanks, I understood. >>> BTW, currently XLogWrite() increments IO timing even when pg_pwrite() >>> reports an error. But this is useless. Probably IO timing should be >>> incremented after the return code of pg_pwrite() is checked, instead? >> >> Yes, I agree. I fixed it. >> (v18-0003-Makes-the-wal-receiver-report-WAL-statistics.patch) > > Thanks for the patch! > > nleft = nbytes; > do > { > - errno = 0; > + written = XLogWriteFile(openLogFile, from, nleft, (off_t) > startoffset, > + ThisTimeLineID, openLogSegNo, wal_segment_size); > > Can we merge this do-while loop in XLogWrite() into the loop > in XLogWriteFile()? > If we do that, ISTM that the following codes are not necessary in > XLogWrite(). > > nleft -= written; > from += written; OK, I fixed it. > + * 'segsize' is a segment size of WAL segment file. > > Since segsize is always wal_segment_size, segsize argument seems > not necessary in XLogWriteFile(). Right. I fixed it. > +XLogWriteFile(int fd, const void *buf, size_t nbyte, off_t offset, > + TimeLineID timelineid, XLogSegNo segno, int segsize) > > Why did you use "const void *" instead of "char *" for *buf? I followed the argument of pg_pwrite(). But, I think "char *" is better, so fixed it. > Regarding 0005 patch, I will review it later. Thanks. Regards, -- Masahiro Ikeda NTT DATA CORPORATION
Attachment
pgsql-hackers by date: