On 11.02.2019 21:25, Arthur Zakirov wrote:
> Hello hackers,
>
> Grigory noticed that one of our utilities has very slow performance when
> xlogreader reads zlib archives. We found out that xlogreader sometimes
> reads a WAL file block twice.
>
> zlib has slow performance when you read an archive not in sequential
> order. I think reading a block twice in same position isn't sequential,
> because gzread() moves current position forward and next call gzseek()
> to the same position moves it back.
>
> It seems that the attached patch solves the issue. I think when reqLen
> == state->readLen the requested block already is in the xlogreader's
> buffer.
>
> What do you think?
I looked at the history of the code changes:
---------------------------------------------------------------
7fcbf6a405f (Alvaro Herrera 2013-01-16 16:12:53 -0300 539)
reqLen < state->readLen)
1bb2558046c (Heikki Linnakangas 2010-01-27 15:27:51 +0000 9349)
targetPageOff == readOff && targetRecOff < readLen)
eaef111396e (Tom Lane 2006-04-03 23:35:05 +0000 3842)
len = XLOG_BLCKSZ - RecPtr->xrecoff % XLOG_BLCKSZ;
4d14fe0048c (Tom Lane 2001-03-13 01:17:06 +0000 3843)
if (total_len > len)
---------------------------------------------------------------
In the original code of Tom Lane, condition (total_len > len) caused a
page reread from disk. As I understand it, this is equivalent to your
proposal.
Th code line in commit 1bb2558046c seems tantamount to the corresponding
line in commit 7fcbf6a405f but have another semantics: the targetPageOff
value can't be more or equal XLOG_BLCKSZ, but the reqLen value can be.
It may be a reason of appearance of possible mistake, introduced by
commit 7fcbf6a405f.
--
Andrey Lepikhov
Postgres Professional
https://postgrespro.com
The Russian Postgres Company