Re: [HACKERS] Re: [HACKERS] 答复:[HACKERS] 答复:[HACKERS] about fsync in CLOG buffer write - Mailing list pgsql-hackers

From Robert Haas
Subject Re: [HACKERS] Re: [HACKERS] 答复:[HACKERS] 答复:[HACKERS] about fsync in CLOG buffer write
Date
Msg-id CA+TgmobEGkAuaUENr6TUNO+gOouvnOn8cKqe9J_hiV10YinGag@mail.gmail.com
Whole thread Raw
In response to Re: Re: [HACKERS] 答复:[HACKERS] 答复:[HACKERS] about fsync in CLOG buffer write  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Re: [HACKERS] Re: [HACKERS] 答复:[HACKERS] 答复:[HACKERS] about fsync in CLOG buffer write  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Wed, Sep 9, 2015 at 10:35 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> ... How often such a workload actually has to replace a *dirty* clog
>> buffer obviously depends on how often you checkpoint, but if you're
>> getting ~28k TPS you can completely fill 32 clog buffers (1 million
>> transactions) in less than 40 seconds, and you're probably not
>> checkpointing nearly that often.
>
> But by the same token, at that kind of transaction rate, no clog page is
> actively getting dirtied for more than a couple of seconds.  So while it
> might get swapped in and out of the SLRU arena pretty often after that,
> this scenario seems unconvincing as a source of repeated fsyncs.

Well, if you're filling ~1 clog page per second, you're doing ~1 fsync
per second too.  Or if you are not, then you are thrashing the
progressively smaller and smaller number of clean slots ever-harder
until no clean pages remain and you're forced to fsync something -
probably, a bunch of things all at once.

> Like Andres, I'd want to see a more realistic problem case before
> expending a lot of work here.

I think the question here isn't whether the problem case is realistic
- I am quite sure that a pgbench workload is - but rather how much of
a problem it actually causes.  I'm very sure that individual pgbench
backends experience multi-second stallls as a result of this.  What
I'm not sure about is how frequently it happens, and how much of an
effect it has on overall latency.  I think it would be worth someone's
time to try to write some good instrumentation code here and figure
that out.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Parallel Seq Scan
Next
From: Robert Haas
Date:
Subject: Re: checkpointer continuous flushing