Re: some longer, larger pgbench tests with various performance-related patches - Mailing list pgsql-hackers

From Robert Haas
Subject Re: some longer, larger pgbench tests with various performance-related patches
Date
Msg-id CA+Tgmob1L+1ROcUX46us9mFcvBuT58UxDq0NZ3+HQWk=QGr-6A@mail.gmail.com
Whole thread Raw
In response to Re: some longer, larger pgbench tests with various performance-related patches  (Jeff Janes <jeff.janes@gmail.com>)
Responses Re: some longer, larger pgbench tests with various performance-related patches
List pgsql-hackers
On Sat, Feb 4, 2012 at 2:13 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
> We really need to nail that down.  Could you post the scripts (on the
> wiki) you use for running the benchmark and making the graph?  I'd
> like to see how much work it would be for me to change it to detect
> checkpoints and do something like color the markers blue during
> checkpoints and red elsewhen.

They're pretty crude - I've attached them here.

> Also, I'm not sure how bad that graph really is.  The overall
> throughput is more variable, and there are a few latency spikes but
> they are few.  The dominant feature is simply that the long-term
> average is less than the initial burst.Of course the goal is to have
> a high level of throughput with a smooth latency under sustained
> conditions.  But to expect that that long-sustained smooth level of
> throughput be identical to the "initial burst throughput" sounds like
> more of a fantasy than a goal.

That's probably true, but the drop-off is currently quite extreme.
The fact that disabling full_page_writes causes throughput to increase
by >4x is dismaying, at least to me.

> If we want to accept the lowered
> throughput and work on the what variability/spikes are there, I think
> a good approach would be to take the long term TPS average, and dial
> the number of clients back until the initial burst TPS matches that
> long term average.  Then see if the spikes still exist over the long
> term using that dialed back number of clients.

Hmm, I might be able to do that.

> I don't think the full-page-writes are leading to WALInsert
> contention, for example, because that would probably lead to smooth
> throughput decline, but not those latency spikes in which those entire
> seconds go by without transactions.

Right.

> I doubt it is leading to general
> IO compaction, as the IO at that point should be pretty much
> sequential (the checkpoint has not yet reached the sync stage, the WAL
> is sequential).  So I bet that that is caused by fsyncs occurring at
> xlog segment switches, and the locking that that entails.

That's definitely possible.

> If I
> recall, we can have a segment which is completely written to OS and in
> the process of being fsynced, and we can have another segment which is
> in some state of partially in wal_buffers and partly written out to OS
> cache, but that we can't start reusing the wal_buffers that were
> already written to OS for that segment (and therefore are
> theoretically available for reuse by the upcoming 3rd segment)  until
> the previous segments fsync has completed.  So all WALInsert's freeze.
>  Or something like that.  This code has changed a bit since last time
> I studied it.

Yeah, I need to better-characterize where the pauses are coming from,
but I'm reluctant to invest too much effort in until Heikki's xlog
scaling patch goes in, because I think that's going to change things
enough that any work done now will mostly be wasted.

It might be worth trying a run with wal_buffers=32MB or something like
that, just to see whether that mitigates any of the locking pile-ups.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: freezing multixacts
Next
From: Alvaro Herrera
Date:
Subject: Re: Dry-run mode for pg_archivecleanup