Re: measuring lwlock-related latency spikes - Mailing list pgsql-hackers

From Kevin Grittner
Subject Re: measuring lwlock-related latency spikes
Date
Msg-id 4F7AA69A0200002500046AE4@gw.wicourts.gov
Whole thread Raw
In response to measuring lwlock-related latency spikes  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: measuring lwlock-related latency spikes  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
> Robert Haas  wrote:
> Kevin Grittner  wrote:
>  
>> I can't help thinking that the "background hinter" I had ideas
>> about writing would prevent many of the reads of old CLOG pages,
>> taking a lot of pressure off of this area. It just occurred to me
>> that the difference between that idea and having an autovacuum
>> thread which just did first-pass work on dirty heap pages is slim
>> to none.
>
> Yeah. Marking things all-visible in the background seems possibly
> attractive, too. I think the trick is to figuring out the control
> mechanism. In this case, the workload fits within shared_buffers,
> so it's not helpful to think about using buffer eviction as the
> trigger for doing these operations, though that might have some
> legs in general. And a simple revolving scan over shared_buffers
> doesn't really figure to work out well either, I suspect, because
> it's too undirected. I think what you'd really like to have is a
> list of buffers that were modified by transactions which have
> recently committed or rolled back.
Yeah, that's what I was thinking.  Since we only care about dirty
unhinted tuples, we need some fairly efficient way to track those to
make this pay.
> but that seems more like a nasty benchmarking kludge that something
> that's likely to solve real-world problems.
I'm not so sure.  Unfortunagely, it may be hard to know without
writing at least a crude form of this to test, but there are several
workloads where hint bit rewrites and/or CLOG contention caused by
the slow tapering of usage of old pages contribute to problems.
>> I know how much time good benchmarking can take, so I hesitate to
>> suggest another permutation, but it might be interesting to see
>> what it does to the throughput if autovacuum is configured to what
>> would otherwise be considered insanely aggressive values (just for
>> vacuum, not analyze). To give this a fair shot, the whole database
>> would need to be vacuumed between initial load and the start of
>> the benchmark.
>
> If you would like to provide a chunk of settings that I can splat
> into postgresql.conf, I'm happy to run 'em through a test cycle and
> see what pops out.
Might as well jump in with both feet:
autovacuum_naptime = 1s
autovacuum_vacuum_threshold = 1
autovacuum_vacuum_scale_factor = 0.0
If that smooths the latency peaks and doesn't hurt performance too
much, it's decent evidence that the more refined technique could be a
win.
-Kevin


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Switching to Homebrew as recommended Mac install?
Next
From: Dimitri Fontaine
Date:
Subject: Re: Command Triggers patch v18