Re: Seq scans status update - Mailing list pgsql-patches

From Heikki Linnakangas
Subject Re: Seq scans status update
Date
Msg-id 465DD256.3010702@enterprisedb.com
Whole thread Raw
In response to Re: Seq scans status update  (Jeff Davis <pgsql@j-davis.com>)
Responses Re: Seq scans status update  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-patches
Jeff Davis wrote:
> On Tue, 2007-05-29 at 17:43 -0700, Jeff Davis wrote:
>>> Hmm.  But we probably don't want the same buffer in two different
>>> backends' rings, either.  You *sure* the sync-scan patch has no
>>> interaction with this one?
>>>
>> I will run some tests again tonight, I think the interaction needs more
>> testing than I did originally. Also, I'm not sure that the hardware I
>> have is sufficient to test those cases.
>>
>
> I ran some sanity tests last night with cvs head, plus my syncscan20-
> cvshead.patch, plus scan_recycle_buffers.v3.patch.
>
> It passed the sanity tests at least.
>
> I did see that there was more interference with sync_seqscan_threshold=0
> (always on) and scan_recycle_buffers=0 (off) than I had previously seen
> with 8.2.4, so I will test again against 8.2.4 to see why that might be.
> The interference that I saw was still quite small, a scan moving
> concurrently with 9 other scans was about 10% slower than a scan running
> alone -- which is still very good compared with plain cvs head and no
> sync scan -- it's just not ideal.
>
> However, turning scan_recycle_buffers between 0 (off), 16, 32, and 128
> didn't have much effect. At 32 it appeared to be about 1% worse during
> 10 scans, but that may have been noise. The other values I tried didn't
> have any difference that I could see.
>
> This was really just a quick sanity test, I think more hard data would
> be useful.

The interesting question is whether the small buffer ring is big enough
to let all synchronized scans to process a page before it's being
recycled. Keep an eye on pg_buffercache to see if it gets filled with
pages from the table you're querying.

I just ran a quick test with 4 concurrent scans on a dual-core system,
and it looks like we do "leak" buffers from the rings because they're
pinned at the time they would be recycled. A full scan of a 30GB table
took just under 7 minutes, and starting after a postmaster restart it
took ~4-5 minutes until all of the 320MB of shared_buffers were used.
That means we're leaking a buffer from the ring very roughly on every
20-40th ReadBuffer call, but I'll put in some proper instrumentation and
  test with different configurations to get a better picture.

The synchronized scans gives such a big benefit when it's applicable,
that I think that some cache-spoiling is acceptable and in fact
unavoidable in some scenarios. It's much better than 8.2 behavior anyway.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

pgsql-patches by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: boolean <=> text explicit casts
Next
From: Neil Conway
Date:
Subject: Re: boolean <=> text explicit casts