Re: Seq scans roadmap - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Seq scans roadmap
Date
Msg-id 46431F11.9020303@enterprisedb.com
Whole thread Raw
In response to Re: Seq scans roadmap  ("Zeugswetter Andreas ADI SD" <ZeugswetterA@spardat.at>)
Responses Re: Seq scans roadmap
List pgsql-hackers
Zeugswetter Andreas ADI SD wrote:
>> Also, that patch doesn't address the VACUUM issue at all. And 
>> using a small fixed size ring with scans that do updates can 
>> be devastating. I'm experimenting with different ring sizes 
>> for COPY at the moment. Too small ring leads to a lot of WAL 
>> flushes, it's basically the same problem we have with VACUUM 
>> in CVS HEAD.
> 
> My first take on that would be to simply abandon any dirty (and actually
> also any still pinned) buffer from the ring and replace the ring slot
> with a buffer from the freelist.
> If the freelist is empty and LSN allows writing the buffer, write it
> (and maybe try to group these).
> If the LSN does not allow the write, replace the slot with a buffer from
> LRU.

That would effectively disable the ring for COPY and the 2nd phase of 
VACUUM.

One problem with looking at the LSN is that you need the content lock to 
read it, and I wouldn't want to add any new locking. It could be done 
inside FlushBuffer when we hold the lock anyway, but I'm afraid the 
changes would be pretty invasive.

I'm struggling to get a grip of what the optimal ring size is under 
various circumstances. Some thoughts I have this far:
- a small ring gives better L2 cache behavior
- for read-only queries, and for queries that just hint bits, 1 buffer 
is enough
- small ring with query that writes WAL (COPY, mass updates, 2nd phase 
of VACUUM) leads to a lot of WAL flushes, which can become bottleneck.

But all these assumptions need to be validated. I'm setting up tests 
with different ring sizes and queries to get a clear picture of this:
- VACUUM on a clean table
- VACUUM on a table with 1 dead tuple per page
- read-only scan, large table
- read-only scan, table fits in OS cache
- COPY

In addition, I'm going to run VACUUM in a DBT-2 test to see the affect 
on other queries running concurrently.

I think a ring that grows when WAL flushes occur covers all the use 
cases reasonably well, but I need to do the testing...

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: "Zeugswetter Andreas ADI SD"
Date:
Subject: Re: Seq scans roadmap
Next
From: Alvaro Herrera
Date:
Subject: Re: Re: [COMMITTERS] psqlodbc - psqlodbc: Put Autotools-generated files into subdirectory