Re: I/O on select count(*) - Mailing list pgsql-performance

From Decibel!
Subject Re: I/O on select count(*)
Date
Msg-id E18CFDAC-00FC-4107-ADD2-53AC09041BEE@decibel.org
Whole thread Raw
In response to Re: I/O on select count(*)  (Greg Smith <gsmith@gregsmith.com>)
List pgsql-performance
On May 18, 2008, at 1:28 AM, Greg Smith wrote:
> I just collected all the good internals information included in
> this thread and popped it onto http://wiki.postgresql.org/wiki/
> Hint_Bits where I'll continue to hack away at the text until it's
> readable.  Thanks to everyone who answered my questions here,
> that's good progress toward clearing up a very underdocumented area.
>
> I note a couple of potential TODO items not on the official list
> yet that came up during this discussion:
>
> -Smooth latency spikes when switching commit log pages by
> preallocating cleared pages before they are needed
>
> -Improve bulk loading by setting "frozen" hint bits for tuple
> inserts which occur within the same database transaction as the
> creation of the table into which they're being inserted
>
> Did I miss anything?  I think everything brought up falls either
> into one of those two or the existing "Consider having the
> background writer update the transaction status hint bits..." TODO.

-Evaluate impact of improved caching of CLOG per Greenplum:

Per Luke Longergan:
I'll find out if we can extract our code that did the work. It was
simple but scattered in a few routines. In concept it worked like this:

1 - Ignore if hint bits are unset, use them if set.  This affects
heapam and vacuum I think.
2 - implement a cache for clog lookups based on the optimistic
assumption that the data was inserted in bulk.  Put the cache one
call away from heapgetnext()

I forget the details of (2).  As I recall, if we fall off of the
assumption, the penalty for long scans get large-ish (maybe 2X), but
since when do people full table scan when they're updates/inserts are
so scattered across TIDs?  It's an obvious big win for DW work.

--
Decibel!, aka Jim C. Nasby, Database Architect  decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828



Attachment

pgsql-performance by date:

Previous
From: Decibel!
Date:
Subject: Re: Posible planner improvement?
Next
From: Decibel!
Date:
Subject: Re: I/O on select count(*)