Re: I: About "Our CLUSTER implementation is pessimal" patch - Mailing list pgsql-hackers

From Tom Lane
Subject Re: I: About "Our CLUSTER implementation is pessimal" patch
Date
Msg-id 8092.1286493646@sss.pgh.pa.us
Whole thread Raw
In response to Re: I: About "Our CLUSTER implementation is pessimal" patch  (Josh Kupershmidt <schmiddy@gmail.com>)
List pgsql-hackers
Josh Kupershmidt <schmiddy@gmail.com> writes:
> So I think there are definitely cases where this patch helps, but it
> looks like a seq. scan is being chosen in some cases where it doesn't
> help.

I've been poking through this patch, and have found two different ways
in which it underestimates the cost of the seqscan case:

* it's not setting rel->width, resulting in an underestimate of the
amount of disk space needed for a sort; this would get worse for wider
tables.

* it's not allowing for the cost of recomputing index expression values
during comparisons.  That doesn't matter of course if you're not testing
the index-expression case (which other infelicities suggest hasn't
exactly been stressed yet).

I suspect the first of these might have something to do with your
observation.  AFAIR the width value isn't used in estimating indexscan
cost, so this omission would bias it in favor of seqscans, as soon as
the data volume exceeded maintenance_work_mem.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Greg Smith
Date:
Subject: Re: standby registration (was: is sync rep stalled?)
Next
From: Greg Smith
Date:
Subject: Re: O_DSYNC broken on MacOS X?