Re: number of rows estimation for bit-AND operation - Mailing list pgsql-performance

From Robert Haas
Subject Re: number of rows estimation for bit-AND operation
Date
Msg-id 603c8f070908201055g4f1af18webdfad0640140f94@mail.gmail.com
Whole thread Raw
In response to Re: number of rows estimation for bit-AND operation  (Scott Marlowe <scott.marlowe@gmail.com>)
Responses Re: number of rows estimation for bit-AND operation
List pgsql-performance
On Tue, Aug 18, 2009 at 6:34 PM, Scott Marlowe<scott.marlowe@gmail.com> wrote:
> 2009/8/18 Slava Moudry <smoudry@4info.net>:
>>> increase default stats target, analyze, try again.
>> This field has only 5 values. I had put values/frequencies in my first post.
>
> Sorry, kinda missed that.  Anyway, there's no way for pg to know which
> operation is gonna match.  Without an index on it.  So my guess is
> that it just guesses some fixed value.  With an index it might be able
> to get it right, but you'll need an index for each type of match
> you're looking for.  I think.  Maybe someone else on the list has a
> better idea.

The best way to handle this is probably to not cram multiple vales
into a single field.  Just use one boolean for each flag.  It won't
even cost you any space, because right now you are using 8 bytes to
store 5 booleans, and 5 booleans will (I believe) only require 5
bytes.  Even if you were using enough of the bits for the space usage
to be higher with individual booleans, the overall performance is
likely to be better that way.

This is sort of stating the obvious, but it doesn't make it any less
true.  Unfortunately, PG's selectivity estimator can't handle cases
like this.  Tom Lane recently made some noises about trying to improve
it, but it's not clear whether that will go anywhere, and in any event
it won't happen before 8.5.0 comes out next spring/summer.

...Robert

pgsql-performance by date:

Previous
From: Ivan Voras
Date:
Subject: Re: PG 8.3 and server load
Next
From: Craig James
Date:
Subject: Re: Number of tables