Re: Planner question - "bit" data types - Mailing list pgsql-performance

From Tom Lane
Subject Re: Planner question - "bit" data types
Date
Msg-id 9504.1252193078@sss.pgh.pa.us
Whole thread Raw
In response to Re: Planner question - "bit" data types  (Karl Denninger <karl@denninger.net>)
Responses Re: Planner question - "bit" data types  (Karl Denninger <karl@denninger.net>)
List pgsql-performance
Karl Denninger <karl@denninger.net> writes:
> Tom Lane wrote:
>> In that case you'd be wasting your time to get it to use an index
>> for the condition anyway.  Maybe you need to take a step back and
>> look at the query as a whole rather than focus on this particular
>> condition.

> The query, sans this condition, is extremely fast and contains a LOT of
> other conditions (none of which cause trouble.)
> It is only attempting to filter the returned tuples on the permission
> bit(s) involved that cause trouble.

My comment stands: asking about how to use an index for this is the
wrong question.

You never showed us any EXPLAIN results, but I suspect what is happening
is that the planner thinks the "permission & mask = permission"
condition is fairly selective (offhand I think it'd default to
DEFAULT_EQ_SEL or 0.005) whereas the true selectivity per your prior
comment is only 0.1 to 0.8.  This is causing it to change to a plan that
would be good for a small number of rows, when it should stick to a plan
that is good for a large number of rows.

So the right question is "how do I fix the bad selectivity estimate?".
Unfortunately there's no non-kluge answer.  What I think I'd try is
wrapping the condition into a function, say

create function permission_match(perms int, mask int) returns bool
as $$begin return perms & mask = mask; end$$ language plpgsql
strict immutable;

The planner won't know what to make of "where permission_match(perms, 64)"
either, but the default selectivity estimate for a boolean function
is 0.333, much closer to what you need.

Or plan B, which I'd recommend, is to forget the mask business and go
over to a boolean column per permission flag.  Then the planner would
actually have decent statistics about the flag selectivities, and the
queries would be a lot more readable too.  Your objection that you'd
need an index per flag column is misguided --- at these selectivities
an index is really pointless.  And I entirely fail to understand the
complaint about it being unportable; you think "&" is more portable than
boolean?  Only one of those things is in the SQL standard.

            regards, tom lane

pgsql-performance by date:

Previous
From: Karl Denninger
Date:
Subject: Re: Planner question - "bit" data types
Next
From: Karl Denninger
Date:
Subject: Re: Planner question - "bit" data types