Josh Berkus wrote:
> Things we've already tried to avoid going over old ground:
>
>1) increasing statistics;
>2) increasing sort_mem (to 256MB, which is overkill)
>3) testing on 8.0 beta, which does not affect the issue.
>
>At this point I'm looking for ideas. Suggestions, anyone?
>
>
>
with respect to query design:
consider instead of:
select
pav1.person_id
from
person_attributes_vertical pav1,
person_attributes_vertical pav2
where
pav1.attribute_id = 1
and pav1.value_id in (2,3)
and pav2.attribute_id = 2
and pav2.value_id in (2,3)
and pav1.person_id = pav2.person_id
try:
select
pav1.person_id
from
person_attributes_vertical pav1
where
( pav1.attribute_id = 1
and pav1.value_id in (2,3))
or ( pav1.attribute_id = 2
and pav1.value_id in (2,3))
I am gambling that the 'or's' might be less expensive than the multiple self joins (particularly in the more general
cases!).
To make access work well you might want to have *several* concatenated indexes of 2 -> 4 attributes - to work around Pg
inabilityto use more than 1 in a given query.
For this query indexing (attribute_id, value_id) is probably good.
Consider playing with 'random_page_cost' and maybe 'effective_cache_size' to encourage the planner to use 'em.
regards
Mark