Re: What is the best way to do attribute/values? - Mailing list pgsql-performance

From Mark Kirkwood
Subject Re: What is the best way to do attribute/values?
Date
Msg-id 412C4C3A.2030001@coretech.co.nz
Whole thread Raw
In response to Re: What is the best way to do attribute/values?  (Josh Berkus <josh@agliodbs.com>)
Responses Re: What is the best way to do attribute/values?  (Jeff <threshar@torgo.978.org>)
Re: What is the best way to do attribute/values?  (Josh Berkus <josh@agliodbs.com>)
List pgsql-performance

Josh Berkus wrote:

> Things we've already tried to avoid going over old ground:
>
>1) increasing statistics;
>2) increasing sort_mem (to 256MB, which is overkill)
>3) testing on 8.0 beta, which does not affect the issue.
>
>At this point I'm looking for ideas.   Suggestions, anyone?
>
>
>
with respect to query design:

consider instead of:

select
    pav1.person_id
from
    person_attributes_vertical pav1,
    person_attributes_vertical pav2
where
    pav1.attribute_id = 1
    and pav1.value_id in (2,3)
    and pav2.attribute_id = 2
    and pav2.value_id in (2,3)
    and pav1.person_id = pav2.person_id

try:

select
    pav1.person_id
from
    person_attributes_vertical pav1
where
       (    pav1.attribute_id = 1
        and pav1.value_id in (2,3))
    or (    pav1.attribute_id = 2
        and pav1.value_id in (2,3))

I am gambling that the 'or's' might be less expensive than the multiple self joins (particularly in the more general
cases!).

To make access work well you might want to have *several* concatenated indexes of 2 -> 4 attributes - to work around Pg
inabilityto use more than 1 in a given query. 
For this query indexing (attribute_id, value_id) is probably good.

Consider playing with 'random_page_cost' and maybe 'effective_cache_size' to encourage the planner to use 'em.

regards

Mark





pgsql-performance by date:

Previous
From: "Gregory S. Williamson"
Date:
Subject: Re: postgresql performance with multimedia
Next
From: Jan Wieck
Date:
Subject: Re: postgresql performance with multimedia