Re: Comparing user attributes with bitwise operators - Mailing list pgsql-performance

From Patrick Clery
Subject Re: Comparing user attributes with bitwise operators
Date
Msg-id 200410061255.03052.patrick@phpforhire.com
Whole thread Raw
In response to Re: Comparing user attributes with bitwise operators  (Josh Berkus <josh@agliodbs.com>)
Responses Re: Comparing user attributes with bitwise operators  (Greg Stark <gsstark@mit.edu>)
List pgsql-performance
Another problem I should note is that when I first insert all the data into
the people_attributes table ("the int[] table"), the GiST index is not used:

THE INDEX:
"people_attributes_search" gist ((ARRAY[age, gender, orientation, children,
drinking, education,
ethnicity, eyecolor, haircolor, hairstyle, height, income, occupation,
relation, religion, smoking, w
ant_children, weight] + seeking + languages))

PART OF THE QUERY PLAN:
Seq Scan on people_attributes pa  (cost=0.00..0.00 rows=1 width=20)
                     Filter: (((ARRAY[age, gender, orientation, children,
drinking, education, ethnicity, eyecolor, haircolor, hairstyle, height,
income, occupation, relation, religion, smoking, want_children, weight] +
seeking) + languages) @@ '( ( 4 | 5 ) | 6 ) & 88 & 48 & ( 69 | 70 ) & 92 &
( ( ( ( ( ( ( ( ( ( ( ( ( 95 | 96 ) | 97 ) | 98 ) | 99 ) | 100 ) | 101 ) |
102 ) | 103 ) | 104 ) | 105 ) | 106 ) | 107 ) | 108 ) &
( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( 190
| 191 ) | 192 ) | 193 ) | 194 ) | 195 ) | 196 ) | 197 ) | 198 ) | 199 ) |
200 ) | 201 ) | 202 ) | 203 ) | 204 ) | 205 ) | 206 ) | 207 ) | 208 ) | 209 )
| 210 ) | 211 ) | 212 ) | 213 ) | 214 ) | 215 ) | 216 ) | 217 ) | 218 ) |
219 ) | 220 ) | 221 ) | 222 ) | 223 ) | 224 ) | 225 ) | 226 ) | 227 ) | 228 )
| 229 ) | 230 ) | 231 ) | 232 ) | 233 ) | 234 ) | 235 ) | 236 ) | 237 ) |
238 ) | 239 ) | 240 ) | 241 ) | 242 ) | 243 )'::query_int)


So I run "VACUUM ANALYZE people_attributes", then run again:

PART OF THE QUERY PLAN:
Index Scan using people_attributes_pkey on people_attributes pa
(cost=0.00..5.32 rows=1 width=20)
         Index Cond: (pa.person_id = "outer".person_id)
         Filter: (((ARRAY[age, gender, orientation, children, drinking,
education, ethnicity, eyecolor, haircolor, hairstyle, height, income,
occupation, relation, religion, smoking, want_children, weight] + seeking) +
languages) @@ '( ( 4 | 5 ) | 6 ) & 88 & 48 & ( 69 | 70 ) & 92 &
( ( ( ( ( ( ( ( ( ( ( ( ( 95 | 96 ) | 97 ) | 98 ) | 99 ) | 100 ) | 101 ) |
102 ) | 103 ) | 104 ) | 105 ) | 106 ) | 107 ) | 108 ) &
( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( 190
| 191 ) | 192 ) | 193 ) | 194 ) | 195 ) | 196 ) | 197 ) | 198 ) | 199 ) |
200 ) | 201 ) | 202 ) | 203 ) | 204 ) | 205 ) | 206 ) | 207 ) | 208 ) | 209 )
| 210 ) | 211 ) | 212 ) | 213 ) | 214 ) | 215 ) | 216 ) | 217 ) | 218 ) |
219 ) | 220 ) | 221 ) | 222 ) | 223 ) | 224 ) | 225 ) | 226 ) | 227 ) | 228 )
| 229 ) | 230 ) | 231 ) | 232 ) | 233 ) | 234 ) | 235 ) | 236 ) | 237 ) |
238 ) | 239 ) | 240 ) | 241 ) | 242 ) | 243 )'::query_int)

Still not using the index. I'm trying to DROP INDEX and recreate it, but the
query just stalls. I remember last time this situation happened that I just
dropped and recreated the index, and voila it was using the index again. Now
I can't seem to get this index to drop. Here's the table structure:


    Column     |   Type    |     Modifiers
---------------+-----------+--------------------
 person_id     | integer   | not null
 askmecount    | integer   | not null default 0
 age           | integer   | not null
 gender        | integer   | not null
 bodytype      | integer   | not null
 children      | integer   | not null
 drinking      | integer   | not null
 education     | integer   | not null
 ethnicity     | integer   | not null
 eyecolor      | integer   | not null
 haircolor     | integer   | not null
 hairstyle     | integer   | not null
 height        | integer   | not null
 income        | integer   | not null
 languages     | integer[] | not null
 occupation    | integer   | not null
 orientation   | integer   | not null
 relation      | integer   | not null
 religion      | integer   | not null
 smoking       | integer   | not null
 want_children | integer   | not null
 weight        | integer   | not null
 seeking       | integer[] | not null
Indexes:
    "people_attributes_pkey" PRIMARY KEY, btree (person_id)
    "people_attributes_search" gist ((ARRAY[age, gender, orientation,
children, drinking, education,
ethnicity, eyecolor, haircolor, hairstyle, height, income, occupation,
relation, religion, smoking, w
ant_children, weight] + seeking + languages))
Foreign-key constraints:
    "people_attributes_weight_fkey" FOREIGN KEY (weight) REFERENCES
attribute_values(value_id) ON DEL
ETE RESTRICT
    "people_attributes_person_id_fkey" FOREIGN KEY (person_id) REFERENCES
people(person_id) ON DELETE
 CASCADE DEFERRABLE INITIALLY DEFERRED
    "people_attributes_age_fkey" FOREIGN KEY (age) REFERENCES
attribute_values(value_id) ON DELETE RE
STRICT
    "people_attributes_gender_fkey" FOREIGN KEY (gender) REFERENCES
attribute_values(value_id) ON DEL
ETE RESTRICT
    "people_attributes_bodytype_fkey" FOREIGN KEY (bodytype) REFERENCES
attribute_values(value_id) ON
 DELETE RESTRICT
    "people_attributes_children_fkey" FOREIGN KEY (children) REFERENCES
attribute_values(value_id) ON
 DELETE RESTRICT
    "people_attributes_drinking_fkey" FOREIGN KEY (drinking) REFERENCES
attribute_values(value_id) ON
 DELETE RESTRICT
    "people_attributes_education_fkey" FOREIGN KEY (education) REFERENCES
attribute_values(value_id)
ON DELETE RESTRICT
    "people_attributes_ethnicity_fkey" FOREIGN KEY (ethnicity) REFERENCES
attribute_values(value_id)
ON DELETE RESTRICT
    "people_attributes_eyecolor_fkey" FOREIGN KEY (eyecolor) REFERENCES
attribute_values(value_id) ON
 DELETE RESTRICT
    "people_attributes_haircolor_fkey" FOREIGN KEY (haircolor) REFERENCES
attribute_values(value_id)
ON DELETE RESTRICT
    "people_attributes_hairstyle_fkey" FOREIGN KEY (hairstyle) REFERENCES
attribute_values(value_id)
ON DELETE RESTRICT
    "people_attributes_height_fkey" FOREIGN KEY (height) REFERENCES
attribute_values(value_id) ON DELETE RESTRICT
    "people_attributes_income_fkey" FOREIGN KEY (income) REFERENCES
attribute_values(value_id) ON DELETE RESTRICT
    "people_attributes_occupation_fkey" FOREIGN KEY (occupation) REFERENCES
attribute_values(value_id
) ON DELETE RESTRICT
    "people_attributes_orientation_fkey" FOREIGN KEY (orientation) REFERENCES
attribute_values(value_
id) ON DELETE RESTRICT
    "people_attributes_relation_fkey" FOREIGN KEY (relation) REFERENCES
attribute_values(value_id) ON
 DELETE RESTRICT
    "people_attributes_religion_fkey" FOREIGN KEY (religion) REFERENCES
attribute_values(value_id) ON
 DELETE RESTRICT
    "people_attributes_smoking_fkey" FOREIGN KEY (smoking) REFERENCES
attribute_values(value_id) ON D
ELETE RESTRICT
    "people_attributes_want_children_fkey" FOREIGN KEY (want_children)
REFERENCES attribute_values(va
lue_id) ON DELETE RESTRICT


Is it all the foreign keys that are stalling the drop? I have done VACUUM
ANALYZE on the entire db. Could anyone offer some insight as to why this
index is not being used or why the index is not dropping easily?


On Tuesday 05 October 2004 10:32, you wrote:
> Patrick,
>
> First off, thanks for posting this solution!  I love to see a new demo of
> The Power of Postgres(tm) and have been wondering about this particular
> problem since it came up on IRC.
>
> > The array method works quite nicely, especially for the
> > columns like "languages" and "seeking" that are multiple choice. However,
> > even though this method is fast, I still might opt for caching the
> > results because the "real world" search query involves a lot more and
> > will be executed non-stop. But to have it run this fast the first time
> > certainly helps.
>
> Now, for the bad news:  you need to test having a large load of users
> updating their data.   The drawback to GiST indexes is that they are
> low-concurrency, because the updating process needs to lock the whole index
> (this has been on our TODO list for about a decade, but it's a hard
> problem).

pgsql-performance by date:

Previous
From: Pierre-Frédéric Caillaud
Date:
Subject: Re: sequential scan on select distinct
Next
From: Patrick Clery
Date:
Subject: Re: Comparing user attributes with bitwise operators