Home > mailing lists

Peculiar performance observation.... - Mailing list pgsql-general

From	Net Virtual Mailing Lists
Subject	Peculiar performance observation....
Date	March 12, 2005 13:55:26
Msg-id	20050312105407.593@mail.net-virtual.com Whole thread Raw
Responses	Re: Peculiar performance observation....
List	pgsql-general

Tree view

I have a rather peculiar performance observation and would welcome any
feedback on this.....

First off, the main table (well, part of it.. it is quite large..):


                                             Table "table1"
       Column       |           Type           |
   Modifiers
--------------------+--------------------------
+-----------------------------------------------------------------
 id                 | integer                  | not null default
nextval('master.id_seq'::text)
 user_id            | integer                  |
 ... (skipping about 20 columns)
 category           | ltree[]                  |
 somedata           | text                     | not null



Indexes:
    "table1_pkey" primary key, btree (id)
    "table1_category_full_gist_idx" gist (category)
    "table1_id_idx" btree (id)
    "table1_fti_idx" gist (fti) WHERE ((status)::text = 'open'::text)
    "table1_user_id_idx" btree (user_id)


database=> explain analyze select id from table1 where category <@ 'a.b';
                                                                    QUERY
PLAN
-------------------------------------
-------------------------------------
-------------------------------------------------------------------------
 Index Scan using table1_category_full_gist_idx on jobdata
(cost=0.00..113.48 rows=28 width=4) (actual time=43.814..12201.528
rows=1943 loops=1)
   Index Cond: (category <@ 'a.b'::ltree)
   Filter: (category <@ 'a.b'::ltree)
 Total runtime: 12222.258 ms


If I do this:

create table yuck (id integer, category ltree[]);
insert into yuck select id, category from table1;
create index category_idx on yuck using gist(category);
vacuum analyze yuck;
jobs=> explain analyze select id from table1 where id in (select id from
yuck where category <@ 'a.b');
                                                              QUERY PLAN

-------------------------------------
-------------------------------------
-------------------------------------------------------------
 Nested Loop  (cost=108.64..114.28 rows=1 width=52) (actual
time=654.645..1245.212 rows=1943 loops=1)
   ->  HashAggregate  (cost=108.64..108.64 rows=1 width=4) (actual
time=654.202..690.709 rows=1943 loops=1)
         ->  Index Scan using category_idx on yuck  (cost=0.00..108.57
rows=28 width=4) (actual time=2.046..623.436 rows=1943 loops=1)
               Index Cond: (category <@ 'a.b'::ltree)
               Filter: (category <@ 'a.b'::ltree)
   ->  Index Scan using table1_pkey on jobdata  (cost=0.00..5.64 rows=1
width=52) (actual time=0.219..0.235 rows=1 loops=1943)
         Index Cond: (table1.id = "outer".id)
 Total runtime: 1261.551 ms
(8 rows)


In the first query, my hard disk trashes audibly the entire 12 seconds
(this is actually the best run I could get, it is usually closer to 20
seconds), the second query runs almost effortlessly..  I've tried
reindexing, even dropping the index and recreating it but nothing I do
helps at all.

Now keep in mind that I do all of my development on painfully slow
hardware in order to make any performance issues really stand out.  But,
I've done this on production servers too with an equal performance
improvement noticed.

I just can't figure out why this second query is so much faster, I feel
like I must have done something very wrong in my schema design or
something to be suffering this sort of a performance loss.   Any idea
what I can do about this?

Thanks as always!

- Greg

pgsql-general by date:

From: "ILove TheSpam"
Date: 12 March 2005, 13:41:17
Subject: Re: Unique Indexes

From: "Net Virtual Mailing Lists"
Date: 12 March 2005, 14:12:38
Subject: Re: Peculiar performance observation....

Peculiar performance observation.... - Mailing list pgsql-general

Previous

Next