Re: Peculiar performance observation.... - Mailing list pgsql-general

From Net Virtual Mailing Lists
Subject Re: Peculiar performance observation....
Date
Msg-id 20050314121145.20753@mail.net-virtual.com
Whole thread Raw
In response to Re: Peculiar performance observation....  ("Net Virtual Mailing Lists" <mailinglists@net-virtual.com>)
Responses Re: Peculiar performance observation....
Re: Peculiar performance observation....
List pgsql-general
Hello,


I am sorry to bring this up again.... Does anyone have any idea what
might be going on here?...   I'm very worried about this situation.. ;-(

- Greg


>Something even more peculiar (at least it seems to me..)...
>
>
>If I drop the index table1_category_gist_idx, I get this:
>
>
>jobs=> explain analyze select id from table1 where category <@ 'a.b'
>ORDER BY category;
>                                                    QUERY PLAN
>
>-------------------------------------
>------------------------------------
------------------------------------------
> Sort  (cost=7568.55..7568.62 rows=28 width=52) (actual
>time=4842.691..4854.468 rows=1943 loops=1)
>   Sort Key: category
>   ->  Seq Scan on jobdata  (cost=0.00..7567.88 rows=28 width=52) (actual
>time=11.498..4800.907 rows=1943 loops=1)
>         Filter: (category <@ 'a.b'::ltree)
> Total runtime: 4871.076 ms
>(5 rows)
>
>
>.. no disk thrashing all over the place..
>
>I'm really perplexed about this one..;-(
>
>- Greg
>
>>I have a rather peculiar performance observation and would welcome any
>>feedback on this.....
>>
>>First off, the main table (well, part of it.. it is quite large..):
>>
>>
>>                                             Table "table1"
>>       Column       |           Type           |
>>   Modifiers
>>--------------------+--------------------------
>>+-----------------------------------------------------------------
>> id                 | integer                  | not null default
>>nextval('master.id_seq'::text)
>> user_id            | integer                  |
>> ... (skipping about 20 columns)
>> category           | ltree[]                  |
>> somedata           | text                     | not null
>>
>>
>>
>>Indexes:
>>    "table1_pkey" primary key, btree (id)
>>    "table1_category_full_gist_idx" gist (category)
>>    "table1_id_idx" btree (id)
>>    "table1_fti_idx" gist (fti) WHERE ((status)::text = 'open'::text)
>>    "table1_user_id_idx" btree (user_id)
>>
>>
>>database=> explain analyze select id from table1 where category <@ 'a.b';
>>                                                                    QUERY
>>PLAN
>>-------------------------------------
>>-------------------------------------
>>-------------------------------------------------------------------------
>> Index Scan using table1_category_full_gist_idx on jobdata
>>(cost=0.00..113.48 rows=28 width=4) (actual time=43.814..12201.528
>>rows=1943 loops=1)
>>   Index Cond: (category <@ 'a.b'::ltree)
>>   Filter: (category <@ 'a.b'::ltree)
>> Total runtime: 12222.258 ms
>>
>>
>>If I do this:
>>
>>create table yuck (id integer, category ltree[]);
>>insert into yuck select id, category from table1;
>>create index category_idx on yuck using gist(category);
>>vacuum analyze yuck;
>>jobs=> explain analyze select id from table1 where id in (select id from
>>yuck where category <@ 'a.b');
>>                                                              QUERY PLAN
>>
>>-------------------------------------
>>-------------------------------------
>>-------------------------------------------------------------
>> Nested Loop  (cost=108.64..114.28 rows=1 width=52) (actual
>>time=654.645..1245.212 rows=1943 loops=1)
>>   ->  HashAggregate  (cost=108.64..108.64 rows=1 width=4) (actual
>>time=654.202..690.709 rows=1943 loops=1)
>>         ->  Index Scan using category_idx on yuck  (cost=0.00..108.57
>>rows=28 width=4) (actual time=2.046..623.436 rows=1943 loops=1)
>>               Index Cond: (category <@ 'a.b'::ltree)
>>               Filter: (category <@ 'a.b'::ltree)
>>   ->  Index Scan using table1_pkey on jobdata  (cost=0.00..5.64 rows=1
>>width=52) (actual time=0.219..0.235 rows=1 loops=1943)
>>         Index Cond: (table1.id = "outer".id)
>> Total runtime: 1261.551 ms
>>(8 rows)
>>
>>
>>In the first query, my hard disk trashes audibly the entire 12 seconds
>>(this is actually the best run I could get, it is usually closer to 20
>>seconds), the second query runs almost effortlessly..  I've tried
>>reindexing, even dropping the index and recreating it but nothing I do
>>helps at all.
>>
>>Now keep in mind that I do all of my development on painfully slow
>>hardware in order to make any performance issues really stand out.  But,
>>I've done this on production servers too with an equal performance
>>improvement noticed.
>>
>>I just can't figure out why this second query is so much faster, I feel
>>like I must have done something very wrong in my schema design or
>>something to be suffering this sort of a performance loss.   Any idea
>>what I can do about this?
>>
>>Thanks as always!
>>
>>- Greg


pgsql-general by date:

Previous
From: "Rick Schumeyer"
Date:
Subject: prelimiary performance comparison pgsql vs mysql
Next
From: Miroslav Šulc
Date:
Subject: Re: prelimiary performance comparison pgsql vs mysql