Re: Peculiar performance observation.... - Mailing list pgsql-general

From Thomas F.O'Connell
Subject Re: Peculiar performance observation....
Date
Msg-id 3643acee233ea5d0e7f4d82588c18fd8@sitening.com
Whole thread Raw
In response to Re: Peculiar performance observation....  ("Net Virtual Mailing Lists" <mailinglists@net-virtual.com>)
List pgsql-general
Well, your expected vs. actual rows are off, so analyzing might help.
Otherwise, what is your sort_mem set to?

-tfo

--
Thomas F. O'Connell
Co-Founder, Information Architect
Sitening, LLC
http://www.sitening.com/
110 30th Avenue North, Suite 6
Nashville, TN 37203-6320
615-260-0005

On Mar 14, 2005, at 6:11 AM, Net Virtual Mailing Lists wrote:

> Hello,
>
>
> I am sorry to bring this up again.... Does anyone have any idea what
> might be going on here?...   I'm very worried about this situation..
> ;-(
>
> - Greg
>
>
>> Something even more peculiar (at least it seems to me..)...
>>
>>
>> If I drop the index table1_category_gist_idx, I get this:
>>
>>
>> jobs=> explain analyze select id from table1 where category <@ 'a.b'
>> ORDER BY category;
>>                                                    QUERY PLAN
>>
>> -------------------------------------
>> ------------------------------------
> ------------------------------------------
>> Sort  (cost=7568.55..7568.62 rows=28 width=52) (actual
>> time=4842.691..4854.468 rows=1943 loops=1)
>>   Sort Key: category
>>   ->  Seq Scan on jobdata  (cost=0.00..7567.88 rows=28 width=52)
>> (actual
>> time=11.498..4800.907 rows=1943 loops=1)
>>         Filter: (category <@ 'a.b'::ltree)
>> Total runtime: 4871.076 ms
>> (5 rows)
>>
>>
>> .. no disk thrashing all over the place..
>>
>> I'm really perplexed about this one..;-(
>>
>> - Greg
>>
>>> I have a rather peculiar performance observation and would welcome
>>> any
>>> feedback on this.....
>>>
>>> First off, the main table (well, part of it.. it is quite large..):
>>>
>>>
>>>                                             Table "table1"
>>>       Column       |           Type           |
>>>   Modifiers
>>> --------------------+--------------------------
>>> +-----------------------------------------------------------------
>>> id                 | integer                  | not null default
>>> nextval('master.id_seq'::text)
>>> user_id            | integer                  |
>>> ... (skipping about 20 columns)
>>> category           | ltree[]                  |
>>> somedata           | text                     | not null
>>>
>>>
>>>
>>> Indexes:
>>>    "table1_pkey" primary key, btree (id)
>>>    "table1_category_full_gist_idx" gist (category)
>>>    "table1_id_idx" btree (id)
>>>    "table1_fti_idx" gist (fti) WHERE ((status)::text = 'open'::text)
>>>    "table1_user_id_idx" btree (user_id)
>>>
>>>
>>> database=> explain analyze select id from table1 where category <@
>>> 'a.b';
>>>
>>> QUERY
>>> PLAN
>>> -------------------------------------
>>> -------------------------------------
>>> ---------------------------------------------------------------------
>>> ----
>>> Index Scan using table1_category_full_gist_idx on jobdata
>>> (cost=0.00..113.48 rows=28 width=4) (actual time=43.814..12201.528
>>> rows=1943 loops=1)
>>>   Index Cond: (category <@ 'a.b'::ltree)
>>>   Filter: (category <@ 'a.b'::ltree)
>>> Total runtime: 12222.258 ms
>>>
>>>
>>> If I do this:
>>>
>>> create table yuck (id integer, category ltree[]);
>>> insert into yuck select id, category from table1;
>>> create index category_idx on yuck using gist(category);
>>> vacuum analyze yuck;
>>> jobs=> explain analyze select id from table1 where id in (select id
>>> from
>>> yuck where category <@ 'a.b');
>>>                                                              QUERY
>>> PLAN
>>>
>>> -------------------------------------
>>> -------------------------------------
>>> -------------------------------------------------------------
>>> Nested Loop  (cost=108.64..114.28 rows=1 width=52) (actual
>>> time=654.645..1245.212 rows=1943 loops=1)
>>>   ->  HashAggregate  (cost=108.64..108.64 rows=1 width=4) (actual
>>> time=654.202..690.709 rows=1943 loops=1)
>>>         ->  Index Scan using category_idx on yuck  (cost=0.00..108.57
>>> rows=28 width=4) (actual time=2.046..623.436 rows=1943 loops=1)
>>>               Index Cond: (category <@ 'a.b'::ltree)
>>>               Filter: (category <@ 'a.b'::ltree)
>>>   ->  Index Scan using table1_pkey on jobdata  (cost=0.00..5.64
>>> rows=1
>>> width=52) (actual time=0.219..0.235 rows=1 loops=1943)
>>>         Index Cond: (table1.id = "outer".id)
>>> Total runtime: 1261.551 ms
>>> (8 rows)
>>>
>>>
>>> In the first query, my hard disk trashes audibly the entire 12
>>> seconds
>>> (this is actually the best run I could get, it is usually closer to
>>> 20
>>> seconds), the second query runs almost effortlessly..  I've tried
>>> reindexing, even dropping the index and recreating it but nothing I
>>> do
>>> helps at all.
>>>
>>> Now keep in mind that I do all of my development on painfully slow
>>> hardware in order to make any performance issues really stand out.
>>> But,
>>> I've done this on production servers too with an equal performance
>>> improvement noticed.
>>>
>>> I just can't figure out why this second query is so much faster, I
>>> feel
>>> like I must have done something very wrong in my schema design or
>>> something to be suffering this sort of a performance loss.   Any idea
>>> what I can do about this?
>>>
>>> Thanks as always!
>>>
>>> - Greg
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 8: explain analyze is your friend


pgsql-general by date:

Previous
From: Richard Huxton
Date:
Subject: Re: prelimiary performance comparison pgsql vs mysql
Next
From: Andre Maasikas
Date:
Subject: Re: skip weekends: revisited