Thread: Beginner Question

Beginner Question

From
"s d"
Date:
Hi,
I am trying to figure out how to debug a performance problem / use psql explain.
The table in question is:
# \d word_association;
           Table "public.word_association"
 Column |          Type          |     Modifiers
--------+------------------------+--------------------
 word1  | character varying(128) | not null
 word2  | character varying(128) | not null
 count  | integer                | not null default 0
Indexes:
    "word1_word2_comb_unique" unique, btree (word1, word2)
    "word1_hash_index" hash (word1)
    "word2_hash_index" hash (word2)
    "word_association_count_index" btree (count)
    "word_association_index1_1" btree (word1)
    "word_association_index2_1" btree (word2)

It has multiple indices since i wanted to see which one the planner choses.


# explain select * FROM word_association WHERE (word1 = 'bdss' OR
word2 = 'bdss')  AND count >= 10;
                                           QUERY PLAN
------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on word_association  (cost=11.53..1192.09 rows=155 width=22)
   Recheck Cond: (((word1)::text = 'bdss'::text) OR ((word2)::text =
'bdss'::text))
   Filter: (count >= 10)
   ->  BitmapOr  (cost=11.53..11.53 rows=364 width=0)
         ->  Bitmap Index Scan on word_association_index1_1
(cost=0.00..5.79 rows=190 width=0)
               Index Cond: ((word1)::text = 'bdss'::text)
         ->  Bitmap Index Scan on word_association_index2_1
(cost=0.00..5.67 rows=174 width=0)
               Index Cond: ((word2)::text = 'bdss'::text)
(8 rows)

The questions:
1. i can undestand where the cost=11.53 came from but where did the
1192.09 come form? The values are in milli right ?
2. the query takes  in reality much longer than 1 second.

In short, it feels like something is very wrong here (i tried vacuum
analyze and it didn't do much diff).
any ideas ?

Re: Beginner Question

From
Tom Lane
Date:
"s d" <s.d.sauron@gmail.com> writes:
> 1. i can undestand where the cost=11.53 came from but where did the
> 1192.09 come form? The values are in milli right ?

No, the unit of estimated cost is 1 disk page fetch.  See
http://www.postgresql.org/docs/8.2/static/using-explain.html

            regards, tom lane

Re: Beginner Question

From
"s d"
Date:
Hi Jan,
Adding this Index slowed down things by a factor of 4.

Also, the performance is so horrible (example bellow) that i am
certain i am doing something wrong.

Does the following explain gives any ideas ?

Thanks

=#  EXPLAIN ANALYZE select * from word_association where (word1 ='the'
or word2='the') and count > 10;

QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on word_association  (cost=250.86..7256.59 rows=4624
width=22) (actual time=13.461..211.568 rows=6601 loops=1)
   Recheck Cond: (((word1)::text = 'the'::text) OR ((word2)::text =
'the'::text))
   Filter: (count > 10)
   ->  BitmapOr  (cost=250.86..250.86 rows=12243 width=0) (actual
time=9.052..9.052 rows=0 loops=1)
         ->  Bitmap Index Scan on word_association_index1_1
(cost=0.00..153.20 rows=7579 width=0) (actual time=5.786..5.786
rows=7232 loops=1)
               Index Cond: ((word1)::text = 'the'::text)
         ->  Bitmap Index Scan on word_association_index2_1
(cost=0.00..95.34 rows=4664 width=0) (actual time=3.253..3.253
rows=4073 loops=1)
               Index Cond: ((word2)::text = 'the'::text)
  Total runtime: 219.987 ms
(9 rows)


On 4/9/07, Jan de Visser <jdevisser@digitalfairway.com> wrote:
> On Monday 09 April 2007 05:09:53 s d wrote:
> > Hi,
> > I am trying to figure out how to debug a performance problem / use psql
> > explain. The table in question is:
> > # \d word_association;
> >            Table "public.word_association"
> >  Column |          Type          |     Modifiers
> > --------+------------------------+--------------------
> >  word1  | character varying(128) | not null
> >  word2  | character varying(128) | not null
> >  count  | integer                | not null default 0
> > Indexes:
> >     "word1_word2_comb_unique" unique, btree (word1, word2)
> >     "word1_hash_index" hash (word1)
> >     "word2_hash_index" hash (word2)
> >     "word_association_count_index" btree (count)
> >     "word_association_index1_1" btree (word1)
> >     "word_association_index2_1" btree (word2)
> >
> > It has multiple indices since i wanted to see which one the planner choses.
> >
> >
> > # explain select * FROM word_association WHERE (word1 = 'bdss' OR
> > word2 = 'bdss')  AND count >= 10;
> >                                            QUERY PLAN
> > ---------------------------------------------------------------------------
> >--------------------- Bitmap Heap Scan on word_association
> > (cost=11.53..1192.09 rows=155 width=22) Recheck Cond: (((word1)::text =
> > 'bdss'::text) OR ((word2)::text = 'bdss'::text))
> >    Filter: (count >= 10)
> >    ->  BitmapOr  (cost=11.53..11.53 rows=364 width=0)
> >          ->  Bitmap Index Scan on word_association_index1_1
> > (cost=0.00..5.79 rows=190 width=0)
> >                Index Cond: ((word1)::text = 'bdss'::text)
> >          ->  Bitmap Index Scan on word_association_index2_1
> > (cost=0.00..5.67 rows=174 width=0)
> >                Index Cond: ((word2)::text = 'bdss'::text)
> > (8 rows)
> >
> > The questions:
> > 1. i can undestand where the cost=11.53 came from but where did the
> > 1192.09 come form? The values are in milli right ?
> > 2. the query takes  in reality much longer than 1 second.
> >
> > In short, it feels like something is very wrong here (i tried vacuum
> > analyze and it didn't do much diff).
> > any ideas ?
>
> You need an index on (word1, word2, count). In your current setup it will have
> to scan all rows that satisfy word1 and word2 to see if count >= 10.
>
> jan
>
>
> --
> --------------------------------------------------------------
> Jan de Visser jdevisser@digitalfairway.com
>
>   Baruk Khazad! Khazad ai-menu!
> --------------------------------------------------------------
>

Re: Beginner Question

From
Mike Gargano
Date:
Yeah, I have a lot of similar problems where an index that I have to
speed up one query is used in another query where it actually slows
it down.  Is there any way to ignore indexes for certain queries?
We've been appending empty strings and adding zero's to the column
data to force it into a filter, but it's a messy hack.  I've tried
ordering the joins in the the most efficent way with a
join_collapse_limit of 1, but it still does uses this index in
parallel with searching an index on another table (i guess the
planner figures it's saving some time up front).

-Mike

On Apr 9, 2007, at 8:45 PM, s d wrote:

> Hi Jan,
> Adding this Index slowed down things by a factor of 4.
>
> Also, the performance is so horrible (example bellow) that i am
> certain i am doing something wrong.
>
> Does the following explain gives any ideas ?
>
> Thanks
>
> =#  EXPLAIN ANALYZE select * from word_association where (word1 ='the'
> or word2='the') and count > 10;
>
> QUERY PLAN
> ----------------------------------------------------------------------
> ----------------------------------------------------------------------
> ----
> Bitmap Heap Scan on word_association  (cost=250.86..7256.59 rows=4624
> width=22) (actual time=13.461..211.568 rows=6601 loops=1)
>   Recheck Cond: (((word1)::text = 'the'::text) OR ((word2)::text =
> 'the'::text))
>   Filter: (count > 10)
>   ->  BitmapOr  (cost=250.86..250.86 rows=12243 width=0) (actual
> time=9.052..9.052 rows=0 loops=1)
>         ->  Bitmap Index Scan on word_association_index1_1
> (cost=0.00..153.20 rows=7579 width=0) (actual time=5.786..5.786
> rows=7232 loops=1)
>               Index Cond: ((word1)::text = 'the'::text)
>         ->  Bitmap Index Scan on word_association_index2_1
> (cost=0.00..95.34 rows=4664 width=0) (actual time=3.253..3.253
> rows=4073 loops=1)
>               Index Cond: ((word2)::text = 'the'::text)
>  Total runtime: 219.987 ms
> (9 rows)
>
>
> On 4/9/07, Jan de Visser <jdevisser@digitalfairway.com> wrote:
>> On Monday 09 April 2007 05:09:53 s d wrote:
>> > Hi,
>> > I am trying to figure out how to debug a performance problem /
>> use psql
>> > explain. The table in question is:
>> > # \d word_association;
>> >            Table "public.word_association"
>> >  Column |          Type          |     Modifiers
>> > --------+------------------------+--------------------
>> >  word1  | character varying(128) | not null
>> >  word2  | character varying(128) | not null
>> >  count  | integer                | not null default 0
>> > Indexes:
>> >     "word1_word2_comb_unique" unique, btree (word1, word2)
>> >     "word1_hash_index" hash (word1)
>> >     "word2_hash_index" hash (word2)
>> >     "word_association_count_index" btree (count)
>> >     "word_association_index1_1" btree (word1)
>> >     "word_association_index2_1" btree (word2)
>> >
>> > It has multiple indices since i wanted to see which one the
>> planner choses.
>> >
>> >
>> > # explain select * FROM word_association WHERE (word1 = 'bdss' OR
>> > word2 = 'bdss')  AND count >= 10;
>> >                                            QUERY PLAN
>> >
>> ---------------------------------------------------------------------
>> ------
>> >--------------------- Bitmap Heap Scan on word_association
>> > (cost=11.53..1192.09 rows=155 width=22) Recheck Cond:
>> (((word1)::text =
>> > 'bdss'::text) OR ((word2)::text = 'bdss'::text))
>> >    Filter: (count >= 10)
>> >    ->  BitmapOr  (cost=11.53..11.53 rows=364 width=0)
>> >          ->  Bitmap Index Scan on word_association_index1_1
>> > (cost=0.00..5.79 rows=190 width=0)
>> >                Index Cond: ((word1)::text = 'bdss'::text)
>> >          ->  Bitmap Index Scan on word_association_index2_1
>> > (cost=0.00..5.67 rows=174 width=0)
>> >                Index Cond: ((word2)::text = 'bdss'::text)
>> > (8 rows)
>> >
>> > The questions:
>> > 1. i can undestand where the cost=11.53 came from but where did the
>> > 1192.09 come form? The values are in milli right ?
>> > 2. the query takes  in reality much longer than 1 second.
>> >
>> > In short, it feels like something is very wrong here (i tried
>> vacuum
>> > analyze and it didn't do much diff).
>> > any ideas ?
>>
>> You need an index on (word1, word2, count). In your current setup
>> it will have
>> to scan all rows that satisfy word1 and word2 to see if count >= 10.
>>
>> jan
>>
>>
>> --
>> --------------------------------------------------------------
>> Jan de Visser jdevisser@digitalfairway.com
>>
>>   Baruk Khazad! Khazad ai-menu!
>> --------------------------------------------------------------
>>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 7: You can help support the PostgreSQL project by donating at
>
>                http://www.postgresql.org/about/donate
>


Re: Beginner Question

From
"Dave Dutcher"
Date:
In your first post you said that the query is taking much longer than a
second, and in your second post you say the performance is horrible, but
explain analyze shows the query runs in 219 milliseconds, which doesn't seem
too bad to me.  I wonder if the slow part for you is returning all the rows
to the client?  How are you running this query?  (JDBC, ODBC, C library?)
Do you really need all the rows?  Maybe you could use a cursor to page
through the rows?

Dave


> -----Original Message-----
> From: pgsql-performance-owner@postgresql.org
> [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of s d
> Sent: Monday, April 09, 2007 7:46 PM
> To: Jan de Visser
> Cc: pgsql-performance@postgresql.org
> Subject: Re: [PERFORM] Beginner Question
>
>
> Hi Jan,
> Adding this Index slowed down things by a factor of 4.
>
> Also, the performance is so horrible (example bellow) that i am
> certain i am doing something wrong.
>
> Does the following explain gives any ideas ?
>
> Thanks
>
> =#  EXPLAIN ANALYZE select * from word_association where (word1 ='the'
> or word2='the') and count > 10;
>
> QUERY PLAN
> --------------------------------------------------------------
> --------------------------------------------------------------
> --------------------
>  Bitmap Heap Scan on word_association  (cost=250.86..7256.59 rows=4624
> width=22) (actual time=13.461..211.568 rows=6601 loops=1)
>    Recheck Cond: (((word1)::text = 'the'::text) OR ((word2)::text =
> 'the'::text))
>    Filter: (count > 10)
>    ->  BitmapOr  (cost=250.86..250.86 rows=12243 width=0) (actual
> time=9.052..9.052 rows=0 loops=1)
>          ->  Bitmap Index Scan on word_association_index1_1
> (cost=0.00..153.20 rows=7579 width=0) (actual time=5.786..5.786
> rows=7232 loops=1)
>                Index Cond: ((word1)::text = 'the'::text)
>          ->  Bitmap Index Scan on word_association_index2_1
> (cost=0.00..95.34 rows=4664 width=0) (actual time=3.253..3.253
> rows=4073 loops=1)
>                Index Cond: ((word2)::text = 'the'::text)
>   Total runtime: 219.987 ms
> (9 rows)


Re: Beginner Question

From
"Jan de Visser"
Date:
On Monday 09 April 2007 05:09:53 s d wrote:
> Hi,
> I am trying to figure out how to debug a performance problem / use psql
> explain. The table in question is:
> # \d word_association;
>            Table "public.word_association"
>  Column |          Type          |     Modifiers
> --------+------------------------+--------------------
>  word1  | character varying(128) | not null
>  word2  | character varying(128) | not null
>  count  | integer                | not null default 0
> Indexes:
>     "word1_word2_comb_unique" unique, btree (word1, word2)
>     "word1_hash_index" hash (word1)
>     "word2_hash_index" hash (word2)
>     "word_association_count_index" btree (count)
>     "word_association_index1_1" btree (word1)
>     "word_association_index2_1" btree (word2)
>
> It has multiple indices since i wanted to see which one the planner choses.
>
>
> # explain select * FROM word_association WHERE (word1 = 'bdss' OR
> word2 = 'bdss')  AND count >= 10;
>                                            QUERY PLAN
> ---------------------------------------------------------------------------
>--------------------- Bitmap Heap Scan on word_association
> (cost=11.53..1192.09 rows=155 width=22) Recheck Cond: (((word1)::text =
> 'bdss'::text) OR ((word2)::text = 'bdss'::text))
>    Filter: (count >= 10)
>    ->  BitmapOr  (cost=11.53..11.53 rows=364 width=0)
>          ->  Bitmap Index Scan on word_association_index1_1
> (cost=0.00..5.79 rows=190 width=0)
>                Index Cond: ((word1)::text = 'bdss'::text)
>          ->  Bitmap Index Scan on word_association_index2_1
> (cost=0.00..5.67 rows=174 width=0)
>                Index Cond: ((word2)::text = 'bdss'::text)
> (8 rows)
>
> The questions:
> 1. i can undestand where the cost=11.53 came from but where did the
> 1192.09 come form? The values are in milli right ?
> 2. the query takes  in reality much longer than 1 second.
>
> In short, it feels like something is very wrong here (i tried vacuum
> analyze and it didn't do much diff).
> any ideas ?

You need an index on (word1, word2, count). In your current setup it will have
to scan all rows that satisfy word1 and word2 to see if count >= 10.

jan


--
--------------------------------------------------------------
Jan de Visser                     jdevisser@digitalfairway.com

                Baruk Khazad! Khazad ai-menu!
--------------------------------------------------------------