Thread: Very poor performance

Very poor performance

From

"Aaron Burnett"

Date:

16 August 2010, 22:19:39

Hi,

I'm hoping someone can offer some help here. The query and explain analyze and table layout are below and attached in a text file if the formatting is bad.

The query is part of a bigger query that our front end runs. This is the part that takes forever (84 minutes in this case) to finish and more often than not the front end times out. The table (answerselectinstance) has 168664317 rows while the member table has 626435 rows.

Postgres Version 8.25
CentOs 5.2
16 Gig RAM
192MB work_mem (increasing to 400MB didn't change the outcome)
very light use on this server, it ais a slave to a slony replicated master/slave setup.

Again, apologies if the formatting got munged, the attached text file has the same info.

Thanking you in advance for any help and suggestions.

Aaron

explain analyze select distinct(id) from member where id in (select memberid from answerselectinstance where   nswerid = 127443 OR answerid = 127444 OR answerid = 127445 OR answerid = 127446 OR answerid = 127447 OR answerid = 127448   ) ;

LOG: duration: 5076038.709 ms statement: explain analyze select distinct(id) from member where id in (select memberid from answerselectinstance where   answerid = 127443 OR answerid = 127444 OR answerid = 127445 OR answerid = 127446 OR answerid = 127447 OR answerid = 127448   ) ;
                                                                              QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
Unique (cost=101357.24..101357.28 rows=9 width=4) (actual time=5075511.974..5075911.077 rows=143520 loops=1)
   -> Sort (cost=101357.24..101357.26 rows=9 width=4) (actual time=5075511.971..5075644.323 rows=143520 loops=1)
         Sort Key: member.id
         -> Nested Loop IN Join (cost=0.00..101357.10 rows=9 width=4) (actual time=19.867..5075122.724 rows=143520 loops=1)
               -> Seq Scan on member (cost=0.00..78157.65 rows=626265 width=4) (actual time=3.338..2003.582 rows=626410 loops=1)
               -> Index Scan using asi_memberid_idx on answerselectinstance (cost=0.00..444.46 rows=9 width=4) (actual time=8.096..8.096 rows=0 loops=626410)
                     Index Cond: (member.id = answerselectinstance.memberid)
                     Filter: ((answerid = 127443) OR (answerid = 127444) OR (answerid = 127445) OR (answerid = 127446) OR (answerid = 127447) OR (answerid = 127448))
Total runtime: 5076034.203 ms
(9 rows)

     Column     |            Type             |                         Modifiers
----------------+-----------------------------+------------------------------------------------------------
memberid       | integer                     | not null
answerid       | integer                     | not null
taskinstanceid | integer                     | not null default 0
created        | timestamp without time zone | default "timestamp"('now'::text)
id             | integer                     | not null default nextval(('"asi_id_seq"'::text)::regclass)
Indexes:
    "asi_pkey" PRIMARY KEY, btree (id)
    "asi_answerid_idx" btree (answerid)
    "asi_memberid_idx" btree (memberid)
    "asi_taskinstanceid_idx" btree (taskinstanceid)
Triggers:
    _bzzprod_cluster_denyaccess_301 BEFORE INSERT OR DELETE OR UPDATE ON answerselectinstance FOR EACH ROW EXECUTE PROCEDURE _bzzprod_cluster.denyaccess('_bzzprod_cluster')

Attachment

asi_stats.txt

Re: Very poor performance

From

Mark Rostron

Date:

16 August 2010, 22:51:42

This is weird – is there a particular combination of memberid/answered in answerselectindex that has a very high rowcount?

First change I would suggest looking into would be to try changing sub-query logic to check existence and limit the result set of the sub-query to a single row

Select distinct(m.id)

From member m

Where exists (

Select 1

From answerselectinstance a

Where a.member_id = m.id

And a.answerid between 127443 and 127448

Limit 1

)

If member.id is a primary key, you can eliminate the “distinct” i.e. the sort.

Second would be to build a partial index on answersselectindex to index only the memberid’s you are interested in:

“Create index <new_index_name> on answersselectindex(memberid) where answerid between 127443 and 127448”

From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Aaron Burnett
Sent: Monday, August 16, 2010 6:07 PM
To: pgsql-performance@postgresql.org
Subject: [PERFORM] Very poor performance

Re: Very poor performance

From

"Aaron Burnett"

Date:

16 August 2010, 23:20:01

Thanks Mark,

Yeah, I apologize, I forgot to mention a couple of things.

m.id is the primary key but the biggest problem is that the query loops 626410 times because at one time people were allowed to delete member.id rows which now will break the application if the a.memberid comes out and it doesn't exist in the member table.

The version you sent me yields pretty much the same results.

All I really SHOULD have to do is query the a.memberid column to get distinct memberid and the query takes less than 2 seconds. The join to the member table and subsequnt 600K loops are the killer. The answerselectinstance table has 166 million rows... so the math is pretty easy on why it's painfully slow.

Other than delting data in the answerselectinstance table to get rid of the orphan memberid's I was hoping someone had a better way to do this.

-----Original Message-----
From: Mark Rostron [mailto:mrostron@ql2.com]
Sent: Mon 8/16/2010 9:51 PM
To: Aaron Burnett; pgsql-performance@postgresql.org
Subject: RE: Very poor performance

This is weird - is there a particular combination of memberid/answered in answerselectindex that has a very high rowcount?

First change I would suggest looking into would be to try changing sub-query logic to check existence and limit the result set of the sub-query to a single row

Select distinct(m.id)
From member m
Where exists (
      Select 1
      From answerselectinstance a
      Where a.member_id = m.id
      And a.answerid between 127443 and 127448
      Limit 1
)

If member.id is a primary key, you can eliminate the "distinct" i.e. the sort.

Second would be to build a partial index on answersselectindex to index only the memberid's you are interested in:

"Create index <new_index_name> on answersselectindex(memberid) where answerid between 127443 and 127448"

Mr

From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Aaron Burnett
Sent: Monday, August 16, 2010 6:07 PM
To: pgsql-performance@postgresql.org
Subject: [PERFORM] Very poor performance

Hi,

I'm hoping someone can offer some help here. The query and explain analyze and table layout are below and attached in a text file if the formatting is bad.

The query is part of a bigger query that our front end runs. This is the part that takes forever (84 minutes in this case) to finish and more often than not the front end times out. The table (answerselectinstance) has 168664317 rows while the member table has 626435 rows.

Postgres Version 8.25
CentOs 5.2
16 Gig RAM
192MB work_mem (increasing to 400MB didn't change the outcome)
very light use on this server, it ais a slave to a slony replicated master/slave setup.

Again, apologies if the formatting got munged, the attached text file has the same info.

Thanking you in advance for any help and suggestions.

Aaron

explain analyze select distinct(id) from member where id in (select memberid from answerselectinstance where   nswerid = 127443 OR answerid = 127444 OR answerid = 127445 OR answerid = 127446 OR answerid = 127447 OR answerid = 127448   ) ;

LOG: duration: 5076038.709 ms statement: explain analyze select distinct(id) from member where id in (select memberid from answerselectinstance where   answerid = 127443 OR answerid = 127444 OR answerid = 127445 OR answerid = 127446 OR answerid = 127447 OR answerid = 127448   ) ;
                                                                              QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
Unique (cost=101357.24..101357.28 rows=9 width=4) (actual time=5075511.974..5075911.077 rows=143520 loops=1)
   -> Sort (cost=101357.24..101357.26 rows=9 width=4) (actual time=5075511.971..5075644.323 rows=143520 loops=1)
         Sort Key: member.id
         -> Nested Loop IN Join (cost=0.00..101357.10 rows=9 width=4) (actual time=19.867..5075122.724 rows=143520 loops=1)
               -> Seq Scan on member (cost=0.00..78157.65 rows=626265 width=4) (actual time=3.338..2003.582 rows=626410 loops=1)
               -> Index Scan using asi_memberid_idx on answerselectinstance (cost=0.00..444.46 rows=9 width=4) (actual time=8.096..8.096 rows=0 loops=626410)
                     Index Cond: (member.id = answerselectinstance.memberid)
                     Filter: ((answerid = 127443) OR (answerid = 127444) OR (answerid = 127445) OR (answerid = 127446) OR (answerid = 127447) OR (answerid = 127448))
Total runtime: 5076034.203 ms
(9 rows)

     Column     |            Type             |                         Modifiers
----------------+-----------------------------+------------------------------------------------------------
memberid       | integer                     | not null
answerid       | integer                     | not null
taskinstanceid | integer                     | not null default 0
created        | timestamp without time zone | default "timestamp"('now'::text)
id             | integer                     | not null default nextval(('"asi_id_seq"'::text)::regclass)
Indexes:
    "asi_pkey" PRIMARY KEY, btree (id)
    "asi_answerid_idx" btree (answerid)
    "asi_memberid_idx" btree (memberid)
    "asi_taskinstanceid_idx" btree (taskinstanceid)
Triggers:
    _bzzprod_cluster_denyaccess_301 BEFORE INSERT OR DELETE OR UPDATE ON answerselectinstance FOR EACH ROW EXECUTE PROCEDURE _bzzprod_cluster.denyaccess('_bzzprod_cluster')

Re: Very poor performance

From

"Kevin Grittner"

Date:

17 August 2010, 11:18:39

"Aaron Burnett" <aburnett@bzzagent.com> wrote:

> Postgres Version 8.25

Do you mean 8.2.5?  (There is no PostgreSQL version 8.25.)

If you're concerned about performance and you're still on 8.2, you
might want to consider updating to a new major version.

> 16 Gig RAM
> 192MB work_mem  (increasing to 400MB didn't change the outcome)

What other non-default settings do you have?

> explain analyze select distinct(id) from member  where id in
> (select memberid from answerselectinstance where   nswerid =
> 127443  OR answerid = 127444  OR  answerid = 127445  OR  answerid
> = 127446  OR  answerid = 127447  OR  answerid = 127448   ) ;

How does this do?:

explain analyze
select distinct(m.id)
  from answerselectinstance a
  join member m
    on m.id = a.memberid
  where a.answerid between 127443 and 127448
;

-Kevin

Re: Very poor performance

From

"Kevin Grittner"

Date:

17 August 2010, 12:43:48

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> wrote:
> "Aaron Burnett" <aburnett@bzzagent.com> wrote:

>>  Postgres Version 8.25
>
> Do you mean 8.2.5?  (There is no PostgreSQL version 8.25.)

I just noticed that there's an 8.0.25 -- if that's what you're
running, it's a bit silly trying to optimize individual slow queries
-- performance has improved dramatically since then.  Upgrade and
see if you still have any issues, and tune from there.

By the way, 8.0 is going out of support as soon as the 9.0 release
comes out; likely next month.

-Kevin

Re: Very poor performance

From

Tom Lane

Date:

17 August 2010, 12:57:41

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> By the way, 8.0 is going out of support as soon as the 9.0 release
> comes out; likely next month.

Small clarification on that: the plan is that there will be exactly
one more minor update of 8.0 (and 7.4).  So it'll go out of support
after the next set of back-branch update releases, which will most
likely *not* be synchronized with 9.0.0 release.

"Next month" might be an accurate statement anyway.  I think we're
probably overdue to make updates.

            regards, tom lane

Re: Very poor performance

From

Mark Rostron

Date:

17 August 2010, 13:21:39

So, building the partial index will avoid the table lookup.

Currently answerselectindex only has single-column indexes on memberid and answerid, so any query with a predicate on both columns is gonna be forced to do an index lookup on one column followed by a table lookup to get the other one (which is what the plan shows).

This will be slower than if you can get it to lookup only an index.

I suggested a partial index (and not a two-column index) to keep it small, and to reduce the likelihood that it will screw up another query.

Anyway – good luck man.

From: Aaron Burnett [mailto:aburnett@bzzagent.com]
Sent: Monday, August 16, 2010 7:20 PM
To: Mark Rostron; pgsql-performance@postgresql.org
Subject: RE: Very poor performance

Re: Very poor performance

From

Aaron Burnett

Date:

17 August 2010, 14:55:37

Thanks for the response kevin. Answers interspersed below.


On 8/17/10 10:18 AM, "Kevin Grittner" <Kevin.Grittner@wicourts.gov> wrote:

> "Aaron Burnett" <aburnett@bzzagent.com> wrote:
>
>> Postgres Version 8.25
>
> Do you mean 8.2.5?  (There is no PostgreSQL version 8.25.)
>

Yeah, missed a '.', it's 8.2.5

> If you're concerned about performance and you're still on 8.2, you
> might want to consider updating to a new major version.
>
>> 16 Gig RAM
>> 192MB work_mem  (increasing to 400MB didn't change the outcome)
>
> What other non-default settings do you have?

maintenance_work_mem = 1024MB
max_stack_depth = 8MB
max_fsm_pages = 8000000
max_fsm_relations = 2000

>
>> explain analyze select distinct(id) from member  where id in
>> (select memberid from answerselectinstance where   nswerid =
>> 127443  OR answerid = 127444  OR  answerid = 127445  OR  answerid
>> = 127446  OR  answerid = 127447  OR  answerid = 127448   ) ;
>
> How does this do?:
>
> explain analyze
> select distinct(m.id)
>   from answerselectinstance a
>   join member m
>     on m.id = a.memberid
>   where a.answerid between 127443 and 127448
> ;
>
> -Kevin

Unfortunately because of the way the application does the building of the
variables (answerid) and the query, these were only coincidentally in
numeric order, so the query and resulting plan will look more like this:
(and it finishes fast)

LOG:  duration: 4875.943 ms  statement: explain analyze select
distinct(m.id)
  from answerselectinstance a
  join member m
    on m.id = a.memberid
  where a.answerid = 127443  OR  answerid = 127444  OR
a.answerid = 127445  OR a.answerid = 127446  OR a.answerid = 127447 OR
a.answerid = 127448;

QUERY PLAN
----------------------------------------------------------------------------
----------------------------------------------------------------------------
--------------------
 Unique  (cost=265346.57..265884.69 rows=107623 width=4) (actual
time=4362.948..4751.042 rows=143563 loops=1)
   ->  Sort  (cost=265346.57..265615.63 rows=107623 width=4) (actual
time=4362.945..4489.002 rows=143820 loops=1)
         Sort Key: m.id
         ->  Hash Join  (cost=112462.72..256351.64 rows=107623 width=4)
(actual time=2246.333..4134.240 rows=143820 loops=1)
               Hash Cond: (a.memberid = m.id)
               ->  Bitmap Heap Scan on answerselectinstance a
(cost=1363.57..142561.92 rows=107623 width=4) (actual time=84.082..1447.093
rows=143820 loops=1)
                     Recheck Cond: ((answerid = 127443) OR (answerid =
127444) OR (answerid = 127445) OR (answerid = 127446) OR (answerid = 127447)
OR (answerid = 127448))
                     ->  BitmapOr  (cost=1363.57..1363.57 rows=107651
width=0) (actual time=41.723..41.723 rows=0 loops=1)
                           ->  Bitmap Index Scan on asi_answerid_idx
(cost=0.00..200.36 rows=17942 width=0) (actual time=8.133..8.133 rows=32614
loops=1)
                                 Index Cond: (answerid = 127443)
                           ->  Bitmap Index Scan on asi_answerid_idx
(cost=0.00..200.36 rows=17942 width=0) (actual time=6.498..6.498 rows=23539
loops=1)
                                 Index Cond: (answerid = 127444)
                           ->  Bitmap Index Scan on asi_answerid_idx
(cost=0.00..200.36 rows=17942 width=0) (actual time=5.935..5.935 rows=20368
loops=1)
                                 Index Cond: (answerid = 127445)
                           ->  Bitmap Index Scan on asi_answerid_idx
(cost=0.00..200.36 rows=17942 width=0) (actual time=6.619..6.619 rows=21812
loops=1)
                                 Index Cond: (answerid = 127446)
                           ->  Bitmap Index Scan on asi_answerid_idx
(cost=0.00..200.36 rows=17942 width=0) (actual time=3.039..3.039 rows=9562
loops=1)
                                 Index Cond: (answerid = 127447)
                           ->  Bitmap Index Scan on asi_answerid_idx
(cost=0.00..200.36 rows=17942 width=0) (actual time=11.490..11.490
rows=35925 loops=1)
                                 Index Cond: (answerid = 127448)
               ->  Hash  (cost=103267.40..103267.40 rows=626540 width=4)
(actual time=2161.933..2161.933 rows=626626 loops=1)
                     ->  Seq Scan on member m  (cost=0.00..103267.40
rows=626540 width=4) (actual time=0.009..1467.145 rows=626626 loops=1)
 Total runtime: 4875.015 ms

I got it to run a million times faster than in it's original form simply by
removing the 'distinct' on the m.id because m.id is a primary key and adding
the distinct to a.memberid, but by changing the query in any way it breaks
some other part of the application as this is just a small part of the total
"building process".

I may be stuck between a rock and a very hard place as we don't have the
resources at this time for someone to rewite the whole building (this is
just a tiny part of the process that does what we call 'group building')
procedure.

Thanks to everyone that has responded thus far. Your input is appreciated
and welcomed.

Aaron

Re: Very poor performance

From

"Kevin Grittner"

Date:

17 August 2010, 15:19:16

Aaron Burnett <aburnett@bzzagent.com> wrote:

>>> 16 Gig RAM

>>> 192MB work_mem  (increasing to 400MB didn't change the outcome)
>>
>> What other non-default settings do you have?
>
> maintenance_work_mem = 1024MB
> max_stack_depth = 8MB
> max_fsm_pages = 8000000
> max_fsm_relations = 2000

Since you haven't set effective_cache_size, you're discouraging some
types of plans which might be worth considering.  This should
normally be set to the sum of your shared_buffers setting and
whatever is cached by the OS; try setting effective_cache_size to
15MB.  Speaking of shared_buffers, are you really at the default for
that, too?  If so, try setting it to somewhere between 1GB and 4GB.
(I would test at 1, 2, and 4 if possible, since the best setting is
dependent on workload.)

You may also want to try adjustments to random_page_cost and
seq_page_cost to see if you get a better plan.  How large is the
active (frequently accessed) portion of your database?  If your RAM
is large enough to cover that, you should probably set both to equal
values somewhere in the range of 0.1 to 0.005.  (Again, testing with
your queries is important.)  If your caching is significant (which I
would expect) but not enough to cover the active portion, you might
want to leave seq_page_cost alone and bring random_page_cost down to
somewhere around 2.

All of these except shared_buffers can be set in your session and
tested quickly and easily, without any need to restart PostgreSQL.

For more information, check the manual and this Wiki page:

http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server

-Kevin

Re: Very poor performance

From

Mark Kirkwood

Date:

17 August 2010, 19:28:01

On 18/08/10 06:19, Kevin Grittner wrote:
>
>
> Since you haven't set effective_cache_size, you're discouraging some
> types of plans which might be worth considering.  This should
> normally be set to the sum of your shared_buffers setting and
> whatever is cached by the OS; try setting effective_cache_size to
> 15MB.  Speaking of shared_buffers, are you really at the default for
> that, too?  If so, try setting it to somewhere between 1GB and 4GB.
> (I would test at 1, 2, and 4 if possible, since the best setting is
> dependent on workload.)
>
>
>

Kevin - I'm guessing you meant to suggest setting effective_cache_size
to 15GB (not 15MB)....

Cheers

Mark

Re: Very poor performance

From

"Kevin Grittner"

Date:

18 August 2010, 00:08:42

Mark Kirkwood  wrote:

> I'm guessing you meant to suggest setting effective_cache_size
> to 15GB (not 15MB)....

Yes.  Sorry about that.

-Kevin

Re: Very poor performance

From

Hannes Frederic Sowa

Date:

18 August 2010, 08:56:09

On Tue, Aug 17, 2010 at 7:54 PM, Aaron Burnett <aburnett@bzzagent.com> wrote:
> Yeah, missed a '.', it's 8.2.5

Centos 5.5 has postgresql 8.4.4 available from the main repository.
You might consider an upgrade.