Re: Improve Query - Mailing list pgsql-performance

From Zach Calvert
Subject Re: Improve Query
Date
Msg-id 89af81ab0905270925r23e381e5tc6395c9d215f6ae4@mail.gmail.com
Whole thread Raw
In response to Re: Improve Query  (Nikolas Everett <nik9000@gmail.com>)
Responses Re: Improve Query
List pgsql-performance
I'm running the inserts now via a JDBC call I have, which is then
followed up by the query I'm showing and a few others.  I have run
tests on all of the others, and all others run index scans and are
very fast, 10 ms or less.  This one started at 2 milliseconds when the
table is empty and is up to 40 milliseconds with 40K inserts.  It is
degrading fast and I can't imagine what will happen with 400K, let
alone 400 million.

It is getting slower at a fairly fast clip and I need it to remain
fast.  Does postgre just not do count(*) with index scans?  Is that my
problem?

I'm still running the exact same query.  Here are the indexes I have tried
CREATE INDEX idx_score_score
  ON score
  USING btree
  (score);

CREATE INDEX idx_score_ldbscore
  ON score
  USING btree
  (leaderboardid, score);

CREATE INDEX idx_score_ldbactive
  ON score
  USING btree
  (leaderboardid, active);


CREATE INDEX idx_score_ldbactivescore
  ON score
  USING btree
  (leaderboardid, active, score);

CREATE INDEX idx_score_scoreactiveldb
  ON score
  USING btree
  (score, active, leaderboardid);

Yet still I run
explain analyze
select count(*)
from score
where leaderboardid=35 and active and score <= 6841

and get
"Aggregate  (cost=2641.29..2641.30 rows=1 width=0) (actual
time=134.826..134.826 rows=1 loops=1)"
"  ->  Seq Scan on score  (cost=0.00..2536.44 rows=41938 width=0)
(actual time=0.011..126.250 rows=40918 loops=1)"
"        Filter: (active AND (score <= 6841) AND (leaderboardid = 35))"
"Total runtime: 48.891 ms"




On Wed, May 27, 2009 at 11:06 AM, Nikolas Everett <nik9000@gmail.com> wrote:
> The plan ought to be different when there are more scores and the table is
> analyzed and your statistics target is high enough.  At this point you don't
> have enough data to merit doing anything but a seq scan.  The overhead is
> simply not worth it.
>
> You could try inserting a lot more rows.  I'd create a function to do
> several million inserts with random numbers and then analyze and rerun.
>
> I think in the end your probably going to see a couple of bitmap index
> scans, anding the results together, and then a bitmap scan.
>
> Keep in mind that postgres stores statistics information about each column,
> but doesn't at this point store statistics about two columns together.
>
> Preparing the query might actually hurt performance because postgres treats
> a prepare as "plan this query but I'm not going to tell you value of the
> parameters".  If you actually let the query replan every time then you will
> get a different plan for the leaderboards or score ranges that are more
> popular.
>
> On Wed, May 27, 2009 at 8:09 AM, Zach Calvert
> <zachcalvert@hemerasoftware.com> wrote:
>>
>> So Google hasn't been helpful and I'm not entirely sure what  to look
>> for in the mailing lists to find the answer to my problem, so here
>> goes.
>>
>> I have a query and I have run
>> explain analyze
>> select count(*)
>> from score
>> where leaderboardid=35 and score <= 6841 and active
>>
>> The result is
>> "Aggregate  (cost=2491.06..2491.07 rows=1 width=0) (actual
>> time=38.878..38.878 rows=1 loops=1)"
>> "  ->  Seq Scan on score  (cost=0.00..2391.17 rows=39954 width=0)
>> (actual time=0.012..30.760 rows=38571 loops=1)"
>> "        Filter: (active AND (score <= 6841) AND (leaderboardid = 35))"
>> "Total runtime: 38.937 ms"
>>
>> I have an index on score, I have an index on score and leaderboard and
>> active.  I can't seem to figure out how to create an index that will
>> turn that "Seq Scan" into an index scan. The biggest problem is that
>> the query degrades very quickly with a lot more rows and I will be
>> getting A LOT MORE rows.  What can I do to improve the performance of
>> this query?
>>
>>
>>
>>
>>
>> Thank you so much,
>> ZC
>>
>> --
>> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-performance
>
>

pgsql-performance by date:

Previous
From: Nikolas Everett
Date:
Subject: Re: Improve Query
Next
From: Grzegorz Jaśkiewicz
Date:
Subject: Re: Improve Query