Re: Data Set Growth causing 26+hour runtime, on what we believe to be very simple SQL - Mailing list pgsql-performance

From Richard Huxton
Subject Re: Data Set Growth causing 26+hour runtime, on what we believe to be very simple SQL
Date
Msg-id 4B597319.8010504@archonet.com
Whole thread Raw
In response to Data Set Growth causing 26+hour runtime, on what we believe to be very simple SQL  (Tory M Blue <tmblue@gmail.com>)
Responses Re: Data Set Growth causing 26+hour runtime, on what we believe to be very simple SQL  (Tory M Blue <tmblue@gmail.com>)
List pgsql-performance
On 21/01/10 22:15, Tory M Blue wrote:
> ·         Data distribution = In the 98mill records, there are 7000 unique
> makeid's, and 21mill unique UID's. About 41mill of the records have
> tagged=true
>
> ·         Time to execute the following query with indices on makeid and
> tagged = 90-120 seconds. The planner uses the webid index and filters on
> tagged and then rechecks the webid index
>
> *                SELECT COUNT(DISTINCT uid )  AS active_users FROM
> pixelpool.userstats    WHERE makeid ='bmw-ferman' AND tagged =true*
>
> ·         Time to execute the the same query with a combined index on makeid
> and tagged = 60-100 seconds. The planner uses the combined index and then
> filters tagged.

Two things:

1. You have got the combined index on (makeid, tagged) and not (tagged,
makeid) haven't you? Just checking.

2. If it's mostly tagged=true you are interested in you can always use a
partial index: CREATE INDEX ... (makeid) WHERE tagged
This might be a win even if you need a second index with WHERE NOT tagged.


Also, either I've not had enough cofee yet, or a bitmap scan is an odd
choice for only ~ 13000 rows out of 100 million.

> *                "        ->   Bitmap Index Scan on
> idx_retargetuserstats_makeidtag  (cost=0.00..360.20 rows=13175 width=0)"*
>
> *                "              Index Cond: ((makeid = 'b1mw-ferman'::text)
> AND (tagged = true))"*

Otherwise, see what Craig said.

I'm assuming this isn't the query that is CPU bound for a long time.
Unless your table is horribly bloated, there's no reason for that
judging by this plan.

--
   Richard Huxton
   Archonet Ltd

pgsql-performance by date:

Previous
From: DM
Date:
Subject: Fragmentation/Vacuum, Analyze, Re-Index
Next
From: "elias ghanem"
Date:
Subject: Slow update query