Home > mailing lists

Re: Simple Join - Mailing list pgsql-performance

From	Kevin Brown
Subject	Re: Simple Join
Date	December 14, 2005 23:02:45
Msg-id	200512141752.45866.blargity@gmail.com Whole thread Raw
In response to	Re: Simple Join (Mark Kirkwood <markir@paradise.net.nz>)
Responses	Re: Simple Join
List	pgsql-performance

Tree view

On Wednesday 14 December 2005 17:30, Mark Kirkwood wrote:
> You scan 600000 rows from to_ship to get about 25000 - so some way to
> cut this down would help.

Yup.  I'm open to anything too, as this is the only real part of the system
that cares.  So either maintaining a denormalized copy column, or whatever
would be fine.  We're doing far more reads than writes.

> Try out an explicit INNER JOIN which includes the filter info for paid
> and suspended_sub in the join condition (you may need indexes on each of
> id, paid and suspended_sub, so that the 8.1 optimizer can use a bitmap
> scan):

I only had two explicit indexes.  One was on to_ship.ordered_product_id and
the other was on ordered_products.paid.  ordered_products.id is a primary
key.  This is on your query with an index added on suspended_sub:

dli=# explain analyze SELECT ordered_products.product_id
dli-# FROM to_ship INNER JOIN ordered_products
dli-# ON (to_ship.ordered_product_id = ordered_products.id
dli(#      AND ordered_products.paid = TRUE      AND
dli(# ordered_products.suspended_sub = FALSE);
                                                               QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=5126.19..31528.40 rows=20591 width=8) (actual
time=4554.190..23519.618 rows=14367 loops=1)
   Hash Cond: ("outer".ordered_product_id = "inner".id)
   ->  Seq Scan on to_ship  (cost=0.00..11529.12 rows=611612 width=8) (actual
time=11.254..15192.042 rows=611612 loops=1)
   ->  Hash  (cost=4954.79..4954.79 rows=21759 width=16) (actual
time=4494.900..4494.900 rows=18042 loops=1)
         ->  Index Scan using paid_index on ordered_products
(cost=0.00..4954.79 rows=21759 width=16) (actual time=72.431..4414.697
rows=18042 loops=1)
               Index Cond: (paid = true)
               Filter: (paid AND (NOT suspended_sub))
 Total runtime: 23532.785 ms
(8 rows)

So what's the best way to performance wiggle this info out of the db?  The
list of values is only about 30 tuples long out of this query, so I was
figuring I could trigger on insert to to_ship to place the value into another
table if it didn't already exist.  I'd rather the writing be slow than the
reading.

pgsql-performance by date:

From: Kevin Brown
Date: 14 December 2005, 22:50:18
Subject: Re: Simple Join

From: "Gregory S. Williamson"
Date: 14 December 2005, 23:23:46
Subject: Re: [postgis-users] Is my query planner failing me, or vice versa?

Re: Simple Join - Mailing list pgsql-performance

Previous

Next