Re: Table partition for very large table - Mailing list pgsql-general

From Yudie Pg
Subject Re: Table partition for very large table
Date
Msg-id e460d0c05032816073a4100b7@mail.gmail.com
Whole thread Raw
In response to Re: Table partition for very large table  (Scott Marlowe <smarlowe@g2switchworks.com>)
List pgsql-general
> > create index prdt_new_url_dx on prdt_new (url)
> > create index prdt_new_sku_dx on prdt_new (sku)
> > create index prdt_old_sku_dx on prdt_old (sku)
> > create index prdt_new_url_null_dx on prdt_new (url) where prdt_new.url
> > IS NULL

I added indexes & redo the analyze - Query plan looks better,
But when I execute the query it still can't finish all at once. (i've
been waiting more than 30 minutes) and seems it time out (Error
"canceling query due to user request") :-B
Maybe you can help to analyze this query plan (the second one) to see
what make it slow?

QUERY PLAN (BEFORE ANALYZE):
Merge Left Join  (cost=1886617.54..1960855.12 rows=4979571 width=19)
  Merge Cond: (("outer"."?column3?" = "inner"."?column4?") AND
("outer".groupnum = "inner".groupnum))
  Filter: (("inner".url IS NULL) OR (("inner".url)::text = ''::text))
  ->  Sort  (cost=969258.98..981707.91 rows=4979571 width=19)
        Sort Key: (mc.sku)::text, mc.groupnum
        ->  Seq Scan on prdt_old mc  (cost=0.00..297611.71
rows=4979571 width=19)
  ->  Sort  (cost=917358.56..928785.51 rows=4570779 width=82)
        Sort Key: (mi.sku)::text, mi.groupnum
        ->  Seq Scan on prdt_new mi  (cost=0.00..126438.79
rows=4570779 width=82)

QUERY PLAN AFTER ANALYZE:
Nested Loop Left Join  (cost=0.00..233277.42 rows=23747 width=17)
  Join Filter: ("inner".groupnum = "outer".groupnum)
  Filter: (("inner".url IS NULL) OR (("inner".url)::text = ''::text))
  ->  Index Scan using idx_prdtold_sku on prdt_old mc
(cost=0.00..112211.85 rows=23747 width=17)
  ->  Index Scan using idx_prdtnew_sku on prdt_new mi
(cost=0.00..5.08 rows=1 width=82)
        Index Cond: ((mi.sku)::text = ("outer".sku)::text)

I work around with indexes before and I did not sure to create index
for the url field because it is a text field and not the main key for
this query. But I just know that we can create index with condition
(in this case where url is null), I guess it may cut some of the query
time.

I seperate the query by groupnum instead querying the whole table that
cause the bottle neck effect. It works but yes it's slow (less than 1
hour) but goes thru.
I wish could do this simultaniously.

pgsql-general by date:

Previous
From: Scott Marlowe
Date:
Subject: Re: Table partition for very large table
Next
From: Richard Villalón
Date:
Subject: Error en la instalacion pgsql 8 en xp