> > create index prdt_new_url_dx on prdt_new (url)
> > create index prdt_new_sku_dx on prdt_new (sku)
> > create index prdt_old_sku_dx on prdt_old (sku)
> > create index prdt_new_url_null_dx on prdt_new (url) where prdt_new.url
> > IS NULL
I added indexes & redo the analyze - Query plan looks better,
But when I execute the query it still can't finish all at once. (i've
been waiting more than 30 minutes) and seems it time out (Error
"canceling query due to user request") :-B
Maybe you can help to analyze this query plan (the second one) to see
what make it slow?
QUERY PLAN (BEFORE ANALYZE):
Merge Left Join (cost=1886617.54..1960855.12 rows=4979571 width=19)
Merge Cond: (("outer"."?column3?" = "inner"."?column4?") AND
("outer".groupnum = "inner".groupnum))
Filter: (("inner".url IS NULL) OR (("inner".url)::text = ''::text))
-> Sort (cost=969258.98..981707.91 rows=4979571 width=19)
Sort Key: (mc.sku)::text, mc.groupnum
-> Seq Scan on prdt_old mc (cost=0.00..297611.71
rows=4979571 width=19)
-> Sort (cost=917358.56..928785.51 rows=4570779 width=82)
Sort Key: (mi.sku)::text, mi.groupnum
-> Seq Scan on prdt_new mi (cost=0.00..126438.79
rows=4570779 width=82)
QUERY PLAN AFTER ANALYZE:
Nested Loop Left Join (cost=0.00..233277.42 rows=23747 width=17)
Join Filter: ("inner".groupnum = "outer".groupnum)
Filter: (("inner".url IS NULL) OR (("inner".url)::text = ''::text))
-> Index Scan using idx_prdtold_sku on prdt_old mc
(cost=0.00..112211.85 rows=23747 width=17)
-> Index Scan using idx_prdtnew_sku on prdt_new mi
(cost=0.00..5.08 rows=1 width=82)
Index Cond: ((mi.sku)::text = ("outer".sku)::text)
I work around with indexes before and I did not sure to create index
for the url field because it is a text field and not the main key for
this query. But I just know that we can create index with condition
(in this case where url is null), I guess it may cut some of the query
time.
I seperate the query by groupnum instead querying the whole table that
cause the bottle neck effect. It works but yes it's slow (less than 1
hour) but goes thru.
I wish could do this simultaniously.