improving performance for a delete - Mailing list pgsql-performance
From | kevin kempter |
---|---|
Subject | improving performance for a delete |
Date | |
Msg-id | 1F6D96EA-31C8-47D0-9D72-664639C387C0@kevinkempterllc.com Whole thread Raw |
Responses |
Re: improving performance for a delete
|
List | pgsql-performance |
Hi all; I have 2 tables where I basically want to delete from the first table (seg_id_tmp7) any rows where the entire row already exists in the second table (sl_cd_segment_dim) I have a query that looks like this (and it's slow): delete from seg_id_tmp7 where customer_srcid::text || show_srcid::text || show_name::text || season_srcid::text || season_name::text || episode_srcid::text || episode_name::text || segment_type_id::text || segment_type::text || segment_srcid::text || segment_name::text in ( select customer_srcid::text || show_srcid::text || show_name::text || season_srcid::text || season_name::text || episode_srcid::text || episode_name::text || segment_type_id::text || segment_type::text || segment_srcid::text || segment_name::text from sl_cd_location_dim ) ; Here's the query plan for it: QUERY PLAN ----------------------------------------------------------------------------------- Seq Scan on seg_id_tmp7 (cost=0.00..138870701.56 rows=2136 width=6) Filter: (subplan) SubPlan -> Seq Scan on sl_cd_location_dim (cost=0.00..63931.60 rows=433040 width=8) (4 rows) I also tried this: delete from seg_id_tmp7 where ( customer_srcid , show_srcid , show_name , season_srcid , season_name , episode_srcid , episode_name , segment_type_id , segment_type , segment_srcid , segment_name ) in ( select customer_srcid , show_srcid , show_name , season_srcid , season_name , episode_srcid , episode_name , segment_type_id , segment_type , segment_srcid , segment_name from sl_cd_location_dim ) ; and I get this query plan: QUERY PLAN ----------------------------------------------------------------------------------- Seq Scan on seg_id_tmp7 (cost=0.00..87997034.20 rows=2136 width=6) Filter: (subplan) SubPlan -> Seq Scan on sl_cd_location_dim (cost=0.00..40114.40 rows=433040 width=8) (4 rows) If it helps here's the describe's (including indexes) for both tables: # \d seg_id_tmp7 Table "public.seg_id_tmp7" Column | Type | Modifiers -----------------+-----------------------------+----------- customer_srcid | bigint | show_srcid | bigint | show_name | character varying | season_srcid | bigint | season_name | character varying | episode_srcid | bigint | episode_name | character varying | segment_type_id | bigint | segment_type | character varying | segment_srcid | bigint | segment_name | character varying | create_dt | timestamp without time zone | # \d sl_cd_segment_dim Table "public.sl_cd_segment_dim" Column | Type | Modifiers ----------------------+----------------------------- +------------------------------------------------------------- sl_cd_segment_dim_id | bigint | not null default nextval('sl_cd_segment_dim_seq'::regclass) customer_srcid | bigint | not null show_srcid | bigint | not null show_name | character varying(500) | not null season_srcid | bigint | not null season_name | character varying(500) | not null episode_srcid | bigint | not null episode_name | character varying(500) | not null segment_type_id | integer | segment_type | character varying(500) | segment_srcid | bigint | segment_name | character varying(500) | effective_dt | timestamp without time zone | not null default now() inactive_dt | timestamp without time zone | last_update_dt | timestamp without time zone | not null default now() Indexes: "sl_cd_segment_dim_pk" PRIMARY KEY, btree (sl_cd_segment_dim_id) "seg1" btree (customer_srcid) "seg2" btree (show_srcid) "seg3" btree (season_srcid) "seg4" btree (episode_srcid) "seg5" btree (segment_srcid) "sl_cd_segment_dim_ix1" btree (customer_srcid) Any thoughts, suggestions, etc on how to improve performance for this delete ? Thanks in advance.. /Kevin
pgsql-performance by date: