Re: improving performance for a delete - Mailing list pgsql-performance

From kevin kempter
Subject Re: improving performance for a delete
Date
Msg-id 51E11327-BBEB-4463-BB68-5C94589BF663@kevinkempterllc.com
Whole thread Raw
In response to improving performance for a delete  (kevin kempter <kevin@kevinkempterllc.com>)
Responses Re: improving performance for a delete
List pgsql-performance
Version 8.3.1


On May 20, 2008, at 1:51 PM, kevin kempter wrote:

> Hi all;
>
> I have 2 tables where I basically want to delete from the first
> table  (seg_id_tmp7) any rows where the entire row already exists in
> the second table (sl_cd_segment_dim)
>
> I have a query that looks like this (and it's slow):
>
>
> delete from seg_id_tmp7
> where
>     customer_srcid::text ||
>     show_srcid::text ||
>     show_name::text ||
>     season_srcid::text ||
>     season_name::text ||
>     episode_srcid::text ||
>     episode_name::text ||
>     segment_type_id::text ||
>     segment_type::text ||
>     segment_srcid::text ||
>     segment_name::text
> in
>     ( select
>         customer_srcid::text ||
>         show_srcid::text ||
>         show_name::text ||
>         season_srcid::text ||
>         season_name::text ||
>         episode_srcid::text ||
>         episode_name::text ||
>         segment_type_id::text ||
>         segment_type::text ||
>         segment_srcid::text ||
>         segment_name::text
>         from sl_cd_location_dim )
> ;
>
>
>
>
>
> Here's the query plan for it:
>
>                                    QUERY PLAN
> -----------------------------------------------------------------------------------
> Seq Scan on seg_id_tmp7  (cost=0.00..138870701.56 rows=2136 width=6)
>   Filter: (subplan)
>   SubPlan
>     ->  Seq Scan on sl_cd_location_dim  (cost=0.00..63931.60
> rows=433040 width=8)
> (4 rows)
>
>
>
>
>
>
>
>
> I also tried this:
>
> delete from seg_id_tmp7
> where
>     ( customer_srcid ,
>     show_srcid ,
>     show_name ,
>     season_srcid ,
>     season_name ,
>     episode_srcid ,
>     episode_name ,
>     segment_type_id ,
>     segment_type ,
>     segment_srcid ,
>     segment_name )
> in
>     ( select
>         customer_srcid ,
>         show_srcid ,
>         show_name ,
>         season_srcid ,
>         season_name ,
>         episode_srcid ,
>         episode_name ,
>         segment_type_id ,
>         segment_type ,
>         segment_srcid ,
>         segment_name
>         from sl_cd_location_dim )
> ;
>
>
> and I get this query plan:
>
>                                    QUERY PLAN
> -----------------------------------------------------------------------------------
> Seq Scan on seg_id_tmp7  (cost=0.00..87997034.20 rows=2136 width=6)
>   Filter: (subplan)
>   SubPlan
>     ->  Seq Scan on sl_cd_location_dim  (cost=0.00..40114.40
> rows=433040 width=8)
> (4 rows)
>
>
>
> If it helps here's the describe's (including indexes) for both tables:
>
> # \d seg_id_tmp7
>                Table "public.seg_id_tmp7"
>     Column      |            Type             | Modifiers
> -----------------+-----------------------------+-----------
> customer_srcid  | bigint                      |
> show_srcid      | bigint                      |
> show_name       | character varying           |
> season_srcid    | bigint                      |
> season_name     | character varying           |
> episode_srcid   | bigint                      |
> episode_name    | character varying           |
> segment_type_id | bigint                      |
> segment_type    | character varying           |
> segment_srcid   | bigint                      |
> segment_name    | character varying           |
> create_dt       | timestamp without time zone |
>
>
>
>
> # \d sl_cd_segment_dim
>                                         Table
> "public.sl_cd_segment_dim"
>        Column        |            Type
> |                          Modifiers
> ----------------------+-----------------------------
> +-------------------------------------------------------------
> sl_cd_segment_dim_id | bigint                      | not null
> default nextval('sl_cd_segment_dim_seq'::regclass)
> customer_srcid       | bigint                      | not null
> show_srcid           | bigint                      | not null
> show_name            | character varying(500)      | not null
> season_srcid         | bigint                      | not null
> season_name          | character varying(500)      | not null
> episode_srcid        | bigint                      | not null
> episode_name         | character varying(500)      | not null
> segment_type_id      | integer                     |
> segment_type         | character varying(500)      |
> segment_srcid        | bigint                      |
> segment_name         | character varying(500)      |
> effective_dt         | timestamp without time zone | not null
> default now()
> inactive_dt          | timestamp without time zone |
> last_update_dt       | timestamp without time zone | not null
> default now()
> Indexes:
>    "sl_cd_segment_dim_pk" PRIMARY KEY, btree (sl_cd_segment_dim_id)
>    "seg1" btree (customer_srcid)
>    "seg2" btree (show_srcid)
>    "seg3" btree (season_srcid)
>    "seg4" btree (episode_srcid)
>    "seg5" btree (segment_srcid)
>    "sl_cd_segment_dim_ix1" btree (customer_srcid)
>
>
>
>
>
>
> Any thoughts, suggestions, etc on how to improve performance for
> this delete ?
>
>
> Thanks in advance..
>
> /Kevin
>
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org
> )
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance


pgsql-performance by date:

Previous
From: kevin kempter
Date:
Subject: improving performance for a delete
Next
From: PFC
Date:
Subject: Re: improving performance for a delete