Re: need some help with a delete statement - Mailing list pgsql-sql

From scott.marlowe
Subject Re: need some help with a delete statement
Date
Msg-id Pine.LNX.4.33.0307010646111.16127-100000@css120.ihs.com
Whole thread Raw
In response to Re: need some help with a delete statement  (Matthew Hixson <hixson@poindextrose.org>)
Responses Re: need some help with a delete statement  (Matthew Hixson <hixson@poindextrose.org>)
List pgsql-sql
On Tue, 1 Jul 2003, Matthew Hixson wrote:

> 
> On Monday, June 30, 2003, at 12:00 PM, scott.marlowe wrote:
> 
> > On Mon, 30 Jun 2003, Matthew Hixson wrote:
> >
> >> On Monday, June 30, 2003, at 05:06 AM, scott.marlowe wrote:
> >>
> >>> On Fri, 27 Jun 2003, Matthew Hixson wrote:
> >>>
> >>>> Hi, I have a bunch of records that I need to delete from our  
> >>>> database.
> >>>> These records represent shopping carts for visitors to our website.
> >>>> The shopping carts I'd like to delete are the ones without anything  
> >>>> in
> >>>> them.  Here is the schema:
> >>>>
> >>>> create sequence carts_sequence;
> >>>> create table carts(
> >>>> cart_id integer default nextval('carts_sequence') primary key,
> >>>> cart_cookie varchar(24));
> >>>>
> >>>> create sequence cart_contents_sequence;
> >>>> create table cart_contents(
> >>>> cart_contents_id integer default nextval('cart_contents_sequence')
> >>>> primary key,
> >>>> cart_id integer not null,
> >>>> content_id integer not null,
> >>>> expire_time timestamp);
> >>>>
> >>>> I'm trying to use this query to delete the carts that are not
> >>>> referenced from the cart_contents table.
> >>>>
> >>>> delete from carts where cart_id in (select cart_id from carts except
> >>>> (select distinct cart_id from cart_contents));
> >>>>
> >>>> My dev machine is running Postgres 7.3.3 and is a 550Mhz Titanium
> >>>> running MacOS X 10.2.6.  It has 1GB of RAM.  I have 266777 entries  
> >>>> in
> >>>> v_carts and only 3746 entries in v_cart_contents.  Clearly there  
> >>>> are a
> >>>> very large number of empty carts.  Running the delete statement  
> >>>> above
> >>>> runs for over 15 minutes on this machine.  I just cancelled it  
> >>>> because
> >>>> I want to find a faster query to use in case I ever need to do this
> >>>> again.  While the query is running the disk does not thrash at all.
> >>>> It
> >>>> is definitely CPU bound.
> >>>>    Limiting the statement to 1 item takes about 12 seconds to run:
> >>>>
> >>>> delete from carts where cart_id in (select cart_id from carts except
> >>>> (select distinct cart_id from cart_contents) limit 1);
> >>>> Time: 12062.16 ms
> >>>
> >>> While in() is notoriously slow, this sounds more like a problem where
> >>> your
> >>> query is having to seq scan due to mismatching or missing indexes.
> >>>
> >>> So, what kind of index do you have on cart_id,
> >>
> >> Its is a btree index.
> >>
> >>                                   Table "public.carts"
> >>     Column    |         Type          |                    Modifiers
> >> -------------+-----------------------
> >> +--------------------------------------------------
> >>   cart_id     | integer               | not null default
> >> nextval('carts_sequence'::text)
> >>   cart_cookie | character varying(24) |
> >> Indexes: v_carts_pkey primary key btree (cart_id),
> >>           cart_cart_cookie btree (cart_cookie)
> >>
> >>
> >>> and what happens if you:
> >>>
> >>> select cart_id from carts except
> >>> (select distinct cart_id from cart_contents) limit 1;
> >>>
> >>> then feed the cart_id into
> >>>
> >>> explain analyze delete from carts where cart_id=id_from_above;
> >>>
> >>> from psql?
> >>
> >> #explain analyze delete from carts where cart_id=2700;
> >>                                                         QUERY PLAN
> >> ---------------------------------------------------------------------- 
> >> --
> >> ------------------------------------------------
> >>   Index Scan using carts_pkey on carts  (cost=0.00..3.16 rows=1  
> >> width=6)
> >> (actual time=162.14..162.17 rows=1 loops=1)
> >>     Index Cond: (cart_id = 2700)
> >>   Total runtime: 162.82 msec
> >> (3 rows)
> >
> > what does the output of psql say if you have the /timing switch on?
> 
> # select cart_id from carts except (select distinct cart_id from  
> cart_contents) limit 1;
>   cart_id
> ---------
>      2701
> (1 row)
> Time: 10864.89 ms
> 
> # explain analyze delete from carts where cart_id=2701;
>                                                       QUERY PLAN
> ------------------------------------------------------------------------ 
> --------------------------------------------
>   Index Scan using carts_pkey on carts  (cost=0.00..3.16 rows=1 width=6)  
> (actual time=0.50..0.52 rows=1 loops=1)
>     Index Cond: (cart_id = 2701)
>   Total runtime: 1.06 msec
> (3 rows)
> Time: 257.83 ms

Well, it looks like the fks are running really slow, which may well mean 
that they are seq scanning.  Examine your table definition and make sure 
that they are the same types on both ends, and if not, recreate the table 
so that they are either the same types or one is coerced to the other when
referencing it.



pgsql-sql by date:

Previous
From: Troels Arvin
Date:
Subject: Immutable attributes?
Next
From: Rado Petrik
Date:
Subject: Failed to initialize lc_messages to ''