Re: need some help with a delete statement - Mailing list pgsql-sql
From | scott.marlowe |
---|---|
Subject | Re: need some help with a delete statement |
Date | |
Msg-id | Pine.LNX.4.33.0306301259400.14457-100000@css120.ihs.com Whole thread Raw |
In response to | Re: need some help with a delete statement (Matthew Hixson <hixson@poindextrose.org>) |
Responses |
Re: need some help with a delete statement
|
List | pgsql-sql |
On Mon, 30 Jun 2003, Matthew Hixson wrote: > On Monday, June 30, 2003, at 05:06 AM, scott.marlowe wrote: > > > On Fri, 27 Jun 2003, Matthew Hixson wrote: > > > >> Hi, I have a bunch of records that I need to delete from our database. > >> These records represent shopping carts for visitors to our website. > >> The shopping carts I'd like to delete are the ones without anything in > >> them. Here is the schema: > >> > >> create sequence carts_sequence; > >> create table carts( > >> cart_id integer default nextval('carts_sequence') primary key, > >> cart_cookie varchar(24)); > >> > >> create sequence cart_contents_sequence; > >> create table cart_contents( > >> cart_contents_id integer default nextval('cart_contents_sequence') > >> primary key, > >> cart_id integer not null, > >> content_id integer not null, > >> expire_time timestamp); > >> > >> I'm trying to use this query to delete the carts that are not > >> referenced from the cart_contents table. > >> > >> delete from carts where cart_id in (select cart_id from carts except > >> (select distinct cart_id from cart_contents)); > >> > >> My dev machine is running Postgres 7.3.3 and is a 550Mhz Titanium > >> running MacOS X 10.2.6. It has 1GB of RAM. I have 266777 entries in > >> v_carts and only 3746 entries in v_cart_contents. Clearly there are a > >> very large number of empty carts. Running the delete statement above > >> runs for over 15 minutes on this machine. I just cancelled it because > >> I want to find a faster query to use in case I ever need to do this > >> again. While the query is running the disk does not thrash at all. > >> It > >> is definitely CPU bound. > >> Limiting the statement to 1 item takes about 12 seconds to run: > >> > >> delete from carts where cart_id in (select cart_id from carts except > >> (select distinct cart_id from cart_contents) limit 1); > >> Time: 12062.16 ms > > > > While in() is notoriously slow, this sounds more like a problem where > > your > > query is having to seq scan due to mismatching or missing indexes. > > > > So, what kind of index do you have on cart_id, > > Its is a btree index. > > Table "public.carts" > Column | Type | Modifiers > -------------+----------------------- > +-------------------------------------------------- > cart_id | integer | not null default > nextval('carts_sequence'::text) > cart_cookie | character varying(24) | > Indexes: v_carts_pkey primary key btree (cart_id), > cart_cart_cookie btree (cart_cookie) > > > > and what happens if you: > > > > select cart_id from carts except > > (select distinct cart_id from cart_contents) limit 1; > > > > then feed the cart_id into > > > > explain analyze delete from carts where cart_id=id_from_above; > > > > from psql? > > #explain analyze delete from carts where cart_id=2700; > QUERY PLAN > ------------------------------------------------------------------------ > ------------------------------------------------ > Index Scan using carts_pkey on carts (cost=0.00..3.16 rows=1 width=6) > (actual time=162.14..162.17 rows=1 loops=1) > Index Cond: (cart_id = 2700) > Total runtime: 162.82 msec > (3 rows) what does the output of psql say if you have the /timing switch on? > > > > > > Is cart_id a fk to another table (or is another table using it as a > > fk?) > > cart_id is the pk of the carts table. cart_contents also has a cart_id > and that is the fk pointing to its entry in the carts table. There is > nothing else using cart_id in either of those tables as a fk. > Thanks for the reply,