Thread: delete query taking way too long

delete query taking way too long

From
Ivan Sergio Borgonovo
Date:
I've
delete from catalog_items where ItemID in (select id from
import.Articoli_delete);

id and ItemID have an index.

catalog_items is ~1M rows
Articoli_delete is less than 2K rows.

This query has been running for roughly 50min.
Right now it is the only query running.

PostgreSQL 8.3.4 on x86_64-pc-linux-gnu, compiled by GCC cc (GCC)
4.1.2 20061115 (prerelease) (Debian 4.1.1-21)

--
Ivan Sergio Borgonovo
http://www.webthatworks.it


Re: delete query taking way too long

From
Glyn Astill
Date:
What's the output of explain?

--- On Thu, 12/8/10, Ivan Sergio Borgonovo <mail@webthatworks.it> wrote:

> From: Ivan Sergio Borgonovo <mail@webthatworks.it>
> Subject: [GENERAL] delete query taking way too long
> To: pgsql-general@postgresql.org
> Date: Thursday, 12 August, 2010, 12:14
> I've
> delete from catalog_items where ItemID in (select id from
> import.Articoli_delete);
>
> id and ItemID have an index.
>
> catalog_items is ~1M rows
> Articoli_delete is less than 2K rows.
>
> This query has been running for roughly 50min.
> Right now it is the only query running.
>
> PostgreSQL 8.3.4 on x86_64-pc-linux-gnu, compiled by GCC cc
> (GCC)
> 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)
>
> --
> Ivan Sergio Borgonovo
> http://www.webthatworks.it
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>




Re: delete query taking way too long

From
Jayadevan M
Date:
> From: Ivan Sergio Borgonovo <mail@webthatworks.it>
> To: pgsql-general@postgresql.org
> Date: 12/08/2010 16:43
> Subject: [GENERAL] delete query taking way too long
> Sent by: pgsql-general-owner@postgresql.org
>
> I've
> delete from catalog_items where ItemID in (select id from
> import.Articoli_delete);
Does catalog_items have child tables where the FK columns are not indexed?
Regards,
Jayadevan





DISCLAIMER:

"The information in this e-mail and any attachment is intended only for
the person to whom it is addressed and may contain confidential and/or
privileged material. If you have received this e-mail in error, kindly
contact the sender and destroy all copies of the original communication.
IBS makes no warranty, express or implied, nor guarantees the accuracy,
adequacy or completeness of the information contained in this email or any
attachment and is not liable for any errors, defects, omissions, viruses
or for resultant loss or damage, if any, direct or indirect."






Re: delete query taking way too long

From
Thom Brown
Date:
On 12 August 2010 12:14, Ivan Sergio Borgonovo <mail@webthatworks.it> wrote:
> I've
> delete from catalog_items where ItemID in (select id from
> import.Articoli_delete);
>
> id and ItemID have an index.
>
> catalog_items is ~1M rows
> Articoli_delete is less than 2K rows.
>
> This query has been running for roughly 50min.
> Right now it is the only query running.
>
> PostgreSQL 8.3.4 on x86_64-pc-linux-gnu, compiled by GCC cc (GCC)
> 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)
>
> --
You can try to do deletes in batches of 10,000:

DELETE FROM catalog_items WHERE ItemID IN (SELECT id FROM
import.Articoli_delete LIMIT 10000);

But an EXPLAIN would tell us a lot more.

--
Thom Brown
Registered Linux user: #516935

Re: delete query taking way too long

From
tv@fuzzy.cz
Date:
> I've
> delete from catalog_items where ItemID in (select id from
> import.Articoli_delete);
>
> id and ItemID have an index.
>
> catalog_items is ~1M rows
> Articoli_delete is less than 2K rows.
>
> This query has been running for roughly 50min.
> Right now it is the only query running.
>
> PostgreSQL 8.3.4 on x86_64-pc-linux-gnu, compiled by GCC cc (GCC)
> 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)

1) Run the query with EXPLAIN, e.g. something like EXPLAIN DELETE FROM ...

This won't actually execute the query, it will just prepare an execution
plan and print it. That might show some problems with the query.

Post the output of explain to explain.depesz.com and then send link to the
forum (you could post the explain output here, but it's difficult to
read).

2) Run ANALYZE on the tables involved in the query and then the EXPLIAIN
again (this might show some problems with obsolete statistics).

3) Are there any tables depending on the "catalog_items" table? I mean are
there any foreign keys referencing it through a foreign key? How large are
those tables? Are the FK columns indexed?

regards
Tomas


Re: delete query taking way too long

From
Ivan Sergio Borgonovo
Date:
On Thu, 12 Aug 2010 17:14:17 +0530
Jayadevan M <Jayadevan.Maymala@ibsplc.com> wrote:

> > I've
> > delete from catalog_items where ItemID in (select id from
> > import.Articoli_delete);
> Does catalog_items have child tables where the FK columns are not
> indexed? Regards,

Possibly, but very small ones.

What I missed to say is... that query always worked reasonably fast
in the past. The size of the DB didn't grow substantially recently.

I'd say the query shouldn't be the problem... the question should
have been:
what should I look when postgresql start to behave strangely?
eg. missing resources, locks, solved bug (it is a reasonably old
version)...

--
Ivan Sergio Borgonovo
http://www.webthatworks.it


Re: delete query taking way too long

From
Ivan Sergio Borgonovo
Date:
On Thu, 12 Aug 2010 12:50:49 +0100
Thom Brown <thom@linux.com> wrote:

> On 12 August 2010 12:14, Ivan Sergio Borgonovo
> <mail@webthatworks.it> wrote:
> > I've
> > delete from catalog_items where ItemID in (select id from
> > import.Articoli_delete);
> >
> > id and ItemID have an index.
> >
> > catalog_items is ~1M rows
> > Articoli_delete is less than 2K rows.
> >
> > This query has been running for roughly 50min.
> > Right now it is the only query running.
> >
> > PostgreSQL 8.3.4 on x86_64-pc-linux-gnu, compiled by GCC cc (GCC)
> > 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)
> >
> > --
> You can try to do deletes in batches of 10,000:

ItemID is a PK.
Even if there is no constraint on Articoli_delete.id they *should*
be unique.

There are some other tables with indexes... but I don't expect that
more than 4K row for each table will be deleted.
There are a couple of other large (1M rows) table where an on delete
cascade is defined.

This is the query plan
Nested Loop  (cost=30.07..10757.29 rows=1766 width=6)
 ->  HashAggregate  (cost=30.07..47.73 rows=1766 width=8)
   ->  Seq Scan on articoli_delete  (cost=0.00..25.66 rows=1766
       width=8)
 ->  Index Scan using catalog_items_pkey on catalog_items
  (cost=0.00..6.05 rows=1 width=14)
    Index Cond: (catalog_items.itemid = articoli_delete.id)

BTW it is happening again... after I stopped pg, restarted the whole
server and re-run the query.

This query get generally unnoticed in a longer process but I doubt
it ever lasted more than a couple of minutes in the past.

--
Ivan Sergio Borgonovo
http://www.webthatworks.it