Re: Deleting 173000 records takes forever, blocks async queries for unrelated records - Mailing list pgsql-general

From Albe Laurenz
Subject Re: Deleting 173000 records takes forever, blocks async queries for unrelated records
Date
Msg-id A737B7A37273E048B164557ADEF4A58B057AE0D2@ntex2010a.host.magwien.gv.at
Whole thread Raw
In response to Deleting 173000 records takes forever, blocks async queries for unrelated records  (Alexander Farber <alexander.farber@gmail.com>)
List pgsql-general
Alexander Farber wrote:
> in a Facebook game running on
> PostgreSQL 8.4.13 and having so many players:
>=20
> #  select count(*) from pref_users;
>  count
> --------
>  223964
>=20
> I am trying to get rid of inactive users,
> who just visited the canvas page, but
> never played (I'm sure, Facebook has
> a clever-sounding name for them):
>=20
> # select count(*) from pref_users
> where id not in (select distinct id from pref_money);
>  count
> --------
>  173936
> (1 row)
>=20
> So I call:
>=20
> # delete from pref_users
> where id not in (select distinct id from pref_money);
>=20
> but that query lasts forever and
> what's more troubling me - it blocks
> the async queries of my game daemon
> (the Perl function pg_ready starts returning
> false all the time and my game accumulates
> thousands of yet-to-be-executed SQL queries).
>=20
> The good news is, that my quad server
> doesn't hang - I just see 1 postmaster
> process at 90-100% CPU but total load is 20%.

I hope it's a backend and not the postmaster.

> Also my game daemon in Perl recovers
> and executes the thousands of queued
> up async queries, when I interrupt the
> above DELETE query with CTRL-C at
> the pgsql prompt - i.e. my game is not buggy.
>=20
> My question is how handle this?
>=20
> Why does deleting takes so long,
> is it because of CASCADES?
>=20
> And why does it make the pg_ready
> calls of my game daemon return false?
> The users I'm deleting aren't active,
> they shouldn't "intersect" with the
> async queries of my game daemon.
>=20
> Below are the both SQL tables involved,
> thank you for any insights.

It would help if you send EXPLAIN output for the DELETE statement.

> #  \d pref_money
>                         Table "public.pref_money"
>  Column |         Type          |                Modifiers
> --------+-----------------------+----------------------------------------=
-
>  id     | character varying(32) |
>  money  | integer               | not null
>  yw     | character(7)          | default to_char(now(), 'IYYY-IW'::text)
> Indexes:
>     "pref_money_money_index" btree (money DESC)
>     "pref_money_yw_index" btree (yw)
> Foreign-key constraints:
>     "pref_money_id_fkey" FOREIGN KEY (id) REFERENCES pref_users(id) ON
> DELETE CASCADE

The problem is very likely that you have no index on the "id" column.

That means that for each id deleted from pref_users, the cascading
delete has to perform a full table scan on pref_money to find the
corresponding rows.  This is very likely the cause of your problem.

Indeed, this table does not have a primary key.  That is usually a bad
idea.  How about PRIMARY KEY (id, yw)?
That should speed up the query.

> #  \d pref_users
>                    Table "public.pref_users"
>    Column   |            Type             |     Modifiers
> ------------+-----------------------------+--------------------
>  id         | character varying(32)       | not null
[...]
> Indexes:
>     "pref_users_pkey" PRIMARY KEY, btree (id)
> Referenced by:
[many tables]

While you are at it, check all the other tables referencing pref_users
and make sure that they have an index on the referencing column.

Yours,
Laurenz Albe

pgsql-general by date:

Previous
From: hamann.w@t-online.de
Date:
Subject: Re Deleting 173000 records takes forever
Next
From: Michal Politowski
Date:
Subject: Re: Deleting 173000 records takes forever, blocks async queries for unrelated records