Thread: Deleting 173000 records takes forever, blocks async queries for unrelated records
Deleting 173000 records takes forever, blocks async queries for unrelated records
From
Alexander Farber
Date:
Hello, in a Facebook game running on PostgreSQL 8.4.13 and having so many players: # select count(*) from pref_users; count -------- 223964 I am trying to get rid of inactive users, who just visited the canvas page, but never played (I'm sure, Facebook has a clever-sounding name for them): # select count(*) from pref_users where id not in (select distinct id from pref_money); count -------- 173936 (1 row) So I call: # delete from pref_users where id not in (select distinct id from pref_money); but that query lasts forever and what's more troubling me - it blocks the async queries of my game daemon (the Perl function pg_ready starts returning false all the time and my game accumulates thousands of yet-to-be-executed SQL queries). The good news is, that my quad server doesn't hang - I just see 1 postmaster process at 90-100% CPU but total load is 20%. Also my game daemon in Perl recovers and executes the thousands of queued up async queries, when I interrupt the above DELETE query with CTRL-C at the pgsql prompt - i.e. my game is not buggy. My question is how handle this? Why does deleting takes so long, is it because of CASCADES? And why does it make the pg_ready calls of my game daemon return false? The users I'm deleting aren't active, they shouldn't "intersect" with the async queries of my game daemon. Below are the both SQL tables involved, thank you for any insights. Regards Alex # \d pref_money Table "public.pref_money" Column | Type | Modifiers --------+-----------------------+----------------------------------------- id | character varying(32) | money | integer | not null yw | character(7) | default to_char(now(), 'IYYY-IW'::text) Indexes: "pref_money_money_index" btree (money DESC) "pref_money_yw_index" btree (yw) Foreign-key constraints: "pref_money_id_fkey" FOREIGN KEY (id) REFERENCES pref_users(id) ON DELETE CASCADE # \d pref_users Table "public.pref_users" Column | Type | Modifiers ------------+-----------------------------+-------------------- id | character varying(32) | not null first_name | character varying(64) | last_name | character varying(64) | female | boolean | avatar | character varying(128) | city | character varying(64) | login | timestamp without time zone | default now() last_ip | inet | logout | timestamp without time zone | vip | timestamp without time zone | mail | character varying(254) | medals | integer | not null default 0 Indexes: "pref_users_pkey" PRIMARY KEY, btree (id) Referenced by: TABLE "pref_cards" CONSTRAINT "pref_cards_id_fkey" FOREIGN KEY (id) REFERENCES pref_users(id) ON DELETE CASCADE TABLE "pref_catch" CONSTRAINT "pref_catch_id_fkey" FOREIGN KEY (id) REFERENCES pref_users(id) ON DELETE CASCADE TABLE "pref_chat" CONSTRAINT "pref_chat_id_fkey" FOREIGN KEY (id) REFERENCES pref_users(id) ON DELETE CASCADE TABLE "pref_discuss" CONSTRAINT "pref_discuss_id_fkey" FOREIGN KEY (id) REFERENCES pref_users(id) ON DELETE CASCADE TABLE "pref_game" CONSTRAINT "pref_game_id_fkey" FOREIGN KEY (id) REFERENCES pref_users(id) ON DELETE CASCADE TABLE "pref_luck" CONSTRAINT "pref_luck_id_fkey" FOREIGN KEY (id) REFERENCES pref_users(id) ON DELETE CASCADE TABLE "pref_match" CONSTRAINT "pref_match_id_fkey" FOREIGN KEY (id) REFERENCES pref_users(id) ON DELETE CASCADE TABLE "pref_misere" CONSTRAINT "pref_misere_id_fkey" FOREIGN KEY (id) REFERENCES pref_users(id) ON DELETE CASCADE TABLE "pref_money" CONSTRAINT "pref_money_id_fkey" FOREIGN KEY (id) REFERENCES pref_users(id) ON DELETE CASCADE TABLE "pref_pass" CONSTRAINT "pref_pass_id_fkey" FOREIGN KEY (id) REFERENCES pref_users(id) ON DELETE CASCADE TABLE "pref_payment" CONSTRAINT "pref_payment_id_fkey" FOREIGN KEY (id) REFERENCES pref_users(id) ON DELETE CASCADE TABLE "pref_rep" CONSTRAINT "pref_rep_author_fkey" FOREIGN KEY (author) REFERENCES pref_users(id) ON DELETE CASCADE TABLE "pref_rep" CONSTRAINT "pref_rep_id_fkey" FOREIGN KEY (id) REFERENCES pref_users(id) ON DELETE CASCADE TABLE "pref_scores" CONSTRAINT "pref_scores_id_fkey" FOREIGN KEY (id) REFERENCES pref_users(id) ON DELETE CASCADE TABLE "pref_status" CONSTRAINT "pref_status_id_fkey" FOREIGN KEY (id) REFERENCES pref_users(id) ON DELETE CASCADE TABLE "pref_votes" CONSTRAINT "pref_votes_id_fkey" FOREIGN KEY (id) REFERENCES pref_users(id) ON DELETE CASCADE
Re: Deleting 173000 records takes forever, blocks async queries for unrelated records
From
Albe Laurenz
Date:
Alexander Farber wrote: > in a Facebook game running on > PostgreSQL 8.4.13 and having so many players: >=20 > # select count(*) from pref_users; > count > -------- > 223964 >=20 > I am trying to get rid of inactive users, > who just visited the canvas page, but > never played (I'm sure, Facebook has > a clever-sounding name for them): >=20 > # select count(*) from pref_users > where id not in (select distinct id from pref_money); > count > -------- > 173936 > (1 row) >=20 > So I call: >=20 > # delete from pref_users > where id not in (select distinct id from pref_money); >=20 > but that query lasts forever and > what's more troubling me - it blocks > the async queries of my game daemon > (the Perl function pg_ready starts returning > false all the time and my game accumulates > thousands of yet-to-be-executed SQL queries). >=20 > The good news is, that my quad server > doesn't hang - I just see 1 postmaster > process at 90-100% CPU but total load is 20%. I hope it's a backend and not the postmaster. > Also my game daemon in Perl recovers > and executes the thousands of queued > up async queries, when I interrupt the > above DELETE query with CTRL-C at > the pgsql prompt - i.e. my game is not buggy. >=20 > My question is how handle this? >=20 > Why does deleting takes so long, > is it because of CASCADES? >=20 > And why does it make the pg_ready > calls of my game daemon return false? > The users I'm deleting aren't active, > they shouldn't "intersect" with the > async queries of my game daemon. >=20 > Below are the both SQL tables involved, > thank you for any insights. It would help if you send EXPLAIN output for the DELETE statement. > # \d pref_money > Table "public.pref_money" > Column | Type | Modifiers > --------+-----------------------+----------------------------------------= - > id | character varying(32) | > money | integer | not null > yw | character(7) | default to_char(now(), 'IYYY-IW'::text) > Indexes: > "pref_money_money_index" btree (money DESC) > "pref_money_yw_index" btree (yw) > Foreign-key constraints: > "pref_money_id_fkey" FOREIGN KEY (id) REFERENCES pref_users(id) ON > DELETE CASCADE The problem is very likely that you have no index on the "id" column. That means that for each id deleted from pref_users, the cascading delete has to perform a full table scan on pref_money to find the corresponding rows. This is very likely the cause of your problem. Indeed, this table does not have a primary key. That is usually a bad idea. How about PRIMARY KEY (id, yw)? That should speed up the query. > # \d pref_users > Table "public.pref_users" > Column | Type | Modifiers > ------------+-----------------------------+-------------------- > id | character varying(32) | not null [...] > Indexes: > "pref_users_pkey" PRIMARY KEY, btree (id) > Referenced by: [many tables] While you are at it, check all the other tables referencing pref_users and make sure that they have an index on the referencing column. Yours, Laurenz Albe
Re: Deleting 173000 records takes forever, blocks async queries for unrelated records
From
Michal Politowski
Date:
On Fri, 1 Feb 2013 10:38:26 +0100, Alexander Farber wrote: > Hello, >=20 > in a Facebook game running on > PostgreSQL 8.4.13 and having so many players: >=20 > # select count(*) from pref_users; > count > -------- > 223964 >=20 > I am trying to get rid of inactive users, > who just visited the canvas page, but > never played (I'm sure, Facebook has > a clever-sounding name for them): >=20 > # select count(*) from pref_users > where id not in (select distinct id from pref_money); > count > -------- > 173936 > (1 row) >=20 > So I call: >=20 > # delete from pref_users > where id not in (select distinct id from pref_money); >=20 > but that query lasts forever and > what's more troubling me - it blocks > the async queries of my game daemon > (the Perl function pg_ready starts returning > false all the time and my game accumulates > thousands of yet-to-be-executed SQL queries). >=20 > The good news is, that my quad server > doesn't hang - I just see 1 postmaster > process at 90-100% CPU but total load is 20%. >=20 > Also my game daemon in Perl recovers > and executes the thousands of queued > up async queries, when I interrupt the > above DELETE query with CTRL-C at > the pgsql prompt - i.e. my game is not buggy. >=20 > My question is how handle this? >=20 > Why does deleting takes so long, > is it because of CASCADES? >=20 > And why does it make the pg_ready > calls of my game daemon return false? > The users I'm deleting aren't active, > they shouldn't "intersect" with the > async queries of my game daemon. >=20 > Below are the both SQL tables involved, > thank you for any insights. >=20 > Regards > Alex >=20 > # \d pref_money > Table "public.pref_money" > Column | Type | Modifiers > --------+-----------------------+----------------------------------------- > id | character varying(32) | > money | integer | not null > yw | character(7) | default to_char(now(), 'IYYY-IW'::text) > Indexes: > "pref_money_money_index" btree (money DESC) > "pref_money_yw_index" btree (yw) > Foreign-key constraints: > "pref_money_id_fkey" FOREIGN KEY (id) REFERENCES pref_users(id) ON > DELETE CASCADE I believe an index on pref_money.id could help. Without it Postgres is, if I understand correctly, making 173936 table scan= s on pref_money to try (and fail) to find for each deleted row the referencing row in that = table. =20 > # \d pref_users [...] --=20 Micha=B3 Politowski
Re: Deleting 173000 records takes forever, blocks async queries for unrelated records
From
Bèrto ëd Sèra
Date:
Hi, What Albe said about PKs. I'm also not very fond of people using text in keys (even if it's a varchar or a char field). Test the same thing with a numeric key and you are likely to see a difference. Apart from making sure your design is ok, you might want to keep this stuff well in the background, firing it as a frequent (and small) job. Sort of "your own private vacuum job". Something like delete from pref_users where id not in ( select distinct id from pref_money ) limit 100; Where 100 may be any number of records you find to be compatible with a smooth performance. Keep calling this stuff at a suitable interval (X secs in between each call), and it will silently do the cleaning without creating giant transactions. You definitely want to make a good design BEFORE doing this, though. Cheers Bèrto On 1 February 2013 09:38, Alexander Farber <alexander.farber@gmail.com> wrote: > Hello, > > in a Facebook game running on > PostgreSQL 8.4.13 and having so many players: > > # select count(*) from pref_users; > count > -------- > 223964 > > I am trying to get rid of inactive users, > who just visited the canvas page, but > never played (I'm sure, Facebook has > a clever-sounding name for them): > > # select count(*) from pref_users > where id not in (select distinct id from pref_money); > count > -------- > 173936 > (1 row) > > So I call: > > # delete from pref_users > where id not in (select distinct id from pref_money); > > but that query lasts forever and > what's more troubling me - it blocks > the async queries of my game daemon > (the Perl function pg_ready starts returning > false all the time and my game accumulates > thousands of yet-to-be-executed SQL queries). > > The good news is, that my quad server > doesn't hang - I just see 1 postmaster > process at 90-100% CPU but total load is 20%. > > Also my game daemon in Perl recovers > and executes the thousands of queued > up async queries, when I interrupt the > above DELETE query with CTRL-C at > the pgsql prompt - i.e. my game is not buggy. > > My question is how handle this? > > Why does deleting takes so long, > is it because of CASCADES? > > And why does it make the pg_ready > calls of my game daemon return false? > The users I'm deleting aren't active, > they shouldn't "intersect" with the > async queries of my game daemon. > > Below are the both SQL tables involved, > thank you for any insights. > > Regards > Alex > > # \d pref_money > Table "public.pref_money" > Column | Type | Modifiers > --------+-----------------------+----------------------------------------- > id | character varying(32) | > money | integer | not null > yw | character(7) | default to_char(now(), 'IYYY-IW'::text) > Indexes: > "pref_money_money_index" btree (money DESC) > "pref_money_yw_index" btree (yw) > Foreign-key constraints: > "pref_money_id_fkey" FOREIGN KEY (id) REFERENCES pref_users(id) ON > DELETE CASCADE > > # \d pref_users > Table "public.pref_users" > Column | Type | Modifiers > ------------+-----------------------------+-------------------- > id | character varying(32) | not null > first_name | character varying(64) | > last_name | character varying(64) | > female | boolean | > avatar | character varying(128) | > city | character varying(64) | > login | timestamp without time zone | default now() > last_ip | inet | > logout | timestamp without time zone | > vip | timestamp without time zone | > mail | character varying(254) | > medals | integer | not null default 0 > Indexes: > "pref_users_pkey" PRIMARY KEY, btree (id) > Referenced by: > TABLE "pref_cards" CONSTRAINT "pref_cards_id_fkey" FOREIGN KEY > (id) REFERENCES pref_users(id) ON DELETE CASCADE > TABLE "pref_catch" CONSTRAINT "pref_catch_id_fkey" FOREIGN KEY > (id) REFERENCES pref_users(id) ON DELETE CASCADE > TABLE "pref_chat" CONSTRAINT "pref_chat_id_fkey" FOREIGN KEY (id) > REFERENCES pref_users(id) ON DELETE CASCADE > TABLE "pref_discuss" CONSTRAINT "pref_discuss_id_fkey" FOREIGN KEY > (id) REFERENCES pref_users(id) ON DELETE CASCADE > TABLE "pref_game" CONSTRAINT "pref_game_id_fkey" FOREIGN KEY (id) > REFERENCES pref_users(id) ON DELETE CASCADE > TABLE "pref_luck" CONSTRAINT "pref_luck_id_fkey" FOREIGN KEY (id) > REFERENCES pref_users(id) ON DELETE CASCADE > TABLE "pref_match" CONSTRAINT "pref_match_id_fkey" FOREIGN KEY > (id) REFERENCES pref_users(id) ON DELETE CASCADE > TABLE "pref_misere" CONSTRAINT "pref_misere_id_fkey" FOREIGN KEY > (id) REFERENCES pref_users(id) ON DELETE CASCADE > TABLE "pref_money" CONSTRAINT "pref_money_id_fkey" FOREIGN KEY > (id) REFERENCES pref_users(id) ON DELETE CASCADE > TABLE "pref_pass" CONSTRAINT "pref_pass_id_fkey" FOREIGN KEY (id) > REFERENCES pref_users(id) ON DELETE CASCADE > TABLE "pref_payment" CONSTRAINT "pref_payment_id_fkey" FOREIGN KEY > (id) REFERENCES pref_users(id) ON DELETE CASCADE > TABLE "pref_rep" CONSTRAINT "pref_rep_author_fkey" FOREIGN KEY > (author) REFERENCES pref_users(id) ON DELETE CASCADE > TABLE "pref_rep" CONSTRAINT "pref_rep_id_fkey" FOREIGN KEY (id) > REFERENCES pref_users(id) ON DELETE CASCADE > TABLE "pref_scores" CONSTRAINT "pref_scores_id_fkey" FOREIGN KEY > (id) REFERENCES pref_users(id) ON DELETE CASCADE > TABLE "pref_status" CONSTRAINT "pref_status_id_fkey" FOREIGN KEY > (id) REFERENCES pref_users(id) ON DELETE CASCADE > TABLE "pref_votes" CONSTRAINT "pref_votes_id_fkey" FOREIGN KEY > (id) REFERENCES pref_users(id) ON DELETE CASCADE > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general -- ============================== If Pac-Man had affected us as kids, we'd all be running around in a darkened room munching pills and listening to repetitive music.
Re: Deleting 173000 records takes forever, blocks async queries for unrelated records
From
Alexander Farber
Date:
Thanks, I will add an index(id, yw) and check. The limit suggestion (yes, I already have a "vacuum" cronjob to purge spam-users from drupal_users etc.) doesn't work: # delete from pref_users where id not in (select distinct id from pref_money) limit 10; ERROR: syntax error at or near "limit" LINE 2: ...ere id not in (select distinct id from pref_money) limit 10;
Re: Deleting 173000 records takes forever, blocks async queries for unrelated records
From
Bèrto ëd Sèra
Date:
righto. You need a CTE to do that create table deletable ( id bigint not null primary key); create table condition ( id bigint not null primary key); insert into deletable select generate_series(1,500); insert into condition select generate_series(1,50); WITH target AS ( select id from deletable d where d.id not in (select id from condition) limit 10 ) delete from deletable where id in (select id from target); This will open a hole from 51 to 60, next time from 61 to 70 etc... Cheers Bèrto On 1 February 2013 11:30, Alexander Farber <alexander.farber@gmail.com> wrote: > Thanks, I will add an index(id, yw) and check. > > The limit suggestion (yes, I already have a "vacuum" > cronjob to purge spam-users from drupal_users etc.) > doesn't work: > > # delete from pref_users > where id not in (select distinct id from pref_money) limit 10; > ERROR: syntax error at or near "limit" > LINE 2: ...ere id not in (select distinct id from pref_money) limit 10; > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general -- ============================== If Pac-Man had affected us as kids, we'd all be running around in a darkened room munching pills and listening to repetitive music.
Re: Deleting 173000 records takes forever, blocks async queries for unrelated records
From
"Carlo Stonebanks"
Date:
>># delete from pref_users >>where id not in (select distinct id from pref_money) limit 10; >>ERROR: syntax error at or near "limit" >>LINE 2: ...ere id not in (select distinct id from pref_money) limit 10; Or this? DELETE FROM pref_users WHERE id IN ( SELECT id FROM pref_users WHERE id NOT IN ( SELECT DISTINCT id FROM pref_money ) LIMIT 10 ); Saw elsewhere another suggestion with EXCEPT, which is also sexy.