Thread: delete and select with IN clause issues

delete and select with IN clause issues

From
Jeff Frost
Date:
I'm having problem with a cleanup script that runs nightly.  The script calls 
the following query:

delete from visit where id not in (select distinct visit_id from page_view);

This yields the following error:

ERROR:  update or delete on "visit" violates foreign key constraint 
"fk34afd255fbacabec" on "page_view"
DETAIL:  Key (id)=(38635629) is still referenced from table "page_view".

Ok, that seems fine, but when I do a select instead of delete, I do not 
find the referenced id in my list:

select id  from visit where id not in (select distinct visit_id from page_view) and id = 38635629 ;
 id
----
(0 rows)

Also, if I don't specify the id = bit at the end, I still don't find it in the 
output when I search through with less, so why is it trying to delete that 
row?

Rewriting the query like so, yields the same problem:

delete from visit where NOT EXISTS ( select * from page_view WHERE visit.id = 
page_view.visit_id);

ERROR:  update or delete on "visit" violates foreign key constraint 
"fk34afd255fbacabec" on "page_view"
DETAIL:  Key (id)=(38638264) is still referenced from table "page_view".

The plan looks like this:

explain analyze delete from visit where id not in (select distinct visit_id 
from page_view);

QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------------------
SeqScan on visit  (cost=165027.49..189106.89 rows=211976 width=6) (actual time=4789.595..5330.367 rows=150677 loops=1)
Filter: (NOT (hashed subplan))   SubPlan     ->  Unique  (cost=0.00..165017.77 rows=3889 width=8) (actual
time=2.717..4388.928rows=273285 loops=1)           ->  Index Scan using page_view_visit_idx on page_view
(cost=0.00..159625.41rows=2156946 width=8) (actual time=2.713..2914.944 rows=2249576 loops=1) Trigger for constraint
fk34afd255fbacabec:time=7174.540 calls=150677 Total runtime: 32772.345 ms
 
(7 rows)

\d page_view                      Table "public.page_view"     Column     |            Type             |
Modifiers
----------------+-----------------------------+--------------------- id             | bigint                      | not
nullvisit_id       | bigint                      | not null uri            | character varying(255)      | params
 | text                        | stamp          | timestamp without time zone | cindex         | integer
    | not null default -1 tindex         | integer                     | not null default -1 method         | character
varying(7)       | not null source_address | character varying(16)       | server_name    | character varying(255)
|
Indexes:    "page_view_pkey" PRIMARY KEY, btree (id)    "page_view_stamp_idx" btree (stamp)    "page_view_uri_idx"
btree(uri)    "page_view_visit_idx" btree (visit_id)
 
Foreign-key constraints:    "fk34afd255fbacabec" FOREIGN KEY (visit_id) REFERENCES visit(id)

What kind of silliness am I forgetting?

-- 
Jeff Frost, Owner     <jeff@frostconsultingllc.com>
Frost Consulting, LLC     http://www.frostconsultingllc.com/
Phone: 650-780-7908    FAX: 650-649-1954


Re: delete and select with IN clause issues

From
Jeff Frost
Date:
On Thu, 2 Nov 2006, Jeff Frost wrote:

> I'm having problem with a cleanup script that runs nightly.  The script calls 
> the following query:
>
> delete from visit where id not in (select distinct visit_id from page_view);
>
> This yields the following error:
>
> ERROR:  update or delete on "visit" violates foreign key constraint 
> "fk34afd255fbacabec" on "page_view"
> DETAIL:  Key (id)=(38635629) is still referenced from table "page_view".
>
> Ok, that seems fine, but when I do a select instead of delete, I do not find 
> the referenced id in my list:
>
> select id  from visit where id not in (select distinct visit_id from 
> page_view) and id = 38635629 ;
>
> id
> ----
> (0 rows)
>
> Also, if I don't specify the id = bit at the end, I still don't find it in 
> the output when I search through with less, so why is it trying to delete 
> that row?
>
> Rewriting the query like so, yields the same problem:
>
> delete from visit where NOT EXISTS ( select * from page_view WHERE visit.id = 
> page_view.visit_id);
>
> ERROR:  update or delete on "visit" violates foreign key constraint 
> "fk34afd255fbacabec" on "page_view"
> DETAIL:  Key (id)=(38638264) is still referenced from table "page_view".
>
> The plan looks like this:
>
> explain analyze delete from visit where id not in (select distinct visit_id 
> from page_view);
>
> QUERY PLAN
>
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
> Seq Scan on visit  (cost=165027.49..189106.89 rows=211976 width=6) (actual 
> time=4789.595..5330.367 rows=150677 loops=1)
>   Filter: (NOT (hashed subplan))
>   SubPlan
>     ->  Unique  (cost=0.00..165017.77 rows=3889 width=8) (actual 
> time=2.717..4388.928 rows=273285 loops=1)
>           ->  Index Scan using page_view_visit_idx on page_view 
> (cost=0.00..159625.41 rows=2156946 width=8) (actual time=2.713..2914.944 
> rows=2249576 loops=1)
> Trigger for constraint fk34afd255fbacabec: time=7174.540 calls=150677
> Total runtime: 32772.345 ms
> (7 rows)
>
> \d page_view
>                      Table "public.page_view"
>     Column     |            Type             |      Modifiers
> ----------------+-----------------------------+---------------------
> id             | bigint                      | not null
> visit_id       | bigint                      | not null
> uri            | character varying(255)      |
> params         | text                        |
> stamp          | timestamp without time zone |
> cindex         | integer                     | not null default -1
> tindex         | integer                     | not null default -1
> method         | character varying(7)        | not null
> source_address | character varying(16)       |
> server_name    | character varying(255)      |
> Indexes:
>    "page_view_pkey" PRIMARY KEY, btree (id)
>    "page_view_stamp_idx" btree (stamp)
>    "page_view_uri_idx" btree (uri)
>    "page_view_visit_idx" btree (visit_id)
> Foreign-key constraints:
>    "fk34afd255fbacabec" FOREIGN KEY (visit_id) REFERENCES visit(id)
>
> What kind of silliness am I forgetting?
>
>

Almost forgot:
 server_version
---------------- 8.1.4
(1 row)


-- 
Jeff Frost, Owner     <jeff@frostconsultingllc.com>
Frost Consulting, LLC     http://www.frostconsultingllc.com/
Phone: 650-780-7908    FAX: 650-649-1954


Re: delete and select with IN clause issues

From
Tom Lane
Date:
Jeff Frost <jeff@frostconsultingllc.com> writes:
> delete from visit where id not in (select distinct visit_id from page_view);

> This yields the following error:

> ERROR:  update or delete on "visit" violates foreign key constraint 
> "fk34afd255fbacabec" on "page_view"
> DETAIL:  Key (id)=(38635629) is still referenced from table "page_view".

This seems pretty darn weird.  I am wondering about corrupt indexes ---
can you find the indicated key in either table if you set
enable_indexscan and enable_bitmapscan to 0?

Also, this is a long shot, but does visit by any chance have a cascading
deletion self-reference?
        regards, tom lane


Re: delete and select with IN clause issues

From
Jeff Frost
Date:
On Thu, 2 Nov 2006, Tom Lane wrote:

> Jeff Frost <jeff@frostconsultingllc.com> writes:
>> delete from visit where id not in (select distinct visit_id from page_view);
>
>> This yields the following error:
>
>> ERROR:  update or delete on "visit" violates foreign key constraint
>> "fk34afd255fbacabec" on "page_view"
>> DETAIL:  Key (id)=(38635629) is still referenced from table "page_view".
>
> This seems pretty darn weird.  I am wondering about corrupt indexes ---
> can you find the indicated key in either table if you set
> enable_indexscan and enable_bitmapscan to 0?

test_tracking=# begin;
BEGIN
test_tracking=# set enable_seqscan TO false;
SET
test_tracking=# set enable_bitmapscan to false;
SET
test_tracking=# delete from visit where id not in (select distinct visit_id 
from
test_tracking(# page_view);
DELETE 150660
test_tracking=# ROLLBACK ;

So, it seems everything worked fine with those two set to false.

with seqscan enabled, it fails:

test_tracking=# begin;
BEGIN
test_tracking=# set enable_bitmapscan to false;
SET
test_tracking=# show enable_seqscan ; enable_seqscan
---------------- on
(1 row)

test_tracking=# delete from visit where id not in (select distinct visit_id 
from page_view );
ERROR:  update or delete on "visit" violates foreign key constraint 
"fk34afd255fbacabec" on "page_view"
DETAIL:  Key (id)=(38710245) is still referenced from table "page_view".

Looks like with just enable_seqscan disabled it works:

test_tracking=# begin;
BEGIN
test_tracking=# set enable_seqscan to false;
SET
test_tracking=# show enable_bitmapscan ; enable_bitmapscan
------------------- on
(1 row)

test_tracking=# delete from visit where id not in (select distinct visit_id 
from page_view );
DELETE 150661
test_tracking=# ROLLBACK ;
ROLLBACK

However, this doesn't yield anything:

select id  from visit where id not in (select distinct visit_id from page_view) and id = 38710245 ; id
----
(0 rows)

with them set to true or false.


>
> Also, this is a long shot, but does visit by any chance have a cascading
> deletion self-reference?

Nope, I guess I didn't \d visit..thought I did...it's below.  I'm going to 
guess that there is index corruption and a reindex page_view will fix it.  Do 
you want me to gather any information in case this is a reproducible bug 
before I issue the reindex?
                        Table "public.visit"   Column    |            Type             |       Modifiers
-------------+-----------------------------+------------------------ id          | bigint                      | not
nullvisitor_id  | bigint                      | not null campaign_id | bigint                      | session_id  |
charactervarying(32)       | not null uri         | character varying(2000)     | referer     | character varying(2000)
   | user_agent  | character varying(2000)     | remote_host | bigint                      | not null outcome     |
character(1)               | not null stamp       | timestamp without time zone | email_key   | character varying(16)
   | bot         | boolean                     | not null default false status      | character(1)                | not
null
Indexes:    "visit_pkey" PRIMARY KEY, btree (id)    "visit_un" UNIQUE, btree (session_id)    "visit_bot_idx" btree
(bot)   "visit_remote_host_ix" btree (remote_host)    "visit_stamp_ix" btree (stamp)    "visit_visitor_ix" btree
(visitor_id)
Foreign-key constraints:    "fk6b04d4bbfc2fa3d" FOREIGN KEY (remote_host) REFERENCES remote_host(id)
"fk6b04d4be5dc468"FOREIGN KEY (campaign_id) REFERENCES campaign(id)
 


-- 
Jeff Frost, Owner     <jeff@frostconsultingllc.com>
Frost Consulting, LLC     http://www.frostconsultingllc.com/
Phone: 650-780-7908    FAX: 650-649-1954


Re: delete and select with IN clause issues

From
Tom Lane
Date:
Jeff Frost <jeff@frostconsultingllc.com> writes:
> On Thu, 2 Nov 2006, Tom Lane wrote:
>> This seems pretty darn weird.  I am wondering about corrupt indexes ---
>> can you find the indicated key in either table if you set
>> enable_indexscan and enable_bitmapscan to 0?

> test_tracking=# begin;
> BEGIN
> test_tracking=# set enable_seqscan TO false;
> SET
> test_tracking=# set enable_bitmapscan to false;
> SET

That's not actually the test case I asked for ...

> Nope, I guess I didn't \d visit..thought I did...it's below.  I'm going to 
> guess that there is index corruption and a reindex page_view will fix it.  Do
> you want me to gather any information in case this is a reproducible bug 
> before I issue the reindex?

I'm thinking the same --- please save a physical copy of
page_view_visit_idx before you reindex it, and if that fixes the
problem, please send me copies of both the before and after states
of the index.
        regards, tom lane


Re: delete and select with IN clause issues

From
Jeff Frost
Date:
On Fri, 3 Nov 2006, Tom Lane wrote:

> Jeff Frost <jeff@frostconsultingllc.com> writes:
>> On Thu, 2 Nov 2006, Tom Lane wrote:
>>> This seems pretty darn weird.  I am wondering about corrupt indexes ---
>>> can you find the indicated key in either table if you set
>>> enable_indexscan and enable_bitmapscan to 0?
>
> That's not actually the test case I asked for ...

hahah, I wondered why you wanted me to disable seqscan...turns out you didn't! 
I just misread it. :-)

>
>> Nope, I guess I didn't \d visit..thought I did...it's below.  I'm going to
>> guess that there is index corruption and a reindex page_view will fix it.  Do
>> you want me to gather any information in case this is a reproducible bug
>> before I issue the reindex?
>
> I'm thinking the same --- please save a physical copy of
> page_view_visit_idx before you reindex it, and if that fixes the
> problem, please send me copies of both the before and after states
> of the index.

Well, interestingly, it seems to work now for some reason even though the 
cleanup script has failed three nights in a row and I haven't reindexed yet. 
Seems quite strange.  I guess now I'll have to wait and see if it fails again 
tonight.

BTW, just to make sure I get the right file to ship over if we have this 
again, it would be: /var/lib/pgsql/data/base/9366228/16204210 yes?

select * from pg_stat_user_indexes where indexrelname = 'page_view_visit_idx';  relid  | indexrelid | schemaname |
relname |    indexrelname     | 
 
idx_scan | idx_tup_read | idx_tup_fetch
---------+------------+------------+-----------+---------------------+----------+--------------+--------------- 9366257
|  16204210 | public     | page_view | page_view_visit_idx | 
 
5652735 |   1540722403 |    1524420263
(1 row)


-- 
Jeff Frost, Owner     <jeff@frostconsultingllc.com>
Frost Consulting, LLC     http://www.frostconsultingllc.com/
Phone: 650-780-7908    FAX: 650-649-1954


Re: delete and select with IN clause issues

From
Tom Lane
Date:
Jeff Frost <jeff@frostconsultingllc.com> writes:
> BTW, just to make sure I get the right file to ship over if we have this 
> again, it would be: /var/lib/pgsql/data/base/9366228/16204210 yes?

Not necessarily --- the filename is initially the same as the index OID,
but it wouldn't be anymore after a REINDEX.  Instead use what you get from

select relfilenode from pg_class where relname = 'page_view_visit_idx';

and note you will need to recheck it after reindexing.

Also, it's a good idea to do CHECKPOINT right before copying the
physical file, to make sure you've got a consistent file image on-disk.
If you want to be 100% certain, shut down the postmaster while copying,
but unless the index file is pretty large I think that's not necessary.
        regards, tom lane


Re: delete and select with IN clause issues

From
Jeff Frost
Date:
On Fri, 3 Nov 2006, Tom Lane wrote:

> Jeff Frost <jeff@frostconsultingllc.com> writes:
>> BTW, just to make sure I get the right file to ship over if we have this
>> again, it would be: /var/lib/pgsql/data/base/9366228/16204210 yes?
>
> Not necessarily --- the filename is initially the same as the index OID,
> but it wouldn't be anymore after a REINDEX.  Instead use what you get from
>
> select relfilenode from pg_class where relname = 'page_view_visit_idx';
>
> and note you will need to recheck it after reindexing.
>
> Also, it's a good idea to do CHECKPOINT right before copying the
> physical file, to make sure you've got a consistent file image on-disk.
> If you want to be 100% certain, shut down the postmaster while copying,
> but unless the index file is pretty large I think that's not necessary.

Well, I spoke to soon on the it all works front.  So, it's been reindexed and 
appears to be working properly now.  I guess I'll keep an eye on it for a 
while.  I didn't get your query suggestion in time, so hopefully I grabbed the 
right binary file..though it did seem to disappear after the reindex, so I 
think it's likely the correct one.  Definitely got the correct second one.

You can grab them here:

http://www.frostconsultingllc.com/fortom.tar.gz

And thanks again for the help!

-- 
Jeff Frost, Owner     <jeff@frostconsultingllc.com>
Frost Consulting, LLC     http://www.frostconsultingllc.com/
Phone: 650-780-7908    FAX: 650-649-1954


Re: delete and select with IN clause issues

From
Tom Lane
Date:
Jeff Frost <jeff@frostconsultingllc.com> writes:
> Well, I spoke to soon on the it all works front.  So, it's been
> reindexed and appears to be working properly now.  I guess I'll keep
> an eye on it for a while.  I didn't get your query suggestion in time,
> so hopefully I grabbed the right binary file..though it did seem to
> disappear after the reindex, so I think it's likely the correct one.
> Definitely got the correct second one.

Well, I can't find anything wrong :-(.  There are some differences in
the list of contained keys, but they're all up near the end of the
range, which is consistent with the assumption that the table is live
and had some changes between your two dumps of the index.  In
particular, there's no difference in the entries for the troublesome
key value:

38635629                24080   25
38635629                24080   26
38635629                24080   27

So I dunno what to make of it.  If it happens again, we need to look
more closely.
        regards, tom lane


Re: delete and select with IN clause issues

From
Jeff Frost
Date:
On Fri, 3 Nov 2006, Tom Lane wrote:

> Jeff Frost <jeff@frostconsultingllc.com> writes:
>> Well, I spoke to soon on the it all works front.  So, it's been
>> reindexed and appears to be working properly now.  I guess I'll keep
>> an eye on it for a while.  I didn't get your query suggestion in time,
>> so hopefully I grabbed the right binary file..though it did seem to
>> disappear after the reindex, so I think it's likely the correct one.
>> Definitely got the correct second one.
>
> Well, I can't find anything wrong :-(.  There are some differences in
> the list of contained keys, but they're all up near the end of the
> range, which is consistent with the assumption that the table is live
> and had some changes between your two dumps of the index.  In
> particular, there's no difference in the entries for the troublesome
> key value:
>
> 38635629                24080   25
> 38635629                24080   26
> 38635629                24080   27
>
> So I dunno what to make of it.  If it happens again, we need to look
> more closely.

Tom, I know we shouldn't have to REINDEX in the 8.1.x days.  Do you have any 
idea what might have allowed this to happen?  A while back this particular 
server was unable to send e-mail and so we weren't getting the vacuum verbose 
output.  As a consequence the FSM settings were too low.  That has been 
remedied, but I'm wondering if it's possible that the FSM settings being too 
low would allow the INDEX to somehow get damaged?

-- 
Jeff Frost, Owner     <jeff@frostconsultingllc.com>
Frost Consulting, LLC     http://www.frostconsultingllc.com/
Phone: 650-780-7908    FAX: 650-649-1954


Re: delete and select with IN clause issues

From
Jeff Frost
Date:
On Fri, 3 Nov 2006, Tom Lane wrote:

> Well, I can't find anything wrong :-(.  There are some differences in
> the list of contained keys, but they're all up near the end of the
> range, which is consistent with the assumption that the table is live
> and had some changes between your two dumps of the index.  In
> particular, there's no difference in the entries for the troublesome
> key value:
>
> 38635629                24080   25
> 38635629                24080   26
> 38635629                24080   27
>
> So I dunno what to make of it.  If it happens again, we need to look
> more closely.

Well, it's been working wonderfully since the REINDEX, so I don't know what to 
say.  Any idea if having a too small max_fsm_pages could hose an index, 
because I know that happened not too long before we started seeing this 
problem.  The fsm settings were increased prior to the problem occurring, but 
it's possible the index was already damaged?

-- 
Jeff Frost, Owner     <jeff@frostconsultingllc.com>
Frost Consulting, LLC     http://www.frostconsultingllc.com/
Phone: 650-780-7908    FAX: 650-649-1954


Re: delete and select with IN clause issues

From
Jeff Frost
Date:
On Tue, 7 Nov 2006, Jeff Frost wrote:

> Well, it's been working wonderfully since the REINDEX, so I don't know what 
> to say.  Any idea if having a too small max_fsm_pages could hose an index, 
> because I know that happened not too long before we started seeing this 
> problem.  The fsm settings were increased prior to the problem occurring, but 
> it's possible the index was already damaged?

Well, once again I speak too soon:

ERROR:  update or delete on "visit" violates foreign key constraint 
"fk34afd255fbacabec" on "page_view"
DETAIL:  Key (id)=(40276078) is still referenced from table "page_view".

and again, it's in both tables, so it shouldn't be trying to remove it from 
visit:

engage_tracking=# select id from visit where id = 40276078;    id
---------- 40276078
(1 row)

engage_tracking=# select visit_id from page_view where visit_id = 40276078; visit_id
---------- 40276078

Reminder: the delete query is:

delete from visit where id not in (select distinct visit_id from page_view);

and if I do this:

select id from visit where id not in (select distinct visit_id from page_view);

I get a result set without that id in it.

And now after looking at the data, I realize what's going on.  I believe I 
have a race condition in which the visit entry is created before the select 
begins, but the page_view entry has not yet been created.  I then try to 
delete the visit entry after the page_view row is created and then my fkey 
stops me.

I added a WHERE clause in my script that looks like this:

DELETE FROM visit WHERE stamp < now() - INTERVAL '30 days' AND id NOT IN (SELECT DISTINCT visit_id FROM page_view);

I think we can put this one to bed and sorry for wasting everyone's cycles.

-- 
Jeff Frost, Owner     <jeff@frostconsultingllc.com>
Frost Consulting, LLC     http://www.frostconsultingllc.com/
Phone: 650-780-7908    FAX: 650-649-1954