Thread: BUG #15737: Unexpectedly Deleting full table when referring CTE (With Clause ) data,in a Subquery in another CTE

The following bug has been logged on the website:

Bug reference:      15737
Logged by:          Chandan Ahuja
Email address:      chandanahuja7@gmail.com
PostgreSQL version: 11.1
Operating system:   RHEL
Description:

Hi , 

I faced  this unexpected behavior when i use CTE (WITH clause) query to
delete the rows from multiple tables. Because of which my entire data-set
has been deleted. It looks like this is a bug.

I have 3 tables , (1)Contract , (2)ContractComponent  and (3)ContractRole.
Lets call them CT, CC and CR respectively.
CT has One-To-Many relation with CC  
CC   has One-To-Many relation with CR.
Table Structure:
(1)Contract (contract_id(PK)  )  , (2)ContractComponent
(Contractcomponent_id(PK) , contract_id(FK) ), (3)ContractRole
(ContractRole_id(PK) ,  ContractComponent_id (FK),
party_source_system_record_pk   ).

I have applied Foreign Key constraint in these 3 tables with On-Delete
Cascade option.

I wanted to delete ONLY one particular Contract (CT) record and its
corresponding records in CC and CR tables so I made the following query.
This Query is giving 2 strange results which i am not able to fathom, and it
appears to me that it is a BUG.

ISSUES/BUG:

1. Contract_id column is not specified in the Select clause of "deletedata "
, still PostgreSQL does not complains in the "delct" and goes on to execute
the SQL.
2. But the bigger issue is that on execution it deletes all the records from
the Contract and ContractComponent table, and  not just the one meeting the
filter clause specified  (where cr.party_source_system_record_pk  =
'20000151686').


------------------------------------------------------------------------------------------------------------------------------------------------------------

with deletedata as
(
select  cc.contractcomponent_id   
---- Contract_id column is not specified in the Select clause of
"deletedata"  , still PostgreSQL does  not complains in the "delct"
from 
core.contractrole cr
join core.contractcomponent cc on  cc.contractcomponent_id =
cr.contractcomponent_id
join core.contract ct on  cc.contract_id = ct.contract_id
where 
cr.party_source_system_record_pk  = '20000151686'
)
, delcc as
(
delete from core.contractcomponent where contractcomponent_id in ( select
distinct contractcomponent_id from deletedata)
    -- returning contractcomponent_id
)
, delct as
(
-- I am referring contract_id column here from deletedata but it does not
exist
delete from core.contract where contract_id in ( select distinct contract_id
from deletedata) -- returning contract_id
)
select distinct contractcomponent_id from deletedata

------------------------------------------------------------------------------------------------------------------------------------------------------------

Thanks an anticipation.


On Thu, Apr 4, 2019 at 1:08 PM PG Bug reporting form <noreply@postgresql.org> wrote:
The following bug has been logged on the website:

Bug reference:      15737
Logged by:          Chandan Ahuja
Email address:      chandanahuja7@gmail.com
PostgreSQL version: 11.1
Operating system:   RHEL
Description:       

 ....
 

-- I am referring contract_id column here from deletedata but it does not
exist
delete from core.contract where contract_id in ( select distinct contract_id
from deletedata) -- returning contract_id

No bugs; its just the hard (but common) way to really learn the difference between a correlated subquery and an independent one; and to test your deletion queries thoroughly before running them live.

Since the contract_id column has not been provided a specific table qualifier the planner is free to choose any contract_id column it can find.  Since core.contract has a contract_id column that one is chosen; and so you've created a correlated subquery that is the the equivalent of:

delete from core.contract where contract_id = contract_id;

Which you should agree deletes every row.

Writing the following would provoke the error you are expecting:

delete from core.contract where contract_id in (select distinct deletedata.contract_id from deletedata);

Then, since you implemented ON CASCADE DELETE (which is good, why are you bothering to perform a manual cascade per the above?), the deletion from core.contract cascades to core.contractcomponent.

David J.

On Fri, 5 Apr 2019 at 09:26, David G. Johnston
<david.g.johnston@gmail.com> wrote:
>
> On Thu, Apr 4, 2019 at 1:08 PM PG Bug reporting form <noreply@postgresql.org> wrote:
>> -- I am referring contract_id column here from deletedata but it does not
>> exist
>> delete from core.contract where contract_id in ( select distinct contract_id
>> from deletedata) -- returning contract_id
>
>
> No bugs; its just the hard (but common) way to really learn the difference between a correlated subquery and an
independentone; and to test your deletion queries thoroughly before running them live.
 

Ouch! ... the hard way to learn to always give your tables an alias
and prefix the column names with them.

There are cases where it could also happen if a column is dropped.
Best not to leave these landmines laying around:

postgres=# delete from t1 where b in(select b from t2);
DELETE 0
postgres=# alter table t2 drop column b;
ALTER TABLE
postgres=# delete from t1 where b in(select b from t2);
DELETE 1000

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Thanks much Mr David Johnston and Mr David Rowley.

Apologies for  not replying sooner. 
Yes this issue made me learn the importance of Aliasing the tables and prefixing the column names in SQL. 
I wont forget it now.
Thankfully the mistake didn't prove to be costly as i was able to restore the DB from the backup :-)

Best Regards
Chandan Ahuja






On Fri, Apr 5, 2019 at 2:44 AM David Rowley <david.rowley@2ndquadrant.com> wrote:
On Fri, 5 Apr 2019 at 09:26, David G. Johnston
<david.g.johnston@gmail.com> wrote:
>
> On Thu, Apr 4, 2019 at 1:08 PM PG Bug reporting form <noreply@postgresql.org> wrote:
>> -- I am referring contract_id column here from deletedata but it does not
>> exist
>> delete from core.contract where contract_id in ( select distinct contract_id
>> from deletedata) -- returning contract_id
>
>
> No bugs; its just the hard (but common) way to really learn the difference between a correlated subquery and an independent one; and to test your deletion queries thoroughly before running them live.

Ouch! ... the hard way to learn to always give your tables an alias
and prefix the column names with them.

There are cases where it could also happen if a column is dropped.
Best not to leave these landmines laying around:

postgres=# delete from t1 where b in(select b from t2);
DELETE 0
postgres=# alter table t2 drop column b;
ALTER TABLE
postgres=# delete from t1 where b in(select b from t2);
DELETE 1000

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


--
Best Regards,
Chandan Ahuja