BUG #15737: Unexpectedly Deleting full table when referring CTE (With Clause ) data,in a Subquery in another CTE
The following bug has been logged on the website:
Bug reference: 15737
Logged by: Chandan Ahuja
Email address: chandanahuja7@gmail.com
PostgreSQL version: 11.1
Operating system: RHEL
Description:
Hi ,
I faced this unexpected behavior when i use CTE (WITH clause) query to
delete the rows from multiple tables. Because of which my entire data-set
has been deleted. It looks like this is a bug.
I have 3 tables , (1)Contract , (2)ContractComponent and (3)ContractRole.
Lets call them CT, CC and CR respectively.
CT has One-To-Many relation with CC
CC has One-To-Many relation with CR.
Table Structure:
(1)Contract (contract_id(PK) ) , (2)ContractComponent
(Contractcomponent_id(PK) , contract_id(FK) ), (3)ContractRole
(ContractRole_id(PK) , ContractComponent_id (FK),
party_source_system_record_pk ).
I have applied Foreign Key constraint in these 3 tables with On-Delete
Cascade option.
I wanted to delete ONLY one particular Contract (CT) record and its
corresponding records in CC and CR tables so I made the following query.
This Query is giving 2 strange results which i am not able to fathom, and it
appears to me that it is a BUG.
ISSUES/BUG:
1. Contract_id column is not specified in the Select clause of "deletedata "
, still PostgreSQL does not complains in the "delct" and goes on to execute
the SQL.
2. But the bigger issue is that on execution it deletes all the records from
the Contract and ContractComponent table, and not just the one meeting the
filter clause specified (where cr.party_source_system_record_pk =
'20000151686').
------------------------------------------------------------------------------------------------------------------------------------------------------------
with deletedata as
(
select cc.contractcomponent_id
---- Contract_id column is not specified in the Select clause of
"deletedata" , still PostgreSQL does not complains in the "delct"
from
core.contractrole cr
join core.contractcomponent cc on cc.contractcomponent_id =
cr.contractcomponent_id
join core.contract ct on cc.contract_id = ct.contract_id
where
cr.party_source_system_record_pk = '20000151686'
)
, delcc as
(
delete from core.contractcomponent where contractcomponent_id in ( select
distinct contractcomponent_id from deletedata)
-- returning contractcomponent_id
)
, delct as
(
-- I am referring contract_id column here from deletedata but it does not
exist
delete from core.contract where contract_id in ( select distinct contract_id
from deletedata) -- returning contract_id
)
select distinct contractcomponent_id from deletedata
------------------------------------------------------------------------------------------------------------------------------------------------------------
Thanks an anticipation.
On Thu, Apr 4, 2019 at 1:08 PM PG Bug reporting form <noreply@postgresql.org>
wrote:
The following bug has been logged on the website:
Bug reference: 15737
Logged by: Chandan Ahuja
Email address: chandanahuja7@gmail.com
PostgreSQL version: 11.1
Operating system: RHEL
Description:
....
-- I am referring contract_id column here from deletedata but it does not
exist
delete from core.contract where contract_id in ( select distinct
contract_id
from deletedata) -- returning contract_id
No bugs; its just the hard (but common) way to really learn the difference
between a correlated subquery and an independent one; and to test your
deletion queries thoroughly before running them live.
Since the contract_id column has not been provided a specific table
qualifier the planner is free to choose any contract_id column it can
find. Since core.contract has a contract_id column that one is chosen; and
so you've created a correlated subquery that is the the equivalent of:
delete from core.contract where contract_id = contract_id;
Which you should agree deletes every row.
Writing the following would provoke the error you are expecting:
delete from core.contract where contract_id in (select distinct
deletedata.contract_id from deletedata);
Then, since you implemented ON CASCADE DELETE (which is good, why are you
bothering to perform a manual cascade per the above?), the deletion from
core.contract cascades to core.contractcomponent.
David J.
On Fri, 5 Apr 2019 at 09:26, David G. Johnston
<david.g.johnston@gmail.com> wrote:
On Thu, Apr 4, 2019 at 1:08 PM PG Bug reporting form <noreply@postgresql.org> wrote:
-- I am referring contract_id column here from deletedata but it does not
exist
delete from core.contract where contract_id in ( select distinct contract_id
from deletedata) -- returning contract_idNo bugs; its just the hard (but common) way to really learn the difference between a correlated subquery and an independent one; and to test your deletion queries thoroughly before running them live.
Ouch! ... the hard way to learn to always give your tables an alias
and prefix the column names with them.
There are cases where it could also happen if a column is dropped.
Best not to leave these landmines laying around:
postgres=# delete from t1 where b in(select b from t2);
DELETE 0
postgres=# alter table t2 drop column b;
ALTER TABLE
postgres=# delete from t1 where b in(select b from t2);
DELETE 1000
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
*Thanks much Mr David Johnston and Mr David Rowley.*
Apologies for not replying sooner.
Yes this issue made me learn the importance of Aliasing the tables and
prefixing the column names in SQL.
I wont forget it now.
Thankfully the mistake didn't prove to be costly as i was able to restore
the DB from the backup :-)
Best Regards
Chandan Ahuja
On Fri, Apr 5, 2019 at 2:44 AM David Rowley <david.rowley@2ndquadrant.com>
wrote:
On Fri, 5 Apr 2019 at 09:26, David G. Johnston
<david.g.johnston@gmail.com> wrote:On Thu, Apr 4, 2019 at 1:08 PM PG Bug reporting form <
noreply@postgresql.org> wrote:
-- I am referring contract_id column here from deletedata but it does
not
exist
delete from core.contract where contract_id in ( select distinctcontract_id
from deletedata) -- returning contract_id
No bugs; its just the hard (but common) way to really learn the
difference between a correlated subquery and an independent one; and to
test your deletion queries thoroughly before running them live.Ouch! ... the hard way to learn to always give your tables an alias
and prefix the column names with them.There are cases where it could also happen if a column is dropped.
Best not to leave these landmines laying around:postgres=# delete from t1 where b in(select b from t2);
DELETE 0
postgres=# alter table t2 drop column b;
ALTER TABLE
postgres=# delete from t1 where b in(select b from t2);
DELETE 1000--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Best Regards,
Chandan Ahuja