BUG #17413: update of partitioned table via postgres_fdw updates to much rows

Started by PG Bug reporting formabout 4 years ago4 messagesbugs
Jump to latest
#1PG Bug reporting form
noreply@postgresql.org

The following bug has been logged on the website:

Bug reference: 17413
Logged by: Stepan Yankevych
Email address: stepya@ukr.net
PostgreSQL version: 14.2
Operating system: CentOS
Description:

We noticed that update foreign table in some cases passes following update
to the remote DB
update part_update_test
set field=$2
where ctid=$1
In that case one row from each partition can be updated.
See steps to reproduce

-- pgprod1
drop table if exists trash.part_update_test;

CREATE TABLE trash.part_update_test (
id serial,
date_id int4 NOT NULL,
simple_text varchar
) PARTITION BY RANGE (date_id);

CREATE TABLE trash.part_update_test_20220221 PARTITION OF
trash.part_update_test FOR VALUES FROM (20220221) TO (20220222);
CREATE TABLE trash.part_update_test_20220222 PARTITION OF
trash.part_update_test FOR VALUES FROM (20220222) TO (20220223);
CREATE TABLE trash.part_update_test_20220223 PARTITION OF
trash.part_update_test FOR VALUES FROM (20220223) TO (20220224);

insert into trash.part_update_test (date_id, simple_text)
values (20220221, 'Im 20220221'),(20220222, 'I amd 20220222'),(20220223, 'I
am the third row ');

select ctid, *
from trash.part_update_test;

ctid |id|date_id |simple_text |
-----+--+--------+-------------------+
(0,1)| 1|20220221|Im 20220221 |
(0,1)| 2|20220222|I amd 20220222 |
(0,1)| 3|20220223|I am the third row |

-- pgprod2
DROP FOREIGN TABLE if EXISTS staging.part_update_test;
IMPORT FOREIGN SCHEMA "trash" LIMIT TO (part_update_test)
FROM SERVER postgresprod
into staging;

with ids as materialized (select 1 as id, 20220221 as date_id )
update staging.part_update_test t
set simple_text = 'I am updated version of 20220221 '
from ids
where t.id = ids.id
and t.date_id = ids.date_id ;

select ctid, *
from staging.part_update_test;

ctid |id|date_id |simple_text |
-----+--+--------+----------------------------------+
(0,2)| 1|20220221|I am updated version of 20220221 |
(0,2)| 2|20220222|I am updated version of 20220221 |
(0,2)| 3|20220223|I am updated version of 20220221 |

As you can see all of rows that had (0,1) where updated , but the only first
row with ID =1 had to be updated .
The same was reproducible at least in PG14.1

#2Japin Li
japinli@hotmail.com
In reply to: PG Bug reporting form (#1)
Re: BUG #17413: update of partitioned table via postgres_fdw updates to much rows

On Tue, 22 Feb 2022 at 18:03, PG Bug reporting form <noreply@postgresql.org> wrote:

The following bug has been logged on the website:

Bug reference: 17413
Logged by: Stepan Yankevych
Email address: stepya@ukr.net
PostgreSQL version: 14.2
Operating system: CentOS
Description:

We noticed that update foreign table in some cases passes following update
to the remote DB
update part_update_test
set field=$2
where ctid=$1
In that case one row from each partition can be updated.
See steps to reproduce

-- pgprod1
drop table if exists trash.part_update_test;

CREATE TABLE trash.part_update_test (
id serial,
date_id int4 NOT NULL,
simple_text varchar
) PARTITION BY RANGE (date_id);

CREATE TABLE trash.part_update_test_20220221 PARTITION OF
trash.part_update_test FOR VALUES FROM (20220221) TO (20220222);
CREATE TABLE trash.part_update_test_20220222 PARTITION OF
trash.part_update_test FOR VALUES FROM (20220222) TO (20220223);
CREATE TABLE trash.part_update_test_20220223 PARTITION OF
trash.part_update_test FOR VALUES FROM (20220223) TO (20220224);

insert into trash.part_update_test (date_id, simple_text)
values (20220221, 'Im 20220221'),(20220222, 'I amd 20220222'),(20220223, 'I
am the third row ');

select ctid, *
from trash.part_update_test;

ctid |id|date_id |simple_text |
-----+--+--------+-------------------+
(0,1)| 1|20220221|Im 20220221 |
(0,1)| 2|20220222|I amd 20220222 |
(0,1)| 3|20220223|I am the third row |

-- pgprod2
DROP FOREIGN TABLE if EXISTS staging.part_update_test;
IMPORT FOREIGN SCHEMA "trash" LIMIT TO (part_update_test)
FROM SERVER postgresprod
into staging;

with ids as materialized (select 1 as id, 20220221 as date_id )
update staging.part_update_test t
set simple_text = 'I am updated version of 20220221 '
from ids
where t.id = ids.id
and t.date_id = ids.date_id ;

select ctid, *
from staging.part_update_test;

ctid |id|date_id |simple_text |
-----+--+--------+----------------------------------+
(0,2)| 1|20220221|I am updated version of 20220221 |
(0,2)| 2|20220222|I am updated version of 20220221 |
(0,2)| 3|20220223|I am updated version of 20220221 |

As you can see all of rows that had (0,1) where updated , but the only first
row with ID =1 had to be updated .
The same was reproducible at least in PG14.1

Yeah, it is a bug IMO. The deparseUpdateSql() function assume that the tuple
id is as unique, however, it ignores the partitioned table, in which the tuple
id may be duplicated.

--
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.

#3Stepan Yankevych
Stepan_Yankevych@epam.com
In reply to: Japin Li (#2)
RE: BUG #17413: update of partitioned table via postgres_fdw updates to much rows

I propose to add secondary check of all field values from the where condition.
So in that particular case it could be something like that

update part_update_test
set field=$2
where ctid=$1
and id = $3
and date_id = $4;

Generally speaking it can even improve performance dur to partition pruning.

-----Original Message-----
From: Japin Li <japinli@hotmail.com>
Sent: Tuesday, February 22, 2022 7:09 PM
To: stepya@ukr.net; pgsql-bugs@lists.postgresql.org
Subject: Re: BUG #17413: update of partitioned table via postgres_fdw updates to much rows

On Tue, 22 Feb 2022 at 18:03, PG Bug reporting form <noreply@postgresql.org> wrote:

The following bug has been logged on the website:

Bug reference: 17413
Logged by: Stepan Yankevych
Email address: stepya@ukr.net
PostgreSQL version: 14.2
Operating system: CentOS
Description:

We noticed that update foreign table in some cases passes following
update to the remote DB update part_update_test set field=$2 where
ctid=$1 In that case one row from each partition can be updated.
See steps to reproduce

-- pgprod1
drop table if exists trash.part_update_test;

CREATE TABLE trash.part_update_test (
id serial,
date_id int4 NOT NULL,
simple_text varchar
) PARTITION BY RANGE (date_id);

CREATE TABLE trash.part_update_test_20220221 PARTITION OF
trash.part_update_test FOR VALUES FROM (20220221) TO (20220222);
CREATE TABLE trash.part_update_test_20220222 PARTITION OF
trash.part_update_test FOR VALUES FROM (20220222) TO (20220223);
CREATE TABLE trash.part_update_test_20220223 PARTITION OF
trash.part_update_test FOR VALUES FROM (20220223) TO (20220224);

insert into trash.part_update_test (date_id, simple_text) values
(20220221, 'Im 20220221'),(20220222, 'I amd 20220222'),(20220223, 'I
am the third row ');

select ctid, *
from trash.part_update_test;

ctid |id|date_id |simple_text |
-----+--+--------+-------------------+
(0,1)| 1|20220221|Im 20220221 |
(0,1)| 2|20220222|I amd 20220222 |
(0,1)| 3|20220223|I am the third row |

-- pgprod2
DROP FOREIGN TABLE if EXISTS staging.part_update_test; IMPORT FOREIGN
SCHEMA "trash" LIMIT TO (part_update_test) FROM SERVER postgresprod
into staging;

with ids as materialized (select 1 as id, 20220221 as date_id )
update staging.part_update_test t set simple_text = 'I am updated
version of 20220221 '
from ids
where t.id = ids.id
and t.date_id = ids.date_id ;

select ctid, *
from staging.part_update_test;

ctid |id|date_id |simple_text |
-----+--+--------+----------------------------------+
(0,2)| 1|20220221|I am updated version of 20220221 | (0,2)|
2|20220222|I am updated version of 20220221 | (0,2)| 3|20220223|I am
updated version of 20220221 |

As you can see all of rows that had (0,1) where updated , but the only
first row with ID =1 had to be updated .
The same was reproducible at least in PG14.1

Yeah, it is a bug IMO. The deparseUpdateSql() function assume that the tuple id is as unique, however, it ignores the partitioned table, in which the tuple id may be duplicated.

--
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.

#4Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Japin Li (#2)
Re: BUG #17413: update of partitioned table via postgres_fdw updates to much rows

On Wed, Feb 23, 2022 at 2:09 AM Japin Li <japinli@hotmail.com> wrote:

On Tue, 22 Feb 2022 at 18:03, PG Bug reporting form <noreply@postgresql.org> wrote:

The following bug has been logged on the website:

See steps to reproduce

-- pgprod1
drop table if exists trash.part_update_test;

CREATE TABLE trash.part_update_test (
id serial,
date_id int4 NOT NULL,
simple_text varchar
) PARTITION BY RANGE (date_id);

CREATE TABLE trash.part_update_test_20220221 PARTITION OF
trash.part_update_test FOR VALUES FROM (20220221) TO (20220222);
CREATE TABLE trash.part_update_test_20220222 PARTITION OF
trash.part_update_test FOR VALUES FROM (20220222) TO (20220223);
CREATE TABLE trash.part_update_test_20220223 PARTITION OF
trash.part_update_test FOR VALUES FROM (20220223) TO (20220224);

insert into trash.part_update_test (date_id, simple_text)
values (20220221, 'Im 20220221'),(20220222, 'I amd 20220222'),(20220223, 'I
am the third row ');

select ctid, *
from trash.part_update_test;

ctid |id|date_id |simple_text |
-----+--+--------+-------------------+
(0,1)| 1|20220221|Im 20220221 |
(0,1)| 2|20220222|I amd 20220222 |
(0,1)| 3|20220223|I am the third row |

-- pgprod2
DROP FOREIGN TABLE if EXISTS staging.part_update_test;
IMPORT FOREIGN SCHEMA "trash" LIMIT TO (part_update_test)
FROM SERVER postgresprod
into staging;

with ids as materialized (select 1 as id, 20220221 as date_id )
update staging.part_update_test t
set simple_text = 'I am updated version of 20220221 '
from ids
where t.id = ids.id
and t.date_id = ids.date_id ;

select ctid, *
from staging.part_update_test;

ctid |id|date_id |simple_text |
-----+--+--------+----------------------------------+
(0,2)| 1|20220221|I am updated version of 20220221 |
(0,2)| 2|20220222|I am updated version of 20220221 |
(0,2)| 3|20220223|I am updated version of 20220221 |

As you can see all of rows that had (0,1) where updated , but the only first
row with ID =1 had to be updated .
The same was reproducible at least in PG14.1

Yeah, it is a bug IMO. The deparseUpdateSql() function assume that the tuple
id is as unique, however, it ignores the partitioned table, in which the tuple
id may be duplicated.

Yeah, this is a known issue [1]/messages/by-id/CAFjFpRfcgwsHRmpvoOK-GUQi-n8MgAS+OxcQo=aBDn1COywmcg@mail.gmail.com. I created a patch for the issue, but
the patch needs more work. It’s on my TODO list to revisit the patch.

Best regards,
Etsuro Fujita

[1]: /messages/by-id/CAFjFpRfcgwsHRmpvoOK-GUQi-n8MgAS+OxcQo=aBDn1COywmcg@mail.gmail.com