Allow batched insert during cross-partition updates
Based on the discussion at:
/messages/by-id/6929d485-2d2a-da46-3681-4a400a3d794f@enterprisedb.com
I'm posting the patch for $subject here in this new thread and I'll
add it to the next CF per Tomas' advice.
With 927f453a94106 committed earlier today, we limit insert batching
only to the cases where the query's main command is also insert,
because allowing it to be used in other cases can hit some limitations
of the current code.
One such case is cross-partition updates of a partitioned table which
internally uses insert. postgres_fdw supports some cases where a row
is moved from a local partition to a foreign partition. When doing
so, the moved row is inserted into the latter, but those inserts can't
use batching due to the aforementioned commit.
As described in the thread linked above, to make batching possible for
those internal inserts, we'd need to make some changes to both the
core code and postgres_fdw, which the attached patch implements.
Details are in the commit message.
--
Amit Langote
EDB: http://www.enterprisedb.com
Attachments:
v1-0001-Allow-batching-of-inserts-during-cross-partition-.patchapplication/octet-stream; name=v1-0001-Allow-batching-of-inserts-during-cross-partition-.patchDownload+112-94
On Thu, Feb 18, 2021 at 6:52 PM Amit Langote <amitlangote09@gmail.com> wrote:
Based on the discussion at:
/messages/by-id/6929d485-2d2a-da46-3681-4a400a3d794f@enterprisedb.com
I'm posting the patch for $subject here in this new thread and I'll
add it to the next CF per Tomas' advice.
Done: https://commitfest.postgresql.org/32/2992/
--
Amit Langote
EDB: http://www.enterprisedb.com
Hi,
thanks for the patch. I had a first look and played around with the code.
The code seems clean, complete, and does what it says on the tin. I will
need a bit more time to acclimatise with all the use cases for a more
thorough review.
I small question though is why expose PartitionTupleRouting and not add
a couple of functions to get the necessary info? If I have read the code
correctly the only members actually needed to be exposed are num_partitions
and partitions. Not a critique, I am just curious.
Cheers,
//Georgios
Hi,
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Thursday, February 18, 2021 10:54 AM, Amit Langote <amitlangote09@gmail.com> wrote:
On Thu, Feb 18, 2021 at 6:52 PM Amit Langote amitlangote09@gmail.com wrote:
Based on the discussion at:
/messages/by-id/6929d485-2d2a-da46-3681-4a400a3d794f@enterprisedb.com
I'm posting the patch for $subject here in this new thread and I'll
add it to the next CF per Tomas' advice.Done:https://commitfest.postgresql.org/32/2992/
--------------------------------------------------
apparently I did not receive the review comment I sent via the commitfest app.
Apologies for the chatter. Find the message-id here:
Show quoted text
Amit Langote
EDB: http://www.enterprisedb.com
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Wednesday, March 10, 2021 1:23 PM, Georgios <gkokolatos@protonmail.com> wrote:
Hi,
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Thursday, February 18, 2021 10:54 AM, Amit Langote amitlangote09@gmail.com wrote:On Thu, Feb 18, 2021 at 6:52 PM Amit Langote amitlangote09@gmail.com wrote:
Based on the discussion at:
/messages/by-id/6929d485-2d2a-da46-3681-4a400a3d794f@enterprisedb.com
I'm posting the patch for $subject here in this new thread and I'll
add it to the next CF per Tomas' advice.apparently I did not receive the review comment I sent via the commitfest app.
Apologies for the chatter. Find the message-id here:
/messages/by-id/161530518971.29967.9368488207318158252.pgcf@coridan.postgresql.org
I continued looking a bit at the patch, yet I am either failing to see fix or I am
looking at the wrong thing. Please find attached a small repro of what my expectetions
were.
As you can see in the repro, I would expect the
UPDATE local_root_remote_partitions SET a = 2;
to move the tuples to remote_partition_2 on the same transaction.
However this is not the case, with or without the patch.
Is my expectation of this patch wrong?
Cheers,
//Georgios
Show quoted text
Amit Langote
EDB: http://www.enterprisedb.com
Attachments:
Hi Georgios,
On Wed, Mar 10, 2021 at 12:54 AM Georgios Kokolatos
<gkokolatos@protonmail.com> wrote:
Hi,
thanks for the patch. I had a first look and played around with the code.
The code seems clean, complete, and does what it says on the tin. I will
need a bit more time to acclimatise with all the use cases for a more
thorough review.
Thanks for checking.
I small question though is why expose PartitionTupleRouting and not add
a couple of functions to get the necessary info? If I have read the code
correctly the only members actually needed to be exposed are num_partitions
and partitions. Not a critique, I am just curious.
I had implemented accessor functions in an earlier unposted version of
the patch, but just exposing PartitionTupleRouting does not sound so
harmful, so I switched to that approach. Maybe if others agree with
you that accessor functions would be better, I will change the patch
that way.
--
Amit Langote
EDB: http://www.enterprisedb.com
Hi Georgios,
On Wed, Mar 10, 2021 at 9:30 PM Georgios <gkokolatos@protonmail.com> wrote:
I continued looking a bit at the patch, yet I am either failing to see fix or I am
looking at the wrong thing. Please find attached a small repro of what my expectetions
were.As you can see in the repro, I would expect the
UPDATE local_root_remote_partitions SET a = 2;
to move the tuples to remote_partition_2 on the same transaction.
However this is not the case, with or without the patch.Is my expectation of this patch wrong?
I think yes. We currently don't have the feature you are looking for
-- moving tuples from one remote partition to another remote
partition. This patch is not for adding that feature.
What we do support however is moving rows from a local partition to a
remote partition and that involves performing an INSERT on the latter.
This patch is for teaching those INSERTs to use batched mode if
allowed, which is currently prohibited. So with this patch, if an
UPDATE moves 10 rows from a local partition to a remote partition,
then they will be inserted with a single INSERT command containing all
10 rows, instead of 10 separate INSERT commands.
--
Amit Langote
EDB: http://www.enterprisedb.com
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Thursday, March 11, 2021 9:42 AM, Amit Langote <amitlangote09@gmail.com> wrote:
Hi Georgios,
On Wed, Mar 10, 2021 at 9:30 PM Georgios gkokolatos@protonmail.com wrote:
I continued looking a bit at the patch, yet I am either failing to see fix or I am
looking at the wrong thing. Please find attached a small repro of what my expectetions
were.
As you can see in the repro, I would expect the
UPDATE local_root_remote_partitions SET a = 2;
to move the tuples to remote_partition_2 on the same transaction.
However this is not the case, with or without the patch.
Is my expectation of this patch wrong?I think yes. We currently don't have the feature you are looking for
-- moving tuples from one remote partition to another remote
partition. This patch is not for adding that feature.
Thank you for correcting me.
What we do support however is moving rows from a local partition to a
remote partition and that involves performing an INSERT on the latter.
This patch is for teaching those INSERTs to use batched mode if
allowed, which is currently prohibited. So with this patch, if an
UPDATE moves 10 rows from a local partition to a remote partition,
then they will be inserted with a single INSERT command containing all
10 rows, instead of 10 separate INSERT commands.
So, if I understand correctly then in my previously attached repro I
should have written instead:
CREATE TABLE local_root_remote_partitions (a int) PARTITION BY LIST ( a );
CREATE TABLE
local_root_local_partition_1
PARTITION OF
local_root_remote_partitions FOR VALUES IN (1);
CREATE FOREIGN TABLE
local_root_remote_partition_2
PARTITION OF
local_root_remote_partitions FOR VALUES IN (2)
SERVER
remote_server
OPTIONS (
table_name 'remote_partition_2',
batch_size '10'
);
INSERT INTO local_root_remote_partitions VALUES (1), (1);
-- Everything should be on local_root_local_partition_1 and on the same transaction
SELECT ctid, xmin, xmax, cmax, tableoid::regclass, a FROM local_root_remote_partitions;
UPDATE local_root_remote_partitions SET a = 2;
-- Everything should be on remote_partition_2 and on the same transaction
SELECT ctid, xmin, xmax, cmax, tableoid::regclass, a FROM local_root_remote_partitions;
I am guessing that I am still wrong because the UPDATE operation above will
fail due to the restrictions imposed in postgresBeginForeignInsert regarding
UPDATES.
Would it be too much to ask for the addition of a test case that will
demonstrate the change of behaviour found in patch?
Cheers,
//Georgios
Show quoted text
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Amit Langote
EDB: http://www.enterprisedb.com
On Thu, Mar 11, 2021 at 8:36 PM <gkokolatos@pm.me> wrote:
On Thursday, March 11, 2021 9:42 AM, Amit Langote <amitlangote09@gmail.com> wrote:
On Wed, Mar 10, 2021 at 9:30 PM Georgios gkokolatos@protonmail.com wrote:
I continued looking a bit at the patch, yet I am either failing to see fix or I am
looking at the wrong thing. Please find attached a small repro of what my expectetions
were.
As you can see in the repro, I would expect the
UPDATE local_root_remote_partitions SET a = 2;
to move the tuples to remote_partition_2 on the same transaction.
However this is not the case, with or without the patch.
Is my expectation of this patch wrong?I think yes. We currently don't have the feature you are looking for
-- moving tuples from one remote partition to another remote
partition. This patch is not for adding that feature.Thank you for correcting me.
What we do support however is moving rows from a local partition to a
remote partition and that involves performing an INSERT on the latter.
This patch is for teaching those INSERTs to use batched mode if
allowed, which is currently prohibited. So with this patch, if an
UPDATE moves 10 rows from a local partition to a remote partition,
then they will be inserted with a single INSERT command containing all
10 rows, instead of 10 separate INSERT commands.So, if I understand correctly then in my previously attached repro I
should have written instead:CREATE TABLE local_root_remote_partitions (a int) PARTITION BY LIST ( a );
CREATE TABLE
local_root_local_partition_1
PARTITION OF
local_root_remote_partitions FOR VALUES IN (1);CREATE FOREIGN TABLE
local_root_remote_partition_2
PARTITION OF
local_root_remote_partitions FOR VALUES IN (2)
SERVER
remote_server
OPTIONS (
table_name 'remote_partition_2',
batch_size '10'
);INSERT INTO local_root_remote_partitions VALUES (1), (1);
-- Everything should be on local_root_local_partition_1 and on the same transaction
SELECT ctid, xmin, xmax, cmax, tableoid::regclass, a FROM local_root_remote_partitions;UPDATE local_root_remote_partitions SET a = 2;
-- Everything should be on remote_partition_2 and on the same transaction
SELECT ctid, xmin, xmax, cmax, tableoid::regclass, a FROM local_root_remote_partitions;I am guessing that I am still wrong because the UPDATE operation above will
fail due to the restrictions imposed in postgresBeginForeignInsert regarding
UPDATES.
Yeah, for the move to work without hitting the restriction you
mention, you will need to write the UPDATE query such that
local_root_remote_partition_2 is not updated. For example, as
follows:
UPDATE local_root_remote_partitions SET a = 2 WHERE a <> 2;
With this query, the remote partition is not one of the result
relations to be updated, so is able to escape that restriction.
Would it be too much to ask for the addition of a test case that will
demonstrate the change of behaviour found in patch.
Hmm, I don't think there's a way to display whether the INSERT done on
the remote partition as a part of an (tuple-moving) UPDATE used
batching or not. That's because that INSERT's state is hidden from
EXPLAIN. Maybe we should change EXPLAIN to make it show such hidden
INSERT's state (especially its batch size) under the original UPDATE
node, but I am not sure.
By the way, the test case added by commit 927f453a94106 does exercise
the code added by this patch, but as I said in the last paragraph, we
can't verify that by inspecting EXPLAIN output.
--
Amit Langote
EDB: http://www.enterprisedb.com
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Friday, March 12, 2021 3:45 AM, Amit Langote <amitlangote09@gmail.com> wrote:
On Thu, Mar 11, 2021 at 8:36 PM gkokolatos@pm.me wrote:
On Thursday, March 11, 2021 9:42 AM, Amit Langote amitlangote09@gmail.com wrote:
On Wed, Mar 10, 2021 at 9:30 PM Georgios gkokolatos@protonmail.com wrote:
I continued looking a bit at the patch, yet I am either failing to see fix or I am
looking at the wrong thing. Please find attached a small repro of what my expectetions
were.
As you can see in the repro, I would expect the
UPDATE local_root_remote_partitions SET a = 2;
to move the tuples to remote_partition_2 on the same transaction.
However this is not the case, with or without the patch.
Is my expectation of this patch wrong?I think yes. We currently don't have the feature you are looking for
-- moving tuples from one remote partition to another remote
partition. This patch is not for adding that feature.Thank you for correcting me.
What we do support however is moving rows from a local partition to a
remote partition and that involves performing an INSERT on the latter.
This patch is for teaching those INSERTs to use batched mode if
allowed, which is currently prohibited. So with this patch, if an
UPDATE moves 10 rows from a local partition to a remote partition,
then they will be inserted with a single INSERT command containing all
10 rows, instead of 10 separate INSERT commands.So, if I understand correctly then in my previously attached repro I
should have written instead:CREATE TABLE local_root_remote_partitions (a int) PARTITION BY LIST ( a );
CREATE TABLE
local_root_local_partition_1
PARTITION OF
local_root_remote_partitions FOR VALUES IN (1);CREATE FOREIGN TABLE
local_root_remote_partition_2
PARTITION OF
local_root_remote_partitions FOR VALUES IN (2)
SERVER
remote_server
OPTIONS (
table_name 'remote_partition_2',
batch_size '10'
);INSERT INTO local_root_remote_partitions VALUES (1), (1);
-- Everything should be on local_root_local_partition_1 and on the same transaction
SELECT ctid, xmin, xmax, cmax, tableoid::regclass, a FROM local_root_remote_partitions;UPDATE local_root_remote_partitions SET a = 2;
-- Everything should be on remote_partition_2 and on the same transaction
SELECT ctid, xmin, xmax, cmax, tableoid::regclass, a FROM local_root_remote_partitions;I am guessing that I am still wrong because the UPDATE operation above will
fail due to the restrictions imposed in postgresBeginForeignInsert regarding
UPDATES.Yeah, for the move to work without hitting the restriction you
mention, you will need to write the UPDATE query such that
local_root_remote_partition_2 is not updated. For example, as
follows:UPDATE local_root_remote_partitions SET a = 2 WHERE a <> 2;
With this query, the remote partition is not one of the result
relations to be updated, so is able to escape that restriction.
Excellent. Thank you for the explanation and patience.
Would it be too much to ask for the addition of a test case that will
demonstrate the change of behaviour found in patch.Hmm, I don't think there's a way to display whether the INSERT done on
the remote partition as a part of an (tuple-moving) UPDATE used
batching or not. That's because that INSERT's state is hidden from
EXPLAIN. Maybe we should change EXPLAIN to make it show such hidden
INSERT's state (especially its batch size) under the original UPDATE
node, but I am not sure.
Yeah, there does not seem to be a way for explain to do show that information
with the current code.
By the way, the test case added by commit 927f453a94106 does exercise
the code added by this patch, but as I said in the last paragraph, we
can't verify that by inspecting EXPLAIN output.
I never doubted that. However, there is a difference. The current patch
changes the query to be executed in the remote from:
INSERT INTO <snip> VALUES ($1);
to:
INSERT INTO <snip> VALUES ($1), ($2) ... ($n);
When this patch gets in, it would be very helpful to know that subsequent
code changes will not cause regressions. So I was wondering if there is
a way to craft a test case that would break for the code in 927f453a94106
yet succeed with the current patch.
I attach version 2 of my small reproduction. I am under the impression that
in this version, examining the value of cmin in the remote table should
give an indication of whether the remote received a multiple insert queries
with a single value, or a single insert query with multiple values.
Or is this a wrong assumption of mine?
Cheers,
//Georgios
Show quoted text
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Amit Langote
EDB: http://www.enterprisedb.com
Attachments:
Hi Georgios,
On Fri, Mar 12, 2021 at 7:59 PM <gkokolatos@pm.me> wrote:
On Friday, March 12, 2021 3:45 AM, Amit Langote <amitlangote09@gmail.com> wrote:
On Thu, Mar 11, 2021 at 8:36 PM gkokolatos@pm.me wrote:
On Thursday, March 11, 2021 9:42 AM, Amit Langote amitlangote09@gmail.com wrote:
What we do support however is moving rows from a local partition to a
remote partition and that involves performing an INSERT on the latter.
This patch is for teaching those INSERTs to use batched mode if
allowed, which is currently prohibited. So with this patch, if an
UPDATE moves 10 rows from a local partition to a remote partition,
then they will be inserted with a single INSERT command containing all
10 rows, instead of 10 separate INSERT commands.So, if I understand correctly then in my previously attached repro I
should have written instead:CREATE TABLE local_root_remote_partitions (a int) PARTITION BY LIST ( a );
CREATE TABLE
local_root_local_partition_1
PARTITION OF
local_root_remote_partitions FOR VALUES IN (1);CREATE FOREIGN TABLE
local_root_remote_partition_2
PARTITION OF
local_root_remote_partitions FOR VALUES IN (2)
SERVER
remote_server
OPTIONS (
table_name 'remote_partition_2',
batch_size '10'
);INSERT INTO local_root_remote_partitions VALUES (1), (1);
-- Everything should be on local_root_local_partition_1 and on the same transaction
SELECT ctid, xmin, xmax, cmax, tableoid::regclass, a FROM local_root_remote_partitions;UPDATE local_root_remote_partitions SET a = 2;
-- Everything should be on remote_partition_2 and on the same transaction
SELECT ctid, xmin, xmax, cmax, tableoid::regclass, a FROM local_root_remote_partitions;I am guessing that I am still wrong because the UPDATE operation above will
fail due to the restrictions imposed in postgresBeginForeignInsert regarding
UPDATES.Yeah, for the move to work without hitting the restriction you
mention, you will need to write the UPDATE query such that
local_root_remote_partition_2 is not updated. For example, as
follows:UPDATE local_root_remote_partitions SET a = 2 WHERE a <> 2;
With this query, the remote partition is not one of the result
relations to be updated, so is able to escape that restriction.Excellent. Thank you for the explanation and patience.
Would it be too much to ask for the addition of a test case that will
demonstrate the change of behaviour found in patch.Hmm, I don't think there's a way to display whether the INSERT done on
the remote partition as a part of an (tuple-moving) UPDATE used
batching or not. That's because that INSERT's state is hidden from
EXPLAIN. Maybe we should change EXPLAIN to make it show such hidden
INSERT's state (especially its batch size) under the original UPDATE
node, but I am not sure.Yeah, there does not seem to be a way for explain to do show that information
with the current code.By the way, the test case added by commit 927f453a94106 does exercise
the code added by this patch, but as I said in the last paragraph, we
can't verify that by inspecting EXPLAIN output.I never doubted that. However, there is a difference. The current patch
changes the query to be executed in the remote from:INSERT INTO <snip> VALUES ($1);
to:
INSERT INTO <snip> VALUES ($1), ($2) ... ($n);When this patch gets in, it would be very helpful to know that subsequent
code changes will not cause regressions. So I was wondering if there is
a way to craft a test case that would break for the code in 927f453a94106
yet succeed with the current patch.
The test case "works" both before and after the patch, with the
difference being in the form of the remote query. It seems to me
though that you do get that.
I attach version 2 of my small reproduction. I am under the impression that
in this version, examining the value of cmin in the remote table should
give an indication of whether the remote received a multiple insert queries
with a single value, or a single insert query with multiple values.Or is this a wrong assumption of mine?
No, I think you have a good idea here.
I've adjusted that test case to confirm that the batching indeed works
by checking cmin of the moved rows, as you suggest. Please check the
attached updated patch.
--
Amit Langote
EDB: http://www.enterprisedb.com
Attachments:
v2-0001-Allow-batching-of-inserts-during-cross-partition-.patchapplication/octet-stream; name=v2-0001-Allow-batching-of-inserts-during-cross-partition-.patchDownload+133-96
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Tuesday, March 16, 2021 6:13 AM, Amit Langote <amitlangote09@gmail.com> wrote:
Hi Georgios,
On Fri, Mar 12, 2021 at 7:59 PM gkokolatos@pm.me wrote:
On Friday, March 12, 2021 3:45 AM, Amit Langote amitlangote09@gmail.com wrote:
On Thu, Mar 11, 2021 at 8:36 PM gkokolatos@pm.me wrote:
On Thursday, March 11, 2021 9:42 AM, Amit Langote amitlangote09@gmail.com wrote:
What we do support however is moving rows from a local partition to a
remote partition and that involves performing an INSERT on the latter.
This patch is for teaching those INSERTs to use batched mode if
allowed, which is currently prohibited. So with this patch, if an
UPDATE moves 10 rows from a local partition to a remote partition,
then they will be inserted with a single INSERT command containing all
10 rows, instead of 10 separate INSERT commands.So, if I understand correctly then in my previously attached repro I
should have written instead:CREATE TABLE local_root_remote_partitions (a int) PARTITION BY LIST ( a );
CREATE TABLE
local_root_local_partition_1
PARTITION OF
local_root_remote_partitions FOR VALUES IN (1);CREATE FOREIGN TABLE
local_root_remote_partition_2
PARTITION OF
local_root_remote_partitions FOR VALUES IN (2)
SERVER
remote_server
OPTIONS (
table_name 'remote_partition_2',
batch_size '10'
);INSERT INTO local_root_remote_partitions VALUES (1), (1);
-- Everything should be on local_root_local_partition_1 and on the same transaction
SELECT ctid, xmin, xmax, cmax, tableoid::regclass, a FROM local_root_remote_partitions;UPDATE local_root_remote_partitions SET a = 2;
-- Everything should be on remote_partition_2 and on the same transaction
SELECT ctid, xmin, xmax, cmax, tableoid::regclass, a FROM local_root_remote_partitions;I am guessing that I am still wrong because the UPDATE operation above will
fail due to the restrictions imposed in postgresBeginForeignInsert regarding
UPDATES.Yeah, for the move to work without hitting the restriction you
mention, you will need to write the UPDATE query such that
local_root_remote_partition_2 is not updated. For example, as
follows:
UPDATE local_root_remote_partitions SET a = 2 WHERE a <> 2;
With this query, the remote partition is not one of the result
relations to be updated, so is able to escape that restriction.Excellent. Thank you for the explanation and patience.
Would it be too much to ask for the addition of a test case that will
demonstrate the change of behaviour found in patch.Hmm, I don't think there's a way to display whether the INSERT done on
the remote partition as a part of an (tuple-moving) UPDATE used
batching or not. That's because that INSERT's state is hidden from
EXPLAIN. Maybe we should change EXPLAIN to make it show such hidden
INSERT's state (especially its batch size) under the original UPDATE
node, but I am not sure.Yeah, there does not seem to be a way for explain to do show that information
with the current code.By the way, the test case added by commit 927f453a94106 does exercise
the code added by this patch, but as I said in the last paragraph, we
can't verify that by inspecting EXPLAIN output.I never doubted that. However, there is a difference. The current patch
changes the query to be executed in the remote from:
INSERT INTO <snip> VALUES ($1);
to:
INSERT INTO <snip> VALUES ($1), ($2) ... ($n);
When this patch gets in, it would be very helpful to know that subsequent
code changes will not cause regressions. So I was wondering if there is
a way to craft a test case that would break for the code in 927f453a94106
yet succeed with the current patch.The test case "works" both before and after the patch, with the
difference being in the form of the remote query. It seems to me
though that you do get that.I attach version 2 of my small reproduction. I am under the impression that
in this version, examining the value of cmin in the remote table should
give an indication of whether the remote received a multiple insert queries
with a single value, or a single insert query with multiple values.
Or is this a wrong assumption of mine?No, I think you have a good idea here.
Thank you.
I've adjusted that test case to confirm that the batching indeed works
by checking cmin of the moved rows, as you suggest. Please check the
attached updated patch.
Excellent. The patch in the current version with the added test seems
ready to me.
I would still vote to have accessor functions instead of exposing the
whole PartitionTupleRouting struct, but I am not going to hold a too
strong stance about it.
If you agree with me, please provide an updated version of the patch.
Otherwise let it be known and I will flag the patch as RfC in the
commitfest app.
Cheers,
//Georgios
Show quoted text
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Amit Langote
EDB: http://www.enterprisedb.com
Hi Georgios,
On Tue, Mar 16, 2021 at 5:12 PM <gkokolatos@pm.me> wrote:
On Tuesday, March 16, 2021 6:13 AM, Amit Langote <amitlangote09@gmail.com> wrote:
On Fri, Mar 12, 2021 at 7:59 PM gkokolatos@pm.me wrote:
On Friday, March 12, 2021 3:45 AM, Amit Langote amitlangote09@gmail.com wrote:
By the way, the test case added by commit 927f453a94106 does exercise
the code added by this patch, but as I said in the last paragraph, we
can't verify that by inspecting EXPLAIN output.I never doubted that. However, there is a difference. The current patch
changes the query to be executed in the remote from:
INSERT INTO <snip> VALUES ($1);
to:
INSERT INTO <snip> VALUES ($1), ($2) ... ($n);
When this patch gets in, it would be very helpful to know that subsequent
code changes will not cause regressions. So I was wondering if there is
a way to craft a test case that would break for the code in 927f453a94106
yet succeed with the current patch.The test case "works" both before and after the patch, with the
difference being in the form of the remote query. It seems to me
though that you do get that.I attach version 2 of my small reproduction. I am under the impression that
in this version, examining the value of cmin in the remote table should
give an indication of whether the remote received a multiple insert queries
with a single value, or a single insert query with multiple values.
Or is this a wrong assumption of mine?No, I think you have a good idea here.
I've adjusted that test case to confirm that the batching indeed works
by checking cmin of the moved rows, as you suggest. Please check the
attached updated patch.Excellent. The patch in the current version with the added test seems
ready to me.
Thanks for quickly checking that.
I would still vote to have accessor functions instead of exposing the
whole PartitionTupleRouting struct, but I am not going to hold a too
strong stance about it.
I as well, although I would wait for others to chime in before
updating the patch that way.
--
Amit Langote
EDB: http://www.enterprisedb.com
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Tuesday, March 16, 2021 9:59 AM, Amit Langote <amitlangote09@gmail.com> wrote:
Hi Georgios,
On Tue, Mar 16, 2021 at 5:12 PM gkokolatos@pm.me wrote:
On Tuesday, March 16, 2021 6:13 AM, Amit Langote amitlangote09@gmail.com wrote:
On Fri, Mar 12, 2021 at 7:59 PM gkokolatos@pm.me wrote:
On Friday, March 12, 2021 3:45 AM, Amit Langote amitlangote09@gmail.com wrote:
By the way, the test case added by commit 927f453a94106 does exercise
the code added by this patch, but as I said in the last paragraph, we
can't verify that by inspecting EXPLAIN output.I never doubted that. However, there is a difference. The current patch
changes the query to be executed in the remote from:
INSERT INTO <snip> VALUES ($1);
to:
INSERT INTO <snip> VALUES ($1), ($2) ... ($n);
When this patch gets in, it would be very helpful to know that subsequent
code changes will not cause regressions. So I was wondering if there is
a way to craft a test case that would break for the code in 927f453a94106
yet succeed with the current patch.The test case "works" both before and after the patch, with the
difference being in the form of the remote query. It seems to me
though that you do get that.I attach version 2 of my small reproduction. I am under the impression that
in this version, examining the value of cmin in the remote table should
give an indication of whether the remote received a multiple insert queries
with a single value, or a single insert query with multiple values.
Or is this a wrong assumption of mine?No, I think you have a good idea here.
I've adjusted that test case to confirm that the batching indeed works
by checking cmin of the moved rows, as you suggest. Please check the
attached updated patch.Excellent. The patch in the current version with the added test seems
ready to me.Thanks for quickly checking that.
A pleasure.
I would still vote to have accessor functions instead of exposing the
whole PartitionTupleRouting struct, but I am not going to hold a too
strong stance about it.I as well, although I would wait for others to chime in before
updating the patch that way.
Fair enough.
Status updated to RfC in the commitfest app.
Show quoted text
----------------------------------------------------------------------------------------------
Amit Langote
EDB: http://www.enterprisedb.com
On Tue, Mar 16, 2021 at 6:13 PM <gkokolatos@pm.me> wrote:
Status updated to RfC in the commitfest app.
Patch fails to apply per cfbot, so rebased.
--
Amit Langote
EDB: http://www.enterprisedb.com
Attachments:
v3-0001-Allow-batching-of-inserts-during-cross-partition-.patchapplication/octet-stream; name=v3-0001-Allow-batching-of-inserts-during-cross-partition-.patchDownload+133-96
Hi,
In the description:
cross-partition update of partitioned tables can't use batching
because ExecInitRoutingInfo() which initializes the insert target
'which' should be dropped since 'because' should start a sentence.
+-- Check that batched inserts also works for inserts made during
inserts also works -> inserts also work
+ Assert(node->rootResultRelInfo->ri_RelationDesc->rd_rel->relkind ==
+ RELKIND_PARTITIONED_TABLE);
The level of nested field accesses is quite deep. If the assertion fails,
it would be hard to know which field is null.
Maybe use several assertions:
Assert(node->rootResultRelInfo)
Assert(node->rootResultRelInfo->ri_RelationDesc)
Assert(node->rootResultRelInfo->ri_RelationDesc->rd_rel->relkind ==
...
Cheers
On Sun, Apr 4, 2021 at 8:06 AM Amit Langote <amitlangote09@gmail.com> wrote:
Show quoted text
On Tue, Mar 16, 2021 at 6:13 PM <gkokolatos@pm.me> wrote:
Status updated to RfC in the commitfest app.
Patch fails to apply per cfbot, so rebased.
--
Amit Langote
EDB: http://www.enterprisedb.com
On Mon, Apr 5, 2021 at 1:16 AM Zhihong Yu <zyu@yugabyte.com> wrote:
Hi,
In the description:cross-partition update of partitioned tables can't use batching
because ExecInitRoutingInfo() which initializes the insert target'which' should be dropped since 'because' should start a sentence.
+-- Check that batched inserts also works for inserts made during
inserts also works -> inserts also work
+ Assert(node->rootResultRelInfo->ri_RelationDesc->rd_rel->relkind == + RELKIND_PARTITIONED_TABLE);The level of nested field accesses is quite deep. If the assertion fails, it would be hard to know which field is null.
Maybe use several assertions:
Assert(node->rootResultRelInfo)
Assert(node->rootResultRelInfo->ri_RelationDesc)
Assert(node->rootResultRelInfo->ri_RelationDesc->rd_rel->relkind == ...
Thanks for taking a look at this.
While I agree about having the 1st Assert you suggest, I don't think
this code needs the 2nd one, because its failure would very likely be
due to a problem in some totally unrelated code.
Updated patch attached. I had to adjust the test case a little bit to
account for the changes of 86dc90056d, something I failed to notice
yesterday. Also, I expanded the test case added in postgres_fdw.sql a
bit to show the batching in action more explicitly.
--
Amit Langote
EDB: http://www.enterprisedb.com
Attachments:
v4-0001-Allow-batching-of-inserts-during-cross-partition-.patchapplication/octet-stream; name=v4-0001-Allow-batching-of-inserts-during-cross-partition-.patchDownload+191-103
On Tue, Apr 6, 2021 at 3:08 PM Amit Langote <amitlangote09@gmail.com> wrote:
On Mon, Apr 5, 2021 at 1:16 AM Zhihong Yu <zyu@yugabyte.com> wrote:
Hi,
In the description:cross-partition update of partitioned tables can't use batching
because ExecInitRoutingInfo() which initializes the insert target'which' should be dropped since 'because' should start a sentence.
+-- Check that batched inserts also works for inserts made during
inserts also works -> inserts also work
+ Assert(node->rootResultRelInfo->ri_RelationDesc->rd_rel->relkind == + RELKIND_PARTITIONED_TABLE);The level of nested field accesses is quite deep. If the assertion fails, it would be hard to know which field is null.
Maybe use several assertions:
Assert(node->rootResultRelInfo)
Assert(node->rootResultRelInfo->ri_RelationDesc)
Assert(node->rootResultRelInfo->ri_RelationDesc->rd_rel->relkind == ...Thanks for taking a look at this.
While I agree about having the 1st Assert you suggest, I don't think
this code needs the 2nd one, because its failure would very likely be
due to a problem in some totally unrelated code.Updated patch attached. I had to adjust the test case a little bit to
account for the changes of 86dc90056d, something I failed to notice
yesterday. Also, I expanded the test case added in postgres_fdw.sql a
bit to show the batching in action more explicitly.
Some minor comments:
1) don't we need order by clause for the selects in the tests added?
+SELECT tableoid::regclass, * FROM batch_cp_upd_test;
+SELECT cmin, * FROM batch_cp_upd_test1;
2) typo - it is "should" not "shoud"
+-- cmin shoud be different across rows, because each one would be inserted
3) will the cmin in the output always be the same?
+SELECT cmin, * FROM batch_cp_upd_test3;
With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com
On Tue, Apr 6, 2021 at 6:49 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
On Tue, Apr 6, 2021 at 3:08 PM Amit Langote <amitlangote09@gmail.com> wrote:
Updated patch attached. I had to adjust the test case a little bit to
account for the changes of 86dc90056d, something I failed to notice
yesterday. Also, I expanded the test case added in postgres_fdw.sql a
bit to show the batching in action more explicitly.Some minor comments:
Thanks for the review.
1) don't we need order by clause for the selects in the tests added?
+SELECT tableoid::regclass, * FROM batch_cp_upd_test;
Good point. It wasn't necessary before, but it is after the test
expansion, so added.
3) will the cmin in the output always be the same?
+SELECT cmin, * FROM batch_cp_upd_test3;
TBH, I am not so sure. Maybe it's not a good idea to rely on cmin
after all. I've rewritten the tests to use a different method of
determining if a single or multiple insert commands were used in
moving rows into foreign partitions.
--
Amit Langote
EDB: http://www.enterprisedb.com
Attachments:
v5-0001-Allow-batching-of-inserts-during-cross-partition-.patchapplication/octet-stream; name=v5-0001-Allow-batching-of-inserts-during-cross-partition-.patchDownload+191-107
On Tue, Apr 6, 2021 at 6:37 PM Amit Langote <amitlangote09@gmail.com> wrote:
3) will the cmin in the output always be the same?
+SELECT cmin, * FROM batch_cp_upd_test3;TBH, I am not so sure. Maybe it's not a good idea to rely on cmin
after all. I've rewritten the tests to use a different method of
determining if a single or multiple insert commands were used in
moving rows into foreign partitions.
Thanks! It looks good!
With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com