postgres_fdw: commit remote (sub)transactions in parallel during pre-commit

Started by Etsuro Fujitaover 4 years ago46 messageshackers
Jump to latest
#1Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp

Hi,

As I said before [1]/messages/by-id/CAPmGK177E6HPcCQB4-s+m9AcCZDHCC2drZy+FKnnvEXw9kXoXQ@mail.gmail.com, I’m working on $SUBJECT. Attached is a WIP
patch for that. The patch is pretty simple: if a server option added
by the patch “parallel_commit” is enabled, 1) asynchronously send
COMMIT TRANSACTION (RELEASE SAVEPOINT) to all remote servers involved
in a local (sub)transaction, then 2) wait for the results from the
remote servers in the order that the command was sent to the remote
servers, when called from pgfdw_xact_callback (pgfdw_subxact_callback)
during pre-commit. The patch also parallelizes clearing prepared
statements the same way during pre-commit. (The option is false by
default.)

I evaluated the effectiveness of the patch using a simple
multi-statement transaction:

BEGIN;
SAVEPOINT s;
INSERT INTO ft1 VALUES (10, 10);
INSERT INTO ft2 VALUES (20, 20);
RELEASE SAVEPOINT s;
COMMIT;

where ft1 and ft2 are foreign tables created on different foreign
servers hosted on different machines. I ran the transaction five
times using the patch with the option enabled/disabled, and measured
the latencies for the RELEASE and COMMIT commands in each run. The
average latencies for these commands over the five runs are:

* RELEASE
parallel_commit=0: 0.385 ms
parallel_commit=1: 0.221 ms

* COMMIT
parallel_commit=0: 1.660 ms
parallel_commit=1: 0.861 ms

With the option enabled, the average latencies for both commands are
reduced significantly!

I think we could extend this to abort cleanup of remote
(sub)transactions during post-abort. Anyway, I think this is useful,
so I’ll add this to the upcoming commitfest.

Best regards,
Etsuro Fujita

[1]: /messages/by-id/CAPmGK177E6HPcCQB4-s+m9AcCZDHCC2drZy+FKnnvEXw9kXoXQ@mail.gmail.com

Attachments:

postgres-fdw-parallel-commit-1.patchapplication/octet-stream; name=postgres-fdw-parallel-commit-1.patchDownload+269-16
#2Fujii Masao
masao.fujii@gmail.com
In reply to: Etsuro Fujita (#1)
Re: postgres_fdw: commit remote (sub)transactions in parallel during pre-commit

On 2021/10/31 18:05, Etsuro Fujita wrote:

Hi,

As I said before [1], I’m working on $SUBJECT. Attached is a WIP
patch for that.

Thanks for the patch!

The patch is pretty simple: if a server option added
by the patch “parallel_commit” is enabled,

Could you tell me why the parameter is necessary?
Can't we always enable the feature?

* RELEASE
parallel_commit=0: 0.385 ms
parallel_commit=1: 0.221 ms

* COMMIT
parallel_commit=0: 1.660 ms
parallel_commit=1: 0.861 ms

With the option enabled, the average latencies for both commands are
reduced significantly!

Sounds great!

I think we could extend this to abort cleanup of remote
(sub)transactions during post-abort. Anyway, I think this is useful,
so I’ll add this to the upcoming commitfest.

Thanks!

+	/* Consume whatever data is available from the socket */
+	if (!PQconsumeInput(conn))
+		pgfdw_report_error(ERROR, NULL, conn, false, sql);

Without the patch, PQconsumeInput() is not called before pgfdw_get_result()
But could you tell me why you added PQconsumeInput() there?

When ignore_errors argument is true, the error reported by
PQconsumeInput() should be ignored?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

#3David Zhang
david.zhang@highgo.ca
In reply to: Etsuro Fujita (#1)
Re: postgres_fdw: commit remote (sub)transactions in parallel during pre-commit

I evaluated the effectiveness of the patch using a simple
multi-statement transaction:

BEGIN;
SAVEPOINT s;
INSERT INTO ft1 VALUES (10, 10);
INSERT INTO ft2 VALUES (20, 20);
RELEASE SAVEPOINT s;
COMMIT;

where ft1 and ft2 are foreign tables created on different foreign
servers hosted on different machines. I ran the transaction five
times using the patch with the option enabled/disabled, and measured
the latencies for the RELEASE and COMMIT commands in each run. The
average latencies for these commands over the five runs are:

* RELEASE
parallel_commit=0: 0.385 ms
parallel_commit=1: 0.221 ms

* COMMIT
parallel_commit=0: 1.660 ms
parallel_commit=1: 0.861 ms

With the option enabled, the average latencies for both commands are
reduced significantly!

Followed your instructions, I performed some basic tests to compare the
performance between before and after. In my testing environment (two
foreign servers on the same local machine), the performance varies,
sometimes the time spent on RELEASE and COMMIT without patch are close
to after patched, but seems it always perform better after patched. Then
I ran a 1-millions tuples insert, 5 times average is something like below,

Before
    RELEASE 0.171 ms, COMMIT 1.861 ms

After
    RELEASE 0.147 ms, COMMIT 1.305 ms

Best regards,
--
David

Software Engineer
Highgo Software Inc. (Canada)
www.highgo.ca

#4Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Fujii Masao (#2)
Re: postgres_fdw: commit remote (sub)transactions in parallel during pre-commit

On Mon, Nov 1, 2021 at 3:22 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2021/10/31 18:05, Etsuro Fujita wrote:

The patch is pretty simple: if a server option added
by the patch “parallel_commit” is enabled,

Could you tell me why the parameter is necessary?
Can't we always enable the feature?

I don’t think parallel commit would cause performance degradation,
even in the case when there is just a single remote (sub)transaction
to commit when called from pgfdw_xact_callback
(pgfdw_subxact_callback), so I think it might be OK to enable it by
default. But my concern about doing so is the remote side: during
those functions, if there are a lot of (sub)transactions on a single
remote server that need to be committed, parallel commit would
increase the remote server’s load at (sub)transaction end than serial
commit, which is the existing implementation, as the requests to
commit those (sub)transactions are sent to the remote server at the
same time; which some users might want to avoid.

I think we could extend this to abort cleanup of remote
(sub)transactions during post-abort. Anyway, I think this is useful,
so I’ll add this to the upcoming commitfest.

Thanks!

I'll update the patch as such in the next version.

+       /* Consume whatever data is available from the socket */
+       if (!PQconsumeInput(conn))
+               pgfdw_report_error(ERROR, NULL, conn, false, sql);

Without the patch, PQconsumeInput() is not called before pgfdw_get_result()
But could you tell me why you added PQconsumeInput() there?

The reason is that there might be the result already before calling
pgfdw_get_result(), in which case PQconsumeInput() followed by
PQisBusy() would allow us to call PQgetResult() without doing
WaitLatchOrSocket(), which I think is rather expensive.

When ignore_errors argument is true, the error reported by
PQconsumeInput() should be ignored?

I’m not sure about that, because the error might be caused by e.g.,
OOM in the local server, in which case I don’t think it is safe to
ignore it and continue the (sub)transaction-end processing.

Thanks for reviewing!

Best regards,
Etsuro Fujita

#5Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: David Zhang (#3)
Re: postgres_fdw: commit remote (sub)transactions in parallel during pre-commit

On Tue, Nov 2, 2021 at 7:47 AM David Zhang <david.zhang@highgo.ca> wrote:

Followed your instructions, I performed some basic tests to compare the
performance between before and after. In my testing environment (two
foreign servers on the same local machine), the performance varies,
sometimes the time spent on RELEASE and COMMIT without patch are close
to after patched, but seems it always perform better after patched. Then
I ran a 1-millions tuples insert, 5 times average is something like below,

Before
RELEASE 0.171 ms, COMMIT 1.861 ms

After
RELEASE 0.147 ms, COMMIT 1.305 ms

Good to know! Thanks for testing!

Best regards,
Etsuro Fujita

#6Fujii Masao
masao.fujii@gmail.com
In reply to: Etsuro Fujita (#4)
Re: postgres_fdw: commit remote (sub)transactions in parallel during pre-commit

On 2021/11/07 18:06, Etsuro Fujita wrote:

On Mon, Nov 1, 2021 at 3:22 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2021/10/31 18:05, Etsuro Fujita wrote:

The patch is pretty simple: if a server option added
by the patch “parallel_commit” is enabled,

Could you tell me why the parameter is necessary?
Can't we always enable the feature?

I don’t think parallel commit would cause performance degradation,
even in the case when there is just a single remote (sub)transaction
to commit when called from pgfdw_xact_callback
(pgfdw_subxact_callback), so I think it might be OK to enable it by
default. But my concern about doing so is the remote side: during
those functions, if there are a lot of (sub)transactions on a single
remote server that need to be committed, parallel commit would
increase the remote server’s load at (sub)transaction end than serial
commit, which is the existing implementation, as the requests to
commit those (sub)transactions are sent to the remote server at the
same time; which some users might want to avoid.

Thanks for explaining this! But probably I failed to get your point.
Sorry... Whichever parallel or serial commit approach, the number of
transactions to commit on the remote server is the same, isn't it?
For example, please imagine the case where a client requests
ten transactions per second to the local server. Each transaction
accesses to the foreign table, so which means that ten transaction
commit operations per second are requested to the remote server.
Unless I'm missing something, this number doesn't change whether
the foreign transaction is committed in parallel way or not.
Thought?

I think we could extend this to abort cleanup of remote
(sub)transactions during post-abort. Anyway, I think this is useful,
so I’ll add this to the upcoming commitfest.

Thanks!

I'll update the patch as such in the next version.

IMO it's better to implement and commit these features gradually
if possible. Which would simplify the patch and make it
easier-to-review. So I think that it's better to implement
this feature as a separate patch.

+       /* Consume whatever data is available from the socket */
+       if (!PQconsumeInput(conn))
+               pgfdw_report_error(ERROR, NULL, conn, false, sql);

Without the patch, PQconsumeInput() is not called before pgfdw_get_result()
But could you tell me why you added PQconsumeInput() there?

The reason is that there might be the result already before calling
pgfdw_get_result(), in which case PQconsumeInput() followed by
PQisBusy() would allow us to call PQgetResult() without doing
WaitLatchOrSocket(), which I think is rather expensive.

Understood. It's helpful to add the comment about why PQconsumeInput()
is called there.

Also could you tell me how much expensive it is?

When ignore_errors argument is true, the error reported by
PQconsumeInput() should be ignored?

I’m not sure about that, because the error might be caused by e.g.,
OOM in the local server, in which case I don’t think it is safe to
ignore it and continue the (sub)transaction-end processing.

But the existing code ignores the error at all, doesn't it?
If it's unsafe to do that, probably we should fix that at first?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

#7Hayato Kuroda (Fujitsu)
kuroda.hayato@fujitsu.com
In reply to: Fujii Masao (#6)
RE: postgres_fdw: commit remote (sub)transactions in parallel during pre-commit

Dear Fujita-san,

I love your proposal because it will remove a bottleneck
for PostgreSQL build-in sharding.

I read your patch briefly and I think basically it's good.
Currently I have only one comment.

In your patch, postgres_fdw sends a COMMIT command to all entries in the hash table
and waits for the result without a timeout from the first entry.
I think this specification is good because it's very simple,
but if a COMMIT for a particular foreign server could take some time,
I thought it might be more efficient to stop waiting for results and look at the next entry.
This is how it works. First, we define a function similar to pgfdw_get_result()
so that we can specify the timeout time as an argument to WaitLatchOrSocket().
Then change the function called by do_sql_command_end () to the new one,
and change the callback function to skip if the result has not yet arrived

How is it? Is it an unnecessary assumption that COMMIT takes time? Or is this the next step?
I will put a PoC if needed.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

#8Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Fujii Masao (#6)
Re: postgres_fdw: commit remote (sub)transactions in parallel during pre-commit

On Mon, Nov 8, 2021 at 1:13 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2021/11/07 18:06, Etsuro Fujita wrote:

On Mon, Nov 1, 2021 at 3:22 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

Could you tell me why the parameter is necessary?
Can't we always enable the feature?

I think it might be OK to enable it by
default. But my concern about doing so is the remote side: during
those functions, if there are a lot of (sub)transactions on a single
remote server that need to be committed, parallel commit would
increase the remote server’s load at (sub)transaction end than serial
commit, which is the existing implementation, as the requests to
commit those (sub)transactions are sent to the remote server at the
same time; which some users might want to avoid.

Thanks for explaining this! But probably I failed to get your point.
Sorry... Whichever parallel or serial commit approach, the number of
transactions to commit on the remote server is the same, isn't it?
For example, please imagine the case where a client requests
ten transactions per second to the local server. Each transaction
accesses to the foreign table, so which means that ten transaction
commit operations per second are requested to the remote server.
Unless I'm missing something, this number doesn't change whether
the foreign transaction is committed in parallel way or not.

Sorry, my explanation was not enough, but I don’t think this is always
true. Let me explain using an example:

create server loopback foreign data wrapper postgres_fdw options
(dbname 'postgres', parallel_commit 'true');
create user mapping for current_user server loopback;
create table t1 (a int, b int);
create table t2 (a int, b int);
create foreign table ft1 (a int, b int) server loopback options
(table_name 't1');
create foreign table ft2 (a int, b int) server loopback options
(table_name 't2');
create role view_owner superuser;
create user mapping for view_owner server loopback;
grant SELECT on ft1 to view_owner;
create view v1 as select * from ft1;
alter view v1 owner to view_owner;

begin;
insert into v1 values (10, 10);
insert into ft2 values (20, 20);
commit;

For this transaction, since the first insert is executed as the view
owner while the second insert is executed as the current user, we
create a connection to the foreign server for each of the users to
execute the inserts. This leads to sending two commit commands to the
foreign server at the same time during pre-commit.

To avoid spike loads on a remote server induced by such a workload, I
think it’s a good idea to have a server option to control whether this
is enabled, but I might be too worried about that, so I want to hear
the opinions of people.

IMO it's better to implement and commit these features gradually
if possible. Which would simplify the patch and make it
easier-to-review. So I think that it's better to implement
this feature as a separate patch.

Ok, I'll create a patch for abort cleanup separately.

+       /* Consume whatever data is available from the socket */
+       if (!PQconsumeInput(conn))
+               pgfdw_report_error(ERROR, NULL, conn, false, sql);

Without the patch, PQconsumeInput() is not called before pgfdw_get_result()
But could you tell me why you added PQconsumeInput() there?

The reason is that there might be the result already before calling
pgfdw_get_result(), in which case PQconsumeInput() followed by
PQisBusy() would allow us to call PQgetResult() without doing
WaitLatchOrSocket(), which I think is rather expensive.

Understood. It's helpful to add the comment about why PQconsumeInput()
is called there.

Ok.

Also could you tell me how much expensive it is?

IIUC I think the overheads of WaitLatchOrSocket() incurred by a series
of epoll system calls are much larger compared to the overheads of
PQconsumeInput() incurred by a recv system call in non-blocking mode
when no data is available. I didn’t do testing, though.

Actually, we already use this optimization in libpqrcv_receive() for
the caller of that function to avoid doing WaitLatchOrSocket()?

When ignore_errors argument is true, the error reported by
PQconsumeInput() should be ignored?

I’m not sure about that, because the error might be caused by e.g.,
OOM in the local server, in which case I don’t think it is safe to
ignore it and continue the (sub)transaction-end processing.

But the existing code ignores the error at all, doesn't it?
If it's unsafe to do that, probably we should fix that at first?

I changed my mind; I’ll update the patch to ignore the error as
before, because 1) as far as I know, there are no reports from the
field concerning that we ignore all kinds of errors in cleaning up the
prepared statements, so maybe we don’t need to change that, and 2) we
already committed at least one of the remote transactions, so it’s not
good to abort the local transaction unless we really have to.

Thanks!

Best regards,
Etsuro Fujita

#9Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Hayato Kuroda (Fujitsu) (#7)
Re: postgres_fdw: commit remote (sub)transactions in parallel during pre-commit

Kuroda-san,

On Thu, Nov 11, 2021 at 11:27 AM kuroda.hayato@fujitsu.com
<kuroda.hayato@fujitsu.com> wrote:

I love your proposal because it will remove a bottleneck
for PostgreSQL build-in sharding.

I read your patch briefly and I think basically it's good.

Great! Thanks for reviewing!

Currently I have only one comment.

In your patch, postgres_fdw sends a COMMIT command to all entries in the hash table
and waits for the result without a timeout from the first entry.
I think this specification is good because it's very simple,
but if a COMMIT for a particular foreign server could take some time,
I thought it might be more efficient to stop waiting for results and look at the next entry.
This is how it works. First, we define a function similar to pgfdw_get_result()
so that we can specify the timeout time as an argument to WaitLatchOrSocket().
Then change the function called by do_sql_command_end () to the new one,
and change the callback function to skip if the result has not yet arrived

How is it? Is it an unnecessary assumption that COMMIT takes time? Or is this the next step?
I will put a PoC if needed.

Hmm, I'm not sure the cost-effectiveness of this optimization is
really high, because if the timeout expired, it means that something
unusual would have happened, and that it would take a long time for
the COMMIT command to complete (or abort at worst). So even if we
processed the rest of the entries while waiting for the command
result, we cannot reduce the total time very much. Maybe I'm missing
something, though.

Best regards,
Etsuro Fujita

#10Fujii Masao
masao.fujii@gmail.com
In reply to: Etsuro Fujita (#8)
Re: postgres_fdw: commit remote (sub)transactions in parallel during pre-commit

On 2021/11/16 18:55, Etsuro Fujita wrote:

Sorry, my explanation was not enough, but I don’t think this is always
true. Let me explain using an example:

create server loopback foreign data wrapper postgres_fdw options
(dbname 'postgres', parallel_commit 'true');
create user mapping for current_user server loopback;
create table t1 (a int, b int);
create table t2 (a int, b int);
create foreign table ft1 (a int, b int) server loopback options
(table_name 't1');
create foreign table ft2 (a int, b int) server loopback options
(table_name 't2');
create role view_owner superuser;
create user mapping for view_owner server loopback;
grant SELECT on ft1 to view_owner;
create view v1 as select * from ft1;
alter view v1 owner to view_owner;

begin;
insert into v1 values (10, 10);
insert into ft2 values (20, 20);
commit;

For this transaction, since the first insert is executed as the view
owner while the second insert is executed as the current user, we
create a connection to the foreign server for each of the users to
execute the inserts. This leads to sending two commit commands to the
foreign server at the same time during pre-commit.

To avoid spike loads on a remote server induced by such a workload, I
think it’s a good idea to have a server option to control whether this
is enabled,

I understand your point. But even if the option is disabled (i.e.,
commit command is sent to each foreign server in serial way),
multiple queries still can run on the server concurrently and
which may cause performance "spike". Other clients may open several
sessions to the server and issue queries at the same time. Other
sessions using postgres_fdw may send commit command at the same time.
If we want to avoid that "spike", probably we need to decrease
max_connections or use connection pooling, etc. So ISTM that it's
half-baked and not enough to provide the option that controls
whether postgres_fdw issues commit command in parallel or serial way.

but I might be too worried about that, so I want to hear
the opinions of people.

Yes.

IIUC I think the overheads of WaitLatchOrSocket() incurred by a series
of epoll system calls are much larger compared to the overheads of
PQconsumeInput() incurred by a recv system call in non-blocking mode
when no data is available. I didn’t do testing, though.

Understood.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

#11Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Fujii Masao (#10)
Re: postgres_fdw: commit remote (sub)transactions in parallel during pre-commit

On Thu, Nov 18, 2021 at 1:09 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2021/11/16 18:55, Etsuro Fujita wrote:

Sorry, my explanation was not enough, but I don’t think this is always
true. Let me explain using an example:

create server loopback foreign data wrapper postgres_fdw options
(dbname 'postgres', parallel_commit 'true');
create user mapping for current_user server loopback;
create table t1 (a int, b int);
create table t2 (a int, b int);
create foreign table ft1 (a int, b int) server loopback options
(table_name 't1');
create foreign table ft2 (a int, b int) server loopback options
(table_name 't2');
create role view_owner superuser;
create user mapping for view_owner server loopback;
grant SELECT on ft1 to view_owner;
create view v1 as select * from ft1;
alter view v1 owner to view_owner;

begin;
insert into v1 values (10, 10);
insert into ft2 values (20, 20);
commit;

For this transaction, since the first insert is executed as the view
owner while the second insert is executed as the current user, we
create a connection to the foreign server for each of the users to
execute the inserts. This leads to sending two commit commands to the
foreign server at the same time during pre-commit.

To avoid spike loads on a remote server induced by such a workload, I
think it’s a good idea to have a server option to control whether this
is enabled,

I understand your point. But even if the option is disabled (i.e.,
commit command is sent to each foreign server in serial way),
multiple queries still can run on the server concurrently and
which may cause performance "spike". Other clients may open several
sessions to the server and issue queries at the same time. Other
sessions using postgres_fdw may send commit command at the same time.
If we want to avoid that "spike", probably we need to decrease
max_connections or use connection pooling, etc.

I think that what you are discussing here would be a related but
different issue, because the patch doesn't increase the number of
connections to the remote server that are needed for processing a
single transaction than before.

My concern about the patch is that in parallel-commit mode,
transactions like the above example might increase the remote server's
load at transaction end than before, while using the same number of
connections to the remote server as before, because multiple COMMIT
commands are sent to the remote server at the same time, not
sequentially as before. The option could be used to avoid such a
spike load without changing any settings on the remote server if
necessary. Also, the option could be added at no extra cost, so there
seems to me to be no reason to remove it.

Anyway, I'd like to hear the opinions of others.

Thanks!

Best regards,
Etsuro Fujita

#12Fujii Masao
masao.fujii@gmail.com
In reply to: Etsuro Fujita (#8)
Re: postgres_fdw: commit remote (sub)transactions in parallel during pre-commit

On 2021/11/16 18:55, Etsuro Fujita wrote:

I changed my mind; I’ll update the patch to ignore the error as
before, because 1) as far as I know, there are no reports from the
field concerning that we ignore all kinds of errors in cleaning up the
prepared statements, so maybe we don’t need to change that, and 2) we
already committed at least one of the remote transactions, so it’s not
good to abort the local transaction unless we really have to.

Are you planning to update the patch? In addition to this change,
at least documentation about new parallel_commit parameter needs
to be included in the patch.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

#13Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Fujii Masao (#12)
Re: postgres_fdw: commit remote (sub)transactions in parallel during pre-commit

On Fri, Dec 3, 2021 at 6:07 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2021/11/16 18:55, Etsuro Fujita wrote:

I changed my mind; I’ll update the patch to ignore the error as
before, because 1) as far as I know, there are no reports from the
field concerning that we ignore all kinds of errors in cleaning up the
prepared statements, so maybe we don’t need to change that, and 2) we
already committed at least one of the remote transactions, so it’s not
good to abort the local transaction unless we really have to.

Done.

Are you planning to update the patch? In addition to this change,
at least documentation about new parallel_commit parameter needs
to be included in the patch.

Done. Attached is a new version.

* 0001
This is an updated version of the previous patch. In addition to the
above, I expanded a comment in do_sql_command_end() a bit to explain
why we do PQconsumeInput() before doing pgfdw_get_result(), to address
your comment. Also, I moved the code to finish closing pending
(sub)transactions in pgfdw_xact_callback()(pgfdw_subxact_callback())
into separate functions. Also, I modified regression test cases a bit
to access multiple foreign servers.

* 0002
This is a WIP patch for parallel abort. I added an option
parallel_abort for this, because I thought it would be good to
enable/disable these separately. I didn’t do any performance tests
yet.

Sorry for the long delay.

Best regards,
Etsuro Fujita

Attachments:

v2-0001-postgres-fdw-Add-support-for-parallel-commit.patchapplication/octet-stream; name=v2-0001-postgres-fdw-Add-support-for-parallel-commit.patchDownload+343-19
v2-0002-postgres-fdw-Add-support-for-parallel-abort.patchapplication/octet-stream; name=v2-0002-postgres-fdw-Add-support-for-parallel-abort.patchDownload+401-22
#14Fujii Masao
masao.fujii@gmail.com
In reply to: Etsuro Fujita (#13)
Re: postgres_fdw: commit remote (sub)transactions in parallel during pre-commit

On 2022/01/06 17:29, Etsuro Fujita wrote:

On Fri, Dec 3, 2021 at 6:07 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

On 2021/11/16 18:55, Etsuro Fujita wrote:

I changed my mind; I’ll update the patch to ignore the error as
before, because 1) as far as I know, there are no reports from the
field concerning that we ignore all kinds of errors in cleaning up the
prepared statements, so maybe we don’t need to change that, and 2) we
already committed at least one of the remote transactions, so it’s not
good to abort the local transaction unless we really have to.

Done.

Are you planning to update the patch? In addition to this change,
at least documentation about new parallel_commit parameter needs
to be included in the patch.

Done. Attached is a new version.

* 0001
This is an updated version of the previous patch. In addition to the
above, I expanded a comment in do_sql_command_end() a bit to explain
why we do PQconsumeInput() before doing pgfdw_get_result(), to address
your comment. Also, I moved the code to finish closing pending
(sub)transactions in pgfdw_xact_callback()(pgfdw_subxact_callback())
into separate functions. Also, I modified regression test cases a bit
to access multiple foreign servers.

Thanks for updating the patch!

At first I'm reading the 0001 patch. Here are the comments for the patch.

0001 patch failed to be applied. Could you rebase the patch?

+			entry->changing_xact_state = true;
+			do_sql_command_begin(entry->conn, "DEALLOCATE ALL");
+			pending_deallocs = lappend(pending_deallocs, entry);

Originally entry->changing_xact_state is not set to true when executing DEALLOCATE ALL. But the patch do that. Why do we need this change?

The source comment explains that we intentionally ignore errors in the DEALLOCATE. But the patch changes DEALLOCATE ALL so that it's executed via do_sql_command_begin() that can cause an error. Is this OK?

+		if (ignore_errors)
+			pgfdw_report_error(WARNING, res, conn, true, sql);

When DEALLOCATE fails, originally even warning message is not logged. But the patch changes DEALLOCATE so that its result is received via do_sql_command_end() that can log warning message even when ignore_errors argument is enabled. Why do we need to change the behavior?

+      <para>
+       This option controls whether <filename>postgres_fdw</filename> commits
+       multiple remote (sub)transactions opened in a local (sub)transaction
+       in parallel when the local (sub)transaction commits.

Since parallel_commit is an option for foreign server, how the server with this option enabled is handled by postgres_fdw should be documented, instead?

+   <para>
+    Note that if many remote (sub)transactions are opened on a remote server
+    in a local (sub)transaction, this option might increase the remote
+    server’s load significantly when those remote (sub)transactions are
+    committed.  So be careful when using this option.
+   </para>

This paragraph should be inside the listitem for parallel_commit, shouldn't it?

async_capable=true also may cause the similar issue? If so, this kind of note should be documented also in async_capable?

This explains that the remote server's load will be increased *significantly*. But "significantly" part is really true? I'd like to know how much parallel_commit=true actually can increase the load in a remote server.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

#15Julien Rouhaud
rjuju123@gmail.com
In reply to: Etsuro Fujita (#13)
Re: postgres_fdw: commit remote (sub)transactions in parallel during pre-commit

Hi,

On Thu, Jan 06, 2022 at 05:29:23PM +0900, Etsuro Fujita wrote:

Done. Attached is a new version.

The patchset doesn't apply anymore:
http://cfbot.cputube.org/patch_36_3392.log
=== Applying patches on top of PostgreSQL commit ID 43c2175121c829c8591fc5117b725f1f22bfb670 ===
=== applying patch ./v2-0001-postgres-fdw-Add-support-for-parallel-commit.patch
patching file contrib/postgres_fdw/connection.c
patching file contrib/postgres_fdw/expected/postgres_fdw.out
Hunk #2 FAILED at 10825.
1 out of 2 hunks FAILED -- saving rejects to file contrib/postgres_fdw/expected/postgres_fdw.out.rej
patching file contrib/postgres_fdw/option.c
patching file contrib/postgres_fdw/sql/postgres_fdw.sql
Hunk #1 FAILED at 3452.
1 out of 1 hunk FAILED -- saving rejects to file contrib/postgres_fdw/sql/postgres_fdw.sql.rej
patching file doc/src/sgml/postgres-fdw.sgml

I also see that Fuji-san raised some questions, so for now I will simply change
the patch status to Waiting on Author.

#16Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Fujii Masao (#14)
Re: postgres_fdw: commit remote (sub)transactions in parallel during pre-commit

On Thu, Jan 13, 2022 at 11:54 AM Fujii Masao
<masao.fujii@oss.nttdata.com> wrote:

At first I'm reading the 0001 patch. Here are the comments for the patch.

Thanks for reviewing!

0001 patch failed to be applied. Could you rebase the patch?

Done. Attached is an updated version of the patch set.

+                       entry->changing_xact_state = true;
+                       do_sql_command_begin(entry->conn, "DEALLOCATE ALL");
+                       pending_deallocs = lappend(pending_deallocs, entry);

Originally entry->changing_xact_state is not set to true when executing DEALLOCATE ALL. But the patch do that. Why do we need this change?

The source comment explains that we intentionally ignore errors in the DEALLOCATE. But the patch changes DEALLOCATE ALL so that it's executed via do_sql_command_begin() that can cause an error. Is this OK?

+               if (ignore_errors)
+                       pgfdw_report_error(WARNING, res, conn, true, sql);

When DEALLOCATE fails, originally even warning message is not logged. But the patch changes DEALLOCATE so that its result is received via do_sql_command_end() that can log warning message even when ignore_errors argument is enabled. Why do we need to change the behavior?

Yeah, we don’t need to change the behavior as discussed before, so I
fixed these. I worked on the patch after a while, so I forgot about
that. :-(

+      <para>
+       This option controls whether <filename>postgres_fdw</filename> commits
+       multiple remote (sub)transactions opened in a local (sub)transaction
+       in parallel when the local (sub)transaction commits.

Since parallel_commit is an option for foreign server, how the server with this option enabled is handled by postgres_fdw should be documented, instead?

Agreed. I rewrote this slightly like the attached. Does that make sense?

+   <para>
+    Note that if many remote (sub)transactions are opened on a remote server
+    in a local (sub)transaction, this option might increase the remote
+    server’s load significantly when those remote (sub)transactions are
+    committed.  So be careful when using this option.
+   </para>

This paragraph should be inside the listitem for parallel_commit, shouldn't it?

I put this note outside, because it’s rewritten to a note about both
the parallel_commit and parallel_abort options in the following patch.
But it would be good to make the parallel-commit patch independent, so
I moved it into the list.

async_capable=true also may cause the similar issue? If so, this kind of note should be documented also in async_capable?

That’s right. I think it would be good to add a similar note about
that, but I’d like to leave that for another patch.

This explains that the remote server's load will be increased *significantly*. But "significantly" part is really true?

I think that that would depend on how many transactions are committed
on the remote side at the same time. But the word “significantly”
might be too strong, so I dropped the word.

I'd like to know how much parallel_commit=true actually can increase the load in a remote server.

Ok, I’ll do a load test.

About the #0002 patch:
This is in preparation for the parallel-abort patch (#0003), but I’d
like to propose a minor cleanup for commit 85c696112: 1) build an
abort command to be sent to the remote in pgfdw_abort_cleanup(), using
a macro, only when/if necessary, as before, and 2) add/modify comments
a little bit.

Sorry for the delay again.

Best regards,
Etsuro Fujita

Attachments:

v3-0001-postgres-fdw-Add-support-for-parallel-commit.patchapplication/octet-stream; name=v3-0001-postgres-fdw-Add-support-for-parallel-commit.patchDownload+356-18
v3-0002-postgres_fdw-Minor-cleanup-for-pgfdw_abort_cleanup.patchapplication/octet-stream; name=v3-0002-postgres_fdw-Minor-cleanup-for-pgfdw_abort_cleanup.patchDownload+22-14
v3-0003-postgres-fdw-Add-support-for-parallel-abort.patchapplication/octet-stream; name=v3-0003-postgres-fdw-Add-support-for-parallel-abort.patchDownload+406-33
#17Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Julien Rouhaud (#15)
Re: postgres_fdw: commit remote (sub)transactions in parallel during pre-commit

Hi Julien,

On Sun, Jan 16, 2022 at 1:07 PM Julien Rouhaud <rjuju123@gmail.com> wrote:

On Thu, Jan 06, 2022 at 05:29:23PM +0900, Etsuro Fujita wrote:

Done. Attached is a new version.

I also see that Fuji-san raised some questions, so for now I will simply change
the patch status to Waiting on Author.

Thanks for letting me know!

I posted a new version of the patchset which addresses Fujii-san’s
comments. I changed the status and moved the patchset to the next
commitfest.

Best regards,
Etsuro Fujita

#18Fujii Masao
masao.fujii@gmail.com
In reply to: Etsuro Fujita (#16)
Re: postgres_fdw: commit remote (sub)transactions in parallel during pre-commit

On 2022/02/07 14:35, Etsuro Fujita wrote:

0001 patch failed to be applied. Could you rebase the patch?

Done. Attached is an updated version of the patch set.

Thanks for updating the patch! Here are the review comments for 0001 patch.

I got the following compiler warning.

[16:58:07.120] connection.c: In function ‘pgfdw_finish_pre_commit_cleanup’:
[16:58:07.120] connection.c:1726:4: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement]
[16:58:07.120] 1726 | PGresult *res;
[16:58:07.120] | ^~~~~~~~

+			/* Ignore errors in the DEALLOCATE (see note above) */
+			if ((res = PQgetResult(entry->conn)) != NULL)

Doesn't PQgetResult() need to be called repeatedly until it returns NULL or the connection is lost because there can be more than one messages to receive?

+	if (pending_deallocs)
+	{
+		foreach(lc, pending_deallocs)

If pending_deallocs is NIL, we don't enter this foreach loop. So probably "if (pending_deallocs)" seems not necessary.

  			entry->keep_connections = defGetBoolean(def);
+		if (strcmp(def->defname, "parallel_commit") == 0)
+			entry->parallel_commit = defGetBoolean(def);

Isn't it better to use "else if" here, instead?

+static void do_sql_command_begin(PGconn *conn, const char *sql);
+static void do_sql_command_end(PGconn *conn, const char *sql);

To simplify the code more, I'm tempted to change do_sql_command() so that it just calls the above two functions, instead of calling PQsendQuery() and pgfw_get_result() directly. Thought? If we do this, probably we also need to change do_sql_command_end() so that it accepts boolean flag which specifies whether PQconsumeInput() is called or not, as follows.

do_sql_command_end(PGconn *conn, const char *sql, bool consumeInput)
{
/*
* If any data is expected to be available from the socket, consume it.
* ...
* When parallel_commit is enabled, since there can be a time window between
* sending query and receiving result, we can expect data is already available
* from the socket. In this case we try to consume it at first.... Otherwise..
*/
if (consumeInput && !PQconsumeInput(conn))
...

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

#19Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Fujii Masao (#18)
Re: postgres_fdw: commit remote (sub)transactions in parallel during pre-commit

On Tue, Feb 8, 2022 at 3:49 AM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:

Here are the review comments for 0001 patch.

I got the following compiler warning.

[16:58:07.120] connection.c: In function ‘pgfdw_finish_pre_commit_cleanup’:
[16:58:07.120] connection.c:1726:4: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement]
[16:58:07.120] 1726 | PGresult *res;
[16:58:07.120] | ^~~~~~~~

Sorry, I didn’t notice this, because my compiler doesn’t produce it.
I tried to fix it. Attached is an updated version of the patch set.
I hope this works for you.

+                       /* Ignore errors in the DEALLOCATE (see note above) */
+                       if ((res = PQgetResult(entry->conn)) != NULL)

Doesn't PQgetResult() need to be called repeatedly until it returns NULL or the connection is lost because there can be more than one messages to receive?

Yeah, we would receive a single message here, but PQgetResult must be
called repeatedly until it returns NULL (see the documentation note
about it in libpq.sgml); else the PQtransactionStatus of the
connection would remain PQTRANS_ACTIVE, causing the connection to be
closed at transaction end, because we do this in
pgfdw_reset_xact_state called from pgfdw_xact_callback:

/*
* If the connection isn't in a good idle state, it is marked as
* invalid or keep_connections option of its server is disabled, then
* discard it to recover. Next GetConnection will open a new
* connection.
*/
if (PQstatus(entry->conn) != CONNECTION_OK ||
PQtransactionStatus(entry->conn) != PQTRANS_IDLE ||
entry->changing_xact_state ||
entry->invalidated ||
!entry->keep_connections)
{
elog(DEBUG3, "discarding connection %p", entry->conn);
disconnect_pg_server(entry);
}

But I noticed a brown-paper-bag bug in the bit you showed above: the
if test should be modified as a while loop. :-( I fixed this in the
attached.

+       if (pending_deallocs)
+       {
+               foreach(lc, pending_deallocs)

If pending_deallocs is NIL, we don't enter this foreach loop. So probably "if (pending_deallocs)" seems not necessary.

Yeah, I think we could omit the if test, but I added it to match other
places (see e.g., foreign_grouping_ok() in postgres_fdw.c). It looks
cleaner to me to have it before the loop.

entry->keep_connections = defGetBoolean(def);
+               if (strcmp(def->defname, "parallel_commit") == 0)
+                       entry->parallel_commit = defGetBoolean(def);

Isn't it better to use "else if" here, instead?

Yeah, that would be better. Done.

+static void do_sql_command_begin(PGconn *conn, const char *sql);
+static void do_sql_command_end(PGconn *conn, const char *sql);

To simplify the code more, I'm tempted to change do_sql_command() so that it just calls the above two functions, instead of calling PQsendQuery() and pgfw_get_result() directly. Thought? If we do this, probably we also need to change do_sql_command_end() so that it accepts boolean flag which specifies whether PQconsumeInput() is called or not, as follows.

Done. Actually, I was planning to do this for consistency with a
similar refactoring for pgfdw_cancel_query and
pgfdw_exec_cleanup_query that had been done
in the parallel-abort patch.

I tweaked comments/docs a little bit as well.

Thanks for reviewing!

Best regards,
Etsuro Fujita

Attachments:

v4-0001-postgres-fdw-Add-support-for-parallel-commit.patchapplication/octet-stream; name=v4-0001-postgres-fdw-Add-support-for-parallel-commit.patchDownload+379-19
v4-0002-postgres_fdw-Minor-cleanup-for-pgfdw_abort_cleanup.patchapplication/octet-stream; name=v4-0002-postgres_fdw-Minor-cleanup-for-pgfdw_abort_cleanup.patchDownload+22-14
v4-0003-postgres-fdw-Add-support-for-parallel-abort.patchapplication/octet-stream; name=v4-0003-postgres-fdw-Add-support-for-parallel-abort.patchDownload+412-35
#20Fujii Masao
masao.fujii@gmail.com
In reply to: Etsuro Fujita (#19)
Re: postgres_fdw: commit remote (sub)transactions in parallel during pre-commit

On 2022/02/11 21:59, Etsuro Fujita wrote:

I tweaked comments/docs a little bit as well.

Thanks for updating the patches!

I reviewed 0001 patch. It looks good to me except the following minor things. If these are addressed, I think that the 001 patch can be marked as ready for committer.

+	 * Also determine to commit (sub)transactions opened on the remote server
+	 * in parallel at (sub)transaction end.

Like the comment "Determine whether to keep the connection ...", "determine to commit" should be "determine whether to commit"?

"remote server" should be "remote servers"?

+	curlevel = GetCurrentTransactionNestLevel();
+	snprintf(sql, sizeof(sql), "RELEASE SAVEPOINT s%d", curlevel);

Why does pgfdw_finish_pre_subcommit_cleanup() need to call GetCurrentTransactionNestLevel() and construct the "RELEASE SAVEPOINT" query string again? pgfdw_subxact_callback() already does them and probably we can make it pass either of them to pgfdw_finish_pre_subcommit_cleanup() as its argument.

+       This option controls whether <filename>postgres_fdw</filename> commits
+       remote (sub)transactions opened on a foreign server in a local
+       (sub)transaction in parallel when the local (sub)transaction commits.

"a foreign server" should be "foreign servers"?

"in a local (sub)transaction" part seems not to be necessary.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

#21David Zhang
david.zhang@highgo.ca
In reply to: Fujii Masao (#20)
#22Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Fujii Masao (#20)
#23Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: David Zhang (#21)
#24Fujii Masao
masao.fujii@gmail.com
In reply to: Etsuro Fujita (#22)
#25Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Fujii Masao (#24)
#26Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Etsuro Fujita (#25)
#27Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Etsuro Fujita (#26)
#28Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Etsuro Fujita (#27)
#29David Zhang
david.zhang@highgo.ca
In reply to: Etsuro Fujita (#28)
#30Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: David Zhang (#29)
#31Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Etsuro Fujita (#28)
#32Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Etsuro Fujita (#31)
#33David Zhang
david.zhang@highgo.ca
In reply to: Etsuro Fujita (#32)
#34Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: David Zhang (#33)
#35David Zhang
david.zhang@highgo.ca
In reply to: Etsuro Fujita (#34)
#36Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: David Zhang (#35)
#37Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Etsuro Fujita (#36)
#38Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Etsuro Fujita (#37)
#39Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Etsuro Fujita (#38)
#40Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Jacob Champion (#39)
#41David Zhang
david.zhang@highgo.ca
In reply to: Etsuro Fujita (#40)
#42Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: David Zhang (#41)
#43vignesh C
vignesh21@gmail.com
In reply to: Etsuro Fujita (#42)
#44Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: vignesh C (#43)
#45Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Etsuro Fujita (#44)
#46Etsuro Fujita
fujita.etsuro@lab.ntt.co.jp
In reply to: Etsuro Fujita (#45)