Segment fault when excuting SPI function On PG with commit 41c6a5be
Hi, all
When I used SPI_execute_plan function on PG12 which commit 41c6a5be is used,
Segment fault occurred.
PS: If commit 41c6a5be is not used, this phenomenon will not happen.
Reproduce:
In a background process, the following steps are executed.
--------------------------
StartTransactionCommand();
SPI_connect();
plan = SPI_prepare(query,0,NULL); ★the query is a SELECT SQL.
SPI_keepplan(plan);
SPI_finish();
CommitTransactionCommand();
StartTransactionCommand();
SPI_connect();
SPI_execute_plan(plan, NULL, NULL, true, 0); ★Segment fault
--------------------------
Core stack:
Stack trace of thread 43926:
#0 0x0000000000772f19 EnsurePortalSnapshotExists (postgres)
#1 0x000000000064f85c _SPI_execute_plan (postgres)
#2 0x000000000064fbd1 SPI_execute_plan (postgres)
#3 0x00007fbee784402e xxx (xxx.so)
#4 0x00007fbee78424ae xxxx (xxxx.so)
#5 0x00000000006e91f5 StartBackgroundWorker (postgres)
#6 0x00000000006f5e25 maybe_start_bgworkers (postgres)
#7 0x00000000006f6121 reaper (postgres)
#8 0x00007fbeedf48dc0 __restore_rt (libpthread.so.0)
#9 0x00007fbeebadf75b __select (libc.so.6)
#10 0x00000000006f6ea6 ServerLoop (postgres)
#11 0x00000000006f8c30 PostmasterMain (postgres)
#12 0x0000000000485d99 main (postgres)
#13 0x00007fbeeba0f873 __libc_start_main (libc.so.6)
#14 0x0000000000485e3e _start (postgres)
Is there a bug in the commit 41c6a5be? Please confirm it.
Regards, Liu Huailing
On Fri, Jul 30, 2021 at 6:57 PM liuhuailing@fujitsu.com
<liuhuailing@fujitsu.com> wrote:
When I used SPI_execute_plan function on PG12 which commit 41c6a5be is used,
Segment fault occurred.
I'm not seeing any such commit.
Can you provide a link to the commit in GitHub?
Regards,
Greg Nancarrow
Fujitsu Australia
On Fri, Jul 30, 2021 at 3:30 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Fri, Jul 30, 2021 at 6:57 PM liuhuailing@fujitsu.com
<liuhuailing@fujitsu.com> wrote:When I used SPI_execute_plan function on PG12 which commit 41c6a5be is
used,
Segment fault occurred.
I'm not seeing any such commit.
Can you provide a link to the commit in GitHub?Regards,
Greg Nancarrow
Fujitsu AustraliaHi,
I think Huailing was referring to:
commit 41c6a5bec25e720d98bd60d77dd5c2939189ed3c
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date: Fri May 21 14:03:53 2021 -0400
Restore the portal-level snapshot after procedure COMMIT/ROLLBACK.
Cheers
Em sex., 30 de jul. de 2021 às 05:57, liuhuailing@fujitsu.com <
liuhuailing@fujitsu.com> escreveu:
Hi, all
When I used SPI_execute_plan function on PG12 which commit 41c6a5be is
used,
Segment fault occurred.PS: If commit 41c6a5be is not used, this phenomenon will not happen.
Reproduce:
In a background process, the following steps are executed.
--------------------------
StartTransactionCommand();
SPI_connect();
plan = SPI_prepare(query,0,NULL); ★the query is a SELECT SQL.
SPI_keepplan(plan);
SPI_finish();
CommitTransactionCommand();
StartTransactionCommand();
SPI_connect();
SPI_execute_plan(plan, NULL, NULL, true, 0); ★Segment fault
--------------------------Core stack:
Stack trace of thread 43926:
#0 0x0000000000772f19 EnsurePortalSnapshotExists
(postgres)
#1 0x000000000064f85c _SPI_execute_plan (postgres)
#2 0x000000000064fbd1 SPI_execute_plan (postgres)
#3 0x00007fbee784402e xxx (xxx.so)
#4 0x00007fbee78424ae xxxx (xxxx.so)
#5 0x00000000006e91f5 StartBackgroundWorker (postgres)
#6 0x00000000006f5e25 maybe_start_bgworkers (postgres)
#7 0x00000000006f6121 reaper (postgres)
#8 0x00007fbeedf48dc0 __restore_rt (libpthread.so.0)
#9 0x00007fbeebadf75b __select (libc.so.6)
#10 0x00000000006f6ea6 ServerLoop (postgres)
#11 0x00000000006f8c30 PostmasterMain (postgres)
#12 0x0000000000485d99 main (postgres)
#13 0x00007fbeeba0f873 __libc_start_main (libc.so.6)
#14 0x0000000000485e3e _start (postgres)Is there a bug in the commit 41c6a5be? Please confirm it.
Did you test it on the HEAD too?
regards,
Ranier Vilela
On Fri, Jul 30, 2021 at 08:57:35AM +0000, liuhuailing@fujitsu.com wrote:
When I used SPI_execute_plan function on PG12 which commit 41c6a5be is used,
Segment fault occurred.PS: If commit 41c6a5be is not used, this phenomenon will not happen.
I see nothing wrong in this commit, FWIW.
Reproduce:
In a background process, the following steps are executed.
--------------------------
StartTransactionCommand();
SPI_connect();
plan = SPI_prepare(query,0,NULL); ★the query is a SELECT SQL.
SPI_keepplan(plan);
SPI_finish();
CommitTransactionCommand();
StartTransactionCommand();
SPI_connect();
SPI_execute_plan(plan, NULL, NULL, true, 0); ★Segment fault
--------------------------
But I see an issue with your code. It seems to me that you should
push a snapshot before doing SPI_prepare() and SPI_execute_plan(),
as of:
PushActiveSnapshot(GetTransactionSnapshot());
--
Michael
Michael Paquier <michael@paquier.xyz> writes:
On Fri, Jul 30, 2021 at 08:57:35AM +0000, liuhuailing@fujitsu.com wrote:
When I used SPI_execute_plan function on PG12 which commit 41c6a5be is used,
Segment fault occurred.
I see nothing wrong in this commit, FWIW.
But I see an issue with your code. It seems to me that you should
push a snapshot before doing SPI_prepare() and SPI_execute_plan(),
as of:
PushActiveSnapshot(GetTransactionSnapshot());
Yes. What that commit did was to formalize the requirement that
a snapshot exist *before* entering SPI. Before that, you might
have gotten away without one, depending on what you were trying
to do (in particular, detoasting a toasted output Datum would
fail if you lack an external snapshot).
This isn't the first complaint we've seen from somebody whose
code used to work and now fails there, however. I wonder if
we should convert the Assert into an actual test-and-elog, say
/* Otherwise, we'd better have an active Portal */
portal = ActivePortal;
- Assert(portal != NULL);
+ if (unlikely(portal == NULL))
+ elog(ERROR, "must have an outer snapshot or portal");
Assert(portal->portalSnapshot == NULL);
Perhaps that would help people to realize that the bug is theirs
not EnsurePortalSnapshotExists's.
I gather BTW that the OP is trying to test C code in a non-assert
build. Not a great approach.
regards, tom lane
On 30 Jul 2021, at 17:06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I wonder if we should convert the Assert into an actual test-and-elog, say
/* Otherwise, we'd better have an active Portal */ portal = ActivePortal; - Assert(portal != NULL); + if (unlikely(portal == NULL)) + elog(ERROR, "must have an outer snapshot or portal"); Assert(portal->portalSnapshot == NULL);Perhaps that would help people to realize that the bug is theirs
not EnsurePortalSnapshotExists's.
+1, that would probably be quite helpful.
--
Daniel Gustafsson https://vmware.com/
Daniel Gustafsson <daniel@yesql.se> writes:
On 30 Jul 2021, at 17:06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I wonder if we should convert the Assert into an actual test-and-elog, say/* Otherwise, we'd better have an active Portal */ portal = ActivePortal; - Assert(portal != NULL); + if (unlikely(portal == NULL)) + elog(ERROR, "must have an outer snapshot or portal"); Assert(portal->portalSnapshot == NULL);Perhaps that would help people to realize that the bug is theirs
not EnsurePortalSnapshotExists's.
+1, that would probably be quite helpful.
Happy to make it so. Anyone have suggestions about the wording of
the message?
regards, tom lane
On Fri, Jul 30, 2021 at 11:22:43AM -0400, Tom Lane wrote:
Happy to make it so. Anyone have suggestions about the wording of
the message?
For the archives, this has been applied as of ef12f32, and the new
message seems explicit enough:
+ if (unlikely(portal == NULL))
+ elog(ERROR, "cannot execute SQL without an outer snapshot or portal");
--
Michael