Problem with accessing TOAST data in stored procedures
Hi hackers,
More than month ago I have sent bug report to pgsql-bugs:
/messages/by-id/5d335911-fb25-60cd-4aa7-a5bd0954aea0@postgrespro.ru
with the proposed patch but have not received any response.
I wonder if there is some other way to fix this issue and does somebody
working on it.
While the added check itself is trivial (just one line) the total patch
is not so small because I have added walker for
plpgsql statements tree. It is not strictly needed in this case (it is
possible to find some other way to determine that stored procedure
contains transaction control statements), but I hope such walker may be
useful in other cases.
In any case, I will be glad to receive any response,
because this problem was reported by one of our customers and we need to
provide some fix.
It is better to include it in vanilla, rather than in our pgpro-ee fork.
If it is desirable, I can add this patch to commitfest.
Thanks in advance,
Konstantin
Hi
st 19. 8. 2020 v 19:22 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:
Hi hackers,
More than month ago I have sent bug report to pgsql-bugs:
/messages/by-id/5d335911-fb25-60cd-4aa7-a5bd0954aea0@postgrespro.ru
with the proposed patch but have not received any response.
I wonder if there is some other way to fix this issue and does somebody
working on it.
While the added check itself is trivial (just one line) the total patch
is not so small because I have added walker for
plpgsql statements tree. It is not strictly needed in this case (it is
possible to find some other way to determine that stored procedure
contains transaction control statements), but I hope such walker may be
useful in other cases.In any case, I will be glad to receive any response,
because this problem was reported by one of our customers and we need to
provide some fix.
It is better to include it in vanilla, rather than in our pgpro-ee fork.If it is desirable, I can add this patch to commitfest.
I don't like this design. It is not effective to repeat the walker for
every execution. Introducing a walker just for this case looks like
overengineering.
Personally I am not sure if a walker for plpgsql is a good idea (I thought
about it more times, when I wrote plpgsql_check). But anyway - there should
be good reason for introducing the walker and clean use case.
If you want to introduce stmt walker, then it should be a separate patch
with some benefit on plpgsql environment length.
Regards
Pavel
Show quoted text
Thanks in advance,
Konstantin
Attachments:
plpgsql.patchtext/x-patch; charset=US-ASCII; name=plpgsql.patchDownload
diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c
index d4a3d58daa..ccbddd4d7a 100644
--- a/src/pl/plpgsql/src/pl_exec.c
+++ b/src/pl/plpgsql/src/pl_exec.c
@@ -5997,6 +5997,12 @@ exec_for_query(PLpgSQL_execstate *estate, PLpgSQL_stmt_forq *stmt,
*/
PinPortal(portal);
+ /*
+ * Disable prefetch if procedure contains COMMIT or ROLLBACK statements
+ */
+ if (prefetch_ok && estate->func->fn_xactctrl)
+ prefetch_ok = false;
+
/*
* Fetch the initial tuple(s). If prefetching is allowed then we grab a
* few more rows to avoid multiple trips through executor startup
diff --git a/src/pl/plpgsql/src/pl_gram.y b/src/pl/plpgsql/src/pl_gram.y
index 5a7e1a4444..5b27311b95 100644
--- a/src/pl/plpgsql/src/pl_gram.y
+++ b/src/pl/plpgsql/src/pl_gram.y
@@ -2215,6 +2215,8 @@ stmt_commit : K_COMMIT opt_transaction_chain ';'
new->stmtid = ++plpgsql_curr_compile->nstatements;
new->chain = $2;
+ plpgsql_curr_compile->fn_xactctrl = true;
+
$$ = (PLpgSQL_stmt *)new;
}
;
@@ -2229,6 +2231,8 @@ stmt_rollback : K_ROLLBACK opt_transaction_chain ';'
new->stmtid = ++plpgsql_curr_compile->nstatements;
new->chain = $2;
+ plpgsql_curr_compile->fn_xactctrl = true;
+
$$ = (PLpgSQL_stmt *)new;
}
;
diff --git a/src/pl/plpgsql/src/plpgsql.h b/src/pl/plpgsql/src/plpgsql.h
index 0c3d30fb13..e9b9b0d335 100644
--- a/src/pl/plpgsql/src/plpgsql.h
+++ b/src/pl/plpgsql/src/plpgsql.h
@@ -1006,6 +1006,7 @@ typedef struct PLpgSQL_function
bool fn_retisdomain;
bool fn_retset;
bool fn_readonly;
+ bool fn_xactctrl;
char fn_prokind;
int fn_nargs;
On 19.08.2020 21:50, Pavel Stehule wrote:
Hi
st 19. 8. 2020 v 19:22 odesílatel Konstantin Knizhnik
<k.knizhnik@postgrespro.ru <mailto:k.knizhnik@postgrespro.ru>> napsal:Hi hackers,
More than month ago I have sent bug report to pgsql-bugs:
/messages/by-id/5d335911-fb25-60cd-4aa7-a5bd0954aea0@postgrespro.ru
with the proposed patch but have not received any response.
I wonder if there is some other way to fix this issue and does
somebody
working on it.
While the added check itself is trivial (just one line) the total
patch
is not so small because I have added walker for
plpgsql statements tree. It is not strictly needed in this case
(it is
possible to find some other way to determine that stored procedure
contains transaction control statements), but I hope such walker
may be
useful in other cases.In any case, I will be glad to receive any response,
because this problem was reported by one of our customers and we
need to
provide some fix.
It is better to include it in vanilla, rather than in our pgpro-ee
fork.If it is desirable, I can add this patch to commitfest.
I don't like this design. It is not effective to repeat the walker for
every execution. Introducing a walker just for this case looks like
overengineering.
Personally I am not sure if a walker for plpgsql is a good idea (I
thought about it more times, when I wrote plpgsql_check). But anyway -
there should be good reason for introducing the walker and clean use case.If you want to introduce stmt walker, then it should be a separate
patch with some benefit on plpgsql environment length.
If you think that plpgsql statement walker is not needed, then I do not
insist.
Are you going to commit your version of the patch?
st 19. 8. 2020 v 20:59 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:
On 19.08.2020 21:50, Pavel Stehule wrote:
Hi
st 19. 8. 2020 v 19:22 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:Hi hackers,
More than month ago I have sent bug report to pgsql-bugs:
/messages/by-id/5d335911-fb25-60cd-4aa7-a5bd0954aea0@postgrespro.ru
with the proposed patch but have not received any response.
I wonder if there is some other way to fix this issue and does somebody
working on it.
While the added check itself is trivial (just one line) the total patch
is not so small because I have added walker for
plpgsql statements tree. It is not strictly needed in this case (it is
possible to find some other way to determine that stored procedure
contains transaction control statements), but I hope such walker may be
useful in other cases.In any case, I will be glad to receive any response,
because this problem was reported by one of our customers and we need to
provide some fix.
It is better to include it in vanilla, rather than in our pgpro-ee fork.If it is desirable, I can add this patch to commitfest.
I don't like this design. It is not effective to repeat the walker for
every execution. Introducing a walker just for this case looks like
overengineering.
Personally I am not sure if a walker for plpgsql is a good idea (I thought
about it more times, when I wrote plpgsql_check). But anyway - there should
be good reason for introducing the walker and clean use case.If you want to introduce stmt walker, then it should be a separate patch
with some benefit on plpgsql environment length.If you think that plpgsql statement walker is not needed, then I do not
insist.
Are you going to commit your version of the patch?
I am afraid so it needs significantly much more work :(. My version is
correct just for the case that you describe, but it doesn't fix the
possibility of the end of the transaction inside the nested CALL.
Some like
DO $$ DECLARE v_r record; BEGIN FOR v_r in SELECT data FROM toasted LOOP
INSERT INTO toasted(data) VALUES(v_r.data) CALL check_and_commit();END
LOOP;END;$$;
Probably my patch (or your patch) will fix on 99%, but still there will be
a possibility of this issue. It is very similar to fixing releasing expr
plans inside CALL statements. Current design of CALL statement is ugly
workaround - it is slow, and there is brutal memory leak. Fixing memory
leak is not hard. Fixing every time replaning (and sometimes useless) needs
depper fix. Please check patch
/messages/by-id/attachment/112489/plpgsql-stmt_call-fix-2.patch
Maybe this mechanism can be used for a clean fix of the problem mentioned
in this thread.
Regards
Pavel
On 19.08.2020 22:20, Pavel Stehule wrote:
st 19. 8. 2020 v 20:59 odesílatel Konstantin Knizhnik
<k.knizhnik@postgrespro.ru <mailto:k.knizhnik@postgrespro.ru>> napsal:On 19.08.2020 21:50, Pavel Stehule wrote:
Hi
st 19. 8. 2020 v 19:22 odesílatel Konstantin Knizhnik
<k.knizhnik@postgrespro.ru <mailto:k.knizhnik@postgrespro.ru>>
napsal:Hi hackers,
More than month ago I have sent bug report to pgsql-bugs:
/messages/by-id/5d335911-fb25-60cd-4aa7-a5bd0954aea0@postgrespro.ru
with the proposed patch but have not received any response.
I wonder if there is some other way to fix this issue and
does somebody
working on it.
While the added check itself is trivial (just one line) the
total patch
is not so small because I have added walker for
plpgsql statements tree. It is not strictly needed in this
case (it is
possible to find some other way to determine that stored
procedure
contains transaction control statements), but I hope such
walker may be
useful in other cases.In any case, I will be glad to receive any response,
because this problem was reported by one of our customers and
we need to
provide some fix.
It is better to include it in vanilla, rather than in our
pgpro-ee fork.If it is desirable, I can add this patch to commitfest.
I don't like this design. It is not effective to repeat the
walker for every execution. Introducing a walker just for this
case looks like overengineering.
Personally I am not sure if a walker for plpgsql is a good idea
(I thought about it more times, when I wrote plpgsql_check). But
anyway - there should be good reason for introducing the walker
and clean use case.If you want to introduce stmt walker, then it should be a
separate patch with some benefit on plpgsql environment length.If you think that plpgsql statement walker is not needed, then I
do not insist.
Are you going to commit your version of the patch?I am afraid so it needs significantly much more work :(. My version is
correct just for the case that you describe, but it doesn't fix the
possibility of the end of the transaction inside the nested CALL.Some like
DO $$ DECLARE v_r record; BEGIN FOR v_r in SELECT data FROM toasted
LOOP INSERT INTO toasted(data) VALUES(v_r.data) CALL
check_and_commit();END LOOP;END;$$;Probably my patch (or your patch) will fix on 99%, but still there
will be a possibility of this issue. It is very similar to fixing
releasing expr plans inside CALL statements. Current design of CALL
statement is ugly workaround - it is slow, and there is brutal memory
leak. Fixing memory leak is not hard. Fixing every time replaning (and
sometimes useless) needs depper fix. Please check patch
/messages/by-id/attachment/112489/plpgsql-stmt_call-fix-2.patch
Maybe this mechanism can be used for a clean fix of the problem
mentioned in this thread.
Sorry for delay with answer.
Today we have received another bug report from the client.
And now as you warned, there was no direct call of COMMIT/ROLLBACK
statements in stored procedures, but instead of it it is calling some
other pprocedures
which I suspect contains some transaction control statements.
I looked at the plpgsql-stmt_call-fix-2.patch
</messages/by-id/attachment/112489/plpgsql-stmt_call-fix-2.patch>
It invalidates prepared plan in case of nested procedure call.
But here invalidation approach will not work. We have already prefetched
rows and to access them we need snapshot.
We can not restore the same snapshot after CALL - it will be not correct.
In case of ATX (autonomous transactions supported by PgPro) we really
save/restore context after ATX. But transaction control semantic in
procedures is different:
we commit current transaction and start new one.
So I didn't find better solution than just slightly extend you patch and
consider any procedures containing CALLs as potentially performing
transaction control.
I updated version of your patch.
What do you think about it?
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachments:
plpgsql_stored_procs-2.patchtext/x-patch; name=plpgsql_stored_procs-2.patchDownload
diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c
index b4c70aa..3c074cd 100644
--- a/src/pl/plpgsql/src/pl_exec.c
+++ b/src/pl/plpgsql/src/pl_exec.c
@@ -5827,6 +5827,12 @@ exec_for_query(PLpgSQL_execstate *estate, PLpgSQL_stmt_forq *stmt,
PinPortal(portal);
/*
+ * Disable prefetch if procedure contains COMMIT, ROLLBACK or CALL statements
+ */
+ if (estate->func->fn_xactctrl)
+ prefetch_ok = false;
+
+ /*
* Fetch the initial tuple(s). If prefetching is allowed then we grab a
* few more rows to avoid multiple trips through executor startup
* overhead.
diff --git a/src/pl/plpgsql/src/pl_gram.y b/src/pl/plpgsql/src/pl_gram.y
index 34e0520..ae57475 100644
--- a/src/pl/plpgsql/src/pl_gram.y
+++ b/src/pl/plpgsql/src/pl_gram.y
@@ -935,6 +935,8 @@ stmt_perform : K_PERFORM
check_sql_expr(new->expr->query, new->expr->parseMode,
startloc + 1);
+ plpgsql_curr_compile->fn_xactctrl = true;
+
$$ = (PLpgSQL_stmt *)new;
}
;
@@ -2249,6 +2251,8 @@ stmt_commit : K_COMMIT opt_transaction_chain ';'
new->stmtid = ++plpgsql_curr_compile->nstatements;
new->chain = $2;
+ plpgsql_curr_compile->fn_xactctrl = true;
+
$$ = (PLpgSQL_stmt *)new;
}
;
@@ -2263,6 +2267,8 @@ stmt_rollback : K_ROLLBACK opt_transaction_chain ';'
new->stmtid = ++plpgsql_curr_compile->nstatements;
new->chain = $2;
+ plpgsql_curr_compile->fn_xactctrl = true;
+
$$ = (PLpgSQL_stmt *)new;
}
;
diff --git a/src/pl/plpgsql/src/plpgsql.h b/src/pl/plpgsql/src/plpgsql.h
index d501086..3210116 100644
--- a/src/pl/plpgsql/src/plpgsql.h
+++ b/src/pl/plpgsql/src/plpgsql.h
@@ -992,6 +992,7 @@ typedef struct PLpgSQL_function
bool fn_retisdomain;
bool fn_retset;
bool fn_readonly;
+ bool fn_xactctrl;
char fn_prokind;
int fn_nargs;
čt 18. 2. 2021 v 16:01 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:
On 19.08.2020 22:20, Pavel Stehule wrote:
st 19. 8. 2020 v 20:59 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:On 19.08.2020 21:50, Pavel Stehule wrote:
Hi
st 19. 8. 2020 v 19:22 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:Hi hackers,
More than month ago I have sent bug report to pgsql-bugs:
/messages/by-id/5d335911-fb25-60cd-4aa7-a5bd0954aea0@postgrespro.ru
with the proposed patch but have not received any response.
I wonder if there is some other way to fix this issue and does somebody
working on it.
While the added check itself is trivial (just one line) the total patch
is not so small because I have added walker for
plpgsql statements tree. It is not strictly needed in this case (it is
possible to find some other way to determine that stored procedure
contains transaction control statements), but I hope such walker may be
useful in other cases.In any case, I will be glad to receive any response,
because this problem was reported by one of our customers and we need to
provide some fix.
It is better to include it in vanilla, rather than in our pgpro-ee fork.If it is desirable, I can add this patch to commitfest.
I don't like this design. It is not effective to repeat the walker for
every execution. Introducing a walker just for this case looks like
overengineering.
Personally I am not sure if a walker for plpgsql is a good idea (I
thought about it more times, when I wrote plpgsql_check). But anyway -
there should be good reason for introducing the walker and clean use case.If you want to introduce stmt walker, then it should be a separate patch
with some benefit on plpgsql environment length.If you think that plpgsql statement walker is not needed, then I do not
insist.
Are you going to commit your version of the patch?I am afraid so it needs significantly much more work :(. My version is
correct just for the case that you describe, but it doesn't fix the
possibility of the end of the transaction inside the nested CALL.Some like
DO $$ DECLARE v_r record; BEGIN FOR v_r in SELECT data FROM toasted LOOP
INSERT INTO toasted(data) VALUES(v_r.data) CALL check_and_commit();END
LOOP;END;$$;Probably my patch (or your patch) will fix on 99%, but still there will be
a possibility of this issue. It is very similar to fixing releasing expr
plans inside CALL statements. Current design of CALL statement is ugly
workaround - it is slow, and there is brutal memory leak. Fixing memory
leak is not hard. Fixing every time replaning (and sometimes useless) needs
depper fix. Please check patch
/messages/by-id/attachment/112489/plpgsql-stmt_call-fix-2.patch
Maybe this mechanism can be used for a clean fix of the problem mentioned
in this thread.Sorry for delay with answer.
Today we have received another bug report from the client.
And now as you warned, there was no direct call of COMMIT/ROLLBACK
statements in stored procedures, but instead of it it is calling some other
pprocedures
which I suspect contains some transaction control statements.I looked at the plpgsql-stmt_call-fix-2.patch
</messages/by-id/attachment/112489/plpgsql-stmt_call-fix-2.patch>
It invalidates prepared plan in case of nested procedure call.
But here invalidation approach will not work. We have already prefetched
rows and to access them we need snapshot.
We can not restore the same snapshot after CALL - it will be not correct.
In case of ATX (autonomous transactions supported by PgPro) we really
save/restore context after ATX. But transaction control semantic in
procedures is different:
we commit current transaction and start new one.So I didn't find better solution than just slightly extend you patch and
consider any procedures containing CALLs as potentially performing
transaction control.
I updated version of your patch.
What do you think about it?
This has a negative impact on performance - and a lot of users use
procedures without transaction control. So it doesn't look like a good
solution.
I am more concentrated on the Pg 14 release, where the work with SPI is
redesigned, and I hope so this issue is fixed there. For older releases, I
don't know. Is this issue related to Postgres or it is related to PgPro
only? If it is related to community pg, then we should fix and we should
accept not too good performance, because there is no better non invasive
solution. If it is PgPro issue (because there are ATX support) you can fix
it (or you can try backport the patch
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=ee895a655ce4341546facd6f23e3e8f2931b96bf
). You have more possibilities on PgPro code base.
Regards
Pavel
Show quoted text
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
čt 18. 2. 2021 v 18:10 odesílatel Pavel Stehule <pavel.stehule@gmail.com>
napsal:
čt 18. 2. 2021 v 16:01 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:On 19.08.2020 22:20, Pavel Stehule wrote:
st 19. 8. 2020 v 20:59 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:On 19.08.2020 21:50, Pavel Stehule wrote:
Hi
st 19. 8. 2020 v 19:22 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:Hi hackers,
More than month ago I have sent bug report to pgsql-bugs:
/messages/by-id/5d335911-fb25-60cd-4aa7-a5bd0954aea0@postgrespro.ru
with the proposed patch but have not received any response.
I wonder if there is some other way to fix this issue and does somebody
working on it.
While the added check itself is trivial (just one line) the total patch
is not so small because I have added walker for
plpgsql statements tree. It is not strictly needed in this case (it is
possible to find some other way to determine that stored procedure
contains transaction control statements), but I hope such walker may be
useful in other cases.In any case, I will be glad to receive any response,
because this problem was reported by one of our customers and we need
to
provide some fix.
It is better to include it in vanilla, rather than in our pgpro-ee fork.If it is desirable, I can add this patch to commitfest.
I don't like this design. It is not effective to repeat the walker for
every execution. Introducing a walker just for this case looks like
overengineering.
Personally I am not sure if a walker for plpgsql is a good idea (I
thought about it more times, when I wrote plpgsql_check). But anyway -
there should be good reason for introducing the walker and clean use case.If you want to introduce stmt walker, then it should be a separate patch
with some benefit on plpgsql environment length.If you think that plpgsql statement walker is not needed, then I do not
insist.
Are you going to commit your version of the patch?I am afraid so it needs significantly much more work :(. My version is
correct just for the case that you describe, but it doesn't fix the
possibility of the end of the transaction inside the nested CALL.Some like
DO $$ DECLARE v_r record; BEGIN FOR v_r in SELECT data FROM toasted LOOP
INSERT INTO toasted(data) VALUES(v_r.data) CALL check_and_commit();END
LOOP;END;$$;Probably my patch (or your patch) will fix on 99%, but still there will
be a possibility of this issue. It is very similar to fixing releasing expr
plans inside CALL statements. Current design of CALL statement is ugly
workaround - it is slow, and there is brutal memory leak. Fixing memory
leak is not hard. Fixing every time replaning (and sometimes useless) needs
depper fix. Please check patch
/messages/by-id/attachment/112489/plpgsql-stmt_call-fix-2.patch
Maybe this mechanism can be used for a clean fix of the problem mentioned
in this thread.Sorry for delay with answer.
Today we have received another bug report from the client.
And now as you warned, there was no direct call of COMMIT/ROLLBACK
statements in stored procedures, but instead of it it is calling some other
pprocedures
which I suspect contains some transaction control statements.I looked at the plpgsql-stmt_call-fix-2.patch
</messages/by-id/attachment/112489/plpgsql-stmt_call-fix-2.patch>
It invalidates prepared plan in case of nested procedure call.
But here invalidation approach will not work. We have already prefetched
rows and to access them we need snapshot.
We can not restore the same snapshot after CALL - it will be not correct.
In case of ATX (autonomous transactions supported by PgPro) we really
save/restore context after ATX. But transaction control semantic in
procedures is different:
we commit current transaction and start new one.So I didn't find better solution than just slightly extend you patch and
consider any procedures containing CALLs as potentially performing
transaction control.
I updated version of your patch.
What do you think about it?This has a negative impact on performance - and a lot of users use
procedures without transaction control. So it doesn't look like a good
solution.I am more concentrated on the Pg 14 release, where the work with SPI is
redesigned, and I hope so this issue is fixed there. For older releases, I
don't know. Is this issue related to Postgres or it is related to PgPro
only? If it is related to community pg, then we should fix and we should
accept not too good performance, because there is no better non invasive
solution. If it is PgPro issue (because there are ATX support) you can fix
it (or you can try backport the patch
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=ee895a655ce4341546facd6f23e3e8f2931b96bf
). You have more possibilities on PgPro code base.
I am sorry, maybe my reply was not (is not) correct - this issue was
reported four months ago, and now I think more about your words about ATX,
and I have no idea, how much it is related to community pg or to pgpro.
I am sure so implementation of autonomous transactions is pretty hard, but
the described issue is related to PgPro implementation of ATX, and then it
should be fixed there. Disabling prefetching doesn't look like a good idea.
You try to fix the result, not the source of the problem - but I have not
any idea, what is possible and what not, because I don't know how PgPro ATX
is implemented.
Show quoted text
Regards
Pavel
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On 18.02.2021 20:10, Pavel Stehule wrote:
This has a negative impact on performance - and a lot of users use
procedures without transaction control. So it doesn't look like a good
solution.I am more concentrated on the Pg 14 release, where the work with SPI
is redesigned, and I hope so this issue is fixed there. For older
releases, I don't know. Is this issue related to Postgres or it is
related to PgPro only? If it is related to community pg, then we
should fix and we should accept not too good performance, because
there is no better non invasive solution. If it is PgPro issue
(because there are ATX support) you can fix it (or you can try
backport the patch
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=ee895a655ce4341546facd6f23e3e8f2931b96bf
). You have more possibilities on PgPro code base.
Sorry, it is not PgPro specific problem and recent master suffers from
this bug as well.
In the original bug report there was simple scenario of reproducing the
problem:
CREATE TABLE toasted(id serial primary key, data text);
INSERT INTO toasted(data) VALUES((SELECT string_agg(random()::text,':')
FROM generate_series(1, 1000)));
INSERT INTO toasted(data) VALUES((SELECT string_agg(random()::text,':')
FROM generate_series(1, 1000)));
DO $$ DECLARE v_r record; BEGIN FOR v_r in SELECT data FROM toasted LOOP
INSERT INTO toasted(data) VALUES(v_r.data);COMMIT;END LOOP;END;$$;
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
pá 19. 2. 2021 v 7:51 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:
On 18.02.2021 20:10, Pavel Stehule wrote:
This has a negative impact on performance - and a lot of users use
procedures without transaction control. So it doesn't look like a good
solution.I am more concentrated on the Pg 14 release, where the work with SPI is
redesigned, and I hope so this issue is fixed there. For older releases, I
don't know. Is this issue related to Postgres or it is related to PgPro
only? If it is related to community pg, then we should fix and we should
accept not too good performance, because there is no better non invasive
solution. If it is PgPro issue (because there are ATX support) you can fix
it (or you can try backport the patch
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=ee895a655ce4341546facd6f23e3e8f2931b96bf
). You have more possibilities on PgPro code base.Sorry, it is not PgPro specific problem and recent master suffers from
this bug as well.
In the original bug report there was simple scenario of reproducing the
problem:CREATE TABLE toasted(id serial primary key, data text);
INSERT INTO toasted(data) VALUES((SELECT string_agg(random()::text,':')
FROM generate_series(1, 1000)));
INSERT INTO toasted(data) VALUES((SELECT string_agg(random()::text,':')
FROM generate_series(1, 1000)));
DO $$ DECLARE v_r record; BEGIN FOR v_r in SELECT data FROM toasted LOOP
INSERT INTO toasted(data) VALUES(v_r.data);COMMIT;END LOOP;END;$$;
can you use new procedure_resowner?
Regards
Pavel
Show quoted text
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
I am sorry, maybe my reply was not (is not) correct - this issue was
reported four months ago, and now I think more about your words about
ATX, and I have no idea, how much it is related to community pg or to
pgpro.I am sure so implementation of autonomous transactions is pretty hard,
but the described issue is related to PgPro implementation of ATX, and
then it should be fixed there. Disabling prefetching doesn't look like
a good idea. You try to fix the result, not the source of the problem
- but I have not any idea, what is possible and what not, because I
don't know how PgPro ATX is implemented.
I think there is some misunderstanding.
Sorry if I my explanation was not clear.
This problem is not related with ATX and PgPro. Actually ATX correctly
handle this case (when iteration through query results cross transaction
commit).
It is the problem of transaction control in stored procedures in vanilla
Postgres and it is not yet resolved.
I refer to ATX in PgPro just as example of how this problem can be
solved with different transaction control model.
But this approach is not (IMHO) applicable to stored procedures.
I do not think that this problem is so critical.
Not so many people are using stored procedures (which were added to the
Postgres not so long time ago),
not all of them are performing transaction control inside them and even
less of them interleave loop over query results with transactions commits.
But there are such people and we have received correspondent bug reports.
So I think it should be somehow fixed.
I do not know good solution of the problem.
There are three possibilities:
1. Disable prefetch
2. Keep snapshot (which seems to be incorrect)
3. Materialize prefetched tuples before commit (seems to be non-trivial)
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On 19.02.2021 10:14, Pavel Stehule wrote:
pá 19. 2. 2021 v 7:51 odesílatel Konstantin Knizhnik
<k.knizhnik@postgrespro.ru <mailto:k.knizhnik@postgrespro.ru>> napsal:On 18.02.2021 20:10, Pavel Stehule wrote:
This has a negative impact on performance - and a lot of users
use procedures without transaction control. So it doesn't look
like a good solution.I am more concentrated on the Pg 14 release, where the work with
SPI is redesigned, and I hope so this issue is fixed there. For
older releases, I don't know. Is this issue related to Postgres
or it is related to PgPro only? If it is related to community pg,
then we should fix and we should accept not too good performance,
because there is no better non invasive solution. If it is PgPro
issue (because there are ATX support) you can fix it (or you can
try backport the patch
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=ee895a655ce4341546facd6f23e3e8f2931b96bf
). You have more possibilities on PgPro code base.Sorry, it is not PgPro specific problem and recent master suffers
from this bug as well.
In the original bug report there was simple scenario of
reproducing the problem:CREATE TABLE toasted(id serial primary key, data text);
INSERT INTO toasted(data) VALUES((SELECT
string_agg(random()::text,':') FROM generate_series(1, 1000)));
INSERT INTO toasted(data) VALUES((SELECT
string_agg(random()::text,':') FROM generate_series(1, 1000)));
DO $$ DECLARE v_r record; BEGIN FOR v_r in SELECT data FROM
toasted LOOP INSERT INTO toasted(data) VALUES(v_r.data);COMMIT;END
LOOP;END;$$;can you use new procedure_resowner?
Sorry, I do not understanf your suggestion.
How procedure_resowner can help to solve this problem?
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
pá 19. 2. 2021 v 8:17 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:
I am sorry, maybe my reply was not (is not) correct - this issue was
reported four months ago, and now I think more about your words about ATX,
and I have no idea, how much it is related to community pg or to pgpro.I am sure so implementation of autonomous transactions is pretty hard, but
the described issue is related to PgPro implementation of ATX, and then it
should be fixed there. Disabling prefetching doesn't look like a good idea.
You try to fix the result, not the source of the problem - but I have not
any idea, what is possible and what not, because I don't know how PgPro ATX
is implemented.I think there is some misunderstanding.
Sorry if I my explanation was not clear.This problem is not related with ATX and PgPro. Actually ATX correctly
handle this case (when iteration through query results cross transaction
commit).
It is the problem of transaction control in stored procedures in vanilla
Postgres and it is not yet resolved.
I refer to ATX in PgPro just as example of how this problem can be solved
with different transaction control model.
But this approach is not (IMHO) applicable to stored procedures.I do not think that this problem is so critical.
Not so many people are using stored procedures (which were added to the
Postgres not so long time ago),
not all of them are performing transaction control inside them and even
less of them interleave loop over query results with transactions commits.
But there are such people and we have received correspondent bug reports.
So I think it should be somehow fixed.I do not know good solution of the problem.
There are three possibilities:
1. Disable prefetch
2. Keep snapshot (which seems to be incorrect)
3. Materialize prefetched tuples before commit (seems to be non-trivial)
I am not sure if disabling prefetch for this case is the correct solution.
Probably not if you got a new snapshot, then the cursor will be
"sensitive", but other Postgres cursors are "insensitive".
Implementation of materialization should not be very hard - you will do
only copy tuples to some local buffers, but it doesn't say if the result
will be correct, because you mix more snapshots.
So keeping snapshots looks like a more correct solution - although there
can be inconsistency against current snapshot, the result is very similar
to full materialization.
Regards
Pavel
Show quoted text
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
pá 19. 2. 2021 v 8:39 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:
On 19.02.2021 10:14, Pavel Stehule wrote:
pá 19. 2. 2021 v 7:51 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:On 18.02.2021 20:10, Pavel Stehule wrote:
This has a negative impact on performance - and a lot of users use
procedures without transaction control. So it doesn't look like a good
solution.I am more concentrated on the Pg 14 release, where the work with SPI is
redesigned, and I hope so this issue is fixed there. For older releases, I
don't know. Is this issue related to Postgres or it is related to PgPro
only? If it is related to community pg, then we should fix and we should
accept not too good performance, because there is no better non invasive
solution. If it is PgPro issue (because there are ATX support) you can fix
it (or you can try backport the patch
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=ee895a655ce4341546facd6f23e3e8f2931b96bf
). You have more possibilities on PgPro code base.Sorry, it is not PgPro specific problem and recent master suffers from
this bug as well.
In the original bug report there was simple scenario of reproducing the
problem:CREATE TABLE toasted(id serial primary key, data text);
INSERT INTO toasted(data) VALUES((SELECT string_agg(random()::text,':')
FROM generate_series(1, 1000)));
INSERT INTO toasted(data) VALUES((SELECT string_agg(random()::text,':')
FROM generate_series(1, 1000)));
DO $$ DECLARE v_r record; BEGIN FOR v_r in SELECT data FROM toasted LOOP
INSERT INTO toasted(data) VALUES(v_r.data);COMMIT;END LOOP;END;$$;can you use new procedure_resowner?
Sorry, I do not understanf your suggestion.
How procedure_resowner can help to solve this problem?
This is just an idea - I think the most correct with zero performance
impact is keeping snapshot, and this can be stored in procedure_resowner.
The fundamental question is if we want or allow more snapshots per query.
The implementation is a secondary issue.
Pavel
Show quoted text
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On 19.02.2021 10:47, Pavel Stehule wrote:
pá 19. 2. 2021 v 8:39 odesílatel Konstantin Knizhnik
<k.knizhnik@postgrespro.ru <mailto:k.knizhnik@postgrespro.ru>> napsal:On 19.02.2021 10:14, Pavel Stehule wrote:
pá 19. 2. 2021 v 7:51 odesílatel Konstantin Knizhnik
<k.knizhnik@postgrespro.ru <mailto:k.knizhnik@postgrespro.ru>>
napsal:On 18.02.2021 20:10, Pavel Stehule wrote:
This has a negative impact on performance - and a lot of
users use procedures without transaction control. So it
doesn't look like a good solution.I am more concentrated on the Pg 14 release, where the work
with SPI is redesigned, and I hope so this issue is fixed
there. For older releases, I don't know. Is this issue
related to Postgres or it is related to PgPro only? If it is
related to community pg, then we should fix and we should
accept not too good performance, because there is no better
non invasive solution. If it is PgPro issue (because there
are ATX support) you can fix it (or you can try backport the
patch
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=ee895a655ce4341546facd6f23e3e8f2931b96bf
). You have more possibilities on PgPro code base.Sorry, it is not PgPro specific problem and recent master
suffers from this bug as well.
In the original bug report there was simple scenario of
reproducing the problem:CREATE TABLE toasted(id serial primary key, data text);
INSERT INTO toasted(data) VALUES((SELECT
string_agg(random()::text,':') FROM generate_series(1, 1000)));
INSERT INTO toasted(data) VALUES((SELECT
string_agg(random()::text,':') FROM generate_series(1, 1000)));
DO $$ DECLARE v_r record; BEGIN FOR v_r in SELECT data FROM
toasted LOOP INSERT INTO toasted(data)
VALUES(v_r.data);COMMIT;END LOOP;END;$$;can you use new procedure_resowner?
Sorry, I do not understanf your suggestion.
How procedure_resowner can help to solve this problem?This is just an idea - I think the most correct with zero performance
impact is keeping snapshot, and this can be stored in procedure_resowner.The fundamental question is if we want or allow more snapshots per
query. The implementation is a secondary issue.
I wonder if it is correct from logical point of view.
If we commit transaction in stored procedure, then we actually
implicitly start new transaction.
And new transaction should have new snapshot. Otherwise its behavior
will change.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
pá 19. 2. 2021 v 9:08 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:
On 19.02.2021 10:47, Pavel Stehule wrote:
pá 19. 2. 2021 v 8:39 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:On 19.02.2021 10:14, Pavel Stehule wrote:
pá 19. 2. 2021 v 7:51 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:On 18.02.2021 20:10, Pavel Stehule wrote:
This has a negative impact on performance - and a lot of users use
procedures without transaction control. So it doesn't look like a good
solution.I am more concentrated on the Pg 14 release, where the work with SPI is
redesigned, and I hope so this issue is fixed there. For older releases, I
don't know. Is this issue related to Postgres or it is related to PgPro
only? If it is related to community pg, then we should fix and we should
accept not too good performance, because there is no better non invasive
solution. If it is PgPro issue (because there are ATX support) you can fix
it (or you can try backport the patch
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=ee895a655ce4341546facd6f23e3e8f2931b96bf
). You have more possibilities on PgPro code base.Sorry, it is not PgPro specific problem and recent master suffers from
this bug as well.
In the original bug report there was simple scenario of reproducing the
problem:CREATE TABLE toasted(id serial primary key, data text);
INSERT INTO toasted(data) VALUES((SELECT string_agg(random()::text,':')
FROM generate_series(1, 1000)));
INSERT INTO toasted(data) VALUES((SELECT string_agg(random()::text,':')
FROM generate_series(1, 1000)));
DO $$ DECLARE v_r record; BEGIN FOR v_r in SELECT data FROM toasted LOOP
INSERT INTO toasted(data) VALUES(v_r.data);COMMIT;END LOOP;END;$$;can you use new procedure_resowner?
Sorry, I do not understanf your suggestion.
How procedure_resowner can help to solve this problem?This is just an idea - I think the most correct with zero performance
impact is keeping snapshot, and this can be stored in procedure_resowner.The fundamental question is if we want or allow more snapshots per query.
The implementation is a secondary issue.I wonder if it is correct from logical point of view.
If we commit transaction in stored procedure, then we actually implicitly
start new transaction.
And new transaction should have new snapshot. Otherwise its behavior will
change.
I have no problem with this. I have a problem with cycle implementation -
when I iterate over some result, then this result should be consistent over
all cycles. In other cases, the behaviour is not deterministic.
Show quoted text
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On 19.02.2021 11:12, Pavel Stehule wrote:
pá 19. 2. 2021 v 9:08 odesílatel Konstantin Knizhnik
<k.knizhnik@postgrespro.ru <mailto:k.knizhnik@postgrespro.ru>> napsal:On 19.02.2021 10:47, Pavel Stehule wrote:
pá 19. 2. 2021 v 8:39 odesílatel Konstantin Knizhnik
<k.knizhnik@postgrespro.ru <mailto:k.knizhnik@postgrespro.ru>>
napsal:On 19.02.2021 10:14, Pavel Stehule wrote:
pá 19. 2. 2021 v 7:51 odesílatel Konstantin Knizhnik
<k.knizhnik@postgrespro.ru
<mailto:k.knizhnik@postgrespro.ru>> napsal:On 18.02.2021 20:10, Pavel Stehule wrote:
This has a negative impact on performance - and a lot
of users use procedures without transaction control. So
it doesn't look like a good solution.I am more concentrated on the Pg 14 release, where the
work with SPI is redesigned, and I hope so this issue
is fixed there. For older releases, I don't know. Is
this issue related to Postgres or it is related to
PgPro only? If it is related to community pg, then we
should fix and we should accept not too good
performance, because there is no better non invasive
solution. If it is PgPro issue (because there are ATX
support) you can fix it (or you can try backport the
patch
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=ee895a655ce4341546facd6f23e3e8f2931b96bf
). You have more possibilities on PgPro code base.Sorry, it is not PgPro specific problem and recent
master suffers from this bug as well.
In the original bug report there was simple scenario of
reproducing the problem:CREATE TABLE toasted(id serial primary key, data text);
INSERT INTO toasted(data) VALUES((SELECT
string_agg(random()::text,':') FROM generate_series(1,
1000)));
INSERT INTO toasted(data) VALUES((SELECT
string_agg(random()::text,':') FROM generate_series(1,
1000)));
DO $$ DECLARE v_r record; BEGIN FOR v_r in SELECT data
FROM toasted LOOP INSERT INTO toasted(data)
VALUES(v_r.data);COMMIT;END LOOP;END;$$;can you use new procedure_resowner?
Sorry, I do not understanf your suggestion.
How procedure_resowner can help to solve this problem?This is just an idea - I think the most correct with zero
performance impact is keeping snapshot, and this can be stored in
procedure_resowner.The fundamental question is if we want or allow more snapshots
per query. The implementation is a secondary issue.I wonder if it is correct from logical point of view.
If we commit transaction in stored procedure, then we actually
implicitly start new transaction.
And new transaction should have new snapshot. Otherwise its
behavior will change.I have no problem with this. I have a problem with cycle
implementation - when I iterate over some result, then this result
should be consistent over all cycles. In other cases, the behaviour
is not deterministic.
I have investigated more the problem with toast data in stored
procedures and come to very strange conclusion:
to fix the problem it is enough to pass expand_external=false to
expanded_record_set_tuple instead of !estate->atomic:
{
/* Only need to assign a new
tuple value */
expanded_record_set_tuple(rec->erh, tuptab->vals[i],
- true, !estate->atomic);
+ true, false);
}
Why it is correct?
Because in assign_simple_var we already forced detoasting for data:
/*
* In non-atomic contexts, we do not want to store TOAST pointers in
* variables, because such pointers might become stale after a commit.
* Forcibly detoast in such cases. We don't want to detoast (flatten)
* expanded objects, however; those should be OK across a transaction
* boundary since they're just memory-resident objects. (Elsewhere in
* this module, operations on expanded records likewise need to request
* detoasting of record fields when !estate->atomic. Expanded
arrays are
* not a problem since all array entries are always detoasted.)
*/
if (!estate->atomic && !isnull && var->datatype->typlen == -1 &&
VARATT_IS_EXTERNAL_NON_EXPANDED(DatumGetPointer(newvalue)))
{
MemoryContext oldcxt;
Datum detoasted;
/*
* Do the detoasting in the eval_mcontext to avoid long-term
leakage
* of whatever memory toast fetching might leak. Then we have
to copy
* the detoasted datum to the function's main context, which is a
* pain, but there's little choice.
*/
oldcxt = MemoryContextSwitchTo(get_eval_mcontext(estate));
detoasted = PointerGetDatum(detoast_external_attr((struct
varlena *) DatumGetPointer(newvalue)));
So, there is no need to initialize TOAST snapshot and "no known
snapshots" error is false alarm.
What do you think about it?
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
pá 19. 2. 2021 v 16:19 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:
On 19.02.2021 11:12, Pavel Stehule wrote:
pá 19. 2. 2021 v 9:08 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:On 19.02.2021 10:47, Pavel Stehule wrote:
pá 19. 2. 2021 v 8:39 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:On 19.02.2021 10:14, Pavel Stehule wrote:
pá 19. 2. 2021 v 7:51 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:On 18.02.2021 20:10, Pavel Stehule wrote:
This has a negative impact on performance - and a lot of users use
procedures without transaction control. So it doesn't look like a good
solution.I am more concentrated on the Pg 14 release, where the work with SPI is
redesigned, and I hope so this issue is fixed there. For older releases, I
don't know. Is this issue related to Postgres or it is related to PgPro
only? If it is related to community pg, then we should fix and we should
accept not too good performance, because there is no better non invasive
solution. If it is PgPro issue (because there are ATX support) you can fix
it (or you can try backport the patch
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=ee895a655ce4341546facd6f23e3e8f2931b96bf
). You have more possibilities on PgPro code base.Sorry, it is not PgPro specific problem and recent master suffers from
this bug as well.
In the original bug report there was simple scenario of reproducing the
problem:CREATE TABLE toasted(id serial primary key, data text);
INSERT INTO toasted(data) VALUES((SELECT string_agg(random()::text,':')
FROM generate_series(1, 1000)));
INSERT INTO toasted(data) VALUES((SELECT string_agg(random()::text,':')
FROM generate_series(1, 1000)));
DO $$ DECLARE v_r record; BEGIN FOR v_r in SELECT data FROM toasted
LOOP INSERT INTO toasted(data) VALUES(v_r.data);COMMIT;END LOOP;END;$$;can you use new procedure_resowner?
Sorry, I do not understanf your suggestion.
How procedure_resowner can help to solve this problem?This is just an idea - I think the most correct with zero performance
impact is keeping snapshot, and this can be stored in procedure_resowner.The fundamental question is if we want or allow more snapshots per query.
The implementation is a secondary issue.I wonder if it is correct from logical point of view.
If we commit transaction in stored procedure, then we actually implicitly
start new transaction.
And new transaction should have new snapshot. Otherwise its behavior will
change.I have no problem with this. I have a problem with cycle implementation -
when I iterate over some result, then this result should be consistent over
all cycles. In other cases, the behaviour is not deterministic.I have investigated more the problem with toast data in stored procedures
and come to very strange conclusion:
to fix the problem it is enough to pass expand_external=false to
expanded_record_set_tuple instead of !estate->atomic:{
/* Only need to assign a new tuple
value */expanded_record_set_tuple(rec->erh, tuptab->vals[i],
-
true, !estate->atomic);
+
true, false);
}Why it is correct?
Because in assign_simple_var we already forced detoasting for data:/*
* In non-atomic contexts, we do not want to store TOAST pointers in
* variables, because such pointers might become stale after a commit.
* Forcibly detoast in such cases. We don't want to detoast (flatten)
* expanded objects, however; those should be OK across a transaction
* boundary since they're just memory-resident objects. (Elsewhere in
* this module, operations on expanded records likewise need to request
* detoasting of record fields when !estate->atomic. Expanded arrays
are
* not a problem since all array entries are always detoasted.)
*/
if (!estate->atomic && !isnull && var->datatype->typlen == -1 &&
VARATT_IS_EXTERNAL_NON_EXPANDED(DatumGetPointer(newvalue)))
{
MemoryContext oldcxt;
Datum detoasted;/*
* Do the detoasting in the eval_mcontext to avoid long-term
leakage
* of whatever memory toast fetching might leak. Then we have to
copy
* the detoasted datum to the function's main context, which is a
* pain, but there's little choice.
*/
oldcxt = MemoryContextSwitchTo(get_eval_mcontext(estate));
detoasted = PointerGetDatum(detoast_external_attr((struct varlena
*) DatumGetPointer(newvalue)));So, there is no need to initialize TOAST snapshot and "no known snapshots"
error is false alarm.
What do you think about it?
This is Tom's code, so important is his opinion.
Regards
Pavel
Show quoted text
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company