slow SELECT expr INTO var in plpgsql

Started by Pavel Stehule2 months ago7 messageshackers
Jump to latest
#1Pavel Stehule
pavel.stehule@gmail.com

Hi

I found a article
https://medium.com/google-cloud/postgresql-performance-the-context-switch-trap-that-slows-down-t-sql-migrations-7b8c6f518fd8

It compare T-SQL and PLpgSQL performance on some simple benchmark

do $$ declare x int; begin for i in 1..10000000 loop x := 0; end loop; end
$$;
do $$ declare x int; begin for i in 1..10000000 loop select 0 into x; end
loop; end $$;

SELECT expr INTO var is syntax used on old sybase and mssql systems. The
positive result in this article is fact, so Postgres in all tests are very
well comparable. More - the assignment is really fast and significantly
faster than on MSSQL.

I remember the old discussion about this issue, and I thought that the
performance of SELECT INTO and assignment should be almost the same. I
repeated these tests on pg 9.4, 11 and master (asserts are disabled) with
interesting results

release, assign time, select into time
9.4, 2900 ms, 20800 ms
11, 2041 ms, 16243 ms
master, 534ms, 15438 ms

Originally, I used gcc with O0, and master is really slow without O2
optimization

9.4, 2600 ms, 20800 ms --<< 9.4 is faster with O0
11, 2177 ms, 19128 ms
master, 1395 ms, 70060 ms -- << master is very slow with O0

Using SELECT expr INTO var is plpgsql's antipattern. plpgsql_check can
detect it now. But it will still be nice if there will not be too big a
difference like now. I didn't check the code yet, and I have no idea if
there are some possibilities on how to execute this case better.

Regards

Pavel

tested on Fedora 43

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Pavel Stehule (#1)
Re: slow SELECT expr INTO var in plpgsql

Pavel Stehule <pavel.stehule@gmail.com> writes:

I remember the old discussion about this issue, and I thought that the
performance of SELECT INTO and assignment should be almost the same. I
repeated these tests on pg 9.4, 11 and master (asserts are disabled) with
interesting results

release, assign time, select into time
9.4, 2900 ms, 20800 ms
11, 2041 ms, 16243 ms
master, 534ms, 15438 ms

Yeah, we've sweated a good deal about optimizing plpgsql assignment,
but SELECT INTO is always done the hard way.

I experimented a little bit with converting simple-expression
SELECT INTO into an assignment, as attached. It does reclaim
nearly all of the performance difference: for me, these two
test cases now take about 276 vs 337 ms. However, I'm concerned
about the side-effects of substituting this other code path;
there's a lot of potential minor differences in behavior.
Two that you can see in the regression test changes are:

* SELECT INTO is tracked by pg_stat_statements, assignments aren't.

* The context report for an error can be different, because
_SPI_error_callback() doesn't get used.

We could probably eliminate the context-report difference by setting
up a custom error context callback in this new code path, but the
difference in pg_stat_statements output would be hard to mask.
There may be other discrepancies as well, such as variations in
error message wording.

Probably no one would notice such details if it had been like that
all along, but would they complain about a change? I dunno.

regards, tom lane

Attachments:

optimize-select-into-v1.patchtext/x-diff; charset=us-ascii; name=optimize-select-into-v1.patchDownload+43-11
#3Pavel Stehule
pavel.stehule@gmail.com
In reply to: Tom Lane (#2)
Re: slow SELECT expr INTO var in plpgsql

so 31. 1. 2026 v 21:58 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:

Pavel Stehule <pavel.stehule@gmail.com> writes:

I remember the old discussion about this issue, and I thought that the
performance of SELECT INTO and assignment should be almost the same. I
repeated these tests on pg 9.4, 11 and master (asserts are disabled) with
interesting results

release, assign time, select into time
9.4, 2900 ms, 20800 ms
11, 2041 ms, 16243 ms
master, 534ms, 15438 ms

Yeah, we've sweated a good deal about optimizing plpgsql assignment,
but SELECT INTO is always done the hard way.

I experimented a little bit with converting simple-expression
SELECT INTO into an assignment, as attached. It does reclaim
nearly all of the performance difference: for me, these two
test cases now take about 276 vs 337 ms. However, I'm concerned
about the side-effects of substituting this other code path;
there's a lot of potential minor differences in behavior.
Two that you can see in the regression test changes are:

* SELECT INTO is tracked by pg_stat_statements, assignments aren't.

* The context report for an error can be different, because
_SPI_error_callback() doesn't get used.

We could probably eliminate the context-report difference by setting
up a custom error context callback in this new code path, but the
difference in pg_stat_statements output would be hard to mask.
There may be other discrepancies as well, such as variations in
error message wording.

Probably no one would notice such details if it had been like that
all along, but would they complain about a change? I dunno.

This patch looks well. I can confirm massive speedup.

I don't remember any report related to change of implementation of assign
statement before, and I think it can be similar with this patch.

In this specific case, I think so users suppose SELECT INTO is translated
to assignment by default. And there are a lot of documents on the net that
describe the transformation of the assignment statement to SELECT - so I
think there is some grey zone where optimization can do some magic. More -
the statistics for function execution can be covered by track_functions.

Regards

Pavel

Show quoted text

regards, tom lane

#4Pavel Stehule
pavel.stehule@gmail.com
In reply to: Pavel Stehule (#3)
Re: slow SELECT expr INTO var in plpgsql

Hi

ne 1. 2. 2026 v 6:09 odesílatel Pavel Stehule <pavel.stehule@gmail.com>
napsal:

so 31. 1. 2026 v 21:58 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:

Pavel Stehule <pavel.stehule@gmail.com> writes:

I remember the old discussion about this issue, and I thought that the
performance of SELECT INTO and assignment should be almost the same. I
repeated these tests on pg 9.4, 11 and master (asserts are disabled)

with

interesting results

release, assign time, select into time
9.4, 2900 ms, 20800 ms
11, 2041 ms, 16243 ms
master, 534ms, 15438 ms

Yeah, we've sweated a good deal about optimizing plpgsql assignment,
but SELECT INTO is always done the hard way.

I experimented a little bit with converting simple-expression
SELECT INTO into an assignment, as attached. It does reclaim
nearly all of the performance difference: for me, these two
test cases now take about 276 vs 337 ms. However, I'm concerned
about the side-effects of substituting this other code path;
there's a lot of potential minor differences in behavior.
Two that you can see in the regression test changes are:

* SELECT INTO is tracked by pg_stat_statements, assignments aren't.

* The context report for an error can be different, because
_SPI_error_callback() doesn't get used.

We could probably eliminate the context-report difference by setting
up a custom error context callback in this new code path, but the
difference in pg_stat_statements output would be hard to mask.
There may be other discrepancies as well, such as variations in
error message wording.

Probably no one would notice such details if it had been like that
all along, but would they complain about a change? I dunno.

This patch looks well. I can confirm massive speedup.

I don't remember any report related to change of implementation of assign
statement before, and I think it can be similar with this patch.

In this specific case, I think so users suppose SELECT INTO is translated
to assignment by default. And there are a lot of documents on the net that
describe the transformation of the assignment statement to SELECT - so I
think there is some grey zone where optimization can do some magic. More -
the statistics for function execution can be covered by track_functions.

Do you plan to push this patch? Unfortunately there is not any discussion
about side effects.

I wrote a version with dedicated error context callback, so there will be
differences only in pg_stat_statements. It is true, so this should be hard
to mask. Maybe this difference can be just documented - like "`SELECT expr
INTO variable` can be optimized and executed by a direct expression
executor, and then this query will not be visible in pg_stat_statement."

Regards

Pavel

Show quoted text

Regards

Pavel

regards, tom lane

Attachments:

0001-optimize-select-into-v2.patchtext/x-patch; charset=US-ASCII; name=0001-optimize-select-into-v2.patchDownload+72-12
#5Pavel Stehule
pavel.stehule@gmail.com
In reply to: Pavel Stehule (#4)
Re: slow SELECT expr INTO var in plpgsql

Hi

I am sorry, wrong patch

Regards

Pavel

Attachments:

optimize-select-into-v2.patchtext/x-patch; charset=US-ASCII; name=optimize-select-into-v2.patchDownload+71-9
#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Pavel Stehule (#4)
Re: slow SELECT expr INTO var in plpgsql

Pavel Stehule <pavel.stehule@gmail.com> writes:

so 31. 1. 2026 v 21:58 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:

I experimented a little bit with converting simple-expression
SELECT INTO into an assignment, as attached. It does reclaim
nearly all of the performance difference: for me, these two
test cases now take about 276 vs 337 ms. However, I'm concerned
about the side-effects of substituting this other code path;
there's a lot of potential minor differences in behavior.
Two that you can see in the regression test changes are:

* SELECT INTO is tracked by pg_stat_statements, assignments aren't.

* The context report for an error can be different, because
_SPI_error_callback() doesn't get used.

We could probably eliminate the context-report difference by setting
up a custom error context callback in this new code path, but the
difference in pg_stat_statements output would be hard to mask.
There may be other discrepancies as well, such as variations in
error message wording.

Do you plan to push this patch? Unfortunately there is not any discussion
about side effects.

Yeah, general radio silence out there. After thinking about it for
awhile, I've decided to go ahead with the patch. It'll be easy enough
to revert if people are unhappy.

I wrote a version with dedicated error context callback,

Thanks for doing that. I found though that it wasn't quite enough,
because the existing code path applies _SPI_error_callback() during
evaluation of the expression but not during assignment to the target
variable. So for example, errors during type conversion to match
the target variable don't get a context line claiming they happened
during evaluation of the expression, which seems correct to me.
I was able to fix it by not using exec_assign_expr() but instead
copying that code in-line, so that we can pop the error context stack
at the right point. (See added tests in the committed patch,
ce8d5fe0e2802158b65699aeae1551d489948167.)

... so there will be
differences only in pg_stat_statements. It is true, so this should be hard
to mask. Maybe this difference can be just documented - like "`SELECT expr
INTO variable` can be optimized and executed by a direct expression
executor, and then this query will not be visible in pg_stat_statement."

We don't document that "var := expression" isn't captured, so I don't
think this needs to be either. Possibly Bruce will pick up the change
as a release-note item.

regards, tom lane

#7Pavel Stehule
pavel.stehule@gmail.com
In reply to: Pavel Stehule (#1)
Re: slow SELECT expr INTO var in plpgsql

pá 20. 3. 2026 v 23:33 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:

Pavel Stehule <pavel.stehule@gmail.com> writes:

so 31. 1. 2026 v 21:58 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:

I experimented a little bit with converting simple-expression
SELECT INTO into an assignment, as attached. It does reclaim
nearly all of the performance difference: for me, these two
test cases now take about 276 vs 337 ms. However, I'm concerned
about the side-effects of substituting this other code path;
there's a lot of potential minor differences in behavior.
Two that you can see in the regression test changes are:

* SELECT INTO is tracked by pg_stat_statements, assignments aren't.

* The context report for an error can be different, because
_SPI_error_callback() doesn't get used.

We could probably eliminate the context-report difference by setting
up a custom error context callback in this new code path, but the
difference in pg_stat_statements output would be hard to mask.
There may be other discrepancies as well, such as variations in
error message wording.

Do you plan to push this patch? Unfortunately there is not any discussion
about side effects.

Yeah, general radio silence out there. After thinking about it for
awhile, I've decided to go ahead with the patch. It'll be easy enough
to revert if people are unhappy.

I wrote a version with dedicated error context callback,

Thanks for doing that. I found though that it wasn't quite enough,
because the existing code path applies _SPI_error_callback() during
evaluation of the expression but not during assignment to the target
variable. So for example, errors during type conversion to match
the target variable don't get a context line claiming they happened
during evaluation of the expression, which seems correct to me.
I was able to fix it by not using exec_assign_expr() but instead
copying that code in-line, so that we can pop the error context stack
at the right point. (See added tests in the committed patch,
ce8d5fe0e2802158b65699aeae1551d489948167.)

... so there will be
differences only in pg_stat_statements. It is true, so this should be

hard

to mask. Maybe this difference can be just documented - like "`SELECT

expr

INTO variable` can be optimized and executed by a direct expression
executor, and then this query will not be visible in pg_stat_statement."

We don't document that "var := expression" isn't captured, so I don't
think this needs to be either. Possibly Bruce will pick up the change
as a release-note item.

Thank you very much

Regards

Pavel

Show quoted text

regards, tom lane