How embarrassing: optimization of a one-shot query doesn't work

Started by Tom Laneabout 18 years ago28 messageshackers
Jump to latest
#1Tom Lane
tgl@sss.pgh.pa.us

While testing the changes I was making to Pavel's EXECUTE USING patch
to ensure that parameter values were being provided to the planner,
it became painfully obvious that the planner wasn't actually *doing*
anything with them. For example

execute 'select count(*) from foo where x like $1' into c using $1;

wouldn't generate an indexscan when $1 was of the form 'prefix%'.

Some investigation showed that the planner is using the passed values
for estimation purposes, but not for any purposes where the value *must*
be correct (not only this LIKE-optimization, but constraint exclusion,
for instance). The reason is that the parameter values are made
available to estimate_expression_value but not to eval_const_expressions.
This is a thinko in a cleanup patch I made early in 8.3 development:
http://archives.postgresql.org/pgsql-committers/2007-02/msg00352.php
I said to myself "eval_const_expressions doesn't need any context,
because a constant expression's value must be independent of context,
so I can avoid changing its API". Silly me.

The implication of this is that 8.3 is significantly worse than 8.2
in optimizing unnamed statements in the extended-Query protocol;
a feature that JDBC, at least, relies on.

The fix is simple: add PlannerInfo to eval_const_expressions's
parameter list, as was done for estimate_expression_value. I am
slightly hesitant to do this in a stable branch, since it would break
any third-party code that might be calling that function. I doubt there
is currently any production-grade code doing so, but if anyone out there
is actively using those planner hooks we put into 8.3, it's conceivable
this would affect them.

Still, the performance regression here is bad enough that I think there
is little choice. Comments/objections?

regards, tom lane

#2Stephen Frost
sfrost@snowman.net
In reply to: Tom Lane (#1)
Re: How embarrassing: optimization of a one-shot query doesn't work

* Tom Lane (tgl@sss.pgh.pa.us) wrote:

The fix is simple: add PlannerInfo to eval_const_expressions's
parameter list, as was done for estimate_expression_value. I am
slightly hesitant to do this in a stable branch, since it would break
any third-party code that might be calling that function. I doubt there
is currently any production-grade code doing so, but if anyone out there
is actively using those planner hooks we put into 8.3, it's conceivable
this would affect them.

Still, the performance regression here is bad enough that I think there
is little choice. Comments/objections?

I agree that we should update stable to fix this performance regression,
given the gravity of it. I also feel that we need to do so ASAP, to
minimize the risk of people being affected by it. The longer we wait on
it the more likely someone will write something which is affected.

The constraint-exclusion not being used will be a huge impact and
problem for people moving partitioned-data with dynamic pl/pgsql
generation functions to 8.3.

Robert, I'm suprised you weren't affected, or have you just not noticed
yet due to your fancy new-to-you hardware? ;)

Stephen

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Stephen Frost (#2)
Re: How embarrassing: optimization of a one-shot query doesn't work

Stephen Frost <sfrost@snowman.net> writes:

* Tom Lane (tgl@sss.pgh.pa.us) wrote:

Still, the performance regression here is bad enough that I think there
is little choice. Comments/objections?

I agree that we should update stable to fix this performance regression,
given the gravity of it. I also feel that we need to do so ASAP, to
minimize the risk of people being affected by it. The longer we wait on
it the more likely someone will write something which is affected.

In the normal course of events I'd expect that we'd put out 8.3.2
in a month or so. I'm not quite convinced that this issue is worth
speeding the cycle all by itself. We've done that for data-loss
issues but never before for a mere performance problem.

The main reason I wanted to make some noise about this is to find out
if anyone is actually trying to call eval_const_expressions (or
relation_excluded_by_constraints, which it turned out needed to change
also) from 8.3 add-on code. If anyone squawks we could think about a
faster update ...

regards, tom lane

#4Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#3)
Re: How embarrassing: optimization of a one-shot query doesn't work

Tom Lane wrote:

The main reason I wanted to make some noise about this is to find out
if anyone is actually trying to call eval_const_expressions (or
relation_excluded_by_constraints, which it turned out needed to change
also) from 8.3 add-on code. If anyone squawks we could think about a
faster update ...

That assumes that someone working on using the planner hooks will read
this thread - which might be reasonable - I guess they number of likely
users is fairly small. But if they might miss it then it would be best
to fix it ASAP, ISTM.

cheers

andrew

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#4)
Re: Re: [HACKERS] How embarrassing: optimization of a one-shot query doesn't work

Andrew Dunstan <andrew@dunslane.net> writes:

Tom Lane wrote:

If anyone squawks we could think about a
faster update ...

That assumes that someone working on using the planner hooks will read
this thread - which might be reasonable - I guess they number of likely
users is fairly small. But if they might miss it then it would be best
to fix it ASAP, ISTM.

Well, it's not like we have never before changed internal APIs in a
minor update. (There have been security-related cases where we gave
*zero* notice of such changes.) Nor am I willing to surrender the
option to do so again. If there's somebody out there with a real
problem with this change, they need to speak up.

regards, tom lane

#6Guillaume Smet
guillaume.smet@gmail.com
In reply to: Tom Lane (#5)
Re: Re: [HACKERS] How embarrassing: optimization of a one-shot query doesn't work

On Tue, Apr 1, 2008 at 7:35 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Well, it's not like we have never before changed internal APIs in a
minor update. (There have been security-related cases where we gave
*zero* notice of such changes.) Nor am I willing to surrender the
option to do so again. If there's somebody out there with a real
problem with this change, they need to speak up.

+1 to fix it ASAP. A lot of people usually wait for 8.x.1 before
considering a migration and people are usually quite cautious with the
8.3 migration because of all the cast errors in the existing app.

Another question is how we can be sure it doesn't happen again. The
easiest way to test this is probably to have a JDBC test testing this
exact feature in the future benchfarm. Any comment?

--
Guillaume

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Guillaume Smet (#6)
Re: Re: [HACKERS] How embarrassing: optimization of a one-shot query doesn't work

"Guillaume Smet" <guillaume.smet@gmail.com> writes:

Another question is how we can be sure it doesn't happen again. The
easiest way to test this is probably to have a JDBC test testing this
exact feature in the future benchfarm. Any comment?

Yeah, the lack of any formal testing of the extended-Query protocol
is a real problem. I'm not sure of a good fix, but it bears some
thinking about. Not only do we not have an automated way to notice
if we broke functionality, but we don't really notice for either
extended or basic protocol if we hurt performance.

regards, tom lane

#8Guillaume Smet
guillaume.smet@gmail.com
In reply to: Tom Lane (#7)
Re: Re: [HACKERS] How embarrassing: optimization of a one-shot query doesn't work

On Tue, Apr 1, 2008 at 8:06 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Yeah, the lack of any formal testing of the extended-Query protocol
is a real problem. I'm not sure of a good fix, but it bears some
thinking about. Not only do we not have an automated way to notice
if we broke functionality, but we don't really notice for either
extended or basic protocol if we hurt performance.

I just posted something to -hackers about the availability of boxes
for QA purposes. It doesn't solve the problem by itself though.

A good answer is probably to plan optional JDBC benchmarks in the
benchfarm design - not all people want to run Java on their boxes but
we have servers of our own to do so. Andrew?

--
Guillaume

#9Michael Paesold
mpaesold@gmx.at
In reply to: Tom Lane (#1)
Re: [HACKERS] How embarrassing: optimization of a one-shot query doesn't work

Am 01.04.2008 um 01:26 schrieb Tom Lane:

While testing the changes I was making to Pavel's EXECUTE USING patch
to ensure that parameter values were being provided to the planner,
it became painfully obvious that the planner wasn't actually *doing*
anything with them. For example

execute 'select count(*) from foo where x like $1' into c using $1;

wouldn't generate an indexscan when $1 was of the form 'prefix%'.

...

The implication of this is that 8.3 is significantly worse than 8.2
in optimizing unnamed statements in the extended-Query protocol;
a feature that JDBC, at least, relies on.

The fix is simple: add PlannerInfo to eval_const_expressions's
parameter list, as was done for estimate_expression_value. I am
slightly hesitant to do this in a stable branch, since it would break
any third-party code that might be calling that function. I doubt
there
is currently any production-grade code doing so, but if anyone out
there
is actively using those planner hooks we put into 8.3, it's
conceivable
this would affect them.

Still, the performance regression here is bad enough that I think
there
is little choice. Comments/objections?

Yeah, please fix this performance regression in the 8.3 branch. This
would affect most of the JDBC applications out there, I think.

Best Regards
Michael Paesold

#10Dave Cramer
pg@fastcrypt.com
In reply to: Michael Paesold (#9)
Re: [HACKERS] How embarrassing: optimization of a one-shot query doesn't work

On 1-Apr-08, at 6:25 AM, Michael Paesold wrote:

Am 01.04.2008 um 01:26 schrieb Tom Lane:

While testing the changes I was making to Pavel's EXECUTE USING patch
to ensure that parameter values were being provided to the planner,
it became painfully obvious that the planner wasn't actually *doing*
anything with them. For example

execute 'select count(*) from foo where x like $1' into c using $1;

wouldn't generate an indexscan when $1 was of the form 'prefix%'.

...

The implication of this is that 8.3 is significantly worse than 8.2
in optimizing unnamed statements in the extended-Query protocol;
a feature that JDBC, at least, relies on.

The fix is simple: add PlannerInfo to eval_const_expressions's
parameter list, as was done for estimate_expression_value. I am
slightly hesitant to do this in a stable branch, since it would break
any third-party code that might be calling that function. I doubt
there
is currently any production-grade code doing so, but if anyone out
there
is actively using those planner hooks we put into 8.3, it's
conceivable
this would affect them.

Still, the performance regression here is bad enough that I think
there
is little choice. Comments/objections?

Yeah, please fix this performance regression in the 8.3 branch. This
would affect most of the JDBC applications out there, I think.

Was the driver ever changed to take advantage of the above strategy?

Dave

Show quoted text

Best Regards
Michael Paesold

--
Sent via pgsql-jdbc mailing list (pgsql-jdbc@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-jdbc

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Dave Cramer (#10)
Re: [HACKERS] How embarrassing: optimization of a one-shot query doesn't work

Dave Cramer <pg@fastcrypt.com> writes:

Was the driver ever changed to take advantage of the above strategy?

Well, it's automatic as long as you use the unnamed statement. About
all that might need to be done on the client side is to use unnamed
statements more often in preference to named ones, and I believe that
something like that did get done in JDBC.

regards, tom lane

#12Michael Paesold
mpaesold@gmx.at
In reply to: Dave Cramer (#10)
Re: [HACKERS] How embarrassing: optimization of a one-shot query doesn't work

Am 01.04.2008 um 13:14 schrieb Dave Cramer:

On 1-Apr-08, at 6:25 AM, Michael Paesold wrote:

Am 01.04.2008 um 01:26 schrieb Tom Lane:

While testing the changes I was making to Pavel's EXECUTE USING
patch
to ensure that parameter values were being provided to the planner,
it became painfully obvious that the planner wasn't actually *doing*
anything with them. For example

execute 'select count(*) from foo where x like $1' into c using $1;

wouldn't generate an indexscan when $1 was of the form 'prefix%'.

...

The implication of this is that 8.3 is significantly worse than 8.2
in optimizing unnamed statements in the extended-Query protocol;
a feature that JDBC, at least, relies on.

The fix is simple: add PlannerInfo to eval_const_expressions's
parameter list, as was done for estimate_expression_value. I am
slightly hesitant to do this in a stable branch, since it would
break
any third-party code that might be calling that function. I doubt
there
is currently any production-grade code doing so, but if anyone out
there
is actively using those planner hooks we put into 8.3, it's
conceivable
this would affect them.

Still, the performance regression here is bad enough that I think
there
is little choice. Comments/objections?

Yeah, please fix this performance regression in the 8.3 branch.
This would affect most of the JDBC applications out there, I think.

Was the driver ever changed to take advantage of the above strategy?

IIRC, it is used in most cases with the v3 protocol, as long as you
don't set a prepare-threshold.

Best Regards
Michael Paesold

#13Dave Cramer
pg@fastcrypt.com
In reply to: Tom Lane (#11)
Re: [HACKERS] How embarrassing: optimization of a one-shot query doesn't work

So if I write

conn.prepareStatement("select col from table where col like ?")

then setString(1,'hello%')

The driver will do

prepare foo as select col from table where col like $1

and then

execute foo('hello%')

this will take advantage of the strategy automatically ?

If so this should be changed. The driver does this all the time.

Dave

On 1-Apr-08, at 10:06 AM, Tom Lane wrote:

Show quoted text

Dave Cramer <pg@fastcrypt.com> writes:

Was the driver ever changed to take advantage of the above strategy?

Well, it's automatic as long as you use the unnamed statement. About
all that might need to be done on the client side is to use unnamed
statements more often in preference to named ones, and I believe that
something like that did get done in JDBC.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14PFC
lists@peufeu.com
In reply to: Tom Lane (#11)
Re: [HACKERS] How embarrassing: optimization of a one-shot query doesn't work

On Tue, 01 Apr 2008 16:06:01 +0200, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Dave Cramer <pg@fastcrypt.com> writes:

Was the driver ever changed to take advantage of the above strategy?

Well, it's automatic as long as you use the unnamed statement. About
all that might need to be done on the client side is to use unnamed
statements more often in preference to named ones, and I believe that
something like that did get done in JDBC.

regards, tom lane

PHP is also affected if you use pg_query_params...
Syntax : pg_query_params( "SQL with $ params", array( parameters )

Note that value is TEXT, indexed, there are 100K rows in table.

pg_query( "SELECT * FROM test WHERE id =12345" ); 1 rows in
0.15931844711304 ms
pg_query( "SELECT * FROM test WHERE value LIKE '1234%'" ); 11 rows in
0.26795864105225 ms

pg_query_params( "SELECT * FROM test WHERE id =$1", array( 12345 ) ); 1
rows in 0.16618013381958 ms
pg_query_params( "SELECT * FROM test WHERE value LIKE $1", array( '1234%'
)); 11 rows in 40.66633939743 ms

Last query does not use index.
However since noone uses pg_query_params in PHP (since PHP coders just
LOVE to manually escape their strings, or worse use magicquotes), noone
should notice ;)

#15Greg Smith
gsmith@gregsmith.com
In reply to: Guillaume Smet (#8)
Re: Re: [HACKERS] How embarrassing: optimization of a one-shot query doesn't work

On Tue, 1 Apr 2008, Guillaume Smet wrote:

A good answer is probably to plan optional JDBC benchmarks in the
benchfarm design - not all people want to run Java on their boxes but
we have servers of our own to do so.

The original pgbench was actually based on an older test named JDBCbench.
That code is kind of old and buggy at this point. But with some care and
cleanup it's possible to benchmark not only relative Java performance with
it, but you can compare it with pgbench running the same queries on the
same tables to see how much overhead going through Java is adding.

Original code at http://mmmysql.sourceforge.net/performance/ , there's
also some improved versions at
http://developer.mimer.com/features/feature_16.htm

I'm not sure if all of those changes are net positive for PostgreSQL
though, they weren't last time I played with this.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

#16Guillaume Smet
guillaume.smet@gmail.com
In reply to: Greg Smith (#15)
Re: Re: [HACKERS] How embarrassing: optimization of a one-shot query doesn't work

On Wed, Apr 2, 2008 at 2:05 AM, Greg Smith <gsmith@gregsmith.com> wrote:

I'm not sure if all of those changes are net positive for PostgreSQL
though, they weren't last time I played with this.

I fixed most of the bugs of JDBCBench I found when I benchmarked
Sequoia a long time ago. Totally forgot about it. I'll see if I can
find the sources somewhere.

--
Guillaume

#17Dave Cramer
pg@fastcrypt.com
In reply to: Guillaume Smet (#16)
Re: Re: [HACKERS] How embarrassing: optimization of a one-shot query doesn't work

Guillaume,

I for one would be very interested in the JDBCBench code.

Dave
On 1-Apr-08, at 8:35 PM, Guillaume Smet wrote:

Show quoted text

On Wed, Apr 2, 2008 at 2:05 AM, Greg Smith <gsmith@gregsmith.com>
wrote:

I'm not sure if all of those changes are net positive for PostgreSQL
though, they weren't last time I played with this.

I fixed most of the bugs of JDBCBench I found when I benchmarked
Sequoia a long time ago. Totally forgot about it. I'll see if I can
find the sources somewhere.

--
Guillaume

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18Guillaume Smet
guillaume.smet@gmail.com
In reply to: Dave Cramer (#17)
Re: Re: [HACKERS] How embarrassing: optimization of a one-shot query doesn't work

On Wed, Apr 2, 2008 at 2:53 AM, Dave Cramer <pg@fastcrypt.com> wrote:

I for one would be very interested in the JDBCBench code.

OK, I didn't make anything fancy, I just fixed the problem I
encountered when profiling Sequoia (I mostly used it as an injector).

I'll post the code tomorrow if I can find it somewhere (I lost a
couple of disks and I don't remember the box I used to run the tests).

--
Guillaume

#19Thomas Burdairon
tburdairon@entelience.com
In reply to: Guillaume Smet (#18)
Re: Re: [HACKERS] How embarrassing: optimization of a one-shot query doesn't work

Is there any patch available for this one?

I'm encountering troubles with some JDBC queries and I'd like to test
it before asking some help on the JDBC list.

Thanks.
Tom

#20Dave Cramer
pg@fastcrypt.com
In reply to: Thomas Burdairon (#19)
Re: Re: [HACKERS] How embarrassing: optimization of a one-shot query doesn't work

It's pretty easy to test.

prepare the query
and
run explain analyze on the prepared statement.

Dave
On 10-Apr-08, at 5:47 AM, Thomas Burdairon wrote:

Show quoted text

Is there any patch available for this one?

I'm encountering troubles with some JDBC queries and I'd like to
test it before asking some help on the JDBC list.

Thanks.
Tom

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21Thomas Burdairon
tburdairon@entelience.com
In reply to: Dave Cramer (#20)
#22Dave Cramer
pg@fastcrypt.com
In reply to: Tom Lane (#1)
#23Tom Lane
tgl@sss.pgh.pa.us
In reply to: Dave Cramer (#22)
#24Dave Cramer
pg@fastcrypt.com
In reply to: Tom Lane (#23)
#25Tom Lane
tgl@sss.pgh.pa.us
In reply to: Dave Cramer (#24)
#26Jignesh K. Shah
J.K.Shah@Sun.COM
In reply to: Dave Cramer (#20)
#27Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jignesh K. Shah (#26)
#28Jignesh K. Shah
J.K.Shah@Sun.COM
In reply to: Tom Lane (#27)