'prepare' is not quite schema-safe
Hello,
I'm seeking for an advise to solve the issue that we hit recently
(cost me sleepless night after production server upgrade).
The actual environment is Apache+mod_perl, Postgresql 8.0.2. After
upgrading DBD::Pg to the 1.41 version (which supports preparing quries
on "server" side) we hit a series of strange issues. After digging the
issue for two days I can provide the minimal example to illustrate
what happens:
CREATE SCHEMA one;
SET search_path TO one;
CREATE TABLE test ( item VARCHAR(20) );
INSERT INTO test VALUES( 'one' );
CREATE SCHEMA two;
SET search_path TO two;
CREATE TABLE test ( item VARCHAR(20) );
INSERT INTO test VALUES( 'two' );
SET search_path TO one;
PREPARE st( VARCHAR(20) ) AS SELECT * FROM test WHERE item = $1;
EXECUTE st( 'one' );
SET search_path TO two;
-- next statement fails because st selects from one.test, not from two.test
EXECUTE st( 'two' );
I'm not sure is it bug, feature or something else, but I'm looking for
some solution to either make "prepare" statement bound/apply to
"current schema unless specified in the statement" OR advise from the
list on how the above situation can be avoided. As a quick workaround
we reverted back to DBD::Pg 1.32 with no server-side prepare support.
p.s. I'm not cross posting both to pgsql-general and dbd-pg lists,
cause not sure from which side dbd::pg or postgrtesql a possible
workaround can be found from.
--
Vlad
Vlad <marchenko@gmail.com> writes:
SET search_path TO one;
PREPARE st( VARCHAR(20) ) AS SELECT * FROM test WHERE item = $1;
EXECUTE st( 'one' );
SET search_path TO two;
-- next statement fails because st selects from one.test, not from two.test
EXECUTE st( 'two' );
That's what it is supposed to do. It would hardly be possible to
"prepare" a query at all if we had to wait till EXECUTE to find out
which tables it was supposed to use.
regards, tom lane
On Sun, 1 May 2005, Vlad wrote:
Hello,
I'm seeking for an advise to solve the issue that we hit recently
(cost me sleepless night after production server upgrade).
the first problem you have is that you have a critical production system
that you upgraded without going through proper test first.
That's just bad change control.
In any case, if the new DBD::Pg blew up in your face why did you not
immediately revert to the previous working one? Even if you didn't have it
on disk anymore you can just re-download it explicitly.
If it were no longer on CPAN it is on the 'backpan'.
Just because its Linux, or Postgresql or Perl, doesn't mean you don't have
to follow proper operational procedures.
the first problem you have is that you have a critical production system
that you upgraded without going through proper test first.
That's just bad change control.
In any case, if the new DBD::Pg blew up in your face why did you not
immediately revert to the previous working one? Even if you didn't have it
on disk anymore you can just re-download it explicitly.
If it were no longer on CPAN it is on the 'backpan'.Just because its Linux, or Postgresql or Perl, doesn't mean you don't have
to follow proper operational procedures.
there were a lot of other things upgraded same time, not only dbd::pg
(to minimize total downtime):
- our own perl code (with major changes, it's 4mb code size project)
- postgresql itself
- dbd pg
...
all the above has been tested on the special "testing" server prior
the upgrade and showed itself stable. Unfortunately it's not always
possible to simulate exactly the same situation as in production
environment, and as the result the problem didn't show itself up
untill we did the upgrade.
Because a lot of things were upgraded it wasn't quite obvious right
away that it was the result of new DBD::Pg feature, so it took us an
hour or two to localize it down to DBD::Pg (which has been downgraded
immediately) and then we worked the rest of the time to find out the
actual cause and make sure everything else is OK.
anyways.... with all that said, I wasn't blaming anyone here and
wasn't actually looking for an analyze on how should've we do the
upgrade. Rather I'm interested in list's opinion on possible
workaround. Though thanks for your point anyway.
--
Vlad
Tom,
thanks for you reply.
That's what it is supposed to do. It would hardly be possible to
"prepare" a query at all if we had to wait till EXECUTE to find out
which tables it was supposed to use.
I understand that from postgresql point of view everything is logical.
From the application that serves multiple (identical) queries using
the same DB connection and switching the schemas depends of the
account a query came for it turns into oddity with the switch from
DBD::Pg 1.32 (which caches prepared queries internally AFAIK) to
DBD::Pg 1.41 wich has postgresql prepare the query...
i.e. the following perl code won't work correctly with DBD::Pg 1.40+
$dbh->do("SET search_path TO one");
my $sth1 = $dbh->prepare_cached("SELECT * FROM test WHERE item = ?");
$sth1->execute("one");
$dbh->do("set search_path to two");
my $sth2 = $dbh->prepare_cached("SELECT * FROM test WHERE item = ?");
$sth2->execute("two");
in the last call $sth1 prepared query will be actually executed, i.e.
"one.test" table used, not "two.test" as a programmer would expect!
--
Vlad
Vlad <marchenko@gmail.com> writes:
i.e. the following perl code won't work correctly with DBD::Pg 1.40+
$dbh->do("SET search_path TO one");
my $sth1 = $dbh->prepare_cached("SELECT * FROM test WHERE item = ?");
$sth1->execute("one");
$dbh->do("set search_path to two");
my $sth2 = $dbh->prepare_cached("SELECT * FROM test WHERE item = ?");
$sth2->execute("two");
in the last call $sth1 prepared query will be actually executed, i.e.
"one.test" table used, not "two.test" as a programmer would expect!
Hmm. The above is arguably a DBD::Pg bug: it should not expect that
it's okay to use the same prepared statement in both cases. I do not
know what the spec is for "prepare_cached", but it sure seems that the
concept is fraught with danger --- the client-side driver has very
little hope of knowing what server-side events might be reasons to
invalidate the query cache. (Not that the server side is presently
all that good about it, but at least the server side is fixable
in principle ;-))
regards, tom lane
Tom Lane wrote:
That's what it is supposed to do. It would hardly be possible to
"prepare" a query at all if we had to wait till EXECUTE to find out
which tables it was supposed to use.
An alternative would be to flush dependent plans when the schema search
path is changed. In effect this would mean flushing *all* prepared plans
whenever the search path changes: we could perhaps keep plans that only
contain explicit namespace references, but that seems fragile.
Flushing all plans might well be a cure that is worth than the disease,
at least for a lot of users.
-Neil
Neil Conway <neilc@samurai.com> writes:
An alternative would be to flush dependent plans when the schema search
path is changed.
I think this would actually be the Wrong Thing. It's certainly a
debatable point --- but the best analogy we have is the behavior of
plpgsql functions in the face of search-path changes, and I think that
most people who have thought about that carefully are in favor of
changing plpgsql functions to follow a search path frozen at function
creation time. The fact that we haven't gotten around to making that
happen isn't an argument for breaking PREPARE in the same way that
plpgsql is broken ;-)
regards, tom lane
yeah, I agree.
perhaps a more correct solution would be to adjust DBD::Pg to detect
changes of active schema and store instances of server side prepared
queries tieing them up with query + current schema, not only a query
as it's now (as I understand)...
On 5/2/05, Neil Conway <neilc@samurai.com> wrote:
Tom Lane wrote:
That's what it is supposed to do. It would hardly be possible to
"prepare" a query at all if we had to wait till EXECUTE to find out
which tables it was supposed to use.An alternative would be to flush dependent plans when the schema search
path is changed. In effect this would mean flushing *all* prepared plans
whenever the search path changes: we could perhaps keep plans that only
contain explicit namespace references, but that seems fragile.Flushing all plans might well be a cure that is worth than the disease,
at least for a lot of users.-Neil
--
Vlad
btw, after re-reading the second part of your comment once again, I
have a (clarification) question:
so is it possible that a successfully prepared (and possibly a couple
of times already executed) query will be invalidated by postgresql
for some reason (like lack of memory for processing/caching other
queries)? Assuming that no database structure changes has been
performed.
If the answer is YES, then it's important to double check that the
dbd::pg driver would try to handle such situation appropriate - like
re-prepare query with postgresql.
Hmm. The above is arguably a DBD::Pg bug: it should not expect that
it's okay to use the same prepared statement in both cases. I do not
know what the spec is for "prepare_cached", but it sure seems that the
concept is fraught with danger --- the client-side driver has very
little hope of knowing what server-side events might be reasons to
invalidate the query cache. (Not that the server side is presently
all that good about it, but at least the server side is fixable
in principle ;-))regards, tom lane
--
Vlad
Vlad <marchenko@gmail.com> writes:
so is it possible that a successfully prepared (and possibly a couple
of times already executed) query will be invalidated by postgresql
for some reason (like lack of memory for processing/caching other
queries)? Assuming that no database structure changes has been
performed.
Well, that assumption is wrong to start with: what if the query plan
uses an index that someone else has chosen to drop? Or the plan
depends on an inlined copy of a SQL function that someone has since
changed? Or the plan was chosen on the basis of particular settings
of planner parameters like random_page_cost, but the user has changed
these via SET? (The last is a pretty close analogy to changing
search_path, I think.)
I am not claiming that the backend handles all these cases nicely
today: it certainly doesn't. But we understand in principle how
to fix these problems by invalidating plans inside the backend.
I don't see how the DBD::Pg driver can hope to deal with any of
these situations :-(
regards, tom lane
On Sun, May 01, 2005 at 11:19:16PM -0400, Tom Lane wrote:
Vlad <marchenko@gmail.com> writes:
i.e. the following perl code won't work correctly with DBD::Pg 1.40+
$dbh->do("SET search_path TO one");
my $sth1 = $dbh->prepare_cached("SELECT * FROM test WHERE item = ?");
$sth1->execute("one");$dbh->do("set search_path to two");
my $sth2 = $dbh->prepare_cached("SELECT * FROM test WHERE item = ?");
$sth2->execute("two");in the last call $sth1 prepared query will be actually executed, i.e.
"one.test" table used, not "two.test" as a programmer would expect!Hmm. The above is arguably a DBD::Pg bug: it should not expect that
it's okay to use the same prepared statement in both cases. I do not
know what the spec is for "prepare_cached", but it sure seems that the
concept is fraught with danger --- the client-side driver has very
little hope of knowing what server-side events might be reasons to
invalidate the query cache. (Not that the server side is presently
all that good about it, but at least the server side is fixable
in principle ;-))
Isn't this behaving as documented? prepare_cached() is supposed to
return the original statement handle when you pass it the same string
a second time.
The docs for prepare_cached() are littered with "Don't do this unless
you understand the implications" warnings, as well as some kludges to
differentiate different cases.
Cheers,
Steve
ok, since there is no gurantee that server-side prepared query is
still active, pergaps postgresql interface library provide way to
check if a prepared before query still alive prior runing exec, so
that dbd::pg driver can make sure it's still there, right before
executing?
If there is no such function (and I can't find it), then it will be
hard for a driver to make things working right with server-side
prepared queries!
On 5/2/05, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Vlad <marchenko@gmail.com> writes:
so is it possible that a successfully prepared (and possibly a couple
of times already executed) query will be invalidated by postgresql
for some reason (like lack of memory for processing/caching other
queries)? Assuming that no database structure changes has been
performed.Well, that assumption is wrong to start with: what if the query plan
uses an index that someone else has chosen to drop? Or the plan
depends on an inlined copy of a SQL function that someone has since
changed? Or the plan was chosen on the basis of particular settings
of planner parameters like random_page_cost, but the user has changed
these via SET? (The last is a pretty close analogy to changing
search_path, I think.)I am not claiming that the backend handles all these cases nicely
today: it certainly doesn't. But we understand in principle how
to fix these problems by invalidating plans inside the backend.
I don't see how the DBD::Pg driver can hope to deal with any of
these situations :-(
--
Vlad
Vlad wrote:
ok, since there is no gurantee that server-side prepared query is
still active, pergaps postgresql interface library provide way to
check if a prepared before query still alive prior runing exec
I'm not sure I quite follow you -- in some future version of the backend
in which prepared queries are invalidated, this would be invisible to
the client. The client wouldn't need to explicitly check for the
"liveness" of the prepared query, they could just execute it -- if
necessary, the backend will re-plan the query before executing it.
-Neil
Vlad [marchenko@gmail.com] wrote:
ok, since there is no gurantee that server-side prepared query is
still active, pergaps postgresql interface library provide way to
check if a prepared before query still alive prior runing exec, so
that dbd::pg driver can make sure it's still there, right before
executing?If there is no such function (and I can't find it), then it will be
hard for a driver to make things working right with server-side
prepared queries!
You can always use fully qualified class (table) names in your prepared
queries, i.e. explicitly specify the schema name.
Vlad wrote:
i.e. the following perl code won't work correctly with DBD::Pg 1.40+
$dbh->do("SET search_path TO one");
my $sth1 = $dbh->prepare_cached("SELECT * FROM test WHERE item = ?");
$sth1->execute("one");$dbh->do("set search_path to two");
my $sth2 = $dbh->prepare_cached("SELECT * FROM test WHERE item = ?");
$sth2->execute("two");in the last call $sth1 prepared query will be actually executed, i.e.
"one.test" table used, not "two.test" as a programmer would expect!
Correctness seems to be in the eye of the beholder.
It does what I as a programmer would expect. The behaviour you
previously saw was an unfortunate byproduct of the fact that up to now
DBD::Pg has emulated proper prepared statements, whereas now it uses
them for real. Any application that relies on that broken byproduct is
simply erroneous, IMNSHO.
If you really need this, then as previously discussed on list, there is
a way to turn off use of server-side prepared statements.
cheers
andrew
On 5/2/05, Neil Conway <neilc@samurai.com> wrote:
I'm not sure I quite follow you -- in some future version of the backend
in which prepared queries are invalidated, this would be invisible to
the client. The client wouldn't need to explicitly check for the
"liveness" of the prepared query, they could just execute it -- if
necessary, the backend will re-plan the query before executing it.
as I understood Tom's message, he's not advising dbd::pg driver to
rely on the fact that earlier prepared query is still valid. I don't
actually care abou the cases when DB structure has been changed and
postgtres invalidated prepares because of that.
--
Vlad
Andrew Dunstan wrote:
Vlad wrote:
i.e. the following perl code won't work correctly with DBD::Pg 1.40+
$dbh->do("SET search_path TO one");
my $sth1 = $dbh->prepare_cached("SELECT * FROM test WHERE item = ?");
$sth1->execute("one");$dbh->do("set search_path to two");
my $sth2 = $dbh->prepare_cached("SELECT * FROM test WHERE item = ?");
$sth2->execute("two");
in the last call $sth1 prepared query will be actually executed, i.e.
"one.test" table used, not "two.test" as a programmer would expect!Correctness seems to be in the eye of the beholder.
It does what I as a programmer would expect. The behaviour you
previously saw was an unfortunate byproduct of the fact that up to now
DBD::Pg has emulated proper prepared statements, whereas now it uses
them for real. Any application that relies on that broken byproduct is
simply erroneous, IMNSHO.If you really need this, then as previously discussed on list, there
is a way to turn off use of server-side prepared statements.
Oops. I missed that the code used prepare_cached() rather than just
prepare().
I am not sure this is reasonably fixable. Invalidating the cache is not
a pleasant solution - the query might not be affected by the change in
search path at all. I'd be inclined to say that this is just a
limitation of prepare_cached() which should be documented.
cheers
andrew
On May 1, 2005, at 22:44 , Tom Lane wrote:
I am not claiming that the backend handles all these cases nicely
today: it certainly doesn't. But we understand in principle how
to fix these problems by invalidating plans inside the backend.
I don't see how the DBD::Pg driver can hope to deal with any of
these situations :-(
It can't. So if you need to be able to switch schemas or do any of
the evil(tm) things Tom suggest, then I recommend that you use prepare
() instead of prepare_cached(). Or do the caching yourself.
Regards,
David
On May 1, 2005, at 21:30 , Neil Conway wrote:
An alternative would be to flush dependent plans when the schema
search path is changed. In effect this would mean flushing *all*
prepared plans whenever the search path changes: we could perhaps
keep plans that only contain explicit namespace references, but
that seems fragile.
Yes, but this would be invisible to DBD::Pg and other clients, no?
Regards,
David