PL/pgSQL bug?
It seems that sometimes uncommitted data (dirty data?) could be seen
in PL/pgSQL function.
Below is a sample script to reproduce the problem: If you execute
"SELECT myftest(1)" concurrently, you will see the subselect in the
SELECT INTO... will produce:
ERROR: More than one tuple returned by a subselect used as an expression.
This is odd, since the coulum i is a primary key, and should never has
duplicate values.
If you comment out the SELECT INTO... statement, you could see a line
something like:
NOTICE: ctid (0,5) xmin 645188 xmax 645190 cmin 2 cmax 2
This is odd too, since xmax > 0 or cmax > 0 should never happen with
visible tuples, in my understanding.
I see these in 7.0.3, 7.1.2 and current.
--
Tatsuo Ishii
----------------------------------------------------------------------
DROP TABLE t1;
CREATE TABLE t1 (i INT PRIMARY KEY);
DROP FUNCTION myftest(INT);
CREATE FUNCTION myftest(INT)
RETURNS INT
AS '
DECLARE myid INT;
DECLARE rec RECORD;
key ALIAS FOR $1;
BEGIN
UPDATE t1 SET i = 1 WHERE i = 1;
SELECT INTO tid,myid ctid,i FROM t1 WHERE i = (SELECT i FROM t1 WHERE i = 1);
FOR rec IN SELECT ctid,xmin,xmax,cmin,cmax from t1 LOOP
RAISE NOTICE ''ctid % xmin % xmax % cmin % cmax %'', rec.ctid,rec.xmin,rec.xmax,rec.cmin,rec.cmax;
END LOOP;
RETURN 0;
END;
'
LANGUAGE 'plpgsql';
Tatsuo Ishii wrote:
It seems that sometimes uncommitted data (dirty data?) could be seen
in PL/pgSQL function.Below is a sample script to reproduce the problem: If you execute
"SELECT myftest(1)" concurrently, you will see the subselect in the
SELECT INTO... will produce:ERROR: More than one tuple returned by a subselect used as an expression.
This is odd, since the coulum i is a primary key, and should never has
duplicate values.If you comment out the SELECT INTO... statement, you could see a line
something like:NOTICE: ctid (0,5) xmin 645188 xmax 645190 cmin 2 cmax 2
This is odd too, since xmax > 0 or cmax > 0 should never happen with
visible tuples, in my understanding.I see these in 7.0.3, 7.1.2 and current.
If that's the case, it must be a general problem with SPI
that'll apply to any procedural language as well as user
defined C function using SPI.
When scans or functions are involved, PL/pgSQL uses SPI
functionality to evaluate the expression.
Jan
--
Tatsuo Ishii----------------------------------------------------------------------
DROP TABLE t1;
CREATE TABLE t1 (i INT PRIMARY KEY);
DROP FUNCTION myftest(INT);
CREATE FUNCTION myftest(INT)
RETURNS INT
AS '
DECLARE myid INT;
DECLARE rec RECORD;
key ALIAS FOR $1;
BEGIN
UPDATE t1 SET i = 1 WHERE i = 1;
SELECT INTO tid,myid ctid,i FROM t1 WHERE i = (SELECT i FROM t1 WHERE i = 1);
FOR rec IN SELECT ctid,xmin,xmax,cmin,cmax from t1 LOOP
RAISE NOTICE ''ctid % xmin % xmax % cmin % cmax %'', rec.ctid,rec.xmin,rec.xmax,rec.cmin,rec.cmax;
END LOOP;
RETURN 0;
END;
'
LANGUAGE 'plpgsql';---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #
_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com
Tatsuo Ishii <t-ishii@sra.co.jp> writes:
NOTICE: ctid (0,5) xmin 645188 xmax 645190 cmin 2 cmax 2
This is odd too, since xmax > 0 or cmax > 0 should never happen with
visible tuples, in my understanding.
That's what the docs presently say, but they're in error --- nonzero
xmax could represent a not-yet-committed deleting xact (or one that
did commit, but not in your snapshot); or it could be from a deleting
xact that rolled back.
I get
regression=# SELECT myftest(1);
NOTICE: Error occurred while executing PL/pgSQL function myftest
NOTICE: line 6 at SQL statement
ERROR: parser: parse error at or near "ctid"
regression=#
so there's something wrong with the function as posted.
regards, tom lane
Jan Wieck <JanWieck@Yahoo.com> writes:
If that's the case, it must be a general problem with SPI
that'll apply to any procedural language as well as user
defined C function using SPI.
Not necessarily. It looks to me like someone is forgetting to do a
CommandCounterIncrement() between plpgsql statements. Is this something
that plpgsql should do, or should SPI do it? Not clear.
regards, tom lane
I said:
Not necessarily. It looks to me like someone is forgetting to do a
CommandCounterIncrement() between plpgsql statements.
It's worse than that: someone is caching an out-of-date command counter
value.
Load the attached variant of Tatsuo's script, and then do this:
regression=# SELECT myftest(1);
NOTICE: i 1 ctid (0,30) xmin 5687 xmax 0 cmin 2 cmax 0
NOTICE: i 2 ctid (0,31) xmin 5687 xmax 0 cmin 4 cmax 0
myftest
---------
0
(1 row)
regression=# SELECT myftest(1);
NOTICE: i 1 ctid (0,32) xmin 5688 xmax 0 cmin 1 cmax 0
myftest
---------
0
(1 row)
regression=#
Neat eh? What happened to the i=2 line? If you start a fresh backend,
the first execution of the function works.
regards, tom lane
DROP TABLE t1;
CREATE TABLE t1 (i INT PRIMARY KEY);
insert into t1 values(1);
DROP FUNCTION myftest(INT);
CREATE FUNCTION myftest(INT)
RETURNS INT
AS '
DECLARE myid INT;
DECLARE rec RECORD;
key ALIAS FOR $1;
BEGIN
UPDATE t1 SET i = 1 WHERE i = 1;
INSERT INTO t1 VALUES (2);
FOR rec IN SELECT i,ctid,xmin,xmax,cmin,cmax from t1 LOOP
RAISE NOTICE ''i % ctid % xmin % xmax % cmin % cmax %'', rec.i,rec.ctid,rec.xmin,rec.xmax,rec.cmin,rec.cmax;
END LOOP;
SELECT INTO myid i FROM t1 WHERE i = (SELECT i FROM t1 WHERE i = 1);
DELETE FROM t1 WHERE i = 2;
RETURN 0;
END;
'
LANGUAGE 'plpgsql';
Okay, I understand Tatsuo's original complaint, and I don't think it's
a bug exactly --- it's MVCC/Read Committed operating as designed. Using
the variant script I just posted and two *freshly started* backends, do:
Backend 1:
regression=# begin;
BEGIN
regression=# SELECT myftest(1);
NOTICE: i 1 ctid (0,42) xmin 5701 xmax 0 cmin 3 cmax 0
NOTICE: i 2 ctid (0,43) xmin 5701 xmax 0 cmin 5 cmax 0
myftest
---------
0
(1 row)
Backend 2:
regression=# SELECT myftest(1);
[ backend 2 hangs; now go back and commit backend 1 ]
NOTICE: i 1 ctid (0,40) xmin 5696 xmax 5701 cmin 1 cmax 3
NOTICE: i 1 ctid (0,44) xmin 5702 xmax 0 cmin 2 cmax 0
NOTICE: i 2 ctid (0,45) xmin 5702 xmax 0 cmin 4 cmax 0
NOTICE: Error occurred while executing PL/pgSQL function myftest
NOTICE: line 10 at select into variables
ERROR: More than one tuple returned by a subselect used as an expression.
regression=#
The second backend finds that it wants to update the same row backend 1
did, so it waits to see if 1 commits. After the commit, it decides it
can do the update. Now, what will we see later in that same
transaction? SELECT will consider the original row (ctid 40, here)
to be still good --- it was deleted, sure enough, but by a transaction
that has not committed as far as the current transaction is concerned.
And the row inserted earlier in our own transaction is good too. So
you see two rows with i=1. The only way to avoid this is to use
Serializable mode, which would mean that backend 2 would've gotten an
error on its UPDATE.
However, if you do the same experiment a second time in the same
backends, you get different results. This I think is a SPI bug:
SPI is doing CommandCounterIncrements at bizarre times, and in
particular you get fewer CommandCounterIncrements while planning
and executing a plpgsql function than you do while re-executing
an already-planned one. Not sure yet exactly how it should be
changed.
regards, tom lane
I said:
SPI is doing CommandCounterIncrements at bizarre times, and in
particular you get fewer CommandCounterIncrements while planning
and executing a plpgsql function than you do while re-executing
an already-planned one.
s/fewer/more/ ... guess I'm not fully awake yet ... but anyway,
SPI's handling of CommandCounterIncrement is certainly broken.
Particularly for cursors --- a CCI for every FETCH will not do,
you want the whole scan to be run with the same commandId.
regards, tom lane
Tatsuo Ishii <t-ishii@sra.co.jp> writes:
NOTICE: ctid (0,5) xmin 645188 xmax 645190 cmin 2 cmax 2
This is odd too, since xmax > 0 or cmax > 0 should never happen with
visible tuples, in my understanding.That's what the docs presently say, but they're in error --- nonzero
xmax could represent a not-yet-committed deleting xact (or one that
did commit, but not in your snapshot); or it could be from a deleting
xact that rolled back.I get
regression=# SELECT myftest(1);
NOTICE: Error occurred while executing PL/pgSQL function myftest
NOTICE: line 6 at SQL statement
ERROR: parser: parse error at or near "ctid"
regression=#so there's something wrong with the function as posted.
regards, tom lane
Sorry, please try new one attatched below.
DROP TABLE t1;
CREATE TABLE t1 (i INT PRIMARY KEY);
INSERT INTO t1 VALUES(1);
DROP FUNCTION myftest(INT);
CREATE FUNCTION myftest(INT)
RETURNS INT
AS '
DECLARE myid INT;
DECLARE rec RECORD;
key ALIAS FOR $1;
BEGIN
UPDATE t1 SET i = 1 WHERE i = 1;
SELECT INTO myid i FROM t1 WHERE i = (SELECT i FROM t1 WHERE i = 1);
FOR rec IN SELECT ctid,xmin,xmax,cmin,cmax from t1 LOOP
RAISE NOTICE ''ctid % xmin % xmax % cmin % cmax %'', rec.ctid,rec.xmin,rec.xmax,rec.cmin,rec.cmax;
END LOOP;
RETURN 0;
END;
'
LANGUAGE 'plpgsql';
Okay, I understand Tatsuo's original complaint, and I don't think it's
a bug exactly --- it's MVCC/Read Committed operating as designed. Using
the variant script I just posted and two *freshly started* backends, do:
I don't think so. Seems the problem is specific to PL/pgSQL (or SPI?).
The second backend finds that it wants to update the same row backend 1
did, so it waits to see if 1 commits. After the commit, it decides it
can do the update. Now, what will we see later in that same
transaction? SELECT will consider the original row (ctid 40, here)
to be still good --- it was deleted, sure enough, but by a transaction
that has not committed as far as the current transaction is concerned.
And the row inserted earlier in our own transaction is good too. So
you see two rows with i=1. The only way to avoid this is to use
Serializable mode, which would mean that backend 2 would've gotten an
error on its UPDATE.
If your theory is like that, I could see same effect without using
PL/pgSQL. But I see following in a session using psql (original row's
ctid = (0,2))
[T1] begin;
[T2] begin;
[T1] update t1 set i = 1 where i = 1;
[T2] update t1 set i = 1 where i = 1; <-- waiting for T1 committed/aborted
[T1] end;
[T2] select ctid, i from t1;
test=# select ctid,i from t1;
ctid | i
-------+---
(0,4) | 1
(1 row)
So I only see one row from the last select in T2?
--
Tatsuo Ishii
Tatsuo Ishii <t-ishii@sra.co.jp> writes:
If your theory is like that, I could see same effect without using
PL/pgSQL. But I see following in a session using psql (original row's
ctid = (0,2))
[T1] begin;
[T2] begin;
[T1] update t1 set i = 1 where i = 1;
[T2] update t1 set i = 1 where i = 1; <-- waiting for T1 committed/aborted
[T1] end;
[T2] select ctid, i from t1;
test=# select ctid,i from t1;
ctid | i
-------+---
(0,4) | 1
(1 row)
So I only see one row from the last select in T2?
I believe the reason for this is that in Read Committed mode,
each separate query from the client computes a new snapshot (see
SetQuerySnapshot calls in postgres.c). So, when your
"select ctid, i from t1" query executes, it computes a snapshot
that says T1 is committed, and then it doesn't see the row left
over from T1. On the other hand, your plpgsql function operates
inside a single client query and so it's using just one QuerySnaphot.
One way to make the results equivalent is to compute a new QuerySnapshot
for each SPI query. Quite aside from the cost of doing so, I do not
think it makes sense, considering that the previous QuerySnapshot must
be restored when we return from the function. Do we really want
functions to see transaction status different from what's seen outside
the function call? I doubt it.
The other way to make the results the same is to omit the
SetQuerySnapshot calls for successive client-issued queries in one
transaction. This could perhaps be defended on logical grounds,
but considering your complaint I'm not sure it would make people
happier.
regards, tom lane
I believe the reason for this is that in Read Committed mode,
each separate query from the client computes a new snapshot (see
SetQuerySnapshot calls in postgres.c). So, when your
"select ctid, i from t1" query executes, it computes a snapshot
that says T1 is committed, and then it doesn't see the row left
over from T1. On the other hand, your plpgsql function operates
inside a single client query and so it's using just one QuerySnaphot.
Oh I see. So the "problem" is not specific to PL/pgSQL, but exists in
all our procedual languages.
One way to make the results equivalent is to compute a new QuerySnapshot
for each SPI query. Quite aside from the cost of doing so, I do not
think it makes sense, considering that the previous QuerySnapshot must
be restored when we return from the function. Do we really want
functions to see transaction status different from what's seen outside
the function call? I doubt it.The other way to make the results the same is to omit the
SetQuerySnapshot calls for successive client-issued queries in one
transaction. This could perhaps be defended on logical grounds,
but considering your complaint I'm not sure it would make people
happier.
Ok, maybe another workaround might be adding a checking for cmax in
the subselect:
SELECT INTO myid i FROM t1 WHERE i = (SELECT i FROM t1 WHERE i = 1);
to make sure that cmax > 0?
--
Tatsuo Ishii
Tatsuo Ishii <t-ishii@sra.co.jp> writes:
Ok, maybe another workaround might be adding a checking for cmax in
the subselect:
SELECT INTO myid i FROM t1 WHERE i = (SELECT i FROM t1 WHERE i = 1);
to make sure that cmax > 0?
Huh? How would that help?
regards, tom lane
Tatsuo Ishii <t-ishii@sra.co.jp> writes:
Ok, maybe another workaround might be adding a checking for cmax in
the subselect:SELECT INTO myid i FROM t1 WHERE i = (SELECT i FROM t1 WHERE i = 1);
to make sure that cmax > 0?
Huh? How would that help?
According to the doc, tuples with cmax > 0 should not be visible to
the current transaction, no?
--
Tatsuo Ishii
Tatsuo Ishii <t-ishii@sra.co.jp> writes:
According to the doc, tuples with cmax > 0 should not be visible to
the current transaction, no?
The docs are wrong --- my mistake originally, and in fact I just fixed
it in current sources. cmax != 0 only indicates that someone tried to
delete the tuple; not that the someone ever committed, much less that
their commit should be visible to you under MVCC rules. (Also, I
believe the command counter starts at 0, so this test would only catch
deletes that weren't the first command in their transaction, anyway.
Testing xmax != 0 would avoid that issue, but not the fundamental
problem of commit status.)
regards, tom lane
-----Original Message-----
From: Tom LaneI believe the reason for this is that in Read Committed mode,
each separate query from the client computes a new snapshot (see
SetQuerySnapshot calls in postgres.c). So, when your
"select ctid, i from t1" query executes, it computes a snapshot
that says T1 is committed, and then it doesn't see the row left
over from T1. On the other hand, your plpgsql function operates
inside a single client query and so it's using just one QuerySnaphot.One way to make the results equivalent is to compute a new QuerySnapshot
for each SPI query. Quite aside from the cost of doing so, I do not
think it makes sense, considering that the previous QuerySnapshot must
be restored when we return from the function. Do we really want
functions to see transaction status different from what's seen outside
the function call?
Yes I do.
I doubt it.
The other way to make the results the same is to omit the
SetQuerySnapshot calls for successive client-issued queries in one
transaction.
What's different from SERIALIZABLE mode ?
regards,
Hiroshi Inoue
One way to make the results equivalent is to compute a new QuerySnapshot
for each SPI query. Quite aside from the cost of doing so, I do not
think it makes sense, considering that the previous QuerySnapshot must
be restored when we return from the function. Do we really want
functions to see transaction status different from what's seen outside
the function call?Yes I do.
Me too. Current behavior of procedural languages seem hard to
understand for users.
BTW, why must we restore the previous QuerySnapshot? We already break
the rule (if it's a rule). For example, COPY TO calls SetQuerySnapshot
(see tcop/utility.c). So, below produces "ERROR: More than one tuple
returned by a subselect used as an expression":
DROP TABLE t1;
CREATE TABLE t1 (i INT PRIMARY KEY);
DROP FUNCTION myftest(INT);
CREATE FUNCTION myftest(INT)
RETURNS INT
AS '
UPDATE t1 SET i = 1 WHERE i = 1;
SELECT i FROM t1 WHERE i = (SELECT i FROM t1 WHERE i = 1);
'
LANGUAGE 'sql';
while below does not throw an error:
DROP TABLE t1;
CREATE TABLE t1 (i INT PRIMARY KEY);
DROP FUNCTION myftest(INT);
CREATE FUNCTION myftest(INT)
RETURNS INT
AS '
UPDATE t1 SET i = 1 WHERE i = 1;
COPY t1 TO ''/tmp/t1.data'';
SELECT i FROM t1 WHERE i = (SELECT i FROM t1 WHERE i = 1);
'
LANGUAGE 'sql';
Tatsuo Ishii <t-ishii@sra.co.jp> writes:
BTW, why must we restore the previous QuerySnapshot? We already break
the rule (if it's a rule).
No we don't. There are no SetQuerySnapshot calls occuring *within*
a query. An example of why that would be unacceptable: consider
select myfunc(f1) from table where f2 = 42;
Suppose executing myfunc() causes an unrestored SetQuerySnapshot call.
Then, if other transactions are busy changing f2 values, the set of
rows that this query returns could depend on the order in which rows
are visited --- since whether it thinks a row with f2 = 42 is visible
might depend on whether any previous rows had been matched (causing
myfunc() to be evaluated). For that matter, it could depend on the
form of the query plan used --- in some plans, myfunc() might be called
while the scan is in progress, in others not till afterward.
For example, COPY TO calls SetQuerySnapshot
(see tcop/utility.c).
That's just because postgres.c doesn't do it automatically for utility
statements. There's still just one SetQuerySnapshot per query.
regards, tom lane
No we don't. There are no SetQuerySnapshot calls occuring *within*
a query. An example of why that would be unacceptable: considerselect myfunc(f1) from table where f2 = 42;
Suppose executing myfunc() causes an unrestored SetQuerySnapshot call.
Then, if other transactions are busy changing f2 values, the set of
rows that this query returns could depend on the order in which rows
are visited --- since whether it thinks a row with f2 = 42 is visible
might depend on whether any previous rows had been matched (causing
myfunc() to be evaluated). For that matter, it could depend on the
form of the query plan used --- in some plans, myfunc() might be called
while the scan is in progress, in others not till afterward.
If so, FROM clause-less SELECT (select myfunc();) might be ok.
For example, COPY TO calls SetQuerySnapshot
(see tcop/utility.c).That's just because postgres.c doesn't do it automatically for utility
statements. There's still just one SetQuerySnapshot per query.
I'm confused. In my example:
CREATE FUNCTION myftest(INT)
RETURNS INT
AS '
UPDATE t1 SET i = 1 WHERE i = 1;
COPY t1 TO ''/tmp/t1.data'';
SELECT i FROM t1 WHERE i = (SELECT i FROM t1 WHERE i = 1);
'
LANGUAGE 'sql';
When COPY is invoked in the function, I thought SetQuerySnapshot is
called.
So "SELECT myftest(1);" would call SetQuerySnapshot twice, no?
--
Tatsuo Ishii
Tatsuo Ishii <t-ishii@sra.co.jp> writes:
CREATE FUNCTION myftest(INT)
RETURNS INT
AS '
UPDATE t1 SET i = 1 WHERE i = 1;
COPY t1 TO ''/tmp/t1.data'';
SELECT i FROM t1 WHERE i = (SELECT i FROM t1 WHERE i = 1);
'
LANGUAGE 'sql';
When COPY is invoked in the function, I thought SetQuerySnapshot is
called.
Hmm, I think you are right. This means that calling SetQuerySnapshot
in ProcessUtility is the *wrong* place to do it; or that there should
be additional logic to suppress the call in this context. IMHO, anyway.
regards, tom lane
Tatsuo Ishii wrote:
One way to make the results equivalent is to compute a new
QuerySnapshot
for each SPI query. Quite aside from the cost of doing so, I do not
think it makes sense, considering that the previous QuerySnapshot must
be restored when we return from the function. Do we really want
functions to see transaction status different from what's seen outside
the function call?Yes I do.
Me too. Current behavior of procedural languages seem hard to
understand for users.
Yes it's a siginificant point. I've referred to the
impropriety to use a unique snapshot thoughout a
function call when this kind of bug(?) was reported.
Who could take care of it in writing PL/pgSQL ?
BTW, why must we restore the previous QuerySnapshot?
For example, in the case such as
select .., some_func(item1), .. from a_table;
SELECT always uses the same snapshot for all its
internal fetch operations, so it seems reasonable
for each some_func() to be called in the same snapshot.
It's possible for a function to use a unique snapshot
if there are only SELECT statements in the function
but it's impossible if there are UPDATE/DELETE or
SELECT .. FOR UPDATE statements etc.. We should be
careful to handle such functions which have side
effects. IMHO we shouldn't call such functions or
shouldn't expect consistent results with the use
of such funtions. OTOH
select some_func(..);
is a procedure call not a function call in reality.
There seems to be no necessity to restore the previous
QuerySnapshot when calling procedures and we could
call any function as a procedure.
regards,
Hiroshi Inoue
"Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
It's possible for a function to use a unique snapshot
if there are only SELECT statements in the function
but it's impossible if there are UPDATE/DELETE or
SELECT .. FOR UPDATE statements etc.
You are confusing snapshots (which determine visibility of the results
of OTHER transactions) with command-counter incrementing (which
determines visibility of the results of OUR OWN transaction). I agree
that plpgsql's handling of command-counter changes is broken, but it
does not follow that sprinkling the code with SetQuerySnapshot is wise.
regards, tom lane
Jan Wieck <JanWieck@yahoo.com> writes:
that plpgsql's handling of command-counter changes is broken, but it
does not follow that sprinkling the code with SetQuerySnapshot is wise.
Why do you blame PL/pgSQL for that? I don't see a single
reference to the command counter from the PL/pgSQL sources.
All it does is using SPI. So does "using SPI" by itself count
as "boken"?
Sorry: SPI is broken, not plpgsql. Does that make you feel better?
regards, tom lane
Import Notes
Reply to msg id not found: 200108131411.f7DEBNm07263@jupiter.us.greatbridge.comReference msg id not found: 200108131411.f7DEBNm07263@jupiter.us.greatbridge.com | Resolved by subject fallback
Tom Lane wrote:
"Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
It's possible for a function to use a unique snapshot
if there are only SELECT statements in the function
but it's impossible if there are UPDATE/DELETE or
SELECT .. FOR UPDATE statements etc.You are confusing snapshots (which determine visibility of the results
of OTHER transactions) with command-counter incrementing (which
determines visibility of the results of OUR OWN transaction). I agree
that plpgsql's handling of command-counter changes is broken, but it
does not follow that sprinkling the code with SetQuerySnapshot is wise.
Why do you blame PL/pgSQL for that? I don't see a single
reference to the command counter from the PL/pgSQL sources.
All it does is using SPI. So does "using SPI" by itself count
as "boken"?
If so, uh-oh, referential integrity is using SPI ...
Jan
--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #
_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com
Tom Lane wrote:
Jan Wieck <JanWieck@yahoo.com> writes:
that plpgsql's handling of command-counter changes is broken, but it
does not follow that sprinkling the code with SetQuerySnapshot is wise.Why do you blame PL/pgSQL for that? I don't see a single
reference to the command counter from the PL/pgSQL sources.
All it does is using SPI. So does "using SPI" by itself count
as "boken"?Sorry: SPI is broken, not plpgsql. Does that make you feel better?
Not that it "makes my day". But it makes me feel better,
thanks.
Jan
--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #
_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com
Tom Lane wrote:
"Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
It's possible for a function to use a unique snapshot
if there are only SELECT statements in the function
but it's impossible if there are UPDATE/DELETE or
SELECT .. FOR UPDATE statements etc.You are confusing
No.
snapshots (which determine visibility of the results
of OTHER transactions)
Yes.
with command-counter incrementing (which
determines visibility of the results of OUR OWN transaction).
Yes.
I agree
that plpgsql's handling of command-counter changes is broken,
Probably yes but
but it
does not follow that sprinkling the code with SetQuerySnapshot is wise.
Should both command counter and snapshots be changed
properly ? Please explain me why/how we could do with
no snapshot change in read committed mode.
regards,
Hiroshi Inoue
Tom Lane wrote:
"Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
It's possible for a function to use a unique snapshot
if there are only SELECT statements in the function
but it's impossible if there are UPDATE/DELETE or
SELECT .. FOR UPDATE statements etc.You are confusing
No.
snapshots (which determine visibility of the results
of OTHER transactions)
Yes.
with command-counter incrementing (which
determines visibility of the results of OUR OWN transaction).
Yes.
I agree
that plpgsql's handling of command-counter changes is broken,
Probably yes but
but it
does not follow that sprinkling the code with SetQuerySnapshot is wise.
Should both command counter and snapshots be changed
properly ? Please explain me why/how we could do with
no snapshot change in read committed mode.
regards,
Hiroshi Inoue
Tom Lane wrote:
"Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
It's possible for a function to use a unique snapshot
if there are only SELECT statements in the function
but it's impossible if there are UPDATE/DELETE or
SELECT .. FOR UPDATE statements etc.You are confusing snapshots (which determine visibility of the results
of OTHER transactions)
Please note that the meaning of snapshots for statements
other than SELECT is different from that for SELECT.
For example,
1) The result of SELECT .. FOR UPDATE may be different
from that of SELECT for the same snapshot.
2) Once a snapshot given, the result of SELECT is dicisive
but that of SELECT .. FOR UPDATE isn't.
SELECT and SELECT .. FOR UPDATE are alike in appearance
but quite different in nature. There's no real snapshot
for SELECT .. FOR UPDATE in the current implementation.
I suggested the implementation with the real snapshot
(without the word snapshot though) once before 6.5.
The implementation seems hard and the possibility isn't
confirmed. Even though the implementation is possible,
it requires the repeated computation of snapshot until
the consisteny is satisfied, and so arbitrary snapshots
aren't allowed.
There's little meaning for SELECT statements and subsequent
other statements like UPDATE/DELETE/SELECT .. FOR UPDATE to
use the same snapshot in read committed mode.
There's no consistency with the current handling of snapshots
inside a function call.
regards,
Hiroshi Inoue
Tom Lane wrote:
Tatsuo Ishii <t-ishii@sra.co.jp> writes:
NOTICE: ctid (0,5) xmin 645188 xmax 645190 cmin 2 cmax 2
This is odd too, since xmax > 0 or cmax > 0 should never happen with
visible tuples, in my understanding.That's what the docs presently say, but they're in error --- nonzero
xmax could represent a not-yet-committed deleting xact (or one that
did commit, but not in your snapshot); or it could be from a deleting
xact that rolled back.
or it can come from referential integrity triggers:
hannu=# create table parent(parid integer primary key);
NOTICE: CREATE TABLE/PRIMARY KEY will create implicit index
'parent_pkey' for table 'parent'
CREATE
hannu=# create table child(cldid integer references parent on update
cascade);
NOTICE: CREATE TABLE will create implicit trigger(s) for FOREIGN KEY
check(s)
CREATE
hannu=# insert into parent values(1);
INSERT 20652 1
hannu=# insert into child values(1);
INSERT 20653 1
hannu=# update parent set parid=2;
UPDATE 1
hannu=# select xmin,xmax,cmin,cmax,parid from parent;
xmin | xmax | cmin | cmax | parid
------+------+------+------+-------
731 | 731 | 0 | 4 | 2
(1 row)
Now I have a question: if xmax is not used in determining tuple
visibility
(as I had assumed earlier) then what is ? How does postgres decide that
a
tuple is deleted ?
--------------------
Hannu
Hannu Krosing <hannu@tm.ee> writes:
Tom Lane wrote:
That's what the docs presently say, but they're in error --- nonzero
xmax could represent a not-yet-committed deleting xact (or one that
did commit, but not in your snapshot); or it could be from a deleting
xact that rolled back.
or it can come from referential integrity triggers:
Mmm, yeah, SELECT FOR UPDATE uses xmax to record the identity of a
transaction that has a row locked for update. In this case the xact
hasn't actually deleted the old row yet (and may never do so), but xmax
is set as though it has.
Now I have a question: if xmax is not used in determining tuple
visibility (as I had assumed earlier) then what is ?
There are additional status bits in each tuple (t_infomask) that
distinguish these various situations. The xmax field alone doesn't
tell you much, since you can't interpret it without context.
I'm not sure why we bother to make xmin/xmax/etc visible to
applications. They're really of no value to an app AFAICS.
regards, tom lane
Tom Lane wrote:
Hannu Krosing <hannu@tm.ee> writes:
Tom Lane wrote:
That's what the docs presently say, but they're in error --- nonzero
xmax could represent a not-yet-committed deleting xact (or one that
did commit, but not in your snapshot); or it could be from a deleting
xact that rolled back.or it can come from referential integrity triggers:
Mmm, yeah, SELECT FOR UPDATE uses xmax to record the identity of a
transaction that has a row locked for update. In this case the xact
hasn't actually deleted the old row yet (and may never do so), but xmax
is set as though it has.Now I have a question: if xmax is not used in determining tuple
visibility (as I had assumed earlier) then what is ?There are additional status bits in each tuple (t_infomask) that
distinguish these various situations. The xmax field alone doesn't
tell you much, since you can't interpret it without context.
As I understood it it should tell the trx id that invalidated this
tuple, no ?
If you must write t_infomask in the tuple anyhow, then why not clean up
xmax
on abort ?
I'm not sure why we bother to make xmin/xmax/etc visible to
applications. They're really of no value to an app AFAICS.
I guess they used to be of value at the time when time travel was
possible
and people did use xmax for documented purposes, i.e. recording tuple's
lifetime
and not for "other" stuff, especially without cleaning up after trx
abort ;)
I agree that they are losing their utility as we are moving away from
the
original notion of transaction ids (and oids) as something permanent
that could
be used for time travel or system auditing and recommending peole who
need such
features to reimplement those at application level, with triggers and
explicitly
defined fields.
------------------
Hannu
Is this something that still needs fixing?
I believe the reason for this is that in Read Committed mode,
each separate query from the client computes a new snapshot (see
SetQuerySnapshot calls in postgres.c). So, when your
"select ctid, i from t1" query executes, it computes a snapshot
that says T1 is committed, and then it doesn't see the row left
over from T1. On the other hand, your plpgsql function operates
inside a single client query and so it's using just one QuerySnaphot.Oh I see. So the "problem" is not specific to PL/pgSQL, but exists in
all our procedual languages.One way to make the results equivalent is to compute a new QuerySnapshot
for each SPI query. Quite aside from the cost of doing so, I do not
think it makes sense, considering that the previous QuerySnapshot must
be restored when we return from the function. Do we really want
functions to see transaction status different from what's seen outside
the function call? I doubt it.The other way to make the results the same is to omit the
SetQuerySnapshot calls for successive client-issued queries in one
transaction. This could perhaps be defended on logical grounds,
but considering your complaint I'm not sure it would make people
happier.Ok, maybe another workaround might be adding a checking for cmax in
the subselect:SELECT INTO myid i FROM t1 WHERE i = (SELECT i FROM t1 WHERE i = 1);
to make sure that cmax > 0?
--
Tatsuo Ishii---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
I am not sure if there is a TODO item here, but if there is, please let
me know. Thanks.
-----Original Message-----
From: Tom LaneI believe the reason for this is that in Read Committed mode,
each separate query from the client computes a new snapshot (see
SetQuerySnapshot calls in postgres.c). So, when your
"select ctid, i from t1" query executes, it computes a snapshot
that says T1 is committed, and then it doesn't see the row left
over from T1. On the other hand, your plpgsql function operates
inside a single client query and so it's using just one QuerySnaphot.One way to make the results equivalent is to compute a new QuerySnapshot
for each SPI query. Quite aside from the cost of doing so, I do not
think it makes sense, considering that the previous QuerySnapshot must
be restored when we return from the function. Do we really want
functions to see transaction status different from what's seen outside
the function call?Yes I do.
I doubt it.
The other way to make the results the same is to omit the
SetQuerySnapshot calls for successive client-issued queries in one
transaction.What's different from SERIALIZABLE mode ?
regards,
Hiroshi Inoue---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
-----Original Message-----
From: Bruce Momjian [mailto:pgman@candle.pha.pa.us]I am not sure if there is a TODO item here, but if there is, please let
me know. Thanks.
There seems to be no consensus on this item currently.
IMHO both the command counters and the snapshots
in a function should advance except the leading SELECT
statements.
Note that SELECT .. FOR UPDATE statements aren't
SELECT statements.
regards,
Hiroshi Inoue
Show quoted text
-----Original Message-----
From: Tom LaneI believe the reason for this is that in Read Committed mode,
each separate query from the client computes a new snapshot (see
SetQuerySnapshot calls in postgres.c). So, when your
"select ctid, i from t1" query executes, it computes a snapshot
that says T1 is committed, and then it doesn't see the row left
over from T1. On the other hand, your plpgsql function operates
inside a single client query and so it's using just one QuerySnaphot.One way to make the results equivalent is to compute a new
QuerySnapshot
for each SPI query. Quite aside from the cost of doing so, I do not
think it makes sense, considering that the previous QuerySnapshot must
be restored when we return from the function. Do we really want
functions to see transaction status different from what's seen outside
the function call?Yes I do.
I doubt it.
The other way to make the results the same is to omit the
SetQuerySnapshot calls for successive client-issued queries in one
transaction.What's different from SERIALIZABLE mode ?
regards,
Hiroshi Inoue---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org-- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026