Postgres 7.2 - Updating rows in cursor problem

Started by Vaclav Kulakovskyalmost 24 years ago8 messages
#1Vaclav Kulakovsky
vaclav.kulakovsky@definity.cz

general description:
--------------------

I've a problem in PG 7.2. If you update rows which are included in plpgsql
RECORD , updated rows are again added to the RECORD, so you will get into
infinite loop.

details:
--------

- 7.2 problem
- problem in PLPGSQL
- problem arise only if count in RECORD exceed some number of rows
- problem arise only if RECORD is not sorted

question:
---------
Is it feature or bug ?

example:
--------
If you execute this script, you will enter to infinite loop. If you add
"order by a" it will work fine.

------------<< cut here << ---------------------
create sequence tmp_seq;

create table tmp_test ( a int4 default nextval('tmp_seq'), b int4 );

insert into tmp_test (b) values( 1 );
insert into tmp_test (b) values( 1 );
insert into tmp_test (b) values( 1 );
insert into tmp_test (b) select b from tmp_test;
insert into tmp_test (b) select b from tmp_test;
insert into tmp_test (b) select b from tmp_test;
insert into tmp_test (b) select b from tmp_test;
insert into tmp_test (b) select b from tmp_test;
select count(*) from tmp_test;

drop function ftmp_test();
create function ftmp_test() RETURNS varchar AS'
DECLARE
_grp varchar;

sql varchar;
R RECORD;
i integer;
BEGIN

i = 0;
FOR R IN select * from tmp_test
LOOP
i = i + 1;
if i%100 = 0 then
raise notice ''% - val: %'', i, R.a;
end if;
UPDATE tmp_test SET a=1000 WHERE a = R.a;
END LOOP;
RETURN '''';
END;' LANGUAGE 'plpgsql';

select ftmp_test();

drop table tmp_test;
------------<< cut here << --------------------

Vaclav Kulakovsky
DEFINITY Systems, s.r.o.
Tyrsova 2071
25601 Benesov
Czech Republic
Tel: +420 301 727975,724456
Fax: +420 301 724456
vaclav.kulakovsky@definity.cz
http://www.definity.cz

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Vaclav Kulakovsky (#1)
Re: [GENERAL] Postgres 7.2 - Updating rows in cursor problem

Vaclav Kulakovsky <vaclav.kulakovsky@definity.cz> writes:

I've a problem in PG 7.2. If you update rows which are included in plpgsql
RECORD , updated rows are again added to the RECORD, so you will get into
infinite loop.

This is a bug in plgsql, or more precisely in SPI, I think. The FOR
statement needs to restore its initial value of scanCommandId each time
it resumes execution of the SELECT. Seems like that should be done down
inside SPI. Comments?

regards, tom lane

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#2)
Re: [GENERAL] Postgres 7.2 - Updating rows in cursor problem

I wrote:

This is a bug in plgsql, or more precisely in SPI, I think. The FOR
statement needs to restore its initial value of scanCommandId each time
it resumes execution of the SELECT. Seems like that should be done down
inside SPI. Comments?

More specifically, the problem is that plpgsql's FOR-over-a-select now
depends on a SPI cursor, and both SPI cursors and regular cursors are
broken in this regard. Observe the following misbehavior with a plain
cursor:

regression=# select * from foo;
f1 | f2
----+----
1 | 1
2 | 2
3 | 3
(3 rows)

regression=# begin;
BEGIN
regression=# declare c cursor for select * from foo;
SELECT
regression=# fetch 2 from c;
f1 | f2
----+----
1 | 1
2 | 2
(2 rows)

regression=# update foo set f2 = f2 + 1;
UPDATE 3
regression=# fetch all from c;
f1 | f2
----+----
1 | 2
2 | 3
3 | 4
(3 rows)

IMHO the cursor should not be able to see the rows inserted by the
subsequent UPDATE. (Certainly it should not return the updated versions
of rows it's already returned.) The SQL spec says that cursors declared
INSENSITIVE shall not observe changes made after they are opened --- and
it gives the implementation the option to make all cursors behave that
way. I think we should choose to do so.

I believe the correct fix for this is that Portal objects should store
the scanCommandId that was current when they were created, and restore
this scanCommandId whenever they are asked to run their plan. Comments?

regards, tom lane

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Vaclav Kulakovsky (#1)
Re: [GENERAL] Postgres 7.2 - Updating rows in cursor problem

Vaclav Kulakovsky <vaclav.kulakovsky@definity.cz> writes:

I've a problem in PG 7.2. If you update rows which are included in plpgsql
RECORD , updated rows are again added to the RECORD, so you will get into
infinite loop.

The attached patch against 7.2 appears to fix this problem, as well as
the cursor misbehavior that I exhibited in my followup on pghackers.

If I don't hear any complaints I will commit this soon.

regards, tom lane

*** src/backend/commands/command.c.orig	Thu Jan  3 18:19:30 2002
--- src/backend/commands/command.c	Wed Feb 13 19:24:00 2002
***************
*** 103,108 ****
--- 103,109 ----
  	QueryDesc  *queryDesc;
  	EState	   *estate;
  	MemoryContext oldcontext;
+ 	CommandId	savedId;
  	bool		temp_desc = false;

/*
***************
*** 156,162 ****
}

  	/*
! 	 * tell the destination to prepare to receive some tuples.
  	 */
  	BeginCommand(name,
  				 queryDesc->operation,
--- 157,163 ----
  	}
  	/*
! 	 * Tell the destination to prepare to receive some tuples.
  	 */
  	BeginCommand(name,
  				 queryDesc->operation,
***************
*** 169,174 ****
--- 170,183 ----
  				 queryDesc->dest);
  	/*
+ 	 * Restore the scanCommandId that was current when the cursor was
+ 	 * opened.  This ensures that we see the same tuples throughout the
+ 	 * execution of the cursor.
+ 	 */
+ 	savedId = GetScanCommandId();
+ 	SetScanCommandId(PortalGetCommandId(portal));
+ 
+ 	/*
  	 * Determine which direction to go in, and check to see if we're
  	 * already at the end of the available tuples in that direction.  If
  	 * so, do nothing.	(This check exists because not all plan node types
***************
*** 213,218 ****
--- 222,232 ----
  				portal->atStart = true; /* we retrieved 'em all */
  		}
  	}
+ 
+ 	/*
+ 	 * Restore outer command ID.
+ 	 */
+ 	SetScanCommandId(savedId);
  	/*
  	 * Clean up and switch back to old context.
*** src/backend/executor/spi.c.orig	Thu Jan  3 15:30:47 2002
--- src/backend/executor/spi.c	Wed Feb 13 19:23:55 2002
***************
*** 740,748 ****
  	_SPI_current->processed = 0;
  	_SPI_current->tuptable = NULL;
- 	/* Make up a portal name if none given */
  	if (name == NULL)
  	{
  		for (;;)
  		{
  			unnamed_portal_count++;
--- 740,748 ----
  	_SPI_current->processed = 0;
  	_SPI_current->tuptable = NULL;

if (name == NULL)
{
+ /* Make up a portal name if none given */
for (;;)
{
unnamed_portal_count++;
***************
*** 755,765 ****

name = portalname;
}
!
! /* Ensure the portal doesn't exist already */
! portal = GetPortalByName(name);
! if (portal != NULL)
! elog(ERROR, "cursor \"%s\" already in use", name);

  	/* Create the portal */
  	portal = CreatePortal(name);
--- 755,767 ----

name = portalname;
}
! else
! {
! /* Ensure the portal doesn't exist already */
! portal = GetPortalByName(name);
! if (portal != NULL)
! elog(ERROR, "cursor \"%s\" already in use", name);
! }

  	/* Create the portal */
  	portal = CreatePortal(name);
***************
*** 1228,1233 ****
--- 1230,1236 ----
  	QueryDesc  *querydesc;
  	EState	   *estate;
  	MemoryContext oldcontext;
+ 	CommandId	savedId;
  	CommandDest olddest;
  	/* Check that the portal is valid */
***************
*** 1245,1250 ****
--- 1248,1254 ----

/* Switch to the portals memory context */
oldcontext = MemoryContextSwitchTo(PortalGetHeapMemory(portal));
+
querydesc = PortalGetQueryDesc(portal);
estate = PortalGetState(portal);

***************
*** 1253,1258 ****
--- 1257,1270 ----
  	olddest = querydesc->dest;
  	querydesc->dest = dest;
+ 	/*
+ 	 * Restore the scanCommandId that was current when the cursor was
+ 	 * opened.  This ensures that we see the same tuples throughout the
+ 	 * execution of the cursor.
+ 	 */
+ 	savedId = GetScanCommandId();
+ 	SetScanCommandId(PortalGetCommandId(portal));
+ 
  	/* Run the executor like PerformPortalFetch and remember states */
  	if (forward)
  	{
***************
*** 1278,1283 ****
--- 1290,1300 ----
  				portal->atStart = true;
  		}
  	}
+ 
+ 	/*
+ 	 * Restore outer command ID.
+ 	 */
+ 	SetScanCommandId(savedId);
  	/* Restore the old command destination and switch back to callers */
  	/* memory context */
*** src/backend/utils/mmgr/portalmem.c.orig	Thu Oct 25 01:49:51 2001
--- src/backend/utils/mmgr/portalmem.c	Wed Feb 13 19:23:48 2002
***************
*** 168,173 ****
--- 168,174 ----
  	portal->queryDesc = queryDesc;
  	portal->attinfo = attinfo;
+ 	portal->commandId = GetScanCommandId();
  	portal->state = state;
  	portal->atStart = true;		/* Allow fetch forward only */
  	portal->atEnd = false;
***************
*** 213,218 ****
--- 214,220 ----
  	/* initialize portal query */
  	portal->queryDesc = NULL;
  	portal->attinfo = NULL;
+ 	portal->commandId = 0;
  	portal->state = NULL;
  	portal->atStart = true;		/* disallow fetches until query is set */
  	portal->atEnd = true;
*** src/include/utils/portal.h.orig	Mon Nov  5 14:44:35 2001
--- src/include/utils/portal.h	Wed Feb 13 19:23:42 2002
***************
*** 31,36 ****
--- 31,37 ----
  	MemoryContext heap;			/* subsidiary memory */
  	QueryDesc  *queryDesc;		/* Info about query associated with portal */
  	TupleDesc	attinfo;
+ 	CommandId	commandId;		/* Command counter value for query */
  	EState	   *state;			/* Execution state of query */
  	bool		atStart;		/* T => fetch backwards is not allowed */
  	bool		atEnd;			/* T => fetch forwards is not allowed */
***************
*** 48,55 ****
   */
  #define PortalGetQueryDesc(portal)	((portal)->queryDesc)
  #define PortalGetTupleDesc(portal)	((portal)->attinfo)
! #define PortalGetState(portal)	((portal)->state)
! #define PortalGetHeapMemory(portal)  ((portal)->heap)
  /*
   * estimate of the maximum number of open portals a user would have,
--- 49,57 ----
   */
  #define PortalGetQueryDesc(portal)	((portal)->queryDesc)
  #define PortalGetTupleDesc(portal)	((portal)->attinfo)
! #define PortalGetCommandId(portal)	((portal)->commandId)
! #define PortalGetState(portal)		((portal)->state)
! #define PortalGetHeapMemory(portal)	((portal)->heap)

/*
* estimate of the maximum number of open portals a user would have,

#5Hiroshi Inoue
Inoue@tpf.co.jp
In reply to: Tom Lane (#3)
Re: [GENERAL] Postgres 7.2 - Updating rows in cursor problem

-----Original Message-----
From: Tom Lane

I wrote:

This is a bug in plgsql, or more precisely in SPI, I think. The FOR
statement needs to restore its initial value of scanCommandId each time
it resumes execution of the SELECT. Seems like that should be done down
inside SPI. Comments?

More specifically, the problem is that plpgsql's FOR-over-a-select now
depends on a SPI cursor, and both SPI cursors and regular cursors are
broken in this regard. Observe the following misbehavior with a plain
cursor:

This is a known issue. We should implement INSENSITIVE cursors
to avoid this behavior. The keyword INSENSITIVE is there but isn't
used long. I plan to implement this feature as the first step toward
cross transaction cursors. Saving the xid and commandid in the
portal or snapshot and restoring them at fetch(move) time would
solve it.

regards,
Hiroshi Inoue

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Hiroshi Inoue (#5)
Re: [GENERAL] Postgres 7.2 - Updating rows in cursor problem

"Hiroshi Inoue" <Inoue@tpf.co.jp> writes:

This is a known issue. We should implement INSENSITIVE cursors
to avoid this behavior. The keyword INSENSITIVE is there but isn't
used long. I plan to implement this feature as the first step toward
cross transaction cursors. Saving the xid and commandid in the
portal or snapshot and restoring them at fetch(move) time would
solve it.

For the moment I've arranged to save commandId in portals. (xid isn't
needed since we don't have cross-transaction portals ... yet)

It occurs to me though that scanCommandId should not be part of the
xact.c global status at all. It should be stored in heapscan and
indexscan state structs, instead. I have been thinking about trying
to clean up the API for heapscans and indexscans, and maybe I'll see
if that can be done as part of that work.

regards, tom lane

#7Hiroshi Inoue
Inoue@tpf.co.jp
In reply to: Tom Lane (#6)
Re: [GENERAL] Postgres 7.2 - Updating rows in cursor problem

Bruce Momjian wrote:

More specifically, the problem is that plpgsql's FOR-over-a-select now
depends on a SPI cursor, and both SPI cursors and regular cursors are
broken in this regard. Observe the following misbehavior with a plain
cursor:

This is a known issue. We should implement INSENSITIVE cursors
to avoid this behavior. The keyword INSENSITIVE is there but isn't
used long. I plan to implement this feature as the first step toward
cross transaction cursors. Saving the xid and commandid in the
portal or snapshot and restoring them at fetch(move) time would
solve it.

If I read the CVS logs correctly, I think Tom just fixed it.

Oh I see it now.
I'm not sure if it's good that all cursurs are INSENSITIVE.
Now PostgreSQL is the same as Oracle at the point. Though
there's even a SENSITIVE keyword in SQL statndard, INSENSITIVE
may be what MVCC requires.

regards,
Hiroshi Inoue

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Hiroshi Inoue (#7)
Re: [GENERAL] Postgres 7.2 - Updating rows in cursor problem

Hiroshi Inoue <Inoue@tpf.co.jp> writes:

I'm not sure if it's good that all cursurs are INSENSITIVE.

Perhaps not, but I think implementing a reasonable non-insensitive
behavior under MVCC will be difficult. Do you see a way to do it?

regards, tom lane