Re: [HACKERS] v6.5 release ToDo

Started by Bruce Momjianalmost 27 years ago11 messageshackers
Jump to latest
#1Bruce Momjian
bruce@momjian.us

Hey,
Got a couple of problems that could hopefully be fixed before 6.5 gets
released. This is straight from CVS april 5th.

Prob #1:
DROP TABLE <table> doesn't removed "extended" files.
i.e., if you have 2GB db on Linux(i386), it will only delete the main
file and not the .1, .2, etc table files.

Added to TODO list.

Prob #2:
While running postmaster at the command line like this:
/home/postgres/bin/postmaster -B 1024 -D/home/postgres/data -d 9 -o "-S
4096 -s -d 9 -A"

the current backend for the following CREATE TABLE would
die(consistently):
CREATE TABLE oletest (
id serial,
number int,
string varchar(255)
);

Not sure how to address this.

This was the only query it died on however. I have made huge indexes and
regular selects while running it the same way. There was only one
postgres backend running at the time.

Thanks,
Ole Gjerde

-- 
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#2Ole Gjerde
gjerde@icebox.org
In reply to: Bruce Momjian (#1)

Prob #1:
DROP TABLE <table> doesn't removed "extended" files.
i.e., if you have 2GB db on Linux(i386), it will only delete the main
file and not the .1, .2, etc table files.

I have looked at this.
I made it so it rolled over files at 1MB. My table ended up with 120
segments, and my indexes had 3(Yes, it DOES work!).
DROP TABLE removed ALL segments from the table, but only the main index
segment.

So it looks like removing the table itself is using mdunlink in md.c,
while removing indexes uses FileNameUnlink() which only unlinks 1 file.
As far as I can tell, calling FileNameUnlink() and mdunlink() is basically
the same, except mdunlink() deletes any extra segments.

I've done some testing and it seems to work. It also passes regression
tests(except float8, geometry and rules, but that's normal).

If this patch is right, this fixes all known multi-segment problems on
Linux.

Ole Gjerde

Patch for index drop:
--- src/backend/catalog/index.c	1999/05/10 00:44:55	1.71
+++ src/backend/catalog/index.c	1999/05/15 06:42:27
@@ -1187,7 +1187,7 @@
 	 */
 	ReleaseRelationBuffers(userindexRelation);
-	if (FileNameUnlink(relpath(userindexRelation->rd_rel->relname.data)) < 0)
+	if (mdunlink(userindexRelation) != SM_SUCCESS)
 		elog(ERROR, "amdestroyr: unlink: %m");

index_close(userindexRelation);

#3Bruce Momjian
bruce@momjian.us
In reply to: Ole Gjerde (#2)

Applied. You have the correct patch. All other references to
FileNameUnlink look correct, but the index.c one is clearly wrong.
Thanks.

I believe we still have vacuuming of multi-segment tables as a problem.

Prob #1:
DROP TABLE <table> doesn't removed "extended" files.
i.e., if you have 2GB db on Linux(i386), it will only delete the main
file and not the .1, .2, etc table files.

I have looked at this.
I made it so it rolled over files at 1MB. My table ended up with 120
segments, and my indexes had 3(Yes, it DOES work!).
DROP TABLE removed ALL segments from the table, but only the main index
segment.

So it looks like removing the table itself is using mdunlink in md.c,
while removing indexes uses FileNameUnlink() which only unlinks 1 file.
As far as I can tell, calling FileNameUnlink() and mdunlink() is basically
the same, except mdunlink() deletes any extra segments.

I've done some testing and it seems to work. It also passes regression
tests(except float8, geometry and rules, but that's normal).

If this patch is right, this fixes all known multi-segment problems on
Linux.

Ole Gjerde

Patch for index drop:
--- src/backend/catalog/index.c	1999/05/10 00:44:55	1.71
+++ src/backend/catalog/index.c	1999/05/15 06:42:27
@@ -1187,7 +1187,7 @@
*/
ReleaseRelationBuffers(userindexRelation);
-	if (FileNameUnlink(relpath(userindexRelation->rd_rel->relname.data)) < 0)
+	if (mdunlink(userindexRelation) != SM_SUCCESS)
elog(ERROR, "amdestroyr: unlink: %m");

index_close(userindexRelation);

-- 
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#4Bruce Momjian
bruce@momjian.us
In reply to: Ole Gjerde (#2)

I believe also we have:

DROP TABLE/RENAME TABLE doesn't remove extended files, *.1, *.2

as an open item. Do you see these problems there?

Prob #1:
DROP TABLE <table> doesn't removed "extended" files.
i.e., if you have 2GB db on Linux(i386), it will only delete the main
file and not the .1, .2, etc table files.

I have looked at this.
I made it so it rolled over files at 1MB. My table ended up with 120
segments, and my indexes had 3(Yes, it DOES work!).
DROP TABLE removed ALL segments from the table, but only the main index
segment.

So it looks like removing the table itself is using mdunlink in md.c,
while removing indexes uses FileNameUnlink() which only unlinks 1 file.
As far as I can tell, calling FileNameUnlink() and mdunlink() is basically
the same, except mdunlink() deletes any extra segments.

I've done some testing and it seems to work. It also passes regression
tests(except float8, geometry and rules, but that's normal).

If this patch is right, this fixes all known multi-segment problems on
Linux.

Ole Gjerde

Patch for index drop:
--- src/backend/catalog/index.c	1999/05/10 00:44:55	1.71
+++ src/backend/catalog/index.c	1999/05/15 06:42:27
@@ -1187,7 +1187,7 @@
*/
ReleaseRelationBuffers(userindexRelation);
-	if (FileNameUnlink(relpath(userindexRelation->rd_rel->relname.data)) < 0)
+	if (mdunlink(userindexRelation) != SM_SUCCESS)
elog(ERROR, "amdestroyr: unlink: %m");

index_close(userindexRelation);

-- 
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#5Ole Gjerde
gjerde@icebox.org
In reply to: Bruce Momjian (#4)

On Sat, 15 May 1999, Bruce Momjian wrote:

I believe also we have:
DROP TABLE/RENAME TABLE doesn't remove extended files, *.1, *.2
as an open item. Do you see these problems there?

DROP TABLE worked, ALTER TABLE didn't.

CREATE TABLE
DROP TABLE
CREATE INDEX
DROP INDEX
ALTER TABLE old RENAME TO new

All works on linux now by my tests and regression(with patch below).

Perhaps a mdrename() should be created? Or is something like this good
enough?

Another thing. Should error messages from file related(or all system
calls) use strerror() to print out errno?

Ole Gjerde

--- src/backend/commands/rename.c	1999/05/10 00:44:59	1.23
+++ src/backend/commands/rename.c	1999/05/15 23:42:49
@@ -201,10 +201,13 @@
 void
 renamerel(char *oldrelname, char *newrelname)
 {
+	int		i;
 	Relation	relrelation;	/* for RELATION relation */
 	HeapTuple	oldreltup;
 	char		oldpath[MAXPGPATH],
-				newpath[MAXPGPATH];
+				newpath[MAXPGPATH],
+				toldpath[MAXPGPATH + 10],
+				tnewpath[MAXPGPATH + 10];
 	Relation	irelations[Num_pg_class_indices];
 	if (!allowSystemTableMods && IsSystemRelationName(oldrelname))
@@ -229,6 +232,14 @@
 	strcpy(newpath, relpath(newrelname));
 	if (rename(oldpath, newpath) < 0)
 		elog(ERROR, "renamerel: unable to rename file: %s", oldpath);
+
+	for (i = 1;; i++)
+	{
+		sprintf(toldpath, "%s.%d", oldpath, i);
+		sprintf(tnewpath, "%s.%d", newpath, i);
+		if(rename(toldpath, tnewpath) < 0)
+			break;
+	}

StrNCpy((((Form_pg_class) GETSTRUCT(oldreltup))->relname.data),
newrelname, NAMEDATALEN);

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Ole Gjerde (#5)

gjerde@icebox.org wrote (a couple weeks ago):

While running postmaster at the command line like this:
/home/postgres/bin/postmaster -B 1024 -D/home/postgres/data -d 9 -o "-S
4096 -s -d 9 -A"

the current backend for the following CREATE TABLE would
die(consistently):
CREATE TABLE oletest (
id serial,
number int,
string varchar(255)
);

Are you still seeing this with current sources? I'm not able to
replicate it here...

regards, tom lane

#7Ole Gjerde
gjerde@icebox.org
In reply to: Tom Lane (#6)

On Sun, 16 May 1999, Tom Lane wrote:

gjerde@icebox.org wrote (a couple weeks ago):

While running postmaster at the command line like this:
/home/postgres/bin/postmaster -B 1024 -D/home/postgres/data -d 9 -o "-S
4096 -s -d 9 -A"

Are you still seeing this with current sources? I'm not able to
replicate it here...

Yep..
It appears to die while making the index for serial.
Postmaster keeps running, but the current backend dies in pqReadData().

gdb and postmaster output below.

Ole Gjerde

gdb of core:
#0 0x4013a30a in _IO_default_xsputn (f=0xbf8006f4, data=0x81377e0, n=20)
at genops.c:382
#1 0x40129980 in _IO_vfprintf (s=0xbf8006f4,
format=0x81377e0 " COLUMNDEF :colname %s :typename ", ap=0xbf800c08)
at vfprintf.c:1048
#2 0x40137d16 in _IO_vsnprintf (string=0xbf8007f8 "", maxlen=1024,
format=0x81377e0 " COLUMNDEF :colname %s :typename ", args=0xbf800c08)
at vsnprintf.c:129
#3 0x809ccfb in appendStringInfo ()
#4 0x80b637b in _outColumnDef ()
#5 0x80b8107 in _outNode ()
#6 0x80b7d79 in _outNode ()
#7 0x80b7cab in _outConstraint ()
#8 0x80b84b7 in _outNode ()
#9 0x80b7d79 in _outNode ()
#10 0x80b63bb in _outColumnDef ()
#11 0x80b8107 in _outNode ()
#12 0x80b7d79 in _outNode ()
And this keeps going and going and going..

---------------------------------------------
postmaster log:
StartTransactionCommand
query: create table oletest (
id serial,
number int,
string varchar(255)
);
NOTICE: CREATE TABLE will create implicit sequence 'oletest_id_seq' for SERIAL column 'oletest.id'
NOTICE: CREATE TABLE/UNIQUE will create implicit index 'oletest_id_key' for table 'oletest'
parser outputs:
{ QUERY
:command 5
:utility ?
:resultRelation 0
:into <>
:isPortal false
:isBinary false
:isTemp false
:unionall false
:unique <>
:sortClause <>
:rtable <>
:targetlist <>
:qual <>
:groupClause <>
:havingQual <>
:hasAggs false
:hasSubLinks false
:unionClause <>
:intersectClause <>
:limitOffset <>
:limitCount <>
:rowMark <>
}
parser outputs:
bin/postmaster: reaping dead processes...
bin/postmaster: CleanupProc: pid 593 exited with status 139
bin/postmaster: CleanupProc: reinitializing shared memory and semaphores
shmem_exit(0) [#0]
binding ShmemCreate(key=52e325, size=9343632)

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Ole Gjerde (#7)

Ole Gjerde <gjerde@icebox.org> writes:

gdb of core:
#0 0x4013a30a in _IO_default_xsputn (f=0xbf8006f4, data=0x81377e0, n=20)
at genops.c:382
#1 0x40129980 in _IO_vfprintf (s=0xbf8006f4,
format=0x81377e0 " COLUMNDEF :colname %s :typename ", ap=0xbf800c08)
at vfprintf.c:1048
#2 0x40137d16 in _IO_vsnprintf (string=0xbf8007f8 "", maxlen=1024,
format=0x81377e0 " COLUMNDEF :colname %s :typename ", args=0xbf800c08)
at vsnprintf.c:129
#3 0x809ccfb in appendStringInfo ()
#4 0x80b637b in _outColumnDef ()
#5 0x80b8107 in _outNode ()
#6 0x80b7d79 in _outNode ()
#7 0x80b7cab in _outConstraint ()
#8 0x80b84b7 in _outNode ()
#9 0x80b7d79 in _outNode ()
#10 0x80b63bb in _outColumnDef ()
#11 0x80b8107 in _outNode ()
#12 0x80b7d79 in _outNode ()
And this keeps going and going and going..

Hmm, that looks like a column constraint has somehow gotten recursively
linked back to its parent column definition node.

I poked around in the code for serial-column constraints, and found that
Lockhart's last patch had a subtle bug --- he wrote more characters in
the constraint text without increasing the space palloc'd for the
string. That could maybe cause such a problem, depending on what
happened to be living next door to the string... But I don't really
think this explains your complaint, because according to the cvs log
that change was made on 5/13, and you reported a problem quite some time
before that. Still, please fetch the current cvs sources or apply this
patch:

*** src/backend/parser/analyze.c.orig	Sun May 16 10:29:33 1999
--- src/backend/parser/analyze.c	Mon May 17 00:50:07 1999
***************
*** 545,551 ****
  					constraint = makeNode(Constraint);
  					constraint->contype = CONSTR_DEFAULT;
  					constraint->name = sname;
! 					cstring = palloc(9 + strlen(constraint->name) + 2 + 1);
  					strcpy(cstring, "nextval('\"");
  					strcat(cstring, constraint->name);
  					strcat(cstring, "\"')");
--- 545,551 ----
  					constraint = makeNode(Constraint);
  					constraint->contype = CONSTR_DEFAULT;
  					constraint->name = sname;
! 					cstring = palloc(10 + strlen(constraint->name) + 3 + 1);
  					strcpy(cstring, "nextval('\"");
  					strcat(cstring, constraint->name);
  					strcat(cstring, "\"')");

and let us know if anything changes...

regards, tom lane

#9Ole Gjerde
gjerde@icebox.org
In reply to: Tom Lane (#8)

On Mon, 17 May 1999, Tom Lane wrote:

I poked around in the code for serial-column constraints, and found that
Lockhart's last patch had a subtle bug --- he wrote more characters in
the constraint text without increasing the space palloc'd for the
string. That could maybe cause such a problem, depending on what
happened to be living next door to the string... But I don't really
think this explains your complaint, because according to the cvs log
that change was made on 5/13, and you reported a problem quite some time
before that. Still, please fetch the current cvs sources or apply this
patch:

[snip]

and let us know if anything changes...

Yep, that takes care of it! Thanks

Ole Gjerde

#10Bruce Momjian
bruce@momjian.us
In reply to: Ole Gjerde (#5)

On Sat, 15 May 1999, Bruce Momjian wrote:

I believe also we have:
DROP TABLE/RENAME TABLE doesn't remove extended files, *.1, *.2
as an open item. Do you see these problems there?

DROP TABLE worked, ALTER TABLE didn't.

CREATE TABLE
DROP TABLE
CREATE INDEX
DROP INDEX
ALTER TABLE old RENAME TO new

All works on linux now by my tests and regression(with patch below).

Applied.

Perhaps a mdrename() should be created? Or is something like this good
enough?

I think this is good enough for now. Do people want an mdrename?

Another thing. Should error messages from file related(or all system
calls) use strerror() to print out errno?

Seems like in the code you have, you just keep renaming until you can't
find any more files, so printing out any errno would be a problem,
right?

I assume you are taling about the initial rename. Not sure if strerror
would help. We really try and insulate the user from knowing how we are
doing the SQL we do, so it is possible it may be confusing. However, it
may be very helpful too. Not sure. Comments?

Ole Gjerde

--- src/backend/commands/rename.c	1999/05/10 00:44:59	1.23
+++ src/backend/commands/rename.c	1999/05/15 23:42:49
@@ -201,10 +201,13 @@
void
renamerel(char *oldrelname, char *newrelname)
{
+	int		i;
Relation	relrelation;	/* for RELATION relation */
HeapTuple	oldreltup;
char		oldpath[MAXPGPATH],
-				newpath[MAXPGPATH];
+				newpath[MAXPGPATH],
+				toldpath[MAXPGPATH + 10],
+				tnewpath[MAXPGPATH + 10];
Relation	irelations[Num_pg_class_indices];
if (!allowSystemTableMods && IsSystemRelationName(oldrelname))
@@ -229,6 +232,14 @@
strcpy(newpath, relpath(newrelname));
if (rename(oldpath, newpath) < 0)
elog(ERROR, "renamerel: unable to rename file: %s", oldpath);
+
+	for (i = 1;; i++)
+	{
+		sprintf(toldpath, "%s.%d", oldpath, i);
+		sprintf(tnewpath, "%s.%d", newpath, i);
+		if(rename(toldpath, tnewpath) < 0)
+			break;
+	}

StrNCpy((((Form_pg_class) GETSTRUCT(oldreltup))->relname.data),
newrelname, NAMEDATALEN);

-- 
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#11Thomas Lockhart
lockhart@alumni.caltech.edu
In reply to: Tom Lane (#8)

Lockhart's last patch had a subtle bug

:( Thanks for fixing it...

- Thomas

--
Thomas Lockhart lockhart@alumni.caltech.edu
South Pasadena, California