Re: Tuple too big

Started by Bruce Momjianalmost 27 years ago5 messageshackers
Jump to latest
#1Bruce Momjian
bruce@momjian.us

I've encountered a strange behavior of the VACUUM ANALYZE command.
It seems that this command works only if the size of a text field
does not exceed approximately 4050 bytes! So the real limit on
tuple size is a half of the max tuple size. I've checked this effect
on Postgres 6.4.2 (Sparc Solaris 2.5.1) and Postgres 6.5 (SUSE 6.1
Linux, kernel 2.2.5). Is this a bug or known feature?
The python script used to reproduce this problem and results for
v6.4.2 and v6.5 are follows.
size= 4059
size= 4060
size= 4061
size= 4062
size= 4063
size= 4064
size= 4065
ERROR: Tuple is too big: size 8188

I have always suspected these default values where wrong, but no one
reported it as a bug.

Here is a patch for 6.5 which will prevent the creation of these too big
tuples in certain cases. Seems we should also check for max length at
the time we create the table, but it doesn't look like there is any code
to do that yet.

I am not going to apply this to 6.5.1 because it may have some unknown
side-affects.

-- 
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#2Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#1)
Re: [HACKERS] Re: Tuple too big

I have always suspected these default values where wrong, but no one
reported it as a bug.

Here is a patch for 6.5 which will prevent the creation of these too big
tuples in certain cases. Seems we should also check for max length at
the time we create the table, but it doesn't look like there is any code
to do that yet.

I am not going to apply this to 6.5.1 because it may have some unknown
side-affects.

Oops, forgot the patch. If people want this in 6.5.1, let me know. I
am going to try and add real tuple check in the places that need it.

---------------------------------------------------------------------------

? src/Makefile.custom
? src/config.log
? src/log
? src/config.cache
? src/config.status
? src/GNUmakefile
? src/Makefile.global
? src/backend/fmgr.h
? src/backend/parse.h
? src/backend/postgres
? src/backend/global1.bki.source
? src/backend/local1_template1.bki.source
? src/backend/global1.description
? src/backend/local1_template1.description
? src/backend/bootstrap/bootparse.c
? src/backend/bootstrap/bootstrap_tokens.h
? src/backend/bootstrap/bootscanner.c
? src/backend/catalog/genbki.sh
? src/backend/catalog/global1.bki.source
? src/backend/catalog/global1.description
? src/backend/catalog/local1_template1.bki.source
? src/backend/catalog/local1_template1.description
? src/backend/port/Makefile
? src/backend/utils/Gen_fmgrtab.sh
? src/backend/utils/fmgr.h
? src/backend/utils/fmgrtab.c
? src/bin/cleardbdir/cleardbdir
? src/bin/createdb/createdb
? src/bin/createlang/createlang
? src/bin/createuser/createuser
? src/bin/destroydb/destroydb
? src/bin/destroylang/destroylang
? src/bin/destroyuser/destroyuser
? src/bin/initdb/initdb
? src/bin/initlocation/initlocation
? src/bin/ipcclean/ipcclean
? src/bin/pg_dump/Makefile
? src/bin/pg_dump/pg_dump
? src/bin/pg_id/pg_id
? src/bin/pg_passwd/pg_passwd
? src/bin/pg_version/Makefile
? src/bin/pg_version/pg_version
? src/bin/pgtclsh/mkMakefile.tcldefs.sh
? src/bin/pgtclsh/mkMakefile.tkdefs.sh
? src/bin/pgtclsh/Makefile.tkdefs
? src/bin/pgtclsh/Makefile.tcldefs
? src/bin/pgtclsh/pgtclsh
? src/bin/pgtclsh/pgtksh
? src/bin/psql/Makefile
? src/bin/psql/psql
? src/include/version.h
? src/include/config.h
? src/interfaces/ecpg/lib/Makefile
? src/interfaces/ecpg/lib/libecpg.so.3.0.0
? src/interfaces/ecpg/preproc/ecpg
? src/interfaces/libpgtcl/Makefile
? src/interfaces/libpgtcl/libpgtcl.so.2.0
? src/interfaces/libpq/Makefile
? src/interfaces/libpq/libpq.so.2.0
? src/interfaces/libpq++/Makefile
? src/interfaces/libpq++/libpq++.so.3.0
? src/interfaces/odbc/GNUmakefile
? src/interfaces/odbc/Makefile.global
? src/lextest/lex.yy.c
? src/lextest/lextest
? src/pl/plpgsql/src/Makefile
? src/pl/plpgsql/src/mklang.sql
? src/pl/plpgsql/src/pl_gram.c
? src/pl/plpgsql/src/pl.tab.h
? src/pl/plpgsql/src/pl_scan.c
? src/pl/plpgsql/src/libplpgsql.so.1.0
? src/pl/tcl/mkMakefile.tcldefs.sh
? src/pl/tcl/Makefile.tcldefs
? src/template/linux_m68k
Index: src/backend/access/heap/stats.c
===================================================================
RCS file: /usr/local/cvsroot/pgsql/src/backend/access/heap/stats.c,v
retrieving revision 1.15
diff -c -r1.15 stats.c
*** src/backend/access/heap/stats.c	1999/02/13 23:14:25	1.15
--- src/backend/access/heap/stats.c	1999/07/02 23:34:45
***************
*** 16,21 ****
--- 16,22 ----
   */

#include <stdio.h>
+ #include <time.h>

#include <postgres.h>

Index: src/backend/catalog/index.c
===================================================================
RCS file: /usr/local/cvsroot/pgsql/src/backend/catalog/index.c,v
retrieving revision 1.79
diff -c -r1.79 index.c
*** src/backend/catalog/index.c	1999/06/19 04:54:11	1.79
--- src/backend/catalog/index.c	1999/07/02 23:34:47
***************
*** 20,25 ****
--- 20,26 ----
  #include "postgres.h"

#include "access/genam.h"
+ #include "access/htup.h"
#include "access/heapam.h"
#include "access/istrat.h"
#include "access/xact.h"
***************
*** 56,62 ****
/*
* macros used in guessing how many tuples are on a page.
*/
! #define AVG_TUPLE_SIZE 8
#define NTUPLES_PER_PAGE(natts) (BLCKSZ/((natts)*AVG_TUPLE_SIZE))

  /* non-export function prototypes */
--- 57,63 ----
  /*
   * macros used in guessing how many tuples are on a page.
   */
! #define AVG_TUPLE_SIZE MinTupleSize
  #define NTUPLES_PER_PAGE(natts) (BLCKSZ/((natts)*AVG_TUPLE_SIZE))
  /* non-export function prototypes */
Index: src/backend/commands/copy.c
===================================================================
RCS file: /usr/local/cvsroot/pgsql/src/backend/commands/copy.c,v
retrieving revision 1.80
diff -c -r1.80 copy.c
*** src/backend/commands/copy.c	1999/06/12 20:41:25	1.80
--- src/backend/commands/copy.c	1999/07/02 23:34:52
***************
*** 1073,1079 ****
  	}
  }

! #define EXT_ATTLEN 5*BLCKSZ

  /*
     returns 1 is c is in s
--- 1073,1079 ----
  	}
  }

! #define EXT_ATTLEN (5 * BLCKSZ)

  /*
     returns 1 is c is in s
Index: src/backend/commands/vacuum.c
===================================================================
RCS file: /usr/local/cvsroot/pgsql/src/backend/commands/vacuum.c,v
retrieving revision 1.109
diff -c -r1.109 vacuum.c
*** src/backend/commands/vacuum.c	1999/06/11 09:35:08	1.109
--- src/backend/commands/vacuum.c	1999/07/02 23:35:02
***************
*** 624,630 ****
  				empty_end_pages;
  	Size		free_size,
  				usable_free_size;
! 	Size		min_tlen = MAXTUPLEN;
  	Size		max_tlen = 0;
  	int32		i;
  	struct rusage ru0,
--- 624,630 ----
  				empty_end_pages;
  	Size		free_size,
  				usable_free_size;
! 	Size		min_tlen = MaxTupleSize;
  	Size		max_tlen = 0;
  	int32		i;
  	struct rusage ru0,
Index: src/backend/optimizer/path/_deadcode/xfunc.c
===================================================================
RCS file: /usr/local/cvsroot/pgsql/src/backend/optimizer/path/_deadcode/xfunc.c,v
retrieving revision 1.4
diff -c -r1.4 xfunc.c
*** src/backend/optimizer/path/_deadcode/xfunc.c	1999/05/25 22:41:36	1.4
--- src/backend/optimizer/path/_deadcode/xfunc.c	1999/07/02 23:35:08
***************
*** 20,25 ****
--- 20,26 ----

#include "postgres.h"

+ #include "access/htup.h"
#include "access/heapam.h"
#include "catalog/pg_language.h"
#include "catalog/pg_proc.h"
***************
*** 1094,1100 ****
RelOptInfo outerrel = get_parent((Path) get_outerjoinpath(joinnode));
RelOptInfo innerrel = get_parent((Path) get_innerjoinpath(joinnode));
Count outerwidth = get_width(outerrel);
! Count outers_per_page = ceil(BLCKSZ / (outerwidth + sizeof(HeapTupleData)));

  	if (IsA(joinnode, HashPath))
  	{
--- 1095,1101 ----
  	RelOptInfo	outerrel = get_parent((Path) get_outerjoinpath(joinnode));
  	RelOptInfo	innerrel = get_parent((Path) get_innerjoinpath(joinnode));
  	Count		outerwidth = get_width(outerrel);
! 	Count		outers_per_page = ceil(BLCKSZ / (outerwidth + MinTupleSize));
  	if (IsA(joinnode, HashPath))
  	{
Index: src/backend/parser/gram.y
===================================================================
RCS file: /usr/local/cvsroot/pgsql/src/backend/parser/gram.y,v
retrieving revision 2.84
diff -c -r2.84 gram.y
*** src/backend/parser/gram.y	1999/06/07 14:28:25	2.84
--- src/backend/parser/gram.y	1999/07/02 23:35:24
***************
*** 36,41 ****
--- 36,42 ----
  #include <ctype.h>

#include "postgres.h"
+ #include "access/htup.h"
#include "nodes/parsenodes.h"
#include "nodes/print.h"
#include "parser/gramparse.h"
***************
*** 3384,3391 ****

if ($3 < 1)
elog(ERROR,"length for '%s' type must be at least 1",$1);
! else if ($3 > BLCKSZ - 128)
! elog(ERROR,"length for type '%s' cannot exceed %d",$1, BLCKSZ-128);

  					/* we actually implement this sort of like a varlen, so
  					 * the first 4 bytes is the length. (the difference
--- 3385,3393 ----

if ($3 < 1)
elog(ERROR,"length for '%s' type must be at least 1",$1);
! else if ($3 > MaxTupleSize)
! elog(ERROR,"length for type '%s' cannot exceed %d",$1,
! MaxTupleSize);

  					/* we actually implement this sort of like a varlen, so
  					 * the first 4 bytes is the length. (the difference
Index: src/backend/storage/page/bufpage.c
===================================================================
RCS file: /usr/local/cvsroot/pgsql/src/backend/storage/page/bufpage.c,v
retrieving revision 1.22
diff -c -r1.22 bufpage.c
*** src/backend/storage/page/bufpage.c	1999/05/25 16:11:25	1.22
--- src/backend/storage/page/bufpage.c	1999/07/02 23:35:28
***************
*** 45,51 ****

Assert(pageSize == BLCKSZ);
Assert(pageSize >
! specialSize + sizeof(PageHeaderData) - sizeof(ItemIdData));

specialSize = DOUBLEALIGN(specialSize);

--- 45,51 ----

Assert(pageSize == BLCKSZ);
Assert(pageSize >
! specialSize + sizeof(PageHeaderData) - sizeof(ItemIdData));

specialSize = DOUBLEALIGN(specialSize);

Index: src/backend/utils/adt/varchar.c
===================================================================
RCS file: /usr/local/cvsroot/pgsql/src/backend/utils/adt/varchar.c,v
retrieving revision 1.46
diff -c -r1.46 varchar.c
*** src/backend/utils/adt/varchar.c	1999/05/25 16:12:21	1.46
--- src/backend/utils/adt/varchar.c	1999/07/02 23:35:29
***************
*** 14,19 ****
--- 14,20 ----
  #include <stdio.h>				/* for sprintf() */
  #include <string.h>
  #include "postgres.h"
+ #include "access/htup.h"
  #include "utils/array.h"
  #include "utils/builtins.h"
  #include "catalog/pg_type.h"
***************
*** 81,88 ****
  	else
  		len = atttypmod - VARHDRSZ;

! if (len > BLCKSZ - 128)
! elog(ERROR, "bpcharin: length of char() must be less than %d", BLCKSZ - 128);

  	result = (char *) palloc(atttypmod);
  	VARSIZE(result) = atttypmod;
--- 82,90 ----
  	else
  		len = atttypmod - VARHDRSZ;

! if (len > MaxTupleSize)
! elog(ERROR, "bpcharin: length of char() must be less than %d",
! MaxTupleSize);

result = (char *) palloc(atttypmod);
VARSIZE(result) = atttypmod;
***************
*** 151,158 ****

rlen = len - VARHDRSZ;

! if (rlen > BLCKSZ - 128)
! elog(ERROR, "bpchar: length of char() must be less than %d", BLCKSZ - 128);

  #ifdef STRINGDEBUG
  	printf("bpchar- convert string length %d (%d) ->%d (%d)\n",
--- 153,161 ----

rlen = len - VARHDRSZ;

! if (rlen > MaxTupleSize)
! elog(ERROR, "bpchar: length of char() must be less than %d",
! MaxTupleSize);

#ifdef STRINGDEBUG
printf("bpchar- convert string length %d (%d) ->%d (%d)\n",
***************
*** 332,339 ****
if (atttypmod != -1 && len > atttypmod)
len = atttypmod; /* clip the string at max length */

! if (len > BLCKSZ - 128)
! elog(ERROR, "varcharin: length of char() must be less than %d", BLCKSZ - 128);

  	result = (char *) palloc(len);
  	VARSIZE(result) = len;
--- 335,343 ----
  	if (atttypmod != -1 && len > atttypmod)
  		len = atttypmod;		/* clip the string at max length */

! if (len > MaxTupleSize)
! elog(ERROR, "varcharin: length of char() must be less than %d",
! MaxTupleSize);

result = (char *) palloc(len);
VARSIZE(result) = len;
***************
*** 403,410 ****
len = slen - VARHDRSZ;
#endif

! if (len > BLCKSZ - 128)
! elog(ERROR, "varchar: length of varchar() must be less than BLCKSZ-128");

  	result = (char *) palloc(slen);
  	VARSIZE(result) = slen;
--- 407,415 ----
  	len = slen - VARHDRSZ;
  #endif

! if (len > MaxTupleSize)
! elog(ERROR, "varchar: length of varchar() must be less than %d",
! MaxTupleSize);

  	result = (char *) palloc(slen);
  	VARSIZE(result) = slen;
Index: src/include/access/htup.h
===================================================================
RCS file: /usr/local/cvsroot/pgsql/src/include/access/htup.h,v
retrieving revision 1.16
diff -c -r1.16 htup.h
*** src/include/access/htup.h	1999/05/25 22:42:32	1.16
--- src/include/access/htup.h	1999/07/02 23:35:34
***************
*** 13,19 ****
  #ifndef HTUP_H
  #define HTUP_H

! #include <utils/nabstime.h>
#include <storage/itemptr.h>

  #define MinHeapTupleBitmapSize	32		/* 8 * 4 */
--- 13,19 ----
  #ifndef HTUP_H
  #define HTUP_H

! #include <storage/bufpage.h>
#include <storage/itemptr.h>

  #define MinHeapTupleBitmapSize	32		/* 8 * 4 */
***************
*** 51,56 ****
--- 51,61 ----
  } HeapTupleHeaderData;
  typedef HeapTupleHeaderData *HeapTupleHeader;
+ 
+ #define MinTupleSize	(sizeof (PageHeaderData) + \
+ 						 sizeof(HeapTupleHeaderData) + sizeof(int4))
+ 
+ #define MaxTupleSize	(BLCKSZ/2 - MinTupleSize)
  #define SelfItemPointerAttributeNumber			(-1)
  #define ObjectIdAttributeNumber					(-2)
Index: src/include/storage/bufpage.h
===================================================================
RCS file: /usr/local/cvsroot/pgsql/src/include/storage/bufpage.h,v
retrieving revision 1.22
diff -c -r1.22 bufpage.h
*** src/include/storage/bufpage.h	1999/05/25 16:14:40	1.22
--- src/include/storage/bufpage.h	1999/07/02 23:35:36
***************
*** 133,150 ****
  	OverwritePageManagerMode
  } PageManagerMode;
- /* ----------------
-  *		misc support macros
-  * ----------------
-  */
- 
- /*
-  * XXX this is wrong -- ignores padding/alignment, variable page size,
-  * AM-specific opaque space at the end of the page (as in btrees), ...
-  * however, it at least serves as an upper bound for heap pages.
-  */
- #define MAXTUPLEN		(BLCKSZ - sizeof (PageHeaderData))
- 
  /* ----------------------------------------------------------------
   *						page support macros
   * ----------------------------------------------------------------
--- 133,138 ----
Index: src/interfaces/ecpg/preproc/preproc.y
===================================================================
RCS file: /usr/local/cvsroot/pgsql/src/interfaces/ecpg/preproc/preproc.y,v
retrieving revision 1.57
diff -c -r1.57 preproc.y
*** src/interfaces/ecpg/preproc/preproc.y	1999/06/16 18:25:50	1.57
--- src/interfaces/ecpg/preproc/preproc.y	1999/07/02 23:35:52
***************
*** 3,8 ****
--- 3,11 ----
  #include <stdio.h>
  #include <string.h>
  #include <stdlib.h>
+ 
+ #include "postgres.h"
+ #include "access/htup.h"
  #include "catalog/catname.h"
  #include "utils/numeric.h"

***************
*** 3351,3358 ****
sprintf(errortext, "length for '%s' type must be at least 1",$1);
yyerror(errortext);
}
! else if (atol($3) > BLCKSZ - 128) {
! sprintf(errortext, "length for type '%s' cannot exceed %d",$1,BLCKSZ - 128);
yyerror(errortext);
}

--- 3354,3361 ----
  						sprintf(errortext, "length for '%s' type must be at least 1",$1);
  						yyerror(errortext);
  					}
! 					else if (atol($3) > MaxTupleSize) {
! 						sprintf(errortext, "length for type '%s' cannot exceed %d",$1,MaxTupleSize);
  						yyerror(errortext);
  					}
-- 
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#3Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#1)
Re: [HACKERS] Re: Tuple too big

Here is a patch for 6.5 which will prevent the creation of these too big
tuples in certain cases. Seems we should also check for max length at
the time we create the table, but it doesn't look like there is any code
to do that yet.

I am not going to apply this to 6.5.1 because it may have some unknown
side-affects.

On second thought, the patch looks harmless, so I am going to apply it.
It is better than what is currently there.

-- 
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#4Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#3)

I've encountered a strange behavior of the VACUUM ANALYZE command.
It seems that this command works only if the size of a text field
does not exceed approximately 4050 bytes! So the real limit on
tuple size is a half of the max tuple size. I've checked this effect
on Postgres 6.4.2 (Sparc Solaris 2.5.1) and Postgres 6.5 (SUSE 6.1
Linux, kernel 2.2.5). Is this a bug or known feature?
The python script used to reproduce this problem and results for
v6.4.2 and v6.5 are follows.

OK, looks like the new code works:

test=> create table test (x char(2000), y char(2000), z char(2000))\g
CREATE
test=> insert into test values ('1','2','3');
ERROR: Tuple is too big: size 6044, max size 4044
test=> create table test2 (x varchar(2000), y varchar(2000), z
varchar(2000))\g
CREATE
test=> insert into test2 values ('1','2','3');
INSERT 21303 1

char() is fixed length, while varchar() is variable. Now, we could
prevent creation of the first table, but not the second because only the
inserted data will show if it over the limit. Much easier just to test
in one place.

Here is the new patch:

---------------------------------------------------------------------------

Index: hio.c
===================================================================
RCS file: /usr/local/cvsroot/pgsql/src/backend/access/heap/hio.c,v
retrieving revision 1.20
retrieving revision 1.22
diff -c -r1.20 -r1.22
*** hio.c	1999/05/25 16:07:07	1.20
--- hio.c	1999/07/03 01:56:16	1.22
***************
*** 16,21 ****
--- 16,22 ----
  #include <storage/bufpage.h>
  #include <access/hio.h>
+ #include <access/htup.h>
  #include <access/heapam.h>
  #include <storage/bufmgr.h>
  #include <utils/memutils.h>
***************
*** 164,169 ****
--- 165,173 ----
  		if (len > PageGetFreeSpace(pageHeader))
  			elog(ERROR, "Tuple is too big: size %d", len);
  	}
+ 
+ 	if (len > MaxTupleSize)
+ 		elog(ERROR, "Tuple is too big: size %d, max size %d", len, MaxTupleSize);

if (!relation->rd_myxactonly)
UnlockPage(relation, 0, ExclusiveLock);

-- 
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#5Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#4)

I've encountered a strange behavior of the VACUUM ANALYZE command.
It seems that this command works only if the size of a text field
does not exceed approximately 4050 bytes! So the real limit on
tuple size is a half of the max tuple size. I've checked this effect
on Postgres 6.4.2 (Sparc Solaris 2.5.1) and Postgres 6.5 (SUSE 6.1
Linux, kernel 2.2.5). Is this a bug or known feature?
The python script used to reproduce this problem and results for
v6.4.2 and v6.5 are follows.

OK, I have again written the code to allow tuples to take up a while
block, rather than the 1/2 block limit you were seeing. It consists of
a bunch of patches, so I can't send them to you, but it will be in
6.5.1, due out July 15th.

Also, the snapshot on ftp.postgresql.org has the changes too.

-- 
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026