vacuum analyze corrupts db with larger tuples (< 8k)
ok, here is what I have found out on 6.5.3, Linux 2.2.10:
DROP TABLE buf;
CREATE TABLE buf (s varchar(5000)); -- type TEXT is the same prob
COPY buf FROM stdin;
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\
[... almost 5k ...]
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\
[... almost 5k ...]
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\
[ 10 times ]
\.
# vacuumdb --analyze test
ERROR: Tuple is too big: size 9604
vacuumdb: database vacuum failed on test.
- this is repeatable
- in in this test the db isn't corrupted but in our production db :(
- Tom, you made a remark that you found a bug in access/hio.c.
Does the bug also hit here?
Dirk
Dirk Lutzebaeck writes:
- Tom, you made a remark that you found a bug in access/hio.c.
Does the bug also hit here?
I applied this patch and the corruption is gone but vacuumdb --analyze
still complains about the tuple being too large.
Dirk
Dirk Lutzebaeck <lutzeb@aeccom.com> writes:
ok, here is what I have found out on 6.5.3, Linux 2.2.10:
[ make table with a bunch of almost-5K varchar fields ]
# vacuumdb --analyze test
ERROR: Tuple is too big: size 9604
vacuumdb: database vacuum failed on test.
Ohhh ... I know what's going on. The oversize tuple is the one that
VACUUM is attempting to store in pg_statistic, containing the min and
max values for your varchar column. In this example, both the min and
max are just shy of 5K characters, so the pg_statistic tuple is too
big to fit on a page.
I had already patched this in current sources, by the expedient of not
trying to store a pg_statistic tuple at all if it's too big. (Then
you don't get stats for that particular column, but the stats probably
wouldn't be useful anyway.)
I suppose I should make up a back-patch for REL6_5 with this fix.
regards, tom lane
I wrote:
Ohhh ... I know what's going on. The oversize tuple is the one that
VACUUM is attempting to store in pg_statistic, containing the min and
max values for your varchar column. In this example, both the min and
max are just shy of 5K characters, so the pg_statistic tuple is too
big to fit on a page.
I have applied the attached patch to the REL6_5 branch. (There is
already similar code in the current branch.) I'm afraid it won't
help you recover from an already-corrupted pg_statistic table, though.
regards, tom lane
*** src/backend/commands/vacuum.c.orig Wed Aug 25 08:01:45 1999
--- src/backend/commands/vacuum.c Tue Jan 4 12:15:17 2000
***************
*** 2405,2414 ****
stup = heap_formtuple(sd->rd_att, values, nulls);
/* ----------------
! * insert the tuple in the relation and get the tuple's oid.
* ----------------
*/
! heap_insert(sd, stup);
pfree(DatumGetPointer(values[3]));
pfree(DatumGetPointer(values[4]));
pfree(stup);
--- 2405,2425 ----
stup = heap_formtuple(sd->rd_att, values, nulls);
/* ----------------
! * Watch out for oversize tuple, which can happen if
! * both of the saved data values are long.
! * Our fallback strategy is just to not store the
! * pg_statistic tuple at all in that case. (We could
! * replace the values by NULLs and still store the
! * numeric stats, but presently selfuncs.c couldn't
! * do anything useful with that case anyway.)
* ----------------
*/
! if (MAXALIGN(stup->t_len) <= MaxTupleSize)
! {
! /* OK to store tuple */
! heap_insert(sd, stup);
! }
!
pfree(DatumGetPointer(values[3]));
pfree(DatumGetPointer(values[4]));
pfree(stup);
From bouncefilter Tue Jan 4 13:08:59 2000
Received: from server.pyrenet.fr (server.pyrenet.fr [194.250.190.1])
by hub.org (8.9.3/8.9.3) with ESMTP id NAA36300
for <pgsql-hackers@postgreSQL.org>; Tue, 4 Jan 2000 13:08:29 -0500 (EST)
(envelope-from ohp@pyrenet.fr)
Received: from localhost (localhost [127.0.0.1])
by server.pyrenet.fr (8.8.7/8.8.8) with ESMTP id TAA13636;
Tue, 4 Jan 2000 19:07:39 +0100 (MET)
Date: Tue, 4 Jan 2000 19:07:38 +0100 (MET)
From: Olivier PRENANT <ohp@pyrenet.fr>
Reply-To: ohp@pyrenet.fr
To: Thomas Lockhart <lockhart@alumni.caltech.edu>
cc: Tom Lane <tgl@sss.pgh.pa.us>, Tulassay Zsolt <zsolt@tek.bke.hu>,
pgsql-hackers@postgreSQL.org
Subject: Re: [BUGS] Date calc bug
In-Reply-To: <3871AB34.5ACE74C7@alumni.caltech.edu>
Message-ID: <Pine.UW2.4.21.0001041905510.13631-200000@server.pyrenet.fr>
MIME-Version: 1.0
Content-Type: MULTIPART/Mixed; BOUNDARY=------------0D67EBB5BB25DC2D84AE64E5
Content-ID: <Pine.UW2.4.21.0001041905511.13631@server.pyrenet.fr>
This message is in MIME format. The first part should be readable text,
while the remaining parts are likely unreadable without MIME-aware tools.
Send mail to mime@docserver.cac.washington.edu for more info.
--------------0D67EBB5BB25DC2D84AE64E5
Content-Type: TEXT/PLAIN; CHARSET=US-ASCII
Content-ID: <Pine.UW2.4.21.0001041905512.13631@server.pyrenet.fr>
Thanks you!!
Patch applied and tested.
Great job.
That's exactly the reason why I love Internet and postresql.
You find a bug.. Send a message and correct it.
Best wishes for 2000.
Regards
On Tue, 4 Jan 2000, Thomas Lockhart wrote:
forum=> select datetime(now())+'74565 days'::timespan as ido;
Thu Jan 19 14:07:30 2068and
select '12-01-1999'::datetime + '@ 1 month - 1 sec' ;
Thu Dec 30 23:59:59 1999 ESTI've repaired both problems in both the development and release trees.
Thanks for the reports and analysis. Patch enclosed...- Thomas
--
Olivier PRENANT Tel: +33-5-61-50-97-00 (Work)
Quartier d'Harraud Turrou +33-5-61-50-97-01 (Fax)
31190 AUTERIVE +33-6-07-63-80-64 (GSM)
FRANCE Email: ohp@pyrenet.fr
------------------------------------------------------------------------------
Make your life a dream, make your dream a reality. (St Exupery)
--------------0D67EBB5BB25DC2D84AE64E5
Content-Type: TEXT/PLAIN; CHARSET=us-ascii; NAME="dt.c.patch"
Content-ID: <Pine.UW2.4.21.0001041905513.13631@server.pyrenet.fr>
Content-Description:
Content-Disposition: INLINE; FILENAME="dt.c.patch"
*** ../src/backend/utils/adt/dt.c.orig Mon Jan 3 08:27:24 2000
--- ../src/backend/utils/adt/dt.c Mon Jan 3 16:41:08 2000
***************
*** 787,792 ****
--- 787,793 ----
* To add a month, increment the month, and use the same day of month.
* Then, if the next month has fewer days, set the day of month
* to the last day of month.
+ * Lastly, add in the "quantitative time".
*/
DateTime *
datetime_pl_span(DateTime *datetime, TimeSpan *span)
***************
*** 815,826 ****
{
dt = (DATETIME_IS_RELATIVE(*datetime) ? SetDateTime(*datetime) : *datetime);
- #ifdef ROUND_ALL
- dt = JROUND(dt + span->time);
- #else
- dt += span->time;
- #endif
-
if (span->month != 0)
{
struct tm tt,
--- 816,821 ----
***************
*** 853,858 ****
--- 848,859 ----
DATETIME_INVALID(dt);
}
+ #ifdef ROUND_ALL
+ dt = JROUND(dt + span->time);
+ #else
+ dt += span->time;
+ #endif
+
*result = dt;
}
***************
*** 2441,2447 ****
tm2timespan(struct tm * tm, double fsec, TimeSpan *span)
{
span->month = ((tm->tm_year * 12) + tm->tm_mon);
! span->time = ((((((tm->tm_mday * 24) + tm->tm_hour) * 60) + tm->tm_min) * 60) + tm->tm_sec);
span->time = JROUND(span->time + fsec);
return 0;
--- 2442,2451 ----
tm2timespan(struct tm * tm, double fsec, TimeSpan *span)
{
span->month = ((tm->tm_year * 12) + tm->tm_mon);
! span->time = ((((((tm->tm_mday * 24.0)
! + tm->tm_hour) * 60.0)
! + tm->tm_min) * 60.0)
! + tm->tm_sec);
span->time = JROUND(span->time + fsec);
return 0;
--------------0D67EBB5BB25DC2D84AE64E5--
Dirk Lutzebaeck <lutzeb@aeccom.com> writes:
ok, here is what I have found out on 6.5.3, Linux 2.2.10:
[ make table with a bunch of almost-5K varchar fields ]
# vacuumdb --analyze test
ERROR: Tuple is too big: size 9604
vacuumdb: database vacuum failed on test.Ohhh ... I know what's going on. The oversize tuple is the one that
VACUUM is attempting to store in pg_statistic, containing the min and
max values for your varchar column. In this example, both the min and
max are just shy of 5K characters, so the pg_statistic tuple is too
big to fit on a page.I had already patched this in current sources, by the expedient of not
trying to store a pg_statistic tuple at all if it's too big. (Then
you don't get stats for that particular column, but the stats probably
wouldn't be useful anyway.)I suppose I should make up a back-patch for REL6_5 with this fix.
Oh, good we know the cause. Seems we should wait for 7.0 for this.
--
Bruce Momjian | http://www.op.net/~candle
maillist@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
Tom Lane writes:
I have applied the attached patch to the REL6_5 branch. (There is
already similar code in the current branch.) I'm afraid it won't
help you recover from an already-corrupted pg_statistic table, though.
How can I recover from this? Simply delete the entries in pg_statistic?
Is this possible?
Dirk
Dirk Lutzebaeck <lutzeb@aeccom.com> writes:
Tom Lane writes:
I have applied the attached patch to the REL6_5 branch. (There is
already similar code in the current branch.) I'm afraid it won't
help you recover from an already-corrupted pg_statistic table, though.
How can I recover from this? Simply delete the entries in pg_statistic?
Is this possible?
It's worth a try, anyway: just DELETE FROM pg_statistic
and see what happens... but, depending on how badly pg_statistic is
messed up, that might itself crash...
regards, tom lane
Tom Lane writes:
I have applied the attached patch to the REL6_5 branch. (There is
already similar code in the current branch.) I'm afraid it won't
help you recover from an already-corrupted pg_statistic table, though.How can I recover from this? Simply delete the entries in pg_statistic?
Is this possible?
The only fix I know is pg_upgrade. You may have to enable the script.
--
Bruce Momjian | http://www.op.net/~candle
maillist@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026