BUG #4793: Segmentation fault when doing vacuum analyze
The following bug has been logged online:
Bug reference: 4793
Logged by: Dennis Noordsij
Email address: dennis.noordsij@helsinki.fi
PostgreSQL version: snapshot/beta1
Operating system: 64bit arch Linux
Description: Segmentation fault when doing vacuum analyze
Details:
Seen on both 8.4beta1 and daily snapshot downloaded 5 May 2009.
Table used (effects also seen on similar tables):
osm=# \d way_tags
Table "public.way_tags"
Column | Type | Modifiers
--------+--------+-----------
way | bigint | not null
key | text | not null
value | text |
Indexes:
"way_tags_pkey" PRIMARY KEY, btree (way, key)
"way_tag_kv_idx" btree (key, value)
"way_tag_wkv_idx" btree (way, key, value)
osm=# select count(*) from way_tags;
count
---------
4154315
(1 row)
osm=# select count(*) from (select distinct key from way_tags) as foo;
count
-------
322
(1 row)
osm=# select count(*) from (select distinct value from way_tags) as foo;
count
---------
1124909
(1 row)
Statement:
alter table way_tags alter column way set statistics 5000;
alter table way_tags alter column key set statistics 5000;
vacuum analyze;
results in a segmentation fault with the following backtrace:
(gdb) bt
#0 0x00000000004eecf7 in compute_scalar_stats (stats=0x1abd878,
fetchfunc=0x4f0f30 <std_fetch_func>, samplerows=<value optimized out>,
totalrows=4154315)
at analyze.c:2321
#1 0x00000000004efbf5 in analyze_rel (relid=16484, vacstmt=0x1aaf140,
bstrategy=<value optimized out>, update_reltuples=1 '\001') at
analyze.c:433
#2 0x0000000000538681 in vacuum (vacstmt=0x1aaf140, relid=<value optimized
out>,
do_toast=1 '\001', bstrategy=<value optimized out>, for_wraparound=0
'\0',
isTopLevel=<value optimized out>) at vacuum.c:466
#3 0x00000000005e8bc7 in PortalRunUtility (portal=0x1ae00c0,
utilityStmt=0x1aaf140,
isTopLevel=64 '@', dest=0xabc160, completionTag=0x7fff39e7f0e0 "") at
pquery.c:1192
#4 0x00000000005e9d0d in PortalRunMulti (portal=0x1ae00c0,
isTopLevel=<value optimized out>,
dest=0xabc160, altdest=0xabc160, completionTag=0x7fff39e7f0e0 "") at
pquery.c:1297
#5 0x00000000005ea482 in PortalRun (portal=0x1ae00c0,
count=9223372036854775807,
isTopLevel=1 '\001', dest=0xabc160, altdest=0xabc160,
completionTag=0x7fff39e7f0e0 "")
at pquery.c:823
#6 0x00000000005e5807 in exec_simple_query (query_string=0x1aae880 "vacuum
analyze way_tags;\n")
at postgres.c:991
#7 0x00000000005e6dc7 in PostgresMain (argc=1, argv=<value optimized out>,
username=0x1a07560 "dennis") at postgres.c:3606
#8 0x00000000005676af in main (argc=5, argv=0x1a05b70) at main.c:186
Note if I vacuum analyze between the alter table statements then everything
completes fine.
After the segfault, restarting and then running "vacuum analyze way_tags" is
enough to trigger the segfault again.
"Dennis Noordsij" <dennis.noordsij@helsinki.fi> writes:
(gdb) bt
#0 0x00000000004eecf7 in compute_scalar_stats (stats=0x1abd878,
fetchfunc=0x4f0f30 <std_fetch_func>, samplerows=<value optimized out>,
totalrows=4154315)
at analyze.c:2321
#1 0x00000000004efbf5 in analyze_rel (relid=16484, vacstmt=0x1aaf140,
bstrategy=<value optimized out>, update_reltuples=1 '\001') at
analyze.c:433
Hmm, that code hasn't changed in quite some time, so I doubt this is a
new bug in 8.4. You'll need to either poke into it yourself, or supply
a dump of the table to someone who can.
regards, tom lane
On Tuesday 05 May 2009 17:16:03 Tom Lane wrote:
"Dennis Noordsij" <dennis.noordsij@helsinki.fi> writes:
(gdb) bt
#0 0x00000000004eecf7 in compute_scalar_stats (stats=0x1abd878,
fetchfunc=0x4f0f30 <std_fetch_func>, samplerows=<value optimized
out>, totalrows=4154315)
at analyze.c:2321
#1 0x00000000004efbf5 in analyze_rel (relid=16484, vacstmt=0x1aaf140,
bstrategy=<value optimized out>, update_reltuples=1 '\001') at
analyze.c:433Hmm, that code hasn't changed in quite some time, so I doubt this is a
new bug in 8.4. You'll need to either poke into it yourself, or supply
a dump of the table to someone who can.regards, tom lane
As an update on the bug (sorry if this arrived twice):
for (i = 0; i < num_hist; i++)
{
int pos;
pos = (i * (nvals - 1)) / (num_hist - 1);
hist_values[i] = datumCopy(values[pos].value,
stats->attr->attbyval,
stats->attr->attlen);
}
What happens is that:
(gdb) print i
$17 = 1458
(gdb) print nvals
$18 = 1473527
(gdb) print num_hist
$19 = 5001
(gdb) print pos
$20 = -429313
(gdb) print samplerows
$22 = 1500000
(gdb) print values_cnt
$34 = 1500000
(gdb) print ndistinct
$35 = 904980
(gdb) print nmultiple
$36 = 435290
(gdb) print num_hist
$37 = 5001
(gdb) print dups_cnt
$38 = 0
(gdb) print slot_idx
$39 = 1
Without the overflow the result of
pos = (i * (nvals - 1)) / (num_hist - 1);
would be 429680, which would be a valid index into "values"
Cheers
Dennis
Tom Lane wrote:
"Dennis Noordsij" <dennis.noordsij@helsinki.fi> writes:
(gdb) bt
#0 0x00000000004eecf7 in compute_scalar_stats (stats=0x1abd878,
fetchfunc=0x4f0f30 <std_fetch_func>, samplerows=<value optimized out>,
totalrows=4154315)
at analyze.c:2321
#1 0x00000000004efbf5 in analyze_rel (relid=16484, vacstmt=0x1aaf140,
bstrategy=<value optimized out>, update_reltuples=1 '\001') at
analyze.c:433Hmm, that code hasn't changed in quite some time, so I doubt this is a
new bug in 8.4. You'll need to either poke into it yourself, or supply
a dump of the table to someone who can.
hmm not sure if it is related in any way - but setting statistics to >
1000 is "new" in 8.4...
Stefan
Dennis Noordsij <dennis.noordsij@movial.fi> writes:
What happens is [ integer overflow ]
Doh ... so the increase in the maximum histogram size *is* related.
Thanks, will fix.
regards, tom lane
I wrote:
Dennis Noordsij <dennis.noordsij@movial.fi> writes:
What happens is [ integer overflow ]
Doh ... so the increase in the maximum histogram size *is* related.
Thanks, will fix.
Patch is here if you need it right away:
http://archives.postgresql.org/message-id/20090505180211.B2A59754069@cvs.postgresql.org
regards, tom lane