8.4.0 data loss / HOT-related bug
Hello,
8.4.0 installation on i686 modern Linux, a simple production app with ~4
users entering data; normalised, no weird stuff happening there. Recently:
ERROR: missing chunk number 0 for toast value 120741 in pg_toast_29644
This table has table_log and a "before update" trigger setting
"NEW.last_modified = now()". The broken toast is a HTML blob, stored in
a "text" column.
This particular row currently has last_modified="2009-08-11", but table_log
says it has been updated on "2009-08-12" -- so, data loss.
Please see the attachment for the xmin,xmax,cmin,cmax,ctid values of
these rows. Andrew Gireth has asked for the hexdump of page 539 of the
"promocje" table: http://radek.cc/promocje.hexdump
About a week after I've upgraded to 8.4, the users of this application
have started to complain, that some rows are disappearing.
--
Radosław Zieliński <radek@pld-linux.org>
Attachments:
pg-84-hot-bug.txttext/plain; charset=us-asciiDownload
On Fri, Aug 21, 2009 at 4:04 PM, Radoslaw Zielinski<radek@pld-linux.org> wrote:
This particular row currently has last_modified="2009-08-11", but table_log
says it has been updated on "2009-08-12" -- so, data loss.
These are all very low transaction ids. you say it was a week of
production before this happened?
How did you migrate your data? Was it a normal pg_dump restore? Or did
you do anything funny with slony or pg_migrate?
Please see the attachment for the xmin,xmax,cmin,cmax,ctid values of
these rows. Andrew Gireth has asked for the hexdump of page 539 of the
"promocje" table: http://radek.cc/promocje.hexdump
Here's a dump of that page btw.
postgres=# select (h).* from (select heap_page_items(p) as h from p) as x;
lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 |
t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits
| t_oid
----+--------+----------+--------+--------+--------+----------+----------+-------------+------------+--------+------------------------------------------+-------
1 | 6896 | 1 | 1296 | 3747 | 0 | 0 |
(539,1) | 15 | 10499 | 28 |
1011111000001110000000000000000000000000 |
2 | 5 | 2 | 0 | | | |
| | | |
|
3 | 1164 | 1 | 504 | 3842 | 3843 | 0 |
(539,3) | 15 | 2439 | 28 |
1011111000001110000000000000000000000000 |
4 | 6396 | 1 | 500 | 3806 | 0 | 0 |
(539,4) | 15 | 10503 | 28 |
1011111000001110000000000000000000000000 |
5 | 5928 | 1 | 468 | 3810 | 0 | 0 |
(539,5) | -32753 | 10503 | 28 |
1011111000001110000000000000000000000000 |
6 | 5176 | 1 | 752 | 3812 | 3813 | 0 |
(539,6) | 15 | 2439 | 28 |
1011111000001110000000000000000000000000 |
7 | 4600 | 1 | 576 | 3819 | 6179 | 0 |
(539,12) | 16399 | 10503 | 28 |
1011101000001110000000000000000000000000 |
8 | 4076 | 1 | 524 | 3823 | 3825 | 0 |
(539,8) | 15 | 2439 | 28 |
1011111000001110000000000000000000000000 |
9 | 2872 | 1 | 1204 | 3830 | 0 | 0 |
(539,9) | 15 | 10499 | 28 |
1011111000001110000000000000000000000000 |
10 | 1668 | 1 | 1204 | 3831 | 3832 | 0 |
(539,10) | 15 | 2435 | 28 |
1011111000001110000000000000000000000000 |
11 | 0 | 3 | 0 | | | |
| | | |
|
12 | 0 | 0 | 0 | | | |
| | | |
|
(12 rows)
On Fri, Aug 21, 2009 at 5:03 PM, Greg Stark<gsstark@mit.edu> wrote:
On Fri, Aug 21, 2009 at 4:04 PM, Radoslaw Zielinski<radek@pld-linux.org> wrote:
Here's a dump of that page btw.
... [crap word-wrapped by gmail]
Sorry, clearly gmail sucks for posting things that you actually care
about how they're formatted. I've attached the page dump.
It looks like the row *was* updated by transaction 6179 and the new
version was stored in line pointer 12. However it's marked
XMAX_INVALID which means at least somebody at some point thought 6179
had aborted and marked that hint bit.
Do we have any tools to look up a transaction id status in the clog?
Or alternatively do you have all the wal logs and could you run
xlogdump on them looking for transaction id 6179?
Attachments:
Radoslaw Zielinski wrote:
Hello,
8.4.0 installation on i686 modern Linux, a simple production app with ~4
users entering data; normalised, no weird stuff happening there. Recently:ERROR: missing chunk number 0 for toast value 120741 in pg_toast_29644
bug=# select id, last_modified, xmin,xmax,cmin,cmax,ctid from public.promocje where id = 4300;
id | last_modified | xmin | xmax | cmin | cmax | ctid
------+----------------------------+------+------+------+------+---------
4300 | 2009-08-11 09:54:11.007351 | 3819 | 6179 | 0 | 0 | (539,7)
(1 row)
I wonder if this could be explained by xid=6179 not marked as committed
in clog. I'd try flipping that bit and see what happens ...
--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Greg Stark wrote:
It looks like the row *was* updated by transaction 6179 and the new
version was stored in line pointer 12. However it's marked
XMAX_INVALID which means at least somebody at some point thought 6179
had aborted and marked that hint bit.
Hmm, but LP 12 shouldn't have len=0 then, no? Unless it has been
clobbered by vacuum or HOT pruning, I guess ...
--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera <alvherre@commandprompt.com> writes:
I wonder if this could be explained by xid=6179 not marked as committed
in clog. I'd try flipping that bit and see what happens ...
Once the row's been marked XMAX_INVALID, changing clog won't change that
marking ...
regards, tom lane
On Fri, Aug 21, 2009 at 5:15 PM, Alvaro
Herrera<alvherre@commandprompt.com> wrote:
I wonder if this could be explained by xid=6179 not marked as committed
in clog. I'd try flipping that bit and see what happens ...
Well nothing's going to help much now. Firstly, once the hint bit gets
set nothing second-guesses that and checks the clog anyways. And
secondly the new version of the tuple is already vacuumed.
Either of two things are true.
Either transaction 6179 committed, which would explain why the toast
tuples are missing. In which case sometime later this hint bit became
set and the new version pruned. I don't know if bad memory could cause
all that to happen, would the HOT pruning logic see the hint bit and
decide to prune based on that? I suppose a bad bit hitting the clog
could cause everything though.
Alternatively 6179 aborted but somebody along the way got that wrong
and marked the toast tuples dead (and maybe vacuumed them) thinking it
had committed. That's going to be harder to tell if that's what
happened because we don't have any pointers to the specific page in
the toast table. Not unless you can dump the whole index and find
pointers in there or can find the details in the wal log.
Greg Stark <gsstark@mit.edu> [2009-08-21 18:03]:
On Fri, Aug 21, 2009 at 4:04 PM, Radoslaw Zielinski<radek@pld-linux.org> wrote:
This particular row currently has last_modified="2009-08-11", but table_log
says it has been updated on "2009-08-12" -- so, data loss.
These are all very low transaction ids. you say it was a week of
production before this happened?
Upgraded around:
$ rpm -q postgresql --qf '%{INSTALLTIME}' | xargs perl -le 'print~~localtime shift'
Fri Aug 7 10:06:44 2009
The first mention of the error in the app's output:
[info] *** Request 5532 (0.008/s) [7283] [Wed Aug 19 13:06:35 2009] ***
User has reported "disappearing rows" at 12 Aug 2009, 16:58.
How did you migrate your data? Was it a normal pg_dump restore? Or did
you do anything funny with slony or pg_migrate?
Plain regular dump/restore (from text dump). There's no slony or
anything on this database except for table_log and the trigger to update
last_modified.
In case it might be relevant: locale C, encoding utf8.
--
Radosław Zieliński <radek@pld-linux.org>
Greg Stark <gsstark@mit.edu> [2009-08-21 18:12]:
[...]
Or alternatively do you have all the wal logs and could you run
xlogdump on them looking for transaction id 6179?
Nope, I don't have the wal logs.
--
Radosław Zieliński <radek@pld-linux.org>
2009/8/21 Radoslaw Zielinski <radek@pld-linux.org>:
Greg Stark <gsstark@mit.edu> [2009-08-21 18:03]:
On Fri, Aug 21, 2009 at 4:04 PM, Radoslaw Zielinski<radek@pld-linux.org> wrote:
This particular row currently has last_modified="2009-08-11", but table_log
says it has been updated on "2009-08-12" -- so, data loss.These are all very low transaction ids. you say it was a week of
production before this happened?Upgraded around:
$ rpm -q postgresql --qf '%{INSTALLTIME}' | xargs perl -le 'print~~localtime shift'
Fri Aug 7 10:06:44 2009
Is this, perchance, new hardware? Did you test the memory in it?
Do you have other records that have "disappeared"?
Greg Stark <gsstark@mit.edu> [2009-08-21 18:55]:
[...]
Is this, perchance, new hardware? Did you test the memory in it?
It's a "virtual private server"; the hosting provider is swearing
everything's fine. I can't vouch for it myself, obviously.
Do you have other records that have "disappeared"?
I haven't found any (yet). So far, verified with:
create function exec(text) returns void language plpgsql as $f$ begin execute $1; end; $f$;
select exec(
'select log_id, id from ( select distinct ''' || relname || ''', case first_value(trigger_mode) over w when ''DELETE'' then null else id end as log_id, first_value(last_modified) over w as lm from log.' || relname || ' window w as (partition by id order by id desc,trigger_id desc ) )x left join public.' || relname || ' on id=log_id where ( id is null or last_modified <> x.lm ) and log_id is not null'
) from pg_stat_user_tables where schemaname = 'log';
(Looks at the log.* table, finds the largest last_modified of the
non-deleted rows, joins the public.* table on id, compares
last_modified).
--
Radosław Zieliński <radek@pld-linux.org>
Greg Stark <gsstark@mit.edu> [2009-08-21 18:38]:
[...]
the toast table. Not unless you can dump the whole index and find
pointers in there or can find the details in the wal log.
You mean the primary key index? How do I dump it?
--
Radosław Zieliński <radek@pld-linux.org>
2009/8/21 Radoslaw Zielinski <radek@pld-linux.org>:
Greg Stark <gsstark@mit.edu> [2009-08-21 18:38]:
[...]the toast table. Not unless you can dump the whole index and find
pointers in there or can find the details in the wal log.You mean the primary key index? How do I dump it?
Actually I mean the key for the toast table.
Let me ask firstly do you get anything if you just do select * from
pg_toast.pg_toast_29644 where chunk_id = 120741 ?
And secondly, what do you get if you do "select length(htmlblob) from
tab where id = ?" which should work even if the toast data is broken.
To dump the index -- this will be tedious, but:
Load the pageinspect contrib module and run this:
select (x).* from (select bt_page_items('t_pkey',1) as x ) as y;
select (x).* from (select bt_page_items('t_pkey',2) as x ) as y;
...
Until you get an error. Then look through the output for any pointers
The hex at the end is the index key which in the case of a toast table
starts with the chunk id, you're looking for 120741 which is 0001D7A5
in hex or A5D70100 in little-endian...
If you find any such rows send them and they might tell us what's
going on or where to look for the missing toast records to see what's
going on.
However I'm kind of skeptical that you'll find any. IIRC Btree marks
pointers that it finds point to dead records and reuses them as soon
as possible. Still, maybe nothing's been inserted on that page yet.
Radoslaw Zielinski <radek@pld-linux.org> writes:
Greg Stark <gsstark@mit.edu> [2009-08-21 18:55]:
Is this, perchance, new hardware? Did you test the memory in it?
It's a "virtual private server"; the hosting provider is swearing
everything's fine. I can't vouch for it myself, obviously.
Hmmm ... have you had any crashes? I'd wonder how trustworthy fsync is
on such a setup.
regards, tom lane
Greg Stark <gsstark@mit.edu> [2009-08-21 20:30]:
[...]
Actually I mean the key for the toast table.
Let me ask firstly do you get anything if you just do select * from
pg_toast.pg_toast_29644 where chunk_id = 120741 ?
0 rows.
And secondly, what do you get if you do "select length(htmlblob) from
tab where id = ?" which should work even if the toast data is broken.
bug=# select length(opis_szczeg) from promocje where id = 4300;
ERROR: missing chunk number 0 for toast value 120741 in pg_toast_29644
To dump the index -- this will be tedious, but:
$ seq 11 | xargs -ti psql bug -c "select {}, (x).* from (select bt_page_items('promocje_pkey',{}) as x ) as y"
[...]
psql bug -c select 11, (x).* from (select bt_page_items('promocje_pkey',11) as x ) as y
ERROR: block number out of range
Output: http://radek.cc/promocje_bpi.txt.gz
[...]
If you find any such rows send them and they might tell us what's
going on or where to look for the missing toast records to see what's
going on.
No luck.
--
Radosław Zieliński <radek@pld-linux.org>
Tom Lane <tgl@sss.pgh.pa.us> [2009-08-21 20:35]:
Radoslaw Zielinski <radek@pld-linux.org> writes:
Greg Stark <gsstark@mit.edu> [2009-08-21 18:55]:
Is this, perchance, new hardware? Did you test the memory in it?
It's a "virtual private server"; the hosting provider is swearing
everything's fine. I can't vouch for it myself, obviously.
Hmmm ... have you had any crashes? I'd wonder how trustworthy fsync is
on such a setup.
Nope (at least nothing I have noticed). BTW: it's OpenVZ.
--
Radosław Zieliński <radek@pld-linux.org>
2009/8/22 Radoslaw Zielinski <radek@pld-linux.org>:
bug=# select length(opis_szczeg) from promocje where id = 4300;
ERROR: missing chunk number 0 for toast value 120741 in pg_toast_29644
Sorry, what datatype is this again? And what encoding?
Perhaps I should have said octet_length() instead of length().
2009/8/22 Radoslaw Zielinski <radek@pld-linux.org>:
$ seq 11 | xargs -ti psql bug -c "select {}, (x).* from (select bt_page_items('promocje_pkey',{}) as x ) as y"
[...]
psql bug -c select 11, (x).* from (select bt_page_items('promocje_pkey',11) as x ) as y
ERROR: block number out of range
Sorry, I forgot to adjust the index name in the example. We don't want
the primary key index of your table we want the index of the toast
table.
Try the same thing with bt_page_items('pg_toast.pg_toast_29644_index',...)
Greg Stark <gsstark@mit.edu> [2009-08-23 03:40]:
2009/8/22 Radoslaw Zielinski <radek@pld-linux.org>:
bug=# select length(opis_szczeg) from promocje where id = 4300;
ERROR: missing chunk number 0 for toast value 120741 in pg_toast_29644Sorry, what datatype is this again? And what encoding?
text, UTF-8 (locale C).
Perhaps I should have said octet_length() instead of length().
bug=# select octet_length(opis_szczeg) from promocje where id = 4300;
octet_length
--------------
3081
(1 row)
--
Radosław Zieliński <radek@pld-linux.org>
Greg Stark <gsstark@mit.edu> [2009-08-23 03:44]:
2009/8/22 Radoslaw Zielinski <radek@pld-linux.org>:
$ seq 11 | xargs -ti psql bug -c "select {}, (x).* from (select bt_page_items('promocje_pkey',{}) as x ) as y"
[...]
psql bug -c select 11, (x).* from (select bt_page_items('promocje_pkey',11) as x ) as y
ERROR: block number out of range
Sorry, I forgot to adjust the index name in the example. We don't want
the primary key index of your table we want the index of the toast
table.
Sorry, I got confused by the index not being shown in \d pg_toast.pg_toast_29644.
Try the same thing with bt_page_items('pg_toast.pg_toast_29644_index',...)
bug=# select * from bt_page_items('pg_toast.pg_toast_29644_index', 8) where data ~ 'a5 d7 01 00';
-[ RECORD 1 ]-----------------------
itemoffset | 121
ctid | (458,3)
itemlen | 16
nulls | f
vars | f
data | a5 d7 01 00 00 00 00 00
Full output, without the WHERE: http://radek.cc/promocje_bpi2.txt.gz
get_raw_page( 'pg_toast.pg_toast_29644', 458 ): http://radek.cc/toast_29644_page_458.txt
bug=# select * from heap_page_items( get_raw_page( 'pg_toast.pg_toast_29644', 458 ) );
lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid
----+--------+----------+--------+--------+--------+----------+---------+-------------+------------+--------+--------+-------
1 | 6156 | 1 | 2036 | 3816 | 0 | 1 | (458,1) | 3 | 2306 | 24 | |
2 | 4176 | 1 | 1979 | 3816 | 0 | 1 | (458,2) | 3 | 2306 | 24 | |
3 | 0 | 3 | 0 | | | | | | | | |
4 | 2140 | 1 | 2036 | 3823 | 0 | 0 | (458,4) | 3 | 2306 | 24 | |
(4 rows)
bug=# select * from bt_page_stats('pg_toast.pg_toast_29644_index', 8);
-[ RECORD 1 ]-+-----
blkno | 8
type | l
live_items | 366
dead_items | 1
avg_item_size | 16
page_size | 8192
free_size | 808
btpo_prev | 7
btpo_next | 9
btpo | 0
btpo_flags | 65
--
Radosław Zieliński <radek@pld-linux.org>