(Updated) Table File Format

Started by Michael Richardsabout 25 years ago5 messages
#1Michael Richards
miker@interchange.ca

I need a little help on the format of the postgres tables.

I've got this wonderfully corrupted database where just about everything is
fubar. I've tried a number of things to get it back using postgres and
related tools with no success. It looks like most of the data is there, but
there may be a small amount of corruption that's causing all kinds of
problems.

I've broken down and begin development of a tool to allow examination of
the
data within the table files. This could actually be useful for recovering
and undoing changes (or at least until the row-reuse code goes into
production).

I've been hacking the file format and trying to find stuff in the source
and
docs as much as possible, but here goes...

a) tuples cannot span multiple pages (yet).
b) the data is not platform independant??? Ie the data from a sun looks
different from an intel?

For every page, I see that the first 2 words are for the end of the tuple
pointers and the beginning of the tuple data.

What are the next 2 words used for? In all my cases they appear to be set
to
0x2000.

Following that I find the 2 word tuple pointers.
The first word appears to be the offset in the page where the tuple can be
found but the MSB has to be stripped off (haven't found it's function in the
source yet).
The second is the transactionid that, if comitted gives this tuple
visibility???

Are these tuple pointers always stored in order of last to first? Or should
I be loading and sorting them according to offset?

Now on to the tuple data... I have my tool to the point where it extracts
all the tuple data from the table, but I haven't been able to find the place
in the postgres source that explains the format. I assume a tuple contains a
number of attributes (referencing pg_attribute). Those not found in the
tuple would be assumed to be NULL.

Since I'm ignoring transaction ids right now, I'm planning on extracting
all
the tuple and ordering them by oid so you can see all the comitted and
uncomitted changes. I may even make it look good once I've recovered my
data...

-Michael

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Richards (#1)
Re: (Updated) Table File Format

"Michael Richards" <miker@interchange.ca> writes:

Following that I find the 2 word tuple pointers.
The first word appears to be the offset in the page where the tuple can be
found but the MSB has to be stripped off (haven't found it's function in the
source yet).
The second is the transactionid that, if comitted gives this tuple
visibility???

No, offset and length --- there is also a 2-bit flags field. Look at
the page and item declarations in src/include/storage/

Someone else was recently working on a bit-level dump tool, but I've
forgotten who.

regards, tom lane

#3Hannu Krosing
hannu@tm.ee
In reply to: Michael Richards (#1)
2 attachment(s)
Re: (Updated) Table File Format

Michael Richards wrote:

I need a little help on the format of the postgres tables.

I've got this wonderfully corrupted database where just about everything is
fubar. I've tried a number of things to get it back using postgres and
related tools with no success. It looks like most of the data is there, but
there may be a small amount of corruption that's causing all kinds of
problems.

Find attached a python script that I used to get deleted (actually all
;)
records from a database table.

It was not corrupted, just a simple programming error in client software
had deleted more than needed.

Fortunately it was not vacuumed so the data itself (info for web-based
paper
postcard sending system) was there

It works as-is only for my table as the field extraction code is
hard-coded, but
it should be quite easy to modify for your needs

It worked 1 year ago probably on 6.4.x . I hope that the structure had
not
changed since.

sendcard.py is the actual script used, pgtabdump.py is a somewhat
cleaned-up version

---------------
Hannu

Attachments:

pgtabdump.pytext/plain; charset=us-ascii; name=pgtabdump.pyDownload
sendcard.pytext/plain; charset=us-ascii; name=sendcard.pyDownload
#4Michael Richards
miker@interchange.ca
In reply to: Michael Richards (#1)
Re: (Updated) Table File Format

Okay,

Where would I find a definition of the tuple data? I didn't see anything
promising in include/storage?

I've found a definition for the page inside pagebuf.h That clears up all the
page stuff. I'm still having a little trouble decoding the tuple data
within. Hannu Krosing sent me a python script to do the extract, but having
never seen a line of Python before in my life, I'm having a little trouble
with the actual tuple data. I can see where the actual transaction
visibility info is in the tuple data, but the actual data... nope. My
program (c++) is at the point where it will create tuple objects for every
block of "tuple" data within the page.

thanks
-Michael

----- Original Message -----
From: "Tom Lane" <tgl@sss.pgh.pa.us>
To: "Michael Richards" <miker@interchange.ca>
Cc: <pgsql-hackers@postgresql.org>
Sent: Thursday, December 14, 2000 12:44 AM
Subject: Re: [HACKERS] (Updated) Table File Format

"Michael Richards" <miker@interchange.ca> writes:

Following that I find the 2 word tuple pointers.
The first word appears to be the offset in the page where the tuple can

be

found but the MSB has to be stripped off (haven't found it's function in

the

Show quoted text

source yet).
The second is the transactionid that, if comitted gives this tuple
visibility???

No, offset and length --- there is also a 2-bit flags field. Look at
the page and item declarations in src/include/storage/

Someone else was recently working on a bit-level dump tool, but I've
forgotten who.

regards, tom lane

#5Hannu Krosing
hannu@tm.ee
In reply to: Michael Richards (#1)
Re: (Updated) Table File Format

Michael Richards wrote:

Okay,

Where would I find a definition of the tuple data? I didn't see anything
promising in include/storage?

I've found a definition for the page inside pagebuf.h That clears up all the
page stuff. I'm still having a little trouble decoding the tuple data
within. Hannu Krosing sent me a python script to do the extract, but having
never seen a line of Python before in my life, I'm having a little trouble
with the actual tuple data. I can see where the actual transaction
visibility info is in the tuple data, but the actual data... nope. My
program (c++) is at the point where it will create tuple objects for every
block of "tuple" data within the page.

IIRC, the data field format for individual fields is the same as defined
in
the back-end/front-end protocol for binary cursors.

if there are any NULL fields in the record then there is a flag
somewhere in
the tuple header and a bitmap of N*32 bits (N=no_of_fields/32) for
missing .
It is possible that there is no flag and you must deduce the presence of
bitmap from the tuple-header length, im not sure which way it was.

The actual fields in a table and their order must be extracted from
pg_class
and pg_attribute tables.

------------
Hannu