Page layout footprint

Started by Zdenek Kotalaover 17 years ago10 messages
#1Zdenek Kotala
Zdenek.Kotala@Sun.COM

Hi Heikki,

I'm sorry for lack of explanation. It is my fault.

Heikki says (on commit fest wiki):
------------
I believe I debunked this patch enough already. Apparently there's some
compatibility issue between 32-bit and 64-bit Sparcs, but this patch
didn't catch that. It doesn't seem like this provides any extra safeness
or better error messages. If I'm missing something, please provide more
details on what scenario we currently have a problem, and how this helps
with it.
------------

The original proposal
(http://archives.postgresql.org/message-id/489FC8E1.9090307@sun.com)
contains two parts. First part is implementation of --footprint cmd line
switch which shows you page layout structures footprint. It is useful
for development (mostly for in-place upgrade) and also for manual data
recovery when you need to know exact structures. Second part was add
this information also into pg_control.file, but how you correctly
mentioned there is not real use case to do it at this moment.

However, there is still --footprint switch which is useful and it is
reason why I put it on wiki for review and feedback. The switch could
also use in build farm for collecting footprints from build farm members.

32/64 bit issue is little bit different story and it is general (not
only SPARC but on SPARC has bigger impact). Problem is that CRC32 gives
probably different result when it is compiled 32bit or 64bit. I'm going
to examine it more.

Zdenek

#2Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Zdenek Kotala (#1)
Re: Page layout footprint

Zdenek Kotala wrote:

Hi Heikki,

I'm sorry for lack of explanation. It is my fault.

Heikki says (on commit fest wiki):
------------
I believe I debunked this patch enough already. Apparently there's some
compatibility issue between 32-bit and 64-bit Sparcs, but this patch
didn't catch that. It doesn't seem like this provides any extra safeness
or better error messages. If I'm missing something, please provide more
details on what scenario we currently have a problem, and how this helps
with it.
------------

The original proposal
(http://archives.postgresql.org/message-id/489FC8E1.9090307@sun.com)
contains two parts. First part is implementation of --footprint cmd line
switch which shows you page layout structures footprint. It is useful
for development (mostly for in-place upgrade) and also for manual data
recovery when you need to know exact structures. Second part was add
this information also into pg_control.file, but how you correctly
mentioned there is not real use case to do it at this moment.

However, there is still --footprint switch which is useful and it is
reason why I put it on wiki for review and feedback. The switch could
also use in build farm for collecting footprints from build farm members.

Ok, understood. I'll take another look from that point of view.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#3Zdenek Kotala
Zdenek.Kotala@Sun.COM
In reply to: Zdenek Kotala (#1)
Re: Page layout footprint

Zdenek Kotala napsal(a):

32/64 bit issue is little bit different story and it is general (not
only SPARC but on SPARC has bigger impact). Problem is that CRC32 gives
probably different result when it is compiled 32bit or 64bit. I'm going
to examine it more.

I'm sorry about noise. Everything works as expected. I tested 8.3 and
looked into 8.4 code :(. The problem with 8.3 is that control file uses
time_t which is defined as long and it is not portable between 32bits
and 64bits server version. Version 8.4 is OK, because it uses pg_time_t
which is alway 64bit long.

Zdenek

#4Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Zdenek Kotala (#1)
Re: Page layout footprint

Zdenek Kotala wrote:

The original proposal
(http://archives.postgresql.org/message-id/489FC8E1.9090307@sun.com)
contains two parts. First part is implementation of --footprint cmd line
switch which shows you page layout structures footprint. It is useful
for development (mostly for in-place upgrade) and also for manual data
recovery when you need to know exact structures.

I'm afraid I still fail to see the usefulness of this. gdb knows how to
deal with structs, and for manual data recovery you need so much more
than the page header structure. And if you're working at such a low
level, it's not that hard to calculate the offsets within the struct
manually.

BTW, this makes me wonder if it would be possible to use the
upgrade-in-place machinery to convert a data directory from one
architecture to another? Just a thought..

However, there is still --footprint switch which is useful and it is
reason why I put it on wiki for review and feedback. The switch could
also use in build farm for collecting footprints from build farm members.

If we needed more information about the architectures, we could just
collect the output of pg_controldata. But I think the configure logs
already contains all the useful information.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#5Zdenek Kotala
Zdenek.Kotala@Sun.COM
In reply to: Heikki Linnakangas (#4)
Re: Page layout footprint

Heikki Linnakangas napsal(a):

Zdenek Kotala wrote:

The original proposal
(http://archives.postgresql.org/message-id/489FC8E1.9090307@sun.com)
contains two parts. First part is implementation of --footprint cmd
line switch which shows you page layout structures footprint. It is
useful for development (mostly for in-place upgrade) and also for
manual data recovery when you need to know exact structures.

I'm afraid I still fail to see the usefulness of this. gdb knows how to
deal with structs,

If I correct that GDB knows structure only if you have debug version. But
usually you don't have debug version on production system. And another small
advantage is that --footprint switch is easy to use. It is easier that work with
gdb and you can easy get info from users who are not familiar with gdb.

and for manual data recovery you need so much more
than the page header structure.

Yeah, I know, but I didn't want to spend several days with full coding without
idea approval. There are special data, meta pages and so on which have to be added.

And if you're working at such a low
level, it's not that hard to calculate the offsets within the struct
manually.

I'm not sure if it is so easy. Are you able do it for SPARC, PPC or other non
x86 CPUs?

BTW, this makes me wonder if it would be possible to use the
upgrade-in-place machinery to convert a data directory from one
architecture to another? Just a thought..

Hmm, good question. For example ZFS is platform independent, you can take disk
from SPARC machine and plug it into x86 and ZFS works perfectly. ZFS converts
its own data during a read and any new block is written in a new format. You are
able read all binary platform independent data like MP3, JPEG and so on. But
PostgreSQL will not work, because PostgreSQL fails during pg_control file
reading, because endianes are different.

Convert data structures like PageHeader and so on could be possible but you
don't have control over user data types.

I think in this case is better to develop platform independent replication and
use this mechanism for data transfer.

However, there is still --footprint switch which is useful and it is
reason why I put it on wiki for review and feedback. The switch could
also use in build farm for collecting footprints from build farm members.

If we needed more information about the architectures, we could just
collect the output of pg_controldata. But I think the configure logs
already contains all the useful information.

It seems to be good idea. Only what we need is extend buildfarm to parse
config.log and shows this data for each build machine. It could also report
changes in alignment.

Zdenek

--
Zdenek Kotala Sun Microsystems
Prague, Czech Republic http://sun.com/postgresql

#6Gregory Stark
stark@enterprisedb.com
In reply to: Zdenek Kotala (#5)
Re: Page layout footprint

Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes:

Hmm, good question. For example ZFS is platform independent, you can take disk
from SPARC machine and plug it into x86 and ZFS works perfectly.

FWIW as far as I know *all* filesystems are platform independent. (Of course
now someone is surely going to find some counter-example) Doesn't really
change the argument though.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's Slony Replication support!

#7Zdenek Kotala
Zdenek.Kotala@Sun.COM
In reply to: Gregory Stark (#6)
Re: Page layout footprint

Gregory Stark napsal(a):

Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes:

Hmm, good question. For example ZFS is platform independent, you can take disk
from SPARC machine and plug it into x86 and ZFS works perfectly.

FWIW as far as I know *all* filesystems are platform independent. (Of course
now someone is surely going to find some counter-example) Doesn't really
change the argument though.

Yeah, of course. I selected bad word. ZFS write data structures in native endian
format. It does not need convert data to correct byte order. Only if you switch
harddisk from SPARC to x86 you need convert bytes order. Other systems has a
penalty for endian conversion on oposit platform or conversion is not supported.
However, I'm not sure if ext3 filesystem which is created on x86 machine is
readable under linux on SPARC machine? FAT32 works fine :-) everywhere.

Maybe optimized platform independence is better term.

Zdenek

--
Zdenek Kotala Sun Microsystems
Prague, Czech Republic http://sun.com/postgresql

#8Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Zdenek Kotala (#5)
Re: Page layout footprint

Zdenek Kotala wrote:

Heikki Linnakangas napsal(a):

I'm afraid I still fail to see the usefulness of this. gdb knows how
to deal with structs,

If I correct that GDB knows structure only if you have debug version.
But usually you don't have debug version on production system.

Using gdb without debug systems is pretty much a lost cause anyway.

And
another small advantage is that --footprint switch is easy to use. It is
easier that work with gdb and you can easy get info from users who are
not familiar with gdb.

AFAICS you can get all the same information from pg_controldata. We have
a pretty good idea of the alignments of all the usual platforms anyway.
If someone says in a bug report that they're running on x86_64 or 32-bit
Sparc, we know what the alignments on those platforms are.

And if you're working at such a low level, it's not that hard to
calculate the offsets within the struct manually.

I'm not sure if it is so easy. Are you able do it for SPARC, PPC or
other non x86 CPUs?

Not off the top of my head. But I am able to do it on x86 and x86_64
which are the platforms I work and debug on.

If we needed more information about the architectures, we could just
collect the output of pg_controldata. But I think the configure logs
already contains all the useful information.

It seems to be good idea. Only what we need is extend buildfarm to parse
config.log and shows this data for each build machine.

Well, the information is already there. I'm not convinced it's such an
important issue that it's worth the effort to add special handling to
extract that information from the log. Of course, if someone feels
otherwise and does it, I won't object.

It could also report changes in alignment.

Huh? I would be pretty surprised if the alignment changed randomly on
some platform.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#9Zdenek Kotala
Zdenek.Kotala@Sun.COM
In reply to: Heikki Linnakangas (#8)
Re: Page layout footprint

Heikki Linnakangas napsal(a):

Zdenek Kotala wrote:

Heikki Linnakangas napsal(a):

<snip>

AFAICS you can get all the same information from pg_controldata. We have
a pretty good idea of the alignments of all the usual platforms anyway.
If someone says in a bug report that they're running on x86_64 or 32-bit
Sparc, we know what the alignments on those platforms are.

And if you're working at such a low level, it's not that hard to
calculate the offsets within the struct manually.

I'm not sure if it is so easy. Are you able do it for SPARC, PPC or
other non x86 CPUs?

Not off the top of my head. But I am able to do it on x86 and x86_64
which are the platforms I work and debug on.

OK. You convinced me that information could be collected from other sources.

<snip>

It could also report changes in alignment.

Huh? I would be pretty surprised if the alignment changed randomly on
some platform.

I thought if somebody change compiler switches - for example 32/64 compilation.
But it is probably rare case.

Zdenek

--
Zdenek Kotala Sun Microsystems
Prague, Czech Republic http://sun.com/postgresql

#10Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Zdenek Kotala (#9)
Re: Page layout footprint

Zdenek Kotala wrote:

OK. You convinced me that information could be collected from other
sources.

Great, I'm glad we're in agreement.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com