Possible to go without page headers?

Started by Chris Clevelandalmost 4 years ago5 messages
#1Chris Cleveland
ccleve+github@dieselpoint.com

I'm writing an index access method with its own unique file format. It
involves storing large blobs that break across pages.

The file format itself doesn't need or use page headers. There's no need
for a checksum or to manage free space within the page.

Can I treat pages as just a flat, open 8k buffer and fill them with
arbitrary data?

The reason I ask is that I see some reference to an LSN, used to determine
when to dump a dirty buffer to disk, and don't know whether that is
actually required. I plan to write a large number of pages all at once and
I'm not yet quite sure how WAL logging will work. I also see some
suggestion that the vacuum process uses page headers, but I haven't quite
figured that out either.

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Chris Cleveland (#1)
Re: Possible to go without page headers?

Chris Cleveland <ccleve+github@dieselpoint.com> writes:

Can I treat pages as just a flat, open 8k buffer and fill them with
arbitrary data?

No, at least not unless you plan to reimplement much of the WAL
mechanism. You do need at least an LSN in the right place.
I kinda doubt that you can get away with ignoring checksumming,
either. On the whole, I think you'd be best off to use a standard
page header; the amount you're saving by avoiding that will be
minuscule, and the amount of work you cause for yourself probably
not so much.

BTW, there are also tools such as pg_filedump that expect that index
pages can be identified by some sort of magic number kept in the
"special space" at the page tail. You're not absolutely bound to make
that work, but you'll be cutting yourself off from some potentially
handy support.

regards, tom lane

#3David Steele
david@pgmasters.net
In reply to: Tom Lane (#2)
Re: Possible to go without page headers?

On 2/14/22 16:19, Tom Lane wrote:

Chris Cleveland <ccleve+github@dieselpoint.com> writes:

Can I treat pages as just a flat, open 8k buffer and fill them with
arbitrary data?

No, at least not unless you plan to reimplement much of the WAL
mechanism. You do need at least an LSN in the right place.
I kinda doubt that you can get away with ignoring checksumming,
either. On the whole, I think you'd be best off to use a standard
page header; the amount you're saving by avoiding that will be
minuscule, and the amount of work you cause for yourself probably
not so much.

BTW, there are also tools such as pg_filedump that expect that index
pages can be identified by some sort of magic number kept in the
"special space" at the page tail. You're not absolutely bound to make
that work, but you'll be cutting yourself off from some potentially
handy support.

You'll also get errors from external tools (like pgBackRest) that
validate checksums and headers.

Regards,
-David

In reply to: Tom Lane (#2)
Re: Possible to go without page headers?

On Mon, Feb 14, 2022 at 2:19 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

No, at least not unless you plan to reimplement much of the WAL
mechanism. You do need at least an LSN in the right place.
I kinda doubt that you can get away with ignoring checksumming,
either. On the whole, I think you'd be best off to use a standard
page header; the amount you're saving by avoiding that will be
minuscule, and the amount of work you cause for yourself probably
not so much.

It isn't actually necessary for an index AM to use the standard
slotted page format to get the benefits that you mention, of course --
whether or not an index AM that uses standard page headers *also* uses
slotted pages with standard line pointers is a separate question. For
example, GIN posting tree pages don't use standard line pointers, but
still have a standard page header (and a generic GIN special area in
the opaque space).

I agree that it's hard to imagine that opting out of using the
standard page header format could ever make much sense. Principally
because the restrictions imposed on an index AM that uses the standard
page header format are very minimal, while the benefits are
substantial.

--
Peter Geoghegan

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Geoghegan (#4)
Re: Possible to go without page headers?

Peter Geoghegan <pg@bowt.ie> writes:

It isn't actually necessary for an index AM to use the standard
slotted page format to get the benefits that you mention, of course --
whether or not an index AM that uses standard page headers *also* uses
slotted pages with standard line pointers is a separate question.

Right, you don't need to use a line pointer array if you don't want
to. (IIRC, hash also opts out of that in some pages.) I took the
question to be just about the page header proper.

regards, tom lane