Postgres-R: tuple serialization
Hi,
yesterday, I promised to outline the requirements of Postgres-R for
tuple serialization, which we have been talking about before. There are
basically three types of how to serialize tuple changes, depending on
whether they originate from an INSERT, UPDATE or DELETE. For updates and
deletes, it saves the old pkey as well as the origin (a global
transaction id) of the tuple (required for consistent serialization on
remote nodes). For inserts and updates, all added or changed attributes
need to be serialized as well.
pkey+origin changes
INSERT - x
UPDATE x x
DELETE x -
Note, that the pkey attributes may never be null, so an isnull bit field
can be skipped for those attributes. For the insert case, all attributes
(including primary key attributes) are serialized. Updates require an
additional bit field (well, I'm using chars ATM) to store which
attributes have changed. Only those should be transferred.
I'm tempted to unify that, so that inserts are serialized as the
difference against the default vaules or NULL. That would make things
easier for Postgres-R. However, how about other uses of such a fast
tuple applicator? Does such a use case exist at all? I mean, for
parallelizing COPY FROM STDIN, one certainly doesn't want to serialize
all input tuples into that format before feeding multiple helper
backends. Instead, I'd recommend letting the helper backends do the
parsing and therefore parallelize that as well.
For other features, like parallel pg_dump or even parallel query
execution, this tuple serialization code doesn't help much, IMO. So I'm
thinking that optimizing it for Postgres-R's internal use is the best
way to go.
Comments? Opinions?
Regards
Markus
On Jul 22, 2008, at 3:04 AM, Markus Wanner wrote:
yesterday, I promised to outline the requirements of Postgres-R for
tuple serialization, which we have been talking about before. There
are basically three types of how to serialize tuple changes,
depending on whether they originate from an INSERT, UPDATE or
DELETE. For updates and deletes, it saves the old pkey as well as
the origin (a global transaction id) of the tuple (required for
consistent serialization on remote nodes). For inserts and updates,
all added or changed attributes need to be serialized as well.pkey+origin changes
INSERT - x
UPDATE x x
DELETE x -Note, that the pkey attributes may never be null, so an isnull bit
field can be skipped for those attributes. For the insert case, all
attributes (including primary key attributes) are serialized.
Updates require an additional bit field (well, I'm using chars ATM)
to store which attributes have changed. Only those should be
transferred.I'm tempted to unify that, so that inserts are serialized as the
difference against the default vaules or NULL. That would make
things easier for Postgres-R. However, how about other uses of such
a fast tuple applicator? Does such a use case exist at all? I mean,
for parallelizing COPY FROM STDIN, one certainly doesn't want to
serialize all input tuples into that format before feeding multiple
helper backends. Instead, I'd recommend letting the helper backends
do the parsing and therefore parallelize that as well.For other features, like parallel pg_dump or even parallel query
execution, this tuple serialization code doesn't help much, IMO. So
I'm thinking that optimizing it for Postgres-R's internal use is
the best way to go.Comments? Opinions?
ISTM that both londiste and Slony would be able to make use of these
improvements as well. A modular replication system should be able to
use a variety of methods for logging data changes and then applying
them on a subscriber, so long as some kind of common transport can be
agreed upon (such as text). So having a change capture and apply
mechanism that isn't dependent on a lot of extra stuff would be
generally useful to any replication mechanism.
--
Decibel!, aka Jim C. Nasby, Database Architect decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828
Attachments:
Hi,
Decibel! wrote:
ISTM that both londiste and Slony would be able to make use of these
improvements as well. A modular replication system should be able to use
a variety of methods for logging data changes and then applying them on
a subscriber, so long as some kind of common transport can be agreed
upon (such as text). So having a change capture and apply mechanism that
isn't dependent on a lot of extra stuff would be generally useful to any
replication mechanism.
Hm.. yeah, that's a good hint. However, I'm not sure how londiste and
Slony would interface with these internal methods. That would require
some sort of special replication triggers or something. But when to fire
them? After every statement (sync)? Just before commit (eager)? After
commit (lazy)? (These are the points in Postgres-R, where the internal
methods are called).
I'm claiming that Postgres-R is modular (enough). But I'm unsure what
interface it could provide to the outer world.
Regards
Markus Wanner
On Jul 22, 2008, at 4:43 PM, Markus Wanner wrote:
Decibel! wrote:
ISTM that both londiste and Slony would be able to make use of
these improvements as well. A modular replication system should be
able to use a variety of methods for logging data changes and then
applying them on a subscriber, so long as some kind of common
transport can be agreed upon (such as text). So having a change
capture and apply mechanism that isn't dependent on a lot of extra
stuff would be generally useful to any replication mechanism.Hm.. yeah, that's a good hint. However, I'm not sure how londiste
and Slony would interface with these internal methods. That would
require some sort of special replication triggers or something. But
when to fire them? After every statement (sync)? Just before commit
(eager)? After commit (lazy)? (These are the points in Postgres-R,
where the internal methods are called).
Currently, londiste triggers are per-row, not deferred. IIRC,
londiste is the same. ISTM it'd be much better if we had per-
statement triggers that could see what data had changed; that'd
likely be a lot more efficient than doing stuff per-row.
In any case, both replication systems should work with either sync or
eager. I can't see them working with lazy.
What about just making all three available?
I'm claiming that Postgres-R is modular (enough). But I'm unsure
what interface it could provide to the outer world.
Yeah. I suspect that Postgres-R could end up taking the place of the
replica-hooks mailing list (and more, of course).
--
Decibel!, aka Jim C. Nasby, Database Architect decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828
Attachments:
Hi,
Decibel! wrote:
Currently, londiste triggers are per-row, not deferred. IIRC, londiste
is the same. ISTM it'd be much better if we had per-statement triggers
that could see what data had changed; that'd likely be a lot more
efficient than doing stuff per-row.
Well, now that I think about it... there might be *lots* of changes.
Certainly something you don't want to collect in memory. At the moment,
Postgres-R cannot handle this, but I plan to add an upper limit on the
change set size, and just send it out as soon as it exceeds that limit,
then continue collecting. (Note for the GCS adept: this partial change
set may be sent via reliable multicast, only the very last change set
before the commit needs to be totally ordered.)
That would mean, introducing another 'change set full' hook...
In any case, both replication systems should work with either sync or
eager. I can't see them working with lazy.
Huh? AFAIK, londiste as well as Slony-I are both async. So what would
hooks for sync replication be good for? Why not rather only lazy hooks?
(Well, lazy hooks will pose yet another problem: those theoretically
need to run somewhen *after* the commit, but at that time we don't have
an open transaction, so where exactly shall we do this?)
What about just making all three available?
Doh. Ehm. That's a lot of work for something we are not even sure it's
good for anything. I'm certainly willing to help. And if other projects
show enough interest, I might even add the appropriate triggers myself.
But as long as this is all unclear, I certainly have more important
things on my todo list for Postgres-R (see that TODO list ;-) ).
Yeah. I suspect that Postgres-R could end up taking the place of the
replica-hooks mailing list (and more, of course).
Let's hope so, yeah!
Regards
Markus Wanner