Postgres-R: primary key patches
Hi,
as you might know, Postgres-R relies on primary keys to address tuples
of a table. It cannot replicate tables without a primary key.
Primary keys currently aren't really used within the executor, so I had
to extended and modify Postgres here and there, to get the required
information. To ease reviewing I have split out these modifications and
present them here as two separate little patches.
The first one, get_pkey_index_oid.diff, changes the function
relationHasPrimaryKey into GetPrimaryKeyIndexOid, which now returns an
index oid instead of just a boolean. It works pretty much the same,
except from returning an oid instead of just a boolean. (In the current
Postgres-R code, I've duplicated that code to
src/backend/replication/recovery.c)
And secondly, the add_pkey_info.diff patch adds a boolean field
ii_Primary to the IndexInfo struct and ri_PrimaryKey to the
ResultRelInfo struct, which is an index into the indexInfoArray.
I think these are relatively trivial modifications which could be
helpful for other purposes as well. So I suggest to apply them to
mainline whenever appropriate (read: choose the appropriate commit fest).
This also raises the more general question of how to start collaborating
on Postgres-R. I realize that it's a pretty huge project. However, I'm
unsure on how to ease reviewing for others, so if you have any ideas or
questions, please don't hesitate to ask.
Regards
Markus
markus@bluegap.ch (Markus Wanner) writes:
as you might know, Postgres-R relies on primary keys to address tuples
of a table. It cannot replicate tables without a primary key.
Slony-I does the same, with the "variation" that it permits the option
of using a "candidate primary key," namely an index that is unique+NOT
NULL.
If it is possible to support that broader notion, that might make
addition of these sorts of logic more widely useful.
Primary keys currently aren't really used within the executor, so I
had to extended and modify Postgres here and there, to get the
required information. To ease reviewing I have split out these
modifications and present them here as two separate little patches.
I know Jan Wieck has in mind the idea of adding an interface to enable
doing highly efficient IUD (Insert/Update/Delete) via generating a way
to do direct heap updates, which would be *enormously* more efficient
than the present need (in Slony-I, for instance) to parse, plan and
execute thousands of IUD statements. For UPDATE/DELETE to work
requires utilizing (candidate) primary keys, so there is some
seemingly relevant similarity there.
--
select 'cbbrowne' || '@' || 'linuxfinances.info';
http://cbbrowne.com/info/lsf.html
Rules of the Evil Overlord #145. "My dungeon cell decor will not
feature exposed pipes. While they add to the gloomy atmosphere, they
are good conductors of vibrations and a lot of prisoners know Morse
code." <http://www.eviloverlord.com/>
Hello Chris,
chris wrote:
Slony-I does the same, with the "variation" that it permits the option
of using a "candidate primary key," namely an index that is unique+NOT
NULL.If it is possible to support that broader notion, that might make
addition of these sorts of logic more widely useful.
Well, yeah, that's technically not much different, so it would probably
be very easy to extend Postgres-R to work on any arbitrary Index.
But what do we have primary keys for, in the first place? Isn't it
exactly the *primay* key into the table, which you want to use for
replication? Or do we need an additional per-table configuration option
for that? A REPLICATION KEY besides the PRIMARY KEY?
I know Jan Wieck has in mind the idea of adding an interface to enable
doing highly efficient IUD (Insert/Update/Delete) via generating a way
to do direct heap updates, which would be *enormously* more efficient
than the present need (in Slony-I, for instance) to parse, plan and
execute thousands of IUD statements. For UPDATE/DELETE to work
requires utilizing (candidate) primary keys, so there is some
seemingly relevant similarity there.
Definitely. The remote backend does exactly that for Postgres-R: it
takes a change set, which consists of one or more tuple collections, and
then applies these collections. See ExecProcessCollection() in execMain.c.
(Although, I'm still less than thrilled about the internal storage
format of these tuple collections. That can certainly be improved and
simplified.)
Regards
Markus
On Fri, Jul 18, 2008 at 03:04:08PM +0200, Markus Schiltknecht wrote:
Hello Chris,
chris wrote:
Slony-I does the same, with the "variation" that it permits the
option of using a "candidate primary key," namely an index that is
unique+NOT NULL.If it is possible to support that broader notion, that might make
addition of these sorts of logic more widely useful.Well, yeah, that's technically not much different, so it would
probably be very easy to extend Postgres-R to work on any arbitrary
Index.But what do we have primary keys for, in the first place? Isn't it
exactly the *primay* key into the table, which you want to use for
replication? Or do we need an additional per-table configuration
option for that? A REPLICATION KEY besides the PRIMARY KEY?
We have them because people are used to thinking in terms of a
"PRIMARY KEY," not because that concept is actually distinguishable
from a non-partial UNIQUE NOT NULL constraint.
While I'm a "chicken" rather than a "pig" on this project
<http://en.wikipedia.org/wiki/The_Chicken_and_the_Pig>, I believe that
covering the more general case right from the start would be a much
better plan.
Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
markus@bluegap.ch (Markus Wanner) writes:
Hello Chris,
chris wrote:
Slony-I does the same, with the "variation" that it permits the option
of using a "candidate primary key," namely an index that is unique+NOT
NULL.If it is possible to support that broader notion, that might make
addition of these sorts of logic more widely useful.Well, yeah, that's technically not much different, so it would
probably be very easy to extend Postgres-R to work on any arbitrary
Index.But what do we have primary keys for, in the first place? Isn't it
exactly the *primay* key into the table, which you want to use for
replication? Or do we need an additional per-table configuration
option for that? A REPLICATION KEY besides the PRIMARY KEY?
I agree with you that tables are *supposed* to have primary keys;
that's proper design, and if tables are missing them, then something
is definitely broken.
Sometimes, unfortunately, people make errors in design, and we wind up
needing to accomodate situations that are "less than perfect."
The "happy happenstance" is that, in modern versions of PostgreSQL, a
unique index may be added in the background so that this may be
rectified without outage if you can live with a "candidate primary
key" rather than a true PRIMARY KEY.
It seems to me that this extension can cover over a number of "design
sins," which looks like a very kind accomodation where it is surely
preferable to design it in earlier rather than later.
I know Jan Wieck has in mind the idea of adding an interface to enable
doing highly efficient IUD (Insert/Update/Delete) via generating a way
to do direct heap updates, which would be *enormously* more efficient
than the present need (in Slony-I, for instance) to parse, plan and
execute thousands of IUD statements. For UPDATE/DELETE to work
requires utilizing (candidate) primary keys, so there is some
seemingly relevant similarity there.Definitely. The remote backend does exactly that for Postgres-R: it
takes a change set, which consists of one or more tuple collections,
and then applies these collections. See ExecProcessCollection() in
execMain.c.(Although, I'm still less than thrilled about the internal storage
format of these tuple collections. That can certainly be improved and
simplified.)
You may want to have a chat with Jan; he's got some thoughts on a more
general purpose mechanism that would be good for this as well as for
(we think) extremely efficient bulk data loading.
--
select 'cbbrowne' || '@' || 'linuxfinances.info';
http://cbbrowne.com/info/lsf.html
Rules of the Evil Overlord #145. "My dungeon cell decor will not
feature exposed pipes. While they add to the gloomy atmosphere, they
are good conductors of vibrations and a lot of prisoners know Morse
code." <http://www.eviloverlord.com/>
David Fetter <david@fetter.org> writes:
On Fri, Jul 18, 2008 at 03:04:08PM +0200, Markus Schiltknecht wrote:
But what do we have primary keys for, in the first place?
We have them because people are used to thinking in terms of a
"PRIMARY KEY," not because that concept is actually distinguishable
from a non-partial UNIQUE NOT NULL constraint.
No, we have them because the SQL standard actually assigns a distinct
meaning to a primary key. (It's the default foreign key reference
column(s) for the table --- and in that context it's clear that
There Can Be Only One.)
regards, tom lane
"David Fetter" <david@fetter.org> writes:
On Fri, Jul 18, 2008 at 03:04:08PM +0200, Markus Schiltknecht wrote:
Hello Chris,
chris wrote:
Slony-I does the same, with the "variation" that it permits the
option of using a "candidate primary key," namely an index that is
unique+NOT NULL.If it is possible to support that broader notion, that might make
addition of these sorts of logic more widely useful.Well, yeah, that's technically not much different, so it would
probably be very easy to extend Postgres-R to work on any arbitrary
Index.But what do we have primary keys for, in the first place? Isn't it
exactly the *primay* key into the table, which you want to use for
replication? Or do we need an additional per-table configuration
option for that? A REPLICATION KEY besides the PRIMARY KEY?
Hm, it occurs to me that really Slony should be saying
WHERE (col1,col2,...) = ('x','y','z',...)
and letting the server figure out what access method is best for finding the
candidate record. That could mean using the primary key index, or it could
mean using some other index (perhaps a partial index for example).
It would be nice if there was a way for Slony to express to the server that
really, it only needs any UNIQUE NOT NULL combination of columns to match.
Once the server has any such combination which matches it can skip checking
the rest. I can't think of any way to write such a query in SQL.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training!
Hi,
David Fetter wrote:
While I'm a "chicken" rather than a "pig" on this project
<http://en.wikipedia.org/wiki/The_Chicken_and_the_Pig>, I believe that
covering the more general case right from the start would be a much
better plan.
I was trying to say that Postgres-R internally relies only on a unique
index with not null constraint. It doesn't care if you name it PRIMARY
KEY or REPLICATION KEY or whatever.
So, it's just a question of the syntax. We already have PRIMARY KEYs,
and those are pretty much what I think is needed in 99% of all cases as
the pointer to the replication
While I'm normally an absolute fan of generality,
I think you didn't quite get the point.
Hi,
sorry, some strange key-combination made my mail client send too early...
I myself wrote:
I was trying to say that Postgres-R internally relies only on a unique
index with not null constraint. It doesn't care if you name it PRIMARY
KEY or REPLICATION KEY or whatever.So, it's just a question of the syntax. We already have PRIMARY KEYs,
and those are pretty much what I think is needed in 99% of all cases as
the pointer to the replication
.. as the pointer to the index to use for replication.
Offering the user a possibility to choose another (index + not null)
would require something like ALTER TABLE ... ADD REPLICATION KEY ... or
some such. Mostly syntactic sugar, which can be added as soon as we
really need it.
While I'm normally an absolute fan of generality,
I think you didn't quite get the point.
Iiik.. that's what I didn't want to send and wanted to delete before
sending... :-) Sorry.
Regards
Markus
Hi,
Tom Lane wrote:
It's the default foreign key reference
column(s) for the table
That's why I think it makes for a pretty good replication key as well.
Regards
Markus
Hi,
I realize that you are talk about Slony, let me answer for the
Postgres-R case, anyway.
Gregory Stark wrote:
Hm, it occurs to me that really Slony should be saying
WHERE (col1,col2,...) = ('x','y','z',...)
Hm.. that would mean increasing the amount of work for the remote
backend, which applies remote transaction. For scalability reasons, I'm
trying to keep that minimal.
and letting the server figure out what access method is best for finding the
candidate record. That could mean using the primary key index, or it could
mean using some other index (perhaps a partial index for example).
For Postgres-R, I think that would only be a gain in those cases, where
all tuples of a collection (or even the entire change set) only affect
tuples from a partial index. That doesn't look like it's worth the
trouble, IMO. Or do you think that's a frequent case?
Thinking about it, I'd even say that requiring only one index frequently
is favorable because of caching effects. Dunno.
It would be nice if there was a way for Slony to express to the server that
really, it only needs any UNIQUE NOT NULL combination of columns to match.
Once the server has any such combination which matches it can skip checking
the rest. I can't think of any way to write such a query in SQL.
I don't quite get your point here. For UPDATEs which change the PRIMARY
KEY, the sender currently sends the *old* values plus the changes. In
that case, you certainly don't want to send the entire olde tuple, but
only the fields for *one* KEY. That's what I'm calling the replication
key. (And currently equals the PRIMARY KEY).
Maybe I'm thinking too much in terms of Postgres-R, instead of Slony,
what you are talking about.
Regards
Markus
Markus Wanner wrote:
Gregory Stark wrote:
It would be nice if there was a way for Slony to express to the server that
really, it only needs any UNIQUE NOT NULL combination of columns to match.
Once the server has any such combination which matches it can skip checking
the rest. I can't think of any way to write such a query in SQL.I don't quite get your point here. For UPDATEs which change the PRIMARY
KEY, the sender currently sends the *old* values plus the changes. In
that case, you certainly don't want to send the entire olde tuple, but
only the fields for *one* KEY. That's what I'm calling the replication
key. (And currently equals the PRIMARY KEY).
I think the point here is that you need to distinguish which tuple you
need to update. For this, our Replicator uses the primary key only;
there's no way to use another candidate key (unique not null). It would
certainly be possible to use a different candidate key, but as far as I
know no customer has ever requested this.
(FWIW we don't send the old values -- only the original PK columns, the
values of columns that changed, and the "update mask" in terms of
heap_modify_tuple.)
--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Hi,
Alvaro Herrera wrote:
I think the point here is that you need to distinguish which tuple you
need to update. For this, our Replicator uses the primary key only;
there's no way to use another candidate key (unique not null). It would
certainly be possible to use a different candidate key,
Yeah, and for this to work, the *sender* needs to decide on a key to use.
but as far as I
know no customer has ever requested this.
I can't see the use case for a separate REPLICATION KEY, different from
the PRIMARY KEY, either..
(FWIW we don't send the old values -- only the original PK columns, the
values of columns that changed, and the "update mask" in terms of
heap_modify_tuple.)
Yup, that's pretty much the same what I'm doing for Postgres-R.
Regards
Markus
Hi,
chris wrote:
I agree with you that tables are *supposed* to have primary keys;
that's proper design, and if tables are missing them, then something
is definitely broken.
Ah, I see, so you are not concerned about tables with a PRIMARY KEY for
which one wants another REPLICATION KEY, but only about tables without a
PRIMARY KEY, for which one doesn't want a PRIMARY KEY in the first place.
However, that's a general limitation of replication at tuple level: you
need to be able to uniquely identify tuples. (Unlike replication on
storage level, which can use the storage location for that).
Sometimes, unfortunately, people make errors in design, and we wind up
needing to accomodate situations that are "less than perfect."The "happy happenstance" is that, in modern versions of PostgreSQL, a
unique index may be added in the background so that this may be
rectified without outage if you can live with a "candidate primary
key" rather than a true PRIMARY KEY.
I cannot see any reason for not wanting a PRIMARY KEY, but wanting
replication, and therefore a REPLICATION KEY.
Or are you saying we should add a hidden REPLICATION KEY for people who
are afraid of schema changes and dislike a visible primary key? Would
you want to hide the underlying index as well?
It seems to me that this extension can cover over a number of "design
sins," which looks like a very kind accomodation where it is surely
preferable to design it in earlier rather than later.
Sorry, but I fail to see any real advantage of that covering of "sins".
I would find it rather confusing to have keys and indices hidden from
the admin. It's not like an additional index or a primary key would lead
to functional changes.
That's certainly different for additional columns, where a SELECT *
could all of a sudden return more columns than before. So that's the
exception where I agree that hiding such an additional column like we
already do for system columns would make sense. That's for example the
situation where you add an 'id' column later on and make that the new
primary (and thus replication) key. Maybe that's what you meant?
However, even in that case, I wouldn't hide the index nor the primary
key, but only the column.
Regards
Markus
Hi,
chris wrote:
You may want to have a chat with Jan; he's got some thoughts on a more
general purpose mechanism that would be good for this as well as for
(we think) extremely efficient bulk data loading.
Jan, mind to share your thoughts? What use cases for such a general
purpose mechanism do you see?
What I can imagine doing on top of Postgres-R is: splitting up the data
and feeding multiple backends with it. Unlike Postgres-R's internal use,
you'd still have to check the data against constraints, I think.
It would involve the origin backend asking for help from the manager.
That one checks for available helper backends and then serves as a
message dispatcher between the origin and helper backends (as it does
for replication purposes). Please note that it already uses shared
memory extensively, so the manager doesn't need to copy around the data
itself.
Regards
Markus
Markus Wanner wrote:
(Although, I'm still less than thrilled about the internal storage
format of these tuple collections. That can certainly be improved and
simplified.)
Care to expand more on what it is? On Replicator we're using the binary
send/recv routines to transmit tuples. (Obviously this fails when the
master and slave have differing binary output, but currently we just
punt on this point).
--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.
Hi,
Alvaro Herrera wrote:
Markus Wanner wrote:
(Although, I'm still less than thrilled about the internal storage
format of these tuple collections. That can certainly be improved and
simplified.)Care to expand more on what it is?
Well, what I really dislike is the overhead in code to first transform
tuples into a string based internal change set representation, which
then gets serialized again. That looks like two conversion steps, which
are both prone to error.
I'm about to merge those into a simpler tuple serializer, which shares
code with the initializer (or recovery provider/subscriber) part. This
is where I'd like to know what requirements Jan or others have. I will
try to outline the current implementation and requirements of Postgres-R
in a new thread soon.
On Replicator we're using the binary
send/recv routines to transmit tuples. (Obviously this fails when the
master and slave have differing binary output, but currently we just
punt on this point).
Yeah, that's another point. I'm currently using the textual input/output
functions, but would like to switch to the binary one as well. However,
that's an optimization, where the above is simplification of code, which
is more important to me at the moment.
Regards
Markus
markus@bluegap.ch (Markus Wanner) writes:
chris wrote:
I agree with you that tables are *supposed* to have primary keys;
that's proper design, and if tables are missing them, then something
is definitely broken.Ah, I see, so you are not concerned about tables with a PRIMARY KEY
for which one wants another REPLICATION KEY, but only about tables
without a PRIMARY KEY, for which one doesn't want a PRIMARY KEY in the
first place.
"Doesn't want" is probably overstating the matter.
I'll describe a scenario to suggest where it might happen.
- A system is implemented, using the database, and, for some reason,
no PRIMARY KEY is defined for a table. Someone forgot; it got
misconfigured; a mistake was probably made.
- The system then goes into production, and runs for a while. The
table has data added to it, and starts to grow rather large.
- At THIS point, we decide to introduce replication, only to discover
that there isn't a PRIMARY KEY on the table.
Ideally, we'd take an outage and add the primary key. But suppose we
can't afford to do so?
The "add indexes concurrently" added in 8.3 (if memory serves) *would*
allow us to create a *candidate* primary key without forcing an
outage.
In theory, we'd like to have a true primary key. Sometimes
operational issues get in the way.
However, that's a general limitation of replication at tuple level:
you need to be able to uniquely identify tuples. (Unlike replication
on storage level, which can use the storage location for that).
No disagreement; yes, we certainly do need a way to uniquely identify
tuples, otherwise we can't replicate UPDATE or DELETE.
Sometimes, unfortunately, people make errors in design, and we wind up
needing to accomodate situations that are "less than perfect."The "happy happenstance" is that, in modern versions of PostgreSQL, a
unique index may be added in the background so that this may be
rectified without outage if you can live with a "candidate primary
key" rather than a true PRIMARY KEY.I cannot see any reason for not wanting a PRIMARY KEY, but wanting
replication, and therefore a REPLICATION KEY.Or are you saying we should add a hidden REPLICATION KEY for people
who are afraid of schema changes and dislike a visible primary key?
Would you want to hide the underlying index as well?
The scenario I outline above hopefully answers this. It's not a
matter that I expect people to specifically desire not to have a
primary key. Instead, I expect cases where mistakes compound with
operational issues to make them say "Ow - I can't do that now!"
--
select 'cbbrowne' || '@' || 'linuxfinances.info';
http://cbbrowne.com/info/lsf.html
Rules of the Evil Overlord #145. "My dungeon cell decor will not
feature exposed pipes. While they add to the gloomy atmosphere, they
are good conductors of vibrations and a lot of prisoners know Morse
code." <http://www.eviloverlord.com/>
Hi,
chris wrote:
I'll describe a scenario to suggest where it might happen.
- A system is implemented, using the database, and, for some reason,
no PRIMARY KEY is defined for a table. Someone forgot; it got
misconfigured; a mistake was probably made.- The system then goes into production, and runs for a while. The
table has data added to it, and starts to grow rather large.- At THIS point, we decide to introduce replication, only to discover
that there isn't a PRIMARY KEY on the table.
Yeah, that's the situation I had in mind as well.
Ideally, we'd take an outage and add the primary key. But suppose we
can't afford to do so?
You are assuming that one doesn't need to take an outage to start
replication in the first place. As Postgres-R comes with system catalog
changes, that's not the case. You will at least need to restart the
postmaster, without some sort of system catalog upgrading (which doesn't
currently exists) you even need a full dump/restore cycle.
The "add indexes concurrently" added in 8.3 (if memory serves) *would*
allow us to create a *candidate* primary key without forcing an
outage.
Postgres-R is primarily being developed for *future* versions of
Postgres, I don't see any point in back porting something that is not
complete for the current version, yet.
However, that's a general limitation of replication at tuple level:
you need to be able to uniquely identify tuples. (Unlike replication
on storage level, which can use the storage location for that).No disagreement; yes, we certainly do need a way to uniquely identify
tuples, otherwise we can't replicate UPDATE or DELETE.
Yup, that's the real issue here.
The scenario I outline above hopefully answers this.
I see the problem of wanting to replicate tables which didn't have a
PRIMARY KEY before. But I still cannot see a use case for hiding indices
or keys.
It's not a
matter that I expect people to specifically desire not to have a
primary key. Instead, I expect cases where mistakes compound with
operational issues to make them say "Ow - I can't do that now!"
Yeah, these issues certainly need to be addressed. I think the ability
to add a hidden column and a (visible!) primary key on that column
should help in that case.
Thinking about index creation time doesn't make sense, as long as we
still need a dump/restore cycle to setup replication. And even then,
that operational issue has nothing to do with the question of hiding the
newly generated index or not.
Regards
Markus
Markus Wanner <markus@bluegap.ch> writes:
Thinking about index creation time doesn't make sense, as long as we
still need a dump/restore cycle to setup replication. And even then,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
that operational issue has nothing to do with the question of hiding
the newly generated index or not.
Let me note that one of the design criteria for Slony-I was to
explicitly NOT have such a requirement.
Making the assumption that it *is* acceptable to disrupt operations
for the duration of a dump/restore cycle is certain to limit interest
in a replication system.
A most pointed case where that will cause heartburn of the "I refuse
to use this" sort is if that disruption needs to take place when
recovering from the failure of a node. That sort of disruption is
certainly counterproductive to the usual goal of replication enhancing
system availability.
Maybe I am misreading you; I rather hope so.
--
select 'cbbrowne' || '@' || 'linuxfinances.info';
http://cbbrowne.com/info/lsf.html
Rules of the Evil Overlord #145. "My dungeon cell decor will not
feature exposed pipes. While they add to the gloomy atmosphere, they
are good conductors of vibrations and a lot of prisoners know Morse
code." <http://www.eviloverlord.com/>