Using non-sequential timelines in order to help with possible collisions

Started by Brian Fahertyabout 9 years ago5 messageshackers

anothergenericuser@gmail.com

about 9 years ago

Hey hackers,
I was working with replication and recovery the other day and noticed
that there were scenarios where I could cause multiple servers to enter the
same timeline while possibly having divergent data. One such scenario is
Master A and Replica B are both on timeline 1. There is an event that
causes Replica B to become promoted which changes it to timeline 2.
Following this, you perform a restore on Master A to a point before the
event happened. Once Postgres completes this recovery on Master A, it will
switch over to timeline 2. There are now WAL files that have been written
to timeline 2 from both servers.

From this scenario, I would like to suggest considering using
non-sequential timelines. From what I have investigated so far, I believe
the *.history files in the WAL directory already have all the timelines
id's in them and are in order. If we could make those timeline ids to be a
bit more unique/random, and still rely on the ordering in the *.history
file, I think this would help prevent multiple servers on the same timeline
with divergent data.

I was hoping to begin a conversation on whether or not non-sequential
timelines are a good idea before I looked at the code around timelines.

--
Brian Faherty

Robert Haas

robertmhaas@gmail.com

about 9 years ago

In reply to: Brian Faherty (#1)

Re: Using non-sequential timelines in order to help with possible collisions

On Wed, Jul 19, 2017 at 11:23 AM, Brian Faherty
<anothergenericuser@gmail.com> wrote:

Hey hackers,
I was working with replication and recovery the other day and noticed that
there were scenarios where I could cause multiple servers to enter the same
timeline while possibly having divergent data. One such scenario is Master A
and Replica B are both on timeline 1. There is an event that causes Replica
B to become promoted which changes it to timeline 2. Following this, you
perform a restore on Master A to a point before the event happened. Once
Postgres completes this recovery on Master A, it will switch over to
timeline 2. There are now WAL files that have been written to timeline 2
from both servers.

From this scenario, I would like to suggest considering using non-sequential
timelines. From what I have investigated so far, I believe the *.history
files in the WAL directory already have all the timelines id's in them and
are in order. If we could make those timeline ids to be a bit more
unique/random, and still rely on the ordering in the *.history file, I think
this would help prevent multiple servers on the same timeline with divergent
data.

I was hoping to begin a conversation on whether or not non-sequential
timelines are a good idea before I looked at the code around timelines.

It's interesting that you bring this up. I've also wondered why we
don't use random TLIs. I suppose I'm internally assuming that it's
because the people who wrote the code are far more brilliant and
knowledgeable of this area than I could ever be and that doing
anything else would create some kind of awful problem, but maybe
that's not so.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Michael Paquier

michael@paquier.xyz

about 9 years ago

In reply to: Robert Haas (#2)

Re: Using non-sequential timelines in order to help with possible collisions

On Wed, Jul 19, 2017 at 7:00 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Jul 19, 2017 at 11:23 AM, Brian Faherty
<anothergenericuser@gmail.com> wrote:

I was working with replication and recovery the other day and noticed that
there were scenarios where I could cause multiple servers to enter the same
timeline while possibly having divergent data. One such scenario is Master A
and Replica B are both on timeline 1. There is an event that causes Replica
B to become promoted which changes it to timeline 2. Following this, you
perform a restore on Master A to a point before the event happened. Once
Postgres completes this recovery on Master A, it will switch over to
timeline 2. There are now WAL files that have been written to timeline 2
from both servers.

From this scenario, I would like to suggest considering using non-sequential
timelines. From what I have investigated so far, I believe the *.history
files in the WAL directory already have all the timelines id's in them and
are in order. If we could make those timeline ids to be a bit more
unique/random, and still rely on the ordering in the *.history file, I think
this would help prevent multiple servers on the same timeline with divergent
data.

It seems to me that you are missing one piece here: the history files
generated at the moment of the timeline bump. When recovery finishes,
an instance scans the archives or from the instances it is streaming
from for history files, and chooses a timeline number that does not
match existing ones. So you are trying to avoid a problem that can
easily be solved with a proper archive for example.

I was hoping to begin a conversation on whether or not non-sequential
timelines are a good idea before I looked at the code around timelines.

It's interesting that you bring this up. I've also wondered why we
don't use random TLIs. I suppose I'm internally assuming that it's
because the people who wrote the code are far more brilliant and
knowledgeable of this area than I could ever be and that doing
anything else would create some kind of awful problem, but maybe
that's not so.

I am not the only who worked on that, but the result code is a tad
more simple, as it is possible to guess more easily some hierarchy for
the timelines, of course with the history files at hand.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Tom Lane

tgl@sss.pgh.pa.us

about 9 years ago

In reply to: Michael Paquier (#3)

Re: Using non-sequential timelines in order to help with possible collisions

Michael Paquier <michael.paquier@gmail.com> writes:

On Wed, Jul 19, 2017 at 7:00 PM, Robert Haas <robertmhaas@gmail.com> wrote:

It's interesting that you bring this up. I've also wondered why we
don't use random TLIs. I suppose I'm internally assuming that it's
because the people who wrote the code are far more brilliant and
knowledgeable of this area than I could ever be and that doing
anything else would create some kind of awful problem, but maybe
that's not so.

I am not the only who worked on that, but the result code is a tad
more simple, as it is possible to guess more easily some hierarchy for
the timelines, of course with the history files at hand.

Yeah, right now you have the ability to guess that, say, timeline 42
is a descendant of 41, which you couldn't assume with random TLIs.
Also, the values are only 32 bits, which is not wide enough to allow
imagining that random() could be relied on to produce non-duplicate
values.

If we had separate database identifiers for slave installations, which
AFAIR we don't, it'd be possible to consider incorporating part of
the server ID into timeline IDs it creates, which would alleviate
Brian's issue I think. That is, instead of 1, 2, 3, ..., a server
might create 1xyz, 2xyz, 3xyz, ... where "xyz" are random digits
associated with the particular installation. This is obviously
not bulletproof since you could have collisions of the xyz's, but
it would help. Also you could imagine allowing DBAs to assign
distinct xyz codes to every slave in a given community.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Michael Paquier

michael@paquier.xyz

about 9 years ago

In reply to: Tom Lane (#4)

Re: Using non-sequential timelines in order to help with possible collisions

On Wed, Jul 19, 2017 at 8:05 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Michael Paquier <michael.paquier@gmail.com> writes:

On Wed, Jul 19, 2017 at 7:00 PM, Robert Haas <robertmhaas@gmail.com> wrote:

It's interesting that you bring this up. I've also wondered why we
don't use random TLIs. I suppose I'm internally assuming that it's
because the people who wrote the code are far more brilliant and
knowledgeable of this area than I could ever be and that doing
anything else would create some kind of awful problem, but maybe
that's not so.

I am not the only who worked on that, but the result code is a tad
more simple, as it is possible to guess more easily some hierarchy for
the timelines, of course with the history files at hand.

Yeah, right now you have the ability to guess that, say, timeline 42
is a descendant of 41, which you couldn't assume with random TLIs.
Also, the values are only 32 bits, which is not wide enough to allow
imagining that random() could be relied on to produce non-duplicate
values.

pg_backend_random() perhaps? If any new code uses random(), those
would be slashed quickly at review.

If we had separate database identifiers for slave installations, which
AFAIR we don't, it'd be possible to consider incorporating part of
the server ID into timeline IDs it creates, which would alleviate
Brian's issue I think. That is, instead of 1, 2, 3, ..., a server
might create 1xyz, 2xyz, 3xyz, ... where "xyz" are random digits
associated with the particular installation. This is obviously
not bulletproof since you could have collisions of the xyz's, but
it would help. Also you could imagine allowing DBAs to assign
distinct xyz codes to every slave in a given community.

I am not much into any concept of complicating the timeline name to be honest :)

Having a unique identifier per node has value for other purposes, like
clustering, and we would have the same information by adding in the
history file the ID of the node that generated the new timeline.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers