Why we lost Uber as a user

Started by Joshua D. Drakealmost 10 years ago69 messageshackers

jd@commandprompt.com

almost 10 years ago

Hello,

The following article is a very good look at some of our limitations and
highlights some of the pains many of us have been working "around" since
we started using the software.

https://eng.uber.com/mysql-migration/

Specifically:

* Inefficient architecture for writes
* Inefficient data replication
* Issues with table corruption
* Poor replica MVCC support
* Difficulty upgrading to newer releases

It is a very good read and I encourage our hackers to do so with an open
mind.

Sincerely,

--
Command Prompt, Inc. http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.
Unless otherwise stated, opinions are my own.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Josh Berkus

josh@agliodbs.com

almost 10 years ago

In reply to: Joshua D. Drake (#1)

Re: Why we lost Uber as a user

On 07/26/2016 09:54 AM, Joshua D. Drake wrote:

Hello,

The following article is a very good look at some of our limitations and
highlights some of the pains many of us have been working "around" since
we started using the software.

They also had other reasons to switch to MySQL, particularly around
changes of staffing (the switch happened after they got a new CTO). And
they encountered that 9.2 bug literally the week we released a fix, per
one of the mailing lists. Even if they switched off, it's still a nice
testimonial that they once ran their entire worldwide fleet off a single
Postgres cluster.

However, the issues they cite as limitations of our current replication
system are real, or we wouldn't have so many people working on
alternatives. We could really use pglogical in 10.0, as well as
OLTP-friendly MM replication.

The write amplification issue, and its correllary in VACUUM, certainly
continues to plague some users, and doesn't have any easy solutions.

I do find it interesting that they mention schema changes in passing,
without actually saying anything about them -- given that schema changes
have been one of MySQL's major limitations. I'll also note that they
don't mention any of MySQL's corresponding weak spots, such as
limitations on table size due to primary key sorting.

One wonders what would have happened if they'd adopted a sharding model
on top of Postgres?

I would like to see someone blog about our testing for replication
corruption issues now, in response to this.

--
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Import Notes

Reply to msg id not found: WMbf17ca31d05302c7a45e4abe3d2eb60b3bfc7c43a67063610fb08ddfe22bf585e999fb462b22699699b47a7bbf50283a@mailstronghold-1.zmailcloud.com

Josh Berkus

josh@agliodbs.com

almost 10 years ago

In reply to: Joshua D. Drake (#1)

Re: Why we lost Uber as a user

On 07/26/2016 01:53 PM, Josh Berkus wrote:

The write amplification issue, and its correllary in VACUUM, certainly
continues to plague some users, and doesn't have any easy solutions.

To explain this in concrete terms, which the blog post does not:

1. Create a small table, but one with enough rows that indexes make
sense (say 50,000 rows).

2. Make this table used in JOINs all over your database.

3. To support these JOINs, index most of the columns in the small table.

4. Now, update that small table 500 times per second.

That's a recipe for runaway table bloat; VACUUM can't do much because
there's always some minutes-old transaction hanging around (and SNAPSHOT
TOO OLD doesn't really help, we're talking about minutes here), and
because of all of the indexes HOT isn't effective. Removing the indexes
is equally painful because it means less efficient JOINs.

The Uber guy is right that InnoDB handles this better as long as you
don't touch the primary key (primary key updates in InnoDB are really bad).

This is a common problem case we don't have an answer for yet.

--
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Import Notes

Reply to msg id not found: WMedf21464e7c5aa2ca8c8a68f00bf986d7b9ad0e5bac1888e4dd16fb189cc1caa55f5d0e05a08318d2e1e9da44438f15c@mailstronghold-1.zmailcloud.com

Bruce Momjian

bruce@momjian.us

almost 10 years ago

In reply to: Josh Berkus (#3)

Re: Why we lost Uber as a user

On Tue, Jul 26, 2016 at 02:26:57PM -0700, Josh Berkus wrote:

On 07/26/2016 01:53 PM, Josh Berkus wrote:

The write amplification issue, and its correllary in VACUUM, certainly
continues to plague some users, and doesn't have any easy solutions.

To explain this in concrete terms, which the blog post does not:

1. Create a small table, but one with enough rows that indexes make
sense (say 50,000 rows).

2. Make this table used in JOINs all over your database.

3. To support these JOINs, index most of the columns in the small table.

4. Now, update that small table 500 times per second.

That's a recipe for runaway table bloat; VACUUM can't do much because
there's always some minutes-old transaction hanging around (and SNAPSHOT
TOO OLD doesn't really help, we're talking about minutes here), and
because of all of the indexes HOT isn't effective. Removing the indexes
is equally painful because it means less efficient JOINs.

The Uber guy is right that InnoDB handles this better as long as you
don't touch the primary key (primary key updates in InnoDB are really bad).

This is a common problem case we don't have an answer for yet.

Or, basically, we don't have an answer to without making something else
worse.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+                     Ancient Roman grave inscription +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Robert Haas

robertmhaas@gmail.com

almost 10 years ago

In reply to: Josh Berkus (#3)

Re: Why we lost Uber as a user

On Tue, Jul 26, 2016 at 5:26 PM, Josh Berkus <josh@agliodbs.com> wrote:

On 07/26/2016 01:53 PM, Josh Berkus wrote:

The write amplification issue, and its correllary in VACUUM, certainly
continues to plague some users, and doesn't have any easy solutions.

To explain this in concrete terms, which the blog post does not:

1. Create a small table, but one with enough rows that indexes make
sense (say 50,000 rows).

2. Make this table used in JOINs all over your database.

3. To support these JOINs, index most of the columns in the small table.

4. Now, update that small table 500 times per second.

That's a recipe for runaway table bloat; VACUUM can't do much because
there's always some minutes-old transaction hanging around (and SNAPSHOT
TOO OLD doesn't really help, we're talking about minutes here), and
because of all of the indexes HOT isn't effective. Removing the indexes
is equally painful because it means less efficient JOINs.

The Uber guy is right that InnoDB handles this better as long as you
don't touch the primary key (primary key updates in InnoDB are really bad).

This is a common problem case we don't have an answer for yet.

This is why I think we need a pluggable heap storage layer, which
could be done either by rebranding foreign data wrappers as data
wrappers (as I have previously proposed) or using the access method
interface (as proposed by Alexander Korotkov) at PGCon. We're
reaching the limits of what can be done using our current heap format,
and we need to enable developers to experiment with new things. Aside
from the possibility of eventually coming up with something that's
good enough to completely (or mostly) replace our current heap storage
format, we need to support specialized data storage formats that are
optimized for particular use cases (columnar, memory-optimized, WORM).
I know that people are worried about ending up with too many heap
storage formats, but I think we should be a lot more worried about not
having enough heap storage formats. Anybody who thinks that the
current design is working for all of our users is not paying very
close attention.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Tom Lane

tgl@sss.pgh.pa.us

almost 10 years ago

In reply to: Josh Berkus (#3)

Re: Why we lost Uber as a user

Josh Berkus <josh@agliodbs.com> writes:

To explain this in concrete terms, which the blog post does not:

1. Create a small table, but one with enough rows that indexes make
sense (say 50,000 rows).

2. Make this table used in JOINs all over your database.

3. To support these JOINs, index most of the columns in the small table.

4. Now, update that small table 500 times per second.

That's a recipe for runaway table bloat; VACUUM can't do much because
there's always some minutes-old transaction hanging around (and SNAPSHOT
TOO OLD doesn't really help, we're talking about minutes here), and
because of all of the indexes HOT isn't effective.

Hm, I'm not following why this is a disaster. OK, you have circa 100%
turnover of the table in the lifespan of the slower transactions, but I'd
still expect vacuuming to be able to hold the bloat to some small integer
multiple of the minimum possible table size. (And if the table is small,
that's still small.) I suppose really long transactions (pg_dump?) could
be pretty disastrous, but there are ways around that, like doing pg_dump
on a slave.

Or in short, this seems like an annoyance, not a time-for-a-new-database
kind of problem.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Josh Berkus

josh@agliodbs.com

almost 10 years ago

In reply to: Joshua D. Drake (#1)

Re: Why we lost Uber as a user

On 07/26/2016 03:07 PM, Tom Lane wrote:

Josh Berkus <josh@agliodbs.com> writes:

That's a recipe for runaway table bloat; VACUUM can't do much because
there's always some minutes-old transaction hanging around (and SNAPSHOT
TOO OLD doesn't really help, we're talking about minutes here), and
because of all of the indexes HOT isn't effective.

Hm, I'm not following why this is a disaster. OK, you have circa 100%
turnover of the table in the lifespan of the slower transactions, but I'd
still expect vacuuming to be able to hold the bloat to some small integer
multiple of the minimum possible table size.

Not in practice. Don't forget that you also have bloat of the indexes
as well. I encountered multiple cases of this particular failure case,
and often bloat ended up at something like 100X of the clean table/index
size, with no stable size (that is, it always kept growing). This was
the original impetus for wanting REINDEX CONCURRENTLY, but really that's
kind of a workaround.

(And if the table is small,

that's still small.) I suppose really long transactions (pg_dump?) could
be pretty disastrous, but there are ways around that, like doing pg_dump
on a slave.

You'd need a dedicated slave for the pg_dump, otherwise you'd hit query
cancel.

Or in short, this seems like an annoyance, not a time-for-a-new-database
kind of problem.

It's considerably more than an annoyance for the people who suffer from
it; for some databases I dealt with, this one issue was responsible for
80% of administrative overhead (cron jobs, reindexing, timeouts ...).

But no, it's not a database-switcher *by itself*. But is is a chronic,
and serious, problem. I don't have even a suggestion of a real solution
for it without breaking something else, though.

--
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Import Notes

Reply to msg id not found: WM2857d2c524dbb7eff8e914dc65af7139f845bfaa534a746e8ec293bff1c4f0b24d35e8973f3bf1d6243849217f63d546@mailstronghold-1.zmailcloud.com

Robert Haas

robertmhaas@gmail.com

almost 10 years ago

In reply to: Tom Lane (#6)

Re: Why we lost Uber as a user

On Tue, Jul 26, 2016 at 6:07 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Josh Berkus <josh@agliodbs.com> writes:

To explain this in concrete terms, which the blog post does not:

1. Create a small table, but one with enough rows that indexes make
sense (say 50,000 rows).

2. Make this table used in JOINs all over your database.

3. To support these JOINs, index most of the columns in the small table.

4. Now, update that small table 500 times per second.

That's a recipe for runaway table bloat; VACUUM can't do much because
there's always some minutes-old transaction hanging around (and SNAPSHOT
TOO OLD doesn't really help, we're talking about minutes here), and
because of all of the indexes HOT isn't effective.

Hm, I'm not following why this is a disaster. OK, you have circa 100%
turnover of the table in the lifespan of the slower transactions, but I'd
still expect vacuuming to be able to hold the bloat to some small integer
multiple of the minimum possible table size. (And if the table is small,
that's still small.) I suppose really long transactions (pg_dump?) could
be pretty disastrous, but there are ways around that, like doing pg_dump
on a slave.

Or in short, this seems like an annoyance, not a time-for-a-new-database
kind of problem.

I've seen multiple cases where this kind of thing causes a
sufficiently large performance regression that the system just can't
keep up. Things are OK when the table is freshly-loaded, but as soon
as somebody runs a query on any table in the cluster that lasts for a
minute or two, so much bloat accumulates that the performance drops to
an unacceptable level. This kind of thing certainly doesn't happen to
everybody, but equally certainly, this isn't the first time I've heard
of it being a problem. Sometimes, with careful tending and a very
aggressive autovacuum configuration, you can live with it, but it's
never a lot of fun.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Michael Paquier

michael@paquier.xyz

almost 10 years ago

In reply to: Robert Haas (#8)

Re: Why we lost Uber as a user

On Wed, Jul 27, 2016 at 7:19 AM, Robert Haas <robertmhaas@gmail.com> wrote:

I've seen multiple cases where this kind of thing causes a
sufficiently large performance regression that the system just can't
keep up. Things are OK when the table is freshly-loaded, but as soon
as somebody runs a query on any table in the cluster that lasts for a
minute or two, so much bloat accumulates that the performance drops to
an unacceptable level. This kind of thing certainly doesn't happen to
everybody, but equally certainly, this isn't the first time I've heard
of it being a problem. Sometimes, with careful tending and a very
aggressive autovacuum configuration, you can live with it, but it's
never a lot of fun.

Yes.. That's not fun at all. And it takes days to do this tuning
properly if you do such kind of tests on a given product that should
work the way its spec certifies it to ease the customer experience.

As much as this post is interesting, the comments on HN are a good read as well:
https://news.ycombinator.com/item?id=12166585
Some points raised are that the "flaws" mentioned in this post are
actually advantages. But I guess this depends on how you want to run
your business via your application layer.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10

Stephen Frost

sfrost@snowman.net

almost 10 years ago

In reply to: Joshua D. Drake (#1)

Re: Why we lost Uber as a user

* Joshua D. Drake (jd@commandprompt.com) wrote:

Hello,

The following article is a very good look at some of our limitations
and highlights some of the pains many of us have been working
"around" since we started using the software.

https://eng.uber.com/mysql-migration/

Specifically:

* Inefficient architecture for writes
* Inefficient data replication

The above are related and there are serious downsides to having an extra
mapping in the middle between the indexes and the heap.

What makes me doubt just how well they understood the issues or what is
happening is the lack of any mention of hint bits of tuple freezing
(requiring additional writes).

* Issues with table corruption

That was a bug that was fixed quite quickly once it was detected. The
implication that MySQL doesn't have similar bugs is entirely incorrect,
as is the idea that logical replication would avoid data corruption
issues (in practice, it actually tends to be quite a bit worse).

* Poor replica MVCC support

Solved through the hot standby feedback system.

* Difficulty upgrading to newer releases

Their specific issue with these upgrades was solved, years ago, by me
(and it wasn't particularly difficult to do...) through the use of
pg_upgrade's --link option and rsync's ability to construct hard link
trees. Making major release upgrades easier with less downtime is
certainly a good goal, but there's been a solution to the specific issue
they had here for quite a while.

Thanks!

Stephen

#11

Robert Haas

robertmhaas@gmail.com

almost 10 years ago

In reply to: Stephen Frost (#10)

Re: Why we lost Uber as a user

On Tue, Jul 26, 2016 at 8:27 PM, Stephen Frost <sfrost@snowman.net> wrote:

* Joshua D. Drake (jd@commandprompt.com) wrote:

Hello,

The following article is a very good look at some of our limitations
and highlights some of the pains many of us have been working
"around" since we started using the software.

https://eng.uber.com/mysql-migration/

Specifically:

* Inefficient architecture for writes
* Inefficient data replication

The above are related and there are serious downsides to having an extra
mapping in the middle between the indexes and the heap.

What makes me doubt just how well they understood the issues or what is
happening is the lack of any mention of hint bits of tuple freezing
(requiring additional writes).

Yeah. A surprising amount of that post seemed to be devoted to
describing how our MVCC architecture works rather than what problem
they had with it. I'm not saying we shouldn't take their bad
experience seriously - we clearly should - but I don't feel like it's
as clear as it could be about exactly where the breakdowns happened.
That's why I found Josh's restatement useful - I am assuming without
proof that his restatement is accurate....

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12

Vik Fearing

vik@postgresfriends.org

almost 10 years ago

In reply to: Robert Haas (#11)

Re: Why we lost Uber as a user

On 27/07/16 05:45, Robert Haas wrote:

On Tue, Jul 26, 2016 at 8:27 PM, Stephen Frost <sfrost@snowman.net> wrote:

* Joshua D. Drake (jd@commandprompt.com) wrote:

Hello,

The following article is a very good look at some of our limitations
and highlights some of the pains many of us have been working
"around" since we started using the software.

https://eng.uber.com/mysql-migration/

Specifically:

* Inefficient architecture for writes
* Inefficient data replication

The above are related and there are serious downsides to having an extra
mapping in the middle between the indexes and the heap.

What makes me doubt just how well they understood the issues or what is
happening is the lack of any mention of hint bits of tuple freezing
(requiring additional writes).

Yeah. A surprising amount of that post seemed to be devoted to
describing how our MVCC architecture works rather than what problem
they had with it. I'm not saying we shouldn't take their bad
experience seriously - we clearly should - but I don't feel like it's
as clear as it could be about exactly where the breakdowns happened.

There is some more detailed information in this 30-minute talk:
https://vimeo.com/145842299
--
Vik Fearing +33 6 46 75 15 36
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13

Merlin Moncure

mmoncure@gmail.com

almost 10 years ago

In reply to: Tom Lane (#6)

Re: Why we lost Uber as a user

On Tue, Jul 26, 2016 at 5:07 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Josh Berkus <josh@agliodbs.com> writes:

To explain this in concrete terms, which the blog post does not:

1. Create a small table, but one with enough rows that indexes make
sense (say 50,000 rows).

2. Make this table used in JOINs all over your database.

3. To support these JOINs, index most of the columns in the small table.

4. Now, update that small table 500 times per second.

That's a recipe for runaway table bloat; VACUUM can't do much because
there's always some minutes-old transaction hanging around (and SNAPSHOT
TOO OLD doesn't really help, we're talking about minutes here), and
because of all of the indexes HOT isn't effective.

Hm, I'm not following why this is a disaster. OK, you have circa 100%
turnover of the table in the lifespan of the slower transactions, but I'd
still expect vacuuming to be able to hold the bloat to some small integer
multiple of the minimum possible table size. (And if the table is small,
that's still small.) I suppose really long transactions (pg_dump?) could
be pretty disastrous, but there are ways around that, like doing pg_dump
on a slave.

Or in short, this seems like an annoyance, not a time-for-a-new-database
kind of problem.

Well, the real annoyance as I understand it is the raw volume of bytes
of WAL traffic a single update of a field can cause. They switched to
statement level replication(!).

merlin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14

Bruce Momjian

bruce@momjian.us

almost 10 years ago

In reply to: Merlin Moncure (#13)

Re: Why we lost Uber as a user

On Wed, Jul 27, 2016 at 08:33:52AM -0500, Merlin Moncure wrote:

Or in short, this seems like an annoyance, not a time-for-a-new-database
kind of problem.

Well, the real annoyance as I understand it is the raw volume of bytes
of WAL traffic a single update of a field can cause. They switched to
statement level replication(!).

Well, their big complaint about binary replication is that a bug can
spread from a master to all slaves, which doesn't happen with statement
level replication. If that type of corruption is your primary worry,
and you can ignore the worries about statement level replication, then
it makes sense. Of course, the big tragedy is that statement level
replication has known unfixable(?) failures, while binary replication
failures are caused by developer-introduced bugs.

In some ways, people worry about the bugs they have seen, not the bugs
they haven't seen.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+                     Ancient Roman grave inscription +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15

Josh Berkus

josh@agliodbs.com

almost 10 years ago

In reply to: Joshua D. Drake (#1)

Re: Why we lost Uber as a user

On 07/26/2016 08:45 PM, Robert Haas wrote:

That's why I found Josh's restatement useful - I am assuming without
proof that his restatement is accurate....

FWIW, my restatement was based on some other sites rather than Uber.
Including folks who didn't abandon Postgres.

--
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Import Notes

Reply to msg id not found: WM6cfe8460873ad4dc976f48731b6f2f00c3ee482d2f125dc40cebc5db23b78fd38b367fef768fc0c601f10585ede2d05b@mailstronghold-2.zmailcloud.com

#16

Geoff Winkless

pgsqladmin@geoff.dj

almost 10 years ago

In reply to: Bruce Momjian (#14)

Re: Why we lost Uber as a user

On 27 July 2016 at 17:04, Bruce Momjian <bruce@momjian.us> wrote:

Well, their big complaint about binary replication is that a bug can
spread from a master to all slaves, which doesn't happen with statement
level replication.

I'm not sure that that makes sense to me. If there's a database bug that
occurs when you run a statement on the master, it seems there's a decent
chance that that same bug is going to occur when you run the same statement
on the slave.

Obviously it depends on the type of bug and how identical the slave is, but
statement-level replication certainly doesn't preclude such a bug from
propagating.

Geoff

#17

Vitaly Burovoy

vitaly.burovoy@gmail.com

almost 10 years ago

In reply to: Geoff Winkless (#16)

Re: Why we lost Uber as a user

On 7/28/16, Geoff Winkless <pgsqladmin@geoff.dj> wrote:

On 27 July 2016 at 17:04, Bruce Momjian <bruce@momjian.us> wrote:

Well, their big complaint about binary replication is that a bug can
spread from a master to all slaves, which doesn't happen with statement
level replication.

I'm not sure that that makes sense to me. If there's a database bug that
occurs when you run a statement on the master, it seems there's a decent
chance that that same bug is going to occur when you run the same statement
on the slave.

Obviously it depends on the type of bug and how identical the slave is, but
statement-level replication certainly doesn't preclude such a bug from
propagating.

Geoff

Please, read the article first! The bug is about wrong visibility of
tuples after applying WAL at slaves.
For example, you can see two different records selecting from a table
by a primary key (moreover, their PKs are the same, but other columns
differ).

From the article (emphasizing is mine):
The following query illustrates how this bug would affect our users
table example:
SELECT * FROM users WHERE id = 4;
This query would return *TWO* records: ...

And it affected slaves, not master.
Slaves are for decreasing loading to master, if you run all queries
(even) RO at master, why would you (or someone) have so many slaves?

--
Best regards,
Vitaly Burovoy

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18

Geoff Winkless

pgsqladmin@geoff.dj

almost 10 years ago

In reply to: Vitaly Burovoy (#17)

Re: Why we lost Uber as a user

On 28 Jul 2016 12:19, "Vitaly Burovoy" <vitaly.burovoy@gmail.com> wrote:

On 7/28/16, Geoff Winkless <pgsqladmin@geoff.dj> wrote:

On 27 July 2016 at 17:04, Bruce Momjian <bruce@momjian.us> wrote:

Well, their big complaint about binary replication is that a bug can
spread from a master to all slaves, which doesn't happen with statement
level replication.

I'm not sure that that makes sense to me. If there's a database bug that
occurs when you run a statement on the master, it seems there's a decent
chance that that same bug is going to occur when you run the same

statement

on the slave.

Obviously it depends on the type of bug and how identical the slave is,

but

statement-level replication certainly doesn't preclude such a bug from
propagating.

Geoff

Please, read the article first! The bug is about wrong visibility of
tuples after applying WAL at slaves.
For example, you can see two different records selecting from a table
by a primary key (moreover, their PKs are the same, but other columns
differ).

I read the article. It affected slaves as well as the master.

I quote:
"because of the way replication works, this issue has the potential to
spread into all of the databases in a replication hierarchy"

I maintain that this is a nonsense argument. Especially since (as you
pointed out and as I missed first time around) the bug actually occurred at
different records on different slaves, so he invalidates his own point.

Geoff

#19

michael@sqlexec.com

almost 10 years ago

In reply to: Geoff Winkless (#16)

Re: Why we lost Uber as a user

Statement-Based replication has a lot of problems with it like indeterminate
UDFs. Here is a link to see them all:
https://dev.mysql.com/doc/refman/5.7/en/replication-sbr-rbr.html#replication-sbr-rbr-sbr-disadvantages

--
View this message in context: http://postgresql.nabble.com/Why-we-lost-Uber-as-a-user-tp5913417p5913750.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20

Merlin Moncure

mmoncure@gmail.com

almost 10 years ago

In reply to: michael@sqlexec.com (#19)

Re: Why we lost Uber as a user

On Thu, Jul 28, 2016 at 8:16 AM, pgwhatever <michael@sqlexec.com> wrote:

Statement-Based replication has a lot of problems with it like indeterminate
UDFs. Here is a link to see them all:
https://dev.mysql.com/doc/refman/5.7/en/replication-sbr-rbr.html#replication-sbr-rbr-sbr-disadvantages

Sure. It's also incredibly efficient with respect to bandwidth -- so,
if you're application was engineered to work around those problems
it's a huge win. They could have used pgpool, but I guess the fix was
already in.

Taking a step back, from the outside, it looks like uber:
*) has a very thick middleware, very thin database with respect to
logic and complexity
*) has a very high priority on quick and cheap (in terms of bandwidth)
replication
*) has decided the database needs to be interchangeable
*) is not afraid to make weak or erroneous technical justifications as
a basis of stack selection (the futex vs ipc argument I felt was
particularly awful -- it ignored the fact we use spinlocks)

The very fact that they swapped it out so easily suggests that they
were not utilizing the database as they could have, and a different
technical team might have come to a different result. Postgres is a
very general system and rewards deep knowledge such that it can
outperform even specialty systems in the hands of a capable developer
(for example, myself). I'm just now hammering in the final coffin
nails that will get solr swapped out for jsonb backed postgres.

I guess it's fair to say that they felt mysql is closer to what they
felt a database should do out of the box. That's disappointing, but
life moves on. The takeaways are:

*) people like different choices of replication mechanics -- statement
level sucks a lot of the time, but not all the time
*) hs/sr simplicity of configuration and operation is a big issue.
it's continually gotten better and still needs to
*) bad QC can cost you customers. how much regression coverage do we
have of hs/sr?
*) postgres may not be the ideal choice for those who want a thin and
simple database

merlin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21

Vladimir Sitnikov

sitnikov.vladimir@gmail.com

almost 10 years ago

In reply to: Robert Haas (#8)

#22

Alex Ignatov

a.ignatov@postgrespro.ru

almost 10 years ago

In reply to: Vladimir Sitnikov (#21)

#23

Josh Berkus

josh@agliodbs.com

almost 10 years ago

In reply to: Joshua D. Drake (#1)

#24

Jim Nasby

Jim.Nasby@BlueTreble.com

almost 10 years ago

In reply to: Alex Ignatov (#22)

#25

Stephen Frost

sfrost@snowman.net

almost 10 years ago

In reply to: Jim Nasby (#24)

#26

Bruce Momjian

bruce@momjian.us

almost 10 years ago

In reply to: Stephen Frost (#25)

#27

Stephen Frost

sfrost@snowman.net

almost 10 years ago

In reply to: Bruce Momjian (#26)

#28

Hannu Krosing

hannu@tm.ee

almost 10 years ago

In reply to: Tom Lane (#6)

#29

Stephen Frost

sfrost@snowman.net

almost 10 years ago

In reply to: Hannu Krosing (#28)

#30

Tom Lane

tgl@sss.pgh.pa.us

almost 10 years ago

In reply to: Stephen Frost (#29)

#31

Alfred Perlstein

alfred@freebsd.org

almost 10 years ago

In reply to: Geoff Winkless (#18)

#32

Alfred Perlstein

alfred@freebsd.org

almost 10 years ago

In reply to: Merlin Moncure (#20)

#33

Alfred Perlstein

alfred@freebsd.org

almost 10 years ago

In reply to: Joshua D. Drake (#1)

#34

Geoff Winkless

pgsqladmin@geoff.dj

almost 10 years ago

In reply to: Alfred Perlstein (#31)

#35

Amit Kapila

amit.kapila16@gmail.com

almost 10 years ago

In reply to: Stephen Frost (#27)

#36

Robert Haas

robertmhaas@gmail.com

almost 10 years ago

In reply to: Amit Kapila (#35)

#37

Simon Riggs

simon@2ndQuadrant.com

almost 10 years ago

In reply to: Robert Haas (#36)

#38

Alfred Perlstein

alfred@freebsd.org

almost 10 years ago

In reply to: Geoff Winkless (#34)

#39

Robert Haas

robertmhaas@gmail.com

almost 10 years ago

In reply to: Alfred Perlstein (#38)

#40

Stephen Frost

sfrost@snowman.net

almost 10 years ago

In reply to: Robert Haas (#39)

#41

Tom Lane

tgl@sss.pgh.pa.us

almost 10 years ago

In reply to: Stephen Frost (#40)

#42

Alfred Perlstein

alfred@freebsd.org

almost 10 years ago

In reply to: Tom Lane (#41)

#43

Bruce Momjian

bruce@momjian.us

almost 10 years ago

In reply to: Alfred Perlstein (#42)

#44

Mark Kirkwood

mark.kirkwood@catalyst.net.nz

almost 10 years ago

In reply to: Robert Haas (#36)

#45

Simon Riggs

simon@2ndQuadrant.com

almost 10 years ago

In reply to: Stephen Frost (#25)

#46

Bruce Momjian

bruce@momjian.us

almost 10 years ago

In reply to: Alfred Perlstein (#42)

#47

Bruce Momjian

bruce@momjian.us

almost 10 years ago

In reply to: Bruce Momjian (#43)

#48

Craig Ringer

craig@2ndquadrant.com

almost 10 years ago

In reply to: Tom Lane (#41)

#49

Alfred Perlstein

alfred@freebsd.org

almost 10 years ago

In reply to: Bruce Momjian (#46)

#50

Kevin Grittner

Kevin.Grittner@wicourts.gov

almost 10 years ago

In reply to: Alfred Perlstein (#49)

#51

Geoff Winkless

pgsqladmin@geoff.dj

almost 10 years ago

In reply to: Kevin Grittner (#50)

#52

Robert Haas

robertmhaas@gmail.com

almost 10 years ago

In reply to: Tom Lane (#41)

#53

Tom Lane

tgl@sss.pgh.pa.us

almost 10 years ago

In reply to: Robert Haas (#52)

#54

Joshua D. Drake

jd@commandprompt.com

almost 10 years ago

In reply to: Tom Lane (#53)

#55

Robert Haas

robertmhaas@gmail.com

almost 10 years ago

In reply to: Tom Lane (#53)

#56

Tom Lane

tgl@sss.pgh.pa.us

almost 10 years ago

In reply to: Joshua D. Drake (#54)

#57

Kevin Grittner

Kevin.Grittner@wicourts.gov

almost 10 years ago

In reply to: Tom Lane (#56)

#58

Torsten Zuehlsdorff

mailinglists@toco-domains.de

almost 10 years ago

In reply to: Robert Haas (#55)

#59

Alfred Perlstein

alfred@freebsd.org

almost 10 years ago

In reply to: Torsten Zuehlsdorff (#58)

#60

Alfred Perlstein

alfred@freebsd.org

almost 10 years ago

In reply to: Mark Kirkwood (#44)

#61

Alfred Perlstein

alfred@freebsd.org

almost 10 years ago

In reply to: Bruce Momjian (#46)

#62

Jim Nasby

Jim.Nasby@BlueTreble.com

almost 10 years ago

In reply to: Kevin Grittner (#57)

#63

Craig Ringer

craig@2ndquadrant.com

almost 10 years ago

In reply to: Jim Nasby (#62)

#64

Bruce Momjian

bruce@momjian.us

almost 10 years ago

In reply to: Jim Nasby (#62)

#65

Bruce Momjian

bruce@momjian.us

almost 10 years ago

In reply to: Craig Ringer (#63)

#66

Craig Ringer

craig@2ndquadrant.com

almost 10 years ago

In reply to: Bruce Momjian (#65)

#67

Simon Riggs

simon@2ndQuadrant.com

almost 10 years ago

In reply to: Bruce Momjian (#64)

#68

Jim Nasby

Jim.Nasby@BlueTreble.com

almost 10 years ago

In reply to: Simon Riggs (#67)

#69

Merlin Moncure

mmoncure@gmail.com

almost 10 years ago

In reply to: Jim Nasby (#68)