Horizontal scalability/sharding
I have recently increased my public statements about the idea of adding
horizontal scaling/sharding to Postgres. I wanted to share with hackers
a timeline of how we got here, and where I think we are going in the
short term:
2012-2013: As part of writing my scaling talk
(http://momjian.us/main/presentations/overview.html#scaling), studying
Oracle RAC, and talking to users, it became clear that an XC-like
architecture (sharding) was the only architecture that was going to allow
for write scaling.
Users and conference attendees I talked to were increasingly concerned
about the ability of Postgres to scale for high write volumes. They didn't
necessarily need that scale now, but they needed to know they could get
it if they wanted it, and wouldn't need to switch to a new database in
the future. This is similar to wanting a car that can get you on a highway
on-ramp fast --- even if you don't need it, you want to know it is there.
2014: I started to shop around the idea that we could use FDWs,
parallelism, and a transaction/snapshot manager to get XC features
as built-in to Postgres. (I don't remember where the original idea
came from.) It was clear that having separate forks of the source code
in XC and XL was never going to achieve critical mass --- there just
aren't enough people who need high right scale right now, and the fork
maintenance overhead is a huge burden.
I realized that we would never get community acceptance to dump the XC
(or XL) code needed for sharding into community Postgres, but with FDWs,
we could add the features as _part_ of improving FDWs, which would benefit
FDWs _and_ would be useful for sharding. (We already see some of those
FDW features in 9.5.)
October, 2014: EDB and NTT started working together in the community
to start improving FDWs as a basis for an FDW-based sharding solution.
Many of the 9.5 FDW improvements that also benefit sharding were developed
by a combined EDB/NTT team. The features improved FDWs independent of
sharding, so they didn't need community buy-in on sharding to get them
accepted.
June, 2015: I attended the PGCon sharding unconference session and
there was a huge discussion about where we should go with sharding.
I think the big take-away was that most people liked the FDW approach,
but had business/customer reasons for wanting to work on XC or XL because
those would be production-ready faster.
July, 2015: Oleg Bartunov and his new company Postgres Professional (PP)
started to think about joining the FDW approach, rather than working on
XL, as they had stated at PGCon in June. A joint NTT/EDB/PP phone-in
meeting is scheduled for September 1.
August, 2015: While speaking at SFPUG, Citus Data approached me about
joining the FDW sharding team. They have been invited to the September
1 meeting, as have the XC and XL people.
October, 2015: EDB is sponsoring a free 3-hour summit about FDW sharding
at the PG-EU conference in Vienna. Everyone is invited, but it is hoped
most of the September 1 folks can attend.
February, 2016: Oleg is planning a similar meeting at their February
Moscow conference.
Anyway, I wanted to explain the work that has been happening around
sharding. As things move forward, I am increasingly convinced that write
scaling will be needed soon, that the XC approach is the only reasonable
way to do it, and that FDWs are the cleanest way to get it into community
Postgres.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ Everyone has their own god. +
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 30 August 2015 at 03:17, Bruce Momjian <bruce@momjian.us> wrote:
I have recently increased my public statements about the idea of adding
horizontal scaling/sharding to Postgres.
Glad to see it. Many people have been pushing such things for years, so it
is good to finally see some debate about this on Hackers.
I wanted to share with hackers
a timeline of how we got here, and where I think we are going in the
short term:2012-2013: As part of writing my scaling talk
(http://momjian.us/main/presentations/overview.html#scaling), studying
Oracle RAC, and talking to users, it became clear that an XC-like
architecture (sharding) was the only architecture that was going to allow
for write scaling.
What other architectures were discussed? Where was that discussion?
Users and conference attendees I talked to were increasingly concerned
about the ability of Postgres to scale for high write volumes. They didn't
necessarily need that scale now, but they needed to know they could get
it if they wanted it, and wouldn't need to switch to a new database in
the future. This is similar to wanting a car that can get you on a highway
on-ramp fast --- even if you don't need it, you want to know it is there.
+1
2014: I started to shop around the idea that we could use FDWs,
parallelism, and a transaction/snapshot manager to get XC features
as built-in to Postgres. (I don't remember where the original idea
came from.) It was clear that having separate forks of the source code
in XC and XL was never going to achieve critical mass --- there just
aren't enough people who need high right scale right now, and the fork
maintenance overhead is a huge burden.
I personally support the view that we should put scalability features into
Postgres core, rather than run separate forks.
I realized that we would never get community acceptance to dump the XC
(or XL) code needed for sharding into community Postgres
How or why did you realize that? There has never been any such discussion,
AFAIK. Surely it can be possible to move required subsystems across?
, but with FDWs,
we could add the features as _part_ of improving FDWs, which would benefit
FDWs _and_ would be useful for sharding. (We already see some of those
FDW features in 9.5.)
That is a huge presumption. Not discussed or technically analyzed in any
way with the community.
October, 2014: EDB and NTT started working together in the community
to start improving FDWs as a basis for an FDW-based sharding solution.
Many of the 9.5 FDW improvements that also benefit sharding were developed
by a combined EDB/NTT team. The features improved FDWs independent of
sharding, so they didn't need community buy-in on sharding to get them
accepted.June, 2015: I attended the PGCon sharding unconference session and
there was a huge discussion about where we should go with sharding.
I think the big take-away was that most people liked the FDW approach,
but had business/customer reasons for wanting to work on XC or XL because
those would be production-ready faster.
Cough, cough. You must surely be joking that "most people liked the FDW
approach"? How did we measure the acceptance of this approach?
What actually is the FDW approach? Since its not been written down
anywhere, or even explained verbally, how can anyone actually agree to it?
July, 2015: Oleg Bartunov and his new company Postgres Professional (PP)
started to think about joining the FDW approach, rather than working on
XL, as they had stated at PGCon in June. A joint NTT/EDB/PP phone-in
meeting is scheduled for September 1.
August, 2015: While speaking at SFPUG, Citus Data approached me about
joining the FDW sharding team. They have been invited to the September
1 meeting, as have the XC and XL people.
2ndQuadrant is working in this area, specifically bringing XL 9.5 forwards.
Please can invites be posted to myself, Pavan Deolasee and Petr Jelinek
also? I'll pass on to others also.
Koichi Suzuki is arranging a meeting in Hong Long for XC/XL discussions.
Presumably EDB is invited also? If Koichi is a leading organizer of this,
why are there two meetings?
October, 2015: EDB is sponsoring a free 3-hour summit about FDW sharding
at the PG-EU conference in Vienna. Everyone is invited, but it is hoped
most of the September 1 folks can attend.
February, 2016: Oleg is planning a similar meeting at their February
Moscow conference.
Anyway, I wanted to explain the work that has been happening around
sharding.
Thanks
As things move forward, I am increasingly convinced that write
scaling will be needed soon,
+1
that the XC approach is the only reasonable way to do it,
and that FDWs are the cleanest way to get it into community
Postgres.
Those two things aren't at all obvious to me.
Please don't presume my opposition. If the technical information were made
public, I might understand and agree with "the FDW approach", perhaps
others also. 2ndQuadrant is certainly happy to become involved in any team
aiming to add features to Postgres core, as long as that makes sense. There
may be areas we can all agree upon even if the full architecture remains in
doubt.
Before the community commits to a long term venture together we should see
the plan. Like all IT projects, expensive failure is possible and the lack
of a design is a huge flashing red warning light for me at present. If that
requires a meeting of all Developers, why are the meetings for this
specifically not happening at the agreed Developer meetings?
--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Sun, Aug 30, 2015 at 5:31 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
On 30 August 2015 at 03:17, Bruce Momjian <bruce@momjian.us> wrote:
I have recently increased my public statements about the idea of adding
horizontal scaling/sharding to Postgres.Glad to see it. Many people have been pushing such things for years, so it
is good to finally see some debate about this on Hackers.I wanted to share with hackers
a timeline of how we got here, and where I think we are going in the
short term:2012-2013: As part of writing my scaling talk
(http://momjian.us/main/presentations/overview.html#scaling), studying
Oracle RAC, and talking to users, it became clear that an XC-like
architecture (sharding) was the only architecture that was going to allow
for write scaling.What other architectures were discussed? Where was that discussion?
Users and conference attendees I talked to were increasingly concerned
about the ability of Postgres to scale for high write volumes. They
didn't
necessarily need that scale now, but they needed to know they could get
it if they wanted it, and wouldn't need to switch to a new database in
the future. This is similar to wanting a car that can get you on a
highway
on-ramp fast --- even if you don't need it, you want to know it is there.+1
2014: I started to shop around the idea that we could use FDWs,
parallelism, and a transaction/snapshot manager to get XC features
as built-in to Postgres. (I don't remember where the original idea
came from.) It was clear that having separate forks of the source code
in XC and XL was never going to achieve critical mass --- there just
aren't enough people who need high right scale right now, and the fork
maintenance overhead is a huge burden.I personally support the view that we should put scalability features into
Postgres core, rather than run separate forks.I realized that we would never get community acceptance to dump the XC
(or XL) code needed for sharding into community PostgresHow or why did you realize that? There has never been any such discussion,
AFAIK. Surely it can be possible to move required subsystems across?, but with FDWs,
we could add the features as _part_ of improving FDWs, which would benefit
FDWs _and_ would be useful for sharding. (We already see some of those
FDW features in 9.5.)That is a huge presumption. Not discussed or technically analyzed in any
way with the community.October, 2014: EDB and NTT started working together in the community
to start improving FDWs as a basis for an FDW-based sharding solution.
Many of the 9.5 FDW improvements that also benefit sharding were developed
by a combined EDB/NTT team. The features improved FDWs independent of
sharding, so they didn't need community buy-in on sharding to get them
accepted.June, 2015: I attended the PGCon sharding unconference session and
there was a huge discussion about where we should go with sharding.
I think the big take-away was that most people liked the FDW approach,
but had business/customer reasons for wanting to work on XC or XL because
those would be production-ready faster.Cough, cough. You must surely be joking that "most people liked the FDW
approach"? How did we measure the acceptance of this approach?What actually is the FDW approach? Since its not been written down
anywhere, or even explained verbally, how can anyone actually agree to it?July, 2015: Oleg Bartunov and his new company Postgres Professional (PP)
started to think about joining the FDW approach, rather than working on
XL, as they had stated at PGCon in June. A joint NTT/EDB/PP phone-in
meeting is scheduled for September 1.
A little correction about Postgres Professional. We are concentrated on
idea to have one distributed transaction manager, originally DTM, now we
have better name XTM, which is neutral to actual cluster realization. For
example, we are testing it with XL, ported to 9.4, but we were planning to
extend tests to pg_shard, postgres_fdw. My idea was to have at least XTM
committed to 9.6, so all parties could work on their implementation much
easier.
August, 2015: While speaking at SFPUG, Citus Data approached me about
joining the FDW sharding team. They have been invited to the September
1 meeting, as have the XC and XL people.2ndQuadrant is working in this area, specifically bringing XL 9.5
forwards. Please can invites be posted to myself, Pavan Deolasee and Petr
Jelinek also? I'll pass on to others also.Koichi Suzuki is arranging a meeting in Hong Long for XC/XL discussions.
Presumably EDB is invited also? If Koichi is a leading organizer of this,
why are there two meetings?October, 2015: EDB is sponsoring a free 3-hour summit about FDW sharding
at the PG-EU conference in Vienna. Everyone is invited, but it is hoped
most of the September 1 folks can attend.February, 2016: Oleg is planning a similar meeting at their February
Moscow conference.Anyway, I wanted to explain the work that has been happening around
sharding.Thanks
As things move forward, I am increasingly convinced that write
scaling will be needed soon,+1
that the XC approach is the only reasonable way to do it,
and that FDWs are the cleanest way to get it into community
Postgres.Those two things aren't at all obvious to me.
Please don't presume my opposition. If the technical information were made
public, I might understand and agree with "the FDW approach", perhaps
others also. 2ndQuadrant is certainly happy to become involved in any team
aiming to add features to Postgres core, as long as that makes sense. There
may be areas we can all agree upon even if the full architecture remains in
doubt.Before the community commits to a long term venture together we should see
the plan. Like all IT projects, expensive failure is possible and the lack
of a design is a huge flashing red warning light for me at present. If that
requires a meeting of all Developers, why are the meetings for this
specifically not happening at the agreed Developer meetings?
At PGCon we agreed to have such meeting in Vienna at least. But I think we
should be prepared and try to clean all our issues before. It looks like we
already out of time,but probably we could meet in Hong Kong ?
Honestly, I still don't know which approach is better, we already played
with XL (ported to 9.4) and identified some very strong issues with
inconsistency, which scared us, especially taking into account how easy we
found them. XC people have fixed them, but I'm not sure if they were
fundamental and if we could construct more sophisticated tests and find
more issues in XC/XL. We also a bit disappointed by Huawei position about
CSN patch, we hoped to use for our XTM. FDW approach has been actively
criticized by pg_shard people and that's also made me a bit suspicious. It
looks like we are doomed to continue several development forks, so we
decided to work on very important common project, XTM, which we hoped could
be accepted by all parties and eventually committed to 9.6. Now I see we
were right, unfortunately.
Again, could we organize meeting somewhere in September ? US is not good
for us, but other places should be ok. I want to have an agreement at
least on XTM. We still are testing various approaches, though. We could
present results of our experiments and are open to discussion. It's not
easy project, but it's something we could do for 9.6.
I'm very glad Bruce started this discussion in -hackers, since it's silly
to me to participate in both threads :) Let's meet in September !
Show quoted text
--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Sun, Aug 30, 2015 at 03:31:10PM +0100, Simon Riggs wrote:
On 30 August 2015 at 03:17, Bruce Momjian <bruce@momjian.us> wrote:
I have recently increased my public statements about the idea of adding
horizontal scaling/sharding to Postgres.Glad to see it. Many people have been pushing such things for years, so it is
good to finally see some debate about this on Hackers.
Agreed. Right now, in our community, we are only seeing users who are
happy with what Postgres offers but think they might need massive
horizontal scalability in the future. I think there is a larger group
that cares about massive horizontal scalability, but those people are
using other software right now, so we don't see them yet.
Without a roadmap for built-in massive horizontal scalability, I think
Postgres adoption will eventually suffer.
I wanted to share with hackers
a timeline of how we got here, and where I think we are going in the
short term:2012-2013:� As part of writing my scaling talk
(http://momjian.us/main/presentations/overview.html#scaling), studying
Oracle RAC, and talking to users, it became clear that an XC-like
architecture (sharding) was the only architecture that was going to allow
for write scaling.What other architectures were discussed? Where was that discussion?
That was mostly my conclusion. I explained it to small groups at
conferences and Postgres user groups. No one said I was wrong, but that
is about the level of debate I had.
2014:� I started to shop around the idea that we could use FDWs,
parallelism, and a transaction/snapshot manager to get XC features
as built-in to Postgres.� (I don't remember where the original idea
came from.)� It was clear that having separate forks of the source code
in XC and XL was never going to achieve critical mass --- there just
aren't enough people who need high right scale right now, and the fork
maintenance overhead is a huge burden.I personally support the view that we should put scalability features into
Postgres core, rather than run separate forks.
Good, I do think it is time, but as I stated above, there is limited
interest in our current community, so the tolerance for additional
community code to accomplish this is also limited. This is the big
thing that had me excited about using FDWs --- FDW improvements can get
us closer to sharding without requiring community acceptance of
sharding-only features.
I realized that we would never get community acceptance to dump the XC
(or XL) code needed for sharding into community PostgresHow or why did you realize that? There has never been any such discussion,
AFAIK. Surely it can be possible to move required subsystems across?
Well, I have had many such discussions with XC/XL folks, and that was my
opinion. I have seen almost no public discussion about this because the
idea had almost no chance of success. If it was possible, someone would
have already suggested it on this list.
, but with FDWs,
we could add the features as _part_ of improving FDWs, which would benefit
FDWs _and_ would be useful for sharding.� (We already see some of those
FDW features in 9.5.)That is a huge presumption. Not discussed or technically analyzed in any way
with the community.
True. It seemed pretty obvious to me.
October, 2014:� EDB and NTT started working together in the community
to start improving FDWs as a basis for an FDW-based sharding solution.
Many of the 9.5 FDW improvements that also benefit sharding were developed
by a combined EDB/NTT team.� The features improved FDWs independent of
sharding, so they didn't need community buy-in on sharding to get them
accepted.June, 2015:� I attended the PGCon sharding unconference session and
there was a huge discussion about where we should go with sharding.
I think the big take-away was that most people liked the FDW approach,
but had business/customer reasons for wanting to work on XC or XL because
those would be production-ready faster.Cough, cough. You must surely be joking that "most people liked the FDW
approach"? How did we measure the acceptance of this approach?�
Well, I didn't have my audience-meter with me at the time. ;-)
The discussion was mostly in the hallway after the unconference session,
"Future of PostgreSQL shared-nothing cluster" by Konstantin Knizhnik,
Alexander Korotkov, and Oleg Bartunov. Again, when I explained the
ability to use FDWs to get sharding into Postgres with minimal
additional code, no one said the idea was crazy, which I took as a big
thumbs-up! When I asked why to continue with XC/XL, I was told those
were more mature and more customer-ready, which is true. I will not
quote people from the from the hallway discussion for privacy reasons.
What actually is the FDW approach? Since its not been written down anywhere, or
even explained verbally, how can anyone actually agree to it?
Well, my sharding talk just has the outlines of an approach. I think
there are five broad segments:
* FDW push-down of joins, sorts, aggregates
* ability to send FDW requests in parallel
* transaction/snapshot manager to allow ACID transctions on shards
* simpler user partitioning API
* infrastructure to manage shards, including replicated tables used for joins
July, 2015:� Oleg Bartunov and his new company Postgres Professional (PP)
started to think about joining the FDW approach, rather than working on
XL, as they had stated at PGCon in June.� A joint NTT/EDB/PP phone-in
meeting is scheduled for September 1.August, 2015:� While speaking at SFPUG, Citus Data approached me about
joining the FDW sharding team.� They have been invited to the September
1 meeting, as have the XC and XL people.2ndQuadrant is working in this area, specifically bringing XL 9.5 forwards.
Yes, I saw the blog post about that:
http://blog.2ndquadrant.com/working-towards-postgres-xl-9-5/
Please can invites be posted to myself, Pavan Deolasee and Petr Jelinek also?
I'll pass on to others also.
OK, I will send you a separate email and you can then supply their email
addresses.
Koichi Suzuki is arranging a meeting in Hong Long for XC/XL discussions.
Presumably EDB is invited also? If Koichi is a leading organizer of this, why
are there two meetings?
I certainly have heard nothing about it, except third-hand people
telling me a meeting is happening. I assumed those meetings where
XC/XL-specific.
that the XC approach is the only reasonable�way to do it,
and that FDWs are the cleanest way to get it into community
Postgres.Those two things aren't at all obvious to me.
Please don't presume my opposition. If the technical information were made
public, I might understand and agree with "the FDW approach", perhaps others
also.
Well, the beauty of my approach is that we didn't need any technical
direction or buy-in on sharding from the community to improve FDWs. I
think now is the right time to try to get that buy-in, or adjust our
approach.
There isn't really much more to my _analysis_ than I presented. There
is certainly a lot more work to do to even decide this is the right
approach. Some of the groups already involved have more experience in
trying this, e.g. Citus Data.
2ndQuadrant is certainly happy to become involved in any team aiming to
add features to Postgres core, as long as that makes sense. There may be areas
we can all agree upon even if the full architecture remains in doubt.
Right.
Before the community commits to a long term venture together we should see the
plan. Like all IT projects, expensive failure is possible and the lack of a
design is a huge flashing red warning light for me at present. If that requires
a meeting of all Developers, why are the meetings for this specifically not
happening at the agreed Developer meetings?
Well, what meetings should it be at? I don't think there was clear
enough direction for the June 2015 PGCon meeting. Is there an
unconference in Vienna? One thing I saw at the last PGCon is that this
is a big topic, so I think having a dedicated room and 3-hour slot for
it is nice.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ Everyone has their own god. +
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sun, Aug 30, 2015 at 10:36:23PM +0300, Oleg Bartunov wrote:
Honestly, I still don't know which approach is better, we already played with
XL (ported to 9.4)� and identified some very strong issues with inconsistency,
which scared us, especially taking into account how easy we found them. XC
people have fixed them, but I'm not sure if they were fundamental and if we
could construct more sophisticated tests and find more issues in XC/XL. We also
a bit disappointed by Huawei position about CSN patch, we hoped to use for� our
XTM.� FDW approach has been actively criticized by pg_shard people and that's
also made me a bit suspicious.�
Yep, that has me concerned too. The pg_shard people will be on the
September 1 call and are working on a Google document to explain their
concerns about FDWs for sharding.
It looks like� we are doomed to continue
several development forks, so we decided to work on very important common
project, XTM, which we hoped could be accepted by all parties and eventually
committed to 9.6.� Now I see we were right, unfortunately.��
Yes, the ability to add independent parts that can eventually be used
for sharding is a strong indication that doing this incrementally is a
good approach.
Again, could we organize meeting somewhere in September ?� US is not good for
us, but other places should be ok. I want to have an agreement� at least on
XTM. We still are testing various approaches, though. We could present results
of our experiments and are open to discussion. It's not easy project, but it's
something we could do for 9.6.
Good. XTM is a must-have for several use-cases, including sharding.
I'm very glad Bruce started this discussion in -hackers, since it's silly to me
to participate in both threads :)� Let's meet in September !
In summary, I think we need to start working on built-in sharding, and
FDWs are the only way I can see to do it with minimal code changes,
which I think might be a community requirement. It might not work, but
right now, it is the only possible approach I can see.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ Everyone has their own god. +
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Aug 31, 2015 at 7:29 AM, Bruce Momjian <bruce@momjian.us> wrote:
On Sun, Aug 30, 2015 at 03:31:10PM +0100, Simon Riggs wrote:
I realized that we would never get community acceptance to dump the
XC
(or XL) code needed for sharding into community Postgres
How or why did you realize that? There has never been any such
discussion,
AFAIK. Surely it can be possible to move required subsystems across?
Well, I have had many such discussions with XC/XL folks, and that was my
opinion. I have seen almost no public discussion about this because the
idea had almost no chance of success. If it was possible, someone would
have already suggested it on this list.
Or perhaps people invested in this area had other obligations or lacked
motivation and/or time to work to push up for things in core. That's not
possible to know, and what is done is done.
July, 2015: Oleg Bartunov and his new company Postgres Professional
(PP)
started to think about joining the FDW approach, rather than working
on
XL, as they had stated at PGCon in June. A joint NTT/EDB/PP phone-in
meeting is scheduled for September 1.August, 2015: While speaking at SFPUG, Citus Data approached me
about
joining the FDW sharding team. They have been invited to the
September
1 meeting, as have the XC and XL people.
2ndQuadrant is working in this area, specifically bringing XL 9.5
forwards.
Yes, I saw the blog post about that:
http://blog.2ndquadrant.com/working-towards-postgres-xl-9-5/
Please can invites be posted to myself, Pavan Deolasee and Petr Jelinek
also?
I'll pass on to others also.
OK, I will send you a separate email and you can then supply their email
addresses.
FWIW, I would be interested in that as well. I worked in this area of
things for a couple of years as well FWIW.
Koichi Suzuki is arranging a meeting in Hong Long for XC/XL discussions.
Presumably EDB is invited also? If Koichi is a leading organizer ofthis, why
are there two meetings?
I certainly have heard nothing about it, except third-hand people
telling me a meeting is happening. I assumed those meetings where
XC/XL-specific.
Yep, that's my understanding as well and AFAIK as I know things have been
carried this way until now, aka XC/XL and Postgres core are aimed to live
as separate communities.
--
Michael
On Mon, Aug 31, 2015 at 09:53:57AM +0900, Michael Paquier wrote:
Well, I have had many such discussions with XC/XL folks, and that was my
opinion.� I have seen almost no public discussion about this because the
idea had almost no chance of success.� If it was possible, someone would
have already suggested it on this list.Or perhaps people invested in this area had other obligations or lacked
motivation and/or time to work to push up for things in core. That's not
possible to know, and what is done is done.
Well, I have talked to everyone privately about this, and concluded that
while horizontal scalability/sharding is useful, it is unlikely that the
code volume of something like XC or XL would be accepted into the
community, and frankly, now that we have FDWs, it is hard to imagine why
we would _not_ go in the FDW direction.
Of course, people have concerns, and FDWs might need to be improved, but
it is something worth researching. We might find out FDWs can't be used
at all, and that we have to either add much more code to Postgres to do
sharding, do something like pg_shard, or not implement built-in sharding
at all, but at least it is time to research this.
OK, I will send you a separate email and you can then supply their email
addresses.FWIW, I would be interested in that as well. I worked in this area of things
for a couple of years as well FWIW.
OK, I will send you an email.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ Everyone has their own god. +
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sun, Aug 30, 2015 at 10:08:06PM -0400, Bruce Momjian wrote:
On Mon, Aug 31, 2015 at 09:53:57AM +0900, Michael Paquier wrote:
Well, I have had many such discussions with XC/XL folks, and that was my
opinion.� I have seen almost no public discussion about this because the
idea had almost no chance of success.� If it was possible, someone would
have already suggested it on this list.Or perhaps people invested in this area had other obligations or lacked
motivation and/or time to work to push up for things in core. That's not
possible to know, and what is done is done.Well, I have talked to everyone privately about this, and concluded that
while horizontal scalability/sharding is useful, it is unlikely that the
code volume of something like XC or XL would be accepted into the
community, and frankly, now that we have FDWs, it is hard to imagine why
we would _not_ go in the FDW direction.
Actually, there was hope that XC or XL would get popular enough that it
would justify adding their code into community Postgres, but that never
happened.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ Everyone has their own god. +
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Aug 31, 2015 at 11:08 AM, Bruce Momjian <bruce@momjian.us> wrote:
On Mon, Aug 31, 2015 at 09:53:57AM +0900, Michael Paquier wrote:
Well, I have had many such discussions with XC/XL folks, and that was my
opinion. I have seen almost no public discussion about this because the
idea had almost no chance of success. If it was possible, someone would
have already suggested it on this list.Or perhaps people invested in this area had other obligations or lacked
motivation and/or time to work to push up for things in core. That's not
possible to know, and what is done is done.Well, I have talked to everyone privately about this, and concluded that
while horizontal scalability/sharding is useful, it is unlikely that the
code volume of something like XC or XL would be accepted into the
community, and frankly, now that we have FDWs, it is hard to imagine why
we would _not_ go in the FDW direction.
If I recall correctly in terms of numbers, that's indeed 40k of code,
the main areas of XC code being the GTM, the planner changes for
expression and join push down, and the connection pooler for parallel
query execution.
ISTM that FDW is a portion of the puzzle, there are other pieces that
could be used toward an in-core integration, like the parallel stuff
Amit Kapila is working on to allow remote query execution in parallel
of local scans. Also, XC/XL were performing well on OLTP thanks to the
connection pooler: this should indeed be part of the FDW portion
managing the foreign scans. This may sound like a minor issue compared
to the others, but already established connections help a lot when
scaling out with foreign servers.
Of course, people have concerns, and FDWs might need to be improved, but
it is something worth researching. We might find out FDWs can't be used
at all, and that we have to either add much more code to Postgres to do
sharding, do something like pg_shard, or not implement built-in sharding
at all, but at least it is time to research this.
I am really looking forward to hearing the arguments of the authors of
pg_shard on the matter.
OK, I will send you a separate email and you can then supply their email
addresses.FWIW, I would be interested in that as well. I worked in this area of things
for a couple of years as well FWIW.OK, I will send you an email.
Thanks.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Aug 31, 2015 at 11:48 AM, Bruce Momjian <bruce@momjian.us> wrote:
On Sun, Aug 30, 2015 at 10:08:06PM -0400, Bruce Momjian wrote:
On Mon, Aug 31, 2015 at 09:53:57AM +0900, Michael Paquier wrote:
Well, I have had many such discussions with XC/XL folks, and that was my
opinion. I have seen almost no public discussion about this because the
idea had almost no chance of success. If it was possible, someone would
have already suggested it on this list.Or perhaps people invested in this area had other obligations or lacked
motivation and/or time to work to push up for things in core. That's not
possible to know, and what is done is done.Well, I have talked to everyone privately about this, and concluded that
while horizontal scalability/sharding is useful, it is unlikely that the
code volume of something like XC or XL would be accepted into the
community, and frankly, now that we have FDWs, it is hard to imagine why
we would _not_ go in the FDW direction.Actually, there was hope that XC or XL would get popular enough that it
would justify adding their code into community Postgres, but that never
happened.
Forks are aimed to die without proper maintenance resources. Still,
for XC/XL, what does not help is the complication of the architecture
and SPOF management, particularly thinking with the GTM that was
something completely new and not well understood (there is a GTM
standby but this model is weak IMO and does not scale similarly to
what you get with standbys, and impacts the overall performance of the
cluster).
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sun, Aug 30, 2015 at 7:47 AM, Bruce Momjian <bruce@momjian.us> wrote:
I have recently increased my public statements about the idea of adding
horizontal scaling/sharding to Postgres. I wanted to share with hackers
a timeline of how we got here, and where I think we are going in the
short term:2012-2013: As part of writing my scaling talk
(http://momjian.us/main/presentations/overview.html#scaling), studying
Oracle RAC, and talking to users, it became clear that an XC-like
architecture (sharding) was the only architecture that was going to allow
for write scaling.
I think sharding like architecture is quite useful for certain kind of
workloads
where users can manage to arrange queries and data layout in an optimized
way which I hope users might agree to change if required. One thing to
consider here is what kind of scaling are we expecting in such a system and
is it sufficient considering we will keep focussed on this architecture for
horizontal scalability?
Generally speaking, the scaling in such systems is limited by the number of
profitable partitions user can create based on data and then cross-partition
transactions sucks the performance/scalability in such systems. I
understand that there is definitely a benefit in proceeding with sharding
like
architecture as there are already some PostgreSQL based forks which uses
such architecture, so if we follow same way, we can save some effort rather
than inventing or following some other architecture, however there is no
harm
is discussing pros and cons of some other architectures like Oracle RAC,
Google F1 or others.
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
At PGCon we agreed to have such meeting in Vienna at least. But I think we
should be prepared and try to clean all our issues before. It looks like we
already out of time,but probably we could meet in Hong Kong ?Honestly, I still don't know which approach is better, we already played
with XL (ported to 9.4) and identified some very strong issues with
inconsistency, which scared us, especially taking into account how easy we
found them. XC people have fixed them, but I'm not sure if they were
fundamental and if we could construct more sophisticated tests and find
more issues in XC/XL. We also a bit disappointed by Huawei position about
CSN patch, we hoped to use for our XTM. FDW approach has been actively
criticized by pg_shard people and that's also made me a bit suspicious. It
looks like we are doomed to continue several development forks, so we
decided to work on very important common project, XTM, which we hoped could
be accepted by all parties and eventually committed to 9.6. Now I see we
were right, unfortunately.
Distributed transaction manager should support at least three things
1. Atomic commit
2. Atomic visibility
3. Consistent snapshots (e.g. required for repeatable reads and higher
isolation levels).
I have submitted patch for implementing first for FDWs. The patch adds
infrastructure to be used by all FDWs including postgres_fdw. It also adds
postgres_fdw code to use this infrastructure. The same can be used to
achieve atomic commit in postgres_fdw based sharding. Please see if XTM can
benefit from it. If there are things that are required by XTM, please post
the requirements on that thread and I will work on those. You can find the
latest patch at
/messages/by-id/CAFjFpRfANWL53+x2HdM9TCNe5pup=oPkQSSJ-KGfr-d2efj+CQ@mail.gmail.com
Again, could we organize meeting somewhere in September ? US is not good
for us, but other places should be ok. I want to have an agreement at
least on XTM. We still are testing various approaches, though. We could
present results of our experiments and are open to discussion. It's not
easy project, but it's something we could do for 9.6.I'm very glad Bruce started this discussion in -hackers, since it's silly
to me to participate in both threads :) Let's meet in September !--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
On Mon, Aug 31, 2015 at 5:48 AM, Bruce Momjian <bruce@momjian.us> wrote:
On Sun, Aug 30, 2015 at 10:08:06PM -0400, Bruce Momjian wrote:
On Mon, Aug 31, 2015 at 09:53:57AM +0900, Michael Paquier wrote:
Well, I have had many such discussions with XC/XL folks, and that
was my
opinion. I have seen almost no public discussion about this
because the
idea had almost no chance of success. If it was possible, someone
would
have already suggested it on this list.
Or perhaps people invested in this area had other obligations or lacked
motivation and/or time to work to push up for things in core. That'snot
possible to know, and what is done is done.
Well, I have talked to everyone privately about this, and concluded that
while horizontal scalability/sharding is useful, it is unlikely that the
code volume of something like XC or XL would be accepted into the
community, and frankly, now that we have FDWs, it is hard to imagine why
we would _not_ go in the FDW direction.Actually, there was hope that XC or XL would get popular enough that it
would justify adding their code into community Postgres, but that never
happened.
AFAIK, XC/XL has already some customers and that is an additional pressure
on their development team, which is now called X2. I don't exactly know how
internal Huawei's MPPDB is connected to XC/XL.
We need community test suite for cluster and our company is working on
this. It's non-trivial work, but community will never accepts any cluster
solution without thorough testing of functionality and performance. Our
XC/XL experience was not good.
Show quoted text
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com+ Everyone has their own god. +
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
We also a bit disappointed by Huawei position about CSN patch, we hoped
to use for our XTM.
Disappointed in what way? Moving to some sort of CSN approach seems to open
things up for different future ideas. In the short term, it would mean
replacing potentially large snapshots and longer visibility checks. In the
long term, perhaps CSN could help simplify the design of multi-master
replication schemes.
FDW approach has been actively criticized by pg_shard people and that's
also made me a bit suspicious. It looks like we are doomed to continue
several development forks, so we decided to work on very important common
project, XTM, which we hoped could be accepted by all parties and
eventually committed to 9.6. Now I see we were right, unfortunately.
I think the original XC project probably would have taken the FDW approach
as a basis if it had existed, with focus on push-down optimizations.
I assume that future work around PG sharding probably would be more likely
to be accepted with the FDW approach. One could perhaps work on pushing
down joins, aggregates and order by, then look at any optimizations gained
if code is moved outside of FDW. It would make sense if some kind of
generic optimization for foreign tables for SQL-based sources could be
leveraged across all databases, rather than having to re-implement for each
FDW.
There are different approaches and related features that may need to be
improved.
Do we want multiple copies of shards, like the pg_shard approach? Or keep
things simpler and leave it up to the DBA to add standbys?
Do we want to leverage table inheritance? If so, we may want to spend time
improving performance for when the number of shards becomes large with what
currently exists. If using table inheritance, we could add the ability to
specify what node (er, foreign server) the subtable lives on. We could
create top level sharding expressions that allow these to be implicitly
created.
Should we allow arbitrary expressions for shards, not just range, list and
hash?
Maybe the most community-acceptable approach would look something like
- Use FDWs, and continue to optimize push-down operations, also for
non-PostgreSQL databases.
- Use table inheritance for defining the shards. Ideally allow for
specifying that some shards may be replicated to other foreign servers (and
itself) (for pushing down joins with lookup/static tables; at this point it
should be decent for star schema based data warehouses).
- XTM/GTM hooks. Preferably we move to CSN for snapshots in core PostgreSQL
though.
Longer term, efficient internode joins would require a lot more work.
The devil is in the details. There are things that have to be addressed,
for example, if using global XIDs via GTM, not every transaction is on
every node, so we need to make sure that new clog pages get added
properly. There is also the potential to require a lot more code to be
added, like for cursor handling and stored functions. Perhaps some
limitations when using shards to foreign servers are acceptable if it is
desired to minimize code changes. XC and XL code help.
Regards,
Mason
On Mon, Aug 31, 2015 at 02:48:31PM -0400, Mason S wrote:
I assume that future work around PG sharding probably would be more likely to
be accepted with the FDW approach. One could perhaps work on pushing down
joins, aggregates and order by, then look at any optimizations gained if code
is moved outside of FDW.� It would make sense if some kind of generic
optimization for foreign tables for SQL-based sources could be leveraged across
all databases, rather than having to re-implement for each FDW.There are different approaches and related features that may need to be
improved.Do we want multiple copies of shards, like the pg_shard approach? Or keep
things simpler and leave it up to the DBA to add standbys?�
I agree with all of the above.
Do we want to leverage table inheritance? If so, we may want to spend time
improving performance for when the number of shards becomes large with what
currently exists. If using table inheritance, we could add the ability to
specify what node (er, foreign server) the subtable lives on. We could create
top level sharding expressions that allow these to be implicitly created.Should we allow arbitrary expressions for shards, not just range, list and
hash?Maybe the most community-acceptable approach would look something like
I think everyone agrees that our current partitioning setup is just too
verbose and error-prone for users, and needs a simpler interface, and
one that can be better optimized internally. I assume FDW-based
sharding will benefit from that work as well.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ Everyone has their own god. +
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Aug 31, 2015 at 9:48 PM, Mason S <masonlists@gmail.com> wrote:
We also a bit disappointed by Huawei position about CSN patch, we hoped
to use for our XTM.Disappointed in what way? Moving to some sort of CSN approach seems to
open things up for different future ideas. In the short term, it would mean
replacing potentially large snapshots and longer visibility checks. In the
long term, perhaps CSN could help simplify the design of multi-master
replication schemes.
We are disappointed because at PGCon talk Huawei announced publishing of
their CSN patch and further work in this direction together with community.
However, it's even not published yet despite all the promises. Nobody from
Huawei answers CSN thread in the hackers.
So, I think we got nothing from Huawei except teasing and should rely only
on ourselves. That is disappointing.
------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On Mon, Aug 31, 2015 at 2:12 AM, Oleg Bartunov <obartunov@gmail.com> wrote:
AFAIK, XC/XL has already some customers and that is an additional pressure
on their development team, which is now called X2. I don't exactly know how
internal Huawei's MPPDB is connected to XC/XL.
Huawei's MPPDB is based on PG-XC and tailored it more targeting OLAP scenarios.
The basic idea is that OLAP needs a shared nothing scale out
architecture for read and write. It needs ok-TP-performance, a
restricted set of functionality, and thus avoids some problems like
GTM being a central scaling bottleneck.
I advocate to merge PostgreSQL core with scale-out features, if we are
ready to face some long time functional discrepancies between the two
deployments.
Regards,
Qingqing
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
All, Bruce:
First, let me put out there that I think the horizontal scaling project
which has buy-in from the community and we're working on is infinitely
better than the one we're not working on or is an underresourced fork.
So we're in agreement on that. However, I think there's a lot of room
for discussion; I feel like the FDW approach was decided in exclusive
meetings involving a very small number of people. The FDW approach
*may* be the right approach, but I'd like to see some rigorous
questioning of that before it's final.
Particularly, I'm concerned that we already have two projects in process
aimed at horizontal scalability, and it seems like we could bring either
(or both) projects to production quality MUCH faster than we could make
an FDW-based solution work. These are:
* pg_shard
* BDR
It seems worthwhile, just as a thought experiment, if we can get where
we want using those, faster, or by combining those with new FDW features.
It's also important to recognize that there are three major use-cases
for write-scalable clustering:
* OLTP: small-medium cluster, absolute ACID consistency,
bottlnecked on small writes per second
* DW: small-large cluster, ACID optional,
bottlenecked on bulk reads/writes
* Web: medium to very large cluster, ACID optional,
bottlenecked on # of connections
We cannot possibly solve all of the above at once, but to the extent
that we recognize all 3 use cases, we can build core features which can
be adapted to all of them.
I'm also going to pontificate that, for a future solution, we should not
focus on write *IO*, but rather on CPU and RAM. The reason for this
thinking is that, with the latest improvements in hardware and 9.5
improvements, it's increasingly rare for machines to be bottlenecked on
writes to the transaction log (or the heap). This has some implications
for system design. For example, solutions which require all connections
to go through a single master node do not scale sufficiently to be worth
bothering with.
On some other questions from Mason:
Do we want multiple copies of shards, like the pg_shard approach? Or
keep things simpler and leave it up to the DBA to add standbys?
We want multiple copies of shards created by the sharding system itself.
Having a separate, and completely orthagonal, redundancy system to the
sharding system is overly burdensome on the DBA and makes low-data-loss
HA impossible.
Do we want to leverage table inheritance? If so, we may want to spend
time improving performance for when the number of shards becomes large
with what currently exists. If using table inheritance, we could add the
ability to specify what node (er, foreign server) the subtable lives on.
We could create top level sharding expressions that allow these to be
implicitly created.
IMHO, given that we're looking at replacing inheritance because of its
many documented limitations, building sharding on top of inheritance
seems unwise. For example, many sharding systems are hash-based; how
would an inheritance system transparently use hash keys?
Should we allow arbitrary expressions for shards, not just range, list
and hash?
That seems like a 2.0 feature. It also doesn't seem necessary to
support it for the moderately skilled user; that is, requiring a special
C sharding function for this seems fine to me.
--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Import Notes
Reply to msg id not found: WM233b55d74476db710880fe927b8e7bde6b1eb217bf57c23e467b1d9dc9cc7758421dc2a3cfd84c6d66d039c4d73ac49f@asav-2.01.com
On 08/31/2015 01:16 PM, Josh Berkus wrote:
All, Bruce:
I'm also going to pontificate that, for a future solution, we should not
focus on write *IO*, but rather on CPU and RAM. The reason for this
thinking is that, with the latest improvements in hardware and 9.5
improvements, it's increasingly rare for machines to be bottlenecked on
writes to the transaction log (or the heap). This has some implications
for system design. For example, solutions which require all connections
to go through a single master node do not scale sufficiently to be worth
bothering with.
We see this already, under very high concurrency (lots of connections,
many cores) we often see a significant drop in performance that is not
related to IO in any meaningful way.
JD
--
Command Prompt, Inc. - http://www.commandprompt.com/ 503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Announcing "I'm offended" is basically telling the world you can't
control your own emotions, so everyone else should do it for you.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Aug 31, 2015 at 4:16 PM, Josh Berkus <josh@agliodbs.com> wrote:
First, let me put out there that I think the horizontal scaling project
which has buy-in from the community and we're working on is infinitely
better than the one we're not working on or is an underresourced fork.
So we're in agreement on that. However, I think there's a lot of room
for discussion; I feel like the FDW approach was decided in exclusive
meetings involving a very small number of people. The FDW approach
*may* be the right approach, but I'd like to see some rigorous
questioning of that before it's final.
It seems to me that sharding consists of (1) breaking your data set up
into shards, (2) possibly replicating some of those shards onto
multiple machines, and then (3) being able to access the remote data
from local queries. As far as (1) is concerned, we need declarative
partitioning, which is being worked on by Amit Langote. As far as (2)
is concerned, I hope and expect BDR, or technology derived therefrom,
to eventually fill that need. As far as (3) is concerned, why
wouldn't we use the foreign data wrapper interface, and specifically
postgres_fdw? That interface was designed for the explicit purpose of
allowing access to remote data sources, and a lot of work has been put
into it, so it would be highly surprising if we decided to throw that
away and develop something completely new from the ground up.
It's true that postgres_fdw doesn't do everything we need yet. The
new join pushdown hooks aren't used by postgres_fdw yet, and the API
itself has some bugs with EvalPlanQual handling. Aggregate pushdown
is waiting on upper planner path-ification. DML pushdown doesn't
exist yet, and the hooks that would enable pushdown of ORDER BY
clauses to the remote side aren't being used by postgres_fdw. But all
of these things have been worked on. Patches for many of them have
already been posted. They have suffered from a certain amount of
neglect by senior hackers, and perhaps also from a shortage of time on
the part of the authors. But an awful lot of the work that is needed
here has already been done, if only we could get it committed.
Aggregate pushdown is a notable exception, but abandoning the foreign
data wrapper approach in favor of something else won't fix that.
Postgres-XC developed a purpose-built system for talking to other
nodes instead of using the FDW interface, for the very good reason
that the FDW interface did not yet exist at the time that Postgres-XC
was created. But several people associated with the XC project have
said, including one on this thread, that if it had existed, they
probably would have used it. And it's hard to see why you wouldn't:
with XC's approach, the remote data source is presumed to be
PostgreSQL (or Postgres-XC/XL/X2/whatever); and you can only use the
facility as part of a sharding solution. The FDW interface can talk
to anything, and it can be used for stuff other than sharding, like
making one remote table appear local because you just happen to want
that for some reason. This makes the XC approach look rather brittle
by comparison. I don't blame the XC folks for taking the shortest
path between two points, but FDWs are better, and we ought to try to
leverage that.
Particularly, I'm concerned that we already have two projects in process
aimed at horizontal scalability, and it seems like we could bring either
(or both) projects to production quality MUCH faster than we could make
an FDW-based solution work. These are:* pg_shard
* BDRIt seems worthwhile, just as a thought experiment, if we can get where
we want using those, faster, or by combining those with new FDW features.
I think it's abundantly clear that we need a logical replication
solution as part of any horizontal scalability story. People will
want to do things like have 10 machines with each piece of data on 3
of them, and there won't be any reasonable way of doing that without
logical replication. I assume that BDR, or some technology derived
from it, will end up in core and solve that problem. I had actually
hoped we were going to get that in 9.5, but it didn't happen that way.
Still, I think that getting first single-master, and then eventually
multi-master, logical replication in core is absolutely critical. And
not just for sharding specifically: replicating your whole database to
several nodes and load-balancing your clients across them isn't
sharding, but it does give you read scalability and is a good fit for
people with geographically dispersed data with good geographical
locality. I think a lot of people will want that.
I'm not quite sure yet how we can marry declarative partitioning and
better FDW-pushdown and logical replication into one seamless, easy to
deploy solution that requires very low administrator effort. But I am
sure that each of those things, taken individually, is very useful,
and that being able to construct a solution from those building blocks
would be a big improvement over what we have today. I can't imagine
that trying to do one monolithic project that provides all of those
things, but only if you combine them in the specific way that the
designer had in mind, is ever going to be successful. People _will_
want access to each of those features in an unbundled fashion. And,
trying to do them altogether leads to trying to solve too many
problems at once. I think the history of Postgres-XC is a cautionary
tale there.
I don't really understand how pg_shard fits into this equation. It
looks to me like it does some interesting things but, for example, it
doesn't support JOIN pushdown, and suggests that you use the
proprietary CitusDB engine if you need that. But I think JOIN
pushdown is something we want to have in core, not something where we
want to point people to proprietary alternatives. And it has some
restrictions on INSERT statements - they have to contain only values
which are constants or which can be folded to constants. I'm just
guessing, but I bet that's probably due to some limitation which
pg_shard, being out of core, has difficulty overcoming, but we can do
better in core. Basically I guess I expect much of what pg_shard does
to be subsumed as we improve FDWs, but maybe not all of it.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers