WIP: Upper planner pathification
Those with long memories will recall that I've been waving my arms about
$SUBJECT for more than five years. I started to work seriously on a patch
last summer, and here is a version that I feel comfortable exposing to
public scrutiny (which is not to call it "done"; more below).
The basic point of this patch is to apply the generate-and-compare-Paths
paradigm to the planning steps after query_planner(), which only covers
scan and join processing (the FROM and WHERE parts of a query). These
later steps deal with grouping/aggregation, window functions, SELECT
DISTINCT, ORDER BY, LockRows (SELECT FOR UPDATE), LIMIT/OFFSET, and
ModifyTable. Also UNION/INTERSECT/EXCEPT. Back in the bad old days we
had only one way to do any of that stuff, so there was no real problem
with the approach of converting query_planner's answer into a Plan and
then stacking more Plan nodes atop that. Over time we grew other ways
to do those steps, and chose between those ways with ad-hoc code in
grouping_planner(). That was messy enough in itself, but it had other
disadvantages too: subquery_planner() had to choose and return a single
Plan, without regard to what the outer query might need. (Well, we did
pass down a tuple_fraction parameter, but that is a pretty limited bit of
information.)
An even larger problem is that we had no way to handle addition of new
alternative plan types for these upper-planning steps without fundamental
hacking on grouping_planner(). An example is the code I added in commit
addc42c339208d6a and later (planagg.c and other places) for optimization
of MIN/MAX aggregates: that code had a positively incestuous relationship
with grouping_planner(), and was darn ugly in multiple other ways besides.
Of late, the main way this issue has surfaced is that we have no practical
way to plan pushdown of aggregates or updates on remote tables to the
responsible FDWs, because the FDWs cannot create Paths representing such
operations.
The present patch addresses this problem by inventing Path nodes to
represent every post-scan/join step, and changing the API of
grouping_planner() and subquery_planner() so that they return sets of
Paths rather than single Plans. Creation of a Plan tree happens only
after control returns to the top level of standard_planner(). The Path
nodes for these post-scan/join steps are attached to "upper relation"
RelOptInfos that didn't exist before. There are provisions for FDWs to
inject candidate Paths for these upper-level steps. As proof of concept
for that, planagg.c has been revised to work by injecting a new Path
into the grouping/aggregation upper rel, rather than predetermining what
the answer will be. This vastly decreases its coupling with both
grouping_planner and some other parts of the system such as equivclass.c
(though, the Law of Conservation of Cruft being what it is, I did have to
push some knowledge about planagg.c's work into setrefs.c).
I'm pretty pleased with the way this turned out. grouping_planner() is
about half the length it was before, and much more straightforward IMO.
planagg.c no longer seems like a complete hack; it's a reasonable
prototype for injecting nontraditional implementation paths into
aggregation or other late planner stages, and grouping_planner() doesn't
need to know about it.
The patch does add a lot of net new lines (and it's not done) but
most of the new code is very straightforward boilerplate.
The main thing that makes this WIP and not committable is that I've not
yet bothered to implement outfuncs.c code and some other debug support for
all the new path struct types. A lot of the new function header comments
remain to be fleshed out too, and some more documentation needs to be
written. But I think it's reviewable as-is; the other stuff would just
make it even longer but not more interesting.
There's a lot of future work to be done within this skeleton. Notably,
I did not fix the UNION/INTERSECT/EXCEPT planning code to consider
multiple paths; it still only generates a single Path tree. That code
needs to be rewritten from scratch, probably, and it seems like doing so
is a separate project. I'd also like to do some more refactoring in
createplan.c: some code paths are still doing redundant cost estimation,
and I'm growing increasingly dissatisfied with the "use_physical_tlist"
hack. But that seems like a separable issue as well.
So, where to go from here? I'm acutely aware that we're hard up against
the final 9.6 commitfest, and that we discourage major patches arriving
so late in a devel cycle. But I simply couldn't get this done any faster.
I don't really want to hold it over for the 9.7 devel cycle. It's been
enough trouble maintaining this patch in the face of conflicting commits
over the last year or so (it's probably still got bugs related to parallel
query...), and there definitely are conflicting patches in the upcoming
'fest. And the lack of this infrastructure is blocking progress on FDWs
and some other things.
So I'd really like to get this into 9.6. I'm happy to put it into the
March commitfest if someone will volunteer to review it.
Comments?
regards, tom lane
Attachments:
upper-planner-pathification-1.patch.gzapplication/x-gzip; name=upper-planner-pathification-1.patch.gzDownload+3-1
Hi,
On 2016-02-28 15:03:28 -0500, Tom Lane wrote:
Those with long memories will recall that I've been waving my arms about
$SUBJECT for more than five years. I started to work seriously on a patch
last summer, and here is a version that I feel comfortable exposing to
public scrutiny (which is not to call it "done"; more below).
Yay!
So, where to go from here? I'm acutely aware that we're hard up against
the final 9.6 commitfest, and that we discourage major patches arriving
so late in a devel cycle. But I simply couldn't get this done any faster.
I don't really want to hold it over for the 9.7 devel cycle. It's been
enough trouble maintaining this patch in the face of conflicting commits
over the last year or so (it's probably still got bugs related to parallel
query...), and there definitely are conflicting patches in the upcoming
'fest. And the lack of this infrastructure is blocking progress on FDWs
and some other things.So I'd really like to get this into 9.6. I'm happy to put it into the
March commitfest if someone will volunteer to review it.
Hard. This is likely to cause/trigger a number of bugs, and we don't
have much time to let this mature. It's a change that we're unlikely to
be able to back-out if we discover that it wasn't the right thing to
integrate shortly before the release. On the other hand, this is a
major architectural step forward; one that unblocks a number of nice
features. There's also an argument to be made that integrating this now
is beneficial, because it'll cause less churn for patches being
developed while 9.6 is stabilizing.
Greetings,
Andres Freund
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 28 February 2016 at 20:03, Tom Lane <tgl@sss.pgh.pa.us> wrote:
So, where to go from here? I'm acutely aware that we're hard up against
the final 9.6 commitfest, and that we discourage major patches arriving
so late in a devel cycle. But I simply couldn't get this done any faster.
I don't really want to hold it over for the 9.7 devel cycle. It's been
enough trouble maintaining this patch in the face of conflicting commits
over the last year or so (it's probably still got bugs related to parallel
query...), and there definitely are conflicting patches in the upcoming
'fest. And the lack of this infrastructure is blocking progress on FDWs
and some other things.
Thanks for working on this; it is important.
I'm disappointed to see you do this because of FDWs, with the "some other
things" like parallel aggregation not getting a mention by name.
While I wouldn't mind seeing this go in, what worries me is the multiple
other patches that now need to be rewritten to exploit this and since some
aren't mentioned would it be reasonable to imagine those other things won't
be prioritised for this release? Or will we be deciding to elongate the
integration phase to cope with this? Delay or favour forks, which should we
choose?
Anyway, glad to see you will now experience the problems of maintaining
large patches across multiple releases and/or the difficulty of arguing in
favour of patches that still require work going in at the last minute. Not
with relish, just so that understanding isn't limited to the usual suspects
of feature-crime.
--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Sun, Feb 28, 2016 at 3:03 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
So, where to go from here? I'm acutely aware that we're hard up against
the final 9.6 commitfest, and that we discourage major patches arriving
so late in a devel cycle. But I simply couldn't get this done any faster.
I don't really want to hold it over for the 9.7 devel cycle. It's been
enough trouble maintaining this patch in the face of conflicting commits
over the last year or so (it's probably still got bugs related to parallel
query...), and there definitely are conflicting patches in the upcoming
'fest. And the lack of this infrastructure is blocking progress on FDWs
and some other things.So I'd really like to get this into 9.6. I'm happy to put it into the
March commitfest if someone will volunteer to review it.
I'll abstain from the question of whether this patch is too late in
coming (but the answer is probably "yes") and instead volunteer to
review it.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
I'll abstain from the question of whether this patch is too late in
coming (but the answer is probably "yes") and instead volunteer to
review it.
OK, I've put it into the commitfest. Thanks for volunteering!
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
The basic point of this patch is to apply the generate-and-compare-Paths
paradigm to the planning steps after query_planner(), which only covers
...
The present patch addresses this problem by inventing Path nodes to
represent every post-scan/join step
I'm really glad to see that. Separating path nodes for later steps opens a new
ways to optimize queries. For first glance, consider
select * from a left outer join b on a.i = b.i limit 1;
Limit node could be pushed down to scan over 'a' table if b.i is unique.
I tried to look into patch and I had a question (one for now): why LimitPath
doesn't contain actual limit/offset value? I saw a lot of subqueries with LIMIT
1 which could be transformed into EXISTS subquery.
So I'd really like to get this into 9.6.
Me too. I applied the patch and can confirm that 'make test' doesn't fail on
FreeBSD 10.2. Now I will try to run kind of TPC-H with and without patch.
--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Mar 1, 2016 at 3:11 PM, Teodor Sigaev <teodor@sigaev.ru> wrote:
The basic point of this patch is to apply the generate-and-compare-Paths
paradigm to the planning steps after query_planner(), which only covers
...
The present patch addresses this problem by inventing Path nodes to
represent every post-scan/join stepI'm really glad to see that. Separating path nodes for later steps opens a
new ways to optimize queries. For first glance, consider
select * from a left outer join b on a.i = b.i limit 1;
Limit node could be pushed down to scan over 'a' table if b.i is unique.
This patch opens a lot of possibilities to our ongoing project on indexing
subselects, which we plan to use for jsonb. Having it in 9.6 will
certainly facilitate this. So, I'm +1 for this patch, even if we have to
postpone 9.6 a bit. Hope, Robert, Teodor and other reviewers could help Tom
with this patch.
Show quoted text
--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW:
http://www.sigaev.ru/--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2/28/16 4:02 PM, Andres Freund wrote:
So, where to go from here? I'm acutely aware that we're hard up against
the final 9.6 commitfest, and that we discourage major patches arriving
so late in a devel cycle. But I simply couldn't get this done any faster.
I don't really want to hold it over for the 9.7 devel cycle. It's been
enough trouble maintaining this patch in the face of conflicting commits
over the last year or so (it's probably still got bugs related to parallel
query...), and there definitely are conflicting patches in the upcoming
'fest. And the lack of this infrastructure is blocking progress on FDWs
and some other things.So I'd really like to get this into 9.6. I'm happy to put it into the
March commitfest if someone will volunteer to review it.Hard. This is likely to cause/trigger a number of bugs, and we don't
have much time to let this mature. It's a change that we're unlikely to
be able to back-out if we discover that it wasn't the right thing to
integrate shortly before the release. On the other hand, this is a
major architectural step forward; one that unblocks a number of nice
features. There's also an argument to be made that integrating this now
is beneficial, because it'll cause less churn for patches being
developed while 9.6 is stabilizing.
Perhaps the best way to handle this would be to commit it to a branch
sooner rather than later. If things work out, that branch can become the
official beta. If not, in can become the basis for 9.7.
If nothing else it means that Tom isn't the only one stuck trying to
maintain this. Even if the branch is nothing but a means to generating a
patch for 9.7, having it in place makes it a lot easier for other
developers that need to to code against it.
While I'm promoting heresy... I imagine that this patch doesn't require
a catversion bump. Perhaps it would be worth doing a short-cycle major
release just to get this in. That might sound insane but since one of
the biggest obstacles to upgrading remains dealing with the on-disk
format, I don't think users would freak out about it.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Teodor Sigaev <teodor@sigaev.ru> writes:
I tried to look into patch and I had a question (one for now): why LimitPath
doesn't contain actual limit/offset value? I saw a lot of subqueries with LIMIT
1 which could be transformed into EXISTS subquery.
Oh, yeah, I intended to change that but didn't get to it yet. Consider
it done.
Me too. I applied the patch and can confirm that 'make test' doesn't fail on
FreeBSD 10.2. Now I will try to run kind of TPC-H with and without patch.
I do not think the patch will make a lot of performance difference as-is;
its value is more in what it will let us do later. There are a couple of
regression test cases that change plans for the better, but it's sort of
accidental. Those cases look like
select d.* from d left join (select * from b group by b.id, b.c_id) s
on d.a = s.id;
and what happens in HEAD is that the subquery chooses a hashagg plan
and then the upper query decides a mergejoin would be a good idea ...
so it has to sort the output of the hashagg. With the patch, what
comes back from the subquery is a Path for the hashagg and a Path
for doing the GROUP BY with Sort/Uniq. The second path is more expensive,
but it survives the add_path tournament because it can produce sorted
output. Then the outer level discovers that it can use that to do its
mergejoin without a separate sort step, and that way is cheaper overall.
So instead of
! -> Sort
! Sort Key: s.id
! -> Subquery Scan on s
! -> HashAggregate
! Group Key: b.id
! -> Seq Scan on b
we get
! -> Group
! Group Key: b.id
! -> Index Scan using b_pkey on b
which is noticeably cheaper, and not just because we got rid of the
Subquery Scan node. So that's nice --- but it's more or less accidental,
because the outer level isn't telling the inner level that this sort order
might be interesting.
Once this infrastructure is in place, I want to look at passing down more
information to recursive subquery_planner calls so that we're not leaving
this kind of optimization to chance. But the patch is big enough already,
so that (and a lot of other things) are getting left for later.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Mar 1, 2016 at 2:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
There are a couple of
regression test cases that change plans for the better, but it's sort of
accidental. Those cases look likeselect d.* from d left join (select * from b group by b.id, b.c_id) s
on d.a = s.id;and what happens in HEAD is that the subquery chooses a hashagg plan
and then the upper query decides a mergejoin would be a good idea ...
so it has to sort the output of the hashagg. With the patch, what
comes back from the subquery is a Path for the hashagg and a Path
for doing the GROUP BY with Sort/Uniq. The second path is more expensive,
but it survives the add_path tournament because it can produce sorted
output. Then the outer level discovers that it can use that to do its
mergejoin without a separate sort step, and that way is cheaper overall.
This doesn't sound accidental at all. It sounds like a perfect example
of exactly the benefits of this approach.
I read through the patch just to get an idea what's changing. But
obviously that's not going to actually turn up anything surprising.
(Actually the first hunk in the patch kind of surprised me. Do we dump
node trees with -> notation currently? I thought they normally all
looked like sexpressions.)
--
greg
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Greg Stark <stark@mit.edu> writes:
On Tue, Mar 1, 2016 at 2:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
There are a couple of
regression test cases that change plans for the better, but it's sort of
accidental. Those cases look likeselect d.* from d left join (select * from b group by b.id, b.c_id) s
on d.a = s.id;and what happens in HEAD is that the subquery chooses a hashagg plan
and then the upper query decides a mergejoin would be a good idea ...
so it has to sort the output of the hashagg. With the patch, what
comes back from the subquery is a Path for the hashagg and a Path
for doing the GROUP BY with Sort/Uniq. The second path is more expensive,
but it survives the add_path tournament because it can produce sorted
output. Then the outer level discovers that it can use that to do its
mergejoin without a separate sort step, and that way is cheaper overall.
This doesn't sound accidental at all. It sounds like a perfect example
of exactly the benefits of this approach.
Well, my point is that no such path would have been generated if the
subquery hadn't had an internal reason to consider sorting on b.id.
The "accidental" part of this is that the subquery's GROUP BY key
matches what the outer query needs as a mergejoin key.
(Actually the first hunk in the patch kind of surprised me. Do we dump
node trees with -> notation currently? I thought they normally all
looked like sexpressions.)
I chose in 19a541143 to not make PathTarget be a subclass of Node,
so that's kind of forced --- we can't print it by recursing to
_outNode(). We could change that but I'm not sure it would be an
improvement. The restarget fields are embedded in RelOptInfo, not
sub-nodes of it, so pretending that they're independent nodes seems
a bit phony in its own way. I'm not wedded to that reasoning though;
if people are more concerned about what pprint() output looks like,
we can change it. Or we could make restarget actually be a subnode,
at the cost of one more palloc per RelOptInfo.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
I do not think the patch will make a lot of performance difference as-is;
its value is more in what it will let us do later. There are a couple of
Yep, for now on my notebook (best from 5 tries):
% pgbench -i -s 3000
% pgbench -s 3000 -c 4 -j 4 -P 1 -T 60
HEAD 569 tps
patched 542 tps
% pgbench -s 3000 -c 4 -j 4 -P 1 -T 60 -S
HEAD 9500 tps
patched 9458 tps
Looks close to measurement error, but may be explained increased amount of work
for planning. Including, may be, more complicated path tree.
this kind of optimization to chance. But the patch is big enough already,
so that (and a lot of other things) are getting left for later.
Agree
--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Teodor Sigaev <teodor@sigaev.ru> writes:
I do not think the patch will make a lot of performance difference as-is;
its value is more in what it will let us do later. There are a couple of
Yep, for now on my notebook (best from 5 tries):
% pgbench -i -s 3000
% pgbench -s 3000 -c 4 -j 4 -P 1 -T 60
HEAD 569 tps
patched 542 tps
% pgbench -s 3000 -c 4 -j 4 -P 1 -T 60 -S
HEAD 9500 tps
patched 9458 tps
Looks close to measurement error, but may be explained increased amount of work
for planning. Including, may be, more complicated path tree.
I think the default pgbench queries are too simple to have any possible
benefit from this patch. It does look like you're seeing some extra
planning time, which I think is likely due to redundant construction
of PathTargets. The new function set_pathtarget_cost_width() is not
very cheap, and in order to minimize the delta in this patch I did
not worry much about avoiding duplicate calls of it. That's another
thing in a long list of things to do later ;-). There might be other
pain points I haven't recognized yet.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Mar 1, 2016 at 10:22 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Teodor Sigaev <teodor@sigaev.ru> writes:
I do not think the patch will make a lot of performance difference as-is;
its value is more in what it will let us do later. There are a couple ofYep, for now on my notebook (best from 5 tries):
% pgbench -i -s 3000
% pgbench -s 3000 -c 4 -j 4 -P 1 -T 60
HEAD 569 tps
patched 542 tps
% pgbench -s 3000 -c 4 -j 4 -P 1 -T 60 -S
HEAD 9500 tps
patched 9458 tpsLooks close to measurement error, but may be explained increased amount of work
for planning. Including, may be, more complicated path tree.I think the default pgbench queries are too simple to have any possible
benefit from this patch. It does look like you're seeing some extra
planning time, which I think is likely due to redundant construction
of PathTargets. The new function set_pathtarget_cost_width() is not
very cheap, and in order to minimize the delta in this patch I did
not worry much about avoiding duplicate calls of it. That's another
thing in a long list of things to do later ;-). There might be other
pain points I haven't recognized yet.
Yikes. The read-only test is an 0.5% hit which isn't great, but the
read-write test is about 5% which I think is clearly not OK. What's
your plan for doing something about that?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2 March 2016 at 13:47, Robert Haas <robertmhaas@gmail.com> wrote:
On Tue, Mar 1, 2016 at 10:22 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Teodor Sigaev <teodor@sigaev.ru> writes:
I do not think the patch will make a lot of performance difference
as-is;
its value is more in what it will let us do later. There are a couple
of
Yep, for now on my notebook (best from 5 tries):
% pgbench -i -s 3000
% pgbench -s 3000 -c 4 -j 4 -P 1 -T 60
HEAD 569 tps
patched 542 tps
% pgbench -s 3000 -c 4 -j 4 -P 1 -T 60 -S
HEAD 9500 tps
patched 9458 tpsLooks close to measurement error, but may be explained increased amount
of work
for planning. Including, may be, more complicated path tree.
I think the default pgbench queries are too simple to have any possible
benefit from this patch. It does look like you're seeing some extra
planning time, which I think is likely due to redundant construction
of PathTargets. The new function set_pathtarget_cost_width() is not
very cheap, and in order to minimize the delta in this patch I did
not worry much about avoiding duplicate calls of it. That's another
thing in a long list of things to do later ;-). There might be other
pain points I haven't recognized yet.Yikes. The read-only test is an 0.5% hit which isn't great, but the
read-write test is about 5% which I think is clearly not OK. What's
your plan for doing something about that?
Whether artefact of test, or real problem, clearly something fixable.
ISTM that we are clearly "going for it"; everybody agrees we should apply
the patch now.
The longer we hold off on applying it, the longer we wait for dependent
changes.
My vote is apply it early (i.e. now!) and clean up as we go.
--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Robert Haas <robertmhaas@gmail.com> writes:
On Tue, Mar 1, 2016 at 10:22 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I think the default pgbench queries are too simple to have any possible
benefit from this patch. It does look like you're seeing some extra
planning time, which I think is likely due to redundant construction
of PathTargets. The new function set_pathtarget_cost_width() is not
very cheap, and in order to minimize the delta in this patch I did
not worry much about avoiding duplicate calls of it. That's another
thing in a long list of things to do later ;-). There might be other
pain points I haven't recognized yet.
Yikes. The read-only test is an 0.5% hit which isn't great, but the
read-write test is about 5% which I think is clearly not OK. What's
your plan for doing something about that?
I do plan to take a look at it. Obviously, anything that *does* benefit
from this patch is going to see some planning slowdown as a consequence
of considering more Paths. But ideally, a query that has no grouping/
aggregation/later steps wouldn't see any difference. I think we can
get to that --- but I'd rather not complicate v1 with the hacks that
will probably be required.
(My first thought about how to fix that is to not force
set_pathtarget_cost_width to be done immediately on PathTarget
construction, but make it a decouplable step. I believe that
set_pathtarget_cost_width is only expensive if it's run before
query_planner runs, and we can probably finagle things so that we do not
really care about the cost/width attached to targets made before that.
But this all depends on profiling that I've not done yet...)
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Simon Riggs wrote:
ISTM that we are clearly "going for it"; everybody agrees we should apply
the patch now.The longer we hold off on applying it, the longer we wait for dependent
changes.
Agreed -- we need this in tree as soon as realistically possible.
There is a a bit a problem here, because this patch conflicts heavily
with at least one other patch that's been in the queue for a long time,
which is Kommi/Rowley's patch for parallel aggregation; the more we
delay applying this one, the worse the deadlines for that one.
I assume they are hard at work updating that patch to apply on top of
Tom's patch. It's not realistic to expect that we would apply any
further planner changes before this one is in.
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
Agreed -- we need this in tree as soon as realistically possible.
There is a a bit a problem here, because this patch conflicts heavily
with at least one other patch that's been in the queue for a long time,
which is Kommi/Rowley's patch for parallel aggregation; the more we
delay applying this one, the worse the deadlines for that one.
I assume they are hard at work updating that patch to apply on top of
Tom's patch. It's not realistic to expect that we would apply any
further planner changes before this one is in.
I don't think it's quite that bad: the patch doesn't touch scan/join
planning very much, so for instance I doubt that the pending unique-joins
patch is completely broken. But yeah, anything having anything to do
with planning of grouping/aggregation or later stages is going to need
major revision to play with this.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Teodor Sigaev <teodor@sigaev.ru> writes:
Yep, for now on my notebook (best from 5 tries):
% pgbench -i -s 3000
% pgbench -s 3000 -c 4 -j 4 -P 1 -T 60
HEAD 569 tps
patched 542 tps
% pgbench -s 3000 -c 4 -j 4 -P 1 -T 60 -S
HEAD 9500 tps
patched 9458 tps
Looks close to measurement error, but may be explained increased amount
of work for planning. Including, may be, more complicated path tree.
Hmmm ... I'm now wondering about the "measurement error" theory.
I tried to repeat this measurement locally, focusing on the select-only
number since that should have a higher ratio of planning time to
execution.
Test setup:
cassert-off build
pgbench -i -s 100
sudo cpupower frequency-set --governor performance
repeat 3 times: pgbench -c 4 -j 4 -P 5 -T 60 -S
HEAD:
tps = 32508.217002 (excluding connections establishing)
tps = 33081.402766
tps = 32520.859913
average of 3: 32703 tps
WITH PATCH:
tps = 32815.922160 (excluding connections establishing)
tps = 33312.149718
tps = 32784.527489
average of 3: 32970 tps
(Hardware: dual quad-core Xeon E5-2609, running current RHEL6)
So I see no evidence for a slowdown on pgbench's SELECT queries.
Anybody else want to check performance on simple scan/join queries?
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 3 March 2016 at 04:29, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Simon Riggs wrote:
ISTM that we are clearly "going for it"; everybody agrees we should apply
the patch now.The longer we hold off on applying it, the longer we wait for dependent
changes.Agreed -- we need this in tree as soon as realistically possible.
There is a a bit a problem here, because this patch conflicts heavily
with at least one other patch that's been in the queue for a long time,
which is Kommi/Rowley's patch for parallel aggregation; the more we
delay applying this one, the worse the deadlines for that one.I assume they are hard at work updating that patch to apply on top of
Tom's patch. It's not realistic to expect that we would apply any
further planner changes before this one is in.
I agree that it would be good to get this in as soon as possible. I'm
currently very close to being done with writing Parallel Aggregate on
top of the upper planner changes. So far this version is much cleaner
as there's less cruft added compared with the other version, of which
would need to be removed again after the upper planner changes are in
anyway. Putting parallel aggregate in first would be asking Tom to
re-invent parallel aggregate when he rebases the upper planner stuff
on the new master branch, which makes very little sense.
So I agree that it would be nice to get the upper planner changes in
first, but soon! not at the end of March 'fest, as doing so would most
likely kill parallel aggregate for 9.6, and I kinda think that would
be silly as (I think) it's pretty much the biggest missing piece of
the parallel query set.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers