Choosing parallel_degree

Started by David Rowleyabout 10 years ago64 messageshackers
Jump to latest
#1David Rowley
dgrowleyml@gmail.com

Over in [1]/messages/by-id/CANkGpBtUvzpdvF2=_iQ64UjmVrPYcS6d4i9-wepbUsq1sq+AWw@mail.gmail.com James mentioned about wanting more to be able to have more
influence over the partial path's parallel_degree decision. At risk
of a discussion on that hijacking the parallel aggregate thread, I
thought I'd start this for anyone who would want to discuss making
changes to that.

I've attached a simple C program which shows the parallel_degree which
will be chosen at the moment. For now it's based on the size of the
base relation. Perhaps that will need to be rethought later, perhaps
based on costs. But I just don't think it's something for 9.6.

Here's the output of the C program.

For 1 pages there will be 1 workers (rel size 0 MB, 0 GB)
For 3001 pages there will be 2 workers (rel size 23 MB, 0 GB)
For 9001 pages there will be 3 workers (rel size 70 MB, 0 GB)
For 27001 pages there will be 4 workers (rel size 210 MB, 0 GB)
For 81001 pages there will be 5 workers (rel size 632 MB, 0 GB)
For 243001 pages there will be 6 workers (rel size 1898 MB, 1 GB)
For 729001 pages there will be 7 workers (rel size 5695 MB, 5 GB)
For 2187001 pages there will be 8 workers (rel size 17085 MB, 16 GB)
For 6561001 pages there will be 9 workers (rel size 51257 MB, 50 GB)
For 19683001 pages there will be 10 workers (rel size 153773 MB, 150 GB)
For 59049001 pages there will be 11 workers (rel size 461320 MB, 450 GB)
For 177147001 pages there will be 12 workers (rel size 1383960 MB, 1351 GB)
For 531441001 pages there will be 13 workers (rel size 4151882 MB, 4054 GB)
For 1594323001 pages there will be 14 workers (rel size 12455648 MB, 12163 GB)

[1]: /messages/by-id/CANkGpBtUvzpdvF2=_iQ64UjmVrPYcS6d4i9-wepbUsq1sq+AWw@mail.gmail.com

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

parallel_degree.ctext/x-csrc; charset=US-ASCII; name=parallel_degree.cDownload
#2James Sewell
james.sewell@lisasoft.com
In reply to: David Rowley (#1)
Re: Choosing parallel_degree

Thanks David,

Eventually it would be great to take into account the cost of the function
doing the agg (pg_proc.procost, which is a multiple of CPU units).

This would allow people to mark specific aggregations as needing more CPU
power, therefore needing more workers per page (or should it be tuple in
this case?).

In the meantime some way to manually influence this would be good. I just
did some testing (on an 8VCPU machine) with a 139MB table, which gets 3
workers currently.

For a count(*) I get a time of 131.754 ms. If I increase this to 8 workers
I get around 86.193 ms.

Obviously this doesn't mean much as YMMV - but it does show that the
ability to manually adjust the scaling would be great, especially in
data warehouse or reporting environments.

I did want to test with some really slow aggs, but even when I take out the
small table test in create_parallel_paths I can't seem to get a parallel
plan for a tiny table. Any idea on why this would be David?

Cheers,

James Sewell,
PostgreSQL Team Lead / Solutions Architect
______________________________________

Level 2, 50 Queen St, Melbourne VIC 3000

*P *(+61) 3 8370 8000 *W* www.lisasoft.com *F *(+61) 3 8370 8099

On Tue, Mar 15, 2016 at 12:25 PM, David Rowley <david.rowley@2ndquadrant.com

wrote:

Over in [1] James mentioned about wanting more to be able to have more
influence over the partial path's parallel_degree decision. At risk
of a discussion on that hijacking the parallel aggregate thread, I
thought I'd start this for anyone who would want to discuss making
changes to that.

I've attached a simple C program which shows the parallel_degree which
will be chosen at the moment. For now it's based on the size of the
base relation. Perhaps that will need to be rethought later, perhaps
based on costs. But I just don't think it's something for 9.6.

Here's the output of the C program.

For 1 pages there will be 1 workers (rel size 0 MB, 0 GB)
For 3001 pages there will be 2 workers (rel size 23 MB, 0 GB)
For 9001 pages there will be 3 workers (rel size 70 MB, 0 GB)
For 27001 pages there will be 4 workers (rel size 210 MB, 0 GB)
For 81001 pages there will be 5 workers (rel size 632 MB, 0 GB)
For 243001 pages there will be 6 workers (rel size 1898 MB, 1 GB)
For 729001 pages there will be 7 workers (rel size 5695 MB, 5 GB)
For 2187001 pages there will be 8 workers (rel size 17085 MB, 16 GB)
For 6561001 pages there will be 9 workers (rel size 51257 MB, 50 GB)
For 19683001 pages there will be 10 workers (rel size 153773 MB, 150 GB)
For 59049001 pages there will be 11 workers (rel size 461320 MB, 450 GB)
For 177147001 pages there will be 12 workers (rel size 1383960 MB, 1351 GB)
For 531441001 pages there will be 13 workers (rel size 4151882 MB, 4054 GB)
For 1594323001 pages there will be 14 workers (rel size 12455648 MB, 12163
GB)

[1]
/messages/by-id/CANkGpBtUvzpdvF2=_iQ64UjmVrPYcS6d4i9-wepbUsq1sq+AWw@mail.gmail.com

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--

------------------------------
The contents of this email are confidential and may be subject to legal or
professional privilege and copyright. No representation is made that this
email is free of viruses or other defects. If you have received this
communication in error, you may not copy or distribute any part of it or
otherwise disclose its contents to anyone. Please advise the sender of your
incorrect receipt of this correspondence.

#3David Rowley
dgrowleyml@gmail.com
In reply to: James Sewell (#2)
Re: Choosing parallel_degree

On 15 March 2016 at 15:24, James Sewell <james.sewell@lisasoft.com> wrote:

I did want to test with some really slow aggs, but even when I take out the small table test in create_parallel_paths I can't seem to get a parallel plan for a tiny table. Any idea on why this would be David?

In the test program I attached to the previous email, if I change the
parallel_threshold = 1000; to be parallel_threshold = 1; then I get
the following output:

For 1 pages there will be 1 workers (rel size 0 MB, 0 GB)
For 4 pages there will be 2 workers (rel size 0 MB, 0 GB)

So I'm getting 2 workers for only 4 pages. I've not tested in
Postgres, but if you do this and: SET parallel_setup_cost = 0; then
I'd imagine it should generate a parallel plan.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Robert Haas
robertmhaas@gmail.com
In reply to: David Rowley (#1)
Re: Choosing parallel_degree

On Mon, Mar 14, 2016 at 9:25 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:

Over in [1] James mentioned about wanting more to be able to have more
influence over the partial path's parallel_degree decision. At risk
of a discussion on that hijacking the parallel aggregate thread, I
thought I'd start this for anyone who would want to discuss making
changes to that.

I've attached a simple C program which shows the parallel_degree which
will be chosen at the moment. For now it's based on the size of the
base relation. Perhaps that will need to be rethought later, perhaps
based on costs. But I just don't think it's something for 9.6.

I thought about this a bit more. There are a couple of easy things we
could do here.

The 1000-page threshold could be made into a GUC.

We could add a per-table reloption for parallel-degree that would
override the calculation.

Neither of those things is very smart, but they'd probably both help
some people. If someone is able to produce a patch for either or both
of these things *quickly*, we could possibly try to squeeze it into
9.6 as a cleanup of work already done.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Julien Rouhaud
rjuju123@gmail.com
In reply to: Robert Haas (#4)
Re: Choosing parallel_degree

On 15/03/2016 21:12, Robert Haas wrote:

On Mon, Mar 14, 2016 at 9:25 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:

Over in [1] James mentioned about wanting more to be able to have more
influence over the partial path's parallel_degree decision. At risk
of a discussion on that hijacking the parallel aggregate thread, I
thought I'd start this for anyone who would want to discuss making
changes to that.

I've attached a simple C program which shows the parallel_degree which
will be chosen at the moment. For now it's based on the size of the
base relation. Perhaps that will need to be rethought later, perhaps
based on costs. But I just don't think it's something for 9.6.

I thought about this a bit more. There are a couple of easy things we
could do here.

The 1000-page threshold could be made into a GUC.

We could add a per-table reloption for parallel-degree that would
override the calculation.

Neither of those things is very smart, but they'd probably both help
some people. If someone is able to produce a patch for either or both
of these things *quickly*, we could possibly try to squeeze it into
9.6 as a cleanup of work already done.

I'm not too familiar with parallel planning, but I tried to implement
both in attached patch. I didn't put much effort into the
parallel_threshold GUC documentation, because I didn't really see a good
way to explain it. I'd e happy to improve it if needed. Also, to make
this parameter easier to tune for users, perhaps we could divide the
default value by 3 and use it as is in the first iteration in
create_parallel_path() ?

Also, global max_parallel_degree still needs to be at least 1 for the
per table value to be considered.

--
Julien Rouhaud
http://dalibo.com - http://dalibo.org

Attachments:

choose_parallel_degree.difftext/plain; charset=UTF-8; name=choose_parallel_degree.diffDownload+77-7
#6David Rowley
dgrowleyml@gmail.com
In reply to: Julien Rouhaud (#5)
Re: Choosing parallel_degree

On 16 March 2016 at 13:26, Julien Rouhaud <julien.rouhaud@dalibo.com> wrote:

On 15/03/2016 21:12, Robert Haas wrote:

I thought about this a bit more. There are a couple of easy things we
could do here.

The 1000-page threshold could be made into a GUC.

We could add a per-table reloption for parallel-degree that would
override the calculation.

Neither of those things is very smart, but they'd probably both help
some people. If someone is able to produce a patch for either or both
of these things *quickly*, we could possibly try to squeeze it into
9.6 as a cleanup of work already done.

I'm not too familiar with parallel planning, but I tried to implement
both in attached patch. I didn't put much effort into the
parallel_threshold GUC documentation, because I didn't really see a good
way to explain it. I'd e happy to improve it if needed. Also, to make
this parameter easier to tune for users, perhaps we could divide the
default value by 3 and use it as is in the first iteration in
create_parallel_path() ?

Also, global max_parallel_degree still needs to be at least 1 for the
per table value to be considered.

Thanks for working on this. I've only skimmed the patch so far, but
will try to look more closely later.

This did get me wondering why we have the parallel_threshold at all,
and not just allow the parallel_setup_cost to make parallel plans look
less favourable for smaller relations. I assume that this is so that
we don't burden the planner with the overhead of generating parallel
paths for smaller relations?

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Robert Haas
robertmhaas@gmail.com
In reply to: David Rowley (#6)
Re: Choosing parallel_degree

On Tue, Mar 15, 2016 at 8:45 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:

On 16 March 2016 at 13:26, Julien Rouhaud <julien.rouhaud@dalibo.com> wrote:

On 15/03/2016 21:12, Robert Haas wrote:

I thought about this a bit more. There are a couple of easy things we
could do here.

The 1000-page threshold could be made into a GUC.

We could add a per-table reloption for parallel-degree that would
override the calculation.

Neither of those things is very smart, but they'd probably both help
some people. If someone is able to produce a patch for either or both
of these things *quickly*, we could possibly try to squeeze it into
9.6 as a cleanup of work already done.

I'm not too familiar with parallel planning, but I tried to implement
both in attached patch. I didn't put much effort into the
parallel_threshold GUC documentation, because I didn't really see a good
way to explain it. I'd e happy to improve it if needed. Also, to make
this parameter easier to tune for users, perhaps we could divide the
default value by 3 and use it as is in the first iteration in
create_parallel_path() ?

Also, global max_parallel_degree still needs to be at least 1 for the
per table value to be considered.

Thanks for working on this. I've only skimmed the patch so far, but
will try to look more closely later.

This did get me wondering why we have the parallel_threshold at all,
and not just allow the parallel_setup_cost to make parallel plans look
less favourable for smaller relations. I assume that this is so that
we don't burden the planner with the overhead of generating parallel
paths for smaller relations?

Right. And, also, we need some heuristic for judging how many workers
to deploy. parallel_setup_cost is of no use in making that decision.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8James Sewell
james.sewell@lisasoft.com
In reply to: Julien Rouhaud (#5)
Re: Choosing parallel_degree

On Wed, Mar 16, 2016 at 11:26 AM, Julien Rouhaud <julien.rouhaud@dalibo.com>
wrote:

I'm not too familiar with parallel planning, but I tried to implement
both in attached patch. I didn't put much effort into the
parallel_threshold GUC documentation, because I didn't really see a good
way to explain it. I'd e happy to improve it if needed. Also, to make
this parameter easier to tune for users, perhaps we could divide the
default value by 3 and use it as is in the first iteration in
create_parallel_path() ?

Also, global max_parallel_degree still needs to be at least 1 for the
per table value to be considered.

All applies and works from my end.

Is the max_parallel_degree per table of much use here? It allows the max
number of workers per table to be set - but it's still bound by the same
formula (now from the GUC). So in reality it's only really useful for
limiting the number of workers, not raising it.

Would there be a common use case for limiting parallelism on a subset of
tables in a database you've explicitly set to have a higher amount
of parallel operations via the GUC? I struggle to think of one?

I think in practicality the reverse would be more common, you'd want to set
certain tables to a starting point of a certain number of workers (and ramp
up to more if the formula allowed it). You could set this to 0 for
never use parallel
agg on this table.

Another option is to allow access to the the threshold multiplier
(currently hard coded to 3) per table - but this might become pretty hard
to explain succinctly in the documentation.

Cheers,
James

--

------------------------------
The contents of this email are confidential and may be subject to legal or
professional privilege and copyright. No representation is made that this
email is free of viruses or other defects. If you have received this
communication in error, you may not copy or distribute any part of it or
otherwise disclose its contents to anyone. Please advise the sender of your
incorrect receipt of this correspondence.

#9Julien Rouhaud
rjuju123@gmail.com
In reply to: James Sewell (#8)
Re: Choosing parallel_degree

On 16/03/2016 05:45, James Sewell wrote:

On Wed, Mar 16, 2016 at 11:26 AM, Julien Rouhaud
<julien.rouhaud@dalibo.com <mailto:julien.rouhaud@dalibo.com>>wrote:

I'm not too familiar with parallel planning, but I tried to implement
both in attached patch. I didn't put much effort into the
parallel_threshold GUC documentation, because I didn't really see a good
way to explain it. I'd e happy to improve it if needed. Also, to make
this parameter easier to tune for users, perhaps we could divide the
default value by 3 and use it as is in the first iteration in
create_parallel_path() ?

Also, global max_parallel_degree still needs to be at least 1 for the
per table value to be considered.

All applies and works from my end.

Thanks for testing!

Is the max_parallel_degree per table of much use here? It allows the max
number of workers per table to be set - but it's still bound by the same
formula (now from the GUC). So in reality it's only really useful for
limiting the number of workers, not raising it.

You can set a global max_parallel_degree low, and raise it per table. If
you set up max_parallel_degree to 1, you can "activate" parallel workers
for only a subset of tables.

--
Julien Rouhaud
http://dalibo.com - http://dalibo.org

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Robert Haas
robertmhaas@gmail.com
In reply to: Julien Rouhaud (#5)
Re: Choosing parallel_degree

On Tue, Mar 15, 2016 at 8:26 PM, Julien Rouhaud <julien.rouhaud@dalibo.com>
wrote:

On 15/03/2016 21:12, Robert Haas wrote:

On Mon, Mar 14, 2016 at 9:25 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:

Over in [1] James mentioned about wanting more to be able to have more
influence over the partial path's parallel_degree decision. At risk
of a discussion on that hijacking the parallel aggregate thread, I
thought I'd start this for anyone who would want to discuss making
changes to that.

I've attached a simple C program which shows the parallel_degree which
will be chosen at the moment. For now it's based on the size of the
base relation. Perhaps that will need to be rethought later, perhaps
based on costs. But I just don't think it's something for 9.6.

I thought about this a bit more. There are a couple of easy things we
could do here.

The 1000-page threshold could be made into a GUC.

We could add a per-table reloption for parallel-degree that would
override the calculation.

Neither of those things is very smart, but they'd probably both help
some people. If someone is able to produce a patch for either or both
of these things *quickly*, we could possibly try to squeeze it into
9.6 as a cleanup of work already done.

I'm not too familiar with parallel planning, but I tried to implement
both in attached patch. I didn't put much effort into the
parallel_threshold GUC documentation, because I didn't really see a good
way to explain it. I'd e happy to improve it if needed. Also, to make
this parameter easier to tune for users, perhaps we could divide the
default value by 3 and use it as is in the first iteration in
create_parallel_path()

Hmm. I'm not sure I like the parallel_threshold GUC after all. That's a
little strange. But maybe.

For the reloption, I was thinking it would be parallel_degree, not
max_parallel_degree. max_parallel_degree would still control, so if the
parallel_degree for a given table was greater than max_parallel_degree,
you'd get max_parallel_degree instead. But you could crank up the
parallel_degree for a small table to force more parallelism when querying
it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#11Julien Rouhaud
rjuju123@gmail.com
In reply to: Robert Haas (#10)
Re: Choosing parallel_degree

On 16/03/2016 17:16, Robert Haas wrote:

On Tue, Mar 15, 2016 at 8:26 PM, Julien Rouhaud
<julien.rouhaud@dalibo.com <mailto:julien.rouhaud@dalibo.com>> wrote:

On 15/03/2016 21:12, Robert Haas wrote:

On Mon, Mar 14, 2016 at 9:25 PM, David Rowley
<david.rowley@2ndquadrant.com <mailto:david.rowley@2ndquadrant.com>>

wrote:

Over in [1] James mentioned about wanting more to be able to have more
influence over the partial path's parallel_degree decision. At risk
of a discussion on that hijacking the parallel aggregate thread, I
thought I'd start this for anyone who would want to discuss making
changes to that.

I've attached a simple C program which shows the parallel_degree which
will be chosen at the moment. For now it's based on the size of the
base relation. Perhaps that will need to be rethought later, perhaps
based on costs. But I just don't think it's something for 9.6.

I thought about this a bit more. There are a couple of easy things we
could do here.

The 1000-page threshold could be made into a GUC.

We could add a per-table reloption for parallel-degree that would
override the calculation.

Neither of those things is very smart, but they'd probably both help
some people. If someone is able to produce a patch for either or both
of these things *quickly*, we could possibly try to squeeze it into
9.6 as a cleanup of work already done.

I'm not too familiar with parallel planning, but I tried to implement
both in attached patch. I didn't put much effort into the
parallel_threshold GUC documentation, because I didn't really see a good
way to explain it. I'd e happy to improve it if needed. Also, to make
this parameter easier to tune for users, perhaps we could divide the
default value by 3 and use it as is in the first iteration in
create_parallel_path()

Hmm. I'm not sure I like the parallel_threshold GUC after all. That's
a little strange. But maybe.

FWIW I have two commits in my local branch, so I can remove this one easily.

For the reloption, I was thinking it would be parallel_degree, not
max_parallel_degree. max_parallel_degree would still control, so if the
parallel_degree for a given table was greater than max_parallel_degree,
you'd get max_parallel_degree instead. But you could crank up the
parallel_degree for a small table to force more parallelism when
querying it.

Something like a "min_parallel_degree" then ?

--
Julien Rouhaud
http://dalibo.com - http://dalibo.org

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Robert Haas
robertmhaas@gmail.com
In reply to: Julien Rouhaud (#11)
Re: Choosing parallel_degree

On Wed, Mar 16, 2016 at 12:47 PM, Julien Rouhaud
<julien.rouhaud@dalibo.com> wrote:

Something like a "min_parallel_degree" then ?

Why not just parallel_degree without any prefix? As in, when scanning
this table in parallel, the reloption suggests using N workers.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Julien Rouhaud
rjuju123@gmail.com
In reply to: Robert Haas (#12)
Re: Choosing parallel_degree

On 16/03/2016 17:55, Robert Haas wrote:

On Wed, Mar 16, 2016 at 12:47 PM, Julien Rouhaud
<julien.rouhaud@dalibo.com> wrote:

Something like a "min_parallel_degree" then ?

Why not just parallel_degree without any prefix? As in, when scanning
this table in parallel, the reloption suggests using N workers.

Agreed.

PFA v2 that implements that.

--
Julien Rouhaud
http://dalibo.com - http://dalibo.org

Attachments:

choose_parallel_degree_v2.difftext/x-patch; name=choose_parallel_degree_v2.diffDownload+80-6
#14Robert Haas
robertmhaas@gmail.com
In reply to: Julien Rouhaud (#13)
Re: Choosing parallel_degree

On Wed, Mar 16, 2016 at 1:23 PM, Julien Rouhaud
<julien.rouhaud@dalibo.com> wrote:

On 16/03/2016 17:55, Robert Haas wrote:

On Wed, Mar 16, 2016 at 12:47 PM, Julien Rouhaud
<julien.rouhaud@dalibo.com> wrote:

Something like a "min_parallel_degree" then ?

Why not just parallel_degree without any prefix? As in, when scanning
this table in parallel, the reloption suggests using N workers.

Agreed.

PFA v2 that implements that.

I think create_parallel_paths shouldn't actually run the loop if the
reloption is specified; it should just adopt the specified value (or
max_parallel_degree, whichever is less). Right now, you have it doing
the work to compute the default value but then overriding it.

Also, I think parallel_degree should be down in the section that says
/* information about a base rel (not set for join rels!) */ and I
think it should be called something like rel_parallel_degree, to make
it more clear that it's a value set on the relation level.

Let's leave out the parallel_threshold stuff for now.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15Julien Rouhaud
rjuju123@gmail.com
In reply to: Robert Haas (#14)
Re: Choosing parallel_degree

On 16/03/2016 18:42, Robert Haas wrote:

On Wed, Mar 16, 2016 at 1:23 PM, Julien Rouhaud
<julien.rouhaud@dalibo.com> wrote:

On 16/03/2016 17:55, Robert Haas wrote:

On Wed, Mar 16, 2016 at 12:47 PM, Julien Rouhaud
<julien.rouhaud@dalibo.com> wrote:

Something like a "min_parallel_degree" then ?

Why not just parallel_degree without any prefix? As in, when scanning
this table in parallel, the reloption suggests using N workers.

Agreed.

PFA v2 that implements that.

I think create_parallel_paths shouldn't actually run the loop if the
reloption is specified; it should just adopt the specified value (or
max_parallel_degree, whichever is less). Right now, you have it doing
the work to compute the default value but then overriding it.

Oh ugly mistake. Fixed.

Also, I think parallel_degree should be down in the section that says
/* information about a base rel (not set for join rels!) */ and I
think it should be called something like rel_parallel_degree, to make
it more clear that it's a value set on the relation level.

You're right, fixed.

Let's leave out the parallel_threshold stuff for now.

attached v3 drops the GUC part.

--
Julien Rouhaud
http://dalibo.com - http://dalibo.org

Attachments:

rel_parallel_degree_v3.difftext/x-patch; name=rel_parallel_degree_v3.diffDownload+55-10
#16James Sewell
james.sewell@lisasoft.com
In reply to: Julien Rouhaud (#15)
Re: Choosing parallel_degree

On Thu, Mar 17, 2016 at 5:05 AM, Julien Rouhaud <julien.rouhaud@dalibo.com>
wrote:

attached v3 drops the GUC part.

This looks good good. I do think that some threshold control would be good
in the long term - but you are right Robert it just feels strange.

Maybe once the final formula is implemented in 9.7+ and this gets some real
world use cases it can be revisited?

One thing I really, really like about the way the new patch works is that
you can set parallel_degree on an inheritance parent, then that will set
the minimum workers for all associated children (when accessing from the
parent).

Currently this patch will not fire on small tables even when
parallel_degree is set, can we fix this by adding a check for
ref->parallel_degree to the table size condition?

Cheers,
James

--

------------------------------
The contents of this email are confidential and may be subject to legal or
professional privilege and copyright. No representation is made that this
email is free of viruses or other defects. If you have received this
communication in error, you may not copy or distribute any part of it or
otherwise disclose its contents to anyone. Please advise the sender of your
incorrect receipt of this correspondence.

#17Julien Rouhaud
rjuju123@gmail.com
In reply to: James Sewell (#16)
Re: Choosing parallel_degree

On 17/03/2016 02:07, James Sewell wrote:

On Thu, Mar 17, 2016 at 5:05 AM, Julien Rouhaud
<julien.rouhaud@dalibo.com <mailto:julien.rouhaud@dalibo.com>> wrote:

attached v3 drops the GUC part.

This looks good good. I do think that some threshold control would be
good in the long term - but you are right Robert it just feels strange.

Maybe once the final formula is implemented in 9.7+ and this gets some
real world use cases it can be revisited?

One thing I really, really like about the way the new patch works is
that you can set parallel_degree on an inheritance parent, then that
will set the minimum workers for all associated children (when accessing
from the parent).

Currently this patch will not fire on small tables even
when parallel_degree is set, can we fix this by adding a check
for ref->parallel_degree to the table size condition?

Actually, a parallel plan will be created, since in this case the
rel->reloptkind will be RELOPT_OTHER_MEMBER_REL, not RELOPT_BASEREL.

--
Julien Rouhaud
http://dalibo.com - http://dalibo.org

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18Julien Rouhaud
rjuju123@gmail.com
In reply to: Julien Rouhaud (#17)
Re: Choosing parallel_degree

On 17/03/2016 11:23, Julien Rouhaud wrote:

On 17/03/2016 02:07, James Sewell wrote:

On Thu, Mar 17, 2016 at 5:05 AM, Julien Rouhaud
<julien.rouhaud@dalibo.com <mailto:julien.rouhaud@dalibo.com>> wrote:

attached v3 drops the GUC part.

This looks good good. I do think that some threshold control would be
good in the long term - but you are right Robert it just feels strange.

Maybe once the final formula is implemented in 9.7+ and this gets some
real world use cases it can be revisited?

One thing I really, really like about the way the new patch works is
that you can set parallel_degree on an inheritance parent, then that
will set the minimum workers for all associated children (when accessing
from the parent).

Currently this patch will not fire on small tables even
when parallel_degree is set, can we fix this by adding a check
for ref->parallel_degree to the table size condition?

Actually, a parallel plan will be created, since in this case the
rel->reloptkind will be RELOPT_OTHER_MEMBER_REL, not RELOPT_BASEREL.

And after re-reading your mail I see that it was what you meant, sorry :)

With the current threshold, you need a table bigger than 8 MB to be able
to force parallel workers. I'm not sure there'll be benefits for
multiple workers on a table smaller than 8 MB, since setting up all the
parallel stuff takes time.

--
Julien Rouhaud
http://dalibo.com - http://dalibo.org

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19David Rowley
dgrowleyml@gmail.com
In reply to: Julien Rouhaud (#18)
Re: Choosing parallel_degree

On 18 March 2016 at 00:13, Julien Rouhaud <julien.rouhaud@dalibo.com> wrote:

With the current threshold, you need a table bigger than 8 MB to be able
to force parallel workers. I'm not sure there'll be benefits for
multiple workers on a table smaller than 8 MB, since setting up all the
parallel stuff takes time.

It would be really nice if it were possible to drop the setting really
low, so that combined with a low parallel_setup_cost we could enable
parallel query on small tables in the regression test suite.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20Julien Rouhaud
rjuju123@gmail.com
In reply to: David Rowley (#19)
Re: Choosing parallel_degree

On 17/03/2016 12:21, David Rowley wrote:

On 18 March 2016 at 00:13, Julien Rouhaud <julien.rouhaud@dalibo.com> wrote:

With the current threshold, you need a table bigger than 8 MB to be able
to force parallel workers. I'm not sure there'll be benefits for
multiple workers on a table smaller than 8 MB, since setting up all the
parallel stuff takes time.

It would be really nice if it were possible to drop the setting really
low, so that combined with a low parallel_setup_cost we could enable
parallel query on small tables in the regression test suite.

Indeed. That could also be a use case for moving parallel_threshold to a
GUC, but not sure what'd be best.

--
Julien Rouhaud
http://dalibo.com - http://dalibo.org

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21James Sewell
james.sewell@lisasoft.com
In reply to: Julien Rouhaud (#20)
#22Julien Rouhaud
rjuju123@gmail.com
In reply to: Robert Haas (#14)
#23Tom Lane
tgl@sss.pgh.pa.us
In reply to: Julien Rouhaud (#22)
#24Julien Rouhaud
rjuju123@gmail.com
In reply to: Tom Lane (#23)
#25David Rowley
dgrowleyml@gmail.com
In reply to: James Sewell (#21)
#26James Sewell
james.sewell@lisasoft.com
In reply to: David Rowley (#25)
#27Julien Rouhaud
rjuju123@gmail.com
In reply to: James Sewell (#26)
#28Julien Rouhaud
rjuju123@gmail.com
In reply to: Julien Rouhaud (#27)
#29Julien Rouhaud
rjuju123@gmail.com
In reply to: Julien Rouhaud (#28)
#30Amit Kapila
amit.kapila16@gmail.com
In reply to: Julien Rouhaud (#29)
#31Corey Huinker
corey.huinker@gmail.com
In reply to: Amit Kapila (#30)
#32Julien Rouhaud
rjuju123@gmail.com
In reply to: Amit Kapila (#30)
#33Julien Rouhaud
rjuju123@gmail.com
In reply to: Julien Rouhaud (#32)
#34Amit Kapila
amit.kapila16@gmail.com
In reply to: Julien Rouhaud (#33)
#35Julien Rouhaud
rjuju123@gmail.com
In reply to: Amit Kapila (#34)
#36Amit Kapila
amit.kapila16@gmail.com
In reply to: Julien Rouhaud (#35)
#37Julien Rouhaud
rjuju123@gmail.com
In reply to: Amit Kapila (#36)
#38Amit Kapila
amit.kapila16@gmail.com
In reply to: Julien Rouhaud (#37)
#39Robert Haas
robertmhaas@gmail.com
In reply to: Amit Kapila (#38)
#40Paul Ramsey
pramsey@cleverelephant.ca
In reply to: Robert Haas (#39)
#41Simon Riggs
simon@2ndQuadrant.com
In reply to: Paul Ramsey (#40)
#42Paul Ramsey
pramsey@cleverelephant.ca
In reply to: Simon Riggs (#41)
#43Robert Haas
robertmhaas@gmail.com
In reply to: Paul Ramsey (#42)
#44tushar
tushar.ahuja@enterprisedb.com
In reply to: Robert Haas (#39)
#45Julien Rouhaud
rjuju123@gmail.com
In reply to: tushar (#44)
#46Robert Haas
robertmhaas@gmail.com
In reply to: Julien Rouhaud (#45)
#47Julien Rouhaud
rjuju123@gmail.com
In reply to: Robert Haas (#46)
#48Simon Riggs
simon@2ndQuadrant.com
In reply to: Paul Ramsey (#42)
#49Simon Riggs
simon@2ndQuadrant.com
In reply to: Robert Haas (#43)
#50Amit Kapila
amit.kapila16@gmail.com
In reply to: Simon Riggs (#49)
#51tushar
tushar.ahuja@enterprisedb.com
In reply to: Robert Haas (#46)
#52Simon Riggs
simon@2ndQuadrant.com
In reply to: Amit Kapila (#50)
#53tushar
tushar.ahuja@enterprisedb.com
In reply to: Julien Rouhaud (#45)
#54Robert Haas
robertmhaas@gmail.com
In reply to: Simon Riggs (#49)
#55Simon Riggs
simon@2ndQuadrant.com
In reply to: Robert Haas (#54)
#56Julien Rouhaud
rjuju123@gmail.com
In reply to: Julien Rouhaud (#47)
#57Robert Haas
robertmhaas@gmail.com
In reply to: Julien Rouhaud (#56)
#58Julien Rouhaud
rjuju123@gmail.com
In reply to: Robert Haas (#57)
#59Amit Kapila
amit.kapila16@gmail.com
In reply to: Robert Haas (#57)
#60Robert Haas
robertmhaas@gmail.com
In reply to: Julien Rouhaud (#58)
#61Simon Riggs
simon@2ndQuadrant.com
In reply to: Simon Riggs (#55)
#62Robert Haas
robertmhaas@gmail.com
In reply to: Simon Riggs (#61)
#63Simon Riggs
simon@2ndQuadrant.com
In reply to: Robert Haas (#62)
#64Robert Haas
robertmhaas@gmail.com
In reply to: Simon Riggs (#63)