Parallel query and temp_file_limit

Started by Peter Geogheganover 9 years ago20 messages
#1Peter Geoghegan
pg@heroku.com

temp_file_limit "specifies the maximum amount of disk space that a
session can use for temporary files, such as sort and hash temporary
files", according to the documentation. That's not true when parallel
query is in use, since the global variable temporary_files_size
receives no special treatment for parallel query. Enforcement of
temp_file_limit will therefore not work *per-session*.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Amit Kapila
amit.kapila16@gmail.com
In reply to: Peter Geoghegan (#1)
Re: Parallel query and temp_file_limit

On Wed, May 18, 2016 at 12:55 AM, Peter Geoghegan <pg@heroku.com> wrote:

temp_file_limit "specifies the maximum amount of disk space that a
session can use for temporary files, such as sort and hash temporary
files", according to the documentation. That's not true when parallel
query is in use, since the global variable temporary_files_size
receives no special treatment for parallel query.

What kind of special treatment are you expecting for temporary_files_size,
also why do you think it is required? Currently neither we build hash in
parallel nor there is any form of parallel sort work.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#3Peter Geoghegan
pg@heroku.com
In reply to: Amit Kapila (#2)
Re: Parallel query and temp_file_limit

On Tue, May 17, 2016 at 1:53 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:

What kind of special treatment are you expecting for temporary_files_size,
also why do you think it is required? Currently neither we build hash in
parallel nor there is any form of parallel sort work.

I expect only that temporary_files_size be described accurately, and
have new behavior for parallel query that is not surprising. There are
probably several solutions that would meet that standard, and I am not
attached to any particular one of them.

I wrote a parallel sort patch already (CREATE INDEX for the B-Tree
AM), and will post it at an opportune time. So, I think we can expect
your observations about there not being parallel sort work to no
longer apply in a future release, which we should get ahead of now.
Also, won't parallel workers that build their own copy of the hash
table (for a hash join) also use their own temp files, if there is a
need for temp files? I think parallel query will end up sharing temp
files fairly often, and not just out of convenience to implementers
(that is, not just to avoid using shared memory extensively).

Fundamentally, since temporary_files_size enforcement simply
piggy-backs on low-level fd.c file management, without any
consideration of what the temp files contain, it'll be hard to be sure
that parallel workers will not have issues. I think it'll be far
easier to fix the problem then it would be to figure out if it's
possible to get away with it.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Peter Geoghegan
pg@heroku.com
In reply to: Peter Geoghegan (#3)
Re: Parallel query and temp_file_limit

On Tue, May 17, 2016 at 3:33 PM, Peter Geoghegan <pg@heroku.com> wrote:

Fundamentally, since temporary_files_size enforcement simply
piggy-backs on low-level fd.c file management, without any
consideration of what the temp files contain, it'll be hard to be sure
that parallel workers will not have issues. I think it'll be far
easier to fix the problem then it would be to figure out if it's
possible to get away with it.

I'll write a patch to fix the issue, if there is a consensus on a solution.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Robert Haas
robertmhaas@gmail.com
In reply to: Peter Geoghegan (#4)
Re: Parallel query and temp_file_limit

On Tue, May 17, 2016 at 6:40 PM, Peter Geoghegan <pg@heroku.com> wrote:

On Tue, May 17, 2016 at 3:33 PM, Peter Geoghegan <pg@heroku.com> wrote:

Fundamentally, since temporary_files_size enforcement simply
piggy-backs on low-level fd.c file management, without any
consideration of what the temp files contain, it'll be hard to be sure
that parallel workers will not have issues. I think it'll be far
easier to fix the problem then it would be to figure out if it's
possible to get away with it.

I'll write a patch to fix the issue, if there is a consensus on a solution.

I think for 9.6 we just have to document this issue. In the next
release, we could (and might well want to) try to do something more
clever.

What I'm tempted to do is trying to document that, as a point of
policy, parallel query in 9.6 uses up to (workers + 1) times the
resources that a single session might use. That includes not only CPU
but also things like work_mem and temp file space. This obviously
isn't ideal, but it's what could be done by the ship date.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6David Rowley
david.rowley@2ndquadrant.com
In reply to: Robert Haas (#5)
Re: Parallel query and temp_file_limit

On 18 May 2016 at 22:40, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, May 17, 2016 at 6:40 PM, Peter Geoghegan <pg@heroku.com> wrote:

On Tue, May 17, 2016 at 3:33 PM, Peter Geoghegan <pg@heroku.com> wrote:

Fundamentally, since temporary_files_size enforcement simply
piggy-backs on low-level fd.c file management, without any
consideration of what the temp files contain, it'll be hard to be sure
that parallel workers will not have issues. I think it'll be far
easier to fix the problem then it would be to figure out if it's
possible to get away with it.

I'll write a patch to fix the issue, if there is a consensus on a

solution.

I think for 9.6 we just have to document this issue. In the next
release, we could (and might well want to) try to do something more
clever.

What I'm tempted to do is trying to document that, as a point of
policy, parallel query in 9.6 uses up to (workers + 1) times the
resources that a single session might use. That includes not only CPU
but also things like work_mem and temp file space. This obviously
isn't ideal, but it's what could be done by the ship date.

I was asked (internally I believe) about abuse of work_mem during my work
on parallel aggregates, at the time I didn't really feel like I was abusing
that any more than parallel hash join was. My thought was that one day it
would be nice if work_mem could be granted to a query and we had some query
marshal system which ensured that the total grants did not exceed the
server wide memory dedicated to work_mem. Of course that's lots of work, as
there's at least one node (HashAgg) which can still blow out work_mem for
bad estimates. For this release, I assumed it wouldn't be too big an issue
if we're shipping with max_parallel_degree = 0 as we could just decorate
the docs with some warnings about work_mem is per node / per worker to
caution users setting this setting any higher. That might be enough to give
us wriggle from for the future where we can make improvements, so I agree
with Robert, the docs seem like the best solution for 9.6.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#7Peter Geoghegan
pg@heroku.com
In reply to: Robert Haas (#5)
Re: Parallel query and temp_file_limit

On Wed, May 18, 2016 at 3:40 AM, Robert Haas <robertmhaas@gmail.com> wrote:

I'll write a patch to fix the issue, if there is a consensus on a solution.

I think for 9.6 we just have to document this issue. In the next
release, we could (and might well want to) try to do something more
clever.

Works for me. You may wish to update comments within fd.c at the same time.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Peter Geoghegan
pg@heroku.com
In reply to: Peter Geoghegan (#7)
Re: Parallel query and temp_file_limit

On Wed, May 18, 2016 at 12:01 PM, Peter Geoghegan <pg@heroku.com> wrote:

I think for 9.6 we just have to document this issue. In the next
release, we could (and might well want to) try to do something more
clever.

Works for me. You may wish to update comments within fd.c at the same time.

I've created a 9.6 open issue for this.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Robert Haas
robertmhaas@gmail.com
In reply to: Peter Geoghegan (#8)
Re: Parallel query and temp_file_limit

On Sun, Jun 5, 2016 at 4:32 PM, Peter Geoghegan <pg@heroku.com> wrote:

On Wed, May 18, 2016 at 12:01 PM, Peter Geoghegan <pg@heroku.com> wrote:

I think for 9.6 we just have to document this issue. In the next
release, we could (and might well want to) try to do something more
clever.

Works for me. You may wish to update comments within fd.c at the same time.

I've created a 9.6 open issue for this.

You previously offered to write a patch for this. Are you still
planning to do that?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Peter Geoghegan
pg@heroku.com
In reply to: Robert Haas (#9)
Re: Parallel query and temp_file_limit

On Tue, Jun 7, 2016 at 8:32 AM, Robert Haas <robertmhaas@gmail.com> wrote:

You previously offered to write a patch for this. Are you still
planning to do that?

OK, I'll get to that in the next few days.

I'm slightly concerned that I might have missed a real problem in the
code. I'll need to examine the issue more closely.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Peter Geoghegan
pg@heroku.com
In reply to: Robert Haas (#5)
Re: Parallel query and temp_file_limit

On Wed, May 18, 2016 at 3:40 AM, Robert Haas <robertmhaas@gmail.com> wrote:

What I'm tempted to do is trying to document that, as a point of
policy, parallel query in 9.6 uses up to (workers + 1) times the
resources that a single session might use. That includes not only CPU
but also things like work_mem and temp file space. This obviously
isn't ideal, but it's what could be done by the ship date.

Where would that be documented, though? Would it need to be noted in
the case of each such GUC?

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Geoghegan (#11)
Re: Parallel query and temp_file_limit

Peter Geoghegan <pg@heroku.com> writes:

On Wed, May 18, 2016 at 3:40 AM, Robert Haas <robertmhaas@gmail.com> wrote:

What I'm tempted to do is trying to document that, as a point of
policy, parallel query in 9.6 uses up to (workers + 1) times the
resources that a single session might use. That includes not only CPU
but also things like work_mem and temp file space. This obviously
isn't ideal, but it's what could be done by the ship date.

Where would that be documented, though? Would it need to be noted in
the case of each such GUC?

Why can't we just note this in the number-of-workers GUCs? It's not like
there even *is* a GUC for many of our per-process resource consumption
behaviors.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#12)
Re: Parallel query and temp_file_limit

On Mon, Jun 20, 2016 at 11:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Peter Geoghegan <pg@heroku.com> writes:

On Wed, May 18, 2016 at 3:40 AM, Robert Haas <robertmhaas@gmail.com> wrote:

What I'm tempted to do is trying to document that, as a point of
policy, parallel query in 9.6 uses up to (workers + 1) times the
resources that a single session might use. That includes not only CPU
but also things like work_mem and temp file space. This obviously
isn't ideal, but it's what could be done by the ship date.

Where would that be documented, though? Would it need to be noted in
the case of each such GUC?

Why can't we just note this in the number-of-workers GUCs? It's not like
there even *is* a GUC for many of our per-process resource consumption
behaviors.

+1.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14Robert Haas
robertmhaas@gmail.com
In reply to: Robert Haas (#13)
1 attachment(s)
Re: Parallel query and temp_file_limit

On Tue, Jun 21, 2016 at 8:15 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Jun 20, 2016 at 11:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Peter Geoghegan <pg@heroku.com> writes:

On Wed, May 18, 2016 at 3:40 AM, Robert Haas <robertmhaas@gmail.com> wrote:

What I'm tempted to do is trying to document that, as a point of
policy, parallel query in 9.6 uses up to (workers + 1) times the
resources that a single session might use. That includes not only CPU
but also things like work_mem and temp file space. This obviously
isn't ideal, but it's what could be done by the ship date.

Where would that be documented, though? Would it need to be noted in
the case of each such GUC?

Why can't we just note this in the number-of-workers GUCs? It's not like
there even *is* a GUC for many of our per-process resource consumption
behaviors.

+1.

Since Peter doesn't seem in a hurry to produce a patch for this issue,
I wrote one. It is attached. I'll commit this in a day or two if
nobody objects.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachments:

parallel-workers-guc-doc.patchinvalid/octet-stream; name=parallel-workers-guc-doc.patchDownload
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 061697b..cf0a071 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2011,6 +2011,22 @@ include_dir 'conf.d'
          be inefficient.  The default value is 2.  Setting this value to 0
          disables parallel query execution.
         </para>
+
+        <para>
+         Note that parallel queries may consume very substantially more
+         resources than non-parallel queries, because each worker process is
+         a completely separate process which has roughly the same impact on the
+         system as an additional user session.  This should be taken into
+         account when choosing a value for this setting, as well as when
+         configuring other settings that control resource utilization, such
+         as <xref linkend="guc-work-mem">.  Resource limits such as
+         <varname>work_mem</> are applied individually to each worker,
+         which means the total utilization may be much higher across all
+         processes than it would normally be for any single process.
+         For example, a parallel query using 4 workers may use up to 5 times
+         as much CPU time, memory, I/O bandwidth, and so forth as a query which
+         uses no workers at all.
+        </para>
        </listitem>
       </varlistentry>
 
#15Peter Geoghegan
pg@heroku.com
In reply to: Robert Haas (#14)
Re: Parallel query and temp_file_limit

On Tue, Jul 5, 2016 at 7:45 AM, Robert Haas <robertmhaas@gmail.com> wrote:

Since Peter doesn't seem in a hurry to produce a patch for this issue,
I wrote one. It is attached. I'll commit this in a day or two if
nobody objects.

Sorry about the delay.

Your patch seems reasonable, but I thought we'd also want to change
"per session" to "per session (with an additional temp_file_limit
allowance within each parallel worker)" for temp_file_limit.

I think it's worthwhile noting this for temp_file_limit specifically,
since it's explicitly a per session limit, whereas users are quite
used to the idea that work_mem might be doled out multiple times for
multiple executor nodes.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16Robert Haas
robertmhaas@gmail.com
In reply to: Peter Geoghegan (#15)
Re: Parallel query and temp_file_limit

On Tue, Jul 5, 2016 at 1:58 PM, Peter Geoghegan <pg@heroku.com> wrote:

On Tue, Jul 5, 2016 at 7:45 AM, Robert Haas <robertmhaas@gmail.com> wrote:

Since Peter doesn't seem in a hurry to produce a patch for this issue,
I wrote one. It is attached. I'll commit this in a day or two if
nobody objects.

Sorry about the delay.

Your patch seems reasonable, but I thought we'd also want to change
"per session" to "per session (with an additional temp_file_limit
allowance within each parallel worker)" for temp_file_limit.

I think it's worthwhile noting this for temp_file_limit specifically,
since it's explicitly a per session limit, whereas users are quite
used to the idea that work_mem might be doled out multiple times for
multiple executor nodes.

I think that it is not worth mentioning specifically for
temp_file_limit; to me that seems to be a hole with no bottom. We'll
end up arguing about which GUCs should mention it specifically and
there will be no end to it.

We can see what other people think, but that's my position.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17Peter Geoghegan
pg@heroku.com
In reply to: Robert Haas (#16)
Re: Parallel query and temp_file_limit

On Tue, Jul 5, 2016 at 12:00 PM, Robert Haas <robertmhaas@gmail.com> wrote:

I think that it is not worth mentioning specifically for
temp_file_limit; to me that seems to be a hole with no bottom. We'll
end up arguing about which GUCs should mention it specifically and
there will be no end to it.

I don't think that you need it for any other GUC, so I really don't
know why you're concerned about a slippery slope. The only other
resource GUC that is scoped per session that I can see is
temp_buffers, but that doesn't need to change, since parallel workers
cannot use temp_buffers directly in practice. max_files_per_process is
already clearly per process, so no change needed there either.

I don't see a case other than temp_file_limit that appears to be even
marginally in need of a specific note.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Geoghegan (#17)
Re: Parallel query and temp_file_limit

Peter Geoghegan <pg@heroku.com> writes:

On Tue, Jul 5, 2016 at 12:00 PM, Robert Haas <robertmhaas@gmail.com> wrote:

I think that it is not worth mentioning specifically for
temp_file_limit; to me that seems to be a hole with no bottom. We'll
end up arguing about which GUCs should mention it specifically and
there will be no end to it.

I don't think that you need it for any other GUC, so I really don't
know why you're concerned about a slippery slope.

FWIW, I agree with Robert on this. It seems just weird to call out
temp_file_limit specifically. Also, I don't agree that that's the
only interesting per-process resource consumption; max_files_per_process
seems much more likely to cause trouble in practice.

Perhaps we could change the wording of temp_file_limit's description
from "space that a session can use" to "space that a process can use"
to help clarify this?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19Peter Geoghegan
pg@heroku.com
In reply to: Tom Lane (#18)
Re: Parallel query and temp_file_limit

On Tue, Jul 5, 2016 at 12:58 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Perhaps we could change the wording of temp_file_limit's description
from "space that a session can use" to "space that a process can use"
to help clarify this?

That's all that I was looking for, really.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20Robert Haas
robertmhaas@gmail.com
In reply to: Peter Geoghegan (#19)
Re: Parallel query and temp_file_limit

On Tue, Jul 5, 2016 at 3:59 PM, Peter Geoghegan <pg@heroku.com> wrote:

On Tue, Jul 5, 2016 at 12:58 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Perhaps we could change the wording of temp_file_limit's description
from "space that a session can use" to "space that a process can use"
to help clarify this?

That's all that I was looking for, really.

OK, done that way.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers