GUC parameter cursors_tuple_fraction
This patch adds a GUC parameter for tuple_fraction of cursors (discussed
earlier here:
http://archives.postgresql.org/pgsql-performance/2008-04/msg00018.php).
By setting this parameter the planner's favor to use fast-start plans
for cursors can be affected.
Regards,
Robert
Attachments:
cursor_tuple_fraction.patchapplication/octet-stream; name=cursor_tuple_fraction.patchDownload
Index: doc/src/sgml/config.sgml
===================================================================
--- doc/src/sgml/config.sgml (revision 30152)
+++ doc/src/sgml/config.sgml (working copy)
@@ -2281,6 +2281,20 @@
</listitem>
</varlistentry>
+ <varlistentry id="guc-cursor-tuple-fraction" xreflabel="cursor_tuple_fraction">
+ <term><varname>cursor_tuple_fraction</varname> (<type>floating point</type>)</term>
+ <indexterm>
+ <primary><varname>cursor_tuple_fraction</> configuration parameter</primary>
+ </indexterm>
+ <listitem>
+ <para>
+ Sets the planner's estimate of how many rows of a cursor will
+ be retrieved by fetches. This affects the planner's favor to
+ use fast-start plans for cursor. The default is 0.1.
+ </para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
</sect2>
</sect1>
Index: src/backend/optimizer/plan/planner.c
===================================================================
--- src/backend/optimizer/plan/planner.c (revision 30152)
+++ src/backend/optimizer/plan/planner.c (working copy)
@@ -42,6 +42,9 @@
#include "utils/syscache.h"
+/* GUC parameter cursor_tuple_fraction */
+double cursor_tuple_fraction = DEFAULT_CURSOR_TUPLE_FRACTION;
+
/* Hook for plugins to get control in planner() */
planner_hook_type planner_hook = NULL;
@@ -143,10 +146,9 @@
/*
* We have no real idea how many tuples the user will ultimately FETCH
* from a cursor, but it seems a good bet that he doesn't want 'em
- * all. Optimize for 10% retrieval (you gotta better number? Should
- * this be a SETtable parameter?)
+ * all. Optimize for 10% retrieval per default. Is a GUC parameter.
*/
- tuple_fraction = 0.10;
+ tuple_fraction = cursor_tuple_fraction;
}
else
{
Index: src/backend/utils/misc/guc.c
===================================================================
--- src/backend/utils/misc/guc.c (revision 30152)
+++ src/backend/utils/misc/guc.c (working copy)
@@ -1884,6 +1884,15 @@
0.5, 0.0, 1.0, NULL, NULL
},
+ {
+ {"cursor_tuple_fraction", PGC_USERSET, QUERY_TUNING_OTHER,
+ gettext_noop("Sets the tuple fraction for cursors."),
+ NULL
+ },
+ &cursor_tuple_fraction,
+ DEFAULT_CURSOR_TUPLE_FRACTION, 0.0, 1.0, NULL, NULL
+ },
+
/* End-of-list marker */
{
{NULL, 0, 0, NULL, NULL}, NULL, 0.0, 0.0, 0.0, NULL, NULL
Index: src/backend/utils/misc/postgresql.conf.sample
===================================================================
--- src/backend/utils/misc/postgresql.conf.sample (revision 30152)
+++ src/backend/utils/misc/postgresql.conf.sample (working copy)
@@ -223,6 +223,7 @@
#from_collapse_limit = 8
#join_collapse_limit = 8 # 1 disables collapsing of explicit
# JOIN clauses
+#cursor_tuple_fraction = 0.1 # range 0.0-1.0
#------------------------------------------------------------------------------
Index: src/include/optimizer/planmain.h
===================================================================
--- src/include/optimizer/planmain.h (revision 30152)
+++ src/include/optimizer/planmain.h (working copy)
@@ -18,6 +18,12 @@
#include "nodes/relation.h"
/*
+ * GUC parameter cursor_tuple_fraction
+ */
+#define DEFAULT_CURSOR_TUPLE_FRACTION 0.1
+extern double cursor_tuple_fraction;
+
+/*
* prototypes for plan/planmain.c
*/
extern void query_planner(PlannerInfo *root, List *tlist,
Hell, Robert wrote:
This patch adds a GUC parameter for tuple_fraction of cursors (discussed
earlier here:
http://archives.postgresql.org/pgsql-performance/2008-04/msg00018.php).
By setting this parameter the planner's favor to use fast-start plans
for cursors can be affected.
I have added this to the May commitfest page,
http://wiki.postgresql.org/wiki/CommitFest:May
Thanks.
--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.
On Thu, 2008-04-03 at 16:45 +0200, Hell, Robert wrote:
This patch adds a GUC parameter for tuple_fraction of cursors (discussed
earlier here:
http://archives.postgresql.org/pgsql-performance/2008-04/msg00018.php).
By setting this parameter the planner's favor to use fast-start plans
for cursors can be affected.
I think this patch looks OK coding-wise, but not tested, yet.
If we did apply this patch it would need significantly more
documentation, probably examples and the like. But I think writing that
docs would open up a can of worms, hence copying -hackers.
Other RDBMS allow users to specify whether they want fast-start or
all-data plans. We should discuss whether we want to set the fraction
directly or whether we should have 2 (or more) specific settings such as
"fast" and "all".
Also, if we did have this parameter then I don't think it should be
included in postgresql.conf. I don't see any need to change the default
setting for *all* cursors, but I can see the need to change the cursor
fraction for *one* specific query. Which raises wider issues.
* We could add to DECLARE syntax that says something like OPTIMIZE FOR
FIRST ROWS or OPTIMIZE FOR ALL ROWS. But our policy AIUI is that we do
not want to further decorate SQL Standard commands.
* We've said here http://www.postgresql.org/docs/faqs.TODO.html that we
"Don't want hints". If that's what we really think, then this patch must
surely be rejected because its a hint... That isn't my view. I *now*
think we do need hints of various kinds.
Decorating queries with *all* necessary information is not always good,
but there are some kinds of information that *do* belong on specific
queries. The cursor fraction is a great example of information that
really does live on a specific query.
But in a wider sense, I think support of hints is actually the only way
long term of making large applications work within a reasonable
timeframe and cost. If we change information at the database object
level in order to correct one issue, we are likely to find that more
problems are raised elsewhere. Same thing is true of altering optimiser
cost models. Few users can wait 2 years while we solve the problem and
fix it permanently, or even a few days while they resolve the inner
workings of the planner and work out how to re-write it.
I had spoken strongly against hints for general use in Postgres
previously. Many attendees on recent PostgreSQL performance courses have
successfully argued in favour of hints and as a result my viewpoint is
now changed. Though we need a central no-hints-allowed GUC for those
cases where application programmers need restraining...
I think we need Hints. (And not this patch, sorry about that, Robert).
(This could well lead to me losing work doing performance tuning, though
I believe its the wish of the majority that we should support hints).
--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com
Simon Riggs wrote:
* We've said here http://www.postgresql.org/docs/faqs.TODO.html that we
"Don't want hints". If that's what we really think, then this patch must
surely be rejected because its a hint... That isn't my view. I *now*
think we do need hints of various kinds.
cursors_tuple_fraction or OPTIMIZE FOR xxx ROWS isn't the kind of hints
we've said "no" to in the past. We don't want hints that work-around
planner deficiencies, for example where we get the row count of a node
completely wrong. This is different. This is about telling how the
application is going to use the result set. It's relevant even assuming
that the planner got the estimates spot on. Which plan is the best
depends on whether the application can start processing the data as it
comes in, or whether it's loading it all in memory first, for example.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
"Heikki Linnakangas" <heikki@enterprisedb.com> writes:
Simon Riggs wrote:
* We've said here http://www.postgresql.org/docs/faqs.TODO.html that we
"Don't want hints". If that's what we really think, then this patch must
surely be rejected because its a hint... That isn't my view. I *now*
think we do need hints of various kinds.
cursors_tuple_fraction or OPTIMIZE FOR xxx ROWS isn't the kind of hints
we've said "no" to in the past.
More to the point, I think what we've generally meant by "hints" is
nonstandard decoration on individual SQL commands (either explicit
syntax or one of those interpret-some-comments kluges). Simon is
reading the policy in such a way that it would forbid all the planner
cost parameters, which is surely not what is intended.
I see this as being basically another cost parameter, and as such
I don't think it needs more documentation than any of those have.
(Now admittedly you could argue that they could all use a ton more
documentation than they now have, but it's not reasonable to insist
on just this one meeting a different standard.)
regards, tom lane
On Fri, 2008-05-02 at 12:01 -0400, Tom Lane wrote:
"Heikki Linnakangas" <heikki@enterprisedb.com> writes:
Simon Riggs wrote:
* We've said here http://www.postgresql.org/docs/faqs.TODO.html that we
"Don't want hints". If that's what we really think, then this patch must
surely be rejected because its a hint... That isn't my view. I *now*
think we do need hints of various kinds.cursors_tuple_fraction or OPTIMIZE FOR xxx ROWS isn't the kind of hints
we've said "no" to in the past.More to the point, I think what we've generally meant by "hints" is
nonstandard decoration on individual SQL commands (either explicit
syntax or one of those interpret-some-comments kluges).
Yes, that is definitely an Oracle compatibility thought.
Simon is
reading the policy in such a way that it would forbid all the planner
cost parameters, which is surely not what is intended.
So we're allowed to influence the behaviour of the planner, but just not
by touching the individual statements. OK.
Can we allow a statement like
SET index_weighting = '{{my_index, 0.1},{another_index, 0.5}}'
That would allow us to tell a specific SQL statement that it should use
a cost weighting of 0.1 * normal cost for the "my_index" index (etc).
SET enable_seqscan = off; is a blunt instrument that can sometimes
achieve the same thing, but insufficiently exact to be really useful.
Many people use that (Sun, in their first published PostgreSQL
benchmark...)
We/I want to make the planner even better, but the above is roughly what
people want while they're waiting for us to get the planner right.
I see this as being basically another cost parameter, and as such
I don't think it needs more documentation than any of those have.
(Now admittedly you could argue that they could all use a ton more
documentation than they now have, but it's not reasonable to insist
on just this one meeting a different standard.)
OK, seems fair.
--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com
On Fri, 2008-05-02 at 16:17 +0100, Heikki Linnakangas wrote:
Simon Riggs wrote:
* We've said here http://www.postgresql.org/docs/faqs.TODO.html that we
"Don't want hints". If that's what we really think, then this patch must
surely be rejected because its a hint... That isn't my view. I *now*
think we do need hints of various kinds.cursors_tuple_fraction or OPTIMIZE FOR xxx ROWS isn't the kind of hints
we've said "no" to in the past. We don't want hints that work-around
planner deficiencies, for example where we get the row count of a node
completely wrong. This is different. This is about telling how the
application is going to use the result set. It's relevant even assuming
that the planner got the estimates spot on.
Yes, thats what I see.
Which plan is the best
depends on whether the application can start processing the data as it
comes in, or whether it's loading it all in memory first, for example.
Agreed, which is why people want to tell us that also, when they know
the answer in the context of their application.
--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com
On Fri, 2008-05-02 at 12:01 -0400, Tom Lane wrote:
I see this as being basically another cost parameter, and as such
I don't think it needs more documentation than any of those have.
(Now admittedly you could argue that they could all use a ton more
documentation than they now have, but it's not reasonable to insist
on just this one meeting a different standard.)
OK, if that's the view then the patch is ready for commit, AFAICS.
--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com
Simon Riggs <simon@2ndquadrant.com> writes:
OK, if that's the view then the patch is ready for commit, AFAICS.
Use of the plural in the name seems a bit odd to me. Anyone have a
problem with calling it "cursor_tuple_fraction" instead?
regards, tom lane
You're right - that's just a typo in the subject of the post.
It's called cursor_tuple_fraction in the submitted patch.
Regards,
Robert
-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Freitag, 02. Mai 2008 22:36
To: Simon Riggs
Cc: Heikki Linnakangas; Hell, Robert; pgsql-patches@postgresql.org;
pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] [PATCHES] GUC parameter cursors_tuple_fraction
Simon Riggs <simon@2ndquadrant.com> writes:
OK, if that's the view then the patch is ready for commit, AFAICS.
Use of the plural in the name seems a bit odd to me. Anyone have a
problem with calling it "cursor_tuple_fraction" instead?
regards, tom lane
On Fri, 2008-05-02 at 16:36 -0400, Tom Lane wrote:
Simon Riggs <simon@2ndquadrant.com> writes:
OK, if that's the view then the patch is ready for commit, AFAICS.
Use of the plural in the name seems a bit odd to me. Anyone have a
problem with calling it "cursor_tuple_fraction" instead?
Agreed.
--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com
"Hell, Robert" <Robert.Hell@fabasoft.com> writes:
You're right - that's just a typo in the subject of the post.
It's called cursor_tuple_fraction in the submitted patch.
Ah, I hadn't actually read the patch yet ;-). As penance for the noise,
I will do so now.
regards, tom lane
"Hell, Robert" <Robert.Hell@fabasoft.com> writes:
This patch adds a GUC parameter for tuple_fraction of cursors (discussed
earlier here:
http://archives.postgresql.org/pgsql-performance/2008-04/msg00018.php).
By setting this parameter the planner's favor to use fast-start plans
for cursors can be affected.
Applied with some documentation cleanup, and also care for the behavior
at the range endpoints 0 and 1. A user expecting it to act like a
simple fraction would have been quite surprised at the endpoints
(see the header comments for grouping_planner).
regards, tom lane
On Friday 02 May 2008 13:35:27 Simon Riggs wrote:
On Fri, 2008-05-02 at 12:01 -0400, Tom Lane wrote:
"Heikki Linnakangas" <heikki@enterprisedb.com> writes:
Simon Riggs wrote:
* We've said here http://www.postgresql.org/docs/faqs.TODO.html that
we "Don't want hints". If that's what we really think, then this patch
must surely be rejected because its a hint... That isn't my view. I
*now* think we do need hints of various kinds.cursors_tuple_fraction or OPTIMIZE FOR xxx ROWS isn't the kind of hints
we've said "no" to in the past.More to the point, I think what we've generally meant by "hints" is
nonstandard decoration on individual SQL commands (either explicit
syntax or one of those interpret-some-comments kluges).Yes, that is definitely an Oracle compatibility thought.
Simon is
reading the policy in such a way that it would forbid all the planner
cost parameters, which is surely not what is intended.So we're allowed to influence the behaviour of the planner, but just not
by touching the individual statements. OK.Can we allow a statement like
SET index_weighting = '{{my_index, 0.1},{another_index, 0.5}}'
That would allow us to tell a specific SQL statement that it should use
a cost weighting of 0.1 * normal cost for the "my_index" index (etc).
SET enable_seqscan = off; is a blunt instrument that can sometimes
achieve the same thing, but insufficiently exact to be really useful.
Many people use that (Sun, in their first published PostgreSQL
benchmark...)We/I want to make the planner even better, but the above is roughly what
people want while they're waiting for us to get the planner right.
I think the above would be helpful, but even then I am not sure it goes far
enough, since there might be cases where you need and index wieghted high for
a specific join within the query, but low for a different join in that query.
A further problem with this implementation would be that in general it would
require that you issue a set, run your query, and then issue another set to
put those weightings back to the defaults, which seems like an excessive
amount of overhead. As much as people like to turn their nose to in-line
query hints, the manifestation of deficiencies in the planner always
manifiest themselves at the query level, so it makes it difficult to create a
solid solution that operates somewhere else.
--
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL