RTE_NAMEDTUPLESTORE, enrtuples and comments

Started by Noah Mischover 8 years ago14 messages

noah@leadboat.com

over 8 years ago

While completing my annual src/backend/nodes/*funcs.c audit, I noticed defects
in commit 18ce3a4 changes to RangeTblEntry:

1. Field relid is under a comment saying it is valid for RTE_RELATION only.
Fields coltypes, coltypmods and colcollations are under a comment saying
they are valid for RTE_VALUES and RTE_CTE only. But _outRangeTblEntry()
treats all of the above as valid for RTE_NAMEDTUPLESTORE. Which is right?

2. New fields enrname and enrtuples are set only for RTE_NAMEDTUPLESTORE, yet
they're under the comment for RTE_VALUES and RTE_CTE. This pair needs its
own comment.

3. Each of _{copy,equal,out,read}RangeTblEntry() silently ignores enrtuples.
_equalRangeTblEntry() ignores enrname, too. In each case, the function
should either use the field or have a comment to note that skipping the
field is intentional. Which should it be?

This fourth point is not necessarily a defect: I wonder if RangeTblEntry is
the right place for enrtuples. It's a concept regularly seen in planner data
structures but not otherwise seen at parse tree level.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Thomas Munro

thomas.munro@enterprisedb.com

over 8 years ago

In reply to: Noah Misch (#1)

1 attachment(s)

Re: RTE_NAMEDTUPLESTORE, enrtuples and comments

On Sun, Jun 11, 2017 at 6:25 PM, Noah Misch <noah@leadboat.com> wrote:

While completing my annual src/backend/nodes/*funcs.c audit, I noticed defects
in commit 18ce3a4 changes to RangeTblEntry:

1. Field relid is under a comment saying it is valid for RTE_RELATION only.

The comment is out of date. Here's a fix for that.

Fields coltypes, coltypmods and colcollations are under a comment saying
they are valid for RTE_VALUES and RTE_CTE only. But _outRangeTblEntry()
treats all of the above as valid for RTE_NAMEDTUPLESTORE. Which is right?

The comment is wrong. In passing I also noticed that RTE_TABLEFUNC
also uses coltypes et al and is not mentioned in that comment. Here's
a fix for both omissions.

2. New fields enrname and enrtuples are set only for RTE_NAMEDTUPLESTORE, yet
they're under the comment for RTE_VALUES and RTE_CTE. This pair needs its
own comment.

Right. The attached patch fixes that too.

3. Each of _{copy,equal,out,read}RangeTblEntry() silently ignores enrtuples.
_equalRangeTblEntry() ignores enrname, too. In each case, the function
should either use the field or have a comment to note that skipping the
field is intentional. Which should it be?

Ignoring enrname in _equalRangeTblEntry is certainly a bug, and the
attached adds it. I also noticed that _copyRangeTleEntry had enrname
but not in the same order as the struct's members, which seems to be
an accidental deviation from the convention, so I moved it in the
attached.

Ignoring enrtuples is probably also wrong, but ...

This fourth point is not necessarily a defect: I wonder if RangeTblEntry is
the right place for enrtuples. It's a concept regularly seen in planner data
structures but not otherwise seen at parse tree level.

I agree that this is strange. Perhaps
set_namedtuplestore_size_estimates should instead look up the
EphemeralNamedRelation by rte->enrname to find its way to
enr->md.enrtuples, but I'm not sure off the top of my head how it
should get its hands on the QueryEnvironment required to do that. I
will look into this on Monday, but other ideas/clues welcome...

--
Thomas Munro
http://www.enterprisedb.com

Attachments:

fixes-for-enr-rte-review-v1.patchapplication/octet-stream; name=fixes-for-enr-rte-review-v1.patchDownload

diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 36bf1dc92bb..3b54e3d80ce 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -2303,10 +2303,10 @@ _copyRangeTblEntry(const RangeTblEntry *from)
 	COPY_STRING_FIELD(ctename);
 	COPY_SCALAR_FIELD(ctelevelsup);
 	COPY_SCALAR_FIELD(self_reference);
-	COPY_STRING_FIELD(enrname);
 	COPY_NODE_FIELD(coltypes);
 	COPY_NODE_FIELD(coltypmods);
 	COPY_NODE_FIELD(colcollations);
+	COPY_STRING_FIELD(enrname);
 	COPY_NODE_FIELD(alias);
 	COPY_NODE_FIELD(eref);
 	COPY_SCALAR_FIELD(lateral);
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 5bcf0317dc8..696cd6e7e1b 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2631,6 +2631,7 @@ _equalRangeTblEntry(const RangeTblEntry *a, const RangeTblEntry *b)
 	COMPARE_NODE_FIELD(coltypes);
 	COMPARE_NODE_FIELD(coltypmods);
 	COMPARE_NODE_FIELD(colcollations);
+	COMPARE_STRING_FIELD(enrname);
 	COMPARE_NODE_FIELD(alias);
 	COMPARE_NODE_FIELD(eref);
 	COMPARE_SCALAR_FIELD(lateral);
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 2d2e2c0fbc1..271564fd23b 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -944,6 +944,11 @@ typedef struct RangeTblEntry
 
 	/*
 	 * Fields valid for a plain relation RTE (else zero):
+	 *
+	 * As a special case, RTE_NAMEDTUPLESTORE can also set relid to indicate
+	 * that the tuple format of the tuplestore is the same as the referenced
+	 * relation.  This allows plans referencing AFTER trigger transition
+	 * tables to be invalidated if the underlying table is altered.
 	 */
 	Oid			relid;			/* OID of the relation */
 	char		relkind;		/* relation kind (see pg_class.relkind) */
@@ -1004,16 +1009,23 @@ typedef struct RangeTblEntry
 	bool		self_reference; /* is this a recursive self-reference? */
 
 	/*
-	 * Fields valid for values and CTE RTEs (else NIL):
+	 * Fields valid for table functions, values, CTE and ENR RTEs (else NIL):
 	 *
 	 * We need these for CTE RTEs so that the types of self-referential
 	 * columns are well-defined.  For VALUES RTEs, storing these explicitly
 	 * saves having to re-determine the info by scanning the values_lists.
+	 * For ENRs, we store the types explicitly here (we could get the
+	 * information from the catalogs if 'relid' was supplied, but we'd still
+	 * need these for TupleDesc-based ENRs, so we might as well always store
+	 * the type info here).
 	 */
 	List	   *coltypes;		/* OID list of column type OIDs */
 	List	   *coltypmods;		/* integer list of column typmods */
 	List	   *colcollations;	/* OID list of column collation OIDs */
 
+	/*
+	 * Fields valid for ENR RTEs (else NULL/zero):
+	 */
 	char	   *enrname;		/* name of ephemeral named relation */
 	double		enrtuples;		/* estimated or actual from caller */

Thomas Munro

thomas.munro@enterprisedb.com

over 8 years ago

In reply to: Thomas Munro (#2)

1 attachment(s)

Re: RTE_NAMEDTUPLESTORE, enrtuples and comments

On Sun, Jun 11, 2017 at 11:11 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:

On Sun, Jun 11, 2017 at 6:25 PM, Noah Misch <noah@leadboat.com> wrote:

This fourth point is not necessarily a defect: I wonder if RangeTblEntry is
the right place for enrtuples. It's a concept regularly seen in planner data
structures but not otherwise seen at parse tree level.

I agree that this is strange. Perhaps
set_namedtuplestore_size_estimates should instead look up the
EphemeralNamedRelation by rte->enrname to find its way to
enr->md.enrtuples, but I'm not sure off the top of my head how it
should get its hands on the QueryEnvironment required to do that. I
will look into this on Monday, but other ideas/clues welcome...

Here's some background: If you look at the interface changes
introduced by 18ce3a4 you will see that it is now possible for a
QueryEnvironment object to be injected into the parser and executor.
Currently the only use for it is to inject named tuplestores into the
system via SPI_register_relation or SPI_register_trigger_data. That's
to support SQL standard transition tables, but anyone can now use SPI
to expose data to SQL via an ephemeral named relation in this way.
(In future we could imagine other kinds of objects like table
variables, anonymous functions or streams).

The QueryEnvironment is used by the parser to resolve names and build
RangeTblEntry objects. The planner doesn't currently need it, because
all the information it needs is in the RangeTblEntry, including the
offending row estimate. Then the executor needs it to get its hands
on the tuplestores. So the question is: how can we get it into
costsize.c, so that it can look up the EphermalNamedRelationMetaData
object by name, instead of trafficking statistics through parser data
structures?

Here are a couple of ways forward that I can see:

1. Figure out how to get the QueryEnvironment through more of these
stack frames (possibly inside other objects), so that
set_namedtuplestore_size_estimates can look up enrtuples by enrname:

set_namedtuplestore_size_estimates <-- would need QueryEnvironment
set_namedtuplestore_pathlist
set_rel_size
set_base_rel_sizes
make_one_rel
query_planner
grouping_planner
subquery_planner
standard_planner
planner
pg_plan_query
pg_plan_queries <-- doesn't receive QueryEnvironment
BuildCachedPlan <-- receives QueryEnvironment
GetCachedPlan
_SPI_execute_plan
SPI_execute_plan_with_paramlist

2. Rip the row estimation out for now, use a bogus hard coded
estimate like we do in some other cases, and revisit later. See
attached (including changes from my previous message).
Unsurprisingly, a query plan changes.

Thoughts?

--
Thomas Munro
http://www.enterprisedb.com

Attachments:

fixes-for-enr-rte-review-v2.patchapplication/octet-stream; name=fixes-for-enr-rte-review-v2.patchDownload

diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 36bf1dc92bb..3b54e3d80ce 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -2303,10 +2303,10 @@ _copyRangeTblEntry(const RangeTblEntry *from)
 	COPY_STRING_FIELD(ctename);
 	COPY_SCALAR_FIELD(ctelevelsup);
 	COPY_SCALAR_FIELD(self_reference);
-	COPY_STRING_FIELD(enrname);
 	COPY_NODE_FIELD(coltypes);
 	COPY_NODE_FIELD(coltypmods);
 	COPY_NODE_FIELD(colcollations);
+	COPY_STRING_FIELD(enrname);
 	COPY_NODE_FIELD(alias);
 	COPY_NODE_FIELD(eref);
 	COPY_SCALAR_FIELD(lateral);
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 5bcf0317dc8..696cd6e7e1b 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2631,6 +2631,7 @@ _equalRangeTblEntry(const RangeTblEntry *a, const RangeTblEntry *b)
 	COMPARE_NODE_FIELD(coltypes);
 	COMPARE_NODE_FIELD(coltypmods);
 	COMPARE_NODE_FIELD(colcollations);
+	COMPARE_STRING_FIELD(enrname);
 	COMPARE_NODE_FIELD(alias);
 	COMPARE_NODE_FIELD(eref);
 	COMPARE_SCALAR_FIELD(lateral);
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index f062c6b9f1e..781cdf8ff99 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -4777,14 +4777,12 @@ set_namedtuplestore_size_estimates(PlannerInfo *root, RelOptInfo *rel)
 	Assert(rte->rtekind == RTE_NAMEDTUPLESTORE);
 
 	/*
-	 * Use the estimate provided by the code which is generating the named
-	 * tuplestore.  In some cases, the actual number might be available; in
-	 * others the same plan will be re-used, so a "typical" value might be
-	 * estimated and used.
+	 * XXX We could use the estimate provided by the code which is generating
+	 * the named tuplestore in EphemeralNamedRelationMetadataData member
+	 * 'enrtuples'.  We don't have access to QueryEnvironment to look it up
+	 * from here yet, so for now we'll use a bogus default estimate.
 	 */
-	rel->tuples = rte->enrtuples;
-	if (rel->tuples < 0)
-		rel->tuples = 1000;
+	rel->tuples = 1000;
 
 	/* Now estimate number of output rows, etc */
 	set_baserel_size_estimates(root, rel);
diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c
index e412d0f9d30..f99e547745c 100644
--- a/src/backend/parser/parse_relation.c
+++ b/src/backend/parser/parse_relation.c
@@ -2020,7 +2020,6 @@ addRangeTableEntryForENR(ParseState *pstate,
 	rte->eref = makeAlias(refname, NIL);
 	buildRelationAliases(tupdesc, alias, rte->eref);
 	rte->enrname = enrmd->name;
-	rte->enrtuples = enrmd->enrtuples;
 	rte->coltypes = NIL;
 	rte->coltypmods = NIL;
 	rte->colcollations = NIL;
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 2d2e2c0fbc1..8be5c3a9c09 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -944,6 +944,11 @@ typedef struct RangeTblEntry
 
 	/*
 	 * Fields valid for a plain relation RTE (else zero):
+	 *
+	 * As a special case, RTE_NAMEDTUPLESTORE can also set relid to indicate
+	 * that the tuple format of the tuplestore is the same as the referenced
+	 * relation.  This allows plans referencing AFTER trigger transition
+	 * tables to be invalidated if the underlying table is altered.
 	 */
 	Oid			relid;			/* OID of the relation */
 	char		relkind;		/* relation kind (see pg_class.relkind) */
@@ -1004,18 +1009,24 @@ typedef struct RangeTblEntry
 	bool		self_reference; /* is this a recursive self-reference? */
 
 	/*
-	 * Fields valid for values and CTE RTEs (else NIL):
+	 * Fields valid for table functions, values, CTE and ENR RTEs (else NIL):
 	 *
 	 * We need these for CTE RTEs so that the types of self-referential
 	 * columns are well-defined.  For VALUES RTEs, storing these explicitly
 	 * saves having to re-determine the info by scanning the values_lists.
+	 * For ENRs, we store the types explicitly here (we could get the
+	 * information from the catalogs if 'relid' was supplied, but we'd still
+	 * need these for TupleDesc-based ENRs, so we might as well always store
+	 * the type info here).
 	 */
 	List	   *coltypes;		/* OID list of column type OIDs */
 	List	   *coltypmods;		/* integer list of column typmods */
 	List	   *colcollations;	/* OID list of column collation OIDs */
 
+	/*
+	 * Fields valid for ENR RTEs (else NULL/zero):
+	 */
 	char	   *enrname;		/* name of ephemeral named relation */
-	double		enrtuples;		/* estimated or actual from caller */
 
 	/*
 	 * Fields valid in all RTEs:
diff --git a/src/test/regress/expected/plpgsql.out b/src/test/regress/expected/plpgsql.out
index 7ebbde60d30..66a9c9fc0c8 100644
--- a/src/test/regress/expected/plpgsql.out
+++ b/src/test/regress/expected/plpgsql.out
@@ -5757,13 +5757,17 @@ CREATE TRIGGER transition_table_base_upd_trig
 UPDATE transition_table_base
   SET val = '*' || val || '*'
   WHERE id BETWEEN 2 AND 3;
-INFO:  Hash Full Join
+INFO:  Merge Full Join
   Output: COALESCE(ot.id, nt.id), ot.val, nt.val
-  Hash Cond: (ot.id = nt.id)
-  ->  Named Tuplestore Scan
+  Merge Cond: (ot.id = nt.id)
+  ->  Sort
         Output: ot.id, ot.val
-  ->  Hash
+        Sort Key: ot.id
+        ->  Named Tuplestore Scan
+              Output: ot.id, ot.val
+  ->  Sort
         Output: nt.id, nt.val
+        Sort Key: nt.id
         ->  Named Tuplestore Scan
               Output: nt.id, nt.val

Noah Misch

noah@leadboat.com

over 8 years ago

In reply to: Thomas Munro (#3)

Re: RTE_NAMEDTUPLESTORE, enrtuples and comments

On Mon, Jun 12, 2017 at 04:04:23PM +1200, Thomas Munro wrote:

On Sun, Jun 11, 2017 at 11:11 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:

On Sun, Jun 11, 2017 at 6:25 PM, Noah Misch <noah@leadboat.com> wrote:

This fourth point is not necessarily a defect: I wonder if RangeTblEntry is
the right place for enrtuples. It's a concept regularly seen in planner data
structures but not otherwise seen at parse tree level.

I agree that this is strange. Perhaps
set_namedtuplestore_size_estimates should instead look up the
EphemeralNamedRelation by rte->enrname to find its way to
enr->md.enrtuples, but I'm not sure off the top of my head how it
should get its hands on the QueryEnvironment required to do that. I
will look into this on Monday, but other ideas/clues welcome...

Here's some background: If you look at the interface changes
introduced by 18ce3a4 you will see that it is now possible for a
QueryEnvironment object to be injected into the parser and executor.
Currently the only use for it is to inject named tuplestores into the
system via SPI_register_relation or SPI_register_trigger_data. That's
to support SQL standard transition tables, but anyone can now use SPI
to expose data to SQL via an ephemeral named relation in this way.
(In future we could imagine other kinds of objects like table
variables, anonymous functions or streams).

The QueryEnvironment is used by the parser to resolve names and build
RangeTblEntry objects. The planner doesn't currently need it, because
all the information it needs is in the RangeTblEntry, including the
offending row estimate. Then the executor needs it to get its hands
on the tuplestores. So the question is: how can we get it into
costsize.c, so that it can look up the EphermalNamedRelationMetaData
object by name, instead of trafficking statistics through parser data
structures?

Here are a couple of ways forward that I can see:

1. Figure out how to get the QueryEnvironment through more of these
stack frames (possibly inside other objects), so that
set_namedtuplestore_size_estimates can look up enrtuples by enrname:

set_namedtuplestore_size_estimates <-- would need QueryEnvironment
set_namedtuplestore_pathlist
set_rel_size
set_base_rel_sizes
make_one_rel
query_planner
grouping_planner
subquery_planner
standard_planner
planner
pg_plan_query
pg_plan_queries <-- doesn't receive QueryEnvironment
BuildCachedPlan <-- receives QueryEnvironment
GetCachedPlan
_SPI_execute_plan
SPI_execute_plan_with_paramlist

2. Rip the row estimation out for now, use a bogus hard coded
estimate like we do in some other cases, and revisit later. See
attached (including changes from my previous message).
Unsurprisingly, a query plan changes.

Thoughts?

I'm not sufficiently familiar with the relevant code to judge this one. Let's
see if planner experts voice an opinion. Absent more opinions, the current
design stands.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Thomas Munro

thomas.munro@enterprisedb.com

over 8 years ago

In reply to: Noah Misch (#4)

Re: RTE_NAMEDTUPLESTORE, enrtuples and comments

On Tue, Jun 13, 2017 at 6:40 PM, Noah Misch <noah@leadboat.com> wrote:

On Mon, Jun 12, 2017 at 04:04:23PM +1200, Thomas Munro wrote:

On Sun, Jun 11, 2017 at 11:11 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:

On Sun, Jun 11, 2017 at 6:25 PM, Noah Misch <noah@leadboat.com> wrote:

This fourth point is not necessarily a defect: I wonder if RangeTblEntry is
the right place for enrtuples. It's a concept regularly seen in planner data
structures but not otherwise seen at parse tree level.

I agree that this is strange. Perhaps
set_namedtuplestore_size_estimates should instead look up the
EphemeralNamedRelation by rte->enrname to find its way to
enr->md.enrtuples, but I'm not sure off the top of my head how it
should get its hands on the QueryEnvironment required to do that. I
will look into this on Monday, but other ideas/clues welcome...

Here's some background: If you look at the interface changes
introduced by 18ce3a4 you will see that it is now possible for a
QueryEnvironment object to be injected into the parser and executor.
Currently the only use for it is to inject named tuplestores into the
system via SPI_register_relation or SPI_register_trigger_data. That's
to support SQL standard transition tables, but anyone can now use SPI
to expose data to SQL via an ephemeral named relation in this way.
(In future we could imagine other kinds of objects like table
variables, anonymous functions or streams).

The QueryEnvironment is used by the parser to resolve names and build
RangeTblEntry objects. The planner doesn't currently need it, because
all the information it needs is in the RangeTblEntry, including the
offending row estimate. Then the executor needs it to get its hands
on the tuplestores. So the question is: how can we get it into
costsize.c, so that it can look up the EphermalNamedRelationMetaData
object by name, instead of trafficking statistics through parser data
structures?

Here are a couple of ways forward that I can see:

1. Figure out how to get the QueryEnvironment through more of these
stack frames (possibly inside other objects), so that
set_namedtuplestore_size_estimates can look up enrtuples by enrname:

set_namedtuplestore_size_estimates <-- would need QueryEnvironment
set_namedtuplestore_pathlist
set_rel_size
set_base_rel_sizes
make_one_rel
query_planner
grouping_planner
subquery_planner
standard_planner
planner
pg_plan_query
pg_plan_queries <-- doesn't receive QueryEnvironment
BuildCachedPlan <-- receives QueryEnvironment
GetCachedPlan
_SPI_execute_plan
SPI_execute_plan_with_paramlist

2. Rip the row estimation out for now, use a bogus hard coded
estimate like we do in some other cases, and revisit later. See
attached (including changes from my previous message).
Unsurprisingly, a query plan changes.

Thoughts?

I'm not sufficiently familiar with the relevant code to judge this one. Let's
see if planner experts voice an opinion. Absent more opinions, the current
design stands.

Here's another thought I had about this during review[1]/messages/by-id/CAEepm=02A5LTfGtHo8xWpEmW4AYY+ES-uPr02bWb0OzGF4Ej-Q@mail.gmail.com and didn't
follow up: Query plans are cached by plpgsql and reused. If you're
unlucky, on the first firing of a given trigger a transition
tuplestore could have 0 or 1 rows and on the second firing it could
have a zillion rows, I guess potentially resulting in pathologically
bad performance. I do think it's a good idea for SPI client code to
be able to provide an estimate via SPI_register_relation (if the
coding passes muster or can be suitably refactored), but maybe "oh,
about 1,000" is actually better for transition tables anyway in the
absence of some kind of adaptive replanning or user declarations.

[1]: /messages/by-id/CAEepm=02A5LTfGtHo8xWpEmW4AYY+ES-uPr02bWb0OzGF4Ej-Q@mail.gmail.com

--
Thomas Munro
http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Robert Haas

robertmhaas@gmail.com

over 8 years ago

In reply to: Thomas Munro (#3)

Re: RTE_NAMEDTUPLESTORE, enrtuples and comments

On Mon, Jun 12, 2017 at 12:04 AM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:

Here are a couple of ways forward that I can see:

1. Figure out how to get the QueryEnvironment through more of these
stack frames (possibly inside other objects), so that
set_namedtuplestore_size_estimates can look up enrtuples by enrname:

If you were going to do this, the thing to do would be to get it
hooked up to the PlannerInfo, possibly via an intermediate hop in the
Query.

2. Rip the row estimation out for now, use a bogus hard coded
estimate like we do in some other cases, and revisit later. See
attached (including changes from my previous message).
Unsurprisingly, a query plan changes.

Perhaps this is a silly question, but I don't particularly see what's
wrong with:

3. Do nothing.

If I understand correctly, for the current use of named tuplestores,
the existing code produces correct estimates. If we chose option #1,
that would still be true, but we'd have to change a bunch of code to
get there. If we chose option #2, it would cease to be true. Why not
just leave it alone?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Tom Lane

tgl@sss.pgh.pa.us

over 8 years ago

In reply to: Robert Haas (#6)

Re: RTE_NAMEDTUPLESTORE, enrtuples and comments

Robert Haas <robertmhaas@gmail.com> writes:

Perhaps this is a silly question, but I don't particularly see what's
wrong with:

3. Do nothing.

Well, the fundamental problem is that the RTE is a lousy place to keep
rowcount estimates. That breaks assorted desirable properties like
querytrees being readonly to planning/execution (not that we don't
end up copying them anyway, but up to now that's been because of bad
implementation not because the representation was broken by design).

I agree that it's probably not so badly broken that we have to fix it
in time for v10, but this is not where we want to be in the long run.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Robert Haas

robertmhaas@gmail.com

over 8 years ago

In reply to: Tom Lane (#7)

Re: RTE_NAMEDTUPLESTORE, enrtuples and comments

On Tue, Jun 13, 2017 at 10:05 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

Perhaps this is a silly question, but I don't particularly see what's
wrong with:

3. Do nothing.

Well, the fundamental problem is that the RTE is a lousy place to keep
rowcount estimates. That breaks assorted desirable properties like
querytrees being readonly to planning/execution (not that we don't
end up copying them anyway, but up to now that's been because of bad
implementation not because the representation was broken by design).

How does it break those properties? I don't think enrtuples is being
modified by planning or execution as things stand.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Tom Lane

tgl@sss.pgh.pa.us

over 8 years ago

In reply to: Robert Haas (#8)

Re: RTE_NAMEDTUPLESTORE, enrtuples and comments

Robert Haas <robertmhaas@gmail.com> writes:

On Tue, Jun 13, 2017 at 10:05 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Well, the fundamental problem is that the RTE is a lousy place to keep
rowcount estimates. That breaks assorted desirable properties like
querytrees being readonly to planning/execution (not that we don't
end up copying them anyway, but up to now that's been because of bad
implementation not because the representation was broken by design).

How does it break those properties? I don't think enrtuples is being
modified by planning or execution as things stand.

But it needs to be changeable, unless you like the proposition that we
can never replan a query inside a trigger on the basis of new information
about how big the transition table is. Even if you're okay with that
particular restriction, the NamedTupleStore stuff is supposed to be
flexible enough to accommodate other use-cases, and some of them will
surely not be okay with an immutable estimate for the store's size.

Thomas noted upthread that there wasn't any API for injecting a rowcount
estimate from an SPI caller, which is wrong actually: there's already an
enrtuples field in EphemeralNamedRelationMetadata. So another way to
explain why this is broken is that having a field in the RTE is redundant
with that.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10

Robert Haas

robertmhaas@gmail.com

over 8 years ago

In reply to: Tom Lane (#9)

Re: RTE_NAMEDTUPLESTORE, enrtuples and comments

On Tue, Jun 13, 2017 at 11:53 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

How does it break those properties? I don't think enrtuples is being
modified by planning or execution as things stand.

But it needs to be changeable, unless you like the proposition that we
can never replan a query inside a trigger on the basis of new information
about how big the transition table is. Even if you're okay with that
particular restriction, the NamedTupleStore stuff is supposed to be
flexible enough to accommodate other use-cases, and some of them will
surely not be okay with an immutable estimate for the store's size.

Hmm, true. But even if we extracted enrtuples from the
RangeTbleEntry, there wouldn't be any mechanism to actually trigger
such a replan, would there?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11

Tom Lane

tgl@sss.pgh.pa.us

over 8 years ago

In reply to: Robert Haas (#10)

Re: RTE_NAMEDTUPLESTORE, enrtuples and comments

Robert Haas <robertmhaas@gmail.com> writes:

On Tue, Jun 13, 2017 at 11:53 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

But it needs to be changeable, unless you like the proposition that we
can never replan a query inside a trigger on the basis of new information
about how big the transition table is. Even if you're okay with that
particular restriction, the NamedTupleStore stuff is supposed to be
flexible enough to accommodate other use-cases, and some of them will
surely not be okay with an immutable estimate for the store's size.

Hmm, true. But even if we extracted enrtuples from the
RangeTbleEntry, there wouldn't be any mechanism to actually trigger
such a replan, would there?

You're just pointing out that there's a lot of unfinished work around
this mechanism. I don't think anybody has claimed otherwise.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12

Robert Haas

robertmhaas@gmail.com

over 8 years ago

In reply to: Tom Lane (#11)

Re: RTE_NAMEDTUPLESTORE, enrtuples and comments

On Tue, Jun 13, 2017 at 12:04 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Tue, Jun 13, 2017 at 11:53 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

But it needs to be changeable, unless you like the proposition that we
can never replan a query inside a trigger on the basis of new information
about how big the transition table is. Even if you're okay with that
particular restriction, the NamedTupleStore stuff is supposed to be
flexible enough to accommodate other use-cases, and some of them will
surely not be okay with an immutable estimate for the store's size.

Hmm, true. But even if we extracted enrtuples from the
RangeTbleEntry, there wouldn't be any mechanism to actually trigger
such a replan, would there?

You're just pointing out that there's a lot of unfinished work around
this mechanism. I don't think anybody has claimed otherwise.

I'm just trying to understand your comments so that I can have an
intelligent opinion about what we should do from here. Given that the
replan wouldn't happen anyway, there seems to be no reason to tinker
with the location of enrtuples for v10 -- which is exactly what both
of us already said, but I was confused about your comments about
enrtuples getting changed. I think that I am no longer confused.

We still need to review and commit the minimal cleanup patch Thomas
posted upthread (v1, not v2). I think I might go do that unless
somebody else is feeling the urge to whack it around before commit.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13

Thomas Munro

thomas.munro@enterprisedb.com

over 8 years ago

In reply to: Robert Haas (#12)

1 attachment(s)

Re: RTE_NAMEDTUPLESTORE, enrtuples and comments

On Wed, Jun 14, 2017 at 4:22 AM, Robert Haas <robertmhaas@gmail.com> wrote:

I'm just trying to understand your comments so that I can have an
intelligent opinion about what we should do from here. Given that the
replan wouldn't happen anyway, there seems to be no reason to tinker
with the location of enrtuples for v10 -- which is exactly what both
of us already said, but I was confused about your comments about
enrtuples getting changed. I think that I am no longer confused.

We still need to review and commit the minimal cleanup patch Thomas
posted upthread (v1, not v2). I think I might go do that unless
somebody else is feeling the urge to whack it around before commit.

OK, if we're keeping enrtuples in RangeTblEntry for v10 then we'd
better address Noah's original complaint that it's missing from
_{copy,equal,out,read}RangeTblEntry() . Here's a patch for that, to
apply on top of the -v1 patch above.

--
Thomas Munro
http://www.enterprisedb.com

Attachments:

add-enrtuples-to-node-funcs.patchapplication/octet-stream; name=add-enrtuples-to-node-funcs.patchDownload

diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 3b54e3d80c..02451c2a71 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -2307,6 +2307,7 @@ _copyRangeTblEntry(const RangeTblEntry *from)
 	COPY_NODE_FIELD(coltypmods);
 	COPY_NODE_FIELD(colcollations);
 	COPY_STRING_FIELD(enrname);
+	COPY_SCALAR_FIELD(enrtuples);
 	COPY_NODE_FIELD(alias);
 	COPY_NODE_FIELD(eref);
 	COPY_SCALAR_FIELD(lateral);
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 696cd6e7e1..c88d09f7c6 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2632,6 +2632,7 @@ _equalRangeTblEntry(const RangeTblEntry *a, const RangeTblEntry *b)
 	COMPARE_NODE_FIELD(coltypmods);
 	COMPARE_NODE_FIELD(colcollations);
 	COMPARE_STRING_FIELD(enrname);
+	COMPARE_SCALAR_FIELD(enrtuples);
 	COMPARE_NODE_FIELD(alias);
 	COMPARE_NODE_FIELD(eref);
 	COMPARE_SCALAR_FIELD(lateral);
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index c348bdcde3..4fd2ca50c0 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -3047,6 +3047,7 @@ _outRangeTblEntry(StringInfo str, const RangeTblEntry *node)
 			break;
 		case RTE_NAMEDTUPLESTORE:
 			WRITE_STRING_FIELD(enrname);
+			WRITE_FLOAT_FIELD(enrtuples, "%.0f");
 			WRITE_OID_FIELD(relid);
 			WRITE_NODE_FIELD(coltypes);
 			WRITE_NODE_FIELD(coltypmods);
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 81ddfc3271..039910e1f7 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -1358,6 +1358,7 @@ _readRangeTblEntry(void)
 			break;
 		case RTE_NAMEDTUPLESTORE:
 			READ_STRING_FIELD(enrname);
+			READ_FLOAT_FIELD(enrtuples);
 			READ_OID_FIELD(relid);
 			READ_NODE_FIELD(coltypes);
 			READ_NODE_FIELD(coltypmods);

#14

Robert Haas

robertmhaas@gmail.com

over 8 years ago

In reply to: Thomas Munro (#13)

Re: RTE_NAMEDTUPLESTORE, enrtuples and comments

On Tue, Jun 13, 2017 at 4:56 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:

On Wed, Jun 14, 2017 at 4:22 AM, Robert Haas <robertmhaas@gmail.com> wrote:

I'm just trying to understand your comments so that I can have an
intelligent opinion about what we should do from here. Given that the
replan wouldn't happen anyway, there seems to be no reason to tinker
with the location of enrtuples for v10 -- which is exactly what both
of us already said, but I was confused about your comments about
enrtuples getting changed. I think that I am no longer confused.

We still need to review and commit the minimal cleanup patch Thomas
posted upthread (v1, not v2). I think I might go do that unless
somebody else is feeling the urge to whack it around before commit.

OK, if we're keeping enrtuples in RangeTblEntry for v10 then we'd
better address Noah's original complaint that it's missing from
_{copy,equal,out,read}RangeTblEntry() . Here's a patch for that, to
apply on top of the -v1 patch above.

Committed both of those together.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers