RangeTblEntry.inh vs. RTE_SUBQUERY

Started by Peter Eisentrautabout 2 years ago9 messageshackers
Jump to latest
#1Peter Eisentraut
peter_e@gmx.net

Various code comments say that the RangeTblEntry field inh may only be
set for entries of kind RTE_RELATION.

For example

* inh is true for relation references that should be expanded to
include
* inheritance children, if the rel has any. This *must* be false for
* RTEs other than RTE_RELATION entries.

and various comments in other files.

(Confusingly, it is also listed under "Fields valid in all RTEs:", but
that definitely seems wrong.)

I have been deploying some assertions to see if the claims in the
RangeTblEntry comments are all correct, and I tripped over something.

The function pull_up_simple_union_all() in prepjointree.c sets ->inh to
true for RTE_SUBQUERY entries:

/*
* Mark the parent as an append relation.
*/
rte->inh = true;

Whatever this is doing appears to be some undocumented magic. If I
remove the line, then regression tests fail with plan differences, so it
definitely seems to do something.

Is this something we should explain the RangeTblEntry comments?

#2Dean Rasheed
dean.a.rasheed@gmail.com
In reply to: Peter Eisentraut (#1)
Re: RangeTblEntry.inh vs. RTE_SUBQUERY

On Fri, 23 Feb 2024 at 14:35, Peter Eisentraut <peter@eisentraut.org> wrote:

Various code comments say that the RangeTblEntry field inh may only be
set for entries of kind RTE_RELATION.

The function pull_up_simple_union_all() in prepjointree.c sets ->inh to
true for RTE_SUBQUERY entries:

/*
* Mark the parent as an append relation.
*/
rte->inh = true;

Whatever this is doing appears to be some undocumented magic.

Yes, it's explained a bit more clearly/accurately in expand_inherited_rtentry():

/*
* expand_inherited_rtentry
* Expand a rangetable entry that has the "inh" bit set.
*
* "inh" is only allowed in two cases: RELATION and SUBQUERY RTEs.
*
* "inh" on a plain RELATION RTE means that it is a partitioned table or the
* parent of a traditional-inheritance set. In this case we must add entries
* for all the interesting child tables to the query's rangetable, and build
* additional planner data structures for them, including RelOptInfos,
* AppendRelInfos, and possibly PlanRowMarks.
*
* Note that the original RTE is considered to represent the whole inheritance
* set. In the case of traditional inheritance, the first of the generated
* RTEs is an RTE for the same table, but with inh = false, to represent the
* parent table in its role as a simple member of the inheritance set. For
* partitioning, we don't need a second RTE because the partitioned table
* itself has no data and need not be scanned.
*
* "inh" on a SUBQUERY RTE means that it's the parent of a UNION ALL group,
* which is treated as an appendrel similarly to inheritance cases; however,
* we already made RTEs and AppendRelInfos for the subqueries. We only need
* to build RelOptInfos for them, which is done by expand_appendrel_subquery.
*/

Is this something we should explain the RangeTblEntry comments?

+1

Regards,
Dean

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Dean Rasheed (#2)
Re: RangeTblEntry.inh vs. RTE_SUBQUERY

Dean Rasheed <dean.a.rasheed@gmail.com> writes:

On Fri, 23 Feb 2024 at 14:35, Peter Eisentraut <peter@eisentraut.org> wrote:

Various code comments say that the RangeTblEntry field inh may only be
set for entries of kind RTE_RELATION.

Yes, it's explained a bit more clearly/accurately in expand_inherited_rtentry():

* "inh" is only allowed in two cases: RELATION and SUBQUERY RTEs.

Yes. The latter has been accurate for a very long time, so I'm
surprised that there are any places that think otherwise. We need
to fix them --- where did you see this exactly?

(Note that RELATION-only is accurate within the parser and rewriter,
so maybe clarifications about context are in order.)

regards, tom lane

#4Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#3)
Re: RangeTblEntry.inh vs. RTE_SUBQUERY

On 23.02.24 16:19, Tom Lane wrote:

Dean Rasheed <dean.a.rasheed@gmail.com> writes:

On Fri, 23 Feb 2024 at 14:35, Peter Eisentraut <peter@eisentraut.org> wrote:

Various code comments say that the RangeTblEntry field inh may only be
set for entries of kind RTE_RELATION.

Yes, it's explained a bit more clearly/accurately in expand_inherited_rtentry():

* "inh" is only allowed in two cases: RELATION and SUBQUERY RTEs.

Yes. The latter has been accurate for a very long time, so I'm
surprised that there are any places that think otherwise. We need
to fix them --- where did you see this exactly?

In nodes/parsenodes.h, it says both

This *must* be false for RTEs other than RTE_RELATION entries.

and also puts it under

Fields valid in all RTEs:

which are both wrong on opposite ends of the spectrum.

I think it would make more sense to group inh under "Fields valid for a
plain relation RTE" and then explain the exception for subqueries, like
it is done for several other fields.

See attached patch for a proposal. (I also shuffled a few fields around
to make the order a bit more logical.)

Attachments:

0001-Fix-description-and-grouping-of-RangeTblEntry.inh.patchtext/plain; charset=UTF-8; name=0001-Fix-description-and-grouping-of-RangeTblEntry.inh.patchDownload+18-21
#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#4)
Re: RangeTblEntry.inh vs. RTE_SUBQUERY

Peter Eisentraut <peter@eisentraut.org> writes:

In nodes/parsenodes.h, it says both
This *must* be false for RTEs other than RTE_RELATION entries.

Well, that's true in the parser ...

and also puts it under
Fields valid in all RTEs:
which are both wrong on opposite ends of the spectrum.
I think it would make more sense to group inh under "Fields valid for a
plain relation RTE" and then explain the exception for subqueries, like
it is done for several other fields.

Dunno. The adjacent "lateral" field is also used for only selected
RTE kinds.

I'd be inclined to leave it where it is and just improve the
commentary. That could read like

* inh is true for relation references that should be expanded to include
* inheritance children, if the rel has any. In the parser this
* will only be true for RTE_RELATION entries. The planner also uses
* this field to mark RTE_SUBQUERY entries that contain UNION ALL
* queries that it has flattened into pulled-up subqueries
* (creating a structure much like the effects of inheritance).

If you do insist on moving it, please put it next to relkind so it
packs better.

I agree that perminfoindex seems to have suffered from add-at-the-end
syndrome, and if we do touch the field order you made an improvement
there. (BTW, who thought they needn't bother with a comment for
perminfoindex?)

regards, tom lane

#6Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#5)
Re: RangeTblEntry.inh vs. RTE_SUBQUERY

On 2024-Feb-29, Tom Lane wrote:

I agree that perminfoindex seems to have suffered from add-at-the-end
syndrome, and if we do touch the field order you made an improvement
there. (BTW, who thought they needn't bother with a comment for
perminfoindex?)

There is a comment for it, or at least a61b1f74823c added one, though
not immediately adjacent. I do see that it's now further away than it
was. Perhaps we could add /* index of RTEPermissionInfo entry, or 0 */
to the line.

--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
"The ability of users to misuse tools is, of course, legendary" (David Steele)
/messages/by-id/11b38a96-6ded-4668-b772-40f992132797@pgmasters.net

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#6)
Re: RangeTblEntry.inh vs. RTE_SUBQUERY

Alvaro Herrera <alvherre@alvh.no-ip.org> writes:

On 2024-Feb-29, Tom Lane wrote:

I agree that perminfoindex seems to have suffered from add-at-the-end
syndrome, and if we do touch the field order you made an improvement
there. (BTW, who thought they needn't bother with a comment for
perminfoindex?)

There is a comment for it, or at least a61b1f74823c added one, though
not immediately adjacent. I do see that it's now further away than it
was. Perhaps we could add /* index of RTEPermissionInfo entry, or 0 */
to the line.

That'd be enough for me.

regards, tom lane

#8Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#5)
Re: RangeTblEntry.inh vs. RTE_SUBQUERY

On 29.02.24 19:14, Tom Lane wrote:

Peter Eisentraut <peter@eisentraut.org> writes:

In nodes/parsenodes.h, it says both
This *must* be false for RTEs other than RTE_RELATION entries.

Well, that's true in the parser ...

and also puts it under
Fields valid in all RTEs:
which are both wrong on opposite ends of the spectrum.
I think it would make more sense to group inh under "Fields valid for a
plain relation RTE" and then explain the exception for subqueries, like
it is done for several other fields.

Dunno. The adjacent "lateral" field is also used for only selected
RTE kinds.

The section is

/*
* Fields valid in all RTEs:
*/
Alias *alias; /* user-written alias clause, if any */
Alias *eref; /* expanded reference names */
bool lateral; /* subquery, function, or values is
LATERAL? */
bool inh; /* inheritance requested? */
bool inFromCl; /* present in FROM clause? */
List *securityQuals; /* security barrier quals to apply, if
any */

According to my testing, lateral is used for RTE_RELATION, RTE_SUBQUERY,
RTE_FUNCTION, RTE_TABLEFUNC, RTE_VALUES, which is 5 out of 9 possible.
So I think it might be okay to relabel that section (in actuality or
mentally) as "valid in several/many/most RTEs".

But I'm not sure what reason there would be for having inh there, which
is better described as "valid for RTE_RELATION, but also borrowed by
RTE_SUBQUERY", which is pretty much exactly what is the case for relid,
relkind, etc.

#9Peter Eisentraut
peter_e@gmx.net
In reply to: Peter Eisentraut (#8)
Re: RangeTblEntry.inh vs. RTE_SUBQUERY

On 03.03.24 11:02, Peter Eisentraut wrote:

On 29.02.24 19:14, Tom Lane wrote:

Peter Eisentraut <peter@eisentraut.org> writes:

In nodes/parsenodes.h, it says both
      This *must* be false for RTEs other than RTE_RELATION entries.

Well, that's true in the parser ...

and also puts it under
      Fields valid in all RTEs:
which are both wrong on opposite ends of the spectrum.
I think it would make more sense to group inh under "Fields valid for a
plain relation RTE" and then explain the exception for subqueries, like
it is done for several other fields.

Dunno.  The adjacent "lateral" field is also used for only selected
RTE kinds.

The section is

    /*
     * Fields valid in all RTEs:
     */
    Alias      *alias;          /* user-written alias clause, if any */
    Alias      *eref;           /* expanded reference names */
    bool        lateral;        /* subquery, function, or values is
LATERAL? */
    bool        inh;            /* inheritance requested? */
    bool        inFromCl;       /* present in FROM clause? */
    List       *securityQuals;  /* security barrier quals to apply, if
any */

According to my testing, lateral is used for RTE_RELATION, RTE_SUBQUERY,
RTE_FUNCTION, RTE_TABLEFUNC, RTE_VALUES, which is 5 out of 9 possible.
So I think it might be okay to relabel that section (in actuality or
mentally) as "valid in several/many/most RTEs".

But I'm not sure what reason there would be for having inh there, which
is better described as "valid for RTE_RELATION, but also borrowed by
RTE_SUBQUERY", which is pretty much exactly what is the case for relid,
relkind, etc.

I have committed the patches for this discussion.

Related discussion will continue at
/messages/by-id/4b27fc50-8cd6-46f5-ab20-88dbaadca645@eisentraut.org
/ https://commitfest.postgresql.org/47/4697/ .