New design for FK-based join selectivity estimation

Started by Tom Laneabout 10 years ago25 messageshackers

tgl@sss.pgh.pa.us

about 10 years ago

This is a branch of the discussion in
/messages/by-id/20160429102531.GA13701@huehner.biz
but I'm starting a new thread as the original title is getting
increasingly off-topic.

I complained in that thread that the FK join selectivity patch had a
very brute-force approach to matching join qual clauses to FK
constraints, requiring a total of seven nested levels of looping to
get anything done, and expensively rediscovering the same facts over
and over. Here is a sketch of what I think is a better way:

* During the planner's catalog data collection phase, construct a
single list (per PlannerInfo, not per RTE) of all relevant FKs.
In this data structure, each FK's referenced and referencing relations
are identified by relid (that is, RTE index) not just OID. In the case
of a query containing a self-join, that would mean that the same FK
constraint could give rise to multiple list entries, one for each RTE
occurrence of its referenced or referencing target relation. FKs not
relating two tables of the query are necessarily not in this list,
and we could also choose not to include single-column FKs.

* After finalizing equivalence classes, make a single pass through
the FK list and check each column-pair to see if it can be matched
to any EC (that is, both source and target columns are in the EC and
the comparison operator is in the EC's opfamily). Mark each matched
column pair in the FK list data structure with a pointer to the EC.

* When performing join selectivity estimation, run through the FK list
a single time, ignoring entries that do not link a member of the join's
LHS to a member of the join's RHS. This is a fairly cheap test given
the relid labels; it'd be approximately

if ((bms_is_member(fkinfo->src_relid, outer_rel->relids) &&
bms_is_member(fkinfo->dst_relid, inner_rel->relids)) ||
(bms_is_member(fkinfo->dst_relid, outer_rel->relids) &&
bms_is_member(fkinfo->src_relid, inner_rel->relids)))

For each potentially interesting FK entry, run through the join
qual list. A RestrictInfo that was generated from an EC matches
the FK if and only if that EC appears in the per-column markings;
other RestrictInfos are matched to one of the FK columns normally
(I think this path can ignore FK columns that have been matched to ECs).
At the end of that, we can determine whether all the FK columns have
been matched to some qual item, and we have a count and/or bitmapset
of the qual list entries that matched the FK. Remember the FK entry
with the largest such count.

* After scanning the list, we have our best FK match and can proceed
with making the actual selectivity estimate as in the current code.

With this approach, we have an iteration structure like

* once per join relation created
* for each foreign key constraint relevant to the query
(but skipping the loops below if it's not relevant to this join)
* for each join qual for the joinrel pair
* for each key column in that FK

which gets us down from seven nested loops to four, and also makes the
work done in the innermost loops significantly cheaper for the EC case,
which will be the more common one. It's also much easier to make this
structure do zero extra work when there are no relevant FKs, which is
a pleasant property for extra planner work to have.

Now, we'll also add some per-FK-per-EC setup work, but my guess is
that that's negligible compared to the per-join-relation work.

It's possible that we could reduce the cost of matching non-EC join
quals as well, with some trick along the line of pre-matching non-EC
RestrictInfos to FK items. I'm unsure that that is worth the trouble
though.

Thoughts?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Tom Lane

tgl@sss.pgh.pa.us

about 10 years ago

In reply to: Tom Lane (#1)

Re: New design for FK-based join selectivity estimation

I wrote:

... Here is a sketch of what I think is a better way:
...
It's possible that we could reduce the cost of matching non-EC join
quals as well, with some trick along the line of pre-matching non-EC
RestrictInfos to FK items. I'm unsure that that is worth the trouble
though.

After further thought, I believe that may well be worth doing. That
is, someplace after deconstruct_jointree(), examine all the FKs and
match their columns not only to ECs but to non-EC joinclauses, which
we could find by trawling the joininfo list for either subject relation.
We'd end up with a EC pointer and/or a list of non-EC RestrictInfos
for each FK column.

The thing that makes this attractive is that at the end of this matching,
we would know exactly whether each FK is matched to the query as a whole:
either all its columns have matches, or they don't. It's not necessary to
re-determine that for each joinrel pair that includes the FK's two subject
relations. So the per-join-relation work would reduce to scanning the FK
list once to find the matched FK(s) that connect relations on opposite
sides of the join. Once we've found such an FK, identifying which join
qual list items should be dropped in favor of applying the FK's
selectivity is also really easy: we just check the column markings.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Tomas Vondra

tomas.vondra@2ndquadrant.com

about 10 years ago

In reply to: Tom Lane (#1)

Re: New design for FK-based join selectivity estimation

Hi,

On 06/04/2016 09:44 PM, Tom Lane wrote:

This is a branch of the discussion in
/messages/by-id/20160429102531.GA13701@huehner.biz
but I'm starting a new thread as the original title is getting
increasingly off-topic.

I complained in that thread that the FK join selectivity patch had a
very brute-force approach to matching join qual clauses to FK
constraints, requiring a total of seven nested levels of looping to
get anything done, and expensively rediscovering the same facts over
and over. Here is a sketch of what I think is a better way:

* During the planner's catalog data collection phase, construct a
single list (per PlannerInfo, not per RTE) of all relevant FKs.
In this data structure, each FK's referenced and referencing relations
are identified by relid (that is, RTE index) not just OID. In the case
of a query containing a self-join, that would mean that the same FK
constraint could give rise to multiple list entries, one for each RTE
occurrence of its referenced or referencing target relation. FKs not
relating two tables of the query are necessarily not in this list,
and we could also choose not to include single-column FKs.

* After finalizing equivalence classes, make a single pass through
the FK list and check each column-pair to see if it can be matched
to any EC (that is, both source and target columns are in the EC and
the comparison operator is in the EC's opfamily). Mark each matched
column pair in the FK list data structure with a pointer to the EC.

* When performing join selectivity estimation, run through the FK list
a single time, ignoring entries that do not link a member of the join's
LHS to a member of the join's RHS. This is a fairly cheap test given
the relid labels; it'd be approximately

if ((bms_is_member(fkinfo->src_relid, outer_rel->relids) &&
bms_is_member(fkinfo->dst_relid, inner_rel->relids)) ||
(bms_is_member(fkinfo->dst_relid, outer_rel->relids) &&
bms_is_member(fkinfo->src_relid, inner_rel->relids)))

For each potentially interesting FK entry, run through the join
qual list. A RestrictInfo that was generated from an EC matches
the FK if and only if that EC appears in the per-column markings;
other RestrictInfos are matched to one of the FK columns normally
(I think this path can ignore FK columns that have been matched to ECs).
At the end of that, we can determine whether all the FK columns have
been matched to some qual item, and we have a count and/or bitmapset
of the qual list entries that matched the FK. Remember the FK entry
with the largest such count.

* After scanning the list, we have our best FK match and can proceed
with making the actual selectivity estimate as in the current code.

With this approach, we have an iteration structure like

* once per join relation created
* for each foreign key constraint relevant to the query
(but skipping the loops below if it's not relevant to this join)
* for each join qual for the joinrel pair
* for each key column in that FK

which gets us down from seven nested loops to four, and also makes the
work done in the innermost loops significantly cheaper for the EC case,
which will be the more common one. It's also much easier to make this
structure do zero extra work when there are no relevant FKs, which is
a pleasant property for extra planner work to have.

Now, we'll also add some per-FK-per-EC setup work, but my guess is
that that's negligible compared to the per-join-relation work.

It's possible that we could reduce the cost of matching non-EC join
quals as well, with some trick along the line of pre-matching non-EC
RestrictInfos to FK items. I'm unsure that that is worth the trouble
though.

Thoughts?

Firstly thanks for looking into this, and also for coming up with a very
detailed design proposal.

I do agree this new design seems superior to the current one and it
seems worth a try. I'd like to see how far we can get over the next few
days (say, until the end of the week).

One of the recent issues with the current design was handling of
inheritance / appendrels. ISTM the proposed design has the same issue,
no? What happens if the relations are partitioned?

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Tom Lane

tgl@sss.pgh.pa.us

about 10 years ago

In reply to: Tomas Vondra (#3)

Re: New design for FK-based join selectivity estimation

Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:

One of the recent issues with the current design was handling of
inheritance / appendrels. ISTM the proposed design has the same issue,
no? What happens if the relations are partitioned?

I haven't thought about inheritance in this proposal. My initial feeling
is that considering the parent table's outgoing FKs (if any) as valid is
not unreasonable. If it has any, probably all the children do too.
Not sure about incoming FKs, but there probably are none anyhow, since
our implementation doesn't really permit reasonable FK definitions that
reference a partitioned table. In any case, whatever we might choose
to do differently for inheritance would be no harder in this scheme than
what's there now; plus, whatever it is, we'd do it once not once per join
relation.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Simon Riggs

simon@2ndQuadrant.com

about 10 years ago

In reply to: Tom Lane (#1)

Re: New design for FK-based join selectivity estimation

On 4 June 2016 at 20:44, Tom Lane <tgl@sss.pgh.pa.us> wrote:

This is a branch of the discussion in

/messages/by-id/20160429102531.GA13701@huehner.biz
but I'm starting a new thread as the original title is getting
increasingly off-topic.

I complained in that thread that the FK join selectivity patch had a
very brute-force approach to matching join qual clauses to FK
constraints, requiring a total of seven nested levels of looping to
get anything done, and expensively rediscovering the same facts over
and over. Here is a sketch of what I think is a better way:

Thanks for your review and design notes here, which look like good
improvements.

Tomas has been discussing that with myself and others, but I just realised
that might not be apparent on list, so just to mention there is activity on
this and new code will be published very soon.

On the above mentioned thread, Tomas' analysis was this...
/messages/by-id/8344835e-18af-9d40-aed7-bd2261be9162@2ndquadrant.com

There are probably a few reasonably simple things we could do - e.g.

ignore foreign keys

with just a single column, as the primary goal of the patch is improving

estimates with

multi-column foreign keys. I believe that covers a vast majority of

foreign keys in the wild.

I agree with that comment. The relcache code retrieves all FKs, even ones
that have a single column. Yet the planner code never uses them unless
nKeys>1. That was masked somewhat by my two commits, treating the info as
generic and then using only a very specific subset of it.

So a simple change is to make RelationGetFKeyList() only retrieve FKs with
nKeys>1. Rename to RelationGetMultiColumnFKeyList(). That greatly reduces
the scope for increased planning time.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Tom Lane

tgl@sss.pgh.pa.us

about 10 years ago

In reply to: Simon Riggs (#5)

Re: New design for FK-based join selectivity estimation

Simon Riggs <simon@2ndquadrant.com> writes:

So a simple change is to make RelationGetFKeyList() only retrieve FKs with
nKeys>1. Rename to RelationGetMultiColumnFKeyList(). That greatly reduces
the scope for increased planning time.

FWIW, I don't particularly agree with that. It makes the relcache's fkey
storage extremely specific to this one use-case, a decision I expect we'd
regret later. And the planner needs to filter the fkey list anyway,
because it only wants fkeys that link to tables that are also in the
current query. Thus, my recommendation was that we should allow
RelationGetFKeyList to return a pointer directly to the cached info list
and require the planner to immediately copy (only) the entries that it
needs for the current query.

Another point here is that I'm now unconvinced that restricting the logic
to consider only multi-column fkeys is really what we want. It looks to
me like the code can also improve estimates in the case where there are
multiple single-column FKs linking to the same target relation. That
might not be too common for two-table queries, but I bet it happens a
lot in three-or-more-table queries.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Simon Riggs

simon@2ndQuadrant.com

about 10 years ago

In reply to: Tom Lane (#6)

Re: New design for FK-based join selectivity estimation

On 13 June 2016 at 19:16, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Simon Riggs <simon@2ndquadrant.com> writes:

So a simple change is to make RelationGetFKeyList() only retrieve FKs

with

nKeys>1. Rename to RelationGetMultiColumnFKeyList(). That greatly reduces
the scope for increased planning time.

FWIW, I don't particularly agree with that. It makes the relcache's fkey
storage extremely specific to this one use-case, a decision I expect we'd
regret later.

Hmm, clearly I thought that earlier also; that earlier thinking may be
influencing you. My commits had the concept of generic FK info and then a
specific optimization. So the main part of the planning problem was caused
by stored info that would never be used, in 9.6.

What changes my mind here is 1) point in dev cycle, 2) the point that the
list of FKs doesn't take into account whether the constraints are
deferrable, deferred or immediate and whether they are valid/invalid. ISTM
likely that we would care about those things in the future if we believe
that info is generic.

But then each new usage of the info will have the same planning time
problem to consider if they choose to extend the amount of info they hold.

Rejecting an optimization in 9.6 because it might be undone by later
changes is surely a problem for those later changes to resolve.

And the planner needs to filter the fkey list anyway,
because it only wants fkeys that link to tables that are also in the
current query. Thus, my recommendation was that we should allow
RelationGetFKeyList to return a pointer directly to the cached info list
and require the planner to immediately copy (only) the entries that it
needs for the current query.

Another point here is that I'm now unconvinced that restricting the logic
to consider only multi-column fkeys is really what we want. It looks to
me like the code can also improve estimates in the case where there are
multiple single-column FKs linking to the same target relation. That
might not be too common for two-table queries, but I bet it happens a
lot in three-or-more-table queries.

Is it realistic that we consider that at this point? Certainly not for
myself, at least.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Tom Lane

tgl@sss.pgh.pa.us

about 10 years ago

In reply to: Simon Riggs (#7)

Re: New design for FK-based join selectivity estimation

Simon Riggs <simon@2ndquadrant.com> writes:

On 13 June 2016 at 19:16, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Another point here is that I'm now unconvinced that restricting the logic
to consider only multi-column fkeys is really what we want. It looks to
me like the code can also improve estimates in the case where there are
multiple single-column FKs linking to the same target relation. That
might not be too common for two-table queries, but I bet it happens a
lot in three-or-more-table queries.

Is it realistic that we consider that at this point? Certainly not for
myself, at least.

It's pretty much built into the redesign I proposed, or so I thought.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Tomas Vondra

tomas.vondra@2ndquadrant.com

about 10 years ago

In reply to: Tom Lane (#2)

Re: New design for FK-based join selectivity estimation

Hi,

Attached is a reworked patch, mostly following the new design proposal
from this thread.

I'm not entirely happy with the code, but it's the best I've been able
to come up with by now and I won't be able to significantly improve that
due to travel over. Inevitably there will be issues in the code, and if
there's a chance of fixing them I'll do my best to do that over the
evenings at a hotel.

The filtering and matching to eclasses / join quals is triggered from
planmain.c - I believe this is the right place and that those pieces are
reasonably solid.

The estimation still happens in costsize.c, of course, but was modified
to use the pre-matched info. I have my doubts about this part, and I'm
sure Tom had something less complex / more efficient in mind (using the
pre-matched info).

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#10

Tom Lane

tgl@sss.pgh.pa.us

about 10 years ago

In reply to: Tomas Vondra (#9)

Re: New design for FK-based join selectivity estimation

Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:

Attached is a reworked patch, mostly following the new design proposal
from this thread.

I've whacked this around quite a bit and am now reasonably happy with it.
The issue of planning speed hit should be pretty much gone, although I've
not done testing to prove that. I've rearranged the actual selectivity
calculation some too, because tests I did here did not look very good for
anything but the plain-inner-join case. It may be that more work is
needed there, but at least there's reasonable commentary about what we're
doing and why.

I have not adopted the plan of ignoring single-column FKs. While it would
only take a couple more lines to do so, I think the argument that there
will be a material planning speed hit no longer has much force. And I see
a couple of arguments in favor of allowing this code to trigger on
single-column FKs. First, it should work well even when pg_statistic data
is missing or out of date, which would be an improvement; and second, we
are likely not going to get much beta testing of the behavior if we
restrict it to multi-col FKs. So I think we should ship it like this for
beta even if we end up adding a filter against single-column FKs for
release.

Comments and testing appreciated.

regards, tom lane

#11

Tomas Vondra

tomas.vondra@2ndquadrant.com

about 10 years ago

In reply to: Tom Lane (#10)

Re: New design for FK-based join selectivity estimation

Hi,

On 06/16/2016 01:00 AM, Tom Lane wrote:

Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:

Attached is a reworked patch, mostly following the new design proposal
from this thread.

I've whacked this around quite a bit and am now reasonably happy with it.
The issue of planning speed hit should be pretty much gone, although I've
not done testing to prove that. I've rearranged the actual selectivity
calculation some too, because tests I did here did not look very good for
anything but the plain-inner-join case. It may be that more work is
needed there, but at least there's reasonable commentary about what we're
doing and why.

Thanks for getting the patch into a much better shape. I've quickly
reviewed the patch this morning before leaving to the airport - I do
plan to do additional review/testing once in the air or perhaps over the
weekend. So at the moment I only have a few minor comments:

1) Shouldn't we define a new macro in copyfuncs.c, to do the memcpy for
us? Seems a bit strange we have macros for everything else.

2) I'm wondering whether removing the restrict infos when

if (fkinfo->eclass[i] == rinfo->parent_ec)

is actually correct. Can't the EC include conditions that do not match
the FK? I mean something like this:

CREATE TABLE a (id1 INT PRIMARY KEY, id2 INT);
CREATE TABLE b (id1 INT REFERENCES a (id1), id2 INT);

and then something like

SELECT * FROM a JOIN b ON (a.id1 = b.id1 AND a.id1 = b.id2)

I've been unable to trigger this issue with your patch, but I do
remember I've ran into that with my version, which is why I explicitly
checked the rinfo again before removing it. Maybe that was incorrect, or
perhaps your patch does something smart. Or maybe it's just masked by
the fact that we 'push' the EC conditions to the base relations (which
however means we're stuck with default equality estimate).

3) I think this comment in get_foreign_key_join_selectivity is wrong and
should instead say 'to FK':

/* Otherwise, see if rinfo was previously matched to EC */

I have not adopted the plan of ignoring single-column FKs. While it would
only take a couple more lines to do so, I think the argument that there
will be a material planning speed hit no longer has much force. And I see
a couple of arguments in favor of allowing this code to trigger on
single-column FKs. First, it should work well even when pg_statistic data
is missing or out of date, which would be an improvement; and second, we
are likely not going to get much beta testing of the behavior if we
restrict it to multi-col FKs. So I think we should ship it like this for
beta even if we end up adding a filter against single-column FKs for
release.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12

Tom Lane

tgl@sss.pgh.pa.us

about 10 years ago

In reply to: Tomas Vondra (#11)

Re: New design for FK-based join selectivity estimation

Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:

Thanks for getting the patch into a much better shape. I've quickly
reviewed the patch this morning before leaving to the airport - I do
plan to do additional review/testing once in the air or perhaps over the
weekend. So at the moment I only have a few minor comments:

1) Shouldn't we define a new macro in copyfuncs.c, to do the memcpy for
us? Seems a bit strange we have macros for everything else.

Perhaps, but it seemed not that compelling since we need bespoke code for
those fields in outfuncs.c etc. Maybe it would be worth thinking about
macro infrastructure for array-type fields in all of those modules.

2) I'm wondering whether removing the restrict infos when
if (fkinfo->eclass[i] == rinfo->parent_ec)
is actually correct. Can't the EC include conditions that do not match
the FK?

Doesn't matter. The point is that it *does* include a condition that
does match the FK, whether it chose to generate exactly that condition
for this join or some related one.

I mean something like this:
CREATE TABLE a (id1 INT PRIMARY KEY, id2 INT);
CREATE TABLE b (id1 INT REFERENCES a (id1), id2 INT);
and then something like
SELECT * FROM a JOIN b ON (a.id1 = b.id1 AND a.id1 = b.id2)

Right. In this case we'll have an EC containing {a.id1, b.id1, b.id2}
which means that equivclass.c will generate a restriction condition
b.id1 = b.id2 to be applied at the scan of b. At the join level, it
has a choice whether to generate a.id1 = b.id1 or a.id1 = b.id2.
It could generate both, but that would be pointlessly inefficient (and
would likely confuse the selectivity estimators, too). But even if it
chooses to generate a.id1 = b.id2, we should recognize that the FK is
matched. What we're effectively doing by dropping that clause in
favor of treating the FK as matched is overridding equivclass.c's
arbitrary choice of which join clause to generate with an equally valid
choice that is easier to estimate for.

3) I think this comment in get_foreign_key_join_selectivity is wrong and
should instead say 'to FK':
/* Otherwise, see if rinfo was previously matched to EC */

Duh, yeah.

I rewrote the comments in this section to clarify a bit:

/* Drop this clause if it matches any column of the FK */
for (i = 0; i < fkinfo->nkeys; i++)
{
if (rinfo->parent_ec)
{
/*
* EC-derived clauses can only match by EC. It is okay to
* consider any clause derived from the same EC as
* matching the FK: even if equivclass.c chose to generate
* a clause equating some other pair of Vars, it could
* have generated one equating the FK's Vars. So for
* purposes of estimation, we can act as though it did so.
*
* Note: checking parent_ec is a bit of a cheat because
* there are EC-derived clauses that don't have parent_ec
* set; but such clauses must compare expressions that
* aren't just Vars, so they cannot match the FK anyway.
*/
if (fkinfo->eclass[i] == rinfo->parent_ec)
{
remove_it = true;
break;
}
}
else
{
/*
* Otherwise, see if rinfo was previously matched to FK as
* a "loose" clause.
*/
if (list_member_ptr(fkinfo->rinfos[i], rinfo))
{
remove_it = true;
break;
}
}
}

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13

Tomas Vondra

tomas.vondra@2ndquadrant.com

about 10 years ago

In reply to: Tom Lane (#12)

Re: New design for FK-based join selectivity estimation

Hi,

A few more comments, about re-reading the patch more thoroughly. I
wouldn't say any of those qualify as bugs, but rather as discussion
about some of the design choices:

1) NULL handling

I'd argue that we should do something about this, although I agree it's
non-trivial to estimate - at least until we get some sort of correlation
stats (e.g. my patch already provides most of the pieces, I believe).
But I'd argue that in the case of multi-column foreign keys we can do
better even without it - my experience is that in such cases either all
values are NULL or none of them, and a single NULL value breaks the FK
of course. So I think max(null_frac) would work.

2) get_foreign_key_join_selectivity vs. incomplete matches

The comment right before checking (removedlist == NIL) says:

* For a multi-column FK, it's possible that we found matches to some
* columns but not all, implying that one of the above effects applied
* to just some of the columns. For the moment, we go ahead and
* remove those clauses and apply the FK's selectivity anyway. It
* might be better to put back the removed clauses and ignore the FK;
* but that amounts to betting on independence of the clauses, which
* doesn't seem like a good bet in such messy cases.

Is this a good idea? I'd probably vote to do what's proposed (and
rejected) in the second half of the comment, i.e. put back the clauses
and skip the FK as that pretty much says "improve estimate or keep the
current one, but don't risk making it worse."

3) ForeignKeyOptInfo->rinfos as a List

Can we actually get a list of matching RestrictInfos for a single
foreign key? I've been unable to construct such query.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14

Tom Lane

tgl@sss.pgh.pa.us

about 10 years ago

In reply to: Tomas Vondra (#13)

Re: New design for FK-based join selectivity estimation

Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:

A few more comments, about re-reading the patch more thoroughly. I
wouldn't say any of those qualify as bugs, but rather as discussion
about some of the design choices:

1) NULL handling

I'd argue that we should do something about this, although I agree it's
non-trivial to estimate - at least until we get some sort of correlation
stats (e.g. my patch already provides most of the pieces, I believe).

I concur, actually, but I feel that it's out of scope for this particular
patch, which is only trying to replace the functionality that was
committed previously. If you want to come up with a patch on top of this
that adds some accounting for NULLs, I'd be willing to consider it as
a post-beta2 improvement.

But I'd argue that in the case of multi-column foreign keys we can do
better even without it - my experience is that in such cases either all
values are NULL or none of them, and a single NULL value breaks the FK
of course. So I think max(null_frac) would work.

Yeah, I was thinking along the same lines: max of the per-column null
fractions is probably an OK estimate.

The comment right before checking (removedlist == NIL) says:
* For a multi-column FK, it's possible that we found matches to some
* columns but not all, implying that one of the above effects applied
* to just some of the columns. For the moment, we go ahead and
* remove those clauses and apply the FK's selectivity anyway. It
* might be better to put back the removed clauses and ignore the FK;
* but that amounts to betting on independence of the clauses, which
* doesn't seem like a good bet in such messy cases.

Is this a good idea? I'd probably vote to do what's proposed (and
rejected) in the second half of the comment, i.e. put back the clauses
and skip the FK as that pretty much says "improve estimate or keep the
current one, but don't risk making it worse."

I would buy that approach when it comes to "loose" quals, but I think
it's not right for EC-derived matches, because of the behavior we
discussed previously that an EC should be considered to have implicitly
generated all the matching quals even though it actually made only one
qual that might not be any of them exactly.

Now on the other side of the coin, if several FKs match the same EC,
we might be overshooting the mark to assume that we can just multiply
their selectivity estimates together. But I think that's not really
a matter for the matching logic, but rather a question of how we want
to combine the estimates for multiple FKs.

Possibly what we could do here is assume that EC matches succeed,
whether there's a matching qual physically in the list or not, but
require a qual match for each column with non-EC matches. That
would complicate the logic slightly but not terribly (he says without
having actually tried to code it).

If you have a proposal about adjusting the net selectivity estimate when
multiple FKs match the join, let's hear it. But again that seems like
something we could address as a post-beta2 refinement.

3) ForeignKeyOptInfo->rinfos as a List
Can we actually get a list of matching RestrictInfos for a single
foreign key? I've been unable to construct such query.

I think you'd actually have to write redundant outer join quals,
along the lines of
select ... a left join b on (a.x = b.y and a.x = b.y)
I don't believe we take the trouble to eliminate such duplicates
unless they get absorbed by an EC, which outer-join quals would
not be. (Haven't tried this, though, as I don't have the patch
installed right now.)

The beta2 deadline is just about upon us; I feel that if we're going
to get this into this release at all, we need to push it today so
that we get a full buildfarm cycle on it before the wrap.

I plan to spend an hour or two adjusting the qual match logic as
discussed above, and re-reading the whole patch another time for
sanity. If I've not heard objections by the time I'm done,
I will push it.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15

Tom Lane

tgl@sss.pgh.pa.us

about 10 years ago

In reply to: Tom Lane (#14)

Re: New design for FK-based join selectivity estimation

I wrote:

Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:

Is this a good idea? I'd probably vote to do what's proposed (and
rejected) in the second half of the comment, i.e. put back the clauses
and skip the FK as that pretty much says "improve estimate or keep the
current one, but don't risk making it worse."

I would buy that approach when it comes to "loose" quals, but I think
it's not right for EC-derived matches, because of the behavior we
discussed previously that an EC should be considered to have implicitly
generated all the matching quals even though it actually made only one
qual that might not be any of them exactly.

After further thought I decided you're right, because one of the main
original goals of ECs was to prevent double-counting the selectivity
of redundant equality quals. Acting as though the EC had generated
multiple redundant quals is likely to make things worse not better.

I still feel that we're leaving something on the table here, but it
would need to take the form of intelligently combining estimates for
overlapping FKs, not just blindly multiplying them together. Seems
like a task for later, especially considering that cases of this sort
are likely to be rare.

Pushed with adjustments for that.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16

Tomas Vondra

tomas.vondra@2ndquadrant.com

about 10 years ago

In reply to: Tom Lane (#15)

Re: New design for FK-based join selectivity estimation

On 06/18/2016 09:27 PM, Tom Lane wrote:

I wrote:

Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:

Is this a good idea? I'd probably vote to do what's proposed (and
rejected) in the second half of the comment, i.e. put back the clauses
and skip the FK as that pretty much says "improve estimate or keep the
current one, but don't risk making it worse."

I would buy that approach when it comes to "loose" quals, but I think
it's not right for EC-derived matches, because of the behavior we
discussed previously that an EC should be considered to have implicitly
generated all the matching quals even though it actually made only one
qual that might not be any of them exactly.

After further thought I decided you're right, because one of the main
original goals of ECs was to prevent double-counting the selectivity
of redundant equality quals. Acting as though the EC had generated
multiple redundant quals is likely to make things worse not better.

I still feel that we're leaving something on the table here, but it
would need to take the form of intelligently combining estimates for
overlapping FKs, not just blindly multiplying them together. Seems
like a task for later, especially considering that cases of this sort
are likely to be rare.

Pushed with adjustments for that.

OK, thanks. The one thing we haven't done is testing the performance, to
see how this fares. So I've repeated the tests I've done on the original
version of the patch here

/messages/by-id/8344835e-18af-9d40-aed7-bd2261be9162@2ndquadrant.com

Sadly I don't have access to the machine used for the previous tests (on
a vacation and the machine sits under my desk at home), so I had to use
my laptop. That means a fair amount of variability due to power saving
built into the CPU, and also VM variability (using Xen VM). So the
numbers are not directly comparable to the old results, and I believe
the differences between the patched and unpatched version seem to be
quite clear despite the variability.

It's true the results are not as bad as with the originally committed
patch, but there are multiple cases where the planning time gets up to
~2x compared to master.

See the old-results.ods, and also old-scripts.tgz (the old scripts used
the GUC we removed, so I had to tweak it a bit, and you'll have to whack
it a bit to get it working).

Sure, those cases use many foreign keys (generally >=100), but the
conclusion with the old patch was that it matters and we spent a lot of
time to get it within 10% of master. There are also two or three cases
where the planning got quite a bit faster, for some reason.

I've also constructed another script, joining just 2 tables and doing
some funky things (e.g. adding a lot of overlapping foreign keys), and
in these cases the slowdown is even more significant - up to ~13x, and
the stddev also increased. See new-results.ods and new-scripts.tgz.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#17

Tomas Vondra

tomas.vondra@2ndquadrant.com

about 10 years ago

In reply to: Tom Lane (#14)

Re: New design for FK-based join selectivity estimation

On 06/18/2016 06:52 PM, Tom Lane wrote:

Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:

A few more comments, about re-reading the patch more thoroughly. I
wouldn't say any of those qualify as bugs, but rather as discussion
about some of the design choices:

1) NULL handling

I'd argue that we should do something about this, although I agree it's
non-trivial to estimate - at least until we get some sort of correlation
stats (e.g. my patch already provides most of the pieces, I believe).

I concur, actually, but I feel that it's out of scope for this
particular patch, which is only trying to replace the functionality
that was committed previously. If you want to come up with a patch on
top of this that adds some accounting for NULLs, I'd be willing to
consider it as a post-beta2 improvement.

Sure, fair enough. By post-beta2 you mean for 9.7, or still for 9.6?

But I'd argue that in the case of multi-column foreign keys we can do
better even without it - my experience is that in such cases either all
values are NULL or none of them, and a single NULL value breaks the FK
of course. So I think max(null_frac) would work.

Yeah, I was thinking along the same lines: max of the per-column null
fractions is probably an OK estimate.

3) ForeignKeyOptInfo->rinfos as a List
Can we actually get a list of matching RestrictInfos for a single
foreign key? I've been unable to construct such query.

I think you'd actually have to write redundant outer join quals,
along the lines of
select ... a left join b on (a.x = b.y and a.x = b.y)
I don't believe we take the trouble to eliminate such duplicates
unless they get absorbed by an EC, which outer-join quals would
not be. (Haven't tried this, though, as I don't have the patch
installed right now.)

OK. Let's look into this post-beta2 then.

The beta2 deadline is just about upon us; I feel that if we're going
to get this into this release at all, we need to push it today so
that we get a full buildfarm cycle on it before the wrap.

I plan to spend an hour or two adjusting the qual match logic as
discussed above, and re-reading the whole patch another time for
sanity. If I've not heard objections by the time I'm done,
I will push it.

Thanks!

If I could wish one more thing - could you briefly explain why you
rewrote the patch the way you did? I'd like to learn from this and while
I think I kinda understand most of the changes, I'm not really sure I
got it right.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18

Tom Lane

tgl@sss.pgh.pa.us

about 10 years ago

In reply to: Tomas Vondra (#16)

Re: New design for FK-based join selectivity estimation

Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:

OK, thanks. The one thing we haven't done is testing the performance, to
see how this fares. So I've repeated the tests I've done on the original
version of the patch here

Hmm. I'm not that excited about these results, for a couple of reasons:

* AFAICS, all the numbers are collected from the first execution of a
query within a session, meaning caches aren't populated and everything
has to be loaded from disk (or at least shared buffers).

* I do not credit hundreds of completely redundant FKs between the same
two tables as being representative of plausible real-world cases.

I modified your new script as attached to get rid of the first problem.
Comparing HEAD with HEAD minus commit 100340e2d, in non-assert builds,
I get results like this for the 100-foreign-key case (with repeat
count 1000 for the data collection script):

select code, test, avg(time),stddev(time) from data group by 1,2 order by 1,2;
code | test | avg | stddev
--------+-------+--------------------+---------------------
head | t1/t2 | 0.065045045045045 | 0.00312962651081508
head | t3/t4 | 0.168561561561562 | 0.00379087132124092
head | t5/t6 | 0.127671671671672 | 0.00326275949269809
head | t7/t8 | 0.391057057057056 | 0.00590249325300915
revert | t1/t2 | 0.0613933933933937 | 0.0032082678131875
revert | t3/t4 | 0.0737507507507501 | 0.00221692725859567
revert | t5/t6 | 0.123759759759759 | 0.00431225386651805
revert | t7/t8 | 0.154082082082081 | 0.00405118420422266
(8 rows)

So for the somewhat-credible cases, ie 100 unrelated foreign keys,
I get about 3% - 6% slowdown. The 100-duplicate-foreign-keys case
does indeed look like about a 2X slowdown, but as I said, I do not
think that has anything to do with interesting usage.

In any case, the situation I was worried about making better was
queries joining many tables, which none of this exercises at all.

regards, tom lane

#19

Tom Lane

tgl@sss.pgh.pa.us

about 10 years ago

In reply to: Tomas Vondra (#17)

Re: New design for FK-based join selectivity estimation

Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:

On 06/18/2016 06:52 PM, Tom Lane wrote:

I concur, actually, but I feel that it's out of scope for this
particular patch, which is only trying to replace the functionality
that was committed previously. If you want to come up with a patch on
top of this that adds some accounting for NULLs, I'd be willing to
consider it as a post-beta2 improvement.

Sure, fair enough. By post-beta2 you mean for 9.7, or still for 9.6?

I think it'd be legitimate to address the NULLs question for 9.6, as long
as the patch is not very large. If it does turn out to be invasive or
otherwise hard to review, waiting for 9.7 might be more prudent. But
I argued upthread that failing to consider nulls was a bug in the original
patch, so dealing with them could be considered a bug fix.

If I could wish one more thing - could you briefly explain why you
rewrote the patch the way you did? I'd like to learn from this and while
I think I kinda understand most of the changes, I'm not really sure I
got it right.

I don't at the moment recall everything I changed, but I'm happy to
answer questions that are more specific than that one ...

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20

Andrew Gierth

andrew@tao11.riddles.org.uk

almost 10 years ago

In reply to: Tom Lane (#10)

Re: New design for FK-based join selectivity estimation

"Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:

Attached is a reworked patch, mostly following the new design proposal
from this thread.

Tom> Comments and testing appreciated.

This blows up (see bug 14219 for testcase) in
match_foreign_keys_to_quals on the find_base_rel call(s) in the
following case:

If the query was produced by rule expansion then the code that populates
fkinfo includes FK references to the OLD and NEW RTEs, but those might not
appear in the jointree (the testcase for the bug is a DELETE rule where
NEW clearly doesn't apply) and hence build_simple_rel was not called
(causing find_base_rel to fail). Not sure what the right fix is.

--
Andrew (irc:RhodiumToad)

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21

Tom Lane

tgl@sss.pgh.pa.us

almost 10 years ago

In reply to: Andrew Gierth (#20)

#22