Making clausesel.c Smarter
I've stumbled over an interesting query out in the wild where the
query was testing a nullable timestamptz column was earlier than some
point in time, and also checking the column IS NOT NULL.
The use pattern here was that records which required processing had
this timestamp set to something other than NULL, a worker would come
along and process those, and UPDATE the record to NULL to mark the
fact that it was now processed. So what we are left with was a table
with a small number of rows with a value in this timestamp column, and
an ever-growing number of rows with a NULL value.
A highly simplified version of the query was checking for records that
required processing before some date, say:
SELECT * FROM a WHERE ts < '2017-02-24' AND ts IS NOT NULL;
Of course, the ts IS NOT NULL is not required here, but I can
understand how someone might make the mistake of adding this. The
simple solution to the problem was to have that null test removed, but
that seemingly innocent little mistake caused some pain due to very
slow running queries which held on to a nested loop plan 33 times
longer than it should have been doing. Basically what was happening
here is that clauselist_selectivity() was applying the selectivity
with the ~0.97 null_fact from pg_statistic over the top of the already
applied estimate on the number of rows below the constant timestamp.
Since the diagnosis of this problem was not instant, and some amount
of pain was suffered before the solution was found, I wondered if it
might be worth smartening up the planner a bit in this area.
We're already pretty good at handling conditions like: SELECT * FROM a
WHERE x < 10 and x < 1; where we'll effectively ignore the x < 10
estimate since x < 1 is more restrictive, so if we just build on that
ability a bit we could get enough code to cover the above case.
I've attached a draft patch which improves the situation here.
Given the test case:
create table ts (ts timestamptz);
insert into ts select case when x%1000=0 then '2017-01-01'::date +
(x::text || ' sec')::interval else null end from
generate_series(1,1000000) x ( x );
analyze ts;
explain analyze select count(*) from ts where ts <=
'2017-02-01'::timestamptz and ts is not null;
With the patch we get:
QUERY PLAN
-------------------------------------------------------------------------------------------------------------
Aggregate (cost=15938.83..15938.84 rows=1 width=8) (actual
time=101.003..101.003 rows=1 loops=1)
-> Seq Scan on ts (cost=0.00..15937.00 rows=733 width=0) (actual
time=0.184..100.868 rows=1000 loops=1)
Filter: ((ts IS NOT NULL) AND (ts <= '2017-02-01
00:00:00+13'::timestamp with time zone))
Rows Removed by Filter: 999000
Planning time: 0.153 ms
Execution time: 101.063 ms
Whereas master gives us:
QUERY PLAN
-----------------------------------------------------------------------------------------------------------
Aggregate (cost=15937.00..15937.01 rows=1 width=8) (actual
time=119.256..119.256 rows=1 loops=1)
-> Seq Scan on ts (cost=0.00..15937.00 rows=1 width=0) (actual
time=0.172..119.062 rows=1000 loops=1)
Filter: ((ts IS NOT NULL) AND (ts <= '2017-02-01
00:00:00+13'::timestamp with time zone))
Rows Removed by Filter: 999000
Planning time: 0.851 ms
Execution time: 119.401 ms
A side effect of this is that we're now able to better detect
impossible cases such as:
postgres=# explain analyze select count(*) from ts where ts <=
'2017-02-01'::timestamptz and ts is null;
QUERY PLAN
----------------------------------------------------------------------------------------------------------
Aggregate (cost=15937.00..15937.01 rows=1 width=8) (actual
time=135.012..135.012 rows=1 loops=1)
-> Seq Scan on ts (cost=0.00..15937.00 rows=1 width=0) (actual
time=135.007..135.007 rows=0 loops=1)
Filter: ((ts IS NULL) AND (ts <= '2017-02-01
00:00:00+13'::timestamp with time zone))
Rows Removed by Filter: 1000000
Planning time: 0.067 ms
Execution time: 135.050 ms
(6 rows)
Master is not able to see the impossibility of this case:
postgres=# explain analyze select count(*) from ts where ts <=
'2017-02-01'::timestamptz and ts is null;
QUERY PLAN
------------------------------------------------------------------------------------------------------------
Aggregate (cost=15938.83..15938.84 rows=1 width=8) (actual
time=131.681..131.681 rows=1 loops=1)
-> Seq Scan on ts (cost=0.00..15937.00 rows=733 width=0) (actual
time=131.676..131.676 rows=0 loops=1)
Filter: ((ts IS NULL) AND (ts <= '2017-02-01
00:00:00+13'::timestamp with time zone))
Rows Removed by Filter: 1000000
Planning time: 0.090 ms
Execution time: 131.719 ms
(6 rows)
Now one thing I was unsure of in the patch was this "Other clauses"
concept that I've come up with. In the patch we have:
default:
addCachedSelectivityOtherClause(&cslist, var, s2);
break;
I was unsure if this should only apply to F_EQSEL and F_NEQSEL. My
imagination here won't stretch far enough to imagine a OpExpr which
passes with a NULL operand. If it did my logic is broken, and if
that's the case then I think limiting "Others" to mean F_EQSEL and
F_NEQSEL would be the workaround.
Before I do any more work on this, I'd like to know, is this something
that we might want to fix?
It would be good to improve the situation here in the back branches
too, but I'm thinking that the attached might be a little invasive for
that?
Thoughts?
Regards
David Rowley
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Attachments:
smarter_clausesel.patchapplication/octet-stream; name=smarter_clausesel.patchDownload+281-159
On Fri, Feb 24, 2017 at 3:30 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
It would be good to improve the situation here in the back branches
too, but I'm thinking that the attached might be a little invasive for
that?
My experience has been that customers - at least EnterpriseDB
customers - do not appreciate plan changes when they upgrade to a new
minor release. Most people have production installations that are
basically working; if not, they wouldn't be in production. Even if
they're getting a good plan only by some happy accident, they're still
getting it, and a change can cause a good plan to flop over into a bad
plan, which can easily turn into a service outage. The people who had
a bad plan and get flipped to a good plan will be happy once they
realize what has changed, of course, but that's not really enough to
make up from the panicked calls from customers whose stuff falls over
when they try to apply the critical security update.
I think the basic think you are trying to accomplish is sensible,
though. I haven't reviewed the patch.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2/26/17 1:41 AM, Robert Haas wrote:
On Fri, Feb 24, 2017 at 3:30 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:It would be good to improve the situation here in the back branches
too, but I'm thinking that the attached might be a little invasive for
that?My experience has been that customers - at least EnterpriseDB
customers - do not appreciate plan changes when they upgrade to a new
minor release. Most people have production installations that are
basically working; if not, they wouldn't be in production. Even if
they're getting a good plan only by some happy accident, they're still
getting it, and a change can cause a good plan to flop over into a bad
plan, which can easily turn into a service outage. The people who had
a bad plan and get flipped to a good plan will be happy once they
realize what has changed, of course, but that's not really enough to
make up from the panicked calls from customers whose stuff falls over
when they try to apply the critical security update.I think the basic think you are trying to accomplish is sensible,
though. I haven't reviewed the patch.
This patch applies cleanly and compiles at cccbdde. I agree with Robert
that back patching would likely not be a good idea.
Anyone familiar with the planner available to review this patch?
It also seems like this would be a (relatively) simple patch to start
with for anyone that is interested in learning more about the planner.
--
-David
david@pgmasters.net
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
David Steele <david@pgmasters.net> writes:
Anyone familiar with the planner available to review this patch?
FWIW, it's on my radar but I don't expect to get to it real soon,
as there's other stuff I deem higher priority. In the meantime,
I don't want to stand in the way of someone else looking at it.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Feb 24, 2017 at 7:00 AM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
I've stumbled over an interesting query out in the wild where the
query was testing a nullable timestamptz column was earlier than some
point in time, and also checking the column IS NOT NULL.The use pattern here was that records which required processing had
this timestamp set to something other than NULL, a worker would come
along and process those, and UPDATE the record to NULL to mark the
fact that it was now processed. So what we are left with was a table
with a small number of rows with a value in this timestamp column, and
an ever-growing number of rows with a NULL value.A highly simplified version of the query was checking for records that
required processing before some date, say:SELECT * FROM a WHERE ts < '2017-02-24' AND ts IS NOT NULL;
Of course, the ts IS NOT NULL is not required here, but I can
understand how someone might make the mistake of adding this. The
simple solution to the problem was to have that null test removed, but
that seemingly innocent little mistake caused some pain due to very
slow running queries which held on to a nested loop plan 33 times
longer than it should have been doing. Basically what was happening
here is that clauselist_selectivity() was applying the selectivity
with the ~0.97 null_fact from pg_statistic over the top of the already
applied estimate on the number of rows below the constant timestamp.Since the diagnosis of this problem was not instant, and some amount
of pain was suffered before the solution was found, I wondered if it
might be worth smartening up the planner a bit in this area.We're already pretty good at handling conditions like: SELECT * FROM a
WHERE x < 10 and x < 1; where we'll effectively ignore the x < 10
estimate since x < 1 is more restrictive, so if we just build on that
ability a bit we could get enough code to cover the above case.I've attached a draft patch which improves the situation here.
I thought to take a quick look at this patch. I'll probably take a
deeper look later, but some initial comments:
-typedef struct RangeQueryClause
+typedef struct CachedSelectivityClause
{
- struct RangeQueryClause *next; /* next in linked list */
+ struct CachedSelectivityClause *next; /* next in linked list */
Node *var; /* The common variable of the clauses */
- bool have_lobound; /* found a low-bound clause yet? */
- bool have_hibound; /* found a high-bound clause yet? */
+ int selmask; /* Bitmask of which sel types are stored */
Selectivity lobound; /* Selectivity of a var > something clause */
Selectivity hibound; /* Selectivity of a var < something clause */
-} RangeQueryClause;
As seems customary in other code, perhaps you should define some
HAS_X() macros for dealing with the selmask.
Something like
#SELMASK_HAS_LOBOUND(selmask) (((selmask) & CACHESEL_LOBOUND) != 0)
etc..
+static void addCachedSelectivityRangeVar(CachedSelectivityClause
**cslist, Node *var,
bool varonleft, bool isLTsel, Selectivity s2);
You're changing clause -> var throughout the code when dealing with
nodes, but looking at their use, they hold neither. All those
addCachedSelectivity functions are usually passed expression nodes. If
you're renaming, perhaps you should rename to "expr".
On Fri, Feb 24, 2017 at 7:00 AM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
Now one thing I was unsure of in the patch was this "Other clauses"
concept that I've come up with. In the patch we have:default:
addCachedSelectivityOtherClause(&cslist, var, s2);
break;I was unsure if this should only apply to F_EQSEL and F_NEQSEL. My
imagination here won't stretch far enough to imagine a OpExpr which
passes with a NULL operand. If it did my logic is broken, and if
that's the case then I think limiting "Others" to mean F_EQSEL and
F_NEQSEL would be the workaround.
While not specifically applicable only to "Others", something needs
consideration here:
User-defined functions can be nonstrict. An OpExpr involving such a
user-defined function could possibly pass on null.
I would suggest avoiding making the assumption that it doesn't unless
the operator is strict.
One could argue that such an operator should provide its own
selectivity estimator, but the strictness check shouldn't be too
difficult to add, and I think it's a better choice.
So you'd have to check that:
default:
if (op_strict(expr->opno) && func_strict(expr->opfuncid))
addCachedSelectivityOtherClause(&cslist, var, s2);
else
s1 = s1 * s2;
break;
So, I went ahead and did that.
Considering this setup:
createdb pgtest
cat <<SQL | psql pgtest
CREATE TABLE testdata AS select null::integer as value from
generate_series(1, 100000);
VACUUM FREEZE testdata;
CREATE FUNCTION maskcheck(integer, integer) RETURNS boolean AS \$\$
BEGIN RETURN (coalesce(\$1,0) & coalesce(\$2,0)) = 0 ; END \$\$
LANGUAGE PLPGSQL;
CREATE OPERATOR && ( PROCEDURE = maskcheck, LEFTARG = integer,
RIGHTARG = integer, COMMUTATOR = && );
SQL
The following query:
EXPLAIN ANALYZE SELECT * FROM testdata WHERE value && 1 AND value is null;
The original patch yields:
QUERY PLAN
------------------------------------------------------------------------------------------------------------
Seq Scan on testdata (cost=0.00..26344.00 rows=1 width=4) (actual
time=0.367..84.861 rows=100000 loops=1)
Filter: ((value IS NULL) AND (value && 1))
Planning time: 0.371 ms
Execution time: 88.156 ms
(4 rows)
After the change, however, it yields:
QUERY PLAN
----------------------------------------------------------------------------------------------------------------
Seq Scan on testdata (cost=0.00..26344.00 rows=50000 width=4)
(actual time=0.207..89.668 rows=100000 loops=1)
Filter: ((value IS NULL) AND (value && 1))
Planning time: 0.186 ms
Execution time: 92.978 ms
(4 rows)
Which seems much better.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 1 April 2017 at 16:42, Claudio Freire <klaussfreire@gmail.com> wrote:
I thought to take a quick look at this patch. I'll probably take a
deeper look later, but some initial comments:-typedef struct RangeQueryClause +typedef struct CachedSelectivityClause { - struct RangeQueryClause *next; /* next in linked list */ + struct CachedSelectivityClause *next; /* next in linked list */ Node *var; /* The common variable of the clauses */ - bool have_lobound; /* found a low-bound clause yet? */ - bool have_hibound; /* found a high-bound clause yet? */ + int selmask; /* Bitmask of which sel types are stored */ Selectivity lobound; /* Selectivity of a var > something clause */ Selectivity hibound; /* Selectivity of a var < something clause */ -} RangeQueryClause;As seems customary in other code, perhaps you should define some
HAS_X() macros for dealing with the selmask.Something like
#SELMASK_HAS_LOBOUND(selmask) (((selmask) & CACHESEL_LOBOUND) != 0)
etc..
Thanks for looking at the patch.
The problem with that is that some of the tests are doing more than
just checking a single flag is enabled.
Consider:
if ((cslist->selmask & (CACHESEL_LOBOUND | CACHESEL_HIBOUND)) ==
(CACHESEL_LOBOUND | CACHESEL_HIBOUND))
Of course, those could be Macro'd too, but it seems I might need to be
inventive with the names, or the meaning might not be all that clear.
It does not seem overly complex and it is isolated to a single file,
so perhaps it might be OK as is?
+static void addCachedSelectivityRangeVar(CachedSelectivityClause
**cslist, Node *var,
bool varonleft, bool isLTsel, Selectivity s2);You're changing clause -> var throughout the code when dealing with
nodes, but looking at their use, they hold neither. All those
addCachedSelectivity functions are usually passed expression nodes. If
you're renaming, perhaps you should rename to "expr".
hmm, this is true. I kind of followed the lead of the name of the
variable in the old RangeQueryClause struct. I have changed the names
of these to be expr in the attached, but I didn't change the name of
the Var in the new CachedSelectivityClause struct. It seems like a
change not related to this patch.
Do you think I should change that too?
On Fri, Feb 24, 2017 at 7:00 AM, David Rowley
<david.rowley@2ndquadrant.com> wrote:Now one thing I was unsure of in the patch was this "Other clauses"
concept that I've come up with. In the patch we have:default:
addCachedSelectivityOtherClause(&cslist, var, s2);
break;I was unsure if this should only apply to F_EQSEL and F_NEQSEL. My
imagination here won't stretch far enough to imagine a OpExpr which
passes with a NULL operand. If it did my logic is broken, and if
that's the case then I think limiting "Others" to mean F_EQSEL and
F_NEQSEL would be the workaround.While not specifically applicable only to "Others", something needs
consideration here:User-defined functions can be nonstrict. An OpExpr involving such a
user-defined function could possibly pass on null.
Good point. I overlooked this.
I would suggest avoiding making the assumption that it doesn't unless
the operator is strict.One could argue that such an operator should provide its own
selectivity estimator, but the strictness check shouldn't be too
difficult to add, and I think it's a better choice.So you'd have to check that:
default:
if (op_strict(expr->opno) && func_strict(expr->opfuncid))
addCachedSelectivityOtherClause(&cslist, var, s2);
else
s1 = s1 * s2;
break;
I've changed it to something along those lines. I don't think the
func_strict here is required, though, so I've gone with just
op_strict().
Updated patch attached.
Thanks for reviewing it.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Attachments:
smarter_clausesel_2017-04-03.patchapplication/octet-stream; name=smarter_clausesel_2017-04-03.patchDownload+302-160
On 2017-04-03 20:59:42 +1200, David Rowley wrote:
Updated patch attached.
Thanks for reviewing it.
Given the time in the release cycle I'm afraid that this it's too late
to get this into v10. Does anybody disagree? If not, it should be
moved to the next CF.
Greetings,
Andres Freund
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Apr 3, 2017 at 5:24 PM, Andres Freund <andres@anarazel.de> wrote:
On 2017-04-03 20:59:42 +1200, David Rowley wrote:
Updated patch attached.
Thanks for reviewing it.
Given the time in the release cycle I'm afraid that this it's too late
to get this into v10. Does anybody disagree? If not, it should be
moved to the next CF.Greetings,
Andres Freund
While personally I'd like to see this patch make it (I got bitten by
this more than once), I can't argue with that logic with the feature
freeze almost upon us, especially given the stage the review is at
(ie: just beginning).
I'm still going to review it though.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 4 April 2017 at 08:24, Andres Freund <andres@anarazel.de> wrote:
On 2017-04-03 20:59:42 +1200, David Rowley wrote:
Updated patch attached.
Thanks for reviewing it.
Given the time in the release cycle I'm afraid that this it's too late
to get this into v10. Does anybody disagree? If not, it should be
moved to the next CF.
I strongly disagree. The time in the release cycle is the final
commitfest. This is when patches are commited to the repository. The
exception to this is that no large invasive patches should arrive new
in the final commitfest. This is not one of those.
Tom has already mentioned he'd like to look at this. If you're not
willing, then please just don't look at it, but please also don't
remove other peoples opportunity for doing so.
Please explain your logic for thinking otherwise. Have I somehow
misunderstood what the 1 week extension means?
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2017-04-04 09:22:07 +1200, David Rowley wrote:
On 4 April 2017 at 08:24, Andres Freund <andres@anarazel.de> wrote:
On 2017-04-03 20:59:42 +1200, David Rowley wrote:
Updated patch attached.
Thanks for reviewing it.
Given the time in the release cycle I'm afraid that this it's too late
to get this into v10. Does anybody disagree? If not, it should be
moved to the next CF.I strongly disagree. The time in the release cycle is the final
commitfest. This is when patches are commited to the repository. The
exception to this is that no large invasive patches should arrive new
in the final commitfest. This is not one of those.
Right.
Tom has already mentioned he'd like to look at this.
He also said that it'll be a while, and he hadn't yet by the time of the
original freeze date.
Please explain your logic for thinking otherwise. Have I somehow
misunderstood what the 1 week extension means?
Well, we have a lot of pending work, some of that has been pending for
years essentially. And Tom hadn't looked at it yet, and he wasn't at
pgconf.us, so he wasn't actually delayed due to that...
I'm fine with not punting right now, but we need to reduce the number of
patches and focus on stuff that's realistic to get in. There's 4 days
and ~60 open CF entries. Please help whip other patches into shape...
- Andres
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Apr 3, 2017 at 6:22 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
On 4 April 2017 at 08:24, Andres Freund <andres@anarazel.de> wrote:
On 2017-04-03 20:59:42 +1200, David Rowley wrote:
Updated patch attached.
Thanks for reviewing it.
Given the time in the release cycle I'm afraid that this it's too late
to get this into v10. Does anybody disagree? If not, it should be
moved to the next CF.I strongly disagree. The time in the release cycle is the final
commitfest. This is when patches are commited to the repository. The
exception to this is that no large invasive patches should arrive new
in the final commitfest. This is not one of those.
Selectivity misestimations can have a huge impact on query
performance, and that could be a big deal if there's a big regression
on some queries.
But, the beta period could be enough to detect any issues there. I'll
leave that decision to committers or commitfest managers I guess.
Tom has already mentioned he'd like to look at this. If you're not
willing, then please just don't look at it, but please also don't
remove other peoples opportunity for doing so.Please explain your logic for thinking otherwise. Have I somehow
misunderstood what the 1 week extension means?
But Tom hasn't yet, and even if he does, we'd have to be done with the
review within this week.
In any case, I intend to go on with the review later today, and at
first look the patch seems in good shape (though it'll take a deeper
look before I make that my official review). Most other patches in
need of review are a bit complex for my knowledge of the code base,
and I already started reviewing this one. We could perhaps be done by
the end of the week.
We could defer that decision to within 4 days. If we're done, we're
done. If we're not, move to next CF.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Apr 3, 2017 at 5:59 AM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
+static void addCachedSelectivityRangeVar(CachedSelectivityClause
**cslist, Node *var,
bool varonleft, bool isLTsel, Selectivity s2);You're changing clause -> var throughout the code when dealing with
nodes, but looking at their use, they hold neither. All those
addCachedSelectivity functions are usually passed expression nodes. If
you're renaming, perhaps you should rename to "expr".hmm, this is true. I kind of followed the lead of the name of the
variable in the old RangeQueryClause struct. I have changed the names
of these to be expr in the attached, but I didn't change the name of
the Var in the new CachedSelectivityClause struct. It seems like a
change not related to this patch.Do you think I should change that too?
I'd prefer it if all occurrences of the concept were changed, to
maintain consistency.
That would include variables used to hold expressions that refer to
these as well, as in the case of:
+ Node *var;
+
+ if (varonleft)
+ var = leftexpr;
+ else
+ var = rightexpr;
+
I would suggest avoiding making the assumption that it doesn't unless
the operator is strict.One could argue that such an operator should provide its own
selectivity estimator, but the strictness check shouldn't be too
difficult to add, and I think it's a better choice.So you'd have to check that:
default:
if (op_strict(expr->opno) && func_strict(expr->opfuncid))
addCachedSelectivityOtherClause(&cslist, var, s2);
else
s1 = s1 * s2;
break;I've changed it to something along those lines. I don't think the
func_strict here is required, though, so I've gone with just
op_strict().
Indeed, I realized right after typing it, I thought I had corrected it
before I hit send. Seems I hadn't. Good thing you noticed.
One last observation:
+ /*
+ * An IS NOT NULL test is a no-op if there's any other strict quals,
+ * so if that's the case, then we'll only apply this, otherwise we'll
+ * ignore it.
+ */
+ else if (cslist->selmask == CACHESEL_NOTNULLTEST)
+ s1 *= cslist->notnullsel;
In some paths, the range selectivity estimation code punts and returns
DEFAULT_RANGE_INEQ_SEL. In those cases, if a not null test was
present, it should still be applied. It could make a big difference if
the selectivity for the nulltest is high.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 4 April 2017 at 11:35, Claudio Freire <klaussfreire@gmail.com> wrote:
I'd prefer it if all occurrences of the concept were changed, to
maintain consistency.
That would include variables used to hold expressions that refer to
these as well, as in the case of:+ Node *var; + + if (varonleft) + var = leftexpr; + else + var = rightexpr; +
Thanks for looking again.
I didn't change that one as there's already a variable named "expr" in
the scope. I thought changing that would mean code churn that I don't
really want to add to the patch. Someone else might come along and ask
me why I'm renaming this unrelated variable. I kinda of think that if
it was var before when it meant expr, then it's not the business of
this patch to clean that up. I didn't rename the struct member, for
example, as the meaning is no different than before.
One last observation:
+ /* + * An IS NOT NULL test is a no-op if there's any other strict quals, + * so if that's the case, then we'll only apply this, otherwise we'll + * ignore it. + */ + else if (cslist->selmask == CACHESEL_NOTNULLTEST) + s1 *= cslist->notnullsel;In some paths, the range selectivity estimation code punts and returns
DEFAULT_RANGE_INEQ_SEL. In those cases, if a not null test was
present, it should still be applied. It could make a big difference if
the selectivity for the nulltest is high.
I'd say that the location that applies the DEFAULT_RANGE_INEQ_SEL
should apply the nullfrac. Seems wrong to rely on a IS NOT NULL test
to exists to estimate that properly. I don't think that needs done as
part of this patch. I'd rather limit the scope since "returned with
feedback" is already knocking at the door of this patch.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Apr 3, 2017 at 8:52 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
One last observation:
+ /* + * An IS NOT NULL test is a no-op if there's any other strict quals, + * so if that's the case, then we'll only apply this, otherwise we'll + * ignore it. + */ + else if (cslist->selmask == CACHESEL_NOTNULLTEST) + s1 *= cslist->notnullsel;In some paths, the range selectivity estimation code punts and returns
DEFAULT_RANGE_INEQ_SEL. In those cases, if a not null test was
present, it should still be applied. It could make a big difference if
the selectivity for the nulltest is high.I'd say that the location that applies the DEFAULT_RANGE_INEQ_SEL
should apply the nullfrac. Seems wrong to rely on a IS NOT NULL test
to exists to estimate that properly. I don't think that needs done as
part of this patch. I'd rather limit the scope since "returned with
feedback" is already knocking at the door of this patch.
I agree, but this patch regresses for those cases if the user
compensated with a not null test, leaving little recourse, so I'd fix
it in this patch to avoid the regression.
Maybe you're right that the null fraction should be applied regardless
of whether there's an explicit null check, though.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Apr 3, 2017 at 9:19 PM, Claudio Freire <klaussfreire@gmail.com> wrote:
On Mon, Apr 3, 2017 at 8:52 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:One last observation:
+ /* + * An IS NOT NULL test is a no-op if there's any other strict quals, + * so if that's the case, then we'll only apply this, otherwise we'll + * ignore it. + */ + else if (cslist->selmask == CACHESEL_NOTNULLTEST) + s1 *= cslist->notnullsel;In some paths, the range selectivity estimation code punts and returns
DEFAULT_RANGE_INEQ_SEL. In those cases, if a not null test was
present, it should still be applied. It could make a big difference if
the selectivity for the nulltest is high.I'd say that the location that applies the DEFAULT_RANGE_INEQ_SEL
should apply the nullfrac. Seems wrong to rely on a IS NOT NULL test
to exists to estimate that properly. I don't think that needs done as
part of this patch. I'd rather limit the scope since "returned with
feedback" is already knocking at the door of this patch.I agree, but this patch regresses for those cases if the user
compensated with a not null test, leaving little recourse, so I'd fix
it in this patch to avoid the regression.Maybe you're right that the null fraction should be applied regardless
of whether there's an explicit null check, though.
The more I read that code, the more I'm convinced there's no way to
get a DEFAULT_RANGE_INEQ_SEL if (rqlist->hibound != DEFAULT_INEQ_SEL
&&
rqlist->lobound != DEFAULT_INEQ_SEL) other than from contradictory
clauses, a case the current code handles very poorly, returning
DEFAULT_RANGE_INEQ_SEL and ignoring nullfrac, something that could
give way-off estimates if the table had lots of nulls.
Say, consider if "value" was mostly null and you write:
SELECT * FROM testdata WHERE value BETWEEN 10 AND 20 AND value BETWEEN
50 AND 60;
All other cases I can think of would end with either hibound or
lobound being equal to DEFAULT_INEQ_SEL.
It seems to me, doing a git blame, that the checks in the else branch
were left only as a precaution, one that seems unnecessary.
If you ask me, I'd just leave:
+ if (rqlist->hibound == DEFAULT_INEQ_SEL || rqlist->lobound ==
DEFAULT_INEQ_SEL)
+ {
+ /* No point in checking null selectivity, DEFAULT_INEQ_SEL
implies we have no stats */
+ s2 = DEFAULT_RANGE_INEQ_SEL;
+ }
+ else
+ {
...
+ s2 = rqlist->hibound + rqlist->lobound - 1.0;
+ nullsel = nulltestsel(root, IS_NULL, rqlist->var, varRelid);
+ notnullsel = 1.0 - nullsel;
+
+ /* Adjust for double-exclusion of NULLs */
+ s2 += nullsel;
+ if (s2 <= 0.0) {
+ if (s2 <= (-1.0e-4 * notnullsel - 1.0e-6))
+ {
+ /* Most likely contradictory clauses found */
+ s2 = (1.0e-10 < notnullsel) ? 1.0e-10 : notnullsel;
+ }
+ else
+ {
+ /* Could be a rounding error */
+ s2 = DEFAULT_RANGE_INEQ_SEL * notnullsel;
+ }
+ }
+ }
Where (-1.0e-4 * notnullsel - 1.0e-6) is just a very rough (and overly
cautious) estimation of the amount of rounding error that could be
there with 32-bit floats.
The above assumes a non-DEFAULT_INEQ_SEL value in lobound/hibound is
an estimation based on stats, maybe approximate, but one that makes
sense (ie: preserves the monotonicity of the CPF), and as such
negative values are either sign of a contradiction or rounding error.
The previous code left the possibility of negatives coming out of
default selectivities creeping up on non-DEFAULT_INEQ_SEL estimates,
but that doesn't seem possible coming out of scalarineqsel.
But I'd like it if Tom could comment on that, since he's the one that
did that in commit 547bb4a7f2bdccad9253a99211ce84b3f9de485a (way back
in 2004).
Barring that, I'd just change the
s2 = DEFAULT_RANGE_INEQ_SEL;
To
s2 = DEFAULT_RANGE_INEQ_SEL * notnullsel;
Which makes a lot more sense.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 3 April 2017 at 20:59, David Rowley <david.rowley@2ndquadrant.com> wrote:
Updated patch attached.
I did a little benchmarking on this to learn how planning time is affected.
Master = 9fa6e08d4a16f9b0461743cff35781e16308c106
Patched = 9fa6e08d4a16f9b0461743cff35781e16308c106 +
smarter_clausesel_2017-04-03.patch
Config:
All standard
Setup:
create table abcd (a int, b int, c int, d int, arr int[]);
Tests:
Test1: explain select * from abcd where a = 1 and b = 1 and c = 1 and d = 1;
Test2: explain select * from abcd where a >= 1 and a <= 10 and b = 1 and c = 1;
Test3: explain select * from abcd a1 inner join abcd a2 on a1.a = a2.a
where a1.b = 1 and a2.b = 1 and a1.c = 1 and a2.c = 1;
Test4: (output of) select 'explain select * from abcd where ' ||
string_agg('arr[' || x::Text || '] = 1',' AND ') from
generate_series(0,999) x;
Test5: (output of) select 'explain select * from abcd where ' ||
string_agg('arr[' || x::Text || '] > 1',' AND ') from
generate_series(0,999) x;
Test6: explain select * from abcd;
Tests were performed running pgbench -T 60 -n
Raw Results:
Master
Test 1
tps = 6993.677785 (excluding connections establishing)
tps = 7072.796544 (excluding connections establishing)
tps = 6877.428676 (excluding connections establishing)
Test2
tps = 6698.456888 (excluding connections establishing)
tps = 7117.481643 (excluding connections establishing)
tps = 7053.070973 (excluding connections establishing)
Test 3
tps = 5137.229119 (excluding connections establishing)
tps = 5091.548602 (excluding connections establishing)
tps = 5175.683112 (excluding connections establishing)
Test 4
tps = 23.816356 (excluding connections establishing)
tps = 27.069840 (excluding connections establishing)
tps = 27.398995 (excluding connections establishing)
Test 5
tps = 54.209078 (excluding connections establishing)
tps = 54.022850 (excluding connections establishing)
tps = 54.076590 (excluding connections establishing)
Test 6
tps = 9328.134781 (excluding connections establishing)
tps = 9181.898271 (excluding connections establishing)
tps = 9402.993637 (excluding connections establishing)
Patched
Test 1
tps = 6560.160644 (excluding connections establishing)
tps = 6714.294344 (excluding connections establishing)
tps = 6901.381552 (excluding connections establishing)
Test 2
tps = 6971.106747 (excluding connections establishing)
tps = 6591.363565 (excluding connections establishing)
tps = 6921.572060 (excluding connections establishing)
Test 3
tps = 4784.606871 (excluding connections establishing)
tps = 4980.609259 (excluding connections establishing)
tps = 4954.237649 (excluding connections establishing)
Test 4
tps = 19.374184 (excluding connections establishing)
tps = 19.836406 (excluding connections establishing)
tps = 19.047484 (excluding connections establishing)
(perf of test 4)
+ 40.20% 40.20% postgres postgres [.] equal.part.18
+ 29.57% 0.00% postgres [unknown] [.] 0xffffffff00000017
+ 25.15% 0.00% postgres [unknown] [.] 0x00000000ffffffff
+ 20.29% 0.00% postgres [unknown] [k] 0000000000000000
+ 12.36% 12.36% postgres postgres [.] _equalList
+ 11.25% 0.00% postgres [unknown] [.] 0x0000558ed98fb148
+ 8.53% 8.53% postgres postgres [.] _equalArrayRef
+ 6.22% 6.22% postgres postgres [.] process_equivalence
5.93% 5.93% postgres postgres [.] get_eclass_for_sort_expr
+ 2.75% 0.00% postgres [kernel.kallsyms] [k] 0xffffffffaf06b772
Test 5
tps = 51.691502 (excluding connections establishing)
tps = 51.296724 (excluding connections establishing)
tps = 51.366643 (excluding connections establishing)
Test 6
tps = 9528.363239 (excluding connections establishing)
tps = 9478.202725 (excluding connections establishing)
tps = 9346.753395 (excluding connections establishing)
Result Comparison
Master median tps Patch median tps comparison
Test 1 6993.7 6714.3 104.16%
Test 2 7053.1 6921.6 101.90%
Test 3 5137.2 4954.2 103.69%
Test 4 27.1 19.4 139.72%
Test 5 54.1 51.4 105.28%
Test 6 9328.1 9478.2 98.42%
Results Analyzed:
Test 1 has caused planning to slow down 4.16%. There's quite a bit of
noise from the results, but I think this indicates there is some
overhead to having to add items to the cslist and searching the cslist
when new quals are seen.
Test 2 has a lesser slowdown than test 1, as this test will excercise
the existing rqlist caching in master too. Patched does a little more
work adding the equality condition to the list too.
Test 3 similar to test 1
Test 4 adds quite an overhead and causes 0.5 million comparisons to
find the expressions in the cslist.
Test 5 shows less overhead than test 4 since the Master code has to
also do the expression caching and searching.
Test 6 is a control test
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 4 April 2017 at 13:35, Claudio Freire <klaussfreire@gmail.com> wrote:
On Mon, Apr 3, 2017 at 9:19 PM, Claudio Freire <klaussfreire@gmail.com> wrote:
On Mon, Apr 3, 2017 at 8:52 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:One last observation:
+ /* + * An IS NOT NULL test is a no-op if there's any other strict quals, + * so if that's the case, then we'll only apply this, otherwise we'll + * ignore it. + */ + else if (cslist->selmask == CACHESEL_NOTNULLTEST) + s1 *= cslist->notnullsel;In some paths, the range selectivity estimation code punts and returns
DEFAULT_RANGE_INEQ_SEL. In those cases, if a not null test was
present, it should still be applied. It could make a big difference if
the selectivity for the nulltest is high.I'd say that the location that applies the DEFAULT_RANGE_INEQ_SEL
should apply the nullfrac. Seems wrong to rely on a IS NOT NULL test
to exists to estimate that properly. I don't think that needs done as
part of this patch. I'd rather limit the scope since "returned with
feedback" is already knocking at the door of this patch.I agree, but this patch regresses for those cases if the user
compensated with a not null test, leaving little recourse, so I'd fix
it in this patch to avoid the regression.Maybe you're right that the null fraction should be applied regardless
of whether there's an explicit null check, though.The more I read that code, the more I'm convinced there's no way to
get a DEFAULT_RANGE_INEQ_SEL if (rqlist->hibound != DEFAULT_INEQ_SEL
&&
rqlist->lobound != DEFAULT_INEQ_SEL) other than from contradictory
clauses, a case the current code handles very poorly, returning
DEFAULT_RANGE_INEQ_SEL and ignoring nullfrac, something that could
give way-off estimates if the table had lots of nulls.Say, consider if "value" was mostly null and you write:
SELECT * FROM testdata WHERE value BETWEEN 10 AND 20 AND value BETWEEN
50 AND 60;All other cases I can think of would end with either hibound or
lobound being equal to DEFAULT_INEQ_SEL.It seems to me, doing a git blame, that the checks in the else branch
were left only as a precaution, one that seems unnecessary.If you ask me, I'd just leave:
+ if (rqlist->hibound == DEFAULT_INEQ_SEL || rqlist->lobound == DEFAULT_INEQ_SEL) + { + /* No point in checking null selectivity, DEFAULT_INEQ_SEL implies we have no stats */ + s2 = DEFAULT_RANGE_INEQ_SEL; + } + else + { ... + s2 = rqlist->hibound + rqlist->lobound - 1.0; + nullsel = nulltestsel(root, IS_NULL, rqlist->var, varRelid); + notnullsel = 1.0 - nullsel; + + /* Adjust for double-exclusion of NULLs */ + s2 += nullsel; + if (s2 <= 0.0) { + if (s2 <= (-1.0e-4 * notnullsel - 1.0e-6)) + { + /* Most likely contradictory clauses found */ + s2 = (1.0e-10 < notnullsel) ? 1.0e-10 : notnullsel; + } + else + { + /* Could be a rounding error */ + s2 = DEFAULT_RANGE_INEQ_SEL * notnullsel; + } + } + }Where (-1.0e-4 * notnullsel - 1.0e-6) is just a very rough (and overly
cautious) estimation of the amount of rounding error that could be
there with 32-bit floats.The above assumes a non-DEFAULT_INEQ_SEL value in lobound/hibound is
an estimation based on stats, maybe approximate, but one that makes
sense (ie: preserves the monotonicity of the CPF), and as such
negative values are either sign of a contradiction or rounding error.
The previous code left the possibility of negatives coming out of
default selectivities creeping up on non-DEFAULT_INEQ_SEL estimates,
but that doesn't seem possible coming out of scalarineqsel.But I'd like it if Tom could comment on that, since he's the one that
did that in commit 547bb4a7f2bdccad9253a99211ce84b3f9de485a (way back
in 2004).Barring that, I'd just change the
s2 = DEFAULT_RANGE_INEQ_SEL;
To
s2 = DEFAULT_RANGE_INEQ_SEL * notnullsel;
Which makes a lot more sense.
I think to fix this properly the selfuncs would have to return the
estimate, and nullfrac separately, so the nullfrac could just be
applied once per expression. There's already hacks in there to get rid
of the double null filtering. I'm not proposing that as something for
this patch. It would be pretty invasive to change this.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Apr 4, 2017 at 8:12 AM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
Result Comparison
Master median tps Patch median tps comparison
Test 1 6993.7 6714.3 104.16%
Test 2 7053.1 6921.6 101.90%
Test 3 5137.2 4954.2 103.69%
Test 4 27.1 19.4 139.72%
Test 5 54.1 51.4 105.28%
Test 6 9328.1 9478.2 98.42%Results Analyzed:
Test 1 has caused planning to slow down 4.16%. There's quite a bit of
noise from the results, but I think this indicates there is some
overhead to having to add items to the cslist and searching the cslist
when new quals are seen.Test 2 has a lesser slowdown than test 1, as this test will excercise
the existing rqlist caching in master too. Patched does a little more
work adding the equality condition to the list too.Test 3 similar to test 1
Test 4 adds quite an overhead and causes 0.5 million comparisons to
find the expressions in the cslist.Test 5 shows less overhead than test 4 since the Master code has to
also do the expression caching and searching.Test 6 is a control test
That's consistent with the operation being quadratic.
While I was about to point it out, the old code was quadratic too, so
it sucks in both versions. This version only adds other opexprs to the
list and makes the quadratic cost easier to run into, but it's nothing
new.
I don't think there's an easy fix for that. You'd have to build a hash
table of expression nodes or something of the sort to avoid the
quadratic cost. If you can do that, by all means, but that's a much
bigger patch.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Apr 4, 2017 at 8:21 AM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
On 4 April 2017 at 13:35, Claudio Freire <klaussfreire@gmail.com> wrote:
On Mon, Apr 3, 2017 at 9:19 PM, Claudio Freire <klaussfreire@gmail.com> wrote:
On Mon, Apr 3, 2017 at 8:52 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:One last observation:
+ /* + * An IS NOT NULL test is a no-op if there's any other strict quals, + * so if that's the case, then we'll only apply this, otherwise we'll + * ignore it. + */ + else if (cslist->selmask == CACHESEL_NOTNULLTEST) + s1 *= cslist->notnullsel;In some paths, the range selectivity estimation code punts and returns
DEFAULT_RANGE_INEQ_SEL. In those cases, if a not null test was
present, it should still be applied. It could make a big difference if
the selectivity for the nulltest is high.I'd say that the location that applies the DEFAULT_RANGE_INEQ_SEL
should apply the nullfrac. Seems wrong to rely on a IS NOT NULL test
to exists to estimate that properly. I don't think that needs done as
part of this patch. I'd rather limit the scope since "returned with
feedback" is already knocking at the door of this patch.I agree, but this patch regresses for those cases if the user
compensated with a not null test, leaving little recourse, so I'd fix
it in this patch to avoid the regression.Maybe you're right that the null fraction should be applied regardless
of whether there's an explicit null check, though.The more I read that code, the more I'm convinced there's no way to
get a DEFAULT_RANGE_INEQ_SEL if (rqlist->hibound != DEFAULT_INEQ_SEL
&&
rqlist->lobound != DEFAULT_INEQ_SEL) other than from contradictory
clauses, a case the current code handles very poorly, returning
DEFAULT_RANGE_INEQ_SEL and ignoring nullfrac, something that could
give way-off estimates if the table had lots of nulls.Say, consider if "value" was mostly null and you write:
SELECT * FROM testdata WHERE value BETWEEN 10 AND 20 AND value BETWEEN
50 AND 60;All other cases I can think of would end with either hibound or
lobound being equal to DEFAULT_INEQ_SEL.It seems to me, doing a git blame, that the checks in the else branch
were left only as a precaution, one that seems unnecessary.If you ask me, I'd just leave:
+ if (rqlist->hibound == DEFAULT_INEQ_SEL || rqlist->lobound == DEFAULT_INEQ_SEL) + { + /* No point in checking null selectivity, DEFAULT_INEQ_SEL implies we have no stats */ + s2 = DEFAULT_RANGE_INEQ_SEL; + } + else + { ... + s2 = rqlist->hibound + rqlist->lobound - 1.0; + nullsel = nulltestsel(root, IS_NULL, rqlist->var, varRelid); + notnullsel = 1.0 - nullsel; + + /* Adjust for double-exclusion of NULLs */ + s2 += nullsel; + if (s2 <= 0.0) { + if (s2 <= (-1.0e-4 * notnullsel - 1.0e-6)) + { + /* Most likely contradictory clauses found */ + s2 = (1.0e-10 < notnullsel) ? 1.0e-10 : notnullsel; + } + else + { + /* Could be a rounding error */ + s2 = DEFAULT_RANGE_INEQ_SEL * notnullsel; + } + } + }Where (-1.0e-4 * notnullsel - 1.0e-6) is just a very rough (and overly
cautious) estimation of the amount of rounding error that could be
there with 32-bit floats.The above assumes a non-DEFAULT_INEQ_SEL value in lobound/hibound is
an estimation based on stats, maybe approximate, but one that makes
sense (ie: preserves the monotonicity of the CPF), and as such
negative values are either sign of a contradiction or rounding error.
The previous code left the possibility of negatives coming out of
default selectivities creeping up on non-DEFAULT_INEQ_SEL estimates,
but that doesn't seem possible coming out of scalarineqsel.But I'd like it if Tom could comment on that, since he's the one that
did that in commit 547bb4a7f2bdccad9253a99211ce84b3f9de485a (way back
in 2004).Barring that, I'd just change the
s2 = DEFAULT_RANGE_INEQ_SEL;
To
s2 = DEFAULT_RANGE_INEQ_SEL * notnullsel;
Which makes a lot more sense.
I think to fix this properly the selfuncs would have to return the
estimate, and nullfrac separately, so the nullfrac could just be
applied once per expression. There's already hacks in there to get rid
of the double null filtering. I'm not proposing that as something for
this patch. It would be pretty invasive to change this.
There's no need, you can compute the nullfrac with nulltestsel. While
there's some rework involved, it's not a big deal (it just reads the
stats tuple), and it's a clean solution.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Apr 4, 2017 at 1:00 PM, Claudio Freire <klaussfreire@gmail.com> wrote:
If you ask me, I'd just leave:
+ if (rqlist->hibound == DEFAULT_INEQ_SEL || rqlist->lobound == DEFAULT_INEQ_SEL) + { + /* No point in checking null selectivity, DEFAULT_INEQ_SEL implies we have no stats */ + s2 = DEFAULT_RANGE_INEQ_SEL; + } + else + { ... + s2 = rqlist->hibound + rqlist->lobound - 1.0; + nullsel = nulltestsel(root, IS_NULL, rqlist->var, varRelid); + notnullsel = 1.0 - nullsel; + + /* Adjust for double-exclusion of NULLs */ + s2 += nullsel; + if (s2 <= 0.0) { + if (s2 <= (-1.0e-4 * notnullsel - 1.0e-6)) + { + /* Most likely contradictory clauses found */ + s2 = (1.0e-10 < notnullsel) ? 1.0e-10 : notnullsel; + } + else + { + /* Could be a rounding error */ + s2 = DEFAULT_RANGE_INEQ_SEL * notnullsel; + } + } + }Where (-1.0e-4 * notnullsel - 1.0e-6) is just a very rough (and overly
cautious) estimation of the amount of rounding error that could be
there with 32-bit floats.The above assumes a non-DEFAULT_INEQ_SEL value in lobound/hibound is
an estimation based on stats, maybe approximate, but one that makes
sense (ie: preserves the monotonicity of the CPF), and as such
negative values are either sign of a contradiction or rounding error.
The previous code left the possibility of negatives coming out of
default selectivities creeping up on non-DEFAULT_INEQ_SEL estimates,
but that doesn't seem possible coming out of scalarineqsel.But I'd like it if Tom could comment on that, since he's the one that
did that in commit 547bb4a7f2bdccad9253a99211ce84b3f9de485a (way back
in 2004).Barring that, I'd just change the
s2 = DEFAULT_RANGE_INEQ_SEL;
To
s2 = DEFAULT_RANGE_INEQ_SEL * notnullsel;
Which makes a lot more sense.
I think to fix this properly the selfuncs would have to return the
estimate, and nullfrac separately, so the nullfrac could just be
applied once per expression. There's already hacks in there to get rid
of the double null filtering. I'm not proposing that as something for
this patch. It would be pretty invasive to change this.There's no need, you can compute the nullfrac with nulltestsel. While
there's some rework involved, it's not a big deal (it just reads the
stats tuple), and it's a clean solution.
I'm marking this Waiting on Author until we figure out what to do with
those issues (the null issue quoted above, and the quadratic behavior)
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers