selectivity calculation for or_clause is wrong ?

Started by Hiroshi Inoueover 26 years ago3 messages
#1Hiroshi Inoue
Inoue@tpf.co.jp

Hi all,

I have a question about the calculation of selectivity.

I see the following code in set_rest_selec() in clausesel.c.

cost_clause = clausenode->selectivity;

/*
* Check to see if the selectivity of this clause or any
'or'
* subclauses (if any) haven't been set yet.
*/
if (cost_clause <= 0 || valid_or_clause(clausenode))
{

Why is valid_or_clause(clausenode) necessary ?

This means that even if selectivity is set,set_rest_selec()
calls compute_clause_selec() if the target clause is a
valid_or_clause.
compute_clause_selec() would add the selectivity of
elements of or_clause to the current selectivity.

AFAIC,compute_clause_selec() is called twice at least
( from add_restrict_and_join_to_rel() in initsplan.c
and set_rest_selec() in clausesel.c)
and seems to accumulate the result by repetition if
the target clause is a valid_or_clause.

Regards.

Hiroshi Inoue
Inoue@tpf.co.jp

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Hiroshi Inoue (#1)
Re: [HACKERS] selectivity calculation for or_clause is wrong ?

"Hiroshi Inoue" <Inoue@tpf.co.jp> writes:

Why is valid_or_clause(clausenode) necessary ?

Looks like a waste of cycles to me too.

If the subclauses of an OR could get rearranged during optimization
then this might be a necessary check, but AFAIK they don't.

regards, tom lane

#3Hiroshi Inoue
Inoue@tpf.co.jp
In reply to: Tom Lane (#2)
RE: [HACKERS] selectivity calculation for or_clause is wrong ?

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Wednesday, July 21, 1999 12:32 AM
To: Hiroshi Inoue
Cc: pgsql-hackers
Subject: Re: [HACKERS] selectivity calculation for or_clause is wrong ?

"Hiroshi Inoue" <Inoue@tpf.co.jp> writes:

Why is valid_or_clause(clausenode) necessary ?

Looks like a waste of cycles to me too.

It's not only a waste of cycles.
For exmaple,

1.explain select key1 from b where someitem in (1);

NOTICE: QUERY PLAN:

Seq Scan on b on b (cost=1638.49 rows=261 width=4)

2.explain select key1 from b where someitem in (1,2);

NOTICE: QUERY PLAN:

Seq Scan on b on b (cost=1638.49 rows=773 width=4)

3.explain select key1 from b where someitem in (1,2,3);

NOTICE: QUERY PLAN:

Seq Scan on b on b (cost=1638.49 rows=1274 width=4)

rows of each plan 261 : 773 : 1274 not = 1 : 2 : 3.
It's nearly = 1 :3 :5.

elements of or_clause except its first element are evaluated
twice and the results are accumlated.

Regards.

Hiroshi Inoue
Inoue@tpf.co.jp