Misestimate when applying condition like id = id
Simplified example query-
explain
select * from organizations o where id = id;
On a PG10 instance I get-
Seq Scan on organizations o (cost=0.00..4.21 ***rows=1*** width=37)
Filter: (id = id)
On a PG11 instance with the same or very very similar data, I get-
Seq Scan on organizations o (cost=0.00..5.70 ***rows=270*** width=37)
Filter: (id IS NOT NULL)
Given this is a not null field, it translates to "is true" basically. The
real query that this example was extracted from apparently had a typo in
the join resulting in this bogus id = id condition, but I still curious
about where this estimate change comes from.
*Michael Lewis | Database Engineer*
*Entrata*
Michael Lewis <mlewis@entrata.com> writes:
Simplified example query-
explain
select * from organizations o where id = id;
On a PG10 instance I get-
Seq Scan on organizations o (cost=0.00..4.21 ***rows=1*** width=37)
Filter: (id = id)
On a PG11 instance with the same or very very similar data, I get-
Seq Scan on organizations o (cost=0.00..5.70 ***rows=270*** width=37)
Filter: (id IS NOT NULL)
Given this is a not null field, it translates to "is true" basically. The
real query that this example was extracted from apparently had a typo in
the join resulting in this bogus id = id condition, but I still curious
about where this estimate change comes from.
It's an intentional v11 change, cf
https://git.postgresql.org/gitweb/?p=postgresql.git&a=commitdiff&h=8ec5429e2
regards, tom lane
Thanks very much. I tested with a much bigger table and it seems that it is
using the default .005 selectivity for this integer column. Cool. Nice to
understand even if it was just a typo that had things going off the rails.
And thanks for your work 3 years ago for that commit.