General Bug Report: Bug in optimizer

Started by Unprivileged user <>almost 27 years ago19 messages
#1Unprivileged user <>
unprivileged_user___@unknown.user

============================================================================
POSTGRESQL BUG REPORT TEMPLATE
============================================================================

Your name : Andriy I Pilipenko
Your email address : bamby@marka.net.ua

Category : runtime: back-end: SQL
Severity : serious

Summary: Bug in optimizer

System Configuration
--------------------
Operating System : FreeBSD 2.2.6, FreeBSD 3.1, Linux 2.0.36

PostgreSQL version : 6.4.2

Compiler used : gcc 2.7.2.1

Hardware:
---------
Pentium, AMD K6, 256M RAM, 64M RAM

Versions of other tools:
------------------------
gmake 3.76.1, flex 2.5.4

--------------------------------------------------------------------------

Problem Description:
--------------------
Backend forgets about indexes if WHERE clause includes
negative number. This causes great slowdown in queries
on large tables.

--------------------------------------------------------------------------

Test Case:
----------
Here is an example session. Note that in first SELECT
backend uses index scan, and in second one it uses
sequental scan.

== cut ==
bamby=> create table table1 (field1 int);
CREATE
bamby=> create index i_table1__field1 on table1 (field1);
CREATE
bamby=> explain select * from table1 where field1 = 1;
NOTICE: QUERY PLAN:

Index Scan using i_table1__field1 on table1 (cost=0.00 size=0 width=4)

EXPLAIN
bamby=> explain select * from table1 where field1 = -1;
NOTICE: QUERY PLAN:

Seq Scan on table1 (cost=0.00 size=0 width=4)

EXPLAIN
== cut ==

--------------------------------------------------------------------------

Solution:
---------

--------------------------------------------------------------------------

#2Vadim Mikheev
vadim@krs.ru
In reply to: Unprivileged user <> (#1)
Re: [BUGS] General Bug Report: Bug in optimizer

Unprivileged user wrote:

PostgreSQL version : 6.4.2

...

Here is an example session. Note that in first SELECT
backend uses index scan, and in second one it uses
sequental scan.

== cut ==
bamby=> create table table1 (field1 int);
CREATE
bamby=> create index i_table1__field1 on table1 (field1);
CREATE
bamby=> explain select * from table1 where field1 = 1;
NOTICE: QUERY PLAN:

Index Scan using i_table1__field1 on table1 (cost=0.00 size=0 width=4)

^^^^^^
Hmmm... Seems that vacuum wasn't run for table1.
Why is index used ?!!!
It's bug!

EXPLAIN
bamby=> explain select * from table1 where field1 = -1;
NOTICE: QUERY PLAN:

Seq Scan on table1 (cost=0.00 size=0 width=4)

Run

vacuum table1

Vadim

#3Andriy I Pilipenko
bamby@marka.net.ua
In reply to: Vadim Mikheev (#2)
Re: [BUGS] General Bug Report: Bug in optimizer

On Thu, 18 Mar 1999, Vadim Mikheev wrote:

Unprivileged user wrote:

PostgreSQL version : 6.4.2

...

Here is an example session. Note that in first SELECT
backend uses index scan, and in second one it uses
sequental scan.

== cut ==
bamby=> create table table1 (field1 int);
CREATE
bamby=> create index i_table1__field1 on table1 (field1);
CREATE
bamby=> explain select * from table1 where field1 = 1;
NOTICE: QUERY PLAN:

Index Scan using i_table1__field1 on table1 (cost=0.00 size=0 width=4)

^^^^^^
Hmmm... Seems that vacuum wasn't run for table1.
Why is index used ?!!!
It's bug!

Why I need to vacuum immediately after creating table?

Here is another example from live system:

== cut ==

statserv=> select count(*) from ctime;
count
-----
94256
(1 row)

statserv=> explain select * from ctime where ctg=-1;
NOTICE: QUERY PLAN:

Seq Scan on ctime (cost=3646.86 size=8412 width=54)

EXPLAIN
statserv=> explain select * from ctime where ctg=1;
NOTICE: QUERY PLAN:

Index Scan using i_ctime__ctg on ctime (cost=2.05 size=2 width=54)

EXPLAIN

== cut ==

EXPLAIN
bamby=> explain select * from table1 where field1 = -1;
NOTICE: QUERY PLAN:

Seq Scan on table1 (cost=0.00 size=0 width=4)

Run

vacuum table1

Did it. Doesn't help.

Andriy I Pilipenko
PAI1-RIPE

#4Vadim Mikheev
vadim@krs.ru
In reply to: Andriy I Pilipenko (#3)
Re: [BUGS] General Bug Report: Bug in optimizer

Andriy I Pilipenko wrote:

Why I need to vacuum immediately after creating table?

Oh, sorry, I missed this -:)
Nevertheless, using index for

select * from table1 where field1 = 1;

is bug!

Here is another example from live system:

statserv=> explain select * from ctime where ctg=-1;
NOTICE: QUERY PLAN:

Seq Scan on ctime (cost=3646.86 size=8412 width=54)

As well as this one.
Should be fixed easy... Could someone address this? -:)

Vadim

#5Bruce Momjian
maillist@candle.pha.pa.us
In reply to: Vadim Mikheev (#4)
Re: [HACKERS] Re: [BUGS] General Bug Report: Bug in optimizer

Andriy I Pilipenko wrote:

Why I need to vacuum immediately after creating table?

Oh, sorry, I missed this -:)
Nevertheless, using index for

select * from table1 where field1 = 1;

is bug!

It is possible the new optimizer fixes this. He needs to try the new
snapshot to see.

-- 
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#6Vadim Mikheev
vadim@krs.ru
In reply to: Bruce Momjian (#5)
Re: [HACKERS] Re: [BUGS] General Bug Report: Bug in optimizer

Bruce Momjian wrote:

Andriy I Pilipenko wrote:

Why I need to vacuum immediately after creating table?

Oh, sorry, I missed this -:)
Nevertheless, using index for

select * from table1 where field1 = 1;

is bug!

It is possible the new optimizer fixes this. He needs to try the new
snapshot to see.

vac=> create table table1 (field1 int);
CREATE
vac=> create index i_table1__field1 on table1 (field1);
CREATE
vac=> explain select * from table1 where field1 = 1;
NOTICE: QUERY PLAN:

Index Scan using i_table1__field1 on table1 (cost=0.00 size=0 width=4)

Unfixed...

Vadim

#7Bruce Momjian
maillist@candle.pha.pa.us
In reply to: Vadim Mikheev (#6)
Re: [HACKERS] Re: [BUGS] General Bug Report: Bug in optimizer

It is possible the new optimizer fixes this. He needs to try the new
snapshot to see.

vac=> create table table1 (field1 int);
CREATE
vac=> create index i_table1__field1 on table1 (field1);
CREATE
vac=> explain select * from table1 where field1 = 1;
NOTICE: QUERY PLAN:

Index Scan using i_table1__field1 on table1 (cost=0.00 size=0 width=4)

Unfixed...

Let me tell you why I don't think this is a bug. The optimizer will
choose ordered results over unordered results if the costs are the same.
In this case, the cost of the query is zero, so it chose to use the
index because the index produces an ordered result.

This works well for un-vacuumed tables, because it thinks everything is
zero cost, and chooses the index.

-- 
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#8Vadim Mikheev
vadim@krs.ru
In reply to: Bruce Momjian (#7)
Re: [HACKERS] Re: [BUGS] General Bug Report: Bug in optimizer

Bruce Momjian wrote:

Let me tell you why I don't think this is a bug. The optimizer will
choose ordered results over unordered results if the costs are the same.
In this case, the cost of the query is zero, so it chose to use the
index because the index produces an ordered result.

This works well for un-vacuumed tables, because it thinks everything is
zero cost, and chooses the index.

Agreed, this is ok as long as

vac=> create table table1 (field1 int);
CREATE
vac=> insert into table1 values (1);
INSERT 1583349 1
vac=> create index i_table1__field1 on table1 (field1);
CREATE
vac=> explain select * from table1 where field1 = 1;
NOTICE: QUERY PLAN:

Seq Scan on table1 (cost=1.03 size=1 width=4)

- SeqScan is used for small tables.

So, only bug reported is left.

Vadim

#9Bruce Momjian
maillist@candle.pha.pa.us
In reply to: Vadim Mikheev (#8)
Re: [HACKERS] Re: [BUGS] General Bug Report: Bug in optimizer

Bruce Momjian wrote:

Let me tell you why I don't think this is a bug. The optimizer will
choose ordered results over unordered results if the costs are the same.
In this case, the cost of the query is zero, so it chose to use the
index because the index produces an ordered result.

This works well for un-vacuumed tables, because it thinks everything is
zero cost, and chooses the index.

Agreed, this is ok as long as

vac=> create table table1 (field1 int);
CREATE
vac=> insert into table1 values (1);
INSERT 1583349 1
vac=> create index i_table1__field1 on table1 (field1);
CREATE
vac=> explain select * from table1 where field1 = 1;
NOTICE: QUERY PLAN:

Seq Scan on table1 (cost=1.03 size=1 width=4)

- SeqScan is used for small tables.

So, only bug reported is left.

Can you get on IRC now? Why are you up so late?

-- 
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#10Bruce Momjian
maillist@candle.pha.pa.us
In reply to: Vadim Mikheev (#8)
Re: [HACKERS] Re: [BUGS] General Bug Report: Bug in optimizer

Agreed, this is ok as long as

vac=> create table table1 (field1 int);
CREATE
vac=> insert into table1 values (1);
INSERT 1583349 1
vac=> create index i_table1__field1 on table1 (field1);
CREATE
vac=> explain select * from table1 where field1 = 1;
NOTICE: QUERY PLAN:

Seq Scan on table1 (cost=1.03 size=1 width=4)

- SeqScan is used for small tables.

So, only bug reported is left.

So, yes, I suppose there is an inconsistency there. Zero-sized
tables(according to vacuum), use index, while tables with some data
don't use index.

How does the system know there is a row in there if you didn't run
vacuum? That confuses me.

-- 
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#11Bruce Momjian
maillist@candle.pha.pa.us
In reply to: Vadim Mikheev (#8)
Re: [HACKERS] Re: [BUGS] General Bug Report: Bug in optimizer

Agreed, this is ok as long as

vac=> create table table1 (field1 int);
CREATE
vac=> insert into table1 values (1);
INSERT 1583349 1
vac=> create index i_table1__field1 on table1 (field1);
CREATE
vac=> explain select * from table1 where field1 = 1;
NOTICE: QUERY PLAN:

Seq Scan on table1 (cost=1.03 size=1 width=4)

- SeqScan is used for small tables.

So, only bug reported is left.

My guess is that the creation of the index updates the table size
statistics.

However, when I see zero size, I don't know if it is accurate, or if
someone has added rows since the last vacuum/index creation, so I think
it is correct to use an index on a zero-length table if it is
appropriate. If the size is 1, I will assume that number is accurate,
and do a sequential scan.

Does that make sense?

-- 
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#12Vadim Mikheev
vadim@krs.ru
In reply to: Bruce Momjian (#10)
Re: [HACKERS] Re: [BUGS] General Bug Report: Bug in optimizer

Bruce Momjian wrote:

So, yes, I suppose there is an inconsistency there. Zero-sized
tables(according to vacuum), use index, while tables with some data
don't use index.

How does the system know there is a row in there if you didn't run
vacuum? That confuses me.

Create index updates ntuples & npages in pg_class...

Vadim

#13Vadim Mikheev
vadim@krs.ru
In reply to: Bruce Momjian (#11)
Re: [HACKERS] Re: [BUGS] General Bug Report: Bug in optimizer

Bruce Momjian wrote:

My guess is that the creation of the index updates the table size
statistics.

Yes.

However, when I see zero size, I don't know if it is accurate, or if
someone has added rows since the last vacuum/index creation, so I think
it is correct to use an index on a zero-length table if it is
appropriate. If the size is 1, I will assume that number is accurate,
and do a sequential scan.

Does that make sense?

Yes. But we have to fix SeqScan for field1 = -1...

Vadim

#14Bruce Momjian
maillist@candle.pha.pa.us
In reply to: Vadim Mikheev (#13)
Re: [HACKERS] Re: [BUGS] General Bug Report: Bug in optimizer

Bruce Momjian wrote:

My guess is that the creation of the index updates the table size
statistics.

Yes.

However, when I see zero size, I don't know if it is accurate, or if
someone has added rows since the last vacuum/index creation, so I think
it is correct to use an index on a zero-length table if it is
appropriate. If the size is 1, I will assume that number is accurate,
and do a sequential scan.

Does that make sense?

Yes. But we have to fix SeqScan for field1 = -1...

Woh, I just tried it myself, and was able to reproduce it. I will check
into it now. Gee, that is very strange.

-- 
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#15Bruce Momjian
maillist@candle.pha.pa.us
In reply to: Vadim Mikheev (#13)
Re: [HACKERS] Re: [BUGS] General Bug Report: Bug in optimizer

Bruce Momjian wrote:

My guess is that the creation of the index updates the table size
statistics.

Yes.

However, when I see zero size, I don't know if it is accurate, or if
someone has added rows since the last vacuum/index creation, so I think
it is correct to use an index on a zero-length table if it is
appropriate. If the size is 1, I will assume that number is accurate,
and do a sequential scan.

Does that make sense?

Yes. But we have to fix SeqScan for field1 = -1...

The basic problem is that the -1 is stored as:

{ EXPR
:typeOid 0
:opType op
:oper
{ OPER
:opno 558
:opid 0
:opresulttype 23
}

:args (
{ CONST
:consttype 23
:constlen 4
:constisnull false
:constvalue 4 [ 4 0 0 0 ]
:constbyval true
}
)

This is clearly undesirable, and causes the optimizer to think it can't
use the index.

Is this bug report for 6.4.*, or did are you running the current
development tree? I assume you are running 6.4.*, and am surprised this
did not show as a larger problem.

I will look in the grammer for a fix. This should come across as a
single -4 constant.

-- 
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#16Bruce Momjian
maillist@candle.pha.pa.us
In reply to: Vadim Mikheev (#8)
Re: [HACKERS] Re: [BUGS] General Bug Report: Bug in optimizer

Agreed, this is ok as long as

vac=> create table table1 (field1 int);
CREATE
vac=> insert into table1 values (1);
INSERT 1583349 1
vac=> create index i_table1__field1 on table1 (field1);
CREATE
vac=> explain select * from table1 where field1 = 1;
NOTICE: QUERY PLAN:

Seq Scan on table1 (cost=1.03 size=1 width=4)

- SeqScan is used for small tables.

So, only bug reported is left.

I see it now. The -4 is coming over as a unary minus, and a 4. That is
OK, because the executor knows how to deal with a unary minus, but the
optimizer thinks it is a operator and a constant, which it is, but it
does not know how to index an operator with a constant.

Unary minus is probably the not only operator that can be auto-folded
into the constant. In fact, it may be valuable to auto-fold all
operator-constant pairs into just constants.

In fact, that may not be necessary. If we code so that we check that
the right-hand side is totally constants, and make the change in the
executor(if needed), we can just pass it all through. However, we need
the constant for optimizer min/max comparisons when using >, but we
could do without that if needed, so we don't have to evaluate operators
and functions outside the executor.

The quick fix may be to just make sure -4 does not use unary minus in
the parser.

-- 
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#17Bruce Momjian
maillist@candle.pha.pa.us
In reply to: Vadim Mikheev (#8)
Re: [HACKERS] Re: [BUGS] General Bug Report: Bug in optimizer

Bruce Momjian wrote:

Let me tell you why I don't think this is a bug. The optimizer will
choose ordered results over unordered results if the costs are the same.
In this case, the cost of the query is zero, so it chose to use the
index because the index produces an ordered result.

This works well for un-vacuumed tables, because it thinks everything is
zero cost, and chooses the index.

Agreed, this is ok as long as

vac=> create table table1 (field1 int);
CREATE
vac=> insert into table1 values (1);
INSERT 1583349 1
vac=> create index i_table1__field1 on table1 (field1);
CREATE
vac=> explain select * from table1 where field1 = 1;
NOTICE: QUERY PLAN:

Seq Scan on table1 (cost=1.03 size=1 width=4)

- SeqScan is used for small tables.

So, only bug reported is left.

Vadim

Fixed:

test=> explain select * from table1 where field1 = 1;
NOTICE: QUERY PLAN:

Index Scan using i_table1__field1 on table1 (cost=0.00 size=0 width=4)

EXPLAIN
test=> explain select * from table1 where field1 = -1;
NOTICE: QUERY PLAN:

Index Scan using i_table1__field1 on table1 (cost=0.00 size=0 width=4)

The function fixing it is in gram.y:

static Node *doNegate(Node *n)
{
if (IsA(n, A_Const))
{
A_Const *con = (A_Const *)n;

if (con->val.type == T_Integer)
{
con->val.val.ival = -con->val.val.ival;
return n;
}
if (con->val.type == T_Float)
{
con->val.val.dval = -con->val.val.dval;
return n;
}
}

return makeA_Expr(OP, "-", NULL, n);
}

It tries to merge the negative into the constant. We already had
special '-' handling in the grammer, so I just call this function,
rather than doing makeA_Expr in all cases.

Committed.

-- 
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#18The Hermit Hacker
scrappy@hub.org
In reply to: Bruce Momjian (#9)
Re: [HACKERS] Re: [BUGS] General Bug Report: Bug in optimizer

On Thu, 18 Mar 1999, Bruce Momjian wrote:

Bruce Momjian wrote:

Let me tell you why I don't think this is a bug. The optimizer will
choose ordered results over unordered results if the costs are the same.
In this case, the cost of the query is zero, so it chose to use the
index because the index produces an ordered result.

This works well for un-vacuumed tables, because it thinks everything is
zero cost, and chooses the index.

Agreed, this is ok as long as

vac=> create table table1 (field1 int);
CREATE
vac=> insert into table1 values (1);
INSERT 1583349 1
vac=> create index i_table1__field1 on table1 (field1);
CREATE
vac=> explain select * from table1 where field1 = 1;
NOTICE: QUERY PLAN:

Seq Scan on table1 (cost=1.03 size=1 width=4)

- SeqScan is used for small tables.

So, only bug reported is left.

Can you get on IRC now? Why are you up so late?

That's something we need on our globe...timezones :)

Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org

#19Bruce Momjian
maillist@candle.pha.pa.us
In reply to: The Hermit Hacker (#18)
Re: [HACKERS] Re: [BUGS] General Bug Report: Bug in optimizer

Can you get on IRC now? Why are you up so late?

That's something we need on our globe...timezones :)

He is always 12 hours ahead of me. Here is my Vadim command:

TZ=Asia/Krasnoyarsk
export TZ
date

-- 
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026