Bug with index-usage?

Started by Sebastian Böckover 20 years ago14 messagesgeneral
Jump to latest
#1Sebastian Böck
sebastianboeck@freenet.de

Hello,

I get unpredictibale results selecting from a view depending on
index-usage.

Please see the attached script for details.

Is it a bug or some "weird feature"?

Any help appreciated to get predictibale results

Sebastian

Attachments:

test.sqltext/plain; name=test.sqlDownload
#2Scott Marlowe
smarlowe@g2switchworks.com
In reply to: Sebastian Böck (#1)
Re: Bug with index-usage?

On Mon, 2005-11-14 at 11:07, Sebastian Böck wrote:

Hello,

I get unpredictibale results selecting from a view depending on
index-usage.

PostgreSQL uses a cost based planner. So, it tends to not use the plan
you might expect, especially in "toy" test cases with small data sets.
I.e. why use an index to look up 10 values, when they all fit on the
same page. Just seq scan the data from the table.

Fill up your table with REAL data (or a close substitute) and test
again. Also, read up on the admin section, specifically the part on the
postgresql.conf file and what the settings in there mean, then read
http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html

#3Sebastian Böck
sebastianboeck@freenet.de
In reply to: Scott Marlowe (#2)
Re: Bug with index-usage?

Scott Marlowe wrote:

On Mon, 2005-11-14 at 11:07, Sebastian Böck wrote:

Hello,

I get unpredictibale results selecting from a view depending on
index-usage.

PostgreSQL uses a cost based planner. So, it tends to not use the plan
you might expect, especially in "toy" test cases with small data sets.
I.e. why use an index to look up 10 values, when they all fit on the
same page. Just seq scan the data from the table.

Fill up your table with REAL data (or a close substitute) and test
again. Also, read up on the admin section, specifically the part on the
postgresql.conf file and what the settings in there mean, then read
http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html

I think you didn't test my small script or don't see the same results.

I don't speak about index-usage per se, I'm talkung about the results.

Without indices I get:

SELECT * from test WHERE typ = 'a';
id | typ | test1_id
----+-----+----------
1 | a | 1
2 | a | 2
3 | a | 3
(3 rows)

But with defined indices I get:

SELECT * from test WHERE typ = 'a';
id | typ | test1_id
----+-----+----------
(0 rows)

By the way, this is 8.1 (forgot to mention in my first mail).

Sebastian

#4Scott Marlowe
smarlowe@g2switchworks.com
In reply to: Sebastian Böck (#1)
Re: Bug with index-usage?

On Mon, 2005-11-14 at 11:07, Sebastian Böck wrote:

Hello,

I get unpredictibale results selecting from a view depending on
index-usage.

Also read up on vacuum, analyze, and explain analyze.

#5Scott Marlowe
smarlowe@g2switchworks.com
In reply to: Sebastian Böck (#3)
Re: Bug with index-usage?

On Mon, 2005-11-14 at 11:25, Sebastian Böck wrote:

Scott Marlowe wrote:

On Mon, 2005-11-14 at 11:07, Sebastian Böck wrote:

Hello,

I get unpredictibale results selecting from a view depending on
index-usage.

PostgreSQL uses a cost based planner. So, it tends to not use the plan
you might expect, especially in "toy" test cases with small data sets.
I.e. why use an index to look up 10 values, when they all fit on the
same page. Just seq scan the data from the table.

Fill up your table with REAL data (or a close substitute) and test
again. Also, read up on the admin section, specifically the part on the
postgresql.conf file and what the settings in there mean, then read
http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html

I think you didn't test my small script or don't see the same results.

I don't speak about index-usage per se, I'm talkung about the results.

Without indices I get:

SELECT * from test WHERE typ = 'a';
id | typ | test1_id
----+-----+----------
1 | a | 1
2 | a | 2
3 | a | 3
(3 rows)

But with defined indices I get:

SELECT * from test WHERE typ = 'a';
id | typ | test1_id
----+-----+----------
(0 rows)

By the way, this is 8.1 (forgot to mention in my first mail).

I don't get this problem in 7.4. I'll try 8.1 and get back to you.

#6Csaba Nagy
nagy@ecircle-ag.com
In reply to: Scott Marlowe (#2)
Re: Bug with index-usage?

The OP was complaining about the results of the above script, which I
could readily reproduce on a 8.1.0 installation on debian (see below).
The same select which returned 3 rows will return nothing after creating
the partial indexes, which looks as a bug to me...
I can't tell anything about why it happens, just confirm that I can
reproduce too...

Cheers,
Csaba.

cnagy=> CREATE TABLE test1 (
cnagy(> id SERIAL PRIMARY KEY,
cnagy(> name TEXT NOT NULL
cnagy(> );
NOTICE: CREATE TABLE will create implicit sequence "test1_id_seq" for
serial column "test1.id"
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index
"test1_pkey" for table "test1"
CREATE TABLE
cnagy=>
cnagy=> INSERT INTO test1 (name) VALUES ('test1_1');
INSERT 0 1
cnagy=> INSERT INTO test1 (name) VALUES ('test1_2');
INSERT 0 1
cnagy=> INSERT INTO test1 (name) VALUES ('test1_3');
INSERT 0 1
cnagy=>
cnagy=> CREATE TABLE test2 (
cnagy(> id SERIAL PRIMARY KEY,
cnagy(> type TEXT NOT NULL CHECK (type IN ('a','b','c')),
cnagy(> test1_id INTEGER REFERENCES test1
cnagy(> );
NOTICE: CREATE TABLE will create implicit sequence "test2_id_seq" for
serial column "test2.id"
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index
"test2_pkey" for table "test2"
CREATE TABLE
cnagy=>
cnagy=> INSERT INTO test2 (type,test1_id) VALUES ('a',1);
INSERT 0 1
cnagy=> INSERT INTO test2 (type,test1_id) VALUES ('a',2);
INSERT 0 1
cnagy=> INSERT INTO test2 (type,test1_id) VALUES ('a',3);
INSERT 0 1
cnagy=> INSERT INTO test2 (type,test1_id) VALUES ('b',1);
INSERT 0 1
cnagy=> INSERT INTO test2 (type,test1_id) VALUES ('b',2);
INSERT 0 1
cnagy=> INSERT INTO test2 (type,test1_id) VALUES ('b',3);
INSERT 0 1
cnagy=> INSERT INTO test2 (type,test1_id) VALUES ('c',1);
INSERT 0 1
cnagy=> INSERT INTO test2 (type,test1_id) VALUES ('c',2);
INSERT 0 1
cnagy=> INSERT INTO test2 (type,test1_id) VALUES ('c',3);
INSERT 0 1
cnagy=>
cnagy=> CREATE OR REPLACE VIEW test AS
cnagy-> SELECT test2.*
cnagy-> FROM test2
cnagy-> LEFT JOIN test2 AS t2 ON
cnagy-> test2.type IN ('c','b') AND
cnagy-> t2.type = 'a';
CREATE VIEW
cnagy=>
cnagy=> SELECT * from test WHERE type = 'a';
id | type | test1_id
----+------+----------
1 | a | 1
2 | a | 2
3 | a | 3
(3 rows)

cnagy=>
cnagy=> CREATE INDEX index_a ON test2 (id) WHERE type = 'a';
CREATE INDEX
cnagy=> CREATE INDEX index_b ON test2 (id) WHERE type = 'b';
CREATE INDEX
cnagy=> CREATE INDEX index_c ON test2 (id) WHERE type = 'c';
CREATE INDEX
cnagy=>
cnagy=> SET enable_seqscan TO OFF;
SET
cnagy=>
cnagy=> SELECT * from test WHERE type = 'a';
id | type | test1_id
----+------+----------
(0 rows)

Show quoted text

On Mon, 2005-11-14 at 18:17, Scott Marlowe wrote:

On Mon, 2005-11-14 at 11:07, Sebastian Böck wrote:

Hello,

I get unpredictibale results selecting from a view depending on
index-usage.

PostgreSQL uses a cost based planner. So, it tends to not use the plan
you might expect, especially in "toy" test cases with small data sets.
I.e. why use an index to look up 10 values, when they all fit on the
same page. Just seq scan the data from the table.

Fill up your table with REAL data (or a close substitute) and test
again. Also, read up on the admin section, specifically the part on the
postgresql.conf file and what the settings in there mean, then read
http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

#7Jaime Casanova
jcasanov@systemguards.com.ec
In reply to: Sebastian Böck (#1)
Re: Bug with index-usage?

On 11/14/05, Sebastian Böck <sebastianboeck@freenet.de> wrote:

Hello,

I get unpredictibale results selecting from a view depending on
index-usage.

Please see the attached script for details.

Is it a bug or some "weird feature"?

Any help appreciated to get predictibale results

Sebastian

CREATE TABLE test1 (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL
);

INSERT INTO test1 (name) VALUES ('test1_1');
INSERT INTO test1 (name) VALUES ('test1_2');
INSERT INTO test1 (name) VALUES ('test1_3');

CREATE TABLE test2 (
id SERIAL PRIMARY KEY,
type TEXT NOT NULL CHECK (type IN ('a','b','c')),
test1_id INTEGER REFERENCES test1
);

INSERT INTO test2 (type,test1_id) VALUES ('a',1);
INSERT INTO test2 (type,test1_id) VALUES ('a',2);
INSERT INTO test2 (type,test1_id) VALUES ('a',3);
INSERT INTO test2 (type,test1_id) VALUES ('b',1);
INSERT INTO test2 (type,test1_id) VALUES ('b',2);
INSERT INTO test2 (type,test1_id) VALUES ('b',3);
INSERT INTO test2 (type,test1_id) VALUES ('c',1);
INSERT INTO test2 (type,test1_id) VALUES ('c',2);
INSERT INTO test2 (type,test1_id) VALUES ('c',3);

CREATE OR REPLACE VIEW test AS
SELECT test2.*
FROM test2
LEFT JOIN test2 AS t2 ON
test2.type IN ('c','b') AND
t2.type = 'a';

SELECT * from test WHERE type = 'a';

CREATE INDEX index_a ON test2 (id) WHERE type = 'a';
CREATE INDEX index_b ON test2 (id) WHERE type = 'b';
CREATE INDEX index_c ON test2 (id) WHERE type = 'c';

SET enable_seqscan TO OFF;

SELECT * from test WHERE type = 'a';

i don't have my machine at hand but i don't think that even the select
is right, you have a join but without joining clauses you will get a
cartesian product...

what do you believe is the right answer... just for my probe later...

--
regards,
Jaime Casanova
(DBA: DataBase Aniquilator ;)

#8Andreas Kretschmer
akretschmer@spamfence.net
In reply to: Sebastian Böck (#1)
Re: Bug with index-usage?

Sebastian B�ck <sebastianboeck@freenet.de> schrieb:

Hello,

I get unpredictibale results selecting from a view depending on
index-usage.
[ snipp ]

SELECT * from test WHERE type = 'a';

unfortunately, no result. What Du you expect?

Andreas
--
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect. (Linus Torvalds)
Kaufbach, Saxony, Germany, Europe. N 51.05082�, E 13.56889�

#9Scott Marlowe
smarlowe@g2switchworks.com
In reply to: Csaba Nagy (#6)
Re: Bug with index-usage?

On Mon, 2005-11-14 at 11:30, Csaba Nagy wrote:

The OP was complaining about the results of the above script, which I
could readily reproduce on a 8.1.0 installation on debian (see below).
The same select which returned 3 rows will return nothing after creating
the partial indexes, which looks as a bug to me...
I can't tell anything about why it happens, just confirm that I can
reproduce too...

Yep, I just reproduced it too.

In the future, I'd recommend they include the bad output, as I simply
thought "unpredictable output" was referring to performance, not the
actual data.

#10Sebastian Böck
sebastianboeck@freenet.de
In reply to: Jaime Casanova (#7)
Re: Bug with index-usage?

Jaime Casanova wrote:

On 11/14/05, Sebastian B�ck <sebastianboeck@freenet.de> wrote:

Hello,

I get unpredictibale results selecting from a view depending on
index-usage.

Please see the attached script for details.

Is it a bug or some "weird feature"?

Any help appreciated to get predictibale results

Sebastian

CREATE TABLE test1 (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL
);

INSERT INTO test1 (name) VALUES ('test1_1');
INSERT INTO test1 (name) VALUES ('test1_2');
INSERT INTO test1 (name) VALUES ('test1_3');

CREATE TABLE test2 (
id SERIAL PRIMARY KEY,
type TEXT NOT NULL CHECK (type IN ('a','b','c')),
test1_id INTEGER REFERENCES test1
);

INSERT INTO test2 (type,test1_id) VALUES ('a',1);
INSERT INTO test2 (type,test1_id) VALUES ('a',2);
INSERT INTO test2 (type,test1_id) VALUES ('a',3);
INSERT INTO test2 (type,test1_id) VALUES ('b',1);
INSERT INTO test2 (type,test1_id) VALUES ('b',2);
INSERT INTO test2 (type,test1_id) VALUES ('b',3);
INSERT INTO test2 (type,test1_id) VALUES ('c',1);
INSERT INTO test2 (type,test1_id) VALUES ('c',2);
INSERT INTO test2 (type,test1_id) VALUES ('c',3);

CREATE OR REPLACE VIEW test AS
SELECT test2.*
FROM test2
LEFT JOIN test2 AS t2 ON
test2.type IN ('c','b') AND
t2.type = 'a';

SELECT * from test WHERE type = 'a';

CREATE INDEX index_a ON test2 (id) WHERE type = 'a';
CREATE INDEX index_b ON test2 (id) WHERE type = 'b';
CREATE INDEX index_c ON test2 (id) WHERE type = 'c';

SET enable_seqscan TO OFF;

SELECT * from test WHERE type = 'a';

i don't have my machine at hand but i don't think that even the select
is right, you have a join but without joining clauses you will get a
cartesian product...

what do you believe is the right answer... just for my probe later...

I think it should be:

id | type | test1_id
----+------+----------
1 | a | 1
2 | a | 2
3 | a | 3

because a

EXPLAIN SELECT * from test WHERE type = 'a';

shows some weird assumptions

Index Scan using index_a on test2 (cost=0.00..4.69 rows=1 width=40)
Filter: (("type" = 'c'::text) OR ("type" = 'b'::text))

note that index_a is defined as:

CREATE INDEX index_a ON test2 (id) WHERE type = 'a';

Sebastian

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Sebastian Böck (#1)
Re: Bug with index-usage?

=?ISO-8859-1?Q?Sebastian_B=F6ck?= <sebastianboeck@freenet.de> writes:

I get unpredictibale results selecting from a view depending on
index-usage.

It's not actually *using* the indexes, although presence of the indexes
does seem to be needed to trigger the bug:

regression=# explain SELECT * from test WHERE type = 'a';
QUERY PLAN
-------------------------------------------------------------------------------------------
Nested Loop Left Join (cost=0.00..2.29 rows=1 width=40)
Join Filter: (("outer"."type" = 'c'::text) OR ("outer"."type" = 'b'::text))
-> Seq Scan on test2 (cost=0.00..1.16 rows=1 width=40)
Filter: (("type" = 'a'::text) AND (("type" = 'c'::text) OR ("type" = 'b'::text)))
-> Seq Scan on test2 t2 (cost=0.00..1.11 rows=1 width=0)
Filter: ("type" = 'a'::text)
(6 rows)

regression=# drop index index_b;
DROP INDEX
regression=# explain SELECT * from test WHERE type = 'a';
QUERY PLAN
-------------------------------------------------------------------------------
Nested Loop Left Join (cost=0.00..2.24 rows=1 width=40)
Join Filter: (("outer"."type" = 'c'::text) OR ("outer"."type" = 'b'::text))
-> Seq Scan on test2 (cost=0.00..1.11 rows=1 width=40)
Filter: ("type" = 'a'::text)
-> Seq Scan on test2 t2 (cost=0.00..1.11 rows=1 width=0)
Filter: ("type" = 'a'::text)
(6 rows)

It looks like the problem is that the new 8.1 OR-index-qual code is
confused about when it can apply outer-join conditions. It shouldn't be
propagating the outer-join condition into the scan condition on test2,
but it is. Will fix.

regards, tom lane

#12Sebastian Böck
sebastianboeck@freenet.de
In reply to: Tom Lane (#11)
Re: Bug with index-usage?

Tom Lane wrote:

=?ISO-8859-1?Q?Sebastian_B=F6ck?= <sebastianboeck@freenet.de> writes:

I get unpredictibale results selecting from a view depending on
index-usage.

It's not actually *using* the indexes, although presence of the indexes
does seem to be needed to trigger the bug:

regression=# explain SELECT * from test WHERE type = 'a';
QUERY PLAN
-------------------------------------------------------------------------------------------
Nested Loop Left Join (cost=0.00..2.29 rows=1 width=40)
Join Filter: (("outer"."type" = 'c'::text) OR ("outer"."type" = 'b'::text))
-> Seq Scan on test2 (cost=0.00..1.16 rows=1 width=40)
Filter: (("type" = 'a'::text) AND (("type" = 'c'::text) OR ("type" = 'b'::text)))
-> Seq Scan on test2 t2 (cost=0.00..1.11 rows=1 width=0)
Filter: ("type" = 'a'::text)
(6 rows)

regression=# drop index index_b;
DROP INDEX
regression=# explain SELECT * from test WHERE type = 'a';
QUERY PLAN
-------------------------------------------------------------------------------
Nested Loop Left Join (cost=0.00..2.24 rows=1 width=40)
Join Filter: (("outer"."type" = 'c'::text) OR ("outer"."type" = 'b'::text))
-> Seq Scan on test2 (cost=0.00..1.11 rows=1 width=40)
Filter: ("type" = 'a'::text)
-> Seq Scan on test2 t2 (cost=0.00..1.11 rows=1 width=0)
Filter: ("type" = 'a'::text)
(6 rows)

It looks like the problem is that the new 8.1 OR-index-qual code is
confused about when it can apply outer-join conditions. It shouldn't be
propagating the outer-join condition into the scan condition on test2,
but it is. Will fix.

regards, tom lane

Hi,

thanks for lookin into it.

I patched my 8.1 installation with the following changes:

http://www-new.commandprompt.com/projects/public/pgsql/changeset/23461

The error described above doesn't exist any more, but it's still
buggy. Just create a view which is left-joining to an other table. The
joined columns don't show up in the view.

CREATE OR REPLACE VIEW test_ AS
SELECT test2.*, test1.name
FROM test2
LEFT JOIN test1 ON test1.id = test2.test1_id
LEFT JOIN test2 AS t2 ON
test2.type IN ('c','b') AND
t2.type = 'a';

In 8.0 I get:

SELECT * from test WHERE type = 'a';
id | type | test1_id | name
----+------+----------+---------
1 | a | 1 | test1_1
2 | a | 2 | test1_2
3 | a | 3 | test1_3
(3 rows)

In 8.1 (with or without your patches) I get:

SELECT * from test_ WHERE type = 'a';
id | type | test1_id | name
----+------+----------+------
1 | a | 1 |
2 | a | 2 |
3 | a | 3 |
(3 rows)

Hope you could repeat the problem. Otherwise, please contact me.

Sebastian

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Sebastian Böck (#12)
Re: Bug with index-usage?

=?ISO-8859-1?Q?Sebastian_B=F6ck?= <sebastianboeck@freenet.de> writes:

I patched my 8.1 installation with the following changes:
http://www-new.commandprompt.com/projects/public/pgsql/changeset/23461

The error described above doesn't exist any more, but it's still
buggy.

Yup, you're right :-(. Looks like we haven't been doing adequate
testing with complex OUTER JOIN clauses ...

Fix committed. Thanks for the report!

regards, tom lane

#14Sebastian Böck
sebastianboeck@freenet.de
In reply to: Tom Lane (#13)
Re: Bug with index-usage?

Tom Lane wrote:

=?ISO-8859-1?Q?Sebastian_B=F6ck?= <sebastianboeck@freenet.de> writes:

I patched my 8.1 installation with the following changes:
http://www-new.commandprompt.com/projects/public/pgsql/changeset/23461

The error described above doesn't exist any more, but it's still
buggy.

Yup, you're right :-(. Looks like we haven't been doing adequate
testing with complex OUTER JOIN clauses ...

Fix committed. Thanks for the report!

Thanks for the quick fix, everything looks good now!

Sebastian