Fw: [PHP] Fooling the query optimizer

Started by Adam Langover 25 years ago3 messagesgeneral

aalang@rutgersinsurance.com

over 25 years ago

On another list, someone posted this question. Are they correct, old
problem, etc.? I'll pass whatever info there is back to the originating
list.

Adam Lang
Systems Engineer
Rutgers Casualty Insurance Company
http://www.rutgersinsurance.com
----- Original Message -----
From: "Brent R.Matzelle" <bmatzelle@yahoo.com>
To: "PostgreSQL PHP" <pgsql-php@postgresql.org>
Sent: Thursday, February 08, 2001 10:41 AM
Subject: [PHP] Fooling the query optimizer

Have any of you discovered a way to get around the current query optimizer
limitation in Postgres? For example, I have a table that has three

columns

that I want to index for frequent search duties. In Postgres I am forced

create three indicies: one including all three columns, one for col2 and
col3, and one for just col3. Databases like MySQL can use the first index
for these types of queries "SELECT * WHERE col2 = x AND col3 = y" and

"SELECT

* WHERE col3 = y". Postgres could only perform queries on indicies where

looks like "SELECT * WHERE col1 = x AND col2 = y AND col3 = z" and "SELECT

WHERE col1 = x AND col2 = y" etc. However adding extra indexes as above
would decrease the write speed on that table because a simple insert would
require an update on all three indicies.

Is there a way to fool Postgres to use the first index by creating a query
like "SELECT * WHERE col1 = * AND col3 = x"? I know I'm grasping for

straws

Show quoted text

here, but these issues can kill my database query performance.

Brent

Michael Ansley

Michael.Ansley@intec-telecom-systems.com

over 25 years ago

In reply to: Adam Lang (#1)

RE: Fw: [PHP] Fooling the query optimizer

Running this query:

"SELECT * WHERE col3 = x"

with a btree index on (col1, col2, col3) cannot be performed in an efficient
manner, in any database, because you have specified the column order to be
col1, col2, col3. If somebody claims that MySQL can do this, they're
misunderstanding the problem, and/or solution, or there's some fudging going
on by somebody.

Imagine the index to look like this:

col1 col2 col3
1 1 1
1 1 2
1 1 3
1 1 4
1 2 1
1 2 2
1 2 3
1 2 4

and a query which says "SELECT * WHERE col3 = 4". Now what order are you
going to traverse the index in? Remember that you can only use col3, and
have to binary search (btree index). If you binary split the index, then
you have one 4 in one half, and one four in another, i.e.: it's not going to
work.

If MySQL claims it can do this, then the only way that I can think that they
are doing this is by creating extra or separate indices behind the scenes,
which is inefficient, and not particularly user friendly. Of course, they
may have used GiST to create a special index for this ;-) and gotten it
working, but I doubt it. Btree indices are by far the most common with
simple data.

For any btree index, the index can be used to the point where the index
columns and the filter columns diverge, IN ORDER, e.g.: if your index is
over columns a, b, c, d, and you filter on a, b, d, e, then the index can be
used, only over columns a and b, though (not d!!). In the example cited,
the first column in the index is not used in the filter, and so the complete
index has to be ignored, i.e.: seq scan.

Cheers...

MikeA

-----Original Message-----
From: Adam Lang [mailto:aalang@rutgersinsurance.com]
Sent: 08 February 2001 16:27
To: PGSQL General
Subject: [GENERAL] Fw: [PHP] Fooling the query optimizer

On another list, someone posted this question. Are they correct, old
problem, etc.? I'll pass whatever info there is back to the originating
list.

Have any of you discovered a way to get around the current query optimizer
limitation in Postgres? For example, I have a table that has three

columns

that I want to index for frequent search duties. In Postgres I am forced

create three indicies: one including all three columns, one for col2 and
col3, and one for just col3. Databases like MySQL can use the first index
for these types of queries "SELECT * WHERE col2 = x AND col3 = y" and

"SELECT

* WHERE col3 = y". Postgres could only perform queries on indicies where

looks like "SELECT * WHERE col1 = x AND col2 = y AND col3 = z" and "SELECT

WHERE col1 = x AND col2 = y" etc. However adding extra indexes as above
would decrease the write speed on that table because a simple insert would
require an update on all three indicies.

Is there a way to fool Postgres to use the first index by creating a query
like "SELECT * WHERE col1 = * AND col3 = x"? I know I'm grasping for

straws

here, but these issues can kill my database query performance.

Brent

**********************************************************************
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
Nick West - Global Infrastructure Manager.

This footnote also confirms that this email message has been swept by
MIMEsweeper for the presence of computer viruses.

www.mimesweeper.com
**********************************************************************

Import Notes

Resolved by subject fallback

Tom Lane

tgl@sss.pgh.pa.us

over 25 years ago

In reply to: Adam Lang (#1)

Re: Fw: [PHP] Fooling the query optimizer

"Adam Lang" <aalang@rutgersinsurance.com> forwards:

In Postgres I am forced to create three indicies: one including all
three columns, one for col2 and col3, and one for just col3.

Depending on what his queries actually are, perhaps it's sufficient
to create one index on (col3,col2,col1), rather than on (col1,col2,col3)
as I presume his first index currently is. As Mike Ansley points out,
Postgres can use the first N columns of an index if all N are
constrained by a query's WHERE clause; but there is no point in looking
at index columns beyond an unconstrained column, because if you did
you'd be fighting the index order instead of being helped by it.

I think that the planner used to have some bugs that might interfere
with recognition of these partial-index-match cases, but it's been okay
with them since 7.0 for sure. To say more, we'd need to know exactly
which PG version he's running and exactly what his queries look like.

regards, tom lane