A fairly obvious optimization?

Started by Dann Corbitabout 24 years ago5 messageshackers

DCorbit@connx.com

about 24 years ago

Considering this schema:

-- Table: cnx_ds_sis_bill_detl_tb
CREATE TABLE "cnx_ds_sis_bill_detl_tb" (
"extr_stu_id" char(10),
"term_cyt" char(5),
"subcode" char(5),
"tran_seq" int2,
"crc" int8,
CONSTRAINT "pk_cnx_ds_sis_bill_detl_tb" UNIQUE ("extr_stu_id",
"term_cyt", "subcode", "tran_seq")
);

-- Index: pk_cnx_ds_sis_bill_detl_tb
CREATE UNIQUE INDEX pk_cnx_ds_sis_bill_detl_tb ON
cnx_ds_sis_bill_detl_tb USING btree (extr_stu_id bpchar_ops, term_cyt
bpchar_ops, subcode bpchar_ops, tran_seq int2_ops);

Here is a PSQL session, where I did some simple queries:

connxdatasync=# select count(*) from cnx_ds_sis_bill_detl_tb;
count
---------
1607823
(1 row)

connxdatasync=# select min(extr_stu_id) from cnx_ds_sis_bill_detl_tb;
min
------------
000251681
(1 row)

connxdatasync=# select max(extr_stu_id) from cnx_ds_sis_bill_detl_tb;
max
------------
999999999
(1 row)

The select(min) and select(max) took as long as the table scan to find
the count. It seems logical if a btree type index is available (such
as pk_cnx_ds_sis_bill_detl_tb) where the most significant bit of the
index is the column requested, it should be little more than a seek
first or seek last in the btree. Obviously, it won't work with a hashed
index (which is neither here nor there).

hannu@tm.ee

about 24 years ago

In reply to: Dann Corbit (#1)

Re: A fairly obvious optimization?

On Wed, 2002-05-15 at 23:23, Dann Corbit wrote:

The select(min) and select(max) took as long as the table scan to find
the count. It seems logical if a btree type index is available (such
as pk_cnx_ds_sis_bill_detl_tb) where the most significant bit of the
index is the column requested, it should be little more than a seek
first or seek last in the btree. Obviously, it won't work with a hashed
index (which is neither here nor there).

The problem is postgres' extensibility -there is no hard-wired
connection between max() and b-tree indexes - you can define an
aggregate max() that returns something completely diffrent, say the
longest string length or the "best" optimisation techniqe which may or
may not be able to use an index.

------------
Hannu

Zeugswetter Andreas SB SD

ZeugswetterA@spardat.at

about 24 years ago

In reply to: Hannu Krosing (#2)

Re: A fairly obvious optimization?

The select(min) and select(max) took as long as the table scan to find
the count. It seems logical if a btree type index is available (such
as pk_cnx_ds_sis_bill_detl_tb) where the most significant bit of the
index is the column requested, it should be little more than a seek
first or seek last in the btree. Obviously, it won't work with a hashed
index (which is neither here nor there).

In the meantime you can use:
select extr_stu_id from cnx_ds_sis_bill_detl_tb order by 1 desc limit 1; -- max
select extr_stu_id from cnx_ds_sis_bill_detl_tb order by 1 asc limit 1; -- min

I guess that is the reason why nobody felt really motivated to implement
this optimization. Besides these statements are more powerful, since they can fetch
other columns from this min/max row. The down side is, that this syntax varies across
db vendors, but most (all?) have a corresponding feature nowadays.

select first 1
select top 1 ...

This is actually becoming a FAQ :-)

Andreas

Import Notes

Resolved by subject fallback

bruce@momjian.us

about 24 years ago

In reply to: Zeugswetter Andreas SB SD (#3)

Re: A fairly obvious optimization?

FAQ updated in section 4.8: My queries are slow or don't make use of the
indexes. Why?

is returned. In fact, though MAX() and MIN() don't use indexes,
it is possible to retrieve such values using an index with ORDER BY
and LIMIT:
<PRE>
SELECT col
FROM tab
ORDER BY col
LIMIT 1
</PRE>

---------------------------------------------------------------------------

Zeugswetter Andreas SB SD wrote:

The select(min) and select(max) took as long as the table scan to find
the count. It seems logical if a btree type index is available (such
as pk_cnx_ds_sis_bill_detl_tb) where the most significant bit of the
index is the column requested, it should be little more than a seek
first or seek last in the btree. Obviously, it won't work with a hashed
index (which is neither here nor there).

In the meantime you can use:
select extr_stu_id from cnx_ds_sis_bill_detl_tb order by 1 desc limit 1; -- max
select extr_stu_id from cnx_ds_sis_bill_detl_tb order by 1 asc limit 1; -- min

I guess that is the reason why nobody felt really motivated to implement
this optimization. Besides these statements are more powerful, since they can fetch
other columns from this min/max row. The down side is, that this syntax varies across
db vendors, but most (all?) have a corresponding feature nowadays.

select first 1
select top 1 ...

This is actually becoming a FAQ :-)

Andreas

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

cbbrowne@acm.org

about 24 years ago

In reply to: Bruce Momjian (#4)

Re: A fairly obvious optimization?

On Sun, 23 Jun 2002 17:16:09 EDT, the world broke into rejoicing as
Bruce Momjian <pgman@candle.pha.pa.us> said:

FAQ updated in section 4.8: My queries are slow or don't make use of the
indexes. Why?

is returned. In fact, though MAX() and MIN() don't use indexes,
it is possible to retrieve such values using an index with ORDER BY
and LIMIT:
<PRE>
SELECT col
FROM tab
ORDER BY col
LIMIT 1
</PRE>

This sounds like the sort of thing that would be really nice to be able
to automate into the query optimizer...
--
(reverse (concatenate 'string "moc.enworbbc@" "sirhc"))
http://www3.sympatico.ca/cbbrowne/spreadsheets.html
"I decry the current tendency to seek patents on algorithms. There
are better ways to earn a living than to prevent other people from
making use of one's contributions to computer science."
-- D. E. Knuth