pgsql: Remove item, not sure what it refers to: < * Allow ORDER BY ...

Started by Bruce Momjianabout 21 years ago6 messagescomitters
Jump to latest
#1Bruce Momjian
bruce@momjian.us

Log Message:
-----------
Remove item, not sure what it refers to:

< * Allow ORDER BY ... LIMIT 1 to select high/low value without sort or
< index using a sequential scan for highest/lowest values
<
< If only one value is needed, there is no need to sort the entire
< table. Instead a sequential scan could get the matching value.
<

Modified Files:
--------------
pgsql/doc:
TODO (r1.1510 -> r1.1511)
(http://developer.postgresql.org/cvsweb.cgi/pgsql/doc/TODO.diff?r1=1.1510&amp;r2=1.1511)
pgsql/doc/src/FAQ:
TODO.html (r1.17 -> r1.18)
(http://developer.postgresql.org/cvsweb.cgi/pgsql/doc/src/FAQ/TODO.html.diff?r1=1.17&amp;r2=1.18)

#2Kris Jurka
books@ejurka.com
In reply to: Bruce Momjian (#1)
Re: pgsql: Remove item, not sure what it refers to:

On Sat, 23 Apr 2005, Bruce Momjian wrote:

Log Message:
-----------
Remove item, not sure what it refers to:

< * Allow ORDER BY ... LIMIT 1 to select high/low value without sort or
< index using a sequential scan for highest/lowest values
<
< If only one value is needed, there is no need to sort the entire
< table. Instead a sequential scan could get the matching value.
<

This is actually a suggestion from Oleg here:

http://archives.postgresql.org/pgsql-general/2002-04/msg00464.php

double min = DBL_MAX;
for (i=0; i<N; i++) {
if (data[i] < min) {
min = data[i];
}
}

Kris Jurka

#3Bruce Momjian
bruce@momjian.us
In reply to: Kris Jurka (#2)
Re: pgsql: Remove item, not sure what it refers to:

Kris Jurka wrote:

On Sat, 23 Apr 2005, Bruce Momjian wrote:

Log Message:
-----------
Remove item, not sure what it refers to:

< * Allow ORDER BY ... LIMIT 1 to select high/low value without sort or
< index using a sequential scan for highest/lowest values
<
< If only one value is needed, there is no need to sort the entire
< table. Instead a sequential scan could get the matching value.
<

This is actually a suggestion from Oleg here:

http://archives.postgresql.org/pgsql-general/2002-04/msg00464.php

double min = DBL_MAX;
for (i=0; i<N; i++) {
if (data[i] < min) {
min = data[i];
}
}

OK, so you are saying that right now if we want ORDER BY ... LIMIT 1,
and there is no index, we sort the result then pick the high value,
rather than just doing a sequential scan and grabbing the high/low
value. Makes sense now.

Thanks, TODO item readded with a clearer description:

* Allow ORDER BY ... LIMIT 1 to select high/low value without sort or
index using a sequential scan for highest/lowest values

Right now, if no index exists, ORDER BY ... LIMIT 1 requires we sort
all values to return the high/low value. Instead The idea is to do a
sequential scan to find the high/low value, thus avoiding the sort.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#4Stephen Frost
sfrost@snowman.net
In reply to: Bruce Momjian (#3)
Re: pgsql: Remove item, not sure what it refers to:

* Bruce Momjian (pgman@candle.pha.pa.us) wrote:

Thanks, TODO item readded with a clearer description:

* Allow ORDER BY ... LIMIT 1 to select high/low value without sort or
index using a sequential scan for highest/lowest values

Right now, if no index exists, ORDER BY ... LIMIT 1 requires we sort
all values to return the high/low value. Instead The idea is to do a
sequential scan to find the high/low value, thus avoiding the sort.

Could we take this perhaps a step further and consider things like
'LIMIT 10' and come up with an approximate point where the trade-off
exists? Actually, thinking about this a minute more perhaps there isn't
even a trade-off to be made... What you're suggesting is basically a
size-of-1 temporary memory structure for the 'sort'. Isn't there
already a memory structure used to perform the sorting though? Could it
be adjusted such that it's of a fixed size when 'LIMIT' is given, as
above?

Just some thoughts, while I think the specific 'LIMIT 1' case is
probably pretty common I think the 'LIMIT 10' or 'LIMIT 50' (or however
many you want to display on the webpage...) is a pretty common use case
too and it sounds like we could improve those too with this mechanism.

Thoughts?

Thanks,

Stephen

#5Bruce Momjian
bruce@momjian.us
In reply to: Stephen Frost (#4)
Re: pgsql: Remove item, not sure what it refers to:

Stephen Frost wrote:
-- Start of PGP signed section.

* Bruce Momjian (pgman@candle.pha.pa.us) wrote:

Thanks, TODO item readded with a clearer description:

* Allow ORDER BY ... LIMIT 1 to select high/low value without sort or
index using a sequential scan for highest/lowest values

Right now, if no index exists, ORDER BY ... LIMIT 1 requires we sort
all values to return the high/low value. Instead The idea is to do a
sequential scan to find the high/low value, thus avoiding the sort.

Could we take this perhaps a step further and consider things like
'LIMIT 10' and come up with an approximate point where the trade-off
exists? Actually, thinking about this a minute more perhaps there isn't
even a trade-off to be made... What you're suggesting is basically a
size-of-1 temporary memory structure for the 'sort'. Isn't there
already a memory structure used to perform the sorting though? Could it
be adjusted such that it's of a fixed size when 'LIMIT' is given, as
above?

Just some thoughts, while I think the specific 'LIMIT 1' case is
probably pretty common I think the 'LIMIT 10' or 'LIMIT 50' (or however
many you want to display on the webpage...) is a pretty common use case
too and it sounds like we could improve those too with this mechanism.

Yes, I think the final optimization will allow >1 values for LIMIT.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#6Oleg Bartunov
oleg@sai.msu.su
In reply to: Stephen Frost (#4)
Re: pgsql: Remove item, not sure what it refers to:

On Mon, 25 Apr 2005, Stephen Frost wrote:

* Bruce Momjian (pgman@candle.pha.pa.us) wrote:

Thanks, TODO item readded with a clearer description:

* Allow ORDER BY ... LIMIT 1 to select high/low value without sort or
index using a sequential scan for highest/lowest values

Right now, if no index exists, ORDER BY ... LIMIT 1 requires we sort
all values to return the high/low value. Instead The idea is to do a
sequential scan to find the high/low value, thus avoiding the sort.

Could we take this perhaps a step further and consider things like
'LIMIT 10' and come up with an approximate point where the trade-off
exists? Actually, thinking about this a minute more perhaps there isn't
even a trade-off to be made... What you're suggesting is basically a
size-of-1 temporary memory structure for the 'sort'. Isn't there
already a memory structure used to perform the sorting though? Could it
be adjusted such that it's of a fixed size when 'LIMIT' is given, as
above?

Just some thoughts, while I think the specific 'LIMIT 1' case is
probably pretty common I think the 'LIMIT 10' or 'LIMIT 50' (or however
many you want to display on the webpage...) is a pretty common use case
too and it sounds like we could improve those too with this mechanism.

It's a question of when to stop sorting, so, yes, it should be doable.

Thoughts?

Thanks,

Stephen

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83