Pagination - 1 or 2 queries?

Started by CSNover 22 years ago11 messagesgeneral
Jump to latest
#1CSN
cool_screen_name90001@yahoo.com

Since you usually need to know the total number of
rows a query would return, do you think it's better
to:

a) Do one query with a LIMIT and OFFSET to get the
results, and another COUNT query to get the total
number of rows?

b) Do a single query without a LIMIT and OFFSET, then
do a seek or similiar to get at the rows you want?

Most tutorials, code, etc. I've seen do "a". The
eclipse library does "b".

__________________________________
Do you Yahoo!?
Yahoo! SiteBuilder - Free, easy-to-use web site design software
http://sitebuilder.yahoo.com

#2scott.marlowe
scott.marlowe@ihs.com
In reply to: CSN (#1)
Re: Pagination - 1 or 2 queries?

On Fri, 5 Sep 2003, CSN wrote:

Since you usually need to know the total number of
rows a query would return, do you think it's better
to:

a) Do one query with a LIMIT and OFFSET to get the
results, and another COUNT query to get the total
number of rows?

b) Do a single query without a LIMIT and OFFSET, then
do a seek or similiar to get at the rows you want?

Most tutorials, code, etc. I've seen do "a". The
eclipse library does "b".

Either way works. Does the eclipse library use a cursor, or grab the
whole dataset and then seek on the client side? If it uses a cursor, I'd
expect it to be the fastest and simplest implementation. Since a lot of
libs are designed to work with MySQL, they often are written in the first
method, where select count(*) is quite quick on MySQL, and MySQL doesn't
have cursor support.

With Postgresql, the cursor is likely to be the faster method.

#3CSN
cool_screen_name90001@yahoo.com
In reply to: scott.marlowe (#2)
Re: Pagination - 1 or 2 queries?
--- "scott.marlowe" <scott.marlowe@ihs.com> wrote:

On Fri, 5 Sep 2003, CSN wrote:

Since you usually need to know the total number of
rows a query would return, do you think it's

better

to:

a) Do one query with a LIMIT and OFFSET to get the
results, and another COUNT query to get the total
number of rows?

b) Do a single query without a LIMIT and OFFSET,

then

do a seek or similiar to get at the rows you want?

Most tutorials, code, etc. I've seen do "a". The
eclipse library does "b".

Either way works. Does the eclipse library use a
cursor, or grab the
whole dataset and then seek on the client side? If
it uses a cursor, I'd
expect it to be the fastest and simplest
implementation. Since a lot of
libs are designed to work with MySQL, they often are
written in the first
method, where select count(*) is quite quick on
MySQL, and MySQL doesn't
have cursor support.

With Postgresql, the cursor is likely to be the
faster method.

Eclipse appears to just use pg_fetch_array($result,
$index). That'd be pretty similiar to a cursor
wouldn't it? i.e. only the specified rows would be
sent to the client (but all rows would be in the
server's memory).

Eclipse's docs make the argument that "b" is better
because "a" still needs to select/examine all rows
before doing the LIMIT and OFFSET.

http://www.students.cs.uu.nl/people/voostind/eclipse/api/index.html
(PagedQuery)

CSN

__________________________________
Do you Yahoo!?
Yahoo! SiteBuilder - Free, easy-to-use web site design software
http://sitebuilder.yahoo.com

#4scott.marlowe
scott.marlowe@ihs.com
In reply to: CSN (#3)
Re: Pagination - 1 or 2 queries?

On Fri, 5 Sep 2003, CSN wrote:

--- "scott.marlowe" <scott.marlowe@ihs.com> wrote:

On Fri, 5 Sep 2003, CSN wrote:

Since you usually need to know the total number of
rows a query would return, do you think it's

better

to:

a) Do one query with a LIMIT and OFFSET to get the
results, and another COUNT query to get the total
number of rows?

b) Do a single query without a LIMIT and OFFSET,

then

do a seek or similiar to get at the rows you want?

Most tutorials, code, etc. I've seen do "a". The
eclipse library does "b".

Either way works. Does the eclipse library use a
cursor, or grab the
whole dataset and then seek on the client side? If
it uses a cursor, I'd
expect it to be the fastest and simplest
implementation. Since a lot of
libs are designed to work with MySQL, they often are
written in the first
method, where select count(*) is quite quick on
MySQL, and MySQL doesn't
have cursor support.

With Postgresql, the cursor is likely to be the
faster method.

Eclipse appears to just use pg_fetch_array($result,
$index). That'd be pretty similiar to a cursor
wouldn't it? i.e. only the specified rows would be
sent to the client (but all rows would be in the
server's memory).

Eclipse's docs make the argument that "b" is better
because "a" still needs to select/examine all rows
before doing the LIMIT and OFFSET.

If they aren't explicitly declaring a cursor, then b isn't exactly the
same. If you do:

select * from table order by fieldname

then

$row = pg_fetch_array()

then the whole data set is returned to the client (i.e. php) before we can
get the row. Now, if they do:

begin;
declare bubba as cursor for select * from table order by fieldname;
move forward 100 in bubba;
fetch 5 from bubba;
rollback;

Then you get the same kind of effect, but only 5 rows have to be retrieved
from the database to the client, and pg_fetch_array will now iterate over
those 5 rows only, and then run dry, so to speak.

#5CSN
cool_screen_name90001@yahoo.com
In reply to: scott.marlowe (#4)
Re: Pagination - 1 or 2 queries?
--- "scott.marlowe" <scott.marlowe@ihs.com> wrote:

On Fri, 5 Sep 2003, CSN wrote:

--- "scott.marlowe" <scott.marlowe@ihs.com> wrote:

On Fri, 5 Sep 2003, CSN wrote:

Since you usually need to know the total

number of

rows a query would return, do you think it's

better

to:

a) Do one query with a LIMIT and OFFSET to get

the

results, and another COUNT query to get the

total

number of rows?

b) Do a single query without a LIMIT and

OFFSET,

then

do a seek or similiar to get at the rows you

want?

Most tutorials, code, etc. I've seen do "a".

The

eclipse library does "b".

Either way works. Does the eclipse library use

a

cursor, or grab the
whole dataset and then seek on the client side?

If

it uses a cursor, I'd
expect it to be the fastest and simplest
implementation. Since a lot of
libs are designed to work with MySQL, they often

are

written in the first
method, where select count(*) is quite quick on
MySQL, and MySQL doesn't
have cursor support.

With Postgresql, the cursor is likely to be the
faster method.

Eclipse appears to just use

pg_fetch_array($result,

$index). That'd be pretty similiar to a cursor
wouldn't it? i.e. only the specified rows would be
sent to the client (but all rows would be in the
server's memory).

Eclipse's docs make the argument that "b" is

better

because "a" still needs to select/examine all rows
before doing the LIMIT and OFFSET.

If they aren't explicitly declaring a cursor, then b
isn't exactly the
same. If you do:

select * from table order by fieldname

then

$row = pg_fetch_array()

then the whole data set is returned to the client
(i.e. php) before we can
get the row. Now, if they do:

begin;
declare bubba as cursor for select * from table
order by fieldname;
move forward 100 in bubba;
fetch 5 from bubba;
rollback;

Then you get the same kind of effect, but only 5
rows have to be retrieved
from the database to the client, and pg_fetch_array
will now iterate over
those 5 rows only, and then run dry, so to speak.

Ah, I think you're right ;). PG would get all the
rows, then pass them all on to PHP, then scripts
access whichever ones they want.

With Mysql it looks like mysql_unbuffered_query would
avoid sending all rows to PHP.

PHP also has asynchronous methods for PG:
pg_send_query, pg_get_result, etc. Not sure if they'd
have a similiar effect, but probably better off using
a cursor at any rate :).

CSN

__________________________________
Do you Yahoo!?
Yahoo! SiteBuilder - Free, easy-to-use web site design software
http://sitebuilder.yahoo.com

#6CSN
cool_screen_name90001@yahoo.com
In reply to: scott.marlowe (#4)
Re: Pagination - 1 or 2 queries?
--- "scott.marlowe" <scott.marlowe@ihs.com> wrote:

begin;
declare bubba as cursor for select * from table
order by fieldname;
move forward 100 in bubba;
fetch 5 from bubba;
rollback;

Then you get the same kind of effect, but only 5
rows have to be retrieved
from the database to the client, and pg_fetch_array
will now iterate over
those 5 rows only, and then run dry, so to speak.

Actually, with this method would you be able to get
the count of all rows that could be returned (not just
the 5)?

__________________________________
Do you Yahoo!?
Yahoo! SiteBuilder - Free, easy-to-use web site design software
http://sitebuilder.yahoo.com

#7scott.marlowe
scott.marlowe@ihs.com
In reply to: CSN (#6)
Re: Pagination - 1 or 2 queries?

On Fri, 5 Sep 2003, CSN wrote:

--- "scott.marlowe" <scott.marlowe@ihs.com> wrote:

begin;
declare bubba as cursor for select * from table
order by fieldname;
move forward 100 in bubba;
fetch 5 from bubba;
rollback;

Then you get the same kind of effect, but only 5
rows have to be retrieved
from the database to the client, and pg_fetch_array
will now iterate over
those 5 rows only, and then run dry, so to speak.

Actually, with this method would you be able to get
the count of all rows that could be returned (not just
the 5)?

Yes, you can use "fetch absolute count from cursorname" should work.

#8Bruce Momjian
bruce@momjian.us
In reply to: scott.marlowe (#2)
Re: Pagination - 1 or 2 queries?

scott.marlowe wrote:

On Fri, 5 Sep 2003, CSN wrote:

Since you usually need to know the total number of
rows a query would return, do you think it's better
to:

a) Do one query with a LIMIT and OFFSET to get the
results, and another COUNT query to get the total
number of rows?

b) Do a single query without a LIMIT and OFFSET, then
do a seek or similiar to get at the rows you want?

Most tutorials, code, etc. I've seen do "a". The
eclipse library does "b".

Either way works. Does the eclipse library use a cursor, or grab the
whole dataset and then seek on the client side? If it uses a cursor, I'd
expect it to be the fastest and simplest implementation. Since a lot of
libs are designed to work with MySQL, they often are written in the first
method, where select count(*) is quite quick on MySQL, and MySQL doesn't
have cursor support.

With Postgresql, the cursor is likely to be the faster method.

I agree --- with a LIMIT and COUNT(*), you run the query twice. With a
cursor, you run it once, and only pull the rows to the client you want.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#9CSN
cool_screen_name90001@yahoo.com
In reply to: Bruce Momjian (#8)
Re: Pagination - 1 or 2 queries?

Behind the scenes, is there much performance
difference between:

SELECT *
FROM table_with_millions_of_rows
ORDER BY col1;

and:

SELECT *
FROM table_with_millions_of_rows
ORDER BY col1
LIMIT 100
OFFSET 100000;

?

Wouldn't the second query would use far less memory?

CSN

--- Bruce Momjian <pgman@candle.pha.pa.us> wrote:

scott.marlowe wrote:

On Fri, 5 Sep 2003, CSN wrote:

Since you usually need to know the total number

of

rows a query would return, do you think it's

better

to:

a) Do one query with a LIMIT and OFFSET to get

the

results, and another COUNT query to get the

total

number of rows?

b) Do a single query without a LIMIT and OFFSET,

then

do a seek or similiar to get at the rows you

want?

Most tutorials, code, etc. I've seen do "a". The
eclipse library does "b".

Either way works. Does the eclipse library use a

cursor, or grab the

whole dataset and then seek on the client side?

If it uses a cursor, I'd

expect it to be the fastest and simplest

implementation. Since a lot of

libs are designed to work with MySQL, they often

are written in the first

method, where select count(*) is quite quick on

MySQL, and MySQL doesn't

have cursor support.

With Postgresql, the cursor is likely to be the

faster method.

I agree --- with a LIMIT and COUNT(*), you run the
query twice. With a
cursor, you run it once, and only pull the rows to
the client you want.

-- 
Bruce Momjian                        | 
http://candle.pha.pa.us
pgman@candle.pha.pa.us               |  (610)
359-1001
+  If your life is a hard drive,     |  13 Roberts
Road
+  Christ can be your backup.        |  Newtown
Square, Pennsylvania 19073

__________________________________
Do you Yahoo!?
Yahoo! SiteBuilder - Free, easy-to-use web site design software
http://sitebuilder.yahoo.com

#10Bruce Momjian
bruce@momjian.us
In reply to: CSN (#9)
Re: Pagination - 1 or 2 queries?

CSN wrote:

Behind the scenes, is there much performance
difference between:

SELECT *
FROM table_with_millions_of_rows
ORDER BY col1;

and:

SELECT *
FROM table_with_millions_of_rows
ORDER BY col1
LIMIT 100
OFFSET 100000;

Yes, the optimizer knows it doesn't have to execute the entire query, so
the later can be faster, but that doesn't return the COUNT(*), as
requested.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#11scott.marlowe
scott.marlowe@ihs.com
In reply to: CSN (#6)
Re: Pagination - 1 or 2 queries?

On Fri, 5 Sep 2003, CSN wrote:

--- "scott.marlowe" <scott.marlowe@ihs.com> wrote:

begin;
declare bubba as cursor for select * from table
order by fieldname;
move forward 100 in bubba;
fetch 5 from bubba;
rollback;

Then you get the same kind of effect, but only 5
rows have to be retrieved
from the database to the client, and pg_fetch_array
will now iterate over
those 5 rows only, and then run dry, so to speak.

Actually, with this method would you be able to get
the count of all rows that could be returned (not just
the 5)?

My previous one about using absolute count was wrong, btw, so you can
either fetch forward all and get the count that returns or run select
count(*). note that if you fetch forward all on a complex query, you may
NOT be able to fetch backward all since cursors have a hard time going
backwards on complex queries.