Out of memory error on huge resultset

Started by Nick Fankhauserover 23 years ago61 messages
#1Nick Fankhauser
nickf@ontko.com

I'm selecting a huge ResultSet from our database- about one million rows,
with one of the fields being varchar(500). I get an out of memory error from
java.

If the whole ResultSet gets stashed in memory, this isn't really surprising,
but I'm wondering why this happens (if it does), rather than a subset around
the current record being cached and other rows being retrieved as needed.

If it turns out that there are good reasons for it to all be in memory, then
my question is whether there is a better approach that people typically use
in this situation. For now, I'm simply breaking up the select into smaller
chunks, but that approach won't be satisfactory in the long run.

Thanks

-Nick

--------------------------------------------------------------------------
Nick Fankhauser nickf@ontko.com Phone 1.765.935.4283 Fax 1.765.962.9788
Ray Ontko & Co. Software Consulting Services http://www.ontko.com/

#2Dave Cramer
Dave@micro-automation.net
In reply to: Nick Fankhauser (#1)
Re: Out of memory error on huge resultset

Nick,

Use a cursor, the current driver doesn't support caching, the backend
gives you everything you ask for, you can't just say you want a limited
set.

So if you use cursors you can fetch a subset

Dave

Show quoted text

On Thu, 2002-10-10 at 11:24, Nick Fankhauser wrote:

I'm selecting a huge ResultSet from our database- about one million rows,
with one of the fields being varchar(500). I get an out of memory error from
java.

If the whole ResultSet gets stashed in memory, this isn't really surprising,
but I'm wondering why this happens (if it does), rather than a subset around
the current record being cached and other rows being retrieved as needed.

If it turns out that there are good reasons for it to all be in memory, then
my question is whether there is a better approach that people typically use
in this situation. For now, I'm simply breaking up the select into smaller
chunks, but that approach won't be satisfactory in the long run.

Thanks

-Nick

--------------------------------------------------------------------------
Nick Fankhauser nickf@ontko.com Phone 1.765.935.4283 Fax 1.765.962.9788
Ray Ontko & Co. Software Consulting Services http://www.ontko.com/

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

#3Barry Lind
barry@xythos.com
In reply to: Nick Fankhauser (#1)
Re: Out of memory error on huge resultset

Nick,

This has been discussed before on this list many times. But the short
answer is that that is how the postgres server handles queries. If you
issue a query the server will return the entire result. (try the same
query in psql and you will have the same problem). To work around this
you can use explicit cursors (see the DECLARE CURSOR, FETCH, and MOVE
sql commands for postgres).

thanks,
--Barry

Nick Fankhauser wrote:

Show quoted text

I'm selecting a huge ResultSet from our database- about one million rows,
with one of the fields being varchar(500). I get an out of memory error from
java.

If the whole ResultSet gets stashed in memory, this isn't really surprising,
but I'm wondering why this happens (if it does), rather than a subset around
the current record being cached and other rows being retrieved as needed.

If it turns out that there are good reasons for it to all be in memory, then
my question is whether there is a better approach that people typically use
in this situation. For now, I'm simply breaking up the select into smaller
chunks, but that approach won't be satisfactory in the long run.

Thanks

-Nick

--------------------------------------------------------------------------
Nick Fankhauser nickf@ontko.com Phone 1.765.935.4283 Fax 1.765.962.9788
Ray Ontko & Co. Software Consulting Services http://www.ontko.com/

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

#4snpe
snpe@snpe.co.yu
In reply to: Barry Lind (#3)
Re: Out of memory error on huge resultset

Barry,
Is it true ?
I create table with one column varchar(500) and enter 1 milion rows with
length 10-20 character.JDBC query 'select * from a' get error 'out of
memory', but psql not.
I insert 8 milion rows and psql work fine yet (slow, but work)

In C library is 'execute query' without fetch - in jdbc execute fetch all rows
and this is problem - I think that executequery must prepare query and fetch
(ResultSet.next or ...) must fetch only fetchSize rows.
I am not sure, but I think that is problem with jdbc, not postgresql
Hackers ?
Does psql fetch all rows and if not how many ?
Can I change fetch size in psql ?
CURSOR , FETCH and MOVE isn't solution.
If I use jdbc in third-party IDE, I can't force this solution

regards

Show quoted text

On Thursday 10 October 2002 06:40 pm, Barry Lind wrote:

Nick,

This has been discussed before on this list many times. But the short
answer is that that is how the postgres server handles queries. If you
issue a query the server will return the entire result. (try the same
query in psql and you will have the same problem). To work around this
you can use explicit cursors (see the DECLARE CURSOR, FETCH, and MOVE
sql commands for postgres).

thanks,
--Barry

Nick Fankhauser wrote:

I'm selecting a huge ResultSet from our database- about one million rows,
with one of the fields being varchar(500). I get an out of memory error
from java.

If the whole ResultSet gets stashed in memory, this isn't really
surprising, but I'm wondering why this happens (if it does), rather than
a subset around the current record being cached and other rows being
retrieved as needed.

If it turns out that there are good reasons for it to all be in memory,
then my question is whether there is a better approach that people
typically use in this situation. For now, I'm simply breaking up the
select into smaller chunks, but that approach won't be satisfactory in
the long run.

Thanks

-Nick

-------------------------------------------------------------------------
- Nick Fankhauser nickf@ontko.com Phone 1.765.935.4283 Fax
1.765.962.9788 Ray Ontko & Co. Software Consulting Services
http://www.ontko.com/

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

#5Antti Haapala
antti.haapala@iki.fi
In reply to: snpe (#4)
MySQL vs PostgreSQL.

Check out:

http://www.mysql.com/doc/en/MySQL-PostgreSQL_features.html

MySQL AB compares MySQL with PostgreSQL.

Quoted from one page

Because we couldn't get vacuum() to work reliable with PostgreSQL 7.1.1,
we haven't been able to generate a --fast version of the benchmarks yet
(where we would have done a vacuum() at critical places in the benchmark
to get better performance for PostgreSQL). We will do a new run of the
benchmarks as soon as the PostgreSQL developers can point out what we
have done wrong or have fixed vacuum() so that it works again.

and from another.

Drawbacks with PostgreSQL compared to MySQL Server:

VACUUM makes PostgreSQL hard to use in a 24/7 environment.

They also state that they have more sophisticated ALTER TABLE...

Only usable feature in their ALTER TABLE that doesn't (yet) exist in
PostgreSQL was changing column order (ok, the order by in table creation
could be nice), and that's still almost purely cosmetic. Anyway, I could
have used that command yesterday. Could this be added to pgsql.

MySQL supports data compression between front and back ends. This could be
easily implemented, or is it already supported?

I think all the other statements were misleading in the sense, that they
compared their newest product with PostgreSQL 7.1.1.

There's also following line:

PostgreSQL currently offers the following advantages over MySQL Server:

After which there's only one empty line.

Note that because we know the MySQL road map, we have included in the
following table the version when MySQL Server should support this
feature. Unfortunately we couldn't do this for
previous comparisons, because we don't know the PostgreSQL roadmap.

They could be provided one... ;-)

Upgrading MySQL Server is painless. When you are upgrading MySQL Server,
you don't need to dump/restore your data, as you have to do with most
PostgreSQL upgrades.

Ok... this is true, but not so hard - yesterday I installed 7.3b2 onto my
linux box.

Of course PostgreSQL isn't yet as fast as it could be. ;)

--
Antti Haapala

#6Rod Taylor
rbt@rbt.ca
In reply to: Antti Haapala (#5)
Re: MySQL vs PostgreSQL.

On Fri, 2002-10-11 at 09:20, Antti Haapala wrote:

Check out:

http://www.mysql.com/doc/en/MySQL-PostgreSQL_features.html

MySQL AB compares MySQL with PostgreSQL.

I wouldn't look too far into these at all. I've tried to get
' " as identifier quote (ANSI SQL) ' corrected on the crash-me pages for
us a couple of times (they say we don't support it for some reason).

I've not looked, but I thought 7.1 supported rename table as well.

Anyway, max table row length was wrong with 7.1 wrong too unless I'm
confused as to what a blob is (is text and varchar a blob -- what about
your own 10Mb fixed length datatype -- how about a huge array of
integers if the previous are considered blobs?)

--
Rod Taylor

#7Shridhar Daithankar
shridhar_daithankar@persistent.co.in
In reply to: Antti Haapala (#5)
Re: MySQL vs PostgreSQL.

On 11 Oct 2002 at 16:20, Antti Haapala wrote:

Check out:
http://www.mysql.com/doc/en/MySQL-PostgreSQL_features.html

Well, I guess there are many threads on this. You can dig around archives..

Upgrading MySQL Server is painless. When you are upgrading MySQL Server,
you don't need to dump/restore your data, as you have to do with most
PostgreSQL upgrades.

Ok... this is true, but not so hard - yesterday I installed 7.3b2 onto my
linux box.

Well, that remains as a point. Imagine a 100GB database on a 150GB disk array.
How do you dump and reload? In place conversion of data is an absolute
necessary feature and it's already on TODO.

Of course PostgreSQL isn't yet as fast as it could be. ;)

Check few posts I have made in last three weeks. You will find that postgresql
is fast enough to surpass mysql in what are considered as mysql strongholds. Of
course it's not a handy win but for sure, postgresql is not slow.

And for vacuum thing, I have written a autovacuum daemon that can automatically
vacuum databases depending upon their activity. Check it at
gborg.postgresql.org. (I can't imagine this as an advertisement of myself but
looks like the one)

Let thread be rested. Postgresql certaily needs some maketing hand but refuting
claims in that article is not the best way to start it. I guess most hackers
would agree with this..

Bye
Shridhar

--
Cat, n.: Lapwarmer with built-in buzzer.

#8Dave Cramer
Dave@micro-automation.net
In reply to: snpe (#4)
Re: Out of memory error on huge resultset

No disadvantage, in fact that is what we would like to do.

setFetchSize(size) turns on cursor support, otherwise fetch normally

Dave

Show quoted text

On Fri, 2002-10-11 at 10:30, Aaron Mulder wrote:

What would be the disadvantage of making the JDBC driver use a
cursor under the covers (always)? Is it significantly slower or more
resource-intensive than fetching all the data at once? Certainly it seems
like it would save memory in some cases.

Aaron

On 10 Oct 2002, Dave Cramer wrote:

Nick,

Use a cursor, the current driver doesn't support caching, the backend
gives you everything you ask for, you can't just say you want a limited
set.

So if you use cursors you can fetch a subset

Dave
On Thu, 2002-10-10 at 11:24, Nick Fankhauser wrote:

I'm selecting a huge ResultSet from our database- about one million rows,
with one of the fields being varchar(500). I get an out of memory error from
java.

If the whole ResultSet gets stashed in memory, this isn't really surprising,
but I'm wondering why this happens (if it does), rather than a subset around
the current record being cached and other rows being retrieved as needed.

If it turns out that there are good reasons for it to all be in memory, then
my question is whether there is a better approach that people typically use
in this situation. For now, I'm simply breaking up the select into smaller
chunks, but that approach won't be satisfactory in the long run.

Thanks

-Nick

--------------------------------------------------------------------------
Nick Fankhauser nickf@ontko.com Phone 1.765.935.4283 Fax 1.765.962.9788
Ray Ontko & Co. Software Consulting Services http://www.ontko.com/

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

#9Aaron Mulder
ammulder@alumni.princeton.edu
In reply to: Dave Cramer (#2)
Re: Out of memory error on huge resultset

What would be the disadvantage of making the JDBC driver use a
cursor under the covers (always)? Is it significantly slower or more
resource-intensive than fetching all the data at once? Certainly it seems
like it would save memory in some cases.

Aaron

Show quoted text

On 10 Oct 2002, Dave Cramer wrote:

Nick,

Use a cursor, the current driver doesn't support caching, the backend
gives you everything you ask for, you can't just say you want a limited
set.

So if you use cursors you can fetch a subset

Dave
On Thu, 2002-10-10 at 11:24, Nick Fankhauser wrote:

I'm selecting a huge ResultSet from our database- about one million rows,
with one of the fields being varchar(500). I get an out of memory error from
java.

If the whole ResultSet gets stashed in memory, this isn't really surprising,
but I'm wondering why this happens (if it does), rather than a subset around
the current record being cached and other rows being retrieved as needed.

If it turns out that there are good reasons for it to all be in memory, then
my question is whether there is a better approach that people typically use
in this situation. For now, I'm simply breaking up the select into smaller
chunks, but that approach won't be satisfactory in the long run.

Thanks

-Nick

--------------------------------------------------------------------------
Nick Fankhauser nickf@ontko.com Phone 1.765.935.4283 Fax 1.765.962.9788
Ray Ontko & Co. Software Consulting Services http://www.ontko.com/

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

#10Greg Copeland
greg@CopelandConsulting.Net
In reply to: Antti Haapala (#5)
Re: MySQL vs PostgreSQL.

On Fri, 2002-10-11 at 08:20, Antti Haapala wrote:

Quoted from one page

Because we couldn't get vacuum() to work reliable with PostgreSQL 7.1.1,

I have little respect for the MySQL advocacy guys. They purposely
spread misinformation. They always compare their leading edge alpha
software against Postgres' year+ old stable versions. In some cases,
I've seen them compare their alpha (4.x) software against 7.0. Very sad
that these people can't even attempt to be honest.

In the case above, since they are comparing 4.x, they should be
comparing it to 7.x at least. It's also very sad that their testers
don't seem to even understand something as simple as cron. If they
can't understand something as simple as cron, I fear any conclusions
they may arrive at throughout their testing (destined to be
incorrect/invalid).

MySQL supports data compression between front and back ends. This could be
easily implemented, or is it already supported?

Mammoth has such a feature...or at least it's been in development for a
while. If I understood them correctly, it will be donated back to core
sometime in the 7.5 or 7.7 series. Last I heard, their results were
absolutely wonderful.

I think all the other statements were misleading in the sense, that they
compared their newest product with PostgreSQL 7.1.1.

Ya, historically, they go out of their way to ensure unfair
comparisons. I have no respect for them.

They could be provided one... ;-)

In other words, they need a list of features that they can one day hope
to add to MySQL.

Upgrading MySQL Server is painless. When you are upgrading MySQL Server,
you don't need to dump/restore your data, as you have to do with most
PostgreSQL upgrades.

Ok... this is true, but not so hard - yesterday I installed 7.3b2 onto my
linux box.

Of course PostgreSQL isn't yet as fast as it could be. ;)

I consider this par for the course. This is something I've had to do
with Sybase, Oracle and MSSQL.

Greg

#11Michael Paesold
mpaesold@gmx.at
In reply to: Aaron Mulder (#9)
Re: Out of memory error on huge resultset

Dave Cramer <Dave@micro-automation.net> wrote:

No disadvantage, in fact that is what we would like to do.

setFetchSize(size) turns on cursor support, otherwise fetch normally

Dave

On Fri, 2002-10-11 at 10:30, Aaron Mulder wrote:

What would be the disadvantage of making the JDBC driver use a
cursor under the covers (always)? Is it significantly slower or more
resource-intensive than fetching all the data at once? Certainly it

seems

like it would save memory in some cases.

Aaron

Well, using a cursor based result set *always* is not going to work. Cursors
will not be held over a commit, whereas a buffer result set will. So the
setFetchSize..

Regards,
Michael Paesold

#12Dave Cramer
Dave@micro-automation.net
In reply to: Michael Paesold (#11)
Re: Out of memory error on huge resultset

Michael,

You are correct, commit will effectively close the cursor.

This is the only way to deal with large result sets however.

Dave

Show quoted text

On Fri, 2002-10-11 at 10:44, Michael Paesold wrote:

Dave Cramer <Dave@micro-automation.net> wrote:

No disadvantage, in fact that is what we would like to do.

setFetchSize(size) turns on cursor support, otherwise fetch normally

Dave

On Fri, 2002-10-11 at 10:30, Aaron Mulder wrote:

What would be the disadvantage of making the JDBC driver use a
cursor under the covers (always)? Is it significantly slower or more
resource-intensive than fetching all the data at once? Certainly it

seems

like it would save memory in some cases.

Aaron

Well, using a cursor based result set *always* is not going to work. Cursors
will not be held over a commit, whereas a buffer result set will. So the
setFetchSize..

Regards,
Michael Paesold

#13Noname
nferrier@tapsellferrier.co.uk
In reply to: Dave Cramer (#8)
Re: Out of memory error on huge resultset

Dave Cramer <Dave@micro-automation.net> writes:

No disadvantage, in fact that is what we would like to do.

setFetchSize(size) turns on cursor support, otherwise fetch
normally

I love PostgreSQL's default behaviour: it's great to know
that a server side cursor resource is probably not hanging around for
simple querys.

Nic

#14Dave Cramer
Dave@micro-automation.net
In reply to: Noname (#13)
Re: Out of memory error on huge resultset

Just so you know this isn't implemented yet. My reference to
setFetchSize below was just a suggestion as to how to implement it

Dave

Show quoted text

On Fri, 2002-10-11 at 10:52, nferrier@tapsellferrier.co.uk wrote:

Dave Cramer <Dave@micro-automation.net> writes:

No disadvantage, in fact that is what we would like to do.

setFetchSize(size) turns on cursor support, otherwise fetch
normally

I love PostgreSQL's default behaviour: it's great to know
that a server side cursor resource is probably not hanging around for
simple querys.

Nic

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

#15Dave Cramer
Dave@micro-automation.net
In reply to: Aaron Mulder (#9)
Re: Out of memory error on huge resultset

Nic,

That would be great!

Dave

Show quoted text

On Fri, 2002-10-11 at 10:57, nferrier@tapsellferrier.co.uk wrote:

Dave Cramer <Dave@micro-automation.net> writes:

Just so you know this isn't implemented yet. My reference to
setFetchSize below was just a suggestion as to how to implement it

I'll do it if you like.

Nic

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

#16Jan Wieck
JanWieck@Yahoo.com
In reply to: Antti Haapala (#5)
Re: MySQL vs PostgreSQL.

Rod Taylor wrote:

On Fri, 2002-10-11 at 09:20, Antti Haapala wrote:

Check out:

http://www.mysql.com/doc/en/MySQL-PostgreSQL_features.html

MySQL AB compares MySQL with PostgreSQL.

I wouldn't look too far into these at all. I've tried to get
' " as identifier quote (ANSI SQL) ' corrected on the crash-me pages for
us a couple of times (they say we don't support it for some reason).

It's once again the typical MySQL propaganda. As usual they compare a
future version of MySQL against an old release of PostgreSQL. And they
just compare on buzzword level.
Do their foreign keys have referential actions and deferrability? Is log
based master slave replication all there can be?

And surely do we have something that compares to *their* roadmap. That
they cannot find it is because it's named HISTORY.

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #

#17Noname
nferrier@tapsellferrier.co.uk
In reply to: Dave Cramer (#14)
Re: Out of memory error on huge resultset

Dave Cramer <Dave@micro-automation.net> writes:

Just so you know this isn't implemented yet. My reference to
setFetchSize below was just a suggestion as to how to implement it

I'll do it if you like.

Nic

#18Doug Fields
dfields-postgres@pexicom.com
In reply to: snpe (#4)
Re: Out of memory error on huge resultset

At 08:27 AM 10/11/2002, snpe wrote:

Barry,
Is it true ?
I create table with one column varchar(500) and enter 1 milion rows with
length 10-20 character.JDBC query 'select * from a' get error 'out of
memory', but psql not.
I insert 8 milion rows and psql work fine yet (slow, but work)

The way the code works in JDBC is, in my opinion, a little poor but
possibly mandated by JDBC design specs.

It reads the entire result set from the database backend and caches it in a
horrible Vector (which should really be a List and which should at least
make an attempt to get the # of rows ahead of time to avoid all the
resizing problems).

Then, it doles it out from memory as you go through the ResultSet with the
next() method.

I would have hoped (but was wrong) that it streamed - WITHOUT LOADING THE
WHOLE THING - through the result set as each row is returned from the
backend, thus ensuring that you never use much more memory than one line.
EVEN IF you have to keep the connection locked.

The latter is what I expected it to do. The former is what it does. So, it
necessitates you creating EVERY SELECT query which you think has more than
a few rows (or which you think COULD have more than a few rows, "few" being
defined by our VM memory limits) into a cursor based query. Really klugy. I
intend to write a class to do that for every SELECT query for me automatically.

Cheers,

Doug

Show quoted text

In C library is 'execute query' without fetch - in jdbc execute fetch all
rows
and this is problem - I think that executequery must prepare query and fetch
(ResultSet.next or ...) must fetch only fetchSize rows.
I am not sure, but I think that is problem with jdbc, not postgresql
Hackers ?
Does psql fetch all rows and if not how many ?
Can I change fetch size in psql ?
CURSOR , FETCH and MOVE isn't solution.
If I use jdbc in third-party IDE, I can't force this solution

regards

On Thursday 10 October 2002 06:40 pm, Barry Lind wrote:

Nick,

This has been discussed before on this list many times. But the short
answer is that that is how the postgres server handles queries. If you
issue a query the server will return the entire result. (try the same
query in psql and you will have the same problem). To work around this
you can use explicit cursors (see the DECLARE CURSOR, FETCH, and MOVE
sql commands for postgres).

thanks,
--Barry

Nick Fankhauser wrote:

I'm selecting a huge ResultSet from our database- about one million rows,
with one of the fields being varchar(500). I get an out of memory error
from java.

If the whole ResultSet gets stashed in memory, this isn't really
surprising, but I'm wondering why this happens (if it does), rather than
a subset around the current record being cached and other rows being
retrieved as needed.

If it turns out that there are good reasons for it to all be in memory,
then my question is whether there is a better approach that people
typically use in this situation. For now, I'm simply breaking up the
select into smaller chunks, but that approach won't be satisfactory in
the long run.

Thanks

-Nick

-------------------------------------------------------------------------
- Nick Fankhauser nickf@ontko.com Phone 1.765.935.4283 Fax
1.765.962.9788 Ray Ontko & Co. Software Consulting Services
http://www.ontko.com/

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

#19Dave Cramer
Dave@micro-automation.net
In reply to: Doug Fields (#18)
Re: Out of memory error on huge resultset

This really is an artifact of the way that postgres gives us the data.

When you query the backend you get *all* of the results in the query,
and there is no indication of how many results you are going to get. In
simple selects it would be possible to get some idea by using
count(field), but this wouldn't work nearly enough times to make it
useful. So that leaves us with using cursors, which still won't tell you
how many rows you are getting back, but at least you won't have the
memory problems.

This approach is far from trivial which is why it hasn't been
implemented as of yet, keep in mind that result sets support things like
move(n), first(), last(), the last of which will be the trickiest. Not
to mention updateable result sets.

As it turns out there is a mechanism to get to the end move 0 in
'cursor', which currently is being considered a bug.

Dave

Show quoted text

On Fri, 2002-10-11 at 11:44, Doug Fields wrote:

At 08:27 AM 10/11/2002, snpe wrote:

Barry,
Is it true ?
I create table with one column varchar(500) and enter 1 milion rows with
length 10-20 character.JDBC query 'select * from a' get error 'out of
memory', but psql not.
I insert 8 milion rows and psql work fine yet (slow, but work)

The way the code works in JDBC is, in my opinion, a little poor but
possibly mandated by JDBC design specs.

It reads the entire result set from the database backend and caches it in a
horrible Vector (which should really be a List and which should at least
make an attempt to get the # of rows ahead of time to avoid all the
resizing problems).

Then, it doles it out from memory as you go through the ResultSet with the
next() method.

I would have hoped (but was wrong) that it streamed - WITHOUT LOADING THE
WHOLE THING - through the result set as each row is returned from the
backend, thus ensuring that you never use much more memory than one line.
EVEN IF you have to keep the connection locked.

The latter is what I expected it to do. The former is what it does. So, it
necessitates you creating EVERY SELECT query which you think has more than
a few rows (or which you think COULD have more than a few rows, "few" being
defined by our VM memory limits) into a cursor based query. Really klugy. I
intend to write a class to do that for every SELECT query for me automatically.

Cheers,

Doug

In C library is 'execute query' without fetch - in jdbc execute fetch all
rows
and this is problem - I think that executequery must prepare query and fetch
(ResultSet.next or ...) must fetch only fetchSize rows.
I am not sure, but I think that is problem with jdbc, not postgresql
Hackers ?
Does psql fetch all rows and if not how many ?
Can I change fetch size in psql ?
CURSOR , FETCH and MOVE isn't solution.
If I use jdbc in third-party IDE, I can't force this solution

regards

On Thursday 10 October 2002 06:40 pm, Barry Lind wrote:

Nick,

This has been discussed before on this list many times. But the short
answer is that that is how the postgres server handles queries. If you
issue a query the server will return the entire result. (try the same
query in psql and you will have the same problem). To work around this
you can use explicit cursors (see the DECLARE CURSOR, FETCH, and MOVE
sql commands for postgres).

thanks,
--Barry

Nick Fankhauser wrote:

I'm selecting a huge ResultSet from our database- about one million rows,
with one of the fields being varchar(500). I get an out of memory error
from java.

If the whole ResultSet gets stashed in memory, this isn't really
surprising, but I'm wondering why this happens (if it does), rather than
a subset around the current record being cached and other rows being
retrieved as needed.

If it turns out that there are good reasons for it to all be in memory,
then my question is whether there is a better approach that people
typically use in this situation. For now, I'm simply breaking up the
select into smaller chunks, but that approach won't be satisfactory in
the long run.

Thanks

-Nick

-------------------------------------------------------------------------
- Nick Fankhauser nickf@ontko.com Phone 1.765.935.4283 Fax
1.765.962.9788 Ray Ontko & Co. Software Consulting Services
http://www.ontko.com/

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

#20Dave Cramer
dave@fastcrypt.com
In reply to: Jan Wieck (#16)
move 0 behaviour

Currently there is a TODO list item to have move 0 not position to the
end of the cursor.

Moving to the end of the cursor is useful, can we keep the behaviour and
change it to move end, or just leave it the way it is?

Dave

#21snpe
snpe@snpe.co.yu
In reply to: Dave Cramer (#19)
Re: Out of memory error on huge resultset

Hello,
Does it mean that psql uses cursors ?

regards
Haris Peco

Show quoted text

On Friday 11 October 2002 05:58 pm, Dave Cramer wrote:

This really is an artifact of the way that postgres gives us the data.

When you query the backend you get *all* of the results in the query,
and there is no indication of how many results you are going to get. In
simple selects it would be possible to get some idea by using
count(field), but this wouldn't work nearly enough times to make it
useful. So that leaves us with using cursors, which still won't tell you
how many rows you are getting back, but at least you won't have the
memory problems.

This approach is far from trivial which is why it hasn't been
implemented as of yet, keep in mind that result sets support things like
move(n), first(), last(), the last of which will be the trickiest. Not
to mention updateable result sets.

As it turns out there is a mechanism to get to the end move 0 in
'cursor', which currently is being considered a bug.

Dave

On Fri, 2002-10-11 at 11:44, Doug Fields wrote:

At 08:27 AM 10/11/2002, snpe wrote:

Barry,
Is it true ?
I create table with one column varchar(500) and enter 1 milion rows with
length 10-20 character.JDBC query 'select * from a' get error 'out of
memory', but psql not.
I insert 8 milion rows and psql work fine yet (slow, but work)

The way the code works in JDBC is, in my opinion, a little poor but
possibly mandated by JDBC design specs.

It reads the entire result set from the database backend and caches it in
a horrible Vector (which should really be a List and which should at
least make an attempt to get the # of rows ahead of time to avoid all the
resizing problems).

Then, it doles it out from memory as you go through the ResultSet with
the next() method.

I would have hoped (but was wrong) that it streamed - WITHOUT LOADING THE
WHOLE THING - through the result set as each row is returned from the
backend, thus ensuring that you never use much more memory than one line.
EVEN IF you have to keep the connection locked.

The latter is what I expected it to do. The former is what it does. So,
it necessitates you creating EVERY SELECT query which you think has more
than a few rows (or which you think COULD have more than a few rows,
"few" being defined by our VM memory limits) into a cursor based query.
Really klugy. I intend to write a class to do that for every SELECT query
for me automatically.

Cheers,

Doug

In C library is 'execute query' without fetch - in jdbc execute fetch
all rows
and this is problem - I think that executequery must prepare query and
fetch (ResultSet.next or ...) must fetch only fetchSize rows.
I am not sure, but I think that is problem with jdbc, not postgresql
Hackers ?
Does psql fetch all rows and if not how many ?
Can I change fetch size in psql ?
CURSOR , FETCH and MOVE isn't solution.
If I use jdbc in third-party IDE, I can't force this solution

regards

On Thursday 10 October 2002 06:40 pm, Barry Lind wrote:

Nick,

This has been discussed before on this list many times. But the
short answer is that that is how the postgres server handles queries.
If you issue a query the server will return the entire result. (try
the same query in psql and you will have the same problem). To work
around this you can use explicit cursors (see the DECLARE CURSOR,
FETCH, and MOVE sql commands for postgres).

thanks,
--Barry

Nick Fankhauser wrote:

I'm selecting a huge ResultSet from our database- about one million
rows, with one of the fields being varchar(500). I get an out of
memory error from java.

If the whole ResultSet gets stashed in memory, this isn't really
surprising, but I'm wondering why this happens (if it does), rather
than a subset around the current record being cached and other rows
being retrieved as needed.

If it turns out that there are good reasons for it to all be in
memory, then my question is whether there is a better approach that
people typically use in this situation. For now, I'm simply
breaking up the select into smaller chunks, but that approach won't
be satisfactory in the long run.

Thanks

-Nick

-------------------------------------------------------------------
------ - Nick Fankhauser nickf@ontko.com Phone 1.765.935.4283 Fax
1.765.962.9788 Ray Ontko & Co. Software Consulting Services
http://www.ontko.com/

---------------------------(end of
broadcast)--------------------------- TIP 5: Have you checked our
extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

---------------------------(end of
broadcast)--------------------------- TIP 6: Have you searched our
list archives?

http://archives.postgresql.org

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to
majordomo@postgresql.org)

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

#22Barry Lind
barry@xythos.com
In reply to: Barry Lind (#3)
Re: Out of memory error on huge resultset

Doug Fields wrote:

It reads the entire result set from the database backend and caches it
in a horrible Vector (which should really be a List and which should at
least make an attempt to get the # of rows ahead of time to avoid all
the resizing problems).

The problem here is that we would then need two completely different
implementations for jdbc1 and jdbc2/3 since List is not part of jdk1.1.
We could build our own List implementation that works on jdk1.1, but I
am not sure the gain in performance is worth it. If you could do some
testing and come back with some numbers of the differences in
performance between ResultSets implemented with Vectors and Lists that
would probably give us enough information to guage how to proceed on
this suggested improvement.

Then, it doles it out from memory as you go through the ResultSet with
the next() method.

I would have hoped (but was wrong) that it streamed - WITHOUT LOADING
THE WHOLE THING - through the result set as each row is returned from
the backend, thus ensuring that you never use much more memory than one
line. EVEN IF you have to keep the connection locked.

This had actually been tried in the past (just getting the records from
the server connection as requested), but this behavior violates the spec
and broke many peoples applications. The problem is that if you don't
use cursors, you end up tying up the connection until you finish
fetching all rows. So code like the following no longer works:

get result set
while (rs.next()) {
get some values from the result
use them to update/insert some other table using a preparedstatement
}

Since the connection is locked until all the results are fetched, you
can't use the connection to perform the update/insert you want to do for
each itteration of the loop.

The latter is what I expected it to do. The former is what it does. So,
it necessitates you creating EVERY SELECT query which you think has more
than a few rows (or which you think COULD have more than a few rows,
"few" being defined by our VM memory limits) into a cursor based query.
Really klugy. I intend to write a class to do that for every SELECT
query for me automatically.

Cheers,

Doug

--Barry

#23Aaron Mulder
ammulder@alumni.princeton.edu
In reply to: Dave Cramer (#19)
Re: Out of memory error on huge resultset

It wouldn't be bad to start with a naive implementation of
last()... If the only problem we have is that last() doesn't perform
well, we're probably making good progress. :)
On the other hand, I would think the updateable result sets would
be the most challenging; does the server provide any analogous features
with its cursors?

Aaron

Show quoted text

On 11 Oct 2002, Dave Cramer wrote:

This really is an artifact of the way that postgres gives us the data.

When you query the backend you get *all* of the results in the query,
and there is no indication of how many results you are going to get. In
simple selects it would be possible to get some idea by using
count(field), but this wouldn't work nearly enough times to make it
useful. So that leaves us with using cursors, which still won't tell you
how many rows you are getting back, but at least you won't have the
memory problems.

This approach is far from trivial which is why it hasn't been
implemented as of yet, keep in mind that result sets support things like
move(n), first(), last(), the last of which will be the trickiest. Not
to mention updateable result sets.

As it turns out there is a mechanism to get to the end move 0 in
'cursor', which currently is being considered a bug.

Dave

On Fri, 2002-10-11 at 11:44, Doug Fields wrote:

At 08:27 AM 10/11/2002, snpe wrote:

Barry,
Is it true ?
I create table with one column varchar(500) and enter 1 milion rows with
length 10-20 character.JDBC query 'select * from a' get error 'out of
memory', but psql not.
I insert 8 milion rows and psql work fine yet (slow, but work)

The way the code works in JDBC is, in my opinion, a little poor but
possibly mandated by JDBC design specs.

It reads the entire result set from the database backend and caches it in a
horrible Vector (which should really be a List and which should at least
make an attempt to get the # of rows ahead of time to avoid all the
resizing problems).

Then, it doles it out from memory as you go through the ResultSet with the
next() method.

I would have hoped (but was wrong) that it streamed - WITHOUT LOADING THE
WHOLE THING - through the result set as each row is returned from the
backend, thus ensuring that you never use much more memory than one line.
EVEN IF you have to keep the connection locked.

The latter is what I expected it to do. The former is what it does. So, it
necessitates you creating EVERY SELECT query which you think has more than
a few rows (or which you think COULD have more than a few rows, "few" being
defined by our VM memory limits) into a cursor based query. Really klugy. I
intend to write a class to do that for every SELECT query for me automatically.

Cheers,

Doug

In C library is 'execute query' without fetch - in jdbc execute fetch all
rows
and this is problem - I think that executequery must prepare query and fetch
(ResultSet.next or ...) must fetch only fetchSize rows.
I am not sure, but I think that is problem with jdbc, not postgresql
Hackers ?
Does psql fetch all rows and if not how many ?
Can I change fetch size in psql ?
CURSOR , FETCH and MOVE isn't solution.
If I use jdbc in third-party IDE, I can't force this solution

regards

On Thursday 10 October 2002 06:40 pm, Barry Lind wrote:

Nick,

This has been discussed before on this list many times. But the short
answer is that that is how the postgres server handles queries. If you
issue a query the server will return the entire result. (try the same
query in psql and you will have the same problem). To work around this
you can use explicit cursors (see the DECLARE CURSOR, FETCH, and MOVE
sql commands for postgres).

thanks,
--Barry

Nick Fankhauser wrote:

I'm selecting a huge ResultSet from our database- about one million rows,
with one of the fields being varchar(500). I get an out of memory error
from java.

If the whole ResultSet gets stashed in memory, this isn't really
surprising, but I'm wondering why this happens (if it does), rather than
a subset around the current record being cached and other rows being
retrieved as needed.

If it turns out that there are good reasons for it to all be in memory,
then my question is whether there is a better approach that people
typically use in this situation. For now, I'm simply breaking up the
select into smaller chunks, but that approach won't be satisfactory in
the long run.

Thanks

-Nick

-------------------------------------------------------------------------
- Nick Fankhauser nickf@ontko.com Phone 1.765.935.4283 Fax
1.765.962.9788 Ray Ontko & Co. Software Consulting Services
http://www.ontko.com/

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

#24Dror Matalon
dror@zapatec.com
In reply to: Aaron Mulder (#23)
Re: Out of memory error on huge resultset

Hi,

I'm jumping in late into this discussion but ...

In my mind a lot of these features break the model. From an application
prespective, if I want to do last, I do a count(*) and then I do a fetch
with limit; Not quite the same, but all these methods of fetching the
whole data locally and manipulating it to a large exten defeat the
purpose. Let the backend do the work, instead of trying to replicate the
functionality in JDBC.

That said I do understand that some of these are required by the JDBC 2.0
spec.

Dror

On Fri, Oct 11, 2002 at 01:05:37PM -0400, Aaron Mulder wrote:

It wouldn't be bad to start with a naive implementation of
last()... If the only problem we have is that last() doesn't perform
well, we're probably making good progress. :)
On the other hand, I would think the updateable result sets would
be the most challenging; does the server provide any analogous features
with its cursors?

Aaron

On 11 Oct 2002, Dave Cramer wrote:

This really is an artifact of the way that postgres gives us the data.

When you query the backend you get *all* of the results in the query,
and there is no indication of how many results you are going to get. In
simple selects it would be possible to get some idea by using
count(field), but this wouldn't work nearly enough times to make it
useful. So that leaves us with using cursors, which still won't tell you
how many rows you are getting back, but at least you won't have the
memory problems.

This approach is far from trivial which is why it hasn't been
implemented as of yet, keep in mind that result sets support things like
move(n), first(), last(), the last of which will be the trickiest. Not
to mention updateable result sets.

As it turns out there is a mechanism to get to the end move 0 in
'cursor', which currently is being considered a bug.

Dave

On Fri, 2002-10-11 at 11:44, Doug Fields wrote:

At 08:27 AM 10/11/2002, snpe wrote:

Barry,
Is it true ?
I create table with one column varchar(500) and enter 1 milion rows with
length 10-20 character.JDBC query 'select * from a' get error 'out of
memory', but psql not.
I insert 8 milion rows and psql work fine yet (slow, but work)

The way the code works in JDBC is, in my opinion, a little poor but
possibly mandated by JDBC design specs.

It reads the entire result set from the database backend and caches it in a
horrible Vector (which should really be a List and which should at least
make an attempt to get the # of rows ahead of time to avoid all the
resizing problems).

Then, it doles it out from memory as you go through the ResultSet with the
next() method.

I would have hoped (but was wrong) that it streamed - WITHOUT LOADING THE
WHOLE THING - through the result set as each row is returned from the
backend, thus ensuring that you never use much more memory than one line.
EVEN IF you have to keep the connection locked.

The latter is what I expected it to do. The former is what it does. So, it
necessitates you creating EVERY SELECT query which you think has more than
a few rows (or which you think COULD have more than a few rows, "few" being
defined by our VM memory limits) into a cursor based query. Really klugy. I
intend to write a class to do that for every SELECT query for me automatically.

Cheers,

Doug

In C library is 'execute query' without fetch - in jdbc execute fetch all
rows
and this is problem - I think that executequery must prepare query and fetch
(ResultSet.next or ...) must fetch only fetchSize rows.
I am not sure, but I think that is problem with jdbc, not postgresql
Hackers ?
Does psql fetch all rows and if not how many ?
Can I change fetch size in psql ?
CURSOR , FETCH and MOVE isn't solution.
If I use jdbc in third-party IDE, I can't force this solution

regards

On Thursday 10 October 2002 06:40 pm, Barry Lind wrote:

Nick,

This has been discussed before on this list many times. But the short
answer is that that is how the postgres server handles queries. If you
issue a query the server will return the entire result. (try the same
query in psql and you will have the same problem). To work around this
you can use explicit cursors (see the DECLARE CURSOR, FETCH, and MOVE
sql commands for postgres).

thanks,
--Barry

Nick Fankhauser wrote:

I'm selecting a huge ResultSet from our database- about one million rows,
with one of the fields being varchar(500). I get an out of memory error
from java.

If the whole ResultSet gets stashed in memory, this isn't really
surprising, but I'm wondering why this happens (if it does), rather than
a subset around the current record being cached and other rows being
retrieved as needed.

If it turns out that there are good reasons for it to all be in memory,
then my question is whether there is a better approach that people
typically use in this situation. For now, I'm simply breaking up the
select into smaller chunks, but that approach won't be satisfactory in
the long run.

Thanks

-Nick

-------------------------------------------------------------------------
- Nick Fankhauser nickf@ontko.com Phone 1.765.935.4283 Fax
1.765.962.9788 Ray Ontko & Co. Software Consulting Services
http://www.ontko.com/

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

--
Dror Matalon
Zapatec Inc
1700 MLK Way
Berkeley, CA 94709
http://www.zapatec.com

#25Dave Cramer
Dave@micro-automation.net
In reply to: Dror Matalon (#24)
Re: Out of memory error on huge resultset

Agreed, but there are selects where count(*) won't work. Even so, what
we are talking about here is hiding the implementation of cursors behind
the result set. What I would envision is some sort of cacheing where
when the user set's the fetchsize to 10 for instance we do the select,
and when they ask for next() we check to see if we have these rows in
the cache, and go get them if necessary 10 at a time, possibly keeping
one set of ten behind where we are and one set of 10 ahead of where we
are. So recalling that resultSets have absolute positioning, as well as
first(), and last() positioning we need the ability to move with the
minimum number of trips to the backend.

As it turns out the move command in postgres does support moving to the
end (move 0 ); at the moment this is considered a bug, and is on the
todo list to be removed. I expect we can get some sort of implementation
which allows us to move to the end ( move end )

Dave

Show quoted text

On Fri, 2002-10-11 at 13:12, Dror Matalon wrote:

Hi,

I'm jumping in late into this discussion but ...

In my mind a lot of these features break the model. From an application
prespective, if I want to do last, I do a count(*) and then I do a fetch
with limit; Not quite the same, but all these methods of fetching the
whole data locally and manipulating it to a large exten defeat the
purpose. Let the backend do the work, instead of trying to replicate the
functionality in JDBC.

That said I do understand that some of these are required by the JDBC 2.0
spec.

Dror

On Fri, Oct 11, 2002 at 01:05:37PM -0400, Aaron Mulder wrote:

It wouldn't be bad to start with a naive implementation of
last()... If the only problem we have is that last() doesn't perform
well, we're probably making good progress. :)
On the other hand, I would think the updateable result sets would
be the most challenging; does the server provide any analogous features
with its cursors?

Aaron

On 11 Oct 2002, Dave Cramer wrote:

This really is an artifact of the way that postgres gives us the data.

When you query the backend you get *all* of the results in the query,
and there is no indication of how many results you are going to get. In
simple selects it would be possible to get some idea by using
count(field), but this wouldn't work nearly enough times to make it
useful. So that leaves us with using cursors, which still won't tell you
how many rows you are getting back, but at least you won't have the
memory problems.

This approach is far from trivial which is why it hasn't been
implemented as of yet, keep in mind that result sets support things like
move(n), first(), last(), the last of which will be the trickiest. Not
to mention updateable result sets.

As it turns out there is a mechanism to get to the end move 0 in
'cursor', which currently is being considered a bug.

Dave

On Fri, 2002-10-11 at 11:44, Doug Fields wrote:

At 08:27 AM 10/11/2002, snpe wrote:

Barry,
Is it true ?
I create table with one column varchar(500) and enter 1 milion rows with
length 10-20 character.JDBC query 'select * from a' get error 'out of
memory', but psql not.
I insert 8 milion rows and psql work fine yet (slow, but work)

The way the code works in JDBC is, in my opinion, a little poor but
possibly mandated by JDBC design specs.

It reads the entire result set from the database backend and caches it in a
horrible Vector (which should really be a List and which should at least
make an attempt to get the # of rows ahead of time to avoid all the
resizing problems).

Then, it doles it out from memory as you go through the ResultSet with the
next() method.

I would have hoped (but was wrong) that it streamed - WITHOUT LOADING THE
WHOLE THING - through the result set as each row is returned from the
backend, thus ensuring that you never use much more memory than one line.
EVEN IF you have to keep the connection locked.

The latter is what I expected it to do. The former is what it does. So, it
necessitates you creating EVERY SELECT query which you think has more than
a few rows (or which you think COULD have more than a few rows, "few" being
defined by our VM memory limits) into a cursor based query. Really klugy. I
intend to write a class to do that for every SELECT query for me automatically.

Cheers,

Doug

In C library is 'execute query' without fetch - in jdbc execute fetch all
rows
and this is problem - I think that executequery must prepare query and fetch
(ResultSet.next or ...) must fetch only fetchSize rows.
I am not sure, but I think that is problem with jdbc, not postgresql
Hackers ?
Does psql fetch all rows and if not how many ?
Can I change fetch size in psql ?
CURSOR , FETCH and MOVE isn't solution.
If I use jdbc in third-party IDE, I can't force this solution

regards

On Thursday 10 October 2002 06:40 pm, Barry Lind wrote:

Nick,

This has been discussed before on this list many times. But the short
answer is that that is how the postgres server handles queries. If you
issue a query the server will return the entire result. (try the same
query in psql and you will have the same problem). To work around this
you can use explicit cursors (see the DECLARE CURSOR, FETCH, and MOVE
sql commands for postgres).

thanks,
--Barry

Nick Fankhauser wrote:

I'm selecting a huge ResultSet from our database- about one million rows,
with one of the fields being varchar(500). I get an out of memory error
from java.

If the whole ResultSet gets stashed in memory, this isn't really
surprising, but I'm wondering why this happens (if it does), rather than
a subset around the current record being cached and other rows being
retrieved as needed.

If it turns out that there are good reasons for it to all be in memory,
then my question is whether there is a better approach that people
typically use in this situation. For now, I'm simply breaking up the
select into smaller chunks, but that approach won't be satisfactory in
the long run.

Thanks

-Nick

-------------------------------------------------------------------------
- Nick Fankhauser nickf@ontko.com Phone 1.765.935.4283 Fax
1.765.962.9788 Ray Ontko & Co. Software Consulting Services
http://www.ontko.com/

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

--
Dror Matalon
Zapatec Inc
1700 MLK Way
Berkeley, CA 94709
http://www.zapatec.com

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

#26Dave Cramer
Dave@micro-automation.net
In reply to: snpe (#21)
Re: Out of memory error on huge resultset

No,

It doesn't have to store them, only display them

Dave

Show quoted text

On Fri, 2002-10-11 at 12:48, snpe wrote:

Hello,
Does it mean that psql uses cursors ?

regards
Haris Peco
On Friday 11 October 2002 05:58 pm, Dave Cramer wrote:

This really is an artifact of the way that postgres gives us the data.

When you query the backend you get *all* of the results in the query,
and there is no indication of how many results you are going to get. In
simple selects it would be possible to get some idea by using
count(field), but this wouldn't work nearly enough times to make it
useful. So that leaves us with using cursors, which still won't tell you
how many rows you are getting back, but at least you won't have the
memory problems.

This approach is far from trivial which is why it hasn't been
implemented as of yet, keep in mind that result sets support things like
move(n), first(), last(), the last of which will be the trickiest. Not
to mention updateable result sets.

As it turns out there is a mechanism to get to the end move 0 in
'cursor', which currently is being considered a bug.

Dave

On Fri, 2002-10-11 at 11:44, Doug Fields wrote:

At 08:27 AM 10/11/2002, snpe wrote:

Barry,
Is it true ?
I create table with one column varchar(500) and enter 1 milion rows with
length 10-20 character.JDBC query 'select * from a' get error 'out of
memory', but psql not.
I insert 8 milion rows and psql work fine yet (slow, but work)

The way the code works in JDBC is, in my opinion, a little poor but
possibly mandated by JDBC design specs.

It reads the entire result set from the database backend and caches it in
a horrible Vector (which should really be a List and which should at
least make an attempt to get the # of rows ahead of time to avoid all the
resizing problems).

Then, it doles it out from memory as you go through the ResultSet with
the next() method.

I would have hoped (but was wrong) that it streamed - WITHOUT LOADING THE
WHOLE THING - through the result set as each row is returned from the
backend, thus ensuring that you never use much more memory than one line.
EVEN IF you have to keep the connection locked.

The latter is what I expected it to do. The former is what it does. So,
it necessitates you creating EVERY SELECT query which you think has more
than a few rows (or which you think COULD have more than a few rows,
"few" being defined by our VM memory limits) into a cursor based query.
Really klugy. I intend to write a class to do that for every SELECT query
for me automatically.

Cheers,

Doug

In C library is 'execute query' without fetch - in jdbc execute fetch
all rows
and this is problem - I think that executequery must prepare query and
fetch (ResultSet.next or ...) must fetch only fetchSize rows.
I am not sure, but I think that is problem with jdbc, not postgresql
Hackers ?
Does psql fetch all rows and if not how many ?
Can I change fetch size in psql ?
CURSOR , FETCH and MOVE isn't solution.
If I use jdbc in third-party IDE, I can't force this solution

regards

On Thursday 10 October 2002 06:40 pm, Barry Lind wrote:

Nick,

This has been discussed before on this list many times. But the
short answer is that that is how the postgres server handles queries.
If you issue a query the server will return the entire result. (try
the same query in psql and you will have the same problem). To work
around this you can use explicit cursors (see the DECLARE CURSOR,
FETCH, and MOVE sql commands for postgres).

thanks,
--Barry

Nick Fankhauser wrote:

I'm selecting a huge ResultSet from our database- about one million
rows, with one of the fields being varchar(500). I get an out of
memory error from java.

If the whole ResultSet gets stashed in memory, this isn't really
surprising, but I'm wondering why this happens (if it does), rather
than a subset around the current record being cached and other rows
being retrieved as needed.

If it turns out that there are good reasons for it to all be in
memory, then my question is whether there is a better approach that
people typically use in this situation. For now, I'm simply
breaking up the select into smaller chunks, but that approach won't
be satisfactory in the long run.

Thanks

-Nick

-------------------------------------------------------------------
------ - Nick Fankhauser nickf@ontko.com Phone 1.765.935.4283 Fax
1.765.962.9788 Ray Ontko & Co. Software Consulting Services
http://www.ontko.com/

---------------------------(end of
broadcast)--------------------------- TIP 5: Have you checked our
extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

---------------------------(end of
broadcast)--------------------------- TIP 6: Have you searched our
list archives?

http://archives.postgresql.org

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to
majordomo@postgresql.org)

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

#27Doug Fields
dfields-postgres@pexicom.com
In reply to: Barry Lind (#22)
Re: Out of memory error on huge resultset

The problem here is that we would then need two completely different
implementations for jdbc1 and jdbc2/3 since List is not part of
jdk1.1. We could build our own List implementation that works on jdk1.1,
but I am not sure the gain in performance is worth it. If you could do
some testing and come back with some numbers of the differences in
performance between ResultSets implemented with Vectors and Lists that
would probably give us enough information to guage how to proceed on this
suggested improvement.

In the past, I have done this sort of thing.

The "synchronized" overhead of a "synchronized method" is about 7 times the
overhead of a regular method call. I did many empirical tests of this on
JDK 1.3 and 1.4 on Linux (2.2 and 2.4) due to the high performance demands
of the software my firm uses. Now, that all depends on how many times you
invoke those methods and how fast they are otherwise. I'm unwilling to do
that for PostgreSQL, but I have to imagine that scrapping JDK 1.1 support
would not be a bad thing and may even be a good thing. Anyone still using
JDK 1.1 is also probably using it in conjunction with other products from
that era, so having a modern product compatible with a very out of date
product makes no sense in my estimation.

I don't make policy, though - that seems to be your job generally, Barry. :)

This had actually been tried in the past (just getting the records from
the server connection as requested), but this behavior violates the spec
and broke many peoples applications. The problem is that if you don't use
cursors, you end up tying up the connection until you finish fetching all
rows. So code like the following no longer works:

get result set
while (rs.next()) {
get some values from the result
use them to update/insert some other table using a preparedstatement
}

Since the connection is locked until all the results are fetched, you
can't use the connection to perform the update/insert you want to do for
each itteration of the loop.

Agreed on this point. However, nonetheless, and regardless of the fact that
this may break the spec and should not be the default behavior, this should
be an option, because the current way of the driver working is a horror for
anyone who has to deal with large result sets (such as I do on a regular
basis).

I don't mind keeping a connection locked for ages. I do mind running out of
VM space for a large result set which is streamed FROM THE DATABASE SERVER.
Result sets should have the ability to be streamed end to end, IMO - even
if it's a non-standard extension or an option to the connection when
created or the statement when created.

Again, I don't make policy, and I'm insufficiently motivated to do it
myself. Don't think it invalidates my opinion, but I won't kvetch about it
either. I just designed a class which does the same thing by taking a query
and turning it into a cursor-based query.

Cheers,

Doug

#28Dave Cramer
Dave@micro-automation.net
In reply to: Dave Cramer (#19)
Re: Out of memory error on huge resultset

AFAIK, it doesn't work unless the driver is running natively on linux.
It used to use a native library, but I did see something which looks
like a net protocol ?

dave

Show quoted text

On Fri, 2002-10-11 at 16:33, snpe wrote:

There is jxdbcon Postgresql jdbc driver with setFetchSize method.
Last version don't wokr with pgsql 7.3 and I don't test more.
I will try next day, when I download pgsql 7.2

regards
Haris Peco
On Friday 11 October 2002 07:59 pm, Dave Cramer wrote:

Agreed, but there are selects where count(*) won't work. Even so, what
we are talking about here is hiding the implementation of cursors behind
the result set. What I would envision is some sort of cacheing where
when the user set's the fetchsize to 10 for instance we do the select,
and when they ask for next() we check to see if we have these rows in
the cache, and go get them if necessary 10 at a time, possibly keeping
one set of ten behind where we are and one set of 10 ahead of where we
are. So recalling that resultSets have absolute positioning, as well as
first(), and last() positioning we need the ability to move with the
minimum number of trips to the backend.

As it turns out the move command in postgres does support moving to the
end (move 0 ); at the moment this is considered a bug, and is on the
todo list to be removed. I expect we can get some sort of implementation
which allows us to move to the end ( move end )

Dave

On Fri, 2002-10-11 at 13:12, Dror Matalon wrote:

Hi,

I'm jumping in late into this discussion but ...

In my mind a lot of these features break the model. From an application
prespective, if I want to do last, I do a count(*) and then I do a fetch
with limit; Not quite the same, but all these methods of fetching the
whole data locally and manipulating it to a large exten defeat the
purpose. Let the backend do the work, instead of trying to replicate the
functionality in JDBC.

That said I do understand that some of these are required by the JDBC 2.0
spec.

Dror

On Fri, Oct 11, 2002 at 01:05:37PM -0400, Aaron Mulder wrote:

It wouldn't be bad to start with a naive implementation of
last()... If the only problem we have is that last() doesn't perform
well, we're probably making good progress. :)
On the other hand, I would think the updateable result sets would
be the most challenging; does the server provide any analogous features
with its cursors?

Aaron

On 11 Oct 2002, Dave Cramer wrote:

This really is an artifact of the way that postgres gives us the
data.

When you query the backend you get *all* of the results in the query,
and there is no indication of how many results you are going to get.
In simple selects it would be possible to get some idea by using
count(field), but this wouldn't work nearly enough times to make it
useful. So that leaves us with using cursors, which still won't tell
you how many rows you are getting back, but at least you won't have
the memory problems.

This approach is far from trivial which is why it hasn't been
implemented as of yet, keep in mind that result sets support things
like move(n), first(), last(), the last of which will be the
trickiest. Not to mention updateable result sets.

As it turns out there is a mechanism to get to the end move 0 in
'cursor', which currently is being considered a bug.

Dave

On Fri, 2002-10-11 at 11:44, Doug Fields wrote:

At 08:27 AM 10/11/2002, snpe wrote:

Barry,
Is it true ?
I create table with one column varchar(500) and enter 1 milion
rows with length 10-20 character.JDBC query 'select * from a' get
error 'out of memory', but psql not.
I insert 8 milion rows and psql work fine yet (slow, but work)

The way the code works in JDBC is, in my opinion, a little poor but
possibly mandated by JDBC design specs.

It reads the entire result set from the database backend and caches
it in a horrible Vector (which should really be a List and which
should at least make an attempt to get the # of rows ahead of time
to avoid all the resizing problems).

Then, it doles it out from memory as you go through the ResultSet
with the next() method.

I would have hoped (but was wrong) that it streamed - WITHOUT
LOADING THE WHOLE THING - through the result set as each row is
returned from the backend, thus ensuring that you never use much
more memory than one line. EVEN IF you have to keep the connection
locked.

The latter is what I expected it to do. The former is what it does.
So, it necessitates you creating EVERY SELECT query which you think
has more than a few rows (or which you think COULD have more than a
few rows, "few" being defined by our VM memory limits) into a
cursor based query. Really klugy. I intend to write a class to do
that for every SELECT query for me automatically.

Cheers,

Doug

In C library is 'execute query' without fetch - in jdbc execute
fetch all rows
and this is problem - I think that executequery must prepare query
and fetch (ResultSet.next or ...) must fetch only fetchSize rows.
I am not sure, but I think that is problem with jdbc, not
postgresql Hackers ?
Does psql fetch all rows and if not how many ?
Can I change fetch size in psql ?
CURSOR , FETCH and MOVE isn't solution.
If I use jdbc in third-party IDE, I can't force this solution

regards

On Thursday 10 October 2002 06:40 pm, Barry Lind wrote:

Nick,

This has been discussed before on this list many times. But
the short answer is that that is how the postgres server
handles queries. If you issue a query the server will return
the entire result. (try the same query in psql and you will
have the same problem). To work around this you can use
explicit cursors (see the DECLARE CURSOR, FETCH, and MOVE sql
commands for postgres).

thanks,
--Barry

Nick Fankhauser wrote:

I'm selecting a huge ResultSet from our database- about one
million rows, with one of the fields being varchar(500). I
get an out of memory error from java.

If the whole ResultSet gets stashed in memory, this isn't
really surprising, but I'm wondering why this happens (if it
does), rather than a subset around the current record being
cached and other rows being retrieved as needed.

If it turns out that there are good reasons for it to all be
in memory, then my question is whether there is a better
approach that people typically use in this situation. For
now, I'm simply breaking up the select into smaller chunks,
but that approach won't be satisfactory in the long run.

Thanks

-Nick

-------------------------------------------------------------
------------ - Nick Fankhauser nickf@ontko.com Phone
1.765.935.4283 Fax 1.765.962.9788 Ray Ontko & Co.
Software Consulting Services http://www.ontko.com/

---------------------------(end of
broadcast)--------------------------- TIP 5: Have you checked
our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

---------------------------(end of
broadcast)--------------------------- TIP 6: Have you searched
our list archives?

http://archives.postgresql.org

---------------------------(end of
broadcast)--------------------------- TIP 2: you can get off all
lists at once with the unregister command (send "unregister
YourEmailAddressHere" to majordomo@postgresql.org)

---------------------------(end of
broadcast)--------------------------- TIP 6: Have you searched our
list archives?

http://archives.postgresql.org

---------------------------(end of
broadcast)--------------------------- TIP 1: subscribe and
unsubscribe commands go to majordomo@postgresql.org

---------------------------(end of
broadcast)--------------------------- TIP 3: if posting/reading through
Usenet, please send an appropriate subscribe-nomail command to
majordomo@postgresql.org so that your message can get through to the
mailing list cleanly

--
Dror Matalon
Zapatec Inc
1700 MLK Way
Berkeley, CA 94709
http://www.zapatec.com

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

#29snpe
snpe@snpe.co.yu
In reply to: Dave Cramer (#25)
Re: Out of memory error on huge resultset

There is jxdbcon Postgresql jdbc driver with setFetchSize method.
Last version don't wokr with pgsql 7.3 and I don't test more.
I will try next day, when I download pgsql 7.2

regards
Haris Peco

Show quoted text

On Friday 11 October 2002 07:59 pm, Dave Cramer wrote:

Agreed, but there are selects where count(*) won't work. Even so, what
we are talking about here is hiding the implementation of cursors behind
the result set. What I would envision is some sort of cacheing where
when the user set's the fetchsize to 10 for instance we do the select,
and when they ask for next() we check to see if we have these rows in
the cache, and go get them if necessary 10 at a time, possibly keeping
one set of ten behind where we are and one set of 10 ahead of where we
are. So recalling that resultSets have absolute positioning, as well as
first(), and last() positioning we need the ability to move with the
minimum number of trips to the backend.

As it turns out the move command in postgres does support moving to the
end (move 0 ); at the moment this is considered a bug, and is on the
todo list to be removed. I expect we can get some sort of implementation
which allows us to move to the end ( move end )

Dave

On Fri, 2002-10-11 at 13:12, Dror Matalon wrote:

Hi,

I'm jumping in late into this discussion but ...

In my mind a lot of these features break the model. From an application
prespective, if I want to do last, I do a count(*) and then I do a fetch
with limit; Not quite the same, but all these methods of fetching the
whole data locally and manipulating it to a large exten defeat the
purpose. Let the backend do the work, instead of trying to replicate the
functionality in JDBC.

That said I do understand that some of these are required by the JDBC 2.0
spec.

Dror

On Fri, Oct 11, 2002 at 01:05:37PM -0400, Aaron Mulder wrote:

It wouldn't be bad to start with a naive implementation of
last()... If the only problem we have is that last() doesn't perform
well, we're probably making good progress. :)
On the other hand, I would think the updateable result sets would
be the most challenging; does the server provide any analogous features
with its cursors?

Aaron

On 11 Oct 2002, Dave Cramer wrote:

This really is an artifact of the way that postgres gives us the
data.

When you query the backend you get *all* of the results in the query,
and there is no indication of how many results you are going to get.
In simple selects it would be possible to get some idea by using
count(field), but this wouldn't work nearly enough times to make it
useful. So that leaves us with using cursors, which still won't tell
you how many rows you are getting back, but at least you won't have
the memory problems.

This approach is far from trivial which is why it hasn't been
implemented as of yet, keep in mind that result sets support things
like move(n), first(), last(), the last of which will be the
trickiest. Not to mention updateable result sets.

As it turns out there is a mechanism to get to the end move 0 in
'cursor', which currently is being considered a bug.

Dave

On Fri, 2002-10-11 at 11:44, Doug Fields wrote:

At 08:27 AM 10/11/2002, snpe wrote:

Barry,
Is it true ?
I create table with one column varchar(500) and enter 1 milion
rows with length 10-20 character.JDBC query 'select * from a' get
error 'out of memory', but psql not.
I insert 8 milion rows and psql work fine yet (slow, but work)

The way the code works in JDBC is, in my opinion, a little poor but
possibly mandated by JDBC design specs.

It reads the entire result set from the database backend and caches
it in a horrible Vector (which should really be a List and which
should at least make an attempt to get the # of rows ahead of time
to avoid all the resizing problems).

Then, it doles it out from memory as you go through the ResultSet
with the next() method.

I would have hoped (but was wrong) that it streamed - WITHOUT
LOADING THE WHOLE THING - through the result set as each row is
returned from the backend, thus ensuring that you never use much
more memory than one line. EVEN IF you have to keep the connection
locked.

The latter is what I expected it to do. The former is what it does.
So, it necessitates you creating EVERY SELECT query which you think
has more than a few rows (or which you think COULD have more than a
few rows, "few" being defined by our VM memory limits) into a
cursor based query. Really klugy. I intend to write a class to do
that for every SELECT query for me automatically.

Cheers,

Doug

In C library is 'execute query' without fetch - in jdbc execute
fetch all rows
and this is problem - I think that executequery must prepare query
and fetch (ResultSet.next or ...) must fetch only fetchSize rows.
I am not sure, but I think that is problem with jdbc, not
postgresql Hackers ?
Does psql fetch all rows and if not how many ?
Can I change fetch size in psql ?
CURSOR , FETCH and MOVE isn't solution.
If I use jdbc in third-party IDE, I can't force this solution

regards

On Thursday 10 October 2002 06:40 pm, Barry Lind wrote:

Nick,

This has been discussed before on this list many times. But
the short answer is that that is how the postgres server
handles queries. If you issue a query the server will return
the entire result. (try the same query in psql and you will
have the same problem). To work around this you can use
explicit cursors (see the DECLARE CURSOR, FETCH, and MOVE sql
commands for postgres).

thanks,
--Barry

Nick Fankhauser wrote:

I'm selecting a huge ResultSet from our database- about one
million rows, with one of the fields being varchar(500). I
get an out of memory error from java.

If the whole ResultSet gets stashed in memory, this isn't
really surprising, but I'm wondering why this happens (if it
does), rather than a subset around the current record being
cached and other rows being retrieved as needed.

If it turns out that there are good reasons for it to all be
in memory, then my question is whether there is a better
approach that people typically use in this situation. For
now, I'm simply breaking up the select into smaller chunks,
but that approach won't be satisfactory in the long run.

Thanks

-Nick

-------------------------------------------------------------
------------ - Nick Fankhauser nickf@ontko.com Phone
1.765.935.4283 Fax 1.765.962.9788 Ray Ontko & Co.
Software Consulting Services http://www.ontko.com/

---------------------------(end of
broadcast)--------------------------- TIP 5: Have you checked
our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

---------------------------(end of
broadcast)--------------------------- TIP 6: Have you searched
our list archives?

http://archives.postgresql.org

---------------------------(end of
broadcast)--------------------------- TIP 2: you can get off all
lists at once with the unregister command (send "unregister
YourEmailAddressHere" to majordomo@postgresql.org)

---------------------------(end of
broadcast)--------------------------- TIP 6: Have you searched our
list archives?

http://archives.postgresql.org

---------------------------(end of
broadcast)--------------------------- TIP 1: subscribe and
unsubscribe commands go to majordomo@postgresql.org

---------------------------(end of
broadcast)--------------------------- TIP 3: if posting/reading through
Usenet, please send an appropriate subscribe-nomail command to
majordomo@postgresql.org so that your message can get through to the
mailing list cleanly

--
Dror Matalon
Zapatec Inc
1700 MLK Way
Berkeley, CA 94709
http://www.zapatec.com

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

#30Dave Cramer
Dave@micro-automation.net
In reply to: Dave Cramer (#19)
Re: Out of memory error on huge resultset

Looking at their code, default fetch size is 1000?

Anyways, I think there is sufficient interest in this that we should
have something running soon here

Dave

Show quoted text

On Fri, 2002-10-11 at 17:02, snpe wrote:

I am tried with jxdbcon - it don't work with large table, too.
'out of memory' is when executeQuery()

regards
Haris Peco
On Friday 11 October 2002 10:33 pm, snpe wrote:

There is jxdbcon Postgresql jdbc driver with setFetchSize method.
Last version don't wokr with pgsql 7.3 and I don't test more.
I will try next day, when I download pgsql 7.2

regards
Haris Peco

On Friday 11 October 2002 07:59 pm, Dave Cramer wrote:

Agreed, but there are selects where count(*) won't work. Even so, what
we are talking about here is hiding the implementation of cursors behind
the result set. What I would envision is some sort of cacheing where
when the user set's the fetchsize to 10 for instance we do the select,
and when they ask for next() we check to see if we have these rows in
the cache, and go get them if necessary 10 at a time, possibly keeping
one set of ten behind where we are and one set of 10 ahead of where we
are. So recalling that resultSets have absolute positioning, as well as
first(), and last() positioning we need the ability to move with the
minimum number of trips to the backend.

As it turns out the move command in postgres does support moving to the
end (move 0 ); at the moment this is considered a bug, and is on the
todo list to be removed. I expect we can get some sort of implementation
which allows us to move to the end ( move end )

Dave

On Fri, 2002-10-11 at 13:12, Dror Matalon wrote:

Hi,

I'm jumping in late into this discussion but ...

In my mind a lot of these features break the model. From an application
prespective, if I want to do last, I do a count(*) and then I do a
fetch with limit; Not quite the same, but all these methods of fetching
the whole data locally and manipulating it to a large exten defeat the
purpose. Let the backend do the work, instead of trying to replicate
the functionality in JDBC.

That said I do understand that some of these are required by the JDBC
2.0 spec.

Dror

On Fri, Oct 11, 2002 at 01:05:37PM -0400, Aaron Mulder wrote:

It wouldn't be bad to start with a naive implementation of
last()... If the only problem we have is that last() doesn't perform
well, we're probably making good progress. :)
On the other hand, I would think the updateable result sets would
be the most challenging; does the server provide any analogous
features with its cursors?

Aaron

On 11 Oct 2002, Dave Cramer wrote:

This really is an artifact of the way that postgres gives us the
data.

When you query the backend you get *all* of the results in the
query, and there is no indication of how many results you are going
to get. In simple selects it would be possible to get some idea by
using count(field), but this wouldn't work nearly enough times to
make it useful. So that leaves us with using cursors, which still
won't tell you how many rows you are getting back, but at least you
won't have the memory problems.

This approach is far from trivial which is why it hasn't been
implemented as of yet, keep in mind that result sets support things
like move(n), first(), last(), the last of which will be the
trickiest. Not to mention updateable result sets.

As it turns out there is a mechanism to get to the end move 0 in
'cursor', which currently is being considered a bug.

Dave

On Fri, 2002-10-11 at 11:44, Doug Fields wrote:

At 08:27 AM 10/11/2002, snpe wrote:

Barry,
Is it true ?
I create table with one column varchar(500) and enter 1 milion
rows with length 10-20 character.JDBC query 'select * from a'
get error 'out of memory', but psql not.
I insert 8 milion rows and psql work fine yet (slow, but work)

The way the code works in JDBC is, in my opinion, a little poor
but possibly mandated by JDBC design specs.

It reads the entire result set from the database backend and
caches it in a horrible Vector (which should really be a List and
which should at least make an attempt to get the # of rows ahead
of time to avoid all the resizing problems).

Then, it doles it out from memory as you go through the ResultSet
with the next() method.

I would have hoped (but was wrong) that it streamed - WITHOUT
LOADING THE WHOLE THING - through the result set as each row is
returned from the backend, thus ensuring that you never use much
more memory than one line. EVEN IF you have to keep the
connection locked.

The latter is what I expected it to do. The former is what it
does. So, it necessitates you creating EVERY SELECT query which
you think has more than a few rows (or which you think COULD have
more than a few rows, "few" being defined by our VM memory
limits) into a cursor based query. Really klugy. I intend to
write a class to do that for every SELECT query for me
automatically.

Cheers,

Doug

In C library is 'execute query' without fetch - in jdbc execute
fetch all rows
and this is problem - I think that executequery must prepare
query and fetch (ResultSet.next or ...) must fetch only
fetchSize rows. I am not sure, but I think that is problem with
jdbc, not postgresql Hackers ?
Does psql fetch all rows and if not how many ?
Can I change fetch size in psql ?
CURSOR , FETCH and MOVE isn't solution.
If I use jdbc in third-party IDE, I can't force this solution

regards

On Thursday 10 October 2002 06:40 pm, Barry Lind wrote:

Nick,

This has been discussed before on this list many times. But
the short answer is that that is how the postgres server
handles queries. If you issue a query the server will return
the entire result. (try the same query in psql and you will
have the same problem). To work around this you can use
explicit cursors (see the DECLARE CURSOR, FETCH, and MOVE sql
commands for postgres).

thanks,
--Barry

Nick Fankhauser wrote:

I'm selecting a huge ResultSet from our database- about one
million rows, with one of the fields being varchar(500). I
get an out of memory error from java.

If the whole ResultSet gets stashed in memory, this isn't
really surprising, but I'm wondering why this happens (if
it does), rather than a subset around the current record
being cached and other rows being retrieved as needed.

If it turns out that there are good reasons for it to all
be in memory, then my question is whether there is a better
approach that people typically use in this situation. For
now, I'm simply breaking up the select into smaller chunks,
but that approach won't be satisfactory in the long run.

Thanks

-Nick

-----------------------------------------------------------
-- ------------ - Nick Fankhauser nickf@ontko.com Phone
1.765.935.4283 Fax 1.765.962.9788 Ray Ontko & Co. Software
Consulting Services http://www.ontko.com/

---------------------------(end of
broadcast)--------------------------- TIP 5: Have you
checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

---------------------------(end of
broadcast)--------------------------- TIP 6: Have you
searched our list archives?

http://archives.postgresql.org

---------------------------(end of
broadcast)--------------------------- TIP 2: you can get off
all lists at once with the unregister command (send "unregister
YourEmailAddressHere" to majordomo@postgresql.org)

---------------------------(end of
broadcast)--------------------------- TIP 6: Have you searched
our list archives?

http://archives.postgresql.org

---------------------------(end of
broadcast)--------------------------- TIP 1: subscribe and
unsubscribe commands go to majordomo@postgresql.org

---------------------------(end of
broadcast)--------------------------- TIP 3: if posting/reading
through Usenet, please send an appropriate subscribe-nomail command
to majordomo@postgresql.org so that your message can get through to
the mailing list cleanly

--
Dror Matalon
Zapatec Inc
1700 MLK Way
Berkeley, CA 94709
http://www.zapatec.com

---------------------------(end of
broadcast)--------------------------- TIP 1: subscribe and unsubscribe
commands go to majordomo@postgresql.org

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

#31snpe
snpe@snpe.co.yu
In reply to: snpe (#29)
Re: Out of memory error on huge resultset

I am tried with jxdbcon - it don't work with large table, too.
'out of memory' is when executeQuery()

regards
Haris Peco

Show quoted text

On Friday 11 October 2002 10:33 pm, snpe wrote:

There is jxdbcon Postgresql jdbc driver with setFetchSize method.
Last version don't wokr with pgsql 7.3 and I don't test more.
I will try next day, when I download pgsql 7.2

regards
Haris Peco

On Friday 11 October 2002 07:59 pm, Dave Cramer wrote:

Agreed, but there are selects where count(*) won't work. Even so, what
we are talking about here is hiding the implementation of cursors behind
the result set. What I would envision is some sort of cacheing where
when the user set's the fetchsize to 10 for instance we do the select,
and when they ask for next() we check to see if we have these rows in
the cache, and go get them if necessary 10 at a time, possibly keeping
one set of ten behind where we are and one set of 10 ahead of where we
are. So recalling that resultSets have absolute positioning, as well as
first(), and last() positioning we need the ability to move with the
minimum number of trips to the backend.

As it turns out the move command in postgres does support moving to the
end (move 0 ); at the moment this is considered a bug, and is on the
todo list to be removed. I expect we can get some sort of implementation
which allows us to move to the end ( move end )

Dave

On Fri, 2002-10-11 at 13:12, Dror Matalon wrote:

Hi,

I'm jumping in late into this discussion but ...

In my mind a lot of these features break the model. From an application
prespective, if I want to do last, I do a count(*) and then I do a
fetch with limit; Not quite the same, but all these methods of fetching
the whole data locally and manipulating it to a large exten defeat the
purpose. Let the backend do the work, instead of trying to replicate
the functionality in JDBC.

That said I do understand that some of these are required by the JDBC
2.0 spec.

Dror

On Fri, Oct 11, 2002 at 01:05:37PM -0400, Aaron Mulder wrote:

It wouldn't be bad to start with a naive implementation of
last()... If the only problem we have is that last() doesn't perform
well, we're probably making good progress. :)
On the other hand, I would think the updateable result sets would
be the most challenging; does the server provide any analogous
features with its cursors?

Aaron

On 11 Oct 2002, Dave Cramer wrote:

This really is an artifact of the way that postgres gives us the
data.

When you query the backend you get *all* of the results in the
query, and there is no indication of how many results you are going
to get. In simple selects it would be possible to get some idea by
using count(field), but this wouldn't work nearly enough times to
make it useful. So that leaves us with using cursors, which still
won't tell you how many rows you are getting back, but at least you
won't have the memory problems.

This approach is far from trivial which is why it hasn't been
implemented as of yet, keep in mind that result sets support things
like move(n), first(), last(), the last of which will be the
trickiest. Not to mention updateable result sets.

As it turns out there is a mechanism to get to the end move 0 in
'cursor', which currently is being considered a bug.

Dave

On Fri, 2002-10-11 at 11:44, Doug Fields wrote:

At 08:27 AM 10/11/2002, snpe wrote:

Barry,
Is it true ?
I create table with one column varchar(500) and enter 1 milion
rows with length 10-20 character.JDBC query 'select * from a'
get error 'out of memory', but psql not.
I insert 8 milion rows and psql work fine yet (slow, but work)

The way the code works in JDBC is, in my opinion, a little poor
but possibly mandated by JDBC design specs.

It reads the entire result set from the database backend and
caches it in a horrible Vector (which should really be a List and
which should at least make an attempt to get the # of rows ahead
of time to avoid all the resizing problems).

Then, it doles it out from memory as you go through the ResultSet
with the next() method.

I would have hoped (but was wrong) that it streamed - WITHOUT
LOADING THE WHOLE THING - through the result set as each row is
returned from the backend, thus ensuring that you never use much
more memory than one line. EVEN IF you have to keep the
connection locked.

The latter is what I expected it to do. The former is what it
does. So, it necessitates you creating EVERY SELECT query which
you think has more than a few rows (or which you think COULD have
more than a few rows, "few" being defined by our VM memory
limits) into a cursor based query. Really klugy. I intend to
write a class to do that for every SELECT query for me
automatically.

Cheers,

Doug

In C library is 'execute query' without fetch - in jdbc execute
fetch all rows
and this is problem - I think that executequery must prepare
query and fetch (ResultSet.next or ...) must fetch only
fetchSize rows. I am not sure, but I think that is problem with
jdbc, not postgresql Hackers ?
Does psql fetch all rows and if not how many ?
Can I change fetch size in psql ?
CURSOR , FETCH and MOVE isn't solution.
If I use jdbc in third-party IDE, I can't force this solution

regards

On Thursday 10 October 2002 06:40 pm, Barry Lind wrote:

Nick,

This has been discussed before on this list many times. But
the short answer is that that is how the postgres server
handles queries. If you issue a query the server will return
the entire result. (try the same query in psql and you will
have the same problem). To work around this you can use
explicit cursors (see the DECLARE CURSOR, FETCH, and MOVE sql
commands for postgres).

thanks,
--Barry

Nick Fankhauser wrote:

I'm selecting a huge ResultSet from our database- about one
million rows, with one of the fields being varchar(500). I
get an out of memory error from java.

If the whole ResultSet gets stashed in memory, this isn't
really surprising, but I'm wondering why this happens (if
it does), rather than a subset around the current record
being cached and other rows being retrieved as needed.

If it turns out that there are good reasons for it to all
be in memory, then my question is whether there is a better
approach that people typically use in this situation. For
now, I'm simply breaking up the select into smaller chunks,
but that approach won't be satisfactory in the long run.

Thanks

-Nick

-----------------------------------------------------------
-- ------------ - Nick Fankhauser nickf@ontko.com Phone
1.765.935.4283 Fax 1.765.962.9788 Ray Ontko & Co. Software
Consulting Services http://www.ontko.com/

---------------------------(end of
broadcast)--------------------------- TIP 5: Have you
checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

---------------------------(end of
broadcast)--------------------------- TIP 6: Have you
searched our list archives?

http://archives.postgresql.org

---------------------------(end of
broadcast)--------------------------- TIP 2: you can get off
all lists at once with the unregister command (send "unregister
YourEmailAddressHere" to majordomo@postgresql.org)

---------------------------(end of
broadcast)--------------------------- TIP 6: Have you searched
our list archives?

http://archives.postgresql.org

---------------------------(end of
broadcast)--------------------------- TIP 1: subscribe and
unsubscribe commands go to majordomo@postgresql.org

---------------------------(end of
broadcast)--------------------------- TIP 3: if posting/reading
through Usenet, please send an appropriate subscribe-nomail command
to majordomo@postgresql.org so that your message can get through to
the mailing list cleanly

--
Dror Matalon
Zapatec Inc
1700 MLK Way
Berkeley, CA 94709
http://www.zapatec.com

---------------------------(end of
broadcast)--------------------------- TIP 1: subscribe and unsubscribe
commands go to majordomo@postgresql.org

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

#32Jeff Davis
list-pgsql-hackers@empires.org
In reply to: Antti Haapala (#5)
Re: MySQL vs PostgreSQL.

They also state that they have more sophisticated ALTER TABLE...

Only usable feature in their ALTER TABLE that doesn't (yet) exist in
PostgreSQL was changing column order (ok, the order by in table creation
could be nice), and that's still almost purely cosmetic. Anyway, I could
have used that command yesterday. Could this be added to pgsql.

I agree with your message except for that statement. MySQL alter table
provides the ability to change column types and cast the records
automatically. I remember that feature as really the only thing from MySQL
that I've ever missed.

Of course, it's not that wonderful in theory. During development you can
easily drop/recreate the tables and reload the test data; during production
you don't change the data types of your attributes.

But in practice, during development it's handy sometimes.

Regards,
Jeff

#33scott.marlowe
scott.marlowe@ihs.com
In reply to: Jeff Davis (#32)
Re: MySQL vs PostgreSQL.

On Fri, 11 Oct 2002, Jeff Davis wrote:

They also state that they have more sophisticated ALTER TABLE...

Only usable feature in their ALTER TABLE that doesn't (yet) exist in
PostgreSQL was changing column order (ok, the order by in table creation
could be nice), and that's still almost purely cosmetic. Anyway, I could
have used that command yesterday. Could this be added to pgsql.

I agree with your message except for that statement. MySQL alter table
provides the ability to change column types and cast the records
automatically. I remember that feature as really the only thing from MySQL
that I've ever missed.

Of course, it's not that wonderful in theory. During development you can
easily drop/recreate the tables and reload the test data; during production
you don't change the data types of your attributes.

But in practice, during development it's handy sometimes.

I still remember a post from somebody on the phpbuilder site that had
changed a field from varchar to date and all the dates he had got changed
to 0000-00-00.

He most unimpressed, especially since he (being typical of a lot of MySQL
users) didn't have a backup.

#34snpe
snpe@snpe.co.yu
In reply to: Dave Cramer (#30)
Re: Out of memory error on huge resultset

I test Oracle JDeveloper and jdbc driver for postgresql work fine now
Meanwhile, for production systems I have to have setFetchSize for large tables
I think that it is same with any Java IDE.

Best solution is that we have only n rows from backend, but I don't know is it
possible
regards
Haris Peco

Show quoted text

On Friday 11 October 2002 10:38 pm, Dave Cramer wrote:

Looking at their code, default fetch size is 1000?

Anyways, I think there is sufficient interest in this that we should
have something running soon here

Dave

On Fri, 2002-10-11 at 17:02, snpe wrote:

I am tried with jxdbcon - it don't work with large table, too.
'out of memory' is when executeQuery()

regards
Haris Peco

On Friday 11 October 2002 10:33 pm, snpe wrote:

There is jxdbcon Postgresql jdbc driver with setFetchSize method.
Last version don't wokr with pgsql 7.3 and I don't test more.
I will try next day, when I download pgsql 7.2

regards
Haris Peco

On Friday 11 October 2002 07:59 pm, Dave Cramer wrote:

Agreed, but there are selects where count(*) won't work. Even so,
what we are talking about here is hiding the implementation of
cursors behind the result set. What I would envision is some sort of
cacheing where when the user set's the fetchsize to 10 for instance
we do the select, and when they ask for next() we check to see if we
have these rows in the cache, and go get them if necessary 10 at a
time, possibly keeping one set of ten behind where we are and one set
of 10 ahead of where we are. So recalling that resultSets have
absolute positioning, as well as first(), and last() positioning we
need the ability to move with the minimum number of trips to the
backend.

As it turns out the move command in postgres does support moving to
the end (move 0 ); at the moment this is considered a bug, and is on
the todo list to be removed. I expect we can get some sort of
implementation which allows us to move to the end ( move end )

Dave

On Fri, 2002-10-11 at 13:12, Dror Matalon wrote:

Hi,

I'm jumping in late into this discussion but ...

In my mind a lot of these features break the model. From an
application prespective, if I want to do last, I do a count(*) and
then I do a fetch with limit; Not quite the same, but all these
methods of fetching the whole data locally and manipulating it to a
large exten defeat the purpose. Let the backend do the work,
instead of trying to replicate the functionality in JDBC.

That said I do understand that some of these are required by the
JDBC 2.0 spec.

Dror

On Fri, Oct 11, 2002 at 01:05:37PM -0400, Aaron Mulder wrote:

It wouldn't be bad to start with a naive implementation of
last()... If the only problem we have is that last() doesn't
perform well, we're probably making good progress. :)
On the other hand, I would think the updateable result sets
would be the most challenging; does the server provide any
analogous features with its cursors?

Aaron

On 11 Oct 2002, Dave Cramer wrote:

This really is an artifact of the way that postgres gives us
the data.

When you query the backend you get *all* of the results in the
query, and there is no indication of how many results you are
going to get. In simple selects it would be possible to get
some idea by using count(field), but this wouldn't work nearly
enough times to make it useful. So that leaves us with using
cursors, which still won't tell you how many rows you are
getting back, but at least you won't have the memory problems.

This approach is far from trivial which is why it hasn't been
implemented as of yet, keep in mind that result sets support
things like move(n), first(), last(), the last of which will be
the trickiest. Not to mention updateable result sets.

As it turns out there is a mechanism to get to the end move 0
in 'cursor', which currently is being considered a bug.

Dave

On Fri, 2002-10-11 at 11:44, Doug Fields wrote:

At 08:27 AM 10/11/2002, snpe wrote:

Barry,
Is it true ?
I create table with one column varchar(500) and enter 1
milion rows with length 10-20 character.JDBC query 'select
* from a' get error 'out of memory', but psql not.
I insert 8 milion rows and psql work fine yet (slow, but
work)

The way the code works in JDBC is, in my opinion, a little
poor but possibly mandated by JDBC design specs.

It reads the entire result set from the database backend and
caches it in a horrible Vector (which should really be a List
and which should at least make an attempt to get the # of
rows ahead of time to avoid all the resizing problems).

Then, it doles it out from memory as you go through the
ResultSet with the next() method.

I would have hoped (but was wrong) that it streamed - WITHOUT
LOADING THE WHOLE THING - through the result set as each row
is returned from the backend, thus ensuring that you never
use much more memory than one line. EVEN IF you have to keep
the connection locked.

The latter is what I expected it to do. The former is what it
does. So, it necessitates you creating EVERY SELECT query
which you think has more than a few rows (or which you think
COULD have more than a few rows, "few" being defined by our
VM memory limits) into a cursor based query. Really klugy. I
intend to write a class to do that for every SELECT query for
me automatically.

Cheers,

Doug

In C library is 'execute query' without fetch - in jdbc
execute fetch all rows
and this is problem - I think that executequery must prepare
query and fetch (ResultSet.next or ...) must fetch only
fetchSize rows. I am not sure, but I think that is problem
with jdbc, not postgresql Hackers ?
Does psql fetch all rows and if not how many ?
Can I change fetch size in psql ?
CURSOR , FETCH and MOVE isn't solution.
If I use jdbc in third-party IDE, I can't force this
solution

regards

On Thursday 10 October 2002 06:40 pm, Barry Lind wrote:

Nick,

This has been discussed before on this list many times.
But the short answer is that that is how the postgres
server handles queries. If you issue a query the server
will return the entire result. (try the same query in
psql and you will have the same problem). To work around
this you can use explicit cursors (see the DECLARE
CURSOR, FETCH, and MOVE sql commands for postgres).

thanks,
--Barry

Nick Fankhauser wrote:

I'm selecting a huge ResultSet from our database- about
one million rows, with one of the fields being
varchar(500). I get an out of memory error from java.

If the whole ResultSet gets stashed in memory, this
isn't really surprising, but I'm wondering why this
happens (if it does), rather than a subset around the
current record being cached and other rows being
retrieved as needed.

If it turns out that there are good reasons for it to
all be in memory, then my question is whether there is
a better approach that people typically use in this
situation. For now, I'm simply breaking up the select
into smaller chunks, but that approach won't be
satisfactory in the long run.

Thanks

-Nick

-------------------------------------------------------
---- -- ------------ - Nick Fankhauser nickf@ontko.com
Phone 1.765.935.4283 Fax 1.765.962.9788 Ray Ontko &
Co. Software Consulting Services http://www.ontko.com/

---------------------------(end of
broadcast)--------------------------- TIP 5: Have you
checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

---------------------------(end of
broadcast)--------------------------- TIP 6: Have you
searched our list archives?

http://archives.postgresql.org

---------------------------(end of
broadcast)--------------------------- TIP 2: you can get
off all lists at once with the unregister command (send
"unregister YourEmailAddressHere" to
majordomo@postgresql.org)

---------------------------(end of
broadcast)--------------------------- TIP 6: Have you
searched our list archives?

http://archives.postgresql.org

---------------------------(end of
broadcast)--------------------------- TIP 1: subscribe and
unsubscribe commands go to majordomo@postgresql.org

---------------------------(end of
broadcast)--------------------------- TIP 3: if posting/reading
through Usenet, please send an appropriate subscribe-nomail
command to majordomo@postgresql.org so that your message can get
through to the mailing list cleanly

--
Dror Matalon
Zapatec Inc
1700 MLK Way
Berkeley, CA 94709
http://www.zapatec.com

---------------------------(end of
broadcast)--------------------------- TIP 1: subscribe and
unsubscribe commands go to majordomo@postgresql.org

---------------------------(end of
broadcast)--------------------------- TIP 2: you can get off all
lists at once with the unregister command (send "unregister
YourEmailAddressHere" to majordomo@postgresql.org)

---------------------------(end of
broadcast)--------------------------- TIP 3: if posting/reading through
Usenet, please send an appropriate subscribe-nomail command to
majordomo@postgresql.org so that your message can get through to the
mailing list cleanly

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

#35snpe
snpe@snpe.co.yu
In reply to: Dave Cramer (#26)
Re: Out of memory error on huge resultset

Can You do this :
We save 1000 (or fetchSize rows) first from beginning
If table have < 1000 rows we save all rows, but if table have more rows
and user request 1001 we fetch 1000 (again from begining, but skip 1000 rows
or maybe continue fetching, if it possible)
When user request last we fetch all rows, but save only last 1000 etc

We save only fetchSize rows and seek from begining when user request
backward (or maybe seek always when user request out our 'fetchSize' window)

This is slow for large tables, but this is solution until developer get us
better solution from backend.If table have < fetchSize rows this is same
current solution and we can fix minimal fetchSize for better performance with
small tables.

regards
Haris Peco

Show quoted text

On Friday 11 October 2002 08:13 pm, Dave Cramer wrote:

No,

It doesn't have to store them, only display them

Dave

On Fri, 2002-10-11 at 12:48, snpe wrote:

Hello,
Does it mean that psql uses cursors ?

regards
Haris Peco

On Friday 11 October 2002 05:58 pm, Dave Cramer wrote:

This really is an artifact of the way that postgres gives us the data.

When you query the backend you get *all* of the results in the query,
and there is no indication of how many results you are going to get. In
simple selects it would be possible to get some idea by using
count(field), but this wouldn't work nearly enough times to make it
useful. So that leaves us with using cursors, which still won't tell
you how many rows you are getting back, but at least you won't have the
memory problems.

This approach is far from trivial which is why it hasn't been
implemented as of yet, keep in mind that result sets support things
like move(n), first(), last(), the last of which will be the trickiest.
Not to mention updateable result sets.

As it turns out there is a mechanism to get to the end move 0 in
'cursor', which currently is being considered a bug.

Dave

On Fri, 2002-10-11 at 11:44, Doug Fields wrote:

At 08:27 AM 10/11/2002, snpe wrote:

Barry,
Is it true ?
I create table with one column varchar(500) and enter 1 milion rows
with length 10-20 character.JDBC query 'select * from a' get error
'out of memory', but psql not.
I insert 8 milion rows and psql work fine yet (slow, but work)

The way the code works in JDBC is, in my opinion, a little poor but
possibly mandated by JDBC design specs.

It reads the entire result set from the database backend and caches
it in a horrible Vector (which should really be a List and which
should at least make an attempt to get the # of rows ahead of time to
avoid all the resizing problems).

Then, it doles it out from memory as you go through the ResultSet
with the next() method.

I would have hoped (but was wrong) that it streamed - WITHOUT LOADING
THE WHOLE THING - through the result set as each row is returned from
the backend, thus ensuring that you never use much more memory than
one line. EVEN IF you have to keep the connection locked.

The latter is what I expected it to do. The former is what it does.
So, it necessitates you creating EVERY SELECT query which you think
has more than a few rows (or which you think COULD have more than a
few rows, "few" being defined by our VM memory limits) into a cursor
based query. Really klugy. I intend to write a class to do that for
every SELECT query for me automatically.

Cheers,

Doug

In C library is 'execute query' without fetch - in jdbc execute
fetch all rows
and this is problem - I think that executequery must prepare query
and fetch (ResultSet.next or ...) must fetch only fetchSize rows. I
am not sure, but I think that is problem with jdbc, not postgresql
Hackers ?
Does psql fetch all rows and if not how many ?
Can I change fetch size in psql ?
CURSOR , FETCH and MOVE isn't solution.
If I use jdbc in third-party IDE, I can't force this solution

regards

On Thursday 10 October 2002 06:40 pm, Barry Lind wrote:

Nick,

This has been discussed before on this list many times. But the
short answer is that that is how the postgres server handles
queries. If you issue a query the server will return the entire
result. (try the same query in psql and you will have the same
problem). To work around this you can use explicit cursors (see
the DECLARE CURSOR, FETCH, and MOVE sql commands for postgres).

thanks,
--Barry

Nick Fankhauser wrote:

I'm selecting a huge ResultSet from our database- about one
million rows, with one of the fields being varchar(500). I get
an out of memory error from java.

If the whole ResultSet gets stashed in memory, this isn't
really surprising, but I'm wondering why this happens (if it
does), rather than a subset around the current record being
cached and other rows being retrieved as needed.

If it turns out that there are good reasons for it to all be in
memory, then my question is whether there is a better approach
that people typically use in this situation. For now, I'm
simply breaking up the select into smaller chunks, but that
approach won't be satisfactory in the long run.

Thanks

-Nick

---------------------------------------------------------------
---- ------ - Nick Fankhauser nickf@ontko.com Phone
1.765.935.4283 Fax 1.765.962.9788 Ray Ontko & Co. Software
Consulting Services http://www.ontko.com/

---------------------------(end of
broadcast)--------------------------- TIP 5: Have you checked
our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

---------------------------(end of
broadcast)--------------------------- TIP 6: Have you searched
our list archives?

http://archives.postgresql.org

---------------------------(end of
broadcast)--------------------------- TIP 2: you can get off all
lists at once with the unregister command (send "unregister
YourEmailAddressHere" to
majordomo@postgresql.org)

---------------------------(end of
broadcast)--------------------------- TIP 6: Have you searched our
list archives?

http://archives.postgresql.org

---------------------------(end of
broadcast)--------------------------- TIP 1: subscribe and unsubscribe
commands go to majordomo@postgresql.org

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

#36Mike Mascari
mascarm@mascari.com
In reply to: scott.marlowe (#33)
Re: MySQL vs PostgreSQL.

scott.marlowe wrote:

On Fri, 11 Oct 2002, Jeff Davis wrote:

I agree with your message except for that statement. MySQL alter table
provides the ability to change column types and cast the records
automatically. I remember that feature as really the only thing from MySQL
that I've ever missed.

Of course, it's not that wonderful in theory. During development you can
easily drop/recreate the tables and reload the test data; during production
you don't change the data types of your attributes.

But in practice, during development it's handy sometimes.

I still remember a post from somebody on the phpbuilder site that had
changed a field from varchar to date and all the dates he had got changed
to 0000-00-00.

He most unimpressed, especially since he (being typical of a lot of MySQL
users) didn't have a backup.

Couldn't he just do ROLLBACK? ;-)

(for the humor impaired, that's a joke...)

Mike Mascari
mascarm@mascari.com

#37Jeff Davis
list-pgsql-hackers@empires.org
In reply to: scott.marlowe (#33)
Re: MySQL vs PostgreSQL.

I still remember a post from somebody on the phpbuilder site that had
changed a field from varchar to date and all the dates he had got changed
to 0000-00-00.

He most unimpressed, especially since he (being typical of a lot of MySQL
users) didn't have a backup.

Ah, yes. Classic.

I was talking about a development scenario. Anyone who changes a huge amount
of important data to a new form without a clearly defined algorithm is not
making a wise choice. That's kind of like if you have a perl script operating
on an important file: you don't want it to just kill all your data, so you do
a few tests first.

And it really is a minor matter of convenience. I end up dropping and
recreating all my tables a lot in the early stages of development, which is
mildly annoying. Certainly not as bad, I suppose, as if you're led to believe
that a feature does something safely, and it kills all your data.

So, you're right. It's probably better that it's never implemented.

Regards,
Jeff

#38Alvaro Herrera
alvherre@dcc.uchile.cl
In reply to: Jeff Davis (#37)
Re: MySQL vs PostgreSQL.

On Fri, Oct 11, 2002 at 07:08:18PM -0700, Jeff Davis wrote:

And it really is a minor matter of convenience. I end up dropping and
recreating all my tables a lot in the early stages of development, which is
mildly annoying. Certainly not as bad, I suppose, as if you're led to believe
that a feature does something safely, and it kills all your data.

Now that ALTER TABLE DROP COLUMN is implemented, there probably isn't
any more the need to do such frequent drop/create of tables.

And things just keep getting better and better. This is really amazing.

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"We are who we choose to be", sang the goldfinch
when the sun is high (Sandman)

#39Jeff Davis
list-pgsql-hackers@empires.org
In reply to: Alvaro Herrera (#38)
Re: MySQL vs PostgreSQL.

Oh yes, I agree. ALTER TABLE ... DROP COLUMN helps out a lot. I actually don't
use that for much yet because 7.3 is still in beta. However, I certainly
can't complain to the developers for it since it's already developed :)

I am consistantly amazed by every minor version release. If postgres had a
marketing team it would be at version 37.3 by now. In my last email I agreed
with Scott Marlowe that postgres is better off without the casting of an
entire column, since that's kind of a dangeous procedure and can be completed
in a round-about (read: explicit) way by postgres anyway, that doesn't lose
your data until after you've had a chance to look at the new stuff.

Regards,
Jeff

Show quoted text

On Friday 11 October 2002 07:16 pm, you wrote:

On Fri, Oct 11, 2002 at 07:08:18PM -0700, Jeff Davis wrote:

And it really is a minor matter of convenience. I end up dropping and
recreating all my tables a lot in the early stages of development, which
is mildly annoying. Certainly not as bad, I suppose, as if you're led to
believe that a feature does something safely, and it kills all your data.

Now that ALTER TABLE DROP COLUMN is implemented, there probably isn't
any more the need to do such frequent drop/create of tables.

And things just keep getting better and better. This is really amazing.

#40Hannu Krosing
hannu@tm.ee
In reply to: Alvaro Herrera (#38)
Re: MySQL vs PostgreSQL.

Alvaro Herrera kirjutas L, 12.10.2002 kell 04:16:

On Fri, Oct 11, 2002 at 07:08:18PM -0700, Jeff Davis wrote:

And it really is a minor matter of convenience. I end up dropping and
recreating all my tables a lot in the early stages of development, which is
mildly annoying. Certainly not as bad, I suppose, as if you're led to believe
that a feature does something safely, and it kills all your data.

Now that ALTER TABLE DROP COLUMN is implemented, there probably isn't
any more the need to do such frequent drop/create of tables.

Did attlognum's (for changing column order) get implemented for 7.2 ?

------------
Hannu

#41Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Hannu Krosing (#40)
Re: MySQL vs PostgreSQL.

Hannu Krosing wrote:

Alvaro Herrera kirjutas L, 12.10.2002 kell 04:16:

On Fri, Oct 11, 2002 at 07:08:18PM -0700, Jeff Davis wrote:

And it really is a minor matter of convenience. I end up dropping and
recreating all my tables a lot in the early stages of development, which is
mildly annoying. Certainly not as bad, I suppose, as if you're led to believe
that a feature does something safely, and it kills all your data.

Now that ALTER TABLE DROP COLUMN is implemented, there probably isn't
any more the need to do such frequent drop/create of tables.

Did attlognum's (for changing column order) get implemented for 7.2 ?

No, changing column order isn't even on the TODO list.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#42Darko Prenosil
darko.prenosil@finteh.hr
In reply to: Shridhar Daithankar (#7)
Re: [HACKERS] MySQL vs PostgreSQL.

On Saturday 12 October 2002 09:02, Shridhar Daithankar wrote:

On 12 Oct 2002 at 11:36, Darko Prenosil wrote:

On Friday 11 October 2002 12:38, Shridhar Daithankar wrote:

On 11 Oct 2002 at 16:20, Antti Haapala wrote:

Check out:
http://www.mysql.com/doc/en/MySQL-PostgreSQL_features.html

Well, I guess there are many threads on this. You can dig around
archives..

Upgrading MySQL Server is painless. When you are upgrading MySQL
Server, you don't need to dump/restore your data, as you have to do
with most PostgreSQL upgrades.

Ok... this is true, but not so hard - yesterday I installed 7.3b2
onto my linux box.

Well, that remains as a point. Imagine a 100GB database on a 150GB disk
array. How do you dump and reload? In place conversion of data is an
absolute necessary feature and it's already on TODO.

From PostgreSQL 7.3 Documentation :

Use compressed dumps. Use your favorite compression program, for example
gzip. pg_dump dbname | gzip > filename.gz

Yes. but that may not be enough. Strech the situation. 300GB database 350GB
space. GZip can't compress better than 3:1. And don't think it's
imagination. I am preparing a database of 600GB in near future. Don't want
to provide 1TB of space to include redump.

Where You store Your regular backup (The one You use for security reasons, not
for version change)? Or You are not doing backup at all ???

#43Shridhar Daithankar
shridhar_daithankar@persistent.co.in
In reply to: Darko Prenosil (#42)
Re: [HACKERS] MySQL vs PostgreSQL.

On 12 Oct 2002 at 17:58, Darko Prenosil wrote:

On Saturday 12 October 2002 09:02, Shridhar Daithankar wrote:

Yes. but that may not be enough. Strech the situation. 300GB database 350GB
space. GZip can't compress better than 3:1. And don't think it's
imagination. I am preparing a database of 600GB in near future. Don't want
to provide 1TB of space to include redump.

Where You store Your regular backup (The one You use for security reasons, not
for version change)? Or You are not doing backup at all ???

No regular backups. Data gets recycled in fixed intervals. It's not stored
permannently anyway. And this is not single machine database. It's a cluster
with redundant components like RAID etc. So risk goes further down..

Lucky me.. didn't have to devise regular backup scheme for such a database..

Bye
Shridhar

--
Real Time, adj.: Here and now, as opposed to fake time, which only occurs there
and then.

#44Dave Tenny
tenny@attbi.com
In reply to: snpe (#35)
Re: [JDBC] Out of memory error on huge resultset

To work
around this you can use explicit cursors (see the DECLARE CURSOR,
FETCH, and MOVE sql commands for postgres).

I'm unable to get this to work using the default distribution JDBC driver.
(7.2). Here's a code snippet

conn.setAutoCommit(false) ;
stmt.execute("BEGIN") ;
stmt.execute("DECLARE mysursor CURSOR FOR SELECT icol FROM mtable") ;
ResultSet rs = null ;
if (stmt.execute("FETCH 10000 IN mycursor"))
rs = stmt.getResultSet() ;

The FETCH statement returns an update count of 1, but no ResultSet.
If I try executeQuery, a "no rows found" exception is thrown.

Equivalent code in the C library interface works just fine.

I need a workaround, because default ResultSet processing in the JDBC
driver (and also the jxDBCon driver) pretty much blow out the memory
of the JVM.

#45snpe
snpe@snpe.co.yu
In reply to: Dave Tenny (#44)
Re: [JDBC] Out of memory error on huge resultset

This code work for me :
Connection db = DriverManager.getConnection(url,user,passwd);
PreparedStatement st = db.prepareStatement("begin;declare c1 cursor for
select * from a");
st.execute();
st = db.prepareStatement("fetch 100 in c1");
ResultSet rs = st.executeQuery();
//rs.setFetchSize(100);
while (rs.next() ) {
s = rs.getString(1);
System.out.println(s);
}
st = db.prepareStatement("commit");
st.execute();
st.close();
db.close();

regards
Haris Peco

Show quoted text

On Wednesday 16 October 2002 01:55 pm, Dave Tenny wrote:

To work
around this you can use explicit cursors (see the DECLARE CURSOR,
FETCH, and MOVE sql commands for postgres).

I'm unable to get this to work using the default distribution JDBC driver.
(7.2). Here's a code snippet

conn.setAutoCommit(false) ;
stmt.execute("BEGIN") ;
stmt.execute("DECLARE mysursor CURSOR FOR SELECT icol FROM mtable") ;
ResultSet rs = null ;
if (stmt.execute("FETCH 10000 IN mycursor"))
rs = stmt.getResultSet() ;

The FETCH statement returns an update count of 1, but no ResultSet.
If I try executeQuery, a "no rows found" exception is thrown.

Equivalent code in the C library interface works just fine.

I need a workaround, because default ResultSet processing in the JDBC
driver (and also the jxDBCon driver) pretty much blow out the memory
of the JVM.

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

#46Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Dave Cramer (#20)
1 attachment(s)
Re: move 0 behaviour

Dave Cramer wrote:

Currently there is a TODO list item to have move 0 not position to the
end of the cursor.

Moving to the end of the cursor is useful, can we keep the behaviour and
change it to move end, or just leave it the way it is?

I did some research on this. It turns out the parser uses 0 for ALL, so
when you do a FETCH ALL it is passing zero. Now, when you do MOVE 0,
you are really asking for FETCH ALL and all the tuples are thrown away
because of the MOVE.

So, that is why MOVE 0 goes to the end of the cursor. One idea would be
for MOVE 0 to actually move nothing, but jdbc and others need the
ability to move the end of the cursor, perhaps to then back up a certain
amount and read from there. Seems MOVE 0 is the logical way to do that.
(I can't think of another reasonable value).

I have the following patch which just documents the fact that MOVE 0
goes to the end of the cursor. It does not change any behavior, just
document it.

If/when I apply the patch, I will remove the TODO item. Another idea
would be to require MOVE END to move to the end.

Comments?

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Attachments:

/pgpatches/movetext/plainDownload
Index: doc/src/sgml/ref/move.sgml
===================================================================
RCS file: /cvsroot/pgsql-server/doc/src/sgml/ref/move.sgml,v
retrieving revision 1.13
diff -c -c -r1.13 move.sgml
*** doc/src/sgml/ref/move.sgml	21 Apr 2002 19:02:39 -0000	1.13
--- doc/src/sgml/ref/move.sgml	26 Oct 2002 20:01:15 -0000
***************
*** 37,44 ****
     <command>MOVE</command> allows a user to move cursor position a specified
     number of rows.
     <command>MOVE</command> works like the <command>FETCH</command> command,
!    but only positions the cursor and does
!    not return rows.
    </para>
    <para>
     Refer to 
--- 37,44 ----
     <command>MOVE</command> allows a user to move cursor position a specified
     number of rows.
     <command>MOVE</command> works like the <command>FETCH</command> command,
!    but only positions the cursor and does not return rows. The special
!    direction <literal>0</> moves to the end of the cursor.
    </para>
    <para>
     Refer to 
Index: src/backend/executor/execMain.c
===================================================================
RCS file: /cvsroot/pgsql-server/src/backend/executor/execMain.c,v
retrieving revision 1.180
diff -c -c -r1.180 execMain.c
*** src/backend/executor/execMain.c	14 Oct 2002 16:51:30 -0000	1.180
--- src/backend/executor/execMain.c	26 Oct 2002 20:01:20 -0000
***************
*** 1119,1125 ****
  
  		/*
  		 * check our tuple count.. if we've processed the proper number
! 		 * then quit, else loop again and process more tuples..
  		 */
  		current_tuple_count++;
  		if (numberTuples == current_tuple_count)
--- 1119,1127 ----
  
  		/*
  		 * check our tuple count.. if we've processed the proper number
! 		 * then quit, else loop again and process more tuples.
! 		 * If numberTuples is zero, it means we have done MOVE 0
! 		 * or FETCH ALL and we want to go to the end of the portal.
  		 */
  		current_tuple_count++;
  		if (numberTuples == current_tuple_count)
Index: src/backend/tcop/utility.c
===================================================================
RCS file: /cvsroot/pgsql-server/src/backend/tcop/utility.c,v
retrieving revision 1.180
diff -c -c -r1.180 utility.c
*** src/backend/tcop/utility.c	21 Oct 2002 20:31:52 -0000	1.180
--- src/backend/tcop/utility.c	26 Oct 2002 20:01:29 -0000
***************
*** 263,270 ****
  
  				/*
  				 * parser ensures that count is >= 0 and 'fetch ALL' -> 0
  				 */
- 
  				count = stmt->howMany;
  				PerformPortalFetch(portalName, forward, count,
  								   (stmt->ismove) ? None : dest,
--- 263,270 ----
  
  				/*
  				 * parser ensures that count is >= 0 and 'fetch ALL' -> 0
+ 				 * MOVE 0 is equivalent to fetch ALL with no returned tuples.
  				 */
  				count = stmt->howMany;
  				PerformPortalFetch(portalName, forward, count,
  								   (stmt->ismove) ? None : dest,
#47Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#46)
Re: move 0 behaviour

Bruce Momjian <pgman@candle.pha.pa.us> writes:

I did some research on this. It turns out the parser uses 0 for ALL, so
when you do a FETCH ALL it is passing zero. Now, when you do MOVE 0,
you are really asking for FETCH ALL and all the tuples are thrown away
because of the MOVE.

Yeah. I think this is a bug and "MOVE 0" ought to be a no-op ... but
changing it requires a different parsetree representation for MOVE ALL,
which is tedious enough that it hasn't gotten done yet.

I have the following patch which just documents the fact that MOVE 0
goes to the end of the cursor. It does not change any behavior, just
document it.

It should be documented as behavior that is likely to change. Also,
I believe FETCH 0 has the same issue.

regards, tom lane

#48Peter Eisentraut
peter_e@gmx.net
In reply to: Bruce Momjian (#46)
Re: move 0 behaviour

Bruce Momjian writes:

So, that is why MOVE 0 goes to the end of the cursor. One idea would be
for MOVE 0 to actually move nothing, but jdbc and others need the
ability to move the end of the cursor, perhaps to then back up a certain
amount and read from there. Seems MOVE 0 is the logical way to do that.
(I can't think of another reasonable value).

It would seem more logical and reasonable for MOVE 0 to do nothing and
have some special syntax such as MOVE LAST to move to the end. (MOVE LAST
would actually be consistent with the standard syntax FETCH LAST.)

--
Peter Eisentraut peter_e@gmx.net

#49Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Peter Eisentraut (#48)
1 attachment(s)
Re: move 0 behaviour

Peter Eisentraut wrote:

Bruce Momjian writes:

So, that is why MOVE 0 goes to the end of the cursor. One idea would be
for MOVE 0 to actually move nothing, but jdbc and others need the
ability to move the end of the cursor, perhaps to then back up a certain
amount and read from there. Seems MOVE 0 is the logical way to do that.
(I can't think of another reasonable value).

It would seem more logical and reasonable for MOVE 0 to do nothing and
have some special syntax such as MOVE LAST to move to the end. (MOVE LAST
would actually be consistent with the standard syntax FETCH LAST.)

Yea, I started thinking and we need to get MOVE/FETCH to make sense.
The following patch makes FETCH/MOVE 0 do nothing, and FETCH LAST move
to the end. I was going to use the word END, but if LAST is more
standard, we will use that. It uses INT_MAX in the grammar for FETCH
ALL/MOVE LAST, but maps that to zero so it is consistent in the
/executor code.

I will keep this patch for 7.4.

JDBC folks, I realize you need this. Seems you will have to use MOVE 0
for 7,3 and MOVE LAST for 7.4.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Attachments:

/pgpatches/movetext/plainDownload
Index: doc/src/sgml/ref/move.sgml
===================================================================
RCS file: /cvsroot/pgsql-server/doc/src/sgml/ref/move.sgml,v
retrieving revision 1.13
diff -c -c -r1.13 move.sgml
*** doc/src/sgml/ref/move.sgml	21 Apr 2002 19:02:39 -0000	1.13
--- doc/src/sgml/ref/move.sgml	31 Oct 2002 01:15:42 -0000
***************
*** 21,27 ****
     <date>1999-07-20</date>
    </refsynopsisdivinfo>
    <synopsis>
! MOVE [ <replaceable class="PARAMETER">direction</replaceable> ] [ <replaceable class="PARAMETER">count</replaceable> ] 
      { IN | FROM } <replaceable class="PARAMETER">cursor</replaceable>
    </synopsis>
   </refsynopsisdiv>
--- 21,28 ----
     <date>1999-07-20</date>
    </refsynopsisdivinfo>
    <synopsis>
! MOVE [ <replaceable class="PARAMETER">direction</replaceable> ] 
!     {<replaceable class="PARAMETER">count</replaceable> | LAST }
      { IN | FROM } <replaceable class="PARAMETER">cursor</replaceable>
    </synopsis>
   </refsynopsisdiv>
Index: src/backend/commands/portalcmds.c
===================================================================
RCS file: /cvsroot/pgsql-server/src/backend/commands/portalcmds.c,v
retrieving revision 1.3
diff -c -c -r1.3 portalcmds.c
*** src/backend/commands/portalcmds.c	4 Sep 2002 20:31:15 -0000	1.3
--- src/backend/commands/portalcmds.c	31 Oct 2002 01:15:44 -0000
***************
*** 15,20 ****
--- 15,22 ----
  
  #include "postgres.h"
  
+ #include <limits.h>
+ 
  #include "commands/portalcmds.h"
  #include "executor/executor.h"
  
***************
*** 55,61 ****
   *
   *	name: name of portal
   *	forward: forward or backward fetch?
!  *	count: # of tuples to fetch (0 implies all)
   *	dest: where to send results
   *	completionTag: points to a buffer of size COMPLETION_TAG_BUFSIZE
   *		in which to store a command completion status string.
--- 57,63 ----
   *
   *	name: name of portal
   *	forward: forward or backward fetch?
!  *	count: # of tuples to fetch
   *	dest: where to send results
   *	completionTag: points to a buffer of size COMPLETION_TAG_BUFSIZE
   *		in which to store a command completion status string.
***************
*** 100,105 ****
--- 102,115 ----
  		return;
  	}
  
+ 	/* If zero count, we are done */
+ 	if (count == 0)
+ 		return;
+ 
+ 	/* Internally, zero count processes all portal rows */
+ 	if (count == INT_MAX)
+ 		count = 0;
+ 		
  	/*
  	 * switch into the portal context
  	 */
Index: src/backend/executor/execMain.c
===================================================================
RCS file: /cvsroot/pgsql-server/src/backend/executor/execMain.c,v
retrieving revision 1.180
diff -c -c -r1.180 execMain.c
*** src/backend/executor/execMain.c	14 Oct 2002 16:51:30 -0000	1.180
--- src/backend/executor/execMain.c	31 Oct 2002 01:15:50 -0000
***************
*** 1119,1125 ****
  
  		/*
  		 * check our tuple count.. if we've processed the proper number
! 		 * then quit, else loop again and process more tuples..
  		 */
  		current_tuple_count++;
  		if (numberTuples == current_tuple_count)
--- 1119,1126 ----
  
  		/*
  		 * check our tuple count.. if we've processed the proper number
! 		 * then quit, else loop again and process more tuples.  Zero
! 		 * number_tuples means no limit.
  		 */
  		current_tuple_count++;
  		if (numberTuples == current_tuple_count)
Index: src/backend/parser/gram.y
===================================================================
RCS file: /cvsroot/pgsql-server/src/backend/parser/gram.y,v
retrieving revision 2.370
diff -c -c -r2.370 gram.y
*** src/backend/parser/gram.y	22 Sep 2002 21:44:43 -0000	2.370
--- src/backend/parser/gram.y	31 Oct 2002 01:16:14 -0000
***************
*** 49,54 ****
--- 49,55 ----
  #include "postgres.h"
  
  #include <ctype.h>
+ #include <limits.h>
  
  #include "access/htup.h"
  #include "catalog/index.h"
***************
*** 357,363 ****
  	JOIN
  	KEY
  
! 	LANCOMPILER LANGUAGE LEADING LEFT LEVEL LIKE LIMIT
  	LISTEN LOAD LOCAL LOCALTIME LOCALTIMESTAMP LOCATION
  	LOCK_P
  
--- 358,364 ----
  	JOIN
  	KEY
  
! 	LANCOMPILER LANGUAGE LAST LEADING LEFT LEVEL LIKE LIMIT
  	LISTEN LOAD LOCAL LOCALTIME LOCALTIMESTAMP LOCATION
  	LOCK_P
  
***************
*** 2644,2650 ****
  					if ($3 < 0)
  					{
  						$3 = -$3;
! 						$2 = (($2 == FORWARD)? BACKWARD: FORWARD);
  					}
  					n->direction = $2;
  					n->howMany = $3;
--- 2645,2651 ----
  					if ($3 < 0)
  					{
  						$3 = -$3;
! 						$2 = (($2 == FORWARD) ? BACKWARD: FORWARD);
  					}
  					n->direction = $2;
  					n->howMany = $3;
***************
*** 2712,2719 ****
  fetch_how_many:
  			Iconst									{ $$ = $1; }
  			| '-' Iconst							{ $$ = - $2; }
! 											/* 0 means fetch all tuples*/
! 			| ALL									{ $$ = 0; }
  			| NEXT									{ $$ = 1; }
  			| PRIOR									{ $$ = -1; }
  		;
--- 2713,2720 ----
  fetch_how_many:
  			Iconst									{ $$ = $1; }
  			| '-' Iconst							{ $$ = - $2; }
! 			| ALL									{ $$ = INT_MAX; }
! 			| LAST									{ $$ = INT_MAX; }
  			| NEXT									{ $$ = 1; }
  			| PRIOR									{ $$ = -1; }
  		;
***************
*** 7098,7103 ****
--- 7099,7105 ----
  			| KEY
  			| LANGUAGE
  			| LANCOMPILER
+ 			| LAST
  			| LEVEL
  			| LISTEN
  			| LOAD
Index: src/backend/parser/keywords.c
===================================================================
RCS file: /cvsroot/pgsql-server/src/backend/parser/keywords.c,v
retrieving revision 1.127
diff -c -c -r1.127 keywords.c
*** src/backend/parser/keywords.c	18 Sep 2002 21:35:22 -0000	1.127
--- src/backend/parser/keywords.c	31 Oct 2002 01:16:15 -0000
***************
*** 172,177 ****
--- 172,178 ----
  	{"key", KEY},
  	{"lancompiler", LANCOMPILER},
  	{"language", LANGUAGE},
+ 	{"last", LAST},
  	{"leading", LEADING},
  	{"left", LEFT},
  	{"level", LEVEL},
Index: src/backend/tcop/utility.c
===================================================================
RCS file: /cvsroot/pgsql-server/src/backend/tcop/utility.c,v
retrieving revision 1.180
diff -c -c -r1.180 utility.c
*** src/backend/tcop/utility.c	21 Oct 2002 20:31:52 -0000	1.180
--- src/backend/tcop/utility.c	31 Oct 2002 01:16:18 -0000
***************
*** 262,270 ****
  				forward = (bool) (stmt->direction == FORWARD);
  
  				/*
! 				 * parser ensures that count is >= 0 and 'fetch ALL' -> 0
  				 */
- 
  				count = stmt->howMany;
  				PerformPortalFetch(portalName, forward, count,
  								   (stmt->ismove) ? None : dest,
--- 262,269 ----
  				forward = (bool) (stmt->direction == FORWARD);
  
  				/*
! 				 * parser ensures that count is >= 0
  				 */
  				count = stmt->howMany;
  				PerformPortalFetch(portalName, forward, count,
  								   (stmt->ismove) ? None : dest,
#50Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#49)
Re: move 0 behaviour

Bruce Momjian <pgman@candle.pha.pa.us> writes:

The following patch makes FETCH/MOVE 0 do nothing, and FETCH LAST move
to the end.

Do not hack up PerformPortalFetch; put the special case for INT_MAX in
utility.c's FetchStmt code, instead. As-is, you probably broke other
callers of PerformPortalFetch.

BTW, there's a comment in parsenodes.h that needs to be fixed too:

int howMany; /* amount to fetch ("ALL" --> 0) */

regards, tom lane

#51Martijn van Oosterhout
kleptog@svana.org
In reply to: Peter Eisentraut (#48)
Re: Cursors: getting the number of tuples; moving backwards

On Fri, Nov 01, 2002 at 12:43:48PM +0200, am@fx.ro wrote:

Hello everyone!

I have 2 questions:

--1-- Some days ago, I've been trying to get the number of tuples
that FETCH ALL would return, *before* fetching anything.
(the program is written in C++, using libpq ; PostgreSQL 7.2.3).

Well, to get an answer, the server needs to execute the entire query. It
won't do that unless you explicitly ask for it.

The solution i've found was something like:

int nr_tuples;

res = PQexec(conn, "MOVE ALL in CURS");
sscanf(PQcmdStatus(res),"MOVE %i",&nr_tuples);
PQclear(res);

That would work. But why do you need to know the total beforehand? You could
just do a FETCH ALL and then use PQntuples to get the number. If you're
using it to decide whether to provide a Next link, just FETCH one more item
than you intend to display and if you get it you display the link.

I'm wondering: is there any better way to get that number?

( just an idea: maybe it would be useful to make PQcmdTuples
work for MOVE commands ... ? )

Interesting idea. I'm not sure whether MOVE actually executes the query or
not.

--2-- I found out that if i reach the end of the cursor, and want
to move backwards, i have to increase the MOVE command's argument by 1:

No idea, the cursor has probably moved off the end to indicate the query is
done. So you need the extra one to move it back. That's just a guess though.

Hope this helps,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

There are 10 kinds of people in the world, those that can do binary
arithmetic and those that can't.

#52Noname
am@fx.ro
In reply to: Bruce Momjian (#49)
Cursors: getting the number of tuples; moving backwards

Hello everyone!

I have 2 questions:

--1-- Some days ago, I've been trying to get the number of tuples
that FETCH ALL would return, *before* fetching anything.
(the program is written in C++, using libpq ; PostgreSQL 7.2.3).

The solution i've found was something like:

int nr_tuples;

res = PQexec(conn, "MOVE ALL in CURS");
sscanf(PQcmdStatus(res),"MOVE %i",&nr_tuples);
PQclear(res);

I'm wondering: is there any better way to get that number?

( just an idea: maybe it would be useful to make PQcmdTuples
work for MOVE commands ... ? )

--2-- I found out that if i reach the end of the cursor, and want
to move backwards, i have to increase the MOVE command's argument by 1:

MOVE ALL in CURS --> i get the number of tuples: 590

MOVE -590 in CURS
FETCH ALL --> i get all tuples except the first one

MOVE -591 in CURS
FETCH ALL --> i get all the tuples

MOVE -1 in CURS
FETCH ALL --> i get nothing !

MOVE -2 in CURS
FETCH ALL --> i get the last tuple

This happens only if the current position is at the end of the cursor.

Is this the normal behaviour?

Best regards,
Adrian Maier
(am@fx.ro)

#53Noname
am@fx.ro
In reply to: Martijn van Oosterhout (#51)
Re: Cursors: getting the number of tuples; moving backwards

On Fri, Nov 01, 2002 at 07:23:29PM +1100, Martijn van Oosterhout wrote:

The solution i've found was something like:

int nr_tuples;

res = PQexec(conn, "MOVE ALL in CURS");
sscanf(PQcmdStatus(res),"MOVE %i",&nr_tuples);
PQclear(res);

That would work. But why do you need to know the total beforehand? You could
just do a FETCH ALL and then use PQntuples to get the number.

If the table has, let's say, 10000 rows, it's unlikely that the user
will ever browse all of them ( my program permits the user to set some
filters ; the interface is ncurses-based). Fetching everything
would be unnecessary.

So, for speed reasons, i prefer to fetch maximum 500 rows.
But i want to display in the screen's corner the total number
of rows .

I'm wondering: is there any better way to get that number?

( just an idea: maybe it would be useful to make PQcmdTuples
work for MOVE commands ... ? )

Interesting idea. I'm not sure whether MOVE actually executes the query or
not.

I guess it doesn't execute the whole query. MOVE ALL is *much*
faster than FETCH ALL + PQcmdTuples

--2-- I found out that if i reach the end of the cursor, and want
to move backwards, i have to increase the MOVE command's argument by 1:

No idea, the cursor has probably moved off the end to indicate the query is
done. So you need the extra one to move it back. That's just a guess though.

Yeah, this could be the explanation.

Thanks for your answer

Adrian Maier

#54Martijn van Oosterhout
kleptog@svana.org
In reply to: Noname (#53)
Re: Cursors: getting the number of tuples; moving backwards

On Fri, Nov 01, 2002 at 08:14:33PM +0200, am@fx.ro wrote:

On Fri, Nov 01, 2002 at 07:23:29PM +1100, Martijn van Oosterhout wrote:

That would work. But why do you need to know the total beforehand? You could
just do a FETCH ALL and then use PQntuples to get the number.

If the table has, let's say, 10000 rows, it's unlikely that the user
will ever browse all of them ( my program permits the user to set some
filters ; the interface is ncurses-based). Fetching everything
would be unnecessary.

So, for speed reasons, i prefer to fetch maximum 500 rows.
But i want to display in the screen's corner the total number
of rows .

Maybe do what google does. If there's lots of rows, give an estimate. I
don't know how they do it but if there are more than 1000 rows then the user
probably won't care if you wrote 1000, 2000 or a million.

Maybe some whacky curve fitting. If there's still a 98% match after 100
matches, there must be around 5000 matches.

Interesting idea. I'm not sure whether MOVE actually executes the query or
not.

I guess it doesn't execute the whole query. MOVE ALL is *much*
faster than FETCH ALL + PQcmdTuples

Curious. I wonder how it does it then.

--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

There are 10 kinds of people in the world, those that can do binary
arithmetic and those that can't.

#55Tom Lane
tgl@sss.pgh.pa.us
In reply to: Martijn van Oosterhout (#54)
Re: Cursors: getting the number of tuples; moving backwards

Martijn van Oosterhout <kleptog@svana.org> writes:

I guess it doesn't execute the whole query. MOVE ALL is *much*
faster than FETCH ALL + PQcmdTuples

Curious. I wonder how it does it then.

MOVE does execute the query, it just doesn't ship the tuples to the
client. This would save some formatting overhead (no need to run
the datatype I/O conversion procedures), but unless you have a slow
network link between client and server I would not expect it to be
"much" faster ...

regards, tom lane

#56Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#50)
Re: move 0 behaviour

Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

The following patch makes FETCH/MOVE 0 do nothing, and FETCH LAST move
to the end.

Do not hack up PerformPortalFetch; put the special case for INT_MAX in
utility.c's FetchStmt code, instead. As-is, you probably broke other
callers of PerformPortalFetch.

I thought about that, but I need to fail if the cursor name is invalid.
Those tests are done in PerformPortalFetch(). The good news is that no
one else call it. Other ideas?

BTW, there's a comment in parsenodes.h that needs to be fixed too:

int howMany; /* amount to fetch ("ALL" --> 0) */

Done.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#57Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#56)
Re: move 0 behaviour

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Tom Lane wrote:

Do not hack up PerformPortalFetch; put the special case for INT_MAX in
utility.c's FetchStmt code, instead. As-is, you probably broke other
callers of PerformPortalFetch.

I thought about that, but I need to fail if the cursor name is invalid.

What has that got to do with it?

regards, tom lane

#58Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#57)
Re: move 0 behaviour

Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Tom Lane wrote:

Do not hack up PerformPortalFetch; put the special case for INT_MAX in
utility.c's FetchStmt code, instead. As-is, you probably broke other
callers of PerformPortalFetch.

I thought about that, but I need to fail if the cursor name is invalid.

What has that got to do with it?

If I put the 'return' for 0 MOVE/FETCH in utility.c's FetchStmt code, I
will not get the checks for invalid cursor names, and I will not get the
proper return tag. I don't see how to do anything in utility.c. I
assume this is the code you want to move to utility.c:

	+       /* If zero count, we are done */
	+       if (count == 0)
	+               return;
	+ 
	+       /* Internally, zero count processes all portal rows */
	+       if (count == INT_MAX)
	+               count = 0;
	+               
-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#59Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#58)
Re: move 0 behaviour

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

I thought about that, but I need to fail if the cursor name is invalid.

What has that got to do with it?

If I put the 'return' for 0 MOVE/FETCH in utility.c's FetchStmt code, I
will not get the checks for invalid cursor names, and I will not get the
proper return tag.

Oh, I see. Yeah, you're probably right, we have to change the calling
convention for PerformPortalFetch.

BTW, portalcmds.h also contains a comment that would need to be fixed.

regards, tom lane

#60Noname
am@fx.ro
In reply to: Tom Lane (#55)
Re: Cursors: getting the number of tuples; moving backwards

On Fri, Nov 01, 2002 at 10:03:17PM -0500, Tom Lane wrote:

MOVE does execute the query, it just doesn't ship the tuples to the
client. This would save some formatting overhead (no need to run
the datatype I/O conversion procedures), but unless you have a slow
network link between client and server I would not expect it to be
"much" faster ...

It must be the fact that the computer is quite old : Cyrix 6x86 166Mhz.
( this is not the deplyoment machine ).

Using MOVE is about 5 times faster in my case :
For 150784 tuples in the table, FETCH-ing took about 1m30 ,
while MOVE-ing took only about 17sec.

| Real | User | Sys
-------------------------------------------------------------------
select * from PRODTEST | 1m30.843s | 0m42.960s | 0m1.720s
-------------------------------------------------------------------
declare cursor... + FETCH | 1m32.835s | 0m42.680s | 0m1.780s
-------------------------------------------------------------------
declare cursor... + MOVE | 0m17.215s | 0m0.030s | 0m0.030s
-------------------------------------------------------------------
( i used commands like: time psql -f test.sql db_rex
to get those timings )

The difference must be smaller on fast machines.

So i guess that my computer is pretty good when it comes to finding
performance problems in applications ;-)

Bye,
Adrian Maier
(am@fx.ro)

#61Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#59)
Re: move 0 behaviour

Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

I thought about that, but I need to fail if the cursor name is invalid.

What has that got to do with it?

If I put the 'return' for 0 MOVE/FETCH in utility.c's FetchStmt code, I
will not get the checks for invalid cursor names, and I will not get the
proper return tag.

Oh, I see. Yeah, you're probably right, we have to change the calling
convention for PerformPortalFetch.

BTW, portalcmds.h also contains a comment that would need to be fixed.

Updated.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073