libpq - lack of support to set the fetch size

Started by matshyeqabout 12 years ago21 messagesgeneral

matshyeq@gmail.com

about 12 years ago

Hello,

I've found an issue when tried to implement fetching rows from big table
(2mln rows) in my app.
Basically I don't find an elegant and easy way (other than always use
cursors) to limit the number of rows returned.
This causes my application to break due to the excessive memory consumption.

I'm using Perl and DBD::Pg library but contacted maintainer who actually
pointed out this is an issue that goes much deeper (libpq):

"Unfortunately, this is a limitation in the underlying driver (libpq)
rather than DBD::Pg itself. There have been talks over the years of
supporting this, but nothing concrete yet. Your best bet would be to ask
about this on the Postgres lists"

Would you consider putting this on the roadmap, so one day it gets improved?

Re, the details of the issue, I believe this has been well described at:
http://stackoverflow.com/questions/21960121/perl-dbdpg-script-fails-when-selecting-data-from-big-table

Kind Regards
~Msciwoj

Adrian Klaver

adrian.klaver@aklaver.com

about 12 years ago

In reply to: matshyeq (#1)

Re: libpq - lack of support to set the fetch size

On 03/09/2014 06:43 AM, matshyeq wrote:

Hello,

I've found an issue when tried to implement fetching rows from big table
(2mln rows) in my app.
Basically I don't find an elegant and easy way (other than always use
cursors) to limit the number of rows returned.
This causes my application to break due to the excessive memory consumption.

LIMIT does not work?

I'm using Perl and DBD::Pg library but contacted maintainer who actually
pointed out this is an issue that goes much deeper (libpq):

"Unfortunately, this is a limitation in the underlying driver (libpq)
rather than DBD::Pg itself. There have been talks over the years of
supporting this, but nothing concrete yet. Your best bet would be to ask
about this on the Postgres lists"

Would you consider putting this on the roadmap, so one day it gets improved?

Re, the details of the issue, I believe this has been well described at:
http://stackoverflow.com/questions/21960121/perl-dbdpg-script-fails-when-selecting-data-from-big-table

Kind Regards
~Msciwoj

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Daniel Verite

daniel@manitou-mail.org

about 12 years ago

In reply to: matshyeq (#1)

Re: libpq - lack of support to set the fetch size

matshyeq wrote:

"Unfortunately, this is a limitation in the underlying driver (libpq) rather
than DBD::Pg itself. There have been talks over the years of supporting
this, but nothing concrete yet. Your best bet would be to ask about this on
the Postgres lists"

Would you consider putting this on the roadmap, so one day it gets improved?

This improvement seems to have actually been made since 9.2 with
the PQsetSingleRowMode() function:

http://postgresql.org/docs/current/static/libpq-single-row-mode.html

Best regards,
--
Daniel
PostgreSQL-powered mail user agent and storage: http://www.manitou-mail.org

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Laurenz Albe

laurenz.albe@cybertec.at

about 12 years ago

In reply to: Daniel Verite (#3)

Re: libpq - lack of support to set the fetch size

Daniel Verite wrote:

matshyeq wrote:

[ runs out of memory on the client because all results from a large query are retrieved at once ]

"Unfortunately, this is a limitation in the underlying driver (libpq) rather
than DBD::Pg itself. There have been talks over the years of supporting
this, but nothing concrete yet. Your best bet would be to ask about this on
the Postgres lists"

Would you consider putting this on the roadmap, so one day it gets improved?

This improvement seems to have actually been made since 9.2 with
the PQsetSingleRowMode() function:

http://postgresql.org/docs/current/static/libpq-single-row-mode.html

Yes, DBD::Pg could be improved to make use of that; the problem is probably
that the code would have to differentiate between PostgreSQL versions.

Your solution with using
SELECT ... OFFSET ? LIMIT 1
in a loop is bound to suck.

First of all, there is no guarantee that the rows will be returned in
the same order each time, see for example
http://www.postgresql.org/docs/current/static/runtime-config-compatible.html#GUC-SYNCHRONIZE-SEQSCANS
Also, unless you operate with an isolation level higher than READ COMMITTED,
the various SELECTs could operate on different data sets.

So you are likely to end up with incorrect results sooner or later
if you use OFFSET and LIMIT without an ORDER BY clause.

Then you will have really bad performance, especially with a large table,
because each SELECT statement will have to start scanning the table again.
The complexity will rise from O(n) to O(n^2).

You can improve on this by using ORDER BY with an index and remembering
the last returned row (get and read http://sql-performance-explained.com/).

Finally, you will have a client-server round trip for each row returned.
This is a problem you would also have when using PQsetSingleRowMode().

Yours,
Laurenz Albe

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

matshyeq

matshyeq@gmail.com

about 12 years ago

In reply to: Laurenz Albe (#4)

Re: libpq - lack of support to set the fetch size

Fully agree with Laurenz.
LIMIT in some (limited!) cases could be seen as a workaround but it's far
from being elegant (what if your end user types the query?)
If 'SingleRowMode' goes row-by-row then again it's not a solution,
especially given that this particular issue applies to rather large row
sets.
The only solution is CURSOR based which I find an awkward low level hack
comparing to elegant option supported by native library.

Postgresql is there for a good while perceived as one of the best (or just
simply the best!?) available open source DB solution, so I'm really
surprised this functionality is not yet supported...

On Mon, Mar 10, 2014 at 6:58 AM, Albe Laurenz <laurenz.albe@wien.gv.at>wrote:

Daniel Verite wrote:

matshyeq wrote:

[ runs out of memory on the client because all results from a large query
are retrieved at once ]

"Unfortunately, this is a limitation in the underlying driver (libpq)

rather

than DBD::Pg itself. There have been talks over the years of supporting
this, but nothing concrete yet. Your best bet would be to ask about

this on

the Postgres lists"

Would you consider putting this on the roadmap, so one day it gets

improved?

This improvement seems to have actually been made since 9.2 with
the PQsetSingleRowMode() function:

http://postgresql.org/docs/current/static/libpq-single-row-mode.html

Yes, DBD::Pg could be improved to make use of that; the problem is probably
that the code would have to differentiate between PostgreSQL versions.

Your solution with using
SELECT ... OFFSET ? LIMIT 1
in a loop is bound to suck.

First of all, there is no guarantee that the rows will be returned in
the same order each time, see for example

http://www.postgresql.org/docs/current/static/runtime-config-compatible.html#GUC-SYNCHRONIZE-SEQSCANS
Also, unless you operate with an isolation level higher than READ
COMMITTED,
the various SELECTs could operate on different data sets.

So you are likely to end up with incorrect results sooner or later
if you use OFFSET and LIMIT without an ORDER BY clause.

Then you will have really bad performance, especially with a large table,
because each SELECT statement will have to start scanning the table again.
The complexity will rise from O(n) to O(n^2).

You can improve on this by using ORDER BY with an index and remembering
the last returned row (get and read http://sql-performance-explained.com/
).

Finally, you will have a client-server round trip for each row returned.
This is a problem you would also have when using PQsetSingleRowMode().

Yours,
Laurenz Albe

--
Thank you,
Kind Regards
~Maciek

Laurenz Albe

laurenz.albe@cybertec.at

about 12 years ago

In reply to: matshyeq (#5)

Re: libpq - lack of support to set the fetch size

matshyeq wrote:

Postgresql is there for a good while perceived as one of the best (or just simply the best!?)
available open source DB solution, so I'm really surprised this functionality is not yet supported...

You can retrieve the full result set,
you can retrieve it row by row,
you can use a LIMIT clause to retrieve it in batches.

Can you explain how exactly the functionality would look that
you are missing?

Yours,
Laurenz Albe

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

John R Pierce

pierce@hogranch.com

about 12 years ago

In reply to: matshyeq (#1)

Re: libpq - lack of support to set the fetch size

On 3/9/2014 6:43 AM, matshyeq wrote:

Hello,

I've found an issue when tried to implement fetching rows from big
table (2mln rows) in my app.
Basically I don't find an elegant and easy way (other than always use
cursors) to limit the number of rows returned.
This causes my application to break due to the excessive memory
consumption.

I'm using Perl and DBD::Pg library but contacted maintainer who
actually pointed out this is an issue that goes much deeper (libpq):

"Unfortunately, this is a limitation in the underlying driver (libpq)
rather than DBD::Pg itself. There have been talks over the years of
supporting this, but nothing concrete yet. Your best bet would be to
ask about this on the Postgres lists"

in addition to what the others suggested, you can use a CURSOR to read
through results in arbitrary sized blocks.

--
john r pierce 37N 122W
somewhere on the middle of the left coast

matshyeq

matshyeq@gmail.com

about 12 years ago

In reply to: Laurenz Albe (#6)

Re: libpq - lack of support to set the fetch size

Albe Laurenz wrote:

I would believe the stackoverflow (
http://stackoverflow.com/questions/21960121/perl-script-fails-when-selecting-data-from-big-postgresql-table)
question referred to explains the issue well.

You can retrieve the full result set,

not an option because of client memory limitations (in this case it's poor
client spec but there always are some, especially when you want to pull 1e7
rows)

you can retrieve it row by row,

not an option because of performance (db calls/network roundtrips)

you can use a LIMIT clause to retrieve it in batches.

you pointed the best why it's not a feasible option (complexity, isolation
levels, not always possible ie. when custom query and last but not least:
far from being elegant)

CURSOR option

As already explained at stackoverflow - I'm using it as a workaround. My
general point is it forces developers to use lower level communication with
DB (cursors) therefore not as elegant as just setting RowCacheSize
parameter as specified by DBI. According to DBD::Pg maintainer this hasn't
and can't be implemented for PostgreSQL due to the lack of support in its
own libpq library.
So again.., I'm really surprised this functionality is not yet supported in
PostgreSQL. Does that mean everybody have been implementing this through
cursors?

To recap what's on stackoverflow - The functionality I'm talking about
would be an equivalent of JDBC
setFetchSize()<http://docs.oracle.com/javase/7/docs/api/java/sql/Statement.html#setFetchSize(int)>
function
to optimize the load from (any) database in batches, like in the example
below:

Statement st =
conn.createStatement(java.sql.ResultSet.TYPE_FORWARD_ONLY,
java.sql.ResultSet.CONCUR_READ_ONLY);

// Set the fetch size to 1000.

st.setFetchSize(1000);

// Execute the given sql query

String sql = "select * from bigtable";

ResultSet rs = statement.executeQuery(sql);

while (rs.next()) {

⋮

}

where underneath ResultSet.next() doesn't actually fetch one row at a time
from the RESULT-SET. It returns that from the (local) ROW-SET and fetches
ROW-SET (transparently) whenever it becomes exhausted on the local client.

Actually, curious now if this functionality has been implemented in
PostgreSQL JDBC drivers...?

Anyway, according to one of the DBD::Pg developers it's impossible to bring
this functionality as the problem lies deeper, within libpq library:

"Unfortunately, this is a limitation in the underlying driver (libpq)
rather than DBD::Pg itself. There have been talks over the years of
supporting this, but nothing concrete yet."

So probably the best is to ask Greg to speak to details if still unclear.

Kind Regards,
Maciek

On Mon, Mar 10, 2014 at 9:42 AM, Albe Laurenz <laurenz.albe@wien.gv.at>wrote:

matshyeq wrote:

Postgresql is there for a good while perceived as one of the best (or

just simply the best!?)

available open source DB solution, so I'm really surprised this

functionality is not yet supported...

You can retrieve the full result set,
you can retrieve it row by row,
you can use a LIMIT clause to retrieve it in batches.

Can you explain how exactly the functionality would look that
you are missing?

Yours,
Laurenz Albe

--
Thank you,
Kind Regards
~Maciek

Marko Kreen

markokr@gmail.com

about 12 years ago

In reply to: Laurenz Albe (#4)

Re: libpq - lack of support to set the fetch size

On Mon, Mar 10, 2014 at 06:58:26AM +0000, Albe Laurenz wrote:

Daniel Verite wrote:

matshyeq wrote:

[ runs out of memory on the client because all results from a large query are retrieved at once ]

"Unfortunately, this is a limitation in the underlying driver (libpq) rather
than DBD::Pg itself. There have been talks over the years of supporting
this, but nothing concrete yet. Your best bet would be to ask about this on
the Postgres lists"

Would you consider putting this on the roadmap, so one day it gets improved?

This improvement seems to have actually been made since 9.2 with
the PQsetSingleRowMode() function:

http://postgresql.org/docs/current/static/libpq-single-row-mode.html

Finally, you will have a client-server round trip for each row returned.
This is a problem you would also have when using PQsetSingleRowMode().

PQsetSingleRowMode() does not do additional roudtrips, it loads rows
from libpq internal buffer.

--
marko

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#10

Tom Lane

tgl@sss.pgh.pa.us

about 12 years ago

In reply to: matshyeq (#5)

Re: libpq - lack of support to set the fetch size

matshyeq <matshyeq@gmail.com> writes:

If 'SingleRowMode' goes row-by-row then again it's not a solution,
especially given that this particular issue applies to rather large row
sets.

Perhaps you should actually experiment with that solution instead of
rejecting it out of hand. Or at least RTFM about it.

It does have limitations: you can't interleave fetching of different
large query results. But I don't have a problem telling people they
ought to use cursors for such cases.

regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#11

Adrian Klaver

adrian.klaver@aklaver.com

about 12 years ago

In reply to: matshyeq (#8)

Re: libpq - lack of support to set the fetch size

On 03/10/2014 04:51 AM, matshyeq wrote:

Albe Laurenz wrote:

I would believe the stackoverflow
(http://stackoverflow.com/questions/21960121/perl-script-fails-when-selecting-data-from-big-postgresql-table)
question referred to explains the issue well.

You can retrieve the full result set,

not an option because of client memory limitations (in this case it's
poor client spec but there always are some, especially when you want to
pull 1e7 rows)

you can retrieve it row by row,

not an option because of performance (db calls/network roundtrips)

you can use a LIMIT clause to retrieve it in batches.

you pointed the best why it's not a feasible option (complexity,
isolation levels, not always possible ie. when custom query and last but
not least: far from being elegant)

CURSOR option

As already explained at stackoverflow - I'm using it as a workaround. My
general point is it forces developers to use lower level communication
with DB (cursors) therefore not as elegant as just setting RowCacheSize
parameter as specified by DBI. According to DBD::Pg maintainer this
hasn't and can't be implemented for PostgreSQL due to the lack of
support in its own libpq library.
So again.., I'm really surprised this functionality is not yet supported
in PostgreSQL. Does that mean everybody have been implementing this
through cursors?

To recap what's on stackoverflow - The functionality I'm talking about
would be an equivalent of JDBC setFetchSize()
<http://docs.oracle.com/javase/7/docs/api/java/sql/Statement.html#setFetchSize(int)> function
to optimize the load from (any) database in batches, like in the example
below:

Statement st =
conn.createStatement(java.sql.ResultSet.TYPE_FORWARD_ONLY,
java.sql.ResultSet.CONCUR_READ_ONLY);

// Set the fetch size to 1000.

st.setFetchSize(1000);

// Execute the given sql query

String sql = "select * from bigtable";

ResultSet rs = statement.executeQuery(sql);

while (rs.next()) {

⋮

}

where underneath ResultSet.next() doesn't actually fetch one row at a
time from the RESULT-SET. It returns that from the (local) ROW-SET and
fetches ROW-SET (transparently) whenever it becomes exhausted on the
local client.

Actually, curious now if this functionality has been implemented in
PostgreSQL JDBC drivers...?

Yes, using a cursor.

http://jdbc.postgresql.org/documentation/92/query.html

By default the driver collects all the results for the query at once.
This can be inconvenient for large data sets so the JDBC driver provides
a means of basing a ResultSet on a database cursor and only fetching a
small number of rows.

.....

Anyway, according to one of the DBD::Pg developers it's impossible to
bring this functionality as the problem lies deeper, within libpq library:

"Unfortunately, this is a limitation in the underlying driver (libpq)
rather than DBD::Pg itself. There have been talks over the years of
supporting this, but nothing concrete yet."

So probably the best is to ask Greg to speak to details if still unclear.

Kind Regards,
Maciek

On Mon, Mar 10, 2014 at 9:42 AM, Albe Laurenz <laurenz.albe@wien.gv.at
<mailto:laurenz.albe@wien.gv.at>> wrote:

matshyeq wrote:

Postgresql is there for a good while perceived as one of the best

(or just simply the best!?)

available open source DB solution, so I'm really surprised this

functionality is not yet supported...

You can retrieve the full result set,
you can retrieve it row by row,
you can use a LIMIT clause to retrieve it in batches.

Can you explain how exactly the functionality would look that
you are missing?

Yours,
Laurenz Albe

--
Thank you,
Kind Regards
~Maciek

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#12

Jeff Janes

jeff.janes@gmail.com

about 12 years ago

In reply to: matshyeq (#1)

Re: libpq - lack of support to set the fetch size

On Sun, Mar 9, 2014 at 6:43 AM, matshyeq <matshyeq@gmail.com> wrote:

Hello,

I've found an issue when tried to implement fetching rows from big table
(2mln rows) in my app.
Basically I don't find an elegant and easy way (other than always use
cursors) to limit the number of rows returned.
This causes my application to break due to the excessive memory
consumption.

I'm using Perl and DBD::Pg library but contacted maintainer who actually
pointed out this is an issue that goes much deeper (libpq):

"Unfortunately, this is a limitation in the underlying driver (libpq)
rather than DBD::Pg itself. There have been talks over the years of
supporting this, but nothing concrete yet. Your best bet would be to ask
about this on the Postgres lists"

I don't think this is correct. First, DBD::Pg could get tricky and
automatically wrap your query in a cursor and then fetch from the cursor
behind the scenes. I believe that this is what Python's module does for
you in some modes. Second, the feature needed to do this without even
using a cursor was added 1.5 years ago (PQsetSingleRowMode). The DBD::Pg
was just not taught how to use it yet.

The first strategy could probably be done purely in Perl, the second would
require changes to the C parts of DBD::Pg.

Of course just because it can be implemented in DBD::Pg doesn't mean anyone
has an obligation to do it. You could speed that along by contributing the
code yourself. But I would say the ball is firmly in DBD::Pg's court.

Cheers,

Jeff

#13

Daniel Verite

daniel@manitou-mail.org

about 12 years ago

In reply to: matshyeq (#5)

Re: libpq - lack of support to set the fetch size

matshyeq wrote:

The only solution is CURSOR based which I find an awkward low level hack
comparing to elegant option supported by native library.

That's not the only solution, even with the current DBD::Pg
you could do:

$dbh->do("COPY (sql-squery) TO STDOUT");
my $data;
while ($dbh->pg_getcopydata($data) >= 0) {
# process $data
}

The results would be streamed as opposed to being accumulated in
memory. Also COPY is optimized for high performance.

The drawback is you'd have to parse $data according to the
specific rules of the COPY format, which may be easy or
not-so-easy depending on the actual data, numeric or text
or other, whether it has NULLs, backslashes and so on.

Best regards,
--
Daniel
PostgreSQL-powered mail user agent and storage: http://www.manitou-mail.org

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#14

matshyeq

matshyeq@gmail.com

about 12 years ago

In reply to: Daniel Verite (#13)

Re: libpq - lack of support to set the fetch size

Marko, Tom, Adrian, Jeff, Daniel - thank you all for valuable feedback!

Two general questions:

- when using PQsetSingleRowMode() function - does it give an option to
define how many rows to cache on client's side (like JDBC setFetchSize()
does) or leaves it at pqlib's discretion?

- is it/would it be possible to add corresponding option to pgAdmin to
limit initially (and each subsequently) returned rows in Query Tool by
custom defined max value?

@Tom

Perhaps you should actually experiment with that solution instead of

rejecting it out of hand. Or at least RTFM about it.

As I'm not using pqlib directly I'm unable to leverage PQsetSingleRowMode()
call (or at least I'm not aware how to do this from DBD::Pg)

I simply passed you feedback given by them.

@Adrian

The example in the documentation you refer to actually demonstrates this
has been properly implemented in JDBC.

By properly I mean call to:

setFetchSize()

works, whatever it actually does behind the scenes (cursors?) it doesn't
actually require a developer to declare and utilize cursors explicitly, like

st.execute("DECLARE csr CURSOR FOR SELECT * FROM myBigTable;");

conn.prepareStatement("fetch 1000 from csr");

⋮

@Jeff

I'll make a suggestion to DBD::Pg development

@Daniel

Very interesting alternative. You're welcome to contribute to this
stackoverflow question!

Does it mean $data is a reference to single returned row of data (`COPY
formatted`)?

Kind Regards

Msciwoj

#15

Marko Kreen

markokr@gmail.com

about 12 years ago

In reply to: matshyeq (#14)

Re: libpq - lack of support to set the fetch size

On Tue, Mar 11, 2014 at 12:39:12PM +0000, matshyeq wrote:

- when using PQsetSingleRowMode() function - does it give an option to
define how many rows to cache on client's side (like JDBC setFetchSize()
does) or leaves it at pqlib's discretion?

This option would not make sense as you are not "fetching" anything,
full resultset is being streamed from server over TCP connection.

- is it/would it be possible to add corresponding option to pgAdmin to
limit initially (and each subsequently) returned rows in Query Tool by
custom defined max value?

It could close connection in the middle of resultset but that seems like
bad idea. LIMIT N or FETCH N are better for such task.

--
marko

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#16

matshyeq

matshyeq@gmail.com

about 12 years ago

In reply to: Marko Kreen (#15)

Re: libpq - lack of support to set the fetch size

On Wed, Mar 12, 2014 at 9:30 AM, Marko Kreen <markokr@gmail.com> wrote:

On Tue, Mar 11, 2014 at 12:39:12PM +0000, matshyeq wrote:

- when using PQsetSingleRowMode() function - does it give an option to

define how many rows to cache on client's side (like JDBC setFetchSize()

does) or leaves it at pqlib's discretion?

This option would not make sense as you are not "fetching" anything,

full resultset is being streamed from server over TCP connection.

Well, I don't know what "streamed" exactly means here.
If server pushes sequentially all the data not asking client if ready to
receive then that's what the issue is about.
If client asks server for another chunk each time it has received previous
one then to me it's implicit 'fetching' scenario where user/developer
doesn't have an option to define fetch size.

- is it/would it be possible to add corresponding option to pgAdmin to

limit initially (and each subsequently) returned rows in Query Tool by

custom defined max value?

It could close connection in the middle of resultset but that seems like

bad idea. LIMIT N or FETCH N are better for such task.

I don't see why? I can't think of any single SQL tool I've been working
with that didn't have this functionality, really.
The principle I find very simple and useful.
There is defined "fetch row size" parameter (each tool calls give its own
name),
after submitting ANY query, client fetches result set rows but not more
than that.
Some programs even automatically define this value based on result grid
size displayed on the screen.
User then usually has two buttons, fetch another batch/screen or fetch all
- he decides.
If he decides way too late (break for coffee) then he simply resubmits the
query (and potentially change the parameter first)...

I don't find value in auto-fetching millions of rows for user to present on
the screen.
Also I don't think it's particularly useful when you need to know and apply
database specific SQL syntax to limit the rows.
If you join multiple tables that may be even more tricky (which table to
apply limit? or use subquerying instead?).

Last but non least, I can even see now this Option was once available in
pgAdmin, but disappeared over time
http://www.pgadmin.org/docs/1.10/query.html

"In the options dialog <http://www.pgadmin.org/docs/1.10/options-tab3.html>,
you can specify a default limit for the rowset size to retrieve. By
default, this value will be 100. If the number of rows to retrieve from the
server exceeds this value, a message box will appear asking what to do to
prevent retrieval of an unexpected high amount of data. You may decide to
retrieve just the first rows, as configured with the max rows setting, or
retrieving the complete rowset regardless of the setting, or abort the
query, effectively retrieving zero rows."

This is pretty much exactly what I'm talking about here (pgAdmin
functionality)
and its equivalent for developers (pqlib)

Regards
Msciwoj

#17

Marko Kreen

markokr@gmail.com

about 12 years ago

In reply to: matshyeq (#16)

Re: libpq - lack of support to set the fetch size

On Wed, Mar 12, 2014 at 10:57:03AM +0000, matshyeq wrote:

On Wed, Mar 12, 2014 at 9:30 AM, Marko Kreen <markokr@gmail.com> wrote:

This option would not make sense as you are not "fetching" anything,
full resultset is being streamed from server over TCP connection.

Well, I don't know what "streamed" exactly means here.
If server pushes sequentially all the data not asking client if ready to
receive then that's what the issue is about.

This problem is handled in kernel's TCP stack - it will slow down
the connection if userspace does not read fast enough.

IOW, don't worry about it.

If client asks server for another chunk each time it has received previous
one then to me it's implicit 'fetching' scenario where user/developer
doesn't have an option to define fetch size.

This is how it is usually implemented and configurable fetch size
is indeed useful in such situation. But it requires separate round-trip
for each chunk so single-row-mode is superior method for processing
large queries without huge buffers.

- is it/would it be possible to add corresponding option to pgAdmin to

limit initially (and each subsequently) returned rows in Query Tool by

custom defined max value?

It could close connection in the middle of resultset but that seems like
bad idea. LIMIT N or FETCH N are better for such task.

I don't see why? I can't think of any single SQL tool I've been working
with that didn't have this functionality, really.

Yeah, I see no reason pgAdmin cannot implement it. Implementing
such feature with transparently adding LIMIT or FETCH is better.

PQsetSingleRowMode() is not meant for partial resultsets, but it can
be used if you accept the downsides.

--
marko

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#18

matshyeq

matshyeq@gmail.com

about 12 years ago

In reply to: matshyeq (#5)

Re: libpq - lack of support to set the fetch size

Well, I don't know what "streamed" exactly means here.
If server pushes sequentially all the data not asking client if ready

to

receive then that's what the issue is about.

This problem is handled in kernel's TCP stack - it will slow down
the connection if userspace does not read fast enough.

IOW, don't worry about it.

Again, this is something 'at discretion' of the pqlib library therefore not
sure how 'slowing down' really works in different scenarios, like ie.
server and app client on the same machine?
Still see quite a value in specifying this explicitly (like JDBC does),
which BTW I did by using CURSOR as a workaround.

If client asks server for another chunk each time it has received
previous

one then to me it's implicit 'fetching' scenario where user/developer
doesn't have an option to define fetch size.

This is how it is usually implemented and configurable fetch size
is indeed useful in such situation. But it requires separate round-trip
for each chunk so single-row-mode is superior method for processing
large queries without huge buffers.

I wouldn't worry about this (round trips) myself - at the end of the day
client decides when and how much to pull.
Also, in the datawarehouse area, we consciously tune this parameter for
each DB source individually which is considered very (and good!) common
practice.

- is it/would it be possible to add corresponding option to pgAdmin

to

limit initially (and each subsequently) returned rows in Query Tool

by

custom defined max value?

It could close connection in the middle of resultset but that seems

like

bad idea. LIMIT N or FETCH N are better for such task.

I don't see why? I can't think of any single SQL tool I've been working
with that didn't have this functionality, really.

Yeah, I see no reason pgAdmin cannot implement it.

Implementing
such feature with transparently adding LIMIT or FETCH is better.

Don't fully agree with last statement. FETCH is WAY better and more

functional than LIMIT.

PQsetSingleRowMode() is not meant for partial resultsets, but it can

be used if you accept the downsides.

Given the 'discretion mode' I'm still not sure if that's a proper and

elegant way to meet the requirement.
Maybe it is? Anyway, This is just me, an end user giving you, the
PostgreSQL maintaining developers a chance to hear 'the voice of the
customer'

Regards,
~Msciwoj

Import Notes

Reply to msg id not found: CAONr5=uXNSxin-O8PwaxdUysoDckKYT+R0UQb=2w_jCx+GaoeQ@mail.gmail.com

#19

Adrian Klaver

adrian.klaver@aklaver.com

about 12 years ago

In reply to: matshyeq (#18)

Re: libpq - lack of support to set the fetch size

On 03/12/2014 06:05 AM, matshyeq wrote:

Don't fully agree with last statement. FETCH is WAY better and more
functional than LIMIT.

PQsetSingleRowMode() is not meant for partial resultsets, but it can
be used if you accept the downsides.

Given the 'discretion mode' I'm still not sure if that's a proper and
elegant way to meet the requirement.

Following this discussion it would seem that Postgres meets the
requirement, but that what you want is that the interfaces you use make
use of the capabilities. As mentioned previously the JDBC driver and the
Python driver(psycopg2) already do what you want. The next step, to me
at least, is take what you have learned and contact the projects
(DBD::Pg, pgAdmin) in question.

Maybe it is? Anyway, This is just me, an end user giving you, the
PostgreSQL maintaining developers a chance to hear 'the voice of the
customer'

Regards,
~Msciwoj

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#20

John DeSoi

desoi@pgedit.com

about 12 years ago

In reply to: matshyeq (#16)

Re: libpq - lack of support to set the fetch size

On Mar 12, 2014, at 5:57 AM, matshyeq <matshyeq@gmail.com> wrote:

I don't see why? I can't think of any single SQL tool I've been working with that didn't have this functionality, really.
The principle I find very simple and useful.
There is defined "fetch row size" parameter (each tool calls give its own name),
after submitting ANY query, client fetches result set rows but not more than that.
Some programs even automatically define this value based on result grid size displayed on the screen.
User then usually has two buttons, fetch another batch/screen or fetch all - he decides.
If he decides way too late (break for coffee) then he simply resubmits the query (and potentially change the parameter first)...

I don't find value in auto-fetching millions of rows for user to present on the screen.
Also I don't think it's particularly useful when you need to know and apply database specific SQL syntax to limit the rows.
If you join multiple tables that may be even more tricky (which table to apply limit? or use subquerying instead?).

Using the extend query protocol, Postgres has a built-in way to limit the number of rows returned from any select without any textual manipulation of the query.

I'm not sure if libpq exposes this capability in the API, but it should not be too difficult to implement.

See:

http://www.postgresql.org/docs/current/static/protocol-flow.html#PROTOCOL-FLOW-EXT-QUERY

Once a portal exists, it can be executed using an Execute message. The Execute message specifies the portal name (empty string denotes the unnamed portal) and a maximum result-row count (zero meaning "fetch all rows"). The result-row count is only meaningful for portals containing commands that return row sets; in other cases the command is always executed to completion, and the row count is ignored. The possible responses to Execute are the same as those described above for queries issued via simple query protocol, except that Execute doesn't cause ReadyForQuery or RowDescription to be issued.

John DeSoi, Ph.D.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#21

Greg Sabino Mullane

greg@turnstep.com

about 12 years ago

In reply to: Jeff Janes (#12)