plpython does not honour max-rows
Hi
I came across this when developing a sampling function using plpy.execute
that needs to be able to sample zero rows. What actually happens is that
zero is ignored for max-rows and all rows are returned. Test case below:-
Many thanks
Kieran
drop table if exists _test_max_rows;
create table _test_max_rows (i integer);
insert into _test_max_rows
select *
from generate_series(1, 100);
create or replace function plpython3u_execute_max_row(max_rows integer)
returns setof integer
language plpython3u
as $$
for row in plpy.execute('select * from _test_max_rows', max_rows):
yield row['i']
$$;
-- Correctly returns 10 rows
select * from plpython3u_execute_max_row(10);
-- Incorrectly returns all 100 rows
select * from plpython3u_execute_max_row(0)
On 2 May 2023, at 12:30, Kieran McCusker <kieran.mccusker@gmail.com> wrote:
I came across this when developing a sampling function using plpy.execute that needs to be able to sample zero rows. What actually happens is that zero is ignored for max-rows and all rows are returned.
A max_rows of less than or equal to zero is IIRC interpreted as "fetch all
rows". I think this works as intended, is it documented anywhere to work in
another way?
--
Daniel Gustafsson
Thanks for the quick response. Chapter 46.6.1 says that max-rows is an
optional row limit. Unless I missed it there is nothing in the
documentation about zero meaning all rows. Wouldn't it rather be like SQL
LIMIT 0 meaning all rows?
Anyway it was surprising gotcha, but of course easy to code around.
Kieran
On Tue, 2 May 2023, 12:01 Daniel Gustafsson, <daniel@yesql.se> wrote:
Show quoted text
On 2 May 2023, at 12:30, Kieran McCusker <kieran.mccusker@gmail.com>
wrote:
I came across this when developing a sampling function using
plpy.execute that needs to be able to sample zero rows. What actually
happens is that zero is ignored for max-rows and all rows are returned.A max_rows of less than or equal to zero is IIRC interpreted as "fetch all
rows". I think this works as intended, is it documented anywhere to work
in
another way?--
Daniel Gustafsson
On 2 May 2023, at 13:37, Kieran McCusker <kieran.mccusker@gmail.com> wrote:
Thanks for the quick response. Chapter 46.6.1 says that max-rows is an optional row limit. Unless I missed it there is nothing in the documentation about zero meaning all rows. Wouldn't it rather be like SQL LIMIT 0 meaning all rows?
That does sound like something which we should document, the confusion is easy
to see. Thanks for the report.
FTR I think I misremembered in my earlier email, it's == 0 and not <= 0 which
implies to limit.
--
Daniel Gustafsson
Without making too much of a fuss, wouldn't it be simpler to honour a
row-limit of zero rather than document that it doesn't work?
On Tue, 2 May 2023 at 12:48, Daniel Gustafsson <daniel@yesql.se> wrote:
Show quoted text
On 2 May 2023, at 13:37, Kieran McCusker <kieran.mccusker@gmail.com>
wrote:
Thanks for the quick response. Chapter 46.6.1 says that max-rows is an
optional row limit. Unless I missed it there is nothing in the
documentation about zero meaning all rows. Wouldn't it rather be like SQL
LIMIT 0 meaning all rows?That does sound like something which we should document, the confusion is
easy
to see. Thanks for the report.FTR I think I misremembered in my earlier email, it's == 0 and not <= 0
which
implies to limit.--
Daniel Gustafsson
Kieran McCusker <kieran.mccusker@gmail.com> writes:
Without making too much of a fuss, wouldn't it be simpler to honour a
row-limit of zero rather than document that it doesn't work?
plpy.execute is a thin wrapper around SPI_execute, which does document
this point:
If <parameter>count</parameter> is zero then the command is executed
for all rows that it applies to. If <parameter>count</parameter>
is greater than zero, then no more than <parameter>count</parameter> rows
will be retrieved; execution stops when the count is reached, much like
adding a <literal>LIMIT</literal> clause to the query.
Since that's stood for a few decades now, changing it seems impossible
from the backwards-compatibility standpoint. However, it does seem
appropriate to repeat that material in the wrapper's documentation.
I wonder whether the similar plperl and pltcl wrappers are also
documentation-shy here.
regards, tom lane
On 2 May 2023, at 16:02, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Since that's stood for a few decades now, changing it seems impossible
from the backwards-compatibility standpoint.
Agreed, that's a non-starter.
However, it does seem
appropriate to repeat that material in the wrapper's documentation.I wonder whether the similar plperl and pltcl wrappers are also
documentation-shy here.
It seems like they are all a bit thin on explaining this. The attached diff
copies the wording (which unsurprisingly is pretty good IMO) into the
plperl/python/tcl documentation.
--
Daniel Gustafsson
Attachments:
max_rows.diffapplication/octet-stream; name=max_rows.diff; x-unix-mode=0644Download+17-2
I wrote:
Since that's stood for a few decades now, changing it seems impossible
from the backwards-compatibility standpoint. However, it does seem
appropriate to repeat that material in the wrapper's documentation.
I wonder whether the similar plperl and pltcl wrappers are also
documentation-shy here.
Indeed so. The underlying SPI documentation is solid enough on this
point, but the PLs are all misleading, in that they suggest the limit
arguments work like "LIMIT n" or "FETCH n", which isn't quite so.
I suggest the attached docs patch.
regards, tom lane
Attachments:
fix-PL-limit-documentation.patchtext/x-diff; charset=us-ascii; name=fix-PL-limit-documentation.patchDownload+29-10
Daniel Gustafsson <daniel@yesql.se> writes:
On 2 May 2023, at 16:02, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I wonder whether the similar plperl and pltcl wrappers are also
documentation-shy here.
It seems like they are all a bit thin on explaining this. The attached diff
copies the wording (which unsurprisingly is pretty good IMO) into the
plperl/python/tcl documentation.
Ah, seems like we set to work on this at the same time :-(
I thought that s/max-rows/limit/ would be a good idea, mainly because
plperl's spi_exec_prepared uses that name as a caller-exposed hash key.
I'm not especially concerned about the wording otherwise.
regards, tom lane
On 2 May 2023, at 22:39, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Daniel Gustafsson <daniel@yesql.se> writes:
On 2 May 2023, at 16:02, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I wonder whether the similar plperl and pltcl wrappers are also
documentation-shy here.It seems like they are all a bit thin on explaining this. The attached diff
copies the wording (which unsurprisingly is pretty good IMO) into the
plperl/python/tcl documentation.Ah, seems like we set to work on this at the same time :-(
Pretty impressive timing across timezones =)
I thought that s/max-rows/limit/ would be a good idea,
I was actually thinking about that but backed off to not confuse things with
LIMIT.
mainly because
plperl's spi_exec_prepared uses that name as a caller-exposed hash key.
But I didn't realize that, and in light of that I agree that limit is better.
I'm not especially concerned about the wording otherwise.
Neither am I, both are fine I think.
--
Daniel Gustafsson
Daniel Gustafsson <daniel@yesql.se> writes:
On 2 May 2023, at 22:39, Tom Lane <tgl@sss.pgh.pa.us> wrote:
mainly because
plperl's spi_exec_prepared uses that name as a caller-exposed hash key.
But I didn't realize that, and in light of that I agree that limit is better.
OK, let's do it like that then.
I'm not especially concerned about the wording otherwise.
Neither am I, both are fine I think.
I'll compare and merge the two patches and push. Thanks for
looking at it!
regards, tom lane
On 2 May 2023, at 23:21, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Daniel Gustafsson <daniel@yesql.se> writes:
On 2 May 2023, at 22:39, Tom Lane <tgl@sss.pgh.pa.us> wrote:
mainly because
plperl's spi_exec_prepared uses that name as a caller-exposed hash key.But I didn't realize that, and in light of that I agree that limit is better.
OK, let's do it like that then.
I'm not especially concerned about the wording otherwise.
Neither am I, both are fine I think.
I'll compare and merge the two patches and push. Thanks for
looking at it!
+1, sounds like a good plan.
--
Daniel Gustafsson