polymorphic table functions light

Started by Peter Eisentrautover 6 years ago9 messageshackers
Jump to latest
#1Peter Eisentraut
peter_e@gmx.net

I want to address the issue that calling a record-returning function
always requires specifying a result column list, even though there are
cases where the function could be self-aware enough to know the result
column list of a particular call. For example, most of the functions in
contrib/tablefunc are like that.

SQL:2016 has a feature called polymorphic table functions (PTF) that
addresses this. The full PTF feature is much larger, so I just carved
out this particular piece of functionality. Here is a link to some
related information:
https://modern-sql.com/blog/2018-11/whats-new-in-oracle-database-18c#ptf

The idea is that you attach a helper function to the main function. The
helper function is called at parse time with the constant arguments of
the main function call and can compute a result row description (a
TupleDesc in our case).

Example from the patch:

CREATE FUNCTION connectby_describe(internal)
RETURNS internal
AS 'MODULE_PATHNAME', 'connectby_describe'
LANGUAGE C;

CREATE FUNCTION connectby(text,text,text,text,int,text)
RETURNS setof record
DESCRIBE WITH connectby_describe
AS 'MODULE_PATHNAME','connectby_text'
LANGUAGE C STABLE STRICT;

(The general idea is very similar to Pavel's patch "parse time support
function"[0]/messages/by-id/CAFj8pRARh+r4=HNwQ+hws-D6msus01Dw_6zjNYur6tPk1+W0rA@mail.gmail.com but addressing a disjoint problem.)

The original SQL:2016 syntax is a bit different: There, you'd first
create two separate functions: a "describe" and a "fulfill" and then
create the callable PTF referencing those two (similar to how an
aggregate is composed of several component functions). I think
deviating from this makes some sense because we can then more easily
"upgrade" existing record-returning functions with this functionality.

Another difference is that AFAICT, the standard specifies that if the
describe function cannot resolve the call, the call fails. Again, in
order to be able to upgrade existing functions (instead of having to
create a second set of functions with a different name), I have made it
so that you can still specify an explicit column list if the describe
function does not succeed.

In this prototype patch, I have written the C interface and several
examples using existing functions in the source tree. Eventually, I'd
like to also add PL-level support for this.

Thoughts so far?

[0]: /messages/by-id/CAFj8pRARh+r4=HNwQ+hws-D6msus01Dw_6zjNYur6tPk1+W0rA@mail.gmail.com
/messages/by-id/CAFj8pRARh+r4=HNwQ+hws-D6msus01Dw_6zjNYur6tPk1+W0rA@mail.gmail.com

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

v1-0001-Polymorphic-table-functions.patchtext/plain; charset=UTF-8; name=v1-0001-Polymorphic-table-functions.patch; x-mac-creator=0; x-mac-type=0Download+503-14
#2Pavel Stehule
pavel.stehule@gmail.com
In reply to: Peter Eisentraut (#1)
Re: polymorphic table functions light

Hi

po 16. 12. 2019 v 19:53 odesílatel Peter Eisentraut <
peter.eisentraut@2ndquadrant.com> napsal:

I want to address the issue that calling a record-returning function
always requires specifying a result column list, even though there are
cases where the function could be self-aware enough to know the result
column list of a particular call. For example, most of the functions in
contrib/tablefunc are like that.

SQL:2016 has a feature called polymorphic table functions (PTF) that
addresses this. The full PTF feature is much larger, so I just carved
out this particular piece of functionality. Here is a link to some
related information:
https://modern-sql.com/blog/2018-11/whats-new-in-oracle-database-18c#ptf

The idea is that you attach a helper function to the main function. The
helper function is called at parse time with the constant arguments of
the main function call and can compute a result row description (a
TupleDesc in our case).

Example from the patch:

CREATE FUNCTION connectby_describe(internal)
RETURNS internal
AS 'MODULE_PATHNAME', 'connectby_describe'
LANGUAGE C;

CREATE FUNCTION connectby(text,text,text,text,int,text)
RETURNS setof record
DESCRIBE WITH connectby_describe
AS 'MODULE_PATHNAME','connectby_text'
LANGUAGE C STABLE STRICT;

(The general idea is very similar to Pavel's patch "parse time support
function"[0] but addressing a disjoint problem.)

The original SQL:2016 syntax is a bit different: There, you'd first
create two separate functions: a "describe" and a "fulfill" and then
create the callable PTF referencing those two (similar to how an
aggregate is composed of several component functions). I think
deviating from this makes some sense because we can then more easily
"upgrade" existing record-returning functions with this functionality.

Another difference is that AFAICT, the standard specifies that if the
describe function cannot resolve the call, the call fails. Again, in
order to be able to upgrade existing functions (instead of having to
create a second set of functions with a different name), I have made it
so that you can still specify an explicit column list if the describe
function does not succeed.

In this prototype patch, I have written the C interface and several
examples using existing functions in the source tree. Eventually, I'd
like to also add PL-level support for this.

Thoughts so far?

What I read about it - it can be very interesting feature. It add lot of
dynamic to top queries - it can be used very easy for cross tables on
server side.

Sure - it can be used very badly - but it is nothing new for stored
procedures.

Personally I like this feature. The difference from standard syntax
probably is not problem a) there are little bit syntax already, b) I cannot
to imagine wide using of this feature. But it can be interesting for
extensions.

Better to use some special pseudotype for describe function instead
"internal" - later it can interesting for PL support

Regards

Pavel

Show quoted text

[0]:

/messages/by-id/CAFj8pRARh+r4=HNwQ+hws-D6msus01Dw_6zjNYur6tPk1+W0rA@mail.gmail.com

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#1)
Re: polymorphic table functions light

Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:

I want to address the issue that calling a record-returning function
always requires specifying a result column list, even though there are
cases where the function could be self-aware enough to know the result
column list of a particular call. For example, most of the functions in
contrib/tablefunc are like that.

Seems like a reasonable goal.

SQL:2016 has a feature called polymorphic table functions (PTF) that
addresses this. The full PTF feature is much larger, so I just carved
out this particular piece of functionality. Here is a link to some
related information:
https://modern-sql.com/blog/2018-11/whats-new-in-oracle-database-18c#ptf

The idea is that you attach a helper function to the main function. The
helper function is called at parse time with the constant arguments of
the main function call and can compute a result row description (a
TupleDesc in our case).

Example from the patch:

CREATE FUNCTION connectby_describe(internal)
RETURNS internal
AS 'MODULE_PATHNAME', 'connectby_describe'
LANGUAGE C;

CREATE FUNCTION connectby(text,text,text,text,int,text)
RETURNS setof record
DESCRIBE WITH connectby_describe
AS 'MODULE_PATHNAME','connectby_text'
LANGUAGE C STABLE STRICT;

(The general idea is very similar to Pavel's patch "parse time support
function"[0] but addressing a disjoint problem.)

Hm. Given that this involves a function-taking-and-returning-internal,
I think it's fairly silly to claim that it is implementing a SQL-standard
feature, or even a subset or related feature. Nor do I see a pathway
whereby this might end in a feature you could use without writing C code.

That being the case, I'm not in favor of using up SQL syntax space for it
if we don't have to. Moreover, this approach requires a whole lot of
duplicative-seeming new infrastructure, such as a new pg_proc column.
And you're not even done yet --- where's the pg_dump support?

I think we'd be better off to address this by extending the existing
"support function" infrastructure by inventing a new support request type,
much as Pavel's patch did. I've not gotten around to reviewing the latest
version of his patch, so I'm not sure if it provides enough flexibility to
solve this particular problem, or if we'd need a different request type
than he proposes. But I'd rather go down that path than this one.
It should provide the same amount of functionality with a whole lot less
overhead code.

regards, tom lane

#4Vik Fearing
vik@postgresfriends.org
In reply to: Tom Lane (#3)
Re: polymorphic table functions light

On 16/12/2019 22:13, Tom Lane wrote:

That being the case, I'm not in favor of using up SQL syntax space for it
if we don't have to.

Do I understand correctly that you are advocating *against* using
standard SQL syntax for a feature that is defined by the SQL Standard
and that we have no similar implementation for?

If so, I would like to stand up to it.  We are known as (at least one
of) the most conforming implementations and I hope we will continue to
be so.  I would rather we remove from rather than add to this page:
https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL_Standard

--

Vik Fearing

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Vik Fearing (#4)
Re: polymorphic table functions light

Vik Fearing <vik.fearing@2ndquadrant.com> writes:

On 16/12/2019 22:13, Tom Lane wrote:

That being the case, I'm not in favor of using up SQL syntax space for it
if we don't have to.

Do I understand correctly that you are advocating *against* using
standard SQL syntax for a feature that is defined by the SQL Standard
and that we have no similar implementation for?

My point is that what Peter is proposing is exactly *not* the standard's
feature. We generally avoid using up standard syntax for not-standard
semantics, especially if there's any chance that somebody might come along
and build a more-conformant version later. (Having said that, I had the
impression that what he was proposing wasn't the standard's syntax either,
but just a homegrown CREATE FUNCTION addition. I don't really see the
point of doing it like that when we can do it below the level of SQL.)

regards, tom lane

#6Peter Eisentraut
peter_e@gmx.net
In reply to: Peter Eisentraut (#1)
Re: polymorphic table functions light

On 2019-12-16 19:53, Peter Eisentraut wrote:

SQL:2016 has a feature called polymorphic table functions (PTF) that
addresses this. The full PTF feature is much larger, so I just carved
out this particular piece of functionality. Here is a link to some
related information:
https://modern-sql.com/blog/2018-11/whats-new-in-oracle-database-18c#ptf

The idea is that you attach a helper function to the main function. The
helper function is called at parse time with the constant arguments of
the main function call and can compute a result row description (a
TupleDesc in our case).

Here is an updated patch for the record, since the previous patch had
accumulated some significant merge conflicts.

I will reply to the discussions elsewhere in the thread.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

v2-0001-Polymorphic-table-functions.patchtext/plain; charset=UTF-8; name=v2-0001-Polymorphic-table-functions.patch; x-mac-creator=0; x-mac-type=0Download+504-15
#7Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#3)
Re: polymorphic table functions light

On 2019-12-16 22:13, Tom Lane wrote:

Hm. Given that this involves a function-taking-and-returning-internal,
I think it's fairly silly to claim that it is implementing a SQL-standard
feature, or even a subset or related feature. Nor do I see a pathway
whereby this might end in a feature you could use without writing C code.

I think we'd be better off to address this by extending the existing
"support function" infrastructure by inventing a new support request type,

I definitely want to make it work in a way that does not require writing
C code. My idea was to create a new type, perhaps called "descriptor",
that represents essentially a tuple descriptor. (It could be exactly a
TupleDesc, as this patch does, or something similar.) For the sake of
discussion, we could use JSON as the text representation of this. Then
a PL/pgSQL function or something else high level could easily be written
to assemble this. Interesting use cases are for example in the area of
using PL/Perl or PL/Python for unpacking some serialization format using
existing modules in those languages.

The SQL standard has the option of leaving the call signatures of the
PTF support functions implementation defined, so this approach would
appear to be within the spirit of the specification.

Obviously, there is a lot of leg work to be done between here and there,
but it seems doable. The purpose of this initial patch submission was
to get some opinions on the basic idea of "determine result tuple
structure by calling helper function at parse time", and so far no one
has fallen off their chair from that, so I'm encouraged. ;-)

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#8Peter Eisentraut
peter_e@gmx.net
In reply to: Vik Fearing (#4)
Re: polymorphic table functions light

On 2019-12-20 01:30, Vik Fearing wrote:

On 16/12/2019 22:13, Tom Lane wrote:

That being the case, I'm not in favor of using up SQL syntax space for it
if we don't have to.

Do I understand correctly that you are advocating *against* using
standard SQL syntax for a feature that is defined by the SQL Standard
and that we have no similar implementation for?

On the question of using SQL syntax or not for this, there are a couple
of arguments I'm considering.

First, the SQL standard explicitly permits not implementing the exact
signatures of the PTF component procedures; see feature code B208.
While this does not literally permit diverging on the CREATE FUNCTION
syntax, it's clear that they expect that the creation side of this will
have some incompatibilities. The existing practices of other vendors
support this observation. What's more interesting in practice is making
the invocation side compatible.

Second, set-returning functions in PostgreSQL already exist and in my
mind it would make sense to make this feature work with existing
functions or allow easy "upgrades" rather than introducing another
completely new syntax to do something very similar to what already
exists. This wouldn't be a good user experience. And the full standard
syntax is also complicated and different enough that it wouldn't be
trivial to add.

But I'm open to other ideas.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#9Dent John
denty@QQdd.eu
In reply to: Peter Eisentraut (#7)
Re: polymorphic table functions light

On 24 Jan 2020, at 08:27, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:

I definitely want to make it work in a way that does not require writing C code. My idea was to create a new type, perhaps called "descriptor", that represents essentially a tuple descriptor. (It could be exactly a TupleDesc, as this patch does, or something similar.) For the sake of discussion, we could use JSON as the text representation of this. Then a PL/pgSQL function or something else high level could easily be written to assemble this. Interesting use cases are for example in the area of using PL/Perl or PL/Python for unpacking some serialization format using existing modules in those languages.

I do think it’s very desirable to make it usable outside of C code.

Obviously, there is a lot of leg work to be done between here and there, but it seems doable. The purpose of this initial patch submission was to get some opinions on the basic idea of "determine result tuple structure by calling helper function at parse time", and so far no one has fallen off their chair from that, so I'm encouraged. ;-)

I’m interested in this development, as it makes RECORD-returning SRFs in the SELECT list a viable proposition, and that in turn allows a ValuePerCall SRF to get meaningful benefit from pipelining. (They could always pipeline, but there is no way to extract information from the RECORD that’s returned, with the sole exception of row_to_json.)

I couldn’t check out that it would work though because I couldn’t apply the v2 (or v1) patch against either 12.0 or 530609a (which I think was sometime around 25th Jan). Against 12.0, I got a few rejections (prepjointree.c and clauses.c). I figured they might be inconsequential, but no: initdb then fails at CREATE VIEW pg_policies. Different rejections against 530609a, but still initdb fails.

But I’m definitely very much encouraged.

denty.