proposal: plpgsql - iteration over fields of rec or row variable

Started by Pavel Stehuleabout 15 years ago30 messages
#1Pavel Stehule
pavel.stehule@gmail.com

I try to take problem of iteration over ROW or RECORD variable from a
different perspective. I would to design a solution where isn't
necessary a conversion from binary value to text (this is a
disadvantage of hstore based solution). This mean so we have to have a
special instance of loop's body for every field of record (for every
field with different type then other field). Can we do it? Yes, we can
- we can use a similar access like polymorphic parameters - just used
not on function level, but on block level. We can iterate of record's
fields and for any distinct type we can do new instance of block
(loop's body) with new instances of included plans. I am thinking
about following syntax:

FOR varname OVER [row | rec variable]
LOOP
{{body}}
END LOOP

this syntax isn't final. The real type of control variable is
specified in runtime and can be changed between iterations. But it
isn't problem, because for any unique data type we will have a
separate instance of {{body}}. control variable with name 'varname' is
redeclared for every iteration of cycle. This variable should be
writeable - so we are able to change any field of record. We can
define a automatic variable FIELDNAME that holds a name of processed
field.

so: sum over some row or rec var can be done with code:

CREATE rectype AS (x int, y int, f float);

DECLARE revar rectype;
BEGIN
FOR f OVER recvar
LOOP
sum := sum + f;
END LOOP;
...

or
FOR f OVER recvar
LOOP
IF fieldname LIKE 'value%' THEN
sum := sum + f;
END IF;
END LOOP;

some other examples:

FOR f OVER recvar
LOOP
RAISE NOTICE '% => %', fieldname, f;
END LOOP;

FOR f OVER recvar
LOOP
IF fieldname LIKE 'value%' THEN
f := 0;
END IF;
END LOOP;

What are you thinking of this proposal?

Regards

Pavel Stehule

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Pavel Stehule (#1)
Re: proposal: plpgsql - iteration over fields of rec or row variable

Pavel Stehule <pavel.stehule@gmail.com> writes:

FOR varname OVER [row | rec variable]
LOOP
{{body}}
END LOOP

this syntax isn't final. The real type of control variable is
specified in runtime and can be changed between iterations.

If you don't know the type or even the name of the field that varname
currently represents, how is the loop body going to do anything useful?

This variable should be
writeable - so we are able to change any field of record.

And that is just plain horrid. The loop variable is a separate
variable, not a modifiable alias for a field, in every existing form
of plpgsql loop.

The idea of multiple instances of the loop body code seems like a mess
anyway. I think this is basically hacking plpgsql beyond recognition
to solve problems that are better solved in plperl or pltcl.

regards, tom lane

#3Pavel Stehule
pavel.stehule@gmail.com
In reply to: Tom Lane (#2)
Re: proposal: plpgsql - iteration over fields of rec or row variable

2010/11/8 Tom Lane <tgl@sss.pgh.pa.us>:

Pavel Stehule <pavel.stehule@gmail.com> writes:

FOR varname OVER [row | rec variable]
LOOP
   {{body}}
END LOOP

this syntax isn't final. The real type of control variable is
specified in runtime and can be changed between iterations.

If you don't know the type or even the name of the field that varname
currently represents, how is the loop body going to do anything useful?

you don't know type or name in validation time. But you don't need to
know it. Name is useless because you access to field via control
variable and type is known in runtime - outer loop's body is
unspecified, but inside loop's body is known. It's analogy to
functions with polymorphic parameters. Outside and in validation time
is parameter type unknown. Inside function in runtime parameter type
is known. I though about it some minutes more, and probably we can do
it without this baroque statement

This variable should be
writeable - so we are able to change any field of record.

And that is just plain horrid.  The loop variable is a separate
variable, not a modifiable alias for a field, in every existing form
of plpgsql loop.

this variable can be like we need - this is automatic declared
variable - we can define a new DTYPE flag, so we are able to control a
assign to this variable - we can block a writing or we can to forward
changes to outer variable. If we can do rowvar.field = some or
recvar.field = some, then we are able to do dynamically too.

The idea of multiple instances of the loop body code seems like a mess
anyway.  I think this is basically hacking plpgsql beyond recognition
to solve problems that are better solved in plperl or pltcl.

I think about it long time. There are a two basic issues: a) one plan
for one assign statement b) dynamic selection of some record's field.
Both points cannot be solved now, because every field can have a
different type. So this syntax changing it. For every field we have a
special limited space, so we can work locally with "mutable" plans,
because the plans are fixed in one iteration. I am sure, so @a can be
solved relative simply without FOR OVER or similar construct. But @b
is more difficult - you can do it just on SQL level, but it need a
generating path in plan for every field in record.

I know so LOOP OVER is relative heavy, but it decrease a necessary
changes in SQL planner to zero

One note - the idea of multiple instances of stored plans inside
PLpgSQL expr isn't far to your proposal of solution for bad plans?

I am open to any ideas. Now I am a searching a possible way. With last
change in plperl it is relative simple to iterate over row or record -
and with possible a access to type descriptor, the iteration can be
relative simple. But I see a main disadvantage: any value must be one
or more times serialized or deserialized to text - and plperl must be
enabled.

Regards

Pavel Stehule

Show quoted text

                       regards, tom lane

#4Merlin Moncure
mmoncure@gmail.com
In reply to: Pavel Stehule (#3)
Re: proposal: plpgsql - iteration over fields of rec or row variable

On Mon, Nov 8, 2010 at 2:29 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:

2010/11/8 Tom Lane <tgl@sss.pgh.pa.us>:

Pavel Stehule <pavel.stehule@gmail.com> writes:

FOR varname OVER [row | rec variable]
LOOP
   {{body}}
END LOOP

this syntax isn't final. The real type of control variable is
specified in runtime and can be changed between iterations.

If you don't know the type or even the name of the field that varname
currently represents, how is the loop body going to do anything useful?

you don't know type or name in validation time. But you don't need to
know it. Name is useless because you access to field via control
variable and type is known in runtime - outer loop's body is
unspecified, but inside loop's body is known. It's analogy to
functions with polymorphic parameters. Outside and in validation time
is parameter type unknown. Inside function in runtime parameter type
is known. I though about it some minutes more, and probably we can do
it without this baroque statement

This variable should be
writeable - so we are able to change any field of record.

And that is just plain horrid.  The loop variable is a separate
variable, not a modifiable alias for a field, in every existing form
of plpgsql loop.

this variable can be like we need - this is automatic declared
variable - we can define a new DTYPE flag, so we are able to control a
assign to this variable - we can block a writing or we can to forward
changes to outer variable. If we can do rowvar.field = some or
recvar.field = some, then we are able to do dynamically too.

The idea of multiple instances of the loop body code seems like a mess
anyway.  I think this is basically hacking plpgsql beyond recognition
to solve problems that are better solved in plperl or pltcl.

I think about it long time. There are a two basic issues: a) one plan
for one assign statement b) dynamic selection of some record's field.
Both points cannot be solved now, because every field can have  a
different type. So this syntax changing it. For every field we have a
special limited space, so we can work locally with "mutable" plans,
because the plans are fixed in one iteration. I am sure, so @a can be
solved relative simply without FOR OVER or similar construct. But @b
is more difficult - you can do it just on SQL level, but it need a
generating path in plan for every field in record.

I know so LOOP OVER is relative heavy, but it decrease a necessary
changes in SQL planner to zero

One note - the idea of multiple instances of stored plans inside
PLpgSQL expr isn't far to your proposal of solution for bad plans?

I am open to any ideas. Now I am a searching a possible way. With last
change in plperl it is relative simple to iterate over row or record -
and with possible a access to type descriptor, the iteration can be
relative simple. But I see a main disadvantage: any value must be one
or more times serialized or deserialized to text - and plperl must be
enabled.

Most cases of this feature are for dealing with new/old from trigger
function right? Why not build a complete new plan for each specific
trigger that invokes the function, along with some magic values like
(TG_FIELDNAMES -> text[]) that could be iterated for the mojo. Not
sure how you get direct type assignment to variable but it could
probably be worked out.

merlin

#5Pavel Stehule
pavel.stehule@gmail.com
In reply to: Merlin Moncure (#4)
Re: proposal: plpgsql - iteration over fields of rec or row variable

2010/11/8 Merlin Moncure <mmoncure@gmail.com>:

On Mon, Nov 8, 2010 at 2:29 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:

2010/11/8 Tom Lane <tgl@sss.pgh.pa.us>:

Pavel Stehule <pavel.stehule@gmail.com> writes:

FOR varname OVER [row | rec variable]
LOOP
   {{body}}
END LOOP

this syntax isn't final. The real type of control variable is
specified in runtime and can be changed between iterations.

If you don't know the type or even the name of the field that varname
currently represents, how is the loop body going to do anything useful?

you don't know type or name in validation time. But you don't need to
know it. Name is useless because you access to field via control
variable and type is known in runtime - outer loop's body is
unspecified, but inside loop's body is known. It's analogy to
functions with polymorphic parameters. Outside and in validation time
is parameter type unknown. Inside function in runtime parameter type
is known. I though about it some minutes more, and probably we can do
it without this baroque statement

This variable should be
writeable - so we are able to change any field of record.

And that is just plain horrid.  The loop variable is a separate
variable, not a modifiable alias for a field, in every existing form
of plpgsql loop.

this variable can be like we need - this is automatic declared
variable - we can define a new DTYPE flag, so we are able to control a
assign to this variable - we can block a writing or we can to forward
changes to outer variable. If we can do rowvar.field = some or
recvar.field = some, then we are able to do dynamically too.

The idea of multiple instances of the loop body code seems like a mess
anyway.  I think this is basically hacking plpgsql beyond recognition
to solve problems that are better solved in plperl or pltcl.

I think about it long time. There are a two basic issues: a) one plan
for one assign statement b) dynamic selection of some record's field.
Both points cannot be solved now, because every field can have  a
different type. So this syntax changing it. For every field we have a
special limited space, so we can work locally with "mutable" plans,
because the plans are fixed in one iteration. I am sure, so @a can be
solved relative simply without FOR OVER or similar construct. But @b
is more difficult - you can do it just on SQL level, but it need a
generating path in plan for every field in record.

I know so LOOP OVER is relative heavy, but it decrease a necessary
changes in SQL planner to zero

One note - the idea of multiple instances of stored plans inside
PLpgSQL expr isn't far to your proposal of solution for bad plans?

I am open to any ideas. Now I am a searching a possible way. With last
change in plperl it is relative simple to iterate over row or record -
and with possible a access to type descriptor, the iteration can be
relative simple. But I see a main disadvantage: any value must be one
or more times serialized or deserialized to text - and plperl must be
enabled.

Most cases of this feature are for dealing with new/old from trigger
function right?  Why not build a complete new plan for each specific
trigger that invokes the function, along with some magic values like
(TG_FIELDNAMES -> text[]) that could be iterated for the mojo.  Not
sure how you get direct type assignment to variable but it could
probably be worked out.

if I understand well - it's not too far to my idea - just you create
instance on function level? It is possible too. As disadvantages I
see:
a) you need some special syntax too
b) there is overhead with multiple function call
c) you have to manage some space for temporary values

Regards

Pavel Stehule

Show quoted text

merlin

#6Dmitriy Igrishin
dmitigr@gmail.com
In reply to: Merlin Moncure (#4)
Re: proposal: plpgsql - iteration over fields of rec or row variable

Hey Pavel, Tom, Merlin,

As a user, I would like to work with records by using simple API:

-- Returns a number of key/values pairs of record.
nKeys_ integer := nRecordKeys(NEW);

-- Returns an i-th key.
key_i text := recordKey(NEW, i);

-- Returns an i-th value.
value1_ text := recordValueByIndex(NEW, i);

-- Returns an value by named key.
value2_ text := recordValueByName(NEW, "id");

and so on...

The syntax with FOR .. LOOP for iteration across record keys
seems to me not so elegant.

2010/11/8 Merlin Moncure <mmoncure@gmail.com>

On Mon, Nov 8, 2010 at 2:29 PM, Pavel Stehule <pavel.stehule@gmail.com>
wrote:

2010/11/8 Tom Lane <tgl@sss.pgh.pa.us>:

Pavel Stehule <pavel.stehule@gmail.com> writes:

FOR varname OVER [row | rec variable]
LOOP
{{body}}
END LOOP

this syntax isn't final. The real type of control variable is
specified in runtime and can be changed between iterations.

If you don't know the type or even the name of the field that varname
currently represents, how is the loop body going to do anything useful?

you don't know type or name in validation time. But you don't need to
know it. Name is useless because you access to field via control
variable and type is known in runtime - outer loop's body is
unspecified, but inside loop's body is known. It's analogy to
functions with polymorphic parameters. Outside and in validation time
is parameter type unknown. Inside function in runtime parameter type
is known. I though about it some minutes more, and probably we can do
it without this baroque statement

This variable should be
writeable - so we are able to change any field of record.

And that is just plain horrid. The loop variable is a separate
variable, not a modifiable alias for a field, in every existing form
of plpgsql loop.

this variable can be like we need - this is automatic declared
variable - we can define a new DTYPE flag, so we are able to control a
assign to this variable - we can block a writing or we can to forward
changes to outer variable. If we can do rowvar.field = some or
recvar.field = some, then we are able to do dynamically too.

The idea of multiple instances of the loop body code seems like a mess
anyway. I think this is basically hacking plpgsql beyond recognition
to solve problems that are better solved in plperl or pltcl.

I think about it long time. There are a two basic issues: a) one plan
for one assign statement b) dynamic selection of some record's field.
Both points cannot be solved now, because every field can have a
different type. So this syntax changing it. For every field we have a
special limited space, so we can work locally with "mutable" plans,
because the plans are fixed in one iteration. I am sure, so @a can be
solved relative simply without FOR OVER or similar construct. But @b
is more difficult - you can do it just on SQL level, but it need a
generating path in plan for every field in record.

I know so LOOP OVER is relative heavy, but it decrease a necessary
changes in SQL planner to zero

One note - the idea of multiple instances of stored plans inside
PLpgSQL expr isn't far to your proposal of solution for bad plans?

I am open to any ideas. Now I am a searching a possible way. With last
change in plperl it is relative simple to iterate over row or record -
and with possible a access to type descriptor, the iteration can be
relative simple. But I see a main disadvantage: any value must be one
or more times serialized or deserialized to text - and plperl must be
enabled.

Most cases of this feature are for dealing with new/old from trigger
function right? Why not build a complete new plan for each specific
trigger that invokes the function, along with some magic values like
(TG_FIELDNAMES -> text[]) that could be iterated for the mojo. Not
sure how you get direct type assignment to variable but it could
probably be worked out.

merlin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

--
// Dmitriy.

#7Merlin Moncure
mmoncure@gmail.com
In reply to: Dmitriy Igrishin (#6)
Re: proposal: plpgsql - iteration over fields of rec or row variable

On Mon, Nov 8, 2010 at 3:02 PM, Dmitriy Igrishin <dmitigr@gmail.com> wrote:

Hey Pavel, Tom, Merlin,

As a user, I would like to work with records by using simple API:

  -- Returns a number of key/values pairs of record.
  nKeys_ integer := nRecordKeys(NEW);

  -- Returns an i-th key.
  key_i text := recordKey(NEW, i);

  -- Returns an i-th value.
  value1_ text := recordValueByIndex(NEW, i);

  -- Returns an value by named key.
  value2_ text := recordValueByName(NEW, "id");

and so on...

This doesn't really solve the problem -- you need some participation
from plpgsql because function behavior post-plan time can not be
ambiguous.

merlin

#8Merlin Moncure
mmoncure@gmail.com
In reply to: Pavel Stehule (#5)
Re: proposal: plpgsql - iteration over fields of rec or row variable

On Mon, Nov 8, 2010 at 3:00 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:

Most cases of this feature are for dealing with new/old from trigger
function right?  Why not build a complete new plan for each specific
trigger that invokes the function, along with some magic values like
(TG_FIELDNAMES -> text[]) that could be iterated for the mojo.  Not
sure how you get direct type assignment to variable but it could
probably be worked out.

if I understand well - it's not too far to my idea - just you create
instance on function level? It is possible too. As disadvantages I
see:
a) you need some special syntax too
b) there is overhead with multiple function call
c) you have to manage some space for temporary values

yes. If you need to deal with plan instance it should be at function
level IMO. There are other cases for this, search_path for example.
What overhead?

merlin

#9Pavel Stehule
pavel.stehule@gmail.com
In reply to: Dmitriy Igrishin (#6)
Re: proposal: plpgsql - iteration over fields of rec or row variable

Hello

2010/11/8 Dmitriy Igrishin <dmitigr@gmail.com>:

Hey Pavel, Tom, Merlin,

As a user, I would like to work with records by using simple API:

  -- Returns a number of key/values pairs of record.
  nKeys_ integer := nRecordKeys(NEW);

  -- Returns an i-th key.
  key_i text := recordKey(NEW, i);

  -- Returns an i-th value.
  value1_ text := recordValueByIndex(NEW, i);

  -- Returns an value by named key.
  value2_ text := recordValueByName(NEW, "id");

some from your lines isn't problem now. You can convert all fields in
record to text and work with it. These functions is very simple. But
it isn't effective, because you convert fields to text and you can
lost a some information or you can get some different

like

intvar := 10.0 / 2.0; doesn't work in plpgsql, because there are casting via IO

Pavel

Show quoted text

The syntax with FOR .. LOOP for iteration across record keys
seems to me not so elegant.

2010/11/8 Merlin Moncure <mmoncure@gmail.com>

On Mon, Nov 8, 2010 at 2:29 PM, Pavel Stehule <pavel.stehule@gmail.com>
wrote:

2010/11/8 Tom Lane <tgl@sss.pgh.pa.us>:

Pavel Stehule <pavel.stehule@gmail.com> writes:

FOR varname OVER [row | rec variable]
LOOP
   {{body}}
END LOOP

this syntax isn't final. The real type of control variable is
specified in runtime and can be changed between iterations.

If you don't know the type or even the name of the field that varname
currently represents, how is the loop body going to do anything useful?

you don't know type or name in validation time. But you don't need to
know it. Name is useless because you access to field via control
variable and type is known in runtime - outer loop's body is
unspecified, but inside loop's body is known. It's analogy to
functions with polymorphic parameters. Outside and in validation time
is parameter type unknown. Inside function in runtime parameter type
is known. I though about it some minutes more, and probably we can do
it without this baroque statement

This variable should be
writeable - so we are able to change any field of record.

And that is just plain horrid.  The loop variable is a separate
variable, not a modifiable alias for a field, in every existing form
of plpgsql loop.

this variable can be like we need - this is automatic declared
variable - we can define a new DTYPE flag, so we are able to control a
assign to this variable - we can block a writing or we can to forward
changes to outer variable. If we can do rowvar.field = some or
recvar.field = some, then we are able to do dynamically too.

The idea of multiple instances of the loop body code seems like a mess
anyway.  I think this is basically hacking plpgsql beyond recognition
to solve problems that are better solved in plperl or pltcl.

I think about it long time. There are a two basic issues: a) one plan
for one assign statement b) dynamic selection of some record's field.
Both points cannot be solved now, because every field can have  a
different type. So this syntax changing it. For every field we have a
special limited space, so we can work locally with "mutable" plans,
because the plans are fixed in one iteration. I am sure, so @a can be
solved relative simply without FOR OVER or similar construct. But @b
is more difficult - you can do it just on SQL level, but it need a
generating path in plan for every field in record.

I know so LOOP OVER is relative heavy, but it decrease a necessary
changes in SQL planner to zero

One note - the idea of multiple instances of stored plans inside
PLpgSQL expr isn't far to your proposal of solution for bad plans?

I am open to any ideas. Now I am a searching a possible way. With last
change in plperl it is relative simple to iterate over row or record -
and with possible a access to type descriptor, the iteration can be
relative simple. But I see a main disadvantage: any value must be one
or more times serialized or deserialized to text - and plperl must be
enabled.

Most cases of this feature are for dealing with new/old from trigger
function right?  Why not build a complete new plan for each specific
trigger that invokes the function, along with some magic values like
(TG_FIELDNAMES -> text[]) that could be iterated for the mojo.  Not
sure how you get direct type assignment to variable but it could
probably be worked out.

merlin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

--
// Dmitriy.

#10Dmitriy Igrishin
dmitigr@gmail.com
In reply to: Merlin Moncure (#7)
Re: proposal: plpgsql - iteration over fields of rec or row variable

Merlin,

Sorry, I don't clearly understand what the problem here, but I just want to
avoid
conversion from record to hstore with suggested API. I am currently happy
with
hstore API to work with record. From the user's point of view :-).

2010/11/8 Merlin Moncure <mmoncure@gmail.com>

On Mon, Nov 8, 2010 at 3:02 PM, Dmitriy Igrishin <dmitigr@gmail.com>
wrote:

Hey Pavel, Tom, Merlin,

As a user, I would like to work with records by using simple API:

-- Returns a number of key/values pairs of record.
nKeys_ integer := nRecordKeys(NEW);

-- Returns an i-th key.
key_i text := recordKey(NEW, i);

-- Returns an i-th value.
value1_ text := recordValueByIndex(NEW, i);

-- Returns an value by named key.
value2_ text := recordValueByName(NEW, "id");

and so on...

This doesn't really solve the problem -- you need some participation
from plpgsql because function behavior post-plan time can not be
ambiguous.

merlin

--
// Dmitriy.

#11Pavel Stehule
pavel.stehule@gmail.com
In reply to: Merlin Moncure (#8)
Re: proposal: plpgsql - iteration over fields of rec or row variable

2010/11/8 Merlin Moncure <mmoncure@gmail.com>:

On Mon, Nov 8, 2010 at 3:00 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:

Most cases of this feature are for dealing with new/old from trigger
function right?  Why not build a complete new plan for each specific
trigger that invokes the function, along with some magic values like
(TG_FIELDNAMES -> text[]) that could be iterated for the mojo.  Not
sure how you get direct type assignment to variable but it could
probably be worked out.

if I understand well - it's not too far to my idea - just you create
instance on function level? It is possible too. As disadvantages I
see:
a) you need some special syntax too
b) there is overhead with multiple function call
c) you have to manage some space for temporary values

yes.  If you need to deal with plan instance it should be at function
level IMO.  There are other cases for this, search_path for example.
What overhead?

you call a trigger body more times then once. The call of plpgsql
isn't cheep. Main problem is missing a some working memory. Task:
ensure so sum of fields must be less than some constant?

What is solution in your design?

Pavel

Show quoted text

merlin

#12Merlin Moncure
mmoncure@gmail.com
In reply to: Pavel Stehule (#11)
Re: proposal: plpgsql - iteration over fields of rec or row variable

On Mon, Nov 8, 2010 at 3:21 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:

2010/11/8 Merlin Moncure <mmoncure@gmail.com>:

On Mon, Nov 8, 2010 at 3:00 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:

Most cases of this feature are for dealing with new/old from trigger
function right?  Why not build a complete new plan for each specific
trigger that invokes the function, along with some magic values like
(TG_FIELDNAMES -> text[]) that could be iterated for the mojo.  Not
sure how you get direct type assignment to variable but it could
probably be worked out.

if I understand well - it's not too far to my idea - just you create
instance on function level? It is possible too. As disadvantages I
see:
a) you need some special syntax too
b) there is overhead with multiple function call
c) you have to manage some space for temporary values

yes.  If you need to deal with plan instance it should be at function
level IMO.  There are other cases for this, search_path for example.
What overhead?

you call a trigger body more times then once. The call of plpgsql
isn't cheep. Main problem is missing a some working memory. Task:
ensure so sum of fields must be less than some constant?

What is solution in your design?

nothing specific. My main interest is in the plan management portion
(only because of side interest in search_path which is complete mess
at present). if plpgsql_compile knows that it is trigger function and
which table fired the trigger, you can generate specific hash key, and
take advantage of new/old being well defined inside function
execution. This seems less fiddly than trying to deal with plan
mechanics inside the function. My knowledge stops there -- I don't
have a real good understanding of how plpgsql works internally. If
this idea passes smell test maybe it merits more research.

merlin

#13Pavel Stehule
pavel.stehule@gmail.com
In reply to: Merlin Moncure (#12)
Re: proposal: plpgsql - iteration over fields of rec or row variable

What is solution in your design?

nothing specific.   My main interest is in the plan management portion
(only because of side interest in search_path which is complete mess
at present).  if plpgsql_compile knows that it is trigger function and
which table fired the trigger, you can generate specific hash key, and
take advantage of new/old being well defined inside function
execution.  This seems less fiddly than trying to deal with plan
mechanics inside the function.  My knowledge stops there -- I don't
have a real good understanding of how plpgsql works internally.  If
this idea passes smell test maybe it merits more research.

I still don't understand how it can be used for iteration over record?

The basic question is - what tasks we have to solve?

a) general constraints over fields
b) general initialization over fields
c) custom record serialization/deserialization - audits, logs, delta
compression, custom formatting (xml, json)

Next question - what breaks we have to across?
a) using a different types for some fields - invalid plans
b) lost of type info
c) toast / detoast overhead
d) text / binary transformation

Possible way:

a) transformation to common type

+ simple - it is one day job - function record_to_array,
array_to_record, and fieldnames_to_array
 - lost of type info, hidden problems with IO cast - int a := 10.0/2.0
is a problem

using a plperl, pltcl is same like @a

b) FOR OVER or similar loop
 + there are not text / binary cast
  - necessary to introduce a new concept - a multiple instances of loop's body
  - longer cache of plans, but it must not be terrible - instance is
per distinct field type not per type
c) LAMBDA calcul? - maybe Merlin's idea
 + there are not text / binary cast
 + probably not necessary changes inside plpgsql
 - it's far to ADA - or do you know any lambda in ADA or PL/SQL?
 - probably higher overhead with detoast
 - probably higher overhead with function call

is this review complete? any other ideas?

Regards

Pavel

Show quoted text

merlin

#14Pavel Stehule
pavel.stehule@gmail.com
In reply to: Pavel Stehule (#13)
Re: proposal: plpgsql - iteration over fields of rec or row variable

a) transformation to common type

+ simple - it is one day job - function record_to_array,
array_to_record, and fieldnames_to_array
 - lost of type info, hidden problems with IO cast - int a := 10.0/2.0
is a problem

using a plperl, pltcl is same like @a

I thinking about some simple API, that can be based on transformation
to text. It can be enough for almost all.

* text[] = record_to_array(record)
* table(id, key, datatype, value) = record_to_table(record)
* text = record_get_field(record, text)
* record = record_set_field(record, text, anyelement)

??

Pavel

#15Dmitriy Igrishin
dmitigr@gmail.com
In reply to: Pavel Stehule (#14)
Re: proposal: plpgsql - iteration over fields of rec or row variable

Hey Pavel,

2010/11/9 Pavel Stehule <pavel.stehule@gmail.com>

a) transformation to common type

+ simple - it is one day job - function record_to_array,
array_to_record, and fieldnames_to_array
- lost of type info, hidden problems with IO cast - int a := 10.0/2.0
is a problem

using a plperl, pltcl is same like @a

I thinking about some simple API, that can be based on transformation
to text. It can be enough for almost all.

* text[] = record_to_array(record)
* table(id, key, datatype, value) = record_to_table(record)
* text = record_get_field(record, text)
* record = record_set_field(record, text, anyelement)

??

I personally like it. But I propose to add as well:
integer := record_nkeys();
text := record_get_field(record, integer);

Pavel

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

--
// Dmitriy.

#16David E. Wheeler
david@kineticode.com
In reply to: Dmitriy Igrishin (#15)
Re: proposal: plpgsql - iteration over fields of rec or row variable

On Nov 9, 2010, at 1:38 AM, Dmitriy Igrishin wrote:

* text[] = record_to_array(record)
* table(id, key, datatype, value) = record_to_table(record)
* text = record_get_field(record, text)
* record = record_set_field(record, text, anyelement)

??
I personally like it. But I propose to add as well:
integer := record_nkeys();
text := record_get_field(record, integer);

You realize you can pretty much do all this with hstore, right?

hstore hash := hstore(record);
ary text[] := hstore_to_matrix(hash);
select * from each(hash); -- TABLE
text foo := hash -> somekey;
hash := hash || '"key","value"'::hstore;
record := populate_record(record, hash);

Best,

David

#17Dmitriy Igrishin
dmitigr@gmail.com
In reply to: David E. Wheeler (#16)
Re: proposal: plpgsql - iteration over fields of rec or row variable

Hey David,

2010/11/9 David E. Wheeler <david@kineticode.com>

On Nov 9, 2010, at 1:38 AM, Dmitriy Igrishin wrote:

* text[] = record_to_array(record)
* table(id, key, datatype, value) = record_to_table(record)
* text = record_get_field(record, text)
* record = record_set_field(record, text, anyelement)

??
I personally like it. But I propose to add as well:
integer := record_nkeys();
text := record_get_field(record, integer);

You realize you can pretty much do all this with hstore, right?

hstore hash := hstore(record);
ary text[] := hstore_to_matrix(hash);
select * from each(hash); -- TABLE
text foo := hash -> somekey;
hash := hash || '"key","value"'::hstore;
record := populate_record(record, hash);

Yep, but hstore is an additional module. Although, its not a problem.

Best,

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

--
// Dmitriy.

#18David E. Wheeler
david@kineticode.com
In reply to: Dmitriy Igrishin (#17)
Re: proposal: plpgsql - iteration over fields of rec or row variable

On Nov 9, 2010, at 9:18 AM, Dmitriy Igrishin wrote:

Yep, but hstore is an additional module. Although, its not a problem.

Yeah, but JSON will be in core, and with luck, before long, it will have the same (or similar) capabilities.

Best,

David

#19Tom Lane
tgl@sss.pgh.pa.us
In reply to: David E. Wheeler (#16)
Re: proposal: plpgsql - iteration over fields of rec or row variable

"David E. Wheeler" <david@kineticode.com> writes:

You realize you can pretty much do all this with hstore, right?

Yeah. Anything that involves smashing all the fields to text is not
really an advance over (a) hstore or (b) using plperl or one of the
other weakly-typed PLs.

I think there's a fairly fundamental contradiction involved here.
One of the basic design attributes of plpgsql is that it's strongly
typed. Sometimes that's a blessing, and sometimes it's not, but
it's a fact. There really isn't a good way to deal with run-time
field selection while still maintaining strong typing. I do not
believe that the answer to that problem is "so let's break strong
typing". Rather, the answer is that if that's what you need, you
need to use a different tool. There's a reason we support multiple
PLs.

regards, tom lane

#20Pavel Stehule
pavel.stehule@gmail.com
In reply to: David E. Wheeler (#16)
Re: proposal: plpgsql - iteration over fields of rec or row variable

2010/11/9 David E. Wheeler <david@kineticode.com>:

On Nov 9, 2010, at 1:38 AM, Dmitriy Igrishin wrote:

* text[] = record_to_array(record)
* table(id, key, datatype, value) = record_to_table(record)
* text = record_get_field(record, text)
* record = record_set_field(record, text, anyelement)

??
I personally like it. But I propose to add as well:
  integer := record_nkeys();
  text := record_get_field(record, integer);

You realize you can pretty much do all this with hstore, right?

hstore has similar functionality, but missing a some details and add
lot of other functionality - it doesn't identify type of field.
Personally - it is nothing what I like - but can be better than
nothing.

Pavel

Show quoted text

 hstore hash := hstore(record);
 ary    text[] := hstore_to_matrix(hash);
 select * from each(hash); -- TABLE
 text foo := hash -> somekey;
 hash := hash || '"key","value"'::hstore;
 record := populate_record(record, hash);

Best,

David

#21David E. Wheeler
david@kineticode.com
In reply to: Pavel Stehule (#20)
Re: proposal: plpgsql - iteration over fields of rec or row variable

On Nov 9, 2010, at 9:35 AM, Pavel Stehule wrote:

You realize you can pretty much do all this with hstore, right?

hstore has similar functionality, but missing a some details and add
lot of other functionality - it doesn't identify type of field.
Personally - it is nothing what I like - but can be better than
nothing.

The JSON data type will give you some basic types (text, number, boolean).

Best,

David

#22Pavel Stehule
pavel.stehule@gmail.com
In reply to: Tom Lane (#19)
Re: proposal: plpgsql - iteration over fields of rec or row variable

2010/11/9 Tom Lane <tgl@sss.pgh.pa.us>:

"David E. Wheeler" <david@kineticode.com> writes:

You realize you can pretty much do all this with hstore, right?

Yeah.  Anything that involves smashing all the fields to text is not
really an advance over (a) hstore or (b) using plperl or one of the
other weakly-typed PLs.

I think there's a fairly fundamental contradiction involved here.
One of the basic design attributes of plpgsql is that it's strongly
typed.  Sometimes that's a blessing, and sometimes it's not, but
it's a fact.  There really isn't a good way to deal with run-time
field selection while still maintaining strong typing.  I do not
believe that the answer to that problem is "so let's break strong
typing".  Rather, the answer is that if that's what you need, you
need to use a different tool.  There's a reason we support multiple
PLs.

yes - I know these arguments well. But you have to know so any
combination of PL increase a project complexity and increase a price
for maintaining, installation, Now It's relative safe to say to
somebody - you need a plpgsql. But it's more difficult to say same
about plperl, pltcl, plpython - I like plperl too much, but I would to
use it for untrusted operation and not for some very simple and
general task.

Pavel

Show quoted text

                       regards, tom lane

#23David E. Wheeler
david@kineticode.com
In reply to: Tom Lane (#19)
Re: proposal: plpgsql - iteration over fields of rec or row variable

On Nov 9, 2010, at 9:34 AM, Tom Lane wrote:

I think there's a fairly fundamental contradiction involved here.
One of the basic design attributes of plpgsql is that it's strongly
typed. Sometimes that's a blessing, and sometimes it's not, but
it's a fact. There really isn't a good way to deal with run-time
field selection while still maintaining strong typing. I do not
believe that the answer to that problem is "so let's break strong
typing". Rather, the answer is that if that's what you need, you
need to use a different tool. There's a reason we support multiple
PLs.

Yeah, I think that Pavel wants to get at the record values with their types in tact. Not sure how that'd work though, really. If you know the type of the record columns already, you can just get them. But I'm not sure how you could introspect the column names and their types, and then get those types out without casting, except perhaps via EXECUTE…

Best,

David

#24David E. Wheeler
david@kineticode.com
In reply to: Pavel Stehule (#20)
Re: proposal: plpgsql - iteration over fields of rec or row variable

On Nov 9, 2010, at 9:35 AM, Pavel Stehule wrote:

hstore has similar functionality, but missing a some details and add
lot of other functionality - it doesn't identify type of field.
Personally - it is nothing what I like - but can be better than
nothing.

What are you going to do with the type once you have it?

David

#25Pavel Stehule
pavel.stehule@gmail.com
In reply to: David E. Wheeler (#21)
Re: proposal: plpgsql - iteration over fields of rec or row variable

2010/11/9 David E. Wheeler <david@kineticode.com>:

On Nov 9, 2010, at 9:35 AM, Pavel Stehule wrote:

You realize you can pretty much do all this with hstore, right?

hstore has similar functionality, but missing a some details and add
lot of other functionality - it doesn't identify type of field.
Personally - it is nothing what I like - but can be better than
nothing.

The JSON data type will give you some basic types (text, number, boolean).

yes, but it's same - transformation via text, and still it's more than
less hack - when you like to iterate over record, then you need to
transform record (generic and basic type) to JSON and then to text.
It's nothing nice - and I don't see a difference between a use a
Plperl or JSON.

Pavel

Show quoted text

Best,

David

#26Pavel Stehule
pavel.stehule@gmail.com
In reply to: David E. Wheeler (#23)
Re: proposal: plpgsql - iteration over fields of rec or row variable

2010/11/9 David E. Wheeler <david@kineticode.com>:

On Nov 9, 2010, at 9:34 AM, Tom Lane wrote:

I think there's a fairly fundamental contradiction involved here.
One of the basic design attributes of plpgsql is that it's strongly
typed.  Sometimes that's a blessing, and sometimes it's not, but
it's a fact.  There really isn't a good way to deal with run-time
field selection while still maintaining strong typing.  I do not
believe that the answer to that problem is "so let's break strong
typing".  Rather, the answer is that if that's what you need, you
need to use a different tool.  There's a reason we support multiple
PLs.

Yeah, I think that Pavel wants to get at the record values with their types in tact. Not sure how that'd work though, really. If you know the type of the record columns already, you can just get them. But I'm not sure how you could introspect the column names and their types, and then get those types out without casting, except perhaps via EXECUTE…

every variable in plpgsql has known type descriptor. So it's not a
problem. The problem is a access to fields in cycle - and then you
need a simple trick like polymorphic parameters.

Pavel

Show quoted text

Best,

David

#27Pavel Stehule
pavel.stehule@gmail.com
In reply to: David E. Wheeler (#24)
Re: proposal: plpgsql - iteration over fields of rec or row variable

2010/11/9 David E. Wheeler <david@kineticode.com>:

On Nov 9, 2010, at 9:35 AM, Pavel Stehule wrote:

hstore has similar functionality, but missing a some details and add
lot of other functionality - it doesn't identify type of field.
Personally - it is nothing what I like - but can be better than
nothing.

What are you going to do with the type once you have it?

for example, you can use it for formatting, for explicit cast, for
different serialization type - like JSON - without knowledge of type,
you can't to build correct JSON value. So you can write a application
with knowledge of type and you don't need to detect type from value,
that isn't robust.

Pavel

Show quoted text

David

#28Pavel Stehule
pavel.stehule@gmail.com
In reply to: Pavel Stehule (#27)
Re: proposal: plpgsql - iteration over fields of rec or row variable

What are you going to do with the type once you have it?

for example, you can use it for formatting, for explicit cast, for
different serialization type - like JSON - without knowledge of type,
you can't to build correct JSON value. So you can write a application
with knowledge of type and you don't need to detect type from value,
that isn't robust.

there is other disadvantage of access to fields via PL function (or
native function) based on transformation to text. It's a lost of
typmod.

Pavel

Show quoted text

Pavel

David

#29Merlin Moncure
mmoncure@gmail.com
In reply to: Tom Lane (#19)
Re: proposal: plpgsql - iteration over fields of rec or row variable

On Tue, Nov 9, 2010 at 12:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

"David E. Wheeler" <david@kineticode.com> writes:

You realize you can pretty much do all this with hstore, right?

Yeah.  Anything that involves smashing all the fields to text is not
really an advance over (a) hstore or (b) using plperl or one of the
other weakly-typed PLs.

I think there's a fairly fundamental contradiction involved here.
One of the basic design attributes of plpgsql is that it's strongly
typed.  Sometimes that's a blessing, and sometimes it's not, but
it's a fact.  There really isn't a good way to deal with run-time
field selection while still maintaining strong typing.  I do not
believe that the answer to that problem is "so let's break strong
typing".  Rather, the answer is that if that's what you need, you
need to use a different tool.  There's a reason we support multiple
PLs.

In cases where both the field layout and the field of interest are
known at plan time this not violating the static principles of
plpgsql. Suppose we decided to access field by name via
recordvar{name} or recordvar{field pos}:

DECLARE
r record;
f foo;
t text default 'id';
BEGIN
<some code>
r{'id'} = 5; -- no good, r is dynamic record
f{t} 5; -- no good, t is not immutable
f{'id'} = 5; -- ok;

Iterating over fields of type foo is not interesting because fields
are already known to whoever is writing the function, and flatten to
text cases are already covered. IOW, the above syntax is not really
useful because you can just do:
f.id = 5;

The only exception I see is in trigger functions. If the trigger
function plan is specific to the firing trigger, new and old are
defined at plan time, so something like:

new{TG_FIELDNAMES[1]} = 5; -- is ok (at least IMO), since
TG_FIELDNAMES is immutable (at least to the plan).

I don't honestly know if it's worth it -- the main case is performance
(plus Pavel's secondary argument of loss of type information).
Something like this would address an awful lot of gripes about trigger
functions though.

merlin

#30Alvaro Herrera
alvherre@commandprompt.com
In reply to: Merlin Moncure (#29)
Re: proposal: plpgsql - iteration over fields of rec or row variable

Excerpts from Merlin Moncure's message of mar nov 09 16:41:32 -0300 2010:

The only exception I see is in trigger functions. If the trigger
function plan is specific to the firing trigger, new and old are
defined at plan time, so something like:

new{TG_FIELDNAMES[1]} = 5; -- is ok (at least IMO), since
TG_FIELDNAMES is immutable (at least to the plan).

I don't honestly know if it's worth it -- the main case is performance
(plus Pavel's secondary argument of loss of type information).
Something like this would address an awful lot of gripes about trigger
functions though.

I think the interesting bit (wrt the examples I've seen, that is) would
be to be able to use the TG_ARGS array as the element specifier. Not
sure if this is any different from your example. It's been some time
since I've been near this though, so maybe what I wanted is now possible
with USING tricks.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support