No title

Started by Pavel Stehuleover 9 years ago3 messages
#1Pavel Stehule
pavel.stehule@gmail.com

2016-09-12 8:14 GMT+02:00 Craig Ringer <craig@2ndquadrant.com>:

On 12 September 2016 at 14:02, Pavel Stehule <pavel.stehule@gmail.com>
wrote:

Hi

There is some opened questions - the standard (and some other databases)
requires entering XPath expression as string literal.

I am thinking so it is too strong not necessary limit - (it enforces

dynamic

query in more cases), so I allowed the expressions there.

I agree. There's no reason not to permit expressions there, and there
are many other places where we have similar extensions.

Another questions is when these expressions should be evaluated. There

are

two possibilities - once per query, once per input row. I selected "once

per

input row mode" - it is simpler to implement it, and it is consistent

with

other "similar" generators - see the behave and related discussion to
"array_to_string" and evaluation of separator argument. The switch to

"once

per query" should not be hard - but it can be strange for users, because
some his volatile expression should be stable.

I would've expected once per query. There's no way the expressions can
reference the row data, so there's no reason to evaluate them each
time.

I disagree - it is hypothetical situation but it is possible

if somebody store documents like

id, xml
=====
id = 1, xml = <doc id = 1> ....<>
id = 2, xml = <doc id = 2> ....

Then evaluating one per query doesn't allow to use any reference to other
columns, and doesn't to build expressions like PATH (...[@id= ' || id || ']

My opinion is not too strong - now, what I know, there is not any only once
per query executed expression in Postgres - so this implementation will be
a precedent.

The only use case I see for evaluating them each time is - maybe -
DEFAULT. Where maybe there's a use for nextval() or other volatile
functions. But honestly, I think that's better done explicitly in a
post-pass, i.e.

select uuid_generate_v4(), x.*
from (
xmltable(.....) x
);

in cases where that's what the user actually wants.

DEFAULT should be evaluated per output row - anybody can use volatile
function there - example: when I have not data - use some random there

Regards

Pavel

Show quoted text

There's no other case I can think of where expressions as arguments to
set-returning functions are evaluated once per output row.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#2Craig Ringer
craig@2ndquadrant.com
In reply to: Pavel Stehule (#1)
Re:

On 12 September 2016 at 14:29, Pavel Stehule <pavel.stehule@gmail.com> wrote:

I would've expected once per query. There's no way the expressions can
reference the row data, so there's no reason to evaluate them each
time.

I disagree - it is hypothetical situation but it is possible

if somebody store documents like

id, xml
=====
id = 1, xml = <doc id = 1> ....<>
id = 2, xml = <doc id = 2> ....

Then evaluating one per query doesn't allow to use any reference to other
columns, and doesn't to build expressions like PATH (...[@id= ' || id || ']

Referencing columns on the same evaluation level? I dunno about that.
You're relying on strict order of evaluation which is pretty unusual
for SQL.

I guess this is why full XQuery would be desirable, but that's a whole
different business.

I would personally expect this sort of thing to be handled by a second
pass; isn't that part of why it's so easy to return xml fields from
xmltable?

Evaluating expressions each time seems likely to be bad for
performance, but I guess it's not going to make a big difference
compared to all the XML crud, so I don't have a super strong opinion
here.

Either way, it's crucial that the behaviour be documented.

DEFAULT should be evaluated per output row - anybody can use volatile
function there - example: when I have not data - use some random there

That would be consistent with how we handle DEFAULT on a table, so I
agree. It's a departure from what we do normally, but we didn't have
table functions before either.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Pavel Stehule
pavel.stehule@gmail.com
In reply to: Craig Ringer (#2)
Re:

2016-09-12 9:07 GMT+02:00 Craig Ringer <craig@2ndquadrant.com>:

On 12 September 2016 at 14:29, Pavel Stehule <pavel.stehule@gmail.com>
wrote:

I would've expected once per query. There's no way the expressions can
reference the row data, so there's no reason to evaluate them each
time.

I disagree - it is hypothetical situation but it is possible

if somebody store documents like

id, xml
=====
id = 1, xml = <doc id = 1> ....<>
id = 2, xml = <doc id = 2> ....

Then evaluating one per query doesn't allow to use any reference to other
columns, and doesn't to build expressions like PATH (...[@id= ' || id ||

']

Referencing columns on the same evaluation level? I dunno about that.
You're relying on strict order of evaluation which is pretty unusual
for SQL.

I guess this is why full XQuery would be desirable, but that's a whole
different business.

I would personally expect this sort of thing to be handled by a second
pass; isn't that part of why it's so easy to return xml fields from
xmltable?

Evaluating expressions each time seems likely to be bad for
performance, but I guess it's not going to make a big difference
compared to all the XML crud, so I don't have a super strong opinion
here.

When expression will a constant, then the cost will be minimal - more, we
can do preevaluation in parser/transform time, and if expression is some
constant, then we should not to evaluate it later.

We can wait if some other people will have a opinion to this topic. This is
important topic, but it is not to hard implement both variants, and more -
this is corner case - it is not important for any example that I found on
net.

Regards

Pavel

Show quoted text

Either way, it's crucial that the behaviour be documented.

DEFAULT should be evaluated per output row - anybody can use volatile
function there - example: when I have not data - use some random there

That would be consistent with how we handle DEFAULT on a table, so I
agree. It's a departure from what we do normally, but we didn't have
table functions before either.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services