BUG #10972: string_agg function incorrectly concatenating varying delimiter
The following bug has been logged on the website:
Bug reference: 10972
Logged by: Jeff Fischer
Email address: jeff@goaldriven.com
PostgreSQL version: 9.3.1
Operating system: CentOS
Description:
Running the query below will show how the delimiter for the current row is
actually the subsequent rows delimiter. The sequence goes like this:
TestBlah 1
ITTest 2
testfail 3
Yet, the concatened output is this:
"testblah 2 ITtest 3 testfail"
You can see that the delimiter for the current row is actually getting the
subsequent rows delimiter. It's as though the string concat moves to the
next row prior to concatenting the delimiter, which is likely the cause of
this bug.
SELECT
string_agg(
Field1,
' ' || RowIndex::text || ' '
ORDER BY RowIndex) as VeryLongConcatenatedResultsFieldName
FROM (
SELECT
'testblah' as Field1,
1 as RowIndex
UNION
SELECT
'ITtest' as Field1,
2 as RowIndex
UNION
SELECT
'testfail' as Field1,
3 as RowIndex
) t;
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
jeff@goaldriven.com writes:
Running the query below will show how the delimiter for the current row is
actually the subsequent rows delimiter.
Hmm, well, the documentation for string_agg doesn't say what happens when
the "delimiter" argument varies across rows; but a quick look at the code
finds that the first-call delimiter isn't actually used at all, and on
subsequent calls the delimiter is appended to the running result before
the associated value is. Which seems to me to be at least as reasonable,
and certainly a great deal easier to implement, as what you seem to have
in mind. Can you offer a principled argument why it should be the other
way around?
regards, tom lane
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
Hi Tom,
I just read the Wikipedia article on you, fun. Glad to make your acquaintance. My business partner, Dave Yarnall, went to Carnegie Mellon (CCed).
I suppose principles can be relative, but I'll assume you mean good principles and give it a shot.
Primarily, I'd consider whether another function uses non-deterministic rows for its evaluation. I could be wrong, but I don't think any other function uses two different rows results within a single function evaluation. Even aggregates, such as string_agg, evaluate one row at a time which is a well-known behavior.
A similar paradigm might be in general programming if a compiled program randomly chose values off of the stack to place as a parameter into a method call (function 3's parameters are passed into function2). An odd and unexpected behavior for the SQL language and really any language, I think. Although, it is quite creative.
It sounds like you've quickly isolated the line within the source. In an interest in learning more about the code, would you mind pointing my partner and I to the line for this bug?
Thanks,
Jeff
-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Wednesday, July 16, 2014 7:20 AM
To: Jeff Fischer
Cc: pgsql-bugs@postgresql.org
Subject: Re: [BUGS] BUG #10972: string_agg function incorrectly concatenating varying delimiter
jeff@goaldriven.com writes:
Running the query below will show how the delimiter for the current
row is actually the subsequent rows delimiter.
Hmm, well, the documentation for string_agg doesn't say what happens when the "delimiter" argument varies across rows; but a quick look at the code finds that the first-call delimiter isn't actually used at all, and on subsequent calls the delimiter is appended to the running result before the associated value is. Which seems to me to be at least as reasonable, and certainly a great deal easier to implement, as what you seem to have in mind. Can you offer a principled argument why it should be the other way around?
regards, tom lane
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
Hello
2014-07-21 22:34 GMT+02:00 Jeff Fischer <jeff@goaldriven.com>:
Hi Tom,
I just read the Wikipedia article on you, fun. Glad to make your
acquaintance. My business partner, Dave Yarnall, went to Carnegie Mellon
(CCed).I suppose principles can be relative, but I'll assume you mean good
principles and give it a shot.Primarily, I'd consider whether another function uses non-deterministic
rows for its evaluation. I could be wrong, but I don't think any other
function uses two different rows results within a single function
evaluation. Even aggregates, such as string_agg, evaluate one row at a
time which is a well-known behavior.A similar paradigm might be in general programming if a compiled program
randomly chose values off of the stack to place as a parameter into a
method call (function 3's parameters are passed into function2). An odd
and unexpected behavior for the SQL language and really any language, I
think. Although, it is quite creative.It sounds like you've quickly isolated the line within the source. In an
interest in learning more about the code, would you mind pointing my
partner and I to the line for this bug?
https://github.com/postgres/postgres/blob/master/src/backend/utils/adt/varlena.c
search string_agg
Regards
Pavel
Show quoted text
Thanks,
Jeff-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Wednesday, July 16, 2014 7:20 AM
To: Jeff Fischer
Cc: pgsql-bugs@postgresql.org
Subject: Re: [BUGS] BUG #10972: string_agg function incorrectly
concatenating varying delimiterjeff@goaldriven.com writes:
Running the query below will show how the delimiter for the current
row is actually the subsequent rows delimiter.Hmm, well, the documentation for string_agg doesn't say what happens when
the "delimiter" argument varies across rows; but a quick look at the code
finds that the first-call delimiter isn't actually used at all, and on
subsequent calls the delimiter is appended to the running result before the
associated value is. Which seems to me to be at least as reasonable, and
certainly a great deal easier to implement, as what you seem to have in
mind. Can you offer a principled argument why it should be the other way
around?regards, tom lane
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
Thank you very much Pavel.
Thanks,
Jeff
From: Pavel Stehule [mailto:pavel.stehule@gmail.com]
Sent: Monday, July 21, 2014 10:42 PM
To: Jeff Fischer
Cc: Tom Lane; pgsql-bugs@postgresql.org; Dave Yarnall
Subject: Re: [BUGS] BUG #10972: string_agg function incorrectly concatenating varying delimiter
Hello
2014-07-21 22:34 GMT+02:00 Jeff Fischer <jeff@goaldriven.com<mailto:jeff@goaldriven.com>>:
Hi Tom,
I just read the Wikipedia article on you, fun. Glad to make your acquaintance. My business partner, Dave Yarnall, went to Carnegie Mellon (CCed).
I suppose principles can be relative, but I'll assume you mean good principles and give it a shot.
Primarily, I'd consider whether another function uses non-deterministic rows for its evaluation. I could be wrong, but I don't think any other function uses two different rows results within a single function evaluation. Even aggregates, such as string_agg, evaluate one row at a time which is a well-known behavior.
A similar paradigm might be in general programming if a compiled program randomly chose values off of the stack to place as a parameter into a method call (function 3's parameters are passed into function2). An odd and unexpected behavior for the SQL language and really any language, I think. Although, it is quite creative.
It sounds like you've quickly isolated the line within the source. In an interest in learning more about the code, would you mind pointing my partner and I to the line for this bug?
https://github.com/postgres/postgres/blob/master/src/backend/utils/adt/varlena.c
search string_agg
Regards
Pavel
Thanks,
Jeff
-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us<mailto:tgl@sss.pgh.pa.us>]
Sent: Wednesday, July 16, 2014 7:20 AM
To: Jeff Fischer
Cc: pgsql-bugs@postgresql.org<mailto:pgsql-bugs@postgresql.org>
Subject: Re: [BUGS] BUG #10972: string_agg function incorrectly concatenating varying delimiter
jeff@goaldriven.com<mailto:jeff@goaldriven.com> writes:
Running the query below will show how the delimiter for the current
row is actually the subsequent rows delimiter.
Hmm, well, the documentation for string_agg doesn't say what happens when the "delimiter" argument varies across rows; but a quick look at the code finds that the first-call delimiter isn't actually used at all, and on subsequent calls the delimiter is appended to the running result before the associated value is. Which seems to me to be at least as reasonable, and certainly a great deal easier to implement, as what you seem to have in mind. Can you offer a principled argument why it should be the other way around?
regards, tom lane
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org<mailto:pgsql-bugs@postgresql.org>)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
On Wed, Jul 16, 2014 at 3:16 AM, <jeff@goaldriven.com> wrote:
Running the query below will show how the delimiter for the current row is
actually the subsequent rows delimiter. The sequence goes like this:TestBlah 1
ITTest 2
testfail 3Yet, the concatened output is this:
"testblah 2 ITtest 3 testfail"
I think you're misunderstanding the point of the 2nd argument. It's
not there for just concatenating two arguments together, it's there to
be a delimiter between the strings being concatenated. Almost always
it should be a constant. For example if you want a result like
"TestBlah,ITTest,testfail", that would be much uglier to do without
the delimiter argument.
The behavior you want is available as simply:
string_agg(Field1 || ' ' || RowIndex::text, ' ')
Regards,
Marti
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs