DISTINCT in STRING_AGG
Hello
Can I make a distinct STRING_AGG?
This is my query :
SELECT atjs.id, STRING_AGG(CAST(aqjs1.id AS VARCHAR), ',' ORDER BY
aqjs1.to_left) AS children
FROM administration.ad_query_join_select atjs
JOIN administration.ad_query aq ON (aq.id=atjs.id_ad_query)
LEFT JOIN administration.ad_query_join_select aqjs1 ON
(aqjs1.id_ad_query = atjs.id_ad_query AND aqjs1.to_left>atjs.to_left AND
aqjs1.to_right<atjs.to_right)
LEFT JOIN administration.ad_query_join_select aqjs2 ON
(aqjs2.id_ad_query = atjs.id_ad_query AND aqjs2.to_left>atjs.to_left AND
aqjs2.to_right<atjs.to_right AND aqjs2.to_left<aqjs1.to_left AND
aqjs2.to_right>aqjs1.to_right)
LEFT JOIN administration.ad_query_join_select aqjs3 ON
(aqjs3.id_ad_query = atjs.id_ad_query AND aqjs3.to_left<atjs.to_left AND
aqjs3.to_right>atjs.to_right)
WHERE aqjs2.id IS NULL AND atjs.id_ad_query = 475543
GROUP BY aq.id, atjs.id
ORDER BY aq.id ASC, atjs.to_left ASC;
And "childen" contain doubles. The result is:
id ; children
1399029;"1399031,1399031"
1399031;"1399032,1399032,1399032,1399033,1399033,1399033"
There are doubles because of the join aqjs3 witch is producing this
problem.
Can I make it so the children ID's are unique?
Thank you.
On Nov 28, 2015, at 1:35 PM, Sterpu Victor <victor@caido.ro> wrote:
Hello
Can I make a distinct STRING_AGG?
This is my query :
SELECT atjs.id, STRING_AGG(CAST(aqjs1.id AS VARCHAR), ',' ORDER BY aqjs1.to_left) AS children
FROM administration.ad_query_join_select atjs
JOIN administration.ad_query aq ON (aq.id=atjs.id_ad_query)
LEFT JOIN administration.ad_query_join_select aqjs1 ON (aqjs1.id_ad_query = atjs.id_ad_query AND aqjs1.to_left>atjs.to_left AND aqjs1.to_right<atjs.to_right)
LEFT JOIN administration.ad_query_join_select aqjs2 ON (aqjs2.id_ad_query = atjs.id_ad_query AND aqjs2.to_left>atjs.to_left AND aqjs2.to_right<atjs.to_right AND aqjs2.to_left<aqjs1.to_left AND aqjs2.to_right>aqjs1.to_right)
LEFT JOIN administration.ad_query_join_select aqjs3 ON (aqjs3.id_ad_query = atjs.id_ad_query AND aqjs3.to_left<atjs.to_left AND aqjs3.to_right>atjs.to_right)
WHERE aqjs2.id IS NULL AND atjs.id_ad_query = 475543
GROUP BY aq.id, atjs.id
ORDER BY aq.id ASC, atjs.to_left ASC;And "childen" contain doubles. The result is:
id ; children
1399029;"1399031,1399031"
1399031;"1399032,1399032,1399032,1399033,1399033,1399033"There are doubles because of the join aqjs3 witch is producing this problem.Can I make it so the children ID's are unique?
Just to mention, this looks like a good candidate for range types and CTE’s.
The > / < comparisons appear to be mutually exclusive in each LEFT JOIN clause, so it’s not apparent why aqjs3 is causing duplication, as you’ve stated.
As far as I can see, without providing us with your table constraints/keys, there’s no way to determine what makes your ID values unique…
However, if you defer your STRING_AGG until after you derive a distinct “staging” result set from the joins, then you can effect uniqueness -
e.g. (air code):
WITH q AS
(SELECT aq.id aq_parent_id, atjs.id atjs_parent_id, CAST(aqjs1.id AS VARCHAR) child FROM ... GROUP BY aq.id, atjs.id, aqjs1.id)
SELECT atjs_parent_id, STRING_AGG(child,’,’ ORDER BY aqjs.to_left) children
FROM q LEFT JOIN (SELECT DISTINCT id_ad_query, to_left FROM administration.ad_query_join_select) aqjs ON …
GROUP BY aq_parent_id, parent
ORDER BY aq_parent_id, atjs.to_left;
Something along these lines ‘may’ produce a unique set of child values for each id by which to perform a STRING_AGG on, but again, I can only guess based on the lack of definition provided for your table constraints.
John
On 28 November 2015 at 18:35, Sterpu Victor <victor@caido.ro> wrote:
Can I make a distinct STRING_AGG?
This is my query :
SELECT atjs.id, STRING_AGG(CAST(aqjs1.id AS VARCHAR), ',' ORDER BY
aqjs1.to_left) AS children
FROM administration.ad_query_join_select atjs
JOIN administration.ad_query aq ON (aq.id=atjs.id_ad_query)
LEFT JOIN administration.ad_query_join_select aqjs1 ON (aqjs1.id_ad_query
= atjs.id_ad_query AND aqjs1.to_left>atjs.to_left AND
aqjs1.to_right<atjs.to_right)
LEFT JOIN administration.ad_query_join_select aqjs2 ON (aqjs2.id_ad_query
= atjs.id_ad_query AND aqjs2.to_left>atjs.to_left AND
aqjs2.to_right<atjs.to_right AND aqjs2.to_left<aqjs1.to_left AND
aqjs2.to_right>aqjs1.to_right)
LEFT JOIN administration.ad_query_join_select aqjs3 ON (aqjs3.id_ad_query
= atjs.id_ad_query AND aqjs3.to_left<atjs.to_left AND
aqjs3.to_right>atjs.to_right)
WHERE aqjs2.id IS NULL AND atjs.id_ad_query = 475543
GROUP BY aq.id, atjs.id
ORDER BY aq.id ASC, atjs.to_left ASC;And "childen" contain doubles. The result is:
*id ; children*
1399029;"1399031,1399031"
1399031;"1399032,1399032,1399032,1399033,1399033,1399033"There are doubles because of the join aqjs3 witch is producing this
problem.
Can I make it so the children ID's are unique?
Well if you can live with losing the to_left ordering, then you could just
do
SELECT STRING_AGG(DISTINCT CAST(aqjs1.id AS VARCHAR), '') AS children
...
no?
Geoff
I can't skip the ordering.
I'm sure aqjs3 is the one that produces the duplication.
I guess subqueries are the only option, like this:
SELECT atjs.id, tmp.children AS children
FROM administration.ad_query_join_select atjsJOIN (SELECT
atjs.id, STRING_AGG(CAST(aqjs1.id AS VARCHAR), ',' ORDER BY
aqjs1.to_left) AS children
FROM administration.ad_query_join_select atjs
LEFT JOIN administration.ad_query_join_select aqjs1 ON
(aqjs1.id_ad_query = atjs.id_ad_query AND aqjs1.to_left>atjs.to_left
AND aqjs1.to_right<atjs.to_right)
LEFT JOIN administration.ad_query_join_select aqjs2 ON
(aqjs2.id_ad_query = atjs.id_ad_query AND aqjs2.to_left>atjs.to_left
AND aqjs2.to_right<atjs.to_right AND aqjs2.to_left<aqjs1.to_left AND
aqjs2.to_right>aqjs1.to_right)
WHERE aqjs2.id IS NULL AND atjs.id_ad_query = 475543
GROUP BY atjs.id) tmp ON (tmp.id = atjs.id)JOIN administration.ad_query aq ON (aq.id=atjs.id_ad_query)
LEFT JOIN administration.ad_query_join_select aqjs3 ON
(aqjs3.id_ad_query = atjs.id_ad_query AND aqjs3.to_left<atjs.to_left
AND aqjs3.to_right>atjs.to_right)
WHERE atjs.id_ad_query = 475543
GROUP BY aq.id, atjs.id, tmp.children
ORDER BY aq.id ASC, atjs.to_left ASC;
Result is:
id ; children
1399029;"1399031"
1399031;"1399032,1399033"
Is there a better way? I usualy try to avoid subqueries.
------ Original Message ------
From: "Geoff Winkless" <pgsqladmin@geoff.dj>
To: "Sterpu Victor" <victor@caido.ro>
Cc: "PostgreSQL General" <pgsql-general@postgresql.org>
Sent: 11/29/2015 6:42:18 PM
Subject: Re: [GENERAL] DISTINCT in STRING_AGG
Show quoted text
On 28 November 2015 at 18:35, Sterpu Victor <victor@caido.ro> wrote:
Can I make a distinct STRING_AGG?
This is my query :
SELECT atjs.id, STRING_AGG(CAST(aqjs1.id AS VARCHAR), ',' ORDER BY
aqjs1.to_left) AS children
FROM administration.ad_query_join_select atjs
JOIN administration.ad_query aq ON (aq.id=atjs.id_ad_query)
LEFT JOIN administration.ad_query_join_select aqjs1 ON
(aqjs1.id_ad_query = atjs.id_ad_query AND aqjs1.to_left>atjs.to_left
AND aqjs1.to_right<atjs.to_right)
LEFT JOIN administration.ad_query_join_select aqjs2 ON
(aqjs2.id_ad_query = atjs.id_ad_query AND aqjs2.to_left>atjs.to_left
AND aqjs2.to_right<atjs.to_right AND aqjs2.to_left<aqjs1.to_left AND
aqjs2.to_right>aqjs1.to_right)
LEFT JOIN administration.ad_query_join_select aqjs3 ON
(aqjs3.id_ad_query = atjs.id_ad_query AND aqjs3.to_left<atjs.to_left
AND aqjs3.to_right>atjs.to_right)
WHERE aqjs2.id IS NULL AND atjs.id_ad_query = 475543
GROUP BY aq.id, atjs.id
ORDER BY aq.id ASC, atjs.to_left ASC;And "childen" contain doubles. The result is:
id ; children
1399029;"1399031,1399031"
1399031;"1399032,1399032,1399032,1399033,1399033,1399033"There are doubles because of the join aqjs3 witch is producing this
problem.
Can I make it so the children ID's are unique?Well if you can live with losing the to_left ordering, then you could
just doSELECT STRING_AGG(DISTINCT CAST(aqjs1.id AS VARCHAR), '') AS children
...no?
Geoff
On 29 November 2015 at 18:59, Sterpu Victor <victor@caido.ro> wrote:
I can't skip the ordering.
I'm sure aqjs3 is the one that produces the duplication.
I guess subqueries are the only option, like this:
Well you could look at the intarray extension and a combination of
array_agg, uniq() and string_to_array:
http://www.postgresql.org/docs/current/static/intarray.html
but that's probably sledgehammer:nut time.
Not sure why you need to order the values you're getting back by something
other than the values themselves - is there a reason you wouldn't want the
"children" set to be ordered numerically? You can still order the outer
query by whatever you like, or you can order the aggregate by the values
themselves, it's just the DISTINCT inside the aggregate query requires that
an internal ORDER includes the ordering term in the result.
Geoff
On 29 November 2015 at 20:51,
I
wrote:
Well you could look at the intarray extension and a combination of
array_agg, uniq() and
*string_to_array*:
Mind blip, apologies, obviously I meant array_to_string :)
Geoff
Thank you.
I think there is no native function that will solve this problem,
intarray extension can't order as in the example query.
I could write a new SQL function but I solved the problem with
subqueries and works fine.
I need order because I use nested trees.
To_left and to_right are the coordinates for each node and the result
must be ordered by this to obtain the correct result.
------ Original Message ------
From: "Geoff Winkless" <pgsqladmin@geoff.dj>
To: "Sterpu Victor" <victor@caido.ro>
Cc: "PostgreSQL General" <pgsql-general@postgresql.org>
Sent: 11/29/2015 10:51:32 PM
Subject: Re: Re[2]: [GENERAL] DISTINCT in STRING_AGG
Show quoted text
On 29 November 2015 at 18:59, Sterpu Victor <victor@caido.ro> wrote:
I can't skip the ordering.
I'm sure aqjs3 is the one that produces the duplication.
I guess subqueries are the only option, like this:Well you could look at the intarray extension and a combination of
array_agg, uniq() and string_to_array:http://www.postgresql.org/docs/current/static/intarray.html
but that's probably sledgehammer:nut time.
Not sure why you need to order the values you're getting back by
something other than the values themselves - is there a reason you
wouldn't want the "children" set to be ordered numerically? You can
still order the outer query by whatever you like, or you can order the
aggregate by the values themselves, it's just the DISTINCT inside the
aggregate query requires that an internal ORDER includes the ordering
term in the result.Geoff