BUG #8144: Problem with rank window function and CTEs
The following bug has been logged on the website:
Bug reference: 8144
Logged by: Marc Munro
Email address: marc@bloodnok.com
PostgreSQL version: 9.2.4
Operating system: Linux 3.6.3 (debian wheezy)
Description:
I have a query in which I want to use the result of a window function to
isolate the most relevant results. While I was trying to develop and test
the query, I discovered what looks like a bug in the results of the rank()
function. This has been tried with the same results on 9.1.9 and 9.2.4
I have two versions of the query, the first works as expected while the
second produces incorrect rank values. I have tried various work-arounds
but they all seem to trigger the same behaviour once a certain degree of
complexity is reached.
This version of the query seems to work, producing the expected rank
values:
with recursive parent_orgs(parent_org_id, org_id) as
(
select pr.first_party_id, pr.second_party_id
from party_rels pr
where pr.relationship_type_id in (1009, 1010, 1011, 1012, 1013)
),
ancestor_orgs(
ancestor_org_id, org_id, depth) as
(
select org_id, org_id, 0
from parent_orgs
union all
select p.parent_org_id, a.org_id, a.depth + 1
from ancestor_orgs a
join parent_orgs p
on p.org_id = a.ancestor_org_id
)
select ao.org_id, oi.item_id,
oi.seq_id, oi.complete,
ao.ancestor_org_id, ao.depth,
rank() over (partition by oi.item_id order by ao.depth)
from ancestor_orgs ao
join oitems oi
on oi.org_id = ao.ancestor_org_id
where ao.org_id = 20150;
org_id | item_id | seq_id | complete | ancestor_org_id | depth |
rank
--------+---------+--------+----------+-----------------+-------+------
20150 | 1 | 1 | t | 20139 | 4 | 1
20150 | 2 | 1 | t | 20139 | 4 | 1
20150 | 200146 | 1 | t | 20146 | 3 | 1
20150 | 200147 | 1 | t | 20146 | 3 | 1
20150 | 200148 | 1 | t | 20146 | 3 | 1
20150 | 200149 | 2 | t | 20146 | 3 | 1
20150 | 200150 | 1 | t | 20146 | 3 | 1
20150 | 200151 | 1 | t | 20146 | 3 | 1
20150 | 200152 | 1 | t | 20146 | 3 | 1
20150 | 200153 | 7 | t | 20150 | 0 | 1
20150 | 200153 | 1 | t | 20146 | 3 | 2
20150 | 200154 | 1 | t | 20146 | 3 | 1
[ rows removed for brevity ]
(38 rows)
This version, which should be equivalent, yields crazy rank values:
with recursive parent_orgs(parent_org_id, org_id) as
(
select pr.first_party_id, pr.second_party_id
from party_rels pr
where pr.relationship_type_id in (1009, 1010, 1011, 1012, 1013)
),
ancestor_orgs(
ancestor_org_id, org_id, depth) as
(
select org_id, org_id, 0
from parent_orgs
union all
select p.parent_org_id, a.org_id, a.depth + 1
from ancestor_orgs a
join parent_orgs p
on p.org_id = a.ancestor_org_id
),
visible_org_items(org_id, item_id,
seq_id, complete, ancestor_org_id,
depth, rank) as
(
select ao.org_id, oi.item_id,
oi.seq_id, oi.complete,
ao.ancestor_org_id, ao.depth,
rank() over (partition by oi.item_id order by ao.depth)
from ancestor_orgs ao
join oitems oi
on oi.org_id = ao.ancestor_org_id
)
select *
from visible_org_items
where org_id = 20150;
org_id | item_id | seq_id | complete | ancestor_org_id | depth |
rank
--------+---------+--------+----------+-----------------+-------+------
20150 | 1 | 1 | t | 20139 | 4 | 21
20150 | 2 | 1 | t | 20139 | 4 | 21
20150 | 200146 | 1 | t | 20146 | 3 | 9
20150 | 200147 | 1 | t | 20146 | 3 | 9
20150 | 200148 | 1 | t | 20146 | 3 | 9
20150 | 200149 | 2 | t | 20146 | 3 | 9
20150 | 200150 | 1 | t | 20146 | 3 | 9
20150 | 200151 | 1 | t | 20146 | 3 | 9
20150 | 200152 | 1 | t | 20146 | 3 | 9
20150 | 200153 | 7 | t | 20150 | 0 | 1
20150 | 200153 | 1 | t | 20146 | 3 | 10
20150 | 200154 | 1 | t | 20146 | 3 | 9
[ rows removed for brevity ]
(38 rows)
I have a pg_dump (< 5K in size) with which the problem can be reproduced.
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
marc@bloodnok.com writes:
I have a query in which I want to use the result of a window function to
isolate the most relevant results. While I was trying to develop and test
the query, I discovered what looks like a bug in the results of the rank()
function. This has been tried with the same results on 9.1.9 and 9.2.4
Well, you didn't provide enough information for somebody else to
reproduce the problem, but just looking at this I see no particular
reason to think there's a bug. In the first case, the
"where ao.org_id = 20150" clause filters rows before the rank() is
taken, whereas in the second case it filters after the rank() function.
I think the larger rank values indicate that there are other rows with
different org_id but the same item_id, which you won't see in the final
output in either case --- but in the second query, the rank() function
does see them.
regards, tom lane
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs