Typo on tutorial window page
The following documentation comment has been logged on the website:
Page: https://www.postgresql.org/docs/17/tutorial-window.html
Description:
EXPECTED:
As shown here, the rank function produces a numerical rank for each distinct
PARTITION BY value in the current row's partition, using the order defined
by the ORDER BY clause. rank needs no explicit parameter, because its
behavior is entirely determined by the OVER clause.
ACTUAL:
As shown here, the rank function produces a numerical rank for each distinct
ORDER BY value in the current row's partition, using the order defined by
the ORDER BY clause. rank needs no explicit parameter, because its behavior
is entirely determined by the OVER clause.
PG Doc comments form <noreply@postgresql.org> writes:
EXPECTED:
As shown here, the rank function produces a numerical rank for each distinct
PARTITION BY value in the current row's partition, using the order defined
by the ORDER BY clause. rank needs no explicit parameter, because its
behavior is entirely determined by the OVER clause.
ACTUAL:
As shown here, the rank function produces a numerical rank for each distinct
ORDER BY value in the current row's partition, using the order defined by
the ORDER BY clause. rank needs no explicit parameter, because its behavior
is entirely determined by the OVER clause.
Hmm, I think the current text is correct, or at least more nearly
correct than what you suggest. Look at the example:
<programlisting>
SELECT depname, empno, salary,
rank() OVER (PARTITION BY depname ORDER BY salary DESC)
FROM empsalary;
</programlisting>
<screen>
depname | empno | salary | rank
-----------+-------+--------+------
develop | 8 | 6000 | 1
develop | 10 | 5200 | 2
develop | 11 | 5200 | 2
develop | 9 | 4500 | 4
develop | 7 | 4200 | 5
personnel | 2 | 3900 | 1
personnel | 5 | 3500 | 2
sales | 1 | 5000 | 1
sales | 4 | 4800 | 2
sales | 3 | 4800 | 2
(10 rows)
</screen>
The ranks are separate within each partition (depname), and
within a partition rows having the same salary get the
same rank.
regards, tom lane
On Tue, Jan 21, 2025 at 10:39 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Show quoted text
PG Doc comments form <noreply@postgresql.org> writes:
EXPECTED:
As shown here, the rank function produces a numerical rank for eachdistinct
PARTITION BY value in the current row's partition, using the order
defined
by the ORDER BY clause. rank needs no explicit parameter, because its
behavior is entirely determined by the OVER clause.ACTUAL:
As shown here, the rank function produces a numerical rank for eachdistinct
ORDER BY value in the current row's partition, using the order defined by
the ORDER BY clause. rank needs no explicit parameter, because itsbehavior
is entirely determined by the OVER clause.
Hmm, I think the current text is correct, or at least more nearly
correct than what you suggest.
Look at the example:
<programlisting>
SELECT depname, empno, salary,
rank() OVER (PARTITION BY depname ORDER BY salary DESC)
FROM empsalary;
</programlisting><screen>
depname | empno | salary | rank
-----------+-------+--------+------
develop | 8 | 6000 | 1
develop | 10 | 5200 | 2
develop | 11 | 5200 | 2
develop | 9 | 4500 | 4
develop | 7 | 4200 | 5
personnel | 2 | 3900 | 1
personnel | 5 | 3500 | 2
sales | 1 | 5000 | 1
sales | 4 | 4800 | 2
sales | 3 | 4800 | 2
(10 rows)
</screen>The ranks are separate within each partition (depname), and
within a partition rows having the same salary get the
same rank.regards, tom lane
sorry, my bad, thanks for the clarification!
вт, 21 січ. 2025 р., 18:40 користувач David G. Johnston <
david.g.johnston@gmail.com> пише:
Show quoted text
On Tue, Jan 21, 2025 at 10:39 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
PG Doc comments form <noreply@postgresql.org> writes:
EXPECTED:
As shown here, the rank function produces a numerical rank for eachdistinct
PARTITION BY value in the current row's partition, using the order
defined
by the ORDER BY clause. rank needs no explicit parameter, because its
behavior is entirely determined by the OVER clause.ACTUAL:
As shown here, the rank function produces a numerical rank for eachdistinct
ORDER BY value in the current row's partition, using the order defined
by
the ORDER BY clause. rank needs no explicit parameter, because its
behavior
is entirely determined by the OVER clause.
Hmm, I think the current text is correct, or at least more nearly
correct than what you suggest.Look at the example:
<programlisting>
SELECT depname, empno, salary,
rank() OVER (PARTITION BY depname ORDER BY salary DESC)
FROM empsalary;
</programlisting><screen>
depname | empno | salary | rank
-----------+-------+--------+------
develop | 8 | 6000 | 1
develop | 10 | 5200 | 2
develop | 11 | 5200 | 2
develop | 9 | 4500 | 4
develop | 7 | 4200 | 5
personnel | 2 | 3900 | 1
personnel | 5 | 3500 | 2
sales | 1 | 5000 | 1
sales | 4 | 4800 | 2
sales | 3 | 4800 | 2
(10 rows)
</screen>The ranks are separate within each partition (depname), and
within a partition rows having the same salary get the
same rank.regards, tom lane
On Tue, Jan 21, 2025 at 10:39 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
PG Doc comments form <noreply@postgresql.org> writes:
EXPECTED:
As shown here, the rank function produces a numerical rank for eachdistinct
PARTITION BY value in the current row's partition, using the order
defined
by the ORDER BY clause. rank needs no explicit parameter, because its
behavior is entirely determined by the OVER clause.Hmm, I think the current text is correct, or at least more nearly
correct than what you suggest.
The ranks are separate within each partition (depname), and
within a partition rows having the same salary get the
same rank.
Sorry for the empty email - finger spasm hit send.
I was going to write basically that but something feels off to me. Maybe
something like this:
"As shown here, the rank function produces a numerical ranking within each
partition, using the order defined by the ORDER BY clause. Ranking assigns
the same rank to all rows that tie according to the order by criteria,
while still incrementing the rank counter. Thus there are gaps in the
serial numbering. These can be removed by instead using the dense_rank
window function. Ties can instead be given their own unique value by using
the row_number window function. In all these cases, as the window function
is effectively just counting rows, the function itself has no input
parameter."
If we don't want to get into that level of nuance in the tutorial I suggest
we use the row_number() window function instead of rank, and just say
because we count rows no parameter is needed.
David J.
"David G. Johnston" <david.g.johnston@gmail.com> writes:
I was going to write basically that but something feels off to me. Maybe
something like this:
"As shown here, the rank function produces a numerical ranking within each
partition, using the order defined by the ORDER BY clause. Ranking assigns
the same rank to all rows that tie according to the order by criteria,
while still incrementing the rank counter. Thus there are gaps in the
serial numbering. These can be removed by instead using the dense_rank
window function. Ties can instead be given their own unique value by using
the row_number window function. In all these cases, as the window function
is effectively just counting rows, the function itself has no input
parameter."
If we don't want to get into that level of nuance in the tutorial I suggest
we use the row_number() window function instead of rank, and just say
because we count rows no parameter is needed.
Yeah, I was wondering if it'd be worth bringing up dense_rank,
but decided "probably not". I like your idea of switching the
example to use row_number to simplify things. What would the
text be then? Perhaps
As shown here, the <function>row_number</function> function
assigns sequential numbers to the rows within each partition,
in the order defined by the <literal>ORDER BY</literal> clause
(with tied rows numbered in an unspecified order).
<function>row_number</function> needs no explicit parameter,
because its behavior
is entirely determined by the <literal>OVER</literal> clause.
regards, tom lane