BUG #17853: COLLATE does not work with numeric column references in ORDER BY

Started by PG Bug reporting formabout 3 years ago6 messagesbugs
Jump to latest
#1PG Bug reporting form
noreply@postgresql.org

The following bug has been logged on the website:

Bug reference: 17853
Logged by: Branko Radovanovic
Email address: branko.radovanovic.zg@gmail.com
PostgreSQL version: 13.4
Operating system: Debian
Description:

The following SQL:

values ('a')
order by 1 collate "C";

...returns an error: SQL Error [42804]: ERROR: collations are not supported
by type integer

In the above query, "1" is not an integer but a column reference, so it
should be treated as well-formed and work the same as with the actual column
label:

values ('a')
order by column1 collate "C";

Best regards,
Branko

#2Vik Fearing
vik@postgresfriends.org
In reply to: PG Bug reporting form (#1)
Re: BUG #17853: COLLATE does not work with numeric column references in ORDER BY

On 3/18/23 23:44, PG Bug reporting form wrote:

The following bug has been logged on the website:

Bug reference: 17853
Logged by: Branko Radovanovic
Email address: branko.radovanovic.zg@gmail.com
PostgreSQL version: 13.4
Operating system: Debian
Description:

The following SQL:

values ('a')
order by 1 collate "C";

...returns an error: SQL Error [42804]: ERROR: collations are not supported
by type integer

In the above query, "1" is not an integer but a column reference, so it
should be treated as well-formed and work the same as with the actual column
label:

There is an argument that this should work. There is also an argument
that using numerical column references is not (or rather, is no longer)
Standard SQL and should not be used.
--
Vik Fearing

#3David G. Johnston
david.g.johnston@gmail.com
In reply to: Vik Fearing (#2)
Re: BUG #17853: COLLATE does not work with numeric column references in ORDER BY

On Sat, Mar 18, 2023 at 5:56 PM Vik Fearing <vik@postgresfriends.org> wrote:

On 3/18/23 23:44, PG Bug reporting form wrote:

The following bug has been logged on the website:

Bug reference: 17853
Logged by: Branko Radovanovic
Email address: branko.radovanovic.zg@gmail.com
PostgreSQL version: 13.4
Operating system: Debian
Description:

The following SQL:

values ('a')
order by 1 collate "C";

...returns an error: SQL Error [42804]: ERROR: collations are not

supported

by type integer

In the above query, "1" is not an integer but a column reference, so it
should be treated as well-formed and work the same as with the actual

column

label:

There is an argument that this should work. There is also an argument
that using numerical column references is not (or rather, is no longer)
Standard SQL and should not be used.

So we also have a documentation bug for failing to accurately indicate that
our treatment here is non-standard. Given the existing notes that refer to
both SQL-92 and SQL:1999 explicitly and make no mention of this I'd have to
assume such a material difference falls into the scope of things we
document.

David J.

#4David G. Johnston
david.g.johnston@gmail.com
In reply to: David G. Johnston (#3)
Re: BUG #17853: COLLATE does not work with numeric column references in ORDER BY

On Sat, Mar 18, 2023 at 6:11 PM David G. Johnston <
david.g.johnston@gmail.com> wrote:

On Sat, Mar 18, 2023 at 5:56 PM Vik Fearing <vik@postgresfriends.org>
wrote:

On 3/18/23 23:44, PG Bug reporting form wrote:

The following bug has been logged on the website:

Bug reference: 17853
Logged by: Branko Radovanovic
Email address: branko.radovanovic.zg@gmail.com
PostgreSQL version: 13.4
Operating system: Debian
Description:

The following SQL:

values ('a')
order by 1 collate "C";

...returns an error: SQL Error [42804]: ERROR: collations are not

supported

by type integer

In the above query, "1" is not an integer but a column reference, so it
should be treated as well-formed and work the same as with the actual

column

label:

There is an argument that this should work. There is also an argument
that using numerical column references is not (or rather, is no longer)
Standard SQL and should not be used.

So we also have a documentation bug for failing to accurately indicate
that our treatment here is non-standard. Given the existing notes that
refer to both SQL-92 and SQL:1999 explicitly and make no mention of this
I'd have to assume such a material difference falls into the scope of
things we document.

I'll correct myself - the original bug report is indeed not actually a
bug. We do not document nor promise the expected behavior. Anything that
is more than a simple integer number or a column name is by its nature an
"...arbitrary expression formed from input-column values." In particular,
the syntax doesn't allow for any place to attach COLLATE to the number like
it does for ASC etc... Though here the number of input columns referenced
is zero, you just have a constant, thus pointless, order by expression.
Which is documented as being allowed.

I wouldn't completely discount the possibility of adding a note about this
in the documentation, but it hasn't ever come up, the section is long
enough as-is, and the error is consistently reported for the case.

David J.

#5Branko Radovanovic
branko.radovanovic.zg@gmail.com
In reply to: David G. Johnston (#4)
Re: BUG #17853: COLLATE does not work with numeric column references in ORDER BY

Indeed, my initial instinct was to report this as a documentation bug.
COLLATE is a good place to de-support numeric references because ORDER BY 1
COLLATE "C" is then unambiguously an error, rather than a silent fail.

However, this (fairly dumb) SQL will still work:

values ('a', 'b'), ('c', 'd')
order by 1 desc, column2 collate "C"

...so in the grand scheme of things it doesn't really help. Sometimes it's
actually easier to implement something than explain (in the doc) how or why
it doesn't work (as expected or at all). On the other hand, the benefit of
doing either may - granted - be close to zero in this particular case.

B.

On Sun, Mar 19, 2023 at 2:28 AM David G. Johnston <
david.g.johnston@gmail.com> wrote:

Show quoted text

On Sat, Mar 18, 2023 at 6:11 PM David G. Johnston <
david.g.johnston@gmail.com> wrote:

On Sat, Mar 18, 2023 at 5:56 PM Vik Fearing <vik@postgresfriends.org>
wrote:

On 3/18/23 23:44, PG Bug reporting form wrote:

The following bug has been logged on the website:

Bug reference: 17853
Logged by: Branko Radovanovic
Email address: branko.radovanovic.zg@gmail.com
PostgreSQL version: 13.4
Operating system: Debian
Description:

The following SQL:

values ('a')
order by 1 collate "C";

...returns an error: SQL Error [42804]: ERROR: collations are not

supported

by type integer

In the above query, "1" is not an integer but a column reference, so it
should be treated as well-formed and work the same as with the actual

column

label:

There is an argument that this should work. There is also an argument
that using numerical column references is not (or rather, is no longer)
Standard SQL and should not be used.

So we also have a documentation bug for failing to accurately indicate
that our treatment here is non-standard. Given the existing notes that
refer to both SQL-92 and SQL:1999 explicitly and make no mention of this
I'd have to assume such a material difference falls into the scope of
things we document.

I'll correct myself - the original bug report is indeed not actually a
bug. We do not document nor promise the expected behavior. Anything that
is more than a simple integer number or a column name is by its nature an
"...arbitrary expression formed from input-column values." In particular,
the syntax doesn't allow for any place to attach COLLATE to the number like
it does for ASC etc... Though here the number of input columns referenced
is zero, you just have a constant, thus pointless, order by expression.
Which is documented as being allowed.

I wouldn't completely discount the possibility of adding a note about this
in the documentation, but it hasn't ever come up, the section is long
enough as-is, and the error is consistently reported for the case.

David J.

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Branko Radovanovic (#5)
Re: BUG #17853: COLLATE does not work with numeric column references in ORDER BY

Branko Radovanovic <branko.radovanovic.zg@gmail.com> writes:

COLLATE is a good place to de-support numeric references because ORDER BY 1
COLLATE "C" is then unambiguously an error, rather than a silent fail.

Yup.

...so in the grand scheme of things it doesn't really help. Sometimes it's
actually easier to implement something than explain (in the doc) how or why
it doesn't work (as expected or at all).

It's not that any specific choice of semantics would be hard to
implement. It's that the syntax is fundamentally ambiguous, so if
you get too aggressive about saying "we'll resolve ambiguous cases
like this" then you risk queries silently doing something other than
what the user expected, giving rise to a different set of bug reports.

The back story here is that SQL92 said "the argument of ORDER BY is
an output column name or number", but SQL99 reversed course and said
"the argument of ORDER BY is an expression over the input columns".
So "ORDER BY 1" means two completely different things depending on
which spec version you read. We didn't (and still don't) want to give
up compatibility with SQL92's way, for backwards-compatibility reasons
and because it's such a handy shortcut in many cases. But we use the
SQL92 interpretation only when the clause satisfies SQL92 exactly.
"COLLATE" wasn't in SQL92; so if we did what you claim we should do
we would be applying SQL92 semantics to a query that isn't even legal
SQL92 syntax, while failing to comply with SQL99 for a query that *is*
legal SQL99.

I like throwing an error better than either of those choices, so
I'm perfectly satisfied with the code's behavior as-is. What seems
open for debate is whether the documentation needs to address this
case specifically. I think it's already sufficiently implied by [1]https://www.postgresql.org/docs/current/queries-order.html,
but perhaps it isn't.

regards, tom lane

[1]: https://www.postgresql.org/docs/current/queries-order.html