BUG #17676: Text comparison appears to be wrong

Started by PG Bug reporting formover 3 years ago2 messagesbugs
Jump to latest
#1PG Bug reporting form
noreply@postgresql.org

The following bug has been logged on the website:

Bug reference: 17676
Logged by: Rob Johnson
Email address: robj@hightouchinc.com
PostgreSQL version: 14.5
Operating system: Ubuntu
Description:

No tables are needed. Just ran this, comparing strings with lower-case 'x'
and period '.' characters. The first two columns are false as expected, the
last column is true, which appears to be wrong.

=> select '.' > 'x' as first, '.x' > 'x.' as second, '.xx' > 'x..' as third;

first | second | third
-------+--------+-------
f | f | t
(1 row)

My Postgres version:
=> select version();
version

---------------------------------------------------------------------------------------------------------------------------------
PostgreSQL 14.5 (Ubuntu 14.5-0ubuntu0.22.04.1) on x86_64-pc-linux-gnu,
compiled by gcc (Ubuntu 11.2.0-19ubuntu1) 11.2.0, 64-bit
(1 row)

I am located in the United States and haven't done anything to change
character sets, collations, or anything like that. The \l+ psql command
shows this for my database, which is called nigeldb:

=> \l+ nigeldb
List of databases
Name | Owner | Encoding | Collate | Ctype | Access
privileges | Size | Tablespace | Description
---------+----------+----------+-------------+-------------+-------------------+-------+------------+-------------
nigeldb | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
| 48 MB | pg_default |
(1 row)

#2David G. Johnston
david.g.johnston@gmail.com
In reply to: PG Bug reporting form (#1)
Re: BUG #17676: Text comparison appears to be wrong

On Thu, Nov 3, 2022 at 12:56 PM PG Bug reporting form <
noreply@postgresql.org> wrote:

The following bug has been logged on the website:

Bug reference: 17676
Logged by: Rob Johnson
Email address: robj@hightouchinc.com
PostgreSQL version: 14.5
Operating system: Ubuntu
Description:

No tables are needed. Just ran this, comparing strings with lower-case 'x'
and period '.' characters. The first two columns are false as expected,
the
last column is true, which appears to be wrong.

=> select '.' > 'x' as first, '.x' > 'x.' as second, '.xx' > 'x..' as
third;

List of databases
Name | Owner | Encoding | Collate | Ctype | Access
privileges | Size | Tablespace | Description

---------+----------+----------+-------------+-------------+-------------------+-------+------------+-------------
nigeldb | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |

| 48 MB | pg_default |

Not a bug.

This just seems to be how UTF-8 Collation works; punctuation produces
non-obvious sorting outcomes.

https://superuser.com/questions/227925/in-utf-8-collation-why-11-is-less-then-1

I confirmed that explicitly adding COLLATE "C" produces the expected
outcome for third.

David J.