order by question
Hello,
I am confused by how postgres 8,4..13 is sorting my data.
\d test
Table "public.test"
Column | Type | Modifiers
--------+------+-----------
data | text |
select * from test order by data;
data
----------
-
--
1
11
11F
1F
a
b
C
F
-F
Feneric
Generic
(14 rows)
The first row is a single space, the next row a single -, the next two -- .
What I don't understand is why the '-F', the ' Feneric' and the ' Generic' sort where they do.
I would expect the output to be like this:
data
----------
Feneric
Generic
-
--
-F
1
11
11F
1F
a
b
C
F
(14 rows)
client_encoding
-----------------
SQL_ASCII
lc_collate
-------------
en_US.UTF-8
foxboxconfig=# show lc_ctype;
lc_ctype
-------------
en_US.UTF-8
Thanks for any clarification.
--
Stephen Clark
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Steve Clark <sclark@netwolves.com> wrote:
I am confused by how postgres 8,4..13 is sorting my data.
select * from test order by data;
data
-----------
--
1
11
11F
1F
a
b
C
F
-F
Feneric
Generic
(14 rows)The first row is a single space, the next row a single -, the next two -- .
What I don't understand is why the '-F', the ' Feneric' and
the ' Generic' sort where they do.I would expect the output to be like this:
data
----------Feneric
Generic
-
--
-F
1
11
11F
1F
a
b
C
F
(14 rows)
lc_collate
-------------
en_US.UTF-8
PostgreSQL uses the OS collations. What you are getting matches my
Ubuntu 14.04 machine:
kgrittn@Kevin-Desktop:~/pg/master$ echo $LANG
en_US.UTF-8
kgrittn@Kevin-Desktop:~/pg/master$ sort <<XXX
Feneric
Generic
-
--
-F
1
11
11F
1F
a
b
C
F
XXX
-
--
1
11
11F
1F
a
b
C
F
-F
Feneric
Generic
--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On 08/07/2014 08:53 AM, Steve Clark wrote:
On 08/07/2014 11:36 AM, Kevin Grittner wrote:
Steve Clark <sclark@netwolves.com> wrote:
--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL CompanyHi Kevin,
Thanks for the response. I get the same thing as postgres if I sort from
the command line too. But I don't understand why.
Because that is how en_US.UTF-8 sorts. You are probably looking for the
'C' sort order. For more explanations see:
http://rhaas.blogspot.com/2012/03/perils-of-collation-aware-comparisons.html
http://superuser.com/questions/227925/in-utf-8-collation-why-11-is-less-then-1
I would expect '-F' to sort immediately after '-' and '--' not after 'F' as
well as ' Feneric' (spaceFeneric) and ' Generic' (spaceGeneric) to sort
immediately after ' ' (space).It is like the space character and the - in -Letter is ignored.
See the second link above for why that is.
--
Adrian Klaver
adrian.klaver@aklaver.com
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Import Notes
Reply to msg id not found: 1407425788.77494.YahooMailNeo@web122301.mail.ne1.yahoo.com1407426790353-003-00164565.sclark.netwolves.com@sclark66.netwolves.comReference msg id not found: 53E3A0E5.7010303@netwolves.com
Steve Clark <sclark@netwolves.com> wrote:
It is like the space character and the - in -Letter is ignored.
Yes, that is how the en_US collation is defined. I think the goal
is to make it something like "phone book" ordering. If you still
have a "white pages" book around, look at how a business name with
a hyphen (or a hyphenated last name) is sorted there.
--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Import Notes
Reply to msg id not found: 53E3A0E5.7010303@netwolves.com
On 08/07/2014 12:10 PM, Adrian Klaver wrote:
On 08/07/2014 08:53 AM, Steve Clark wrote:
On 08/07/2014 11:36 AM, Kevin Grittner wrote:
Steve Clark <sclark@netwolves.com> wrote:
--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL CompanyHi Kevin,
Thanks for the response. I get the same thing as postgres if I sort from
the command line too. But I don't understand why.Because that is how en_US.UTF-8 sorts. You are probably looking for the
'C' sort order. For more explanations see:http://rhaas.blogspot.com/2012/03/perils-of-collation-aware-comparisons.html
http://superuser.com/questions/227925/in-utf-8-collation-why-11-is-less-then-1
I would expect '-F' to sort immediately after '-' and '--' not after 'F' as
well as ' Feneric' (spaceFeneric) and ' Generic' (spaceGeneric) to sort
immediately after ' ' (space).It is like the space character and the - in -Letter is ignored.
See the second link above for why that is.
Thanks Adrian and Kevin,
for the explanations and links. They were very elucidating. At least now I understand what is happening.
Thanks again.
--
Stephen Clark
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general