Q: How ORDER BY is being done inetrnally?
Greetings,
We are trying to fix another problem we have with Turkish locale.
Order in which tuples are returned by SELECT .. ORDER BY .
statement differs from the one returned by UNIX sort utility or
what strcoll() function will suggest. It is specific to a locale so
you will not likely be able to reproduce it.
I understand that with ORDER BY clause, values are compared
in case-insensitive manner as well as it should be the case with
the values stored in indices. And I suspect that problem is hidden
somewhere there.
I tried to find where this conversion and comparison is being done
in source code but could not, despite all my efforts. I would very
much appreciate if someone would point out what functions are
being called while sorting data for return for ORDER BY clause.
Thanks in advance,
Nicolai Tufar
"Nicolai Tufar" <ntufar@pisem.net> writes:
We are trying to fix another problem we have with Turkish locale.
Order in which tuples are returned by SELECT .. ORDER BY .
statement differs from the one returned by UNIX sort utility or
what strcoll() function will suggest.
Text sorting depends on strcoll() and nothing but. See varstr_cmp().
regards, tom lane
-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]Text sorting depends on strcoll() and nothing but. See varstr_cmp().
I see, apparently sort done for "ORDER BY" clause is case-sensitive.
But problem is still there. It is about "I"-with-dot and "I"-without-dot
in Turkish again. While all UNIX programs put "I"-without-dot before
"I"-with-dot, as it should be, PostgreSQL puts it in reverse order.
I examine the code for any possible gotchas, but I am confused about
what function is being called by what. Especially that all those sort
methods and functions are not hard-coded but stored in pg_am* catalogue
tables. Could someone please explain -very briefly- what exactly is
happening when a sort is performed. A kind of stack trace: which
function
calls which one would be very appreciated.
Best regards,
Nicolai Tufar
Okay, I figured out what was the problem.
glibc's LC_COLLATE file under /usr/lib/locale/tr_TR
is wrong! And it has been wrong for many years now. And
nobody noticed it. PostgreSQL is innocent here.
I checked it many times over and over again with
test programs and different environment settings and
it is wrong! I will contact glibc team now.
Thanks a lot for help.
Regards,
Nicolai Tufar