Sorting problems with SELECT * FROM table WHERE name LIKE 'Ö%'
Hi there,
I have a problem when sorting records with:
SELECT * FROM table WHERE name LIKE '�%'
I am running Postgres 8.02 with a database whose character encoding is
UNICODE.
The SQL Query
SELECT *
FROM member
WHERE name LIKE 'O%'
OR
name like '�%'
ORDER BY name
returns this:
�hlmann
�hmann
Obenaus
Ochoa
O'Donovan
Oehme
Oklant
Oltub
Olt�ch
Oltutz
Olt�wer
According to german sorting rules the result is fine except the both
first entries "�hlmann" and "�hmann".
Why do appear these records at the beginning of the list?
The proper result should read like this:
Obenaus
Ochoa
O'Donovan
Oehme
�hlmann
�hmann
Oklant
Oltub
Olt�ch
Oltutz
Olt�wer
The same problem accours when using "E" where my result is this:
�lie de Beaumont
Eberer
Ec�
Edding
Emmer
The proper result should be:
Eberer
Ec�
Edding
�lie de Beaumont
Emmer
Any idea how I can solve this problem?
Thank you very much in advance,
Nico
Nico Grubert <nicogrubert@arcor.de> writes:
I have a problem when sorting records with:
SELECT * FROM table WHERE name LIKE '�%'
I am running Postgres 8.02 with a database whose character encoding is
UNICODE.
... but what locale is it using? (See LC_COLLATE and LC_CTYPE.)
regards, tom lane
... but what locale is it using? (See LC_COLLATE and LC_CTYPE.)
Can I find out out these settings in "phpPgAdmin"?
Or can I use LC_COLLATE and LC_CTYPE in the SQL Query?
Ah, I found it:
lc_collate: de_DE@euro
lc_ctype: de_DE@euro
I have a problem when sorting records with:
SELECT * FROM table WHERE name LIKE '�%'I am running Postgres 8.02 with a database whose character encoding is
UNICODE.The SQL Query
SELECT *
FROM member
WHERE name LIKE 'O%'
OR
name like '�%'
ORDER BY namereturns this:
�hlmann
�hmann
Obenaus
Ochoa
O'Donovan
Oehme
Oklant
Oltub
Olt�ch
Oltutz
Olt�werAccording to german sorting rules the result is fine except the both
first entries "�hlmann" and "�hmann".
Why do appear these records at the beginning of the list?
The proper result should read like this:
Obenaus
Ochoa
O'Donovan
Oehme
�hlmann
�hmann
Oklant
Oltub
Olt�ch
Oltutz
Olt�werThe same problem accours when using "E" where my result is this:
�lie de Beaumont
Eberer
Ec�
Edding
EmmerThe proper result should be:
Eberer
Ec�
Edding
�lie de Beaumont
EmmerAny idea how I can solve this problem?
Thank you very much in advance,
Nico
To complete the missing information, here are the variables set for the
databases:
add_missing_from on
archive_command unset
australian_timezones off
authentication_timeout 60
bgwriter_delay 200
bgwriter_maxpages 100
bgwriter_percent 1
block_size 8192
check_function_bodies on
checkpoint_segments 3
checkpoint_timeout 300
checkpoint_warning 30
client_encoding UNICODE
client_min_messages notice
commit_delay 0
commit_siblings 5
cpu_index_tuple_cost 0.001
cpu_operator_cost 0.0025
cpu_tuple_cost 0.01
custom_variable_classes unset
DateStyle ISO, MDY
db_user_namespace off
deadlock_timeout 1000
debug_pretty_print off
debug_print_parse off
debug_print_plan off
debug_print_rewritten off
debug_shared_buffers 0
default_statistics_target 10
default_tablespace unset
default_transaction_isolation read committed
default_transaction_read_only off
default_with_oids on
effective_cache_size 1000
enable_hashagg on
enable_hashjoin on
enable_indexscan on
enable_mergejoin on
enable_nestloop on
enable_seqscan on
enable_sort on
enable_tidscan on
explain_pretty_print on
extra_float_digits 0
from_collapse_limit 8
fsync on
geqo on
geqo_effort 5
geqo_generations 0
geqo_pool_size 0
geqo_selection_bias 2
geqo_threshold 12
integer_datetimes on
join_collapse_limit 8
lc_collate de_DE@euro
lc_ctype de_DE@euro
lc_messages de_DE@euro
lc_monetary de_DE@euro
lc_numeric de_DE@euro
lc_time de_DE@euro
listen_addresses localhost
log_connections off
log_destination stderr
log_disconnections off
log_duration off
log_error_verbosity default
log_executor_stats off
log_hostname off
log_line_prefix unset
log_min_duration_statement -1
log_min_error_statement panic
log_min_messages notice
log_parser_stats off
log_planner_stats off
log_rotation_age 1440
log_rotation_size 10240
log_statement none
log_statement_stats off
log_truncate_on_rotation off
maintenance_work_mem 16384
max_connections 100
max_files_per_process 1000
max_fsm_pages 20000
max_fsm_relations 1000
max_function_args 32
max_identifier_length 63
max_index_keys 32
max_locks_per_transaction 64
max_stack_depth 2048
password_encryption on
port 5432
pre_auth_delay 0
random_page_cost 4
redirect_stderr off
regex_flavor advanced
rendezvous_name unset
search_path $user,public
server_encoding UNICODE
server_version 8.0.2
shared_buffers 1000
silent_mode off
sql_inheritance on
ssl off
statement_timeout 0
stats_block_level off
stats_command_string off
stats_reset_on_server_start on
stats_row_level off
stats_start_collector on
superuser_reserved_connections 2
syslog_facility LOCAL0
syslog_ident postgres
TimeZone Europe/Berlin
trace_notify off
transaction_isolation read committed
transaction_read_only off
transform_null_equals off
unix_socket_group unset
unix_socket_permissions 511
vacuum_cost_delay 0
vacuum_cost_limit 200
vacuum_cost_page_dirty 20
vacuum_cost_page_hit 1
vacuum_cost_page_miss 10
wal_buffers 8
wal_sync_method fdatasync
work_mem 1024
zero_damaged_pages off
Is there any explaination why the result is not sorted properly?
Nico Grubert wrote:
Ah, I found it:
lc_collate: de_DE@euro
lc_ctype: de_DE@euro
This is an iso-8859-15 locale, isn't it?
If your database encoding is UNICODE, I believe you'd have more success
using an UTF8 locale, such as de_DE.UTF-8 in your case.
--
Daniel
PostgreSQL-powered mail user agent and storage: http://www.manitou-mail.org