Sorting problems with SELECT * FROM table WHERE name LIKE 'Ö%'

Started by Nico Grubertover 20 years ago6 messagesgeneral
Jump to latest
#1Nico Grubert
nicogrubert@arcor.de

Hi there,

I have a problem when sorting records with:
SELECT * FROM table WHERE name LIKE '�%'

I am running Postgres 8.02 with a database whose character encoding is
UNICODE.

The SQL Query

SELECT *
FROM member
WHERE name LIKE 'O%'
OR
name like '�%'
ORDER BY name

returns this:
�hlmann
�hmann
Obenaus
Ochoa
O'Donovan
Oehme
Oklant
Oltub
Olt�ch
Oltutz
Olt�wer

According to german sorting rules the result is fine except the both
first entries "�hlmann" and "�hmann".
Why do appear these records at the beginning of the list?
The proper result should read like this:
Obenaus
Ochoa
O'Donovan
Oehme
�hlmann
�hmann
Oklant
Oltub
Olt�ch
Oltutz
Olt�wer

The same problem accours when using "E" where my result is this:
�lie de Beaumont
Eberer
Ec�
Edding
Emmer

The proper result should be:
Eberer
Ec�
Edding
�lie de Beaumont
Emmer

Any idea how I can solve this problem?

Thank you very much in advance,
Nico

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Nico Grubert (#1)
Re: Sorting problems with SELECT * FROM table WHERE name LIKE 'Ö%'

Nico Grubert <nicogrubert@arcor.de> writes:

I have a problem when sorting records with:
SELECT * FROM table WHERE name LIKE '�%'

I am running Postgres 8.02 with a database whose character encoding is
UNICODE.

... but what locale is it using? (See LC_COLLATE and LC_CTYPE.)

regards, tom lane

#3Nico Grubert
nicogrubert@arcor.de
In reply to: Tom Lane (#2)
Re: [GENERAL] Sorting problems with SELECT * FROM table WHERE name LIKE 'Ö%'

... but what locale is it using? (See LC_COLLATE and LC_CTYPE.)

Can I find out out these settings in "phpPgAdmin"?
Or can I use LC_COLLATE and LC_CTYPE in the SQL Query?

#4Nico Grubert
nicogrubert@arcor.de
In reply to: Tom Lane (#2)
Re: [GENERAL] Sorting problems with SELECT * FROM table WHERE name LIKE 'Ö%'

Ah, I found it:

lc_collate: de_DE@euro
lc_ctype: de_DE@euro

#5Nico Grubert
nicogrubert@arcor.de
In reply to: Nico Grubert (#1)
Re: [GENERAL] Sorting problems with SELECT * FROM table WHERE name LIKE 'Ö%'

I have a problem when sorting records with:
SELECT * FROM table WHERE name LIKE '�%'

I am running Postgres 8.02 with a database whose character encoding is
UNICODE.

The SQL Query

SELECT *
FROM member
WHERE name LIKE 'O%'
OR
name like '�%'
ORDER BY name

returns this:
�hlmann
�hmann
Obenaus
Ochoa
O'Donovan
Oehme
Oklant
Oltub
Olt�ch
Oltutz
Olt�wer

According to german sorting rules the result is fine except the both
first entries "�hlmann" and "�hmann".
Why do appear these records at the beginning of the list?
The proper result should read like this:
Obenaus
Ochoa
O'Donovan
Oehme
�hlmann
�hmann
Oklant
Oltub
Olt�ch
Oltutz
Olt�wer

The same problem accours when using "E" where my result is this:
�lie de Beaumont
Eberer
Ec�
Edding
Emmer

The proper result should be:
Eberer
Ec�
Edding
�lie de Beaumont
Emmer

Any idea how I can solve this problem?

Thank you very much in advance,
Nico

To complete the missing information, here are the variables set for the
databases:
add_missing_from on
archive_command unset
australian_timezones off
authentication_timeout 60
bgwriter_delay 200
bgwriter_maxpages 100
bgwriter_percent 1
block_size 8192
check_function_bodies on
checkpoint_segments 3
checkpoint_timeout 300
checkpoint_warning 30
client_encoding UNICODE
client_min_messages notice
commit_delay 0
commit_siblings 5
cpu_index_tuple_cost 0.001
cpu_operator_cost 0.0025
cpu_tuple_cost 0.01
custom_variable_classes unset
DateStyle ISO, MDY
db_user_namespace off
deadlock_timeout 1000
debug_pretty_print off
debug_print_parse off
debug_print_plan off
debug_print_rewritten off
debug_shared_buffers 0
default_statistics_target 10
default_tablespace unset
default_transaction_isolation read committed
default_transaction_read_only off
default_with_oids on
effective_cache_size 1000
enable_hashagg on
enable_hashjoin on
enable_indexscan on
enable_mergejoin on
enable_nestloop on
enable_seqscan on
enable_sort on
enable_tidscan on
explain_pretty_print on
extra_float_digits 0
from_collapse_limit 8
fsync on
geqo on
geqo_effort 5
geqo_generations 0
geqo_pool_size 0
geqo_selection_bias 2
geqo_threshold 12
integer_datetimes on
join_collapse_limit 8
lc_collate de_DE@euro
lc_ctype de_DE@euro
lc_messages de_DE@euro
lc_monetary de_DE@euro
lc_numeric de_DE@euro
lc_time de_DE@euro
listen_addresses localhost
log_connections off
log_destination stderr
log_disconnections off
log_duration off
log_error_verbosity default
log_executor_stats off
log_hostname off
log_line_prefix unset
log_min_duration_statement -1
log_min_error_statement panic
log_min_messages notice
log_parser_stats off
log_planner_stats off
log_rotation_age 1440
log_rotation_size 10240
log_statement none
log_statement_stats off
log_truncate_on_rotation off
maintenance_work_mem 16384
max_connections 100
max_files_per_process 1000
max_fsm_pages 20000
max_fsm_relations 1000
max_function_args 32
max_identifier_length 63
max_index_keys 32
max_locks_per_transaction 64
max_stack_depth 2048
password_encryption on
port 5432
pre_auth_delay 0
random_page_cost 4
redirect_stderr off
regex_flavor advanced
rendezvous_name unset
search_path $user,public
server_encoding UNICODE
server_version 8.0.2
shared_buffers 1000
silent_mode off
sql_inheritance on
ssl off
statement_timeout 0
stats_block_level off
stats_command_string off
stats_reset_on_server_start on
stats_row_level off
stats_start_collector on
superuser_reserved_connections 2
syslog_facility LOCAL0
syslog_ident postgres
TimeZone Europe/Berlin
trace_notify off
transaction_isolation read committed
transaction_read_only off
transform_null_equals off
unix_socket_group unset
unix_socket_permissions 511
vacuum_cost_delay 0
vacuum_cost_limit 200
vacuum_cost_page_dirty 20
vacuum_cost_page_hit 1
vacuum_cost_page_miss 10
wal_buffers 8
wal_sync_method fdatasync
work_mem 1024
zero_damaged_pages off

Is there any explaination why the result is not sorted properly?

#6Daniel Verite
daniel@manitou-mail.org
In reply to: Nico Grubert (#4)
Re: Re: [GENERAL] Sorting problems with SELECT * FROM

Nico Grubert wrote:

Ah, I found it:

lc_collate: de_DE@euro
lc_ctype: de_DE@euro

This is an iso-8859-15 locale, isn't it?
If your database encoding is UNICODE, I believe you'd have more success
using an UTF8 locale, such as de_DE.UTF-8 in your case.

--
Daniel
PostgreSQL-powered mail user agent and storage: http://www.manitou-mail.org