group by

Started by yoursoft@freemail.hualmost 20 years ago3 messagesbugs
Jump to latest
#1yoursoft@freemail.hu
yoursoft@freemail.hu

Dear Developers,

There is a possible bug in 'select ... group by' SQL.
I reported it on the bugreport form on the web. (I think it number is
2416?). But no any reaction to it.
It is not problem for me, I make my results in other way. But it is
possible problem for other pepoples.

I reproduced it in smaller database table in other way. (with 180000
records) (postgresql 8.03 on linux)
e.g.:
there is a table:
stat-# \d summary
Table "public.summary"
colum | Type | MĂłdosĂ­tĂł
-----------+------------------------+----------
kifejezes | character varying(300) | not null
cnt | integer | not null
talalat | integer |
Indexes:
"idx_summary_cnt" btree (cnt) CLUSTER
"idx_summary_kifejezes" btree (kifejezes text_pattern_ops)

1)
select kifejezes, count(kifejezes) from summary group by kifejezes
having count(kifejezes)>1;
the result is:
kifejezes | count
-------------------------+-------
csĂşcscsajok | 2
jĂĄszszentandrĂĄs | 3
kullancscsĂ­pĂŠs | 2
magannyugdijpenztar | 2
magĂĄnnyugdijpĂŠnztĂĄr | 2
magĂĄnnyugdĂ­jpĂŠnztĂĄr | 3
magĂĄnnyugdĂ­jpĂŠnztĂĄrak | 2
mŹvel�dÊsszociológia | 2
otp magĂĄnnyugdĂ­jpĂŠnztĂĄr | 2
(9 rows)

2)
select * from summary where kifejezes like 'jegygy%';
kifejezes | cnt | talalat
------------+-----+---------
jegygyĂźrĹą | 4 | 0
jegygyĹąrĹą | 5 | 0
jegygyĹąrĹą | 7 | 0
jegygyĹąrĹą | 12 | 0
jegygyĹąrĹąk | 3 | 0
(5 rows)

Why not is in the first query results the "jegygyĹąrĹą" (second query
rows )?

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: yoursoft@freemail.hu (#1)
Re: group by

YourSoft <yoursoft@freemail.hu> writes:

1)
select kifejezes, count(kifejezes) from summary group by kifejezes
having count(kifejezes)>1;
the result is:
kifejezes | count
-------------------------+-------
csúcscsajok | 2
jászszentandrás | 3
kullancscsípés | 2
magannyugdijpenztar | 2
magánnyugdijpénztár | 2
magánnyugdíjpénztár | 3
magánnyugdíjpénztárak | 2
művel�désszociológia | 2
otp magánnyugdíjpénztár | 2
(9 rows)

2)
select * from summary where kifejezes like 'jegygy%';
kifejezes | cnt | talalat
------------+-----+---------
jegygyürű | 4 | 0
jegygyűrű | 5 | 0
jegygyűrű | 7 | 0
jegygyűrű | 12 | 0
jegygyűrűk | 3 | 0
(5 rows)

Why not is in the first query results the "jegygyűrű" (second query
rows )?

We've seen problems like this occur when you have mismatched locale and
encoding specifications --- that can confuse strcoll() to the point that
it gives inconsistent results, and since all PG character comparisons
depend on strcoll(), you get all sorts of bizarre behavior. Check the
LC_COLLATE and LC_CTYPE settings of the database, and make sure that you
have selected a database encoding that matches them.

Also, if you're using Hungarian locale, you probably need to update to
PG 8.0.6 or later. See bug fix list at
http://developer.postgresql.org/docs/postgres/release-8-0-6.html

regards, tom lane

#3yoursoft@freemail.hu
yoursoft@freemail.hu
In reply to: Tom Lane (#2)
Re: group by

Dear Tom,

Thanks for suggestion. I upgrade my database to 8.1.3 and it is solve
the problem :-)

Regards,
Ferenc