is this a bug or I am blind?
online=# select * from common_logins where username = 'potyty';
uid | username | password | lastlogin | status | usertype | loginnum
-----+----------+----------+-----------+--------+----------+----------
(0 rows)
online=# select * from common_logins where username like 'potyty';
uid | username | password | lastlogin | status |
usertype | loginnum
--------+----------+----------+----------------------------+--------+----------+----------
155505 | potyty | board | 2004-08-16 17:45:55.723829 | A |
S | 1
60067 | potyty | board | 2004-07-07 20:22:17.68699 | A |
S | 3
174041 | potyty | board | 2005-02-17 00:00:13.706144 | A |
S | 3
(3 rows)
online=# select username, username = 'potyty' from common_logins where
username like 'potyty';
username | ?column?
----------+----------
potyty | t
potyty | t
potyty | t
(3 rows)
psql 8.0.3, Debian.
Mage
am 15.12.2005, um 14:12:23 +0100 mailte Mage folgendes:
online=# select * from common_logins where username = 'potyty';
uid | username | password | lastlogin | status | usertype | loginnum
-----+----------+----------+-----------+--------+----------+----------
(0 rows)online=# select * from common_logins where username like 'potyty';
uid | username | password | lastlogin | status |
usertype | loginnum
--------+----------+----------+----------------------------+--------+----------+----------
155505 | potyty | board | 2004-08-16 17:45:55.723829 | A | S
| 1
60067 | potyty | board | 2004-07-07 20:22:17.68699 | A | S
| 3
174041 | potyty | board | 2005-02-17 00:00:13.706144 | A | S
| 3
(3 rows)
Try:
select *, length(username), length('potyty') from common_logins where username like 'potyty';
My guess:
select length(username) from common_logins where username like 'potyty';
is _NOT_ 6, there is a SPACE like 'potyty '.
HTH, Andreas
--
Andreas Kretschmer (Kontakt: siehe Header)
Heynitz: 035242/47212, D1: 0160/7141639
GnuPG-ID 0x3FFF606C http://wwwkeys.de.pgp.net
=== Schollglas Unternehmensgruppe ===
On 12/15/05, A. Kretschmer <andreas.kretschmer@schollglas.com> wrote:
am 15.12.2005, um 14:12:23 +0100 mailte Mage folgendes:
online=# select * from common_logins where username = 'potyty';
uid | username | password | lastlogin | status | usertype | loginnum
-----+----------+----------+-----------+--------+----------+----------
(0 rows)online=# select * from common_logins where username like 'potyty';
uid | username | password | lastlogin | status |
usertype | loginnum
--------+----------+----------+----------------------------+--------+----------+----------
155505 | potyty | board | 2004-08-16 17:45:55.723829 | A | S
| 1
60067 | potyty | board | 2004-07-07 20:22:17.68699 | A | S
| 3
174041 | potyty | board | 2005-02-17 00:00:13.706144 | A | S
| 3
(3 rows)Try:
select *, length(username), length('potyty') from common_logins where username like 'potyty';
My guess:
select length(username) from common_logins where username like 'potyty';
is _NOT_ 6, there is a SPACE like 'potyty '.
even is that is true i think you need the "comodin characters" (ie: %.
_) to make "like" behave different from simple comparisons
HTH, Andreas
--
Andreas Kretschmer (Kontakt: siehe Header)
Heynitz: 035242/47212, D1: 0160/7141639
GnuPG-ID 0x3FFF606C http://wwwkeys.de.pgp.net
=== Schollglas Unternehmensgruppe ===
--
regards,
Jaime Casanova
(DBA: DataBase Aniquilator ;)
[snip]
even is that is true i think you need the "comodin characters" (ie: %.
_) to make "like" behave different from simple comparisons
Not entirely true, if the database was initialized in a different locale
than C, then the direct comparison will probably go for an index on
username, while "like" will not. Which points to a possible index
corruption... which might be interesting for the developers to
investigate, but I would guess a reindex will solve the problem for the
OP if he has it urgent...
Cheers,
Csaba.
Mage wrote:
online=# select * from common_logins where username = 'potyty';
online=# select * from common_logins where username like 'potyty';
It's probably worth seeing whether these have different plans (EXPLAIN
ANALYSE...) and if the = is using an index but like isn't.
If so, try issuing "set enable_indexscan=false" first and see what
happens then.
If that makes a difference then I'd guess you have one of two things:
1. A corrupt index (check the REINDEX command)
2. (perhaps more likely) Some localisation issues.
What encoding/locale settings are you using?
--
Richard Huxton
Archonet Ltd
On 12/15/05, Csaba Nagy <nagy@ecircle-ag.com> wrote:
[snip]
even is that is true i think you need the "comodin characters" (ie: %.
_) to make "like" behave different from simple comparisonsNot entirely true, if the database was initialized in a different locale
than C, then the direct comparison will probably go for an index on
username, while "like" will not. Which points to a possible index
corruption... which might be interesting for the developers to
investigate, but I would guess a reindex will solve the problem for the
OP if he has it urgent...Cheers,
Csaba.
Mage if it's not urgent maybe you can make a post in -hackers and
follow instructions about how to get more info to see what happened
here...
--
regards,
Jaime Casanova
(DBA: DataBase Aniquilator ;)
[snip]
If that makes a difference then I'd guess you have one of two things:
1. A corrupt index (check the REINDEX command)
2. (perhaps more likely) Some localisation issues.
What encoding/locale settings are you using?
Based on the 3rd query of the OP, where the direct comparison results in
"true" for all the rows which matched the "like", I would exclude the
localisation issues variant... unless = is not equals in all cases ;-)
Cheers,
Csaba.
A. Kretschmer wrote:
select *, length(username), length('potyty') from common_logins where username like 'potyty';
My guess:
select length(username) from common_logins where username like 'potyty';
is _NOT_ 6, there is a SPACE like 'potyty '.
If you look my 3rd query, you will see that there are no spaces, however:
select *, length(username), length('potyty') from common_logins where
username like 'potyty';
uid | username | password | lastlogin | status |
usertype | loginnum | length | length
--------+----------+----------+----------------------------+--------+----------+----------+--------+--------
155505 | potyty | board | 2004-08-16 17:45:55.723829 | A |
S | 1 | 6 | 6
60067 | potyty | board | 2004-07-07 20:22:17.68699 | A |
S | 3 | 6 | 6
174041 | potyty | board | 2005-02-17 00:00:13.706144 | A |
S | 3 | 6 | 6
(3 rows)
Mage
Csaba Nagy wrote:
[snip]
even is that is true i think you need the "comodin characters" (ie: %.
_) to make "like" behave different from simple comparisonsNot entirely true, if the database was initialized in a different locale
than C, then the direct comparison will probably go for an index on
username, while "like" will not. Which points to a possible index
corruption... which might be interesting for the developers to
investigate, but I would guess a reindex will solve the problem for the
OP if he has it urgent...
I thought that it may be a locale problem, but:
- look at my 3rd query
- potyty doesn't contain special chars
# EXPLAIN ANALYZE select * from common_logins where username = 'potyty';
QUERY
PLAN
-------------------------------------------------------------------------------------------------------------------------------------------
Index Scan using common_logins_username_idx on common_logins
(cost=0.00..4.30 rows=1 width=47) (actual time=0.056..0.056 rows=0 loops=1)
Index Cond: ((username)::text = 'potyty'::text)
Total runtime: 0.109 ms
(3 rows)
online=# EXPLAIN ANALYZE select * from common_logins where username like
'potyty';
QUERY PLAN
----------------------------------------------------------------------------------------------------------------
Seq Scan on common_logins (cost=0.00..63833.88 rows=1 width=47)
(actual time=180.333..262.492 rows=3 loops=1)
Filter: ((username)::text ~~ 'potyty'::text)
Total runtime: 262.551 ms
(3 rows)
I tried it in two databases (dump and load to another one), so I don't
think that we have corrupted indexes.
I can try on a newer version of postgresql on another server.
By the way, if this is a bug then it's a serious one. We have it in
production environment.
Mage
If you look my 3rd query, you will see that there are no spaces, however:
select *, length(username), length('potyty') from common_logins where
username like 'potyty';
uid | username | password | lastlogin | status |
usertype | loginnum | length | length
--------+----------+----------+----------------------------+--------+----------+----------+--------+--------
155505 | potyty | board | 2004-08-16 17:45:55.723829 | A |
S | 1 | 6 | 6
60067 | potyty | board | 2004-07-07 20:22:17.68699 | A |
S | 3 | 6 | 6
174041 | potyty | board | 2005-02-17 00:00:13.706144 | A |
S | 3 | 6 | 6
(3 rows)
Mage
I am puzzled by the lack of a "%" in the LIKE query. When I try this on postgres 7.4 and 8.0 I get no rows when I am missing it; including it works as expected.
The names have been changed to protect the guilty ;-} but the core of it is true -- no "%" means wierdnesses, I think.
gex_runtime=# select gex_clientname from gex_clients where gex_clientname like 'Home';
gex_clientname
----------------
(0 rows)
gex_runtime=# select gex_clientname from gex_clients where gex_clientname like 'Home%';
gex_clientname
--------------------------------------------------------------
HomeHappinesses
HomeMorgageValues, Inc.
(2 rows)
Could you try your query again with the wild card ?
HTH
Greg Williamson
DBA
GlobeXplorer LLC
Import Notes
Resolved by subject fallback
I tried it in two databases (dump and load to another one), so I don't
think that we have corrupted indexes.
the problem persist after a dump a reload? then there is not because
an index corruption... can you send part of the data that reproduces
the bug?
I can try on a newer version of postgresql on another server.
By the way, if this is a bug then it's a serious one. We have it in
production environment.Mage
what locales do you have? encoding?
--
regards,
Jaime Casanova
(DBA: DataBase Aniquilator ;)
Csaba Nagy wrote:
[snip]
If that makes a difference then I'd guess you have one of two things:
1. A corrupt index (check the REINDEX command)
2. (perhaps more likely) Some localisation issues.
What encoding/locale settings are you using?Based on the 3rd query of the OP, where the direct comparison results in
"true" for all the rows which matched the "like", I would exclude the
localisation issues variant... unless = is not equals in all cases ;-)
Well spotted Csaba - that _would_ seem to point to the index.
--
Richard Huxton
Archonet Ltd
Gregory S. Williamson wrote:
I am puzzled by the lack of a "%" in the LIKE query. When I try this on postgres 7.4 and 8.0 I get no rows when I am missing it; including it works as expected.
The names have been changed to protect the guilty ;-} but the core of it is true -- no "%" means wierdnesses, I think.
gex_runtime=# select gex_clientname from gex_clients where gex_clientname like 'Home';
gex_clientname
----------------
(0 rows)gex_runtime=# select gex_clientname from gex_clients where gex_clientname like 'Home%';
gex_clientname
--------------------------------------------------------------
HomeHappinesses
HomeMorgageValues, Inc.
(2 rows)Could you try your query again with the wild card ?
But "like" without any wildcards should be the same as "=", but it isn't
in the original post.
--
Richard Huxton
Archonet Ltd
Jaime Casanova wrote:
I tried it in two databases (dump and load to another one), so I don't
think that we have corrupted indexes.the problem persist after a dump a reload? then there is not because
an index corruption... can you send part of the data that reproduces
the bug?I can try on a newer version of postgresql on another server.
By the way, if this is a bug then it's a serious one. We have it in
production environment.Mage
what locales do you have? encoding?
The problem can be reproduced on pgsql 8.0.3, compiled from source.
This is a third machine where the bug persists. I dumped the table and
loaded in.
I cannot send the table to you because it contains user data and
passwords. I will try to create a fake one with the same problem.
./configure --prefix=/usr/local/pgsql --with-python
client_encoding | LATIN2
lc_collate | hu_HU
lc_ctype | hu_HU
lc_messages | en_US
lc_monetary | en_US
lc_numeric | en_US
lc_time | en_US
server_encoding | LATIN2
server_version | 8.0.3
Mage
Well, then I have the disease. The database is UNICODE:
gex_runtime=# select gex_clientname from gex_clients where gex_clientname = 'HomeHappinesses';
gex_clientname
--------------------------------------------------------------
HomeGain
(1 row)
gex_runtime=# select gex_clientname from gex_clients where gex_clientname like 'HomeHappinesses';
gex_clientname
----------------
(0 rows)
Gregory S. Williamson wrote:
I am puzzled by the lack of a "%" in the LIKE query. When I try this on
postgres 7.4 and 8.0 I get no rows when I am missing it; including it
works as expected.The names have been changed to protect the guilty ;-} but the core of it
is true -- no "%" means wierdnesses, I think.gex_runtime=# select gex_clientname from gex_clients where gex_clientname like 'Home';
gex_clientname
----------------
(0 rows)gex_runtime=# select gex_clientname from gex_clients where gex_clientname like 'Home%';
gex_clientname
--------------------------------------------------------------
HomeHappinesses
HomeMorgageValues, Inc.
(2 rows)Could you try your query again with the wild card ?
But "like" without any wildcards should be the same as "=", but it isn't
in the original post.
Well then I have the same behavior -- the database is UNICODE:
gex_runtime=# select gex_clientname from gex_clients where gex_clientname = 'HomeHappinesses';
gex_clientname
--------------------------------------------------------------
HomeHappinesses
(1 row)
gex_runtime=# select gex_clientname from gex_clients where gex_clientname like 'HomeHappinesses';
gex_clientname
----------------
(0 rows)
An "=" is not equivalent to LIKE with no wildcard.
I never really thought of this as a bug, but if it is ... ring one up for Mage as a good catch.
Import Notes
Resolved by subject fallback
Mage wrote:
./configure --prefix=/usr/local/pgsql --with-python
client_encoding | LATIN2
lc_collate | hu_HU
lc_ctype | hu_HU
lc_messages | en_US
lc_monetary | en_US
lc_numeric | en_US
lc_time | en_US
server_encoding | LATIN2
server_version | 8.0.3
I have created a table that can be sent to you to examine the bug.
I am actually vacuuming it for further testing.
Where can I upload it? Sorry, I cannot host it.
Mage
Richard Huxton <dev@archonet.com> writes:
Csaba Nagy wrote:
Based on the 3rd query of the OP, where the direct comparison results in
"true" for all the rows which matched the "like", I would exclude the
localisation issues variant... unless = is not equals in all cases ;-)
Well spotted Csaba - that _would_ seem to point to the index.
No, localization issues should be real high on your list. In particular
I wonder whether this is the old bugaboo of using a database encoding
that's incompatible with the postmaster's locale setting. We've seen
that on some platforms strcoll() gets completely confused by this and
returns comparison results that are not even self-consistent.
Non-self-consistent comparison results can lead to an index that is
either actually or effectively corrupt (because index searches proceed
down the wrong tree path and thus fail to find items that should be
found). So the observation that only index searches fail is consistent
with this idea.
regards, tom lane
Richard Huxton <dev@archonet.com> writes:
But "like" without any wildcards should be the same as "=", but it isn't
in the original post.
I'm too lazy to go look at the code right now, but I think that the
reduction of "x LIKE constant-pattern" to "x = constant-pattern" is
part of the LIKE index optimization code, which means it'd only get
done in C locale. We're missing a bet there perhaps.
regards, tom lane
Ok, that explains then the problem... but the index is arguably corrupt
in this case, with the important difference that it's not even fixable
by reindex...
I guess what the OP really wants is a solution to his problem. If the
table is not too big, a quick fix would be to just drop the index.
Then figure out an encoding+locale combination which can be used to
properly host the same sample data which leads to failure here and still
satisfies the OP's string sorting and other needs, and then dump
reload... or is there other better way to fix things ?
I guess a note in the docs about not using the same
encoding/locale/(postgres version?)/(OS?) combination as the OP would
make sense too ?
Cheers,
Csaba.
Show quoted text
On Thu, 2005-12-15 at 16:20, Tom Lane wrote:
Richard Huxton <dev@archonet.com> writes:
Csaba Nagy wrote:
Based on the 3rd query of the OP, where the direct comparison results in
"true" for all the rows which matched the "like", I would exclude the
localisation issues variant... unless = is not equals in all cases ;-)Well spotted Csaba - that _would_ seem to point to the index.
No, localization issues should be real high on your list. In particular
I wonder whether this is the old bugaboo of using a database encoding
that's incompatible with the postmaster's locale setting. We've seen
that on some platforms strcoll() gets completely confused by this and
returns comparison results that are not even self-consistent.Non-self-consistent comparison results can lead to an index that is
either actually or effectively corrupt (because index searches proceed
down the wrong tree path and thus fail to find items that should be
found). So the observation that only index searches fail is consistent
with this idea.regards, tom lane