PG SQL and LIKE clause

Started by Matthias Apitzover 6 years ago11 messagesgeneral
Jump to latest
#1Matthias Apitz
guru@unixarea.de

Hello,

We're porting a huge Library Management System, written using all kind
of languages one can think of (C, C++, ESQL/C, Perl, Java, ...) on Linux
from the DBS Sybase to PG, millions of lines of code, which works also
with DBS Oracle and in the past with INFORMIX-SE and -ONLINE.

We got to know that in CHAR columns with trailing blanks a

SELECT ... FROM ... WHERE name LIKE 'Ali'

does not match in 'name' having 'Ali '.

I glanced through our code with grep pipelines and found some hundred
places which would be affected by this problem. I'm not interested in a
religious discussion if or if not this behaviour of PG is correcter or
better than in Sybase. It's just different to Sybase.

Any hints to address this problem? Or is there any compile time option
for the PG server to address this?

Thanks

matthias
--
Matthias Apitz, ✉ guru@unixarea.de, http://www.unixarea.de/ +49-176-38902045
Public GnuPG key: http://www.unixarea.de/key.pub

Mientras haya voluntad de lucha habrá esperanza de vencer.

#2Pavel Stehule
pavel.stehule@gmail.com
In reply to: Matthias Apitz (#1)
Re: PG SQL and LIKE clause

Hi

pá 13. 9. 2019 v 7:29 odesílatel Matthias Apitz <guru@unixarea.de> napsal:

Hello,

We're porting a huge Library Management System, written using all kind
of languages one can think of (C, C++, ESQL/C, Perl, Java, ...) on Linux
from the DBS Sybase to PG, millions of lines of code, which works also
with DBS Oracle and in the past with INFORMIX-SE and -ONLINE.

We got to know that in CHAR columns with trailing blanks a

SELECT ... FROM ... WHERE name LIKE 'Ali'

does not match in 'name' having 'Ali '.

I glanced through our code with grep pipelines and found some hundred
places which would be affected by this problem. I'm not interested in a
religious discussion if or if not this behaviour of PG is correcter or
better than in Sybase. It's just different to Sybase.

Any hints to address this problem? Or is there any compile time option
for the PG server to address this?

There is not simple solution - you should to write own patch and used
patched postgres.

Pavel

Show quoted text

Thanks

matthias
--
Matthias Apitz, ✉ guru@unixarea.de, http://www.unixarea.de/
+49-176-38902045
Public GnuPG key: http://www.unixarea.de/key.pub

Mientras haya voluntad de lucha habrá esperanza de vencer.

#3Ron
ronljohnsonjr@gmail.com
In reply to: Matthias Apitz (#1)
Re: PG SQL and LIKE clause

On 9/13/19 12:28 AM, Matthias Apitz wrote:

Hello,

We're porting a huge Library Management System, written using all kind
of languages one can think of (C, C++, ESQL/C, Perl, Java, ...) on Linux
from the DBS Sybase to PG, millions of lines of code, which works also
with DBS Oracle and in the past with INFORMIX-SE and -ONLINE.

We got to know that in CHAR columns with trailing blanks a

SELECT ... FROM ... WHERE name LIKE 'Ali'

does not match in 'name' having 'Ali '.

Did you forget the "%"?  Because the SQL standard which PostgreSQL follows is:

SELECT ... FROM ... WHERE name LIKE 'Ali%'

I glanced through our code with grep pipelines and found some hundred
places which would be affected by this problem. I'm not interested in a
religious discussion if or if not this behaviour of PG is correcter or
better than in Sybase. It's just different to Sybase.

SQL Server derives from Sybase, and it would also fail on this statement.

Any hints to address this problem? Or is there any compile time option
for the PG server to address this?

Thanks

matthias

--
Angular momentum makes the world go 'round.

#4Matthias Apitz
guru@unixarea.de
In reply to: Pavel Stehule (#2)
Re: PG SQL and LIKE clause

El día Friday, September 13, 2019 a las 07:33:10AM +0200, Pavel Stehule escribió:

We got to know that in CHAR columns with trailing blanks a

SELECT ... FROM ... WHERE name LIKE 'Ali'

does not match in 'name' having 'Ali '.

I glanced through our code with grep pipelines and found some hundred
places which would be affected by this problem. I'm not interested in a
religious discussion if or if not this behaviour of PG is correcter or
better than in Sybase. It's just different to Sybase.

Any hints to address this problem? Or is there any compile time option
for the PG server to address this?

There is not simple solution - you should to write own patch and used
patched postgres.

Hello,

As we compiled our own PG 11.4 and as we're alone on the server with
our software plus PG, this could be an option to consider.

Do you have an idea where to look for this in the PG source. Maybe it's
just deleting the trailing blanks from the column content before applying
the string compare with regular expressions.

Thanks

matthias
--
Matthias Apitz, ✉ guru@unixarea.de, http://www.unixarea.de/ +49-176-38902045
Public GnuPG key: http://www.unixarea.de/key.pub
May, 9: Спаси́бо освободители! Thank you very much, Russian liberators!

#5Pavel Stehule
pavel.stehule@gmail.com
In reply to: Matthias Apitz (#4)
Re: PG SQL and LIKE clause

pá 13. 9. 2019 v 8:49 odesílatel Matthias Apitz <guru@unixarea.de> napsal:

El día Friday, September 13, 2019 a las 07:33:10AM +0200, Pavel Stehule
escribió:

We got to know that in CHAR columns with trailing blanks a

SELECT ... FROM ... WHERE name LIKE 'Ali'

does not match in 'name' having 'Ali '.

I glanced through our code with grep pipelines and found some hundred
places which would be affected by this problem. I'm not interested in a
religious discussion if or if not this behaviour of PG is correcter or
better than in Sybase. It's just different to Sybase.

Any hints to address this problem? Or is there any compile time option
for the PG server to address this?

There is not simple solution - you should to write own patch and used
patched postgres.

Hello,

As we compiled our own PG 11.4 and as we're alone on the server with
our software plus PG, this could be an option to consider.

Do you have an idea where to look for this in the PG source. Maybe it's
just deleting the trailing blanks from the column content before applying
the string compare with regular expressions.

I am not a expert on this area, and I don't know all dependency. Some code
is in src/backend/utils/adt/like.c

Pavel

Show quoted text

Thanks

matthias
--
Matthias Apitz, ✉ guru@unixarea.de, http://www.unixarea.de/
+49-176-38902045
Public GnuPG key: http://www.unixarea.de/key.pub
May, 9: Спаси́бо освободители! Thank you very much, Russian liberators!

#6Thomas Kellerer
spam_eater@gmx.net
In reply to: Matthias Apitz (#1)
Re: PG SQL and LIKE clause

Matthias Apitz schrieb am 13.09.2019 um 07:28:

We're porting a huge Library Management System, written using all kind
of languages one can think of (C, C++, ESQL/C, Perl, Java, ...) on Linux
from the DBS Sybase to PG, millions of lines of code, which works also
with DBS Oracle and in the past with INFORMIX-SE and -ONLINE.

We got to know that in CHAR columns with trailing blanks a

In a nutshell: do not use the CHAR data type (neither in Oracle nor in Postgres)

Unless your application actively checks the data type name of columns in a table you can simply change the type of those columns to varchar (or text) and get rid of the annoying (but required) behaviour of the CHAR type.

To be honest, I am surprised this didn't show up in Oracle, as I think the CHAR behaviour there is the same as in Postgres.

Thomas

#7Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Ron (#3)
Re: PG SQL and LIKE clause

On 9/12/19 11:11 PM, Ron wrote:

On 9/13/19 12:28 AM, Matthias Apitz wrote:

Hello,

We're porting a huge Library Management System, written using all kind
of languages one can think of (C, C++, ESQL/C, Perl, Java, ...) on Linux
from the DBS Sybase to PG, millions of lines of code, which works also
with DBS Oracle and in the past with INFORMIX-SE and -ONLINE.

We got to know that in CHAR columns with trailing blanks a

SELECT ... FROM ... WHERE name LIKE 'Ali'

does not match in 'name' having 'Ali '.

Did you forget the "%"?  Because the SQL standard which PostgreSQL
follows is:

SELECT ... FROM ... WHERE name LIKE 'Ali%'

The above is if you want to find a string starting with 'Ali'. If you
are looking for the complete string 'Ali' then it is appropriate. The OP
is looking for a way to automatically match a complete string against a
right stripped string from a CHAR field.

I glanced through our code with grep pipelines and found some hundred
places which would be affected by this problem. I'm not interested in a
religious discussion if or if not this behaviour of PG is correcter or
better than in Sybase. It's just different to Sybase.

SQL Server derives from Sybase, and it would also fail on this statement.

Any hints to address this problem? Or is there any compile time option
for the PG server to address this?

Thanks

    matthias

--
Adrian Klaver
adrian.klaver@aklaver.com

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Matthias Apitz (#1)
Re: PG SQL and LIKE clause

Matthias Apitz <guru@unixarea.de> writes:

We got to know that in CHAR columns with trailing blanks a

SELECT ... FROM ... WHERE name LIKE 'Ali'

does not match in 'name' having 'Ali '.

I glanced through our code with grep pipelines and found some hundred
places which would be affected by this problem. I'm not interested in a
religious discussion if or if not this behaviour of PG is correcter or
better than in Sybase. It's just different to Sybase.

Any hints to address this problem?

Don't use char(n) for anything more complicated than cases like
it-must-be-a-2-character-state-abbreviation. You can use varchar(n)
if you must have an upper limit on the field length. Or text.

regards, tom lane

#9Ron
ronljohnsonjr@gmail.com
In reply to: Adrian Klaver (#7)
Re: PG SQL and LIKE clause

On 9/13/19 9:14 AM, Adrian Klaver wrote:

On 9/12/19 11:11 PM, Ron wrote:

On 9/13/19 12:28 AM, Matthias Apitz wrote:

Hello,

We're porting a huge Library Management System, written using all kind
of languages one can think of (C, C++, ESQL/C, Perl, Java, ...) on Linux
from the DBS Sybase to PG, millions of lines of code, which works also
with DBS Oracle and in the past with INFORMIX-SE and -ONLINE.

We got to know that in CHAR columns with trailing blanks a

SELECT ... FROM ... WHERE name LIKE 'Ali'

does not match in 'name' having 'Ali '.

Did you forget the "%"?  Because the SQL standard which PostgreSQL
follows is:

SELECT ... FROM ... WHERE name LIKE 'Ali%'

The above is if you want to find a string starting with 'Ali'. If you are
looking for the complete string 'Ali' then it is appropriate. The OP is
looking for a way to automatically match a complete string against a right
stripped string from a CHAR field.

This is highly dependent on implementation.  On the RDBMS that I used to
work on, trailing whitespace was automatically stripped from CHAR(xx) fields.

--
Angular momentum makes the world go 'round.

#10Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Ron (#9)
Re: PG SQL and LIKE clause

On 9/13/19 7:35 AM, Ron wrote:

On 9/13/19 9:14 AM, Adrian Klaver wrote:

On 9/12/19 11:11 PM, Ron wrote:

On 9/13/19 12:28 AM, Matthias Apitz wrote:

Hello,

We're porting a huge Library Management System, written using all kind
of languages one can think of (C, C++, ESQL/C, Perl, Java, ...) on
Linux
from the DBS Sybase to PG, millions of lines of code, which works also
with DBS Oracle and in the past with INFORMIX-SE and -ONLINE.

We got to know that in CHAR columns with trailing blanks a

SELECT ... FROM ... WHERE name LIKE 'Ali'

does not match in 'name' having 'Ali '.

Did you forget the "%"?  Because the SQL standard which PostgreSQL
follows is:

SELECT ... FROM ... WHERE name LIKE 'Ali%'

The above is if you want to find a string starting with 'Ali'. If you
are looking for the complete string 'Ali' then it is appropriate. The
OP is looking for a way to automatically match a complete string
against a right stripped string from a CHAR field.

This is highly dependent on implementation.  On the RDBMS that I used to
work on, trailing whitespace was automatically stripped from CHAR(xx)
fields.

That is what is driving the OP's question. On Sybase the trailing
whitespace is stripped apparently, on Postgres it is not, at least for
the purposes of LIKE. Though for other comparisons whitespaces are
ignored. See below for more information:

https://www.postgresql.org/docs/11/datatype-character.html

--
Adrian Klaver
adrian.klaver@aklaver.com

#11John W Higgins
wishdev@gmail.com
In reply to: Matthias Apitz (#1)
Re: PG SQL and LIKE clause

Is this a possibility?

From
https://www.postgresql.org/docs/current/functions-matching.html#FUNCTIONS-POSIX-REGEXP
-

~ 'ali[ ]*$' matches strings ending in ali that have zero of more spaces
after ali

this would match

'bali'
'ali'
'ali '
'bali '

If you need full string then it would be like this

~ '^ali[ ]*$' matches string only containing ali plus zero or more spaces
after ali

this would match

'ali'
'ali '

but not match

'bali'
'bali '

Also switching ~* for ~ makes if case insensitive if necessary.

John

On Thu, Sep 12, 2019 at 10:29 PM Matthias Apitz <guru@unixarea.de> wrote:

Show quoted text

Hello,

We're porting a huge Library Management System, written using all kind
of languages one can think of (C, C++, ESQL/C, Perl, Java, ...) on Linux
from the DBS Sybase to PG, millions of lines of code, which works also
with DBS Oracle and in the past with INFORMIX-SE and -ONLINE.

We got to know that in CHAR columns with trailing blanks a

SELECT ... FROM ... WHERE name LIKE 'Ali'

does not match in 'name' having 'Ali '.

I glanced through our code with grep pipelines and found some hundred
places which would be affected by this problem. I'm not interested in a
religious discussion if or if not this behaviour of PG is correcter or
better than in Sybase. It's just different to Sybase.

Any hints to address this problem? Or is there any compile time option
for the PG server to address this?

Thanks

matthias
--
Matthias Apitz, ✉ guru@unixarea.de, http://www.unixarea.de/
+49-176-38902045
Public GnuPG key: http://www.unixarea.de/key.pub

Mientras haya voluntad de lucha habrá esperanza de vencer.