behavior of ' = NULL' vs. MySQL vs. Standards

Started by Mark Stosbergover 24 years ago19 messages
#1Mark Stosberg
mark@summersault.com

Hello,

I'm a long time Postgres user who uses MySQL when I have to. I recently
ran into an issue with MySQL where this construct didn't do what I expect:

WHERE date_column = NULL

I expected it to work like "date_column IS NULL" like it does it
Postgres 7.0.2, but instead it returned an empty result set.

After conversing with some folks on the MySQL list, it was mentioned that:

* "NULL is *NOT* a value. It's an absence of a value, and doing *any*
comparisons with NULL is invalid (the result must always be NULL, even
if you say "foo = NULL")."

* Postgres handling is non-standard (even if it's intuitive.)

My questions then are: 1.) What IS the standard for handling NULLs? and
then 2.) If Postgres handling is different than the standard, what's the
reason?

To me, having " = NULL" be the same as " IS NULL" is intuitive and thus
useful, but I also like appeal of using standards when possible. :)

Thanks!

-mark

http://mark.stosberg.com/

#2Stephan Szabo
sszabo@megazone23.bigpanda.com
In reply to: Mark Stosberg (#1)
Re: behavior of ' = NULL' vs. MySQL vs. Standards

On Wed, 6 Jun 2001, Mark Stosberg wrote:

Hello,

I'm a long time Postgres user who uses MySQL when I have to. I recently
ran into an issue with MySQL where this construct didn't do what I expect:

WHERE date_column = NULL

I expected it to work like "date_column IS NULL" like it does it
Postgres 7.0.2, but instead it returned an empty result set.

After conversing with some folks on the MySQL list, it was mentioned that:

* "NULL is *NOT* a value. It's an absence of a value, and doing *any*
comparisons with NULL is invalid (the result must always be NULL, even
if you say "foo = NULL")."

* Postgres handling is non-standard (even if it's intuitive.)

My questions then are: 1.) What IS the standard for handling NULLs? and
then 2.) If Postgres handling is different than the standard, what's the
reason?

To me, having " = NULL" be the same as " IS NULL" is intuitive and thus
useful, but I also like appeal of using standards when possible. :)

Yes, column = NULL should *never* return true according to the spec (it
should always return NULL in fact as stated). The reason for breaking
with the spec is AFAIK to work with broken microsoft clients that seem to
think that =NULL is a meaningful test and generate queries using that.

In general, =NULL should be avoided in favor of IS NULL by users that are
generating their own queries.

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Stephan Szabo (#2)
Re: behavior of ' = NULL' vs. MySQL vs. Standards

Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:

Yes, column = NULL should *never* return true according to the spec (it
should always return NULL in fact as stated). The reason for breaking
with the spec is AFAIK to work with broken microsoft clients that seem to
think that =NULL is a meaningful test and generate queries using that.

Microsoft Access is the guilty party, IIRC. I recently tried to stir up
some interest in changing this behavior back to the standard, but
apparently there are still too many people using broken versions of
Access.

A compromise answer might be to offer a SET variable that selects the
Microsoft-compatible misimplementation. Would that fly?

regards, tom lane

#4Stephan Szabo
sszabo@megazone23.bigpanda.com
In reply to: Tom Lane (#3)
Re: behavior of ' = NULL' vs. MySQL vs. Standards

On Wed, 6 Jun 2001, Tom Lane wrote:

Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:

Yes, column = NULL should *never* return true according to the spec (it
should always return NULL in fact as stated). The reason for breaking
with the spec is AFAIK to work with broken microsoft clients that seem to
think that =NULL is a meaningful test and generate queries using that.

Microsoft Access is the guilty party, IIRC. I recently tried to stir up
some interest in changing this behavior back to the standard, but
apparently there are still too many people using broken versions of
Access.

A compromise answer might be to offer a SET variable that selects the
Microsoft-compatible misimplementation. Would that fly?

It would for me. I'd rather have the default be the spec correct behavior
and let people configure their server to follow the misinterpretation.
Is the conversion just the hack in the grammar rules for
a_expr '=' a_expr?

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Stephan Szabo (#4)
Re: behavior of ' = NULL' vs. MySQL vs. Standards

Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:

A compromise answer might be to offer a SET variable that selects the
Microsoft-compatible misimplementation. Would that fly?

It would for me. I'd rather have the default be the spec correct behavior
and let people configure their server to follow the misinterpretation.

Yes, I agree.

Is the conversion just the hack in the grammar rules for
a_expr '=' a_expr?

AFAIK that's it. It might be a little tricky to do cleanly,
though, since gram.y shouldn't look at SET vars. Really, the
conversion would have to be done in analyze.c instead.

regards, tom lane

#6Thomas Lockhart
lockhart@fourpalms.org
In reply to: Stephan Szabo (#2)
Re: behavior of ' = NULL' vs. MySQL vs. Standards

Yes, column = NULL should *never* return true according to the spec (it
should always return NULL in fact as stated). The reason for breaking
with the spec is AFAIK to work with broken microsoft clients that seem to
think that =NULL is a meaningful test and generate queries using that.

Microsoft Access is the guilty party, IIRC. I recently tried to stir up
some interest in changing this behavior back to the standard, but
apparently there are still too many people using broken versions of
Access.

Since according to the standard "column = NULL" is a near-useless
construct (equivalent to "FALSE") it does not seem to pollute the
grammar much to allow an M$ compatible interpretation. I was not happy
having it added (much better to ask that responsive, customer-focused
company to fix their language compliance) but now that it is there it
seems to be an isolated and manageable feature.

- Thomas

#7Mark Stosberg
mark@summersault.com
In reply to: Stephan Szabo (#4)
Re: behavior of ' = NULL' vs. MySQL vs. Standards

Stephan Szabo wrote:

On Wed, 6 Jun 2001, Tom Lane wrote:

Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:

Yes, column = NULL should *never* return true according to the spec (it
should always return NULL in fact as stated). The reason for breaking
with the spec is AFAIK to work with broken microsoft clients that seem to
think that =NULL is a meaningful test and generate queries using that.

I'd rather have the default be the spec correct behavior
and let people configure their server to follow the misinterpretation.

I like that idea as well. Someone like me who didn't know that this
feature was in there for M$ could have assumed it _was_ standard
behavior, and started using it as a habit. Then when I started porting
my code to another database, I'd have an extra surprise in for me. :)

Rather than being an option targeted at just this piece of grammer,
perhaps it could a piece of a potentially larger option of "stricter
standards compliance." I realize there are a number of useful extensions
to the SQL standard in Postgres (which I like and use.), but it seems
like there would be uses for minimizing non-standard behavior, as well.

Thank you all for your contributions to Postgres-- I use it everyday. :)

-mark

http://mark.stosberg.com/

#8ANDREW PERRIN
aperrin@email.unc.edu
In reply to: Mark Stosberg (#7)
Re: behavior of ' = NULL' vs. MySQL vs. Standards

Interesting - my experience is that Access, at least, generally treats
NULL's correctly:

(This was done under Access 2000):
create table foo (name text(20))
insert into foo values ("bar");
insert into foo values ("bar");
insert into foo values ("bar");
insert into foo values ("bar");
insert into foo values (NULL);
insert into foo values (NULL);
insert into foo values (NULL);
insert into foo values (NULL);

select count(*) from foo where name=NULL;
returns 0

select count(*) from foo where name is null;
returns 4

select count(*) from foo where name <> "bar";
returns 0

Cheers,
Andy

---------------------------------------------------------
Andrew J. Perrin - Assistant Professor of Sociology
University of North Carolina, Chapel Hill
269 Hamilton Hall CB#3210, Chapel Hill, NC 27599-3210 USA
andrew_perrin@unc.edu - http://www.unc.edu/~aperrin

On Thu, 7 Jun 2001, Mark Stosberg wrote:

Show quoted text

Stephan Szabo wrote:

On Wed, 6 Jun 2001, Tom Lane wrote:

Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:

Yes, column = NULL should *never* return true according to the spec (it
should always return NULL in fact as stated). The reason for breaking
with the spec is AFAIK to work with broken microsoft clients that seem to
think that =NULL is a meaningful test and generate queries using that.

I'd rather have the default be the spec correct behavior
and let people configure their server to follow the misinterpretation.

I like that idea as well. Someone like me who didn't know that this
feature was in there for M$ could have assumed it _was_ standard
behavior, and started using it as a habit. Then when I started porting
my code to another database, I'd have an extra surprise in for me. :)

Rather than being an option targeted at just this piece of grammer,
perhaps it could a piece of a potentially larger option of "stricter
standards compliance." I realize there are a number of useful extensions
to the SQL standard in Postgres (which I like and use.), but it seems
like there would be uses for minimizing non-standard behavior, as well.

Thank you all for your contributions to Postgres-- I use it everyday. :)

-mark

http://mark.stosberg.com/

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

#9Josh Berkus
josh@agliodbs.com
In reply to: Tom Lane (#3)
Re: behavior of ' = NULL' vs. MySQL vs. Standards

Tom,

Microsoft Access is the guilty party, IIRC. I recently tried to stir
up
some interest in changing this behavior back to the standard, but
apparently there are still too many people using broken versions of
Access.

Which verison? I work with MSAccess 97 a lot, and the supported syntax
is "IS NULL".

A compromise answer might be to offer a SET variable that selects the
Microsoft-compatible misimplementation. Would that fly?

Sounds good to me. By default, I'd like to see "=NULL" turned OFF, as
I've been writing my SQL on the idea that "=NULL" will return NULL, and
if "=NULL" starts returning a value it's going to screw several things
up ...

-Josh

______AGLIO DATABASE SOLUTIONS___________________________
Josh Berkus
Complete information technology josh@agliodbs.com
and data management solutions (415) 565-7293
for law firms, small businesses fax 621-2533
and non-profit organizations. San Francisco

#10Tom Ivar Helbekkmo
tih@kpnQwest.no
In reply to: Tom Lane (#3)
Re: [HACKERS] Re: behavior of ' = NULL' vs. MySQL vs. Standards

Tom Lane <tgl@sss.pgh.pa.us> writes:

A compromise answer might be to offer a SET variable that selects the
Microsoft-compatible misimplementation. Would that fly?

I'd say that's the best way to handle stuff like this. If you
implement something that breaks the standard, to be compatible with
arrogant programmers at Microsoft who feel that they are above such
things, you're encouraging users to develop bad habits. Better to
make them go "SET MICROSOFT BUG COMPATIBILITY ON" first.

To see the effect of leniency in re standards, just try to browse
around the web with a browser that rejects incorrect HTML! :-)

-tih
--
The basic difference is this: hackers build things, crackers break them.

#11Mike Mascari
mascarm@mascari.com
In reply to: Tom Ivar Helbekkmo (#10)
RE: [HACKERS] Re: behavior of ' = NULL' vs. MySQL vs. Standards

The best solution would be to have the ODBC translate instances of '=
NULL' into IS NULL before submitting the query to PostgreSQL. I'm
sure this is how other vendors, like Oracle handle the issue. Well,
probably sure... :-)

Mike Mascari
mascarm@mascari.com

-----Original Message-----
From: Tom Ivar Helbekkmo [SMTP:tih@kpnQwest.no]
Sent: Thursday, June 07, 2001 3:18 PM
To: Tom Lane
Cc: Stephan Szabo; Mark Stosberg; pgsql-sql@postgresql.org;
pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Re: [SQL] behavior of ' = NULL' vs. MySQL vs.
Standards

Tom Lane <tgl@sss.pgh.pa.us> writes:

A compromise answer might be to offer a SET variable that selects

the

Microsoft-compatible misimplementation. Would that fly?

I'd say that's the best way to handle stuff like this. If you
implement something that breaks the standard, to be compatible with
arrogant programmers at Microsoft who feel that they are above such
things, you're encouraging users to develop bad habits. Better to
make them go "SET MICROSOFT BUG COMPATIBILITY ON" first.

To see the effect of leniency in re standards, just try to browse
around the web with a browser that rejects incorrect HTML! :-)

#12Tom Ivar Helbekkmo
tih@kpnQwest.no
In reply to: Mike Mascari (#11)
Re: [HACKERS] Re: behavior of ' = NULL' vs. MySQL vs. Standards

Mike Mascari <mascarm@mascari.com> writes:

The best solution would be to have the ODBC translate instances of '=
NULL' into IS NULL before submitting the query to PostgreSQL. I'm
sure this is how other vendors, like Oracle handle the issue. Well,
probably sure... :-)

<applause>

That's the intelligent solution, of course.

-tih
--
The basic difference is this: hackers build things, crackers break them.

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Josh Berkus (#9)
Re: behavior of ' = NULL' vs. MySQL vs. Standards

"Josh Berkus" <josh@agliodbs.com> writes:

Microsoft Access is the guilty party, IIRC.

Which verison? I work with MSAccess 97 a lot, and the supported syntax
is "IS NULL".

In our last go-round on this (pghackers around 14 Jan 2001),
Mike Mascari wrote:

: I don't have Office 2000, but I can confirm Access 97 generates such
: queries. The query-builder doesn't generate the 'key = NULL' query, but the
: use of the Forms interface does.

regards, tom lane

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Mike Mascari (#11)
Re: [HACKERS] Re: behavior of ' = NULL' vs. MySQL vs. Standards

Mike Mascari <mascarm@mascari.com> writes:

The best solution would be to have the ODBC translate instances of '=
NULL' into IS NULL before submitting the query to PostgreSQL.

Does Access only talk to Postgres via ODBC? (Seems plausible, but I
don't know.) What about people copying-and-pasting queries generated
by Access; is it even possible?

I doubt that this'd really help much, in any case; it merely moves the
necessary switch from the backend to ODBC.

regards, tom lane

#15Mike Mascari
mascarm@mascari.com
In reply to: Tom Lane (#14)
RE: Re: [SQL] behavior of ' = NULL' vs. MySQL vs. Standards

The best solution would be to have the ODBC translate instances

of '=

NULL' into IS NULL before submitting the query to PostgreSQL.

Tom Lane wrote:

Does Access only talk to Postgres via ODBC? (Seems plausible, but

I

don't know.) What about people copying-and-pasting queries

generated

by Access; is it even possible?

1. Yes, Access only talkes to Postgres via ODBC.

2. The queries generate by Access which use the '= NULL' vs. 'IS
NULL' language are only generated through the use of Access Forms,
when some attempts to fetch a record through the forms interface
where the key of the linked table is an empty edit control. When
access generates queries, it correctly generates 'IS NULL' language.
It is a *very specific problem* WRT the user of Access Forms.

I doubt that this'd really help much, in any case; it merely moves

the

necessary switch from the backend to ODBC.

It seems to me that is where it belongs. The ODBC driver is already
doing translation of some Access functions not found in PostgreSQL
such as LCASE->lower, etc. The kludge isn't found in any other RDBMS
of which I am aware (certainly not Oracle), and yet Access works fine
with them,

Just MHO of course,

Mike Mascari
mascarm@mascari.com

#16Josh Berkus
josh@agliodbs.com
In reply to: Tom Lane (#13)
Re: behavior of ' = NULL' vs. MySQL vs. Standards

Tom, Mike:

In our last go-round on this (pghackers around 14 Jan 2001),
Mike Mascari wrote:

: I don't have Office 2000, but I can confirm Access 97 generates
such
: queries. The query-builder doesn't generate the 'key = NULL' query,
but the
: use of the Forms interface does.

As someone who develops professionally for MSAccess<->ODBC Servers
(PostgreSQL and MS SQL Server) I'd say that we can ignore this. There
are a *lot* of circumstances where the "Filter by Form" interface breaks
down; I don't think that it works properly with MS SQL Server 7.0,
either.

At most, rather than modifying the query parser and building extra
compile-time options, how about talking to the Great bridge folks about
modifying PgODBC? That seems like the appropriate place to tell the
system that "= NULL" is equivalent to "IS NULL" -- if anywhere.

-Josh Berkus

______AGLIO DATABASE SOLUTIONS___________________________
Josh Berkus
Complete information technology josh@agliodbs.com
and data management solutions (415) 565-7293
for law firms, small businesses fax 621-2533
and non-profit organizations. San Francisco

#17Josh Berkus
josh@agliodbs.com
In reply to: Tom Lane (#14)
Re: [HACKERS] Re: behavior of ' = NULL' vs. MySQL vs. Standards

Tom,

Does Access only talk to Postgres via ODBC? (Seems plausible, but I
don't know.)

Yes.

What about people copying-and-pasting queries generated

by Access; is it even possible?

Yes, but MS Access uses some much non-standard SQL that "= Null" would
hardly be the only problem (try to build an UPDATE ... FROM in Access)

I doubt that this'd really help much, in any case; it merely moves
the
necessary switch from the backend to ODBC.

Yes, but it does avoid cluttering up the PgSQL source with switches that
are only useful to a handful of users.

-Josh

______AGLIO DATABASE SOLUTIONS___________________________
Josh Berkus
Complete information technology josh@agliodbs.com
and data management solutions (415) 565-7293
for law firms, small businesses fax 621-2533
and non-profit organizations. San Francisco

#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: Josh Berkus (#16)
Re: behavior of ' = NULL' vs. MySQL vs. Standards

"Josh Berkus" <josh@agliodbs.com> writes:

: I don't have Office 2000, but I can confirm Access 97 generates such
: queries. The query-builder doesn't generate the 'key = NULL' query,
: but the use of the Forms interface does.

As someone who develops professionally for MSAccess<->ODBC Servers
(PostgreSQL and MS SQL Server) I'd say that we can ignore this. There
are a *lot* of circumstances where the "Filter by Form" interface breaks
down; I don't think that it works properly with MS SQL Server 7.0,
either.

This is an interesting comment. Does anyone else want to confirm or
refute it? If Access' forms interface is so badly broken that few
people use it anyway, then I'd say that we should not break standards
compatibility just to support it.

regards, tom lane

#19Hiroshi Inoue
Inoue@tpf.co.jp
In reply to: Mike Mascari (#15)
Re: Re: [SQL] behavior of ' = NULL' vs. MySQL vs. Standards

Mike Mascari wrote:

The best solution would be to have the ODBC translate instances

of '=

NULL' into IS NULL before submitting the query to PostgreSQL.

Tom Lane wrote:

Does Access only talk to Postgres via ODBC? (Seems plausible, but

I

don't know.) What about people copying-and-pasting queries

generated

by Access; is it even possible?

1. Yes, Access only talkes to Postgres via ODBC.

2. The queries generate by Access which use the '= NULL' vs. 'IS
NULL' language are only generated through the use of Access Forms,
when some attempts to fetch a record through the forms interface
where the key of the linked table is an empty edit control. When
access generates queries, it correctly generates 'IS NULL' language.
It is a *very specific problem* WRT the user of Access Forms.

Could you send me an example using Access Forms ?
DAO seems to translate '= NULL' into 'IS NULL' properly.
Hmm parameter bindings( expr = ? ) could be the cause.

I doubt that this'd really help much, in any case; it merely moves

the

necessary switch from the backend to ODBC.

It seems to me that is where it belongs. The ODBC driver is already
doing translation of some Access functions not found in PostgreSQL
such as LCASE->lower, etc.

It doesn't seem that easy.
Detecting '= NULL' isn't sufficient.
'SET .. expr = NULL' mustn't be tranlated into
'SET .. expr IS NULL'. The '= NULL' must be inside
a where clause.
Unfortunately the current psqlodbc driver parses
little.

regards,
Hiroshi Inoue