BUG #6535: LEFT JOIN on large table is altering data

Started by Aren Cambreabout 14 years ago11 messagesbugs
Jump to latest
#1Aren Cambre
aren@arencambre.com

The following bug has been logged on the website:

Bug reference: 6535
Logged by: Aren Cambre
Email address: aren@arencambre.com
PostgreSQL version: 9.1.3
Operating system: Windows 7 x64
Description:

(Note that this issue is described fully at
http://stackoverflow.com/questions/9713537/postgres-left-join-is-creating-new-data.
A few knowledgeable people have weighed in, and there is no solution
identified.)

In a nutshell, I have a query LEFT JOINing table A with table B and
inserting the data into table A'. A' is a functional equivalent of table A.
Table B is actually built from a using a WITH at the top. I'm using it to
help add a rank() to the table.

Table A has just over 2 million rows.

There is a column named citation_id that has no null values. Not only does
citation_id have a NOT NULL constraint, I have verified with a query that it
has no null values.

The LEFT JOIN is pulling table A's citation_id. However, when the joined
table is inserted into table A', Postgres throws "null value in column
"citation_id" violates not-null constraint" errors.

Note that I am running Postgres 32 bit for PostGIS compatibility.

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Aren Cambre (#1)
Re: BUG #6535: LEFT JOIN on large table is altering data

aren@arencambre.com writes:

(Note that this issue is described fully at
http://stackoverflow.com/questions/9713537/postgres-left-join-is-creating-new-data.
A few knowledgeable people have weighed in, and there is no solution
identified.)

This might be a bug, but you've not provided sufficient information for
someone else to reproduce the problem. What would be good is a SQL
script that reproduces the error from a standing start (empty database).

regards, tom lane

#3Aren Cambre
aren@arencambre.com
In reply to: Tom Lane (#2)
Re: BUG #6535: LEFT JOIN on large table is altering data

Thanks. While creating that very file, I discovered that 1 row had blanks
in every field despite a column having a *NOT NULL* constraint and another
column being a* *serial. Removing that column appears to fixed the problem.

Something about that column made Postgres unhappy, though. If I ran these
queries:

SELECT COUNT(*)
FROM consistent.master
WHERE citation_id IS NOT NULL
UNION
SELECT COUNT(*)
FROM consistent.master
UNION
SELECT COUNT(*)
FROM consistent.master
WHERE citation_id IS NULL

I got this result:

2085344
2085343
0

Not clear how adding a WHERE clause, whose only practical effect is to
reduce the number of rows returned, could cause *more* rows to be returned.
That seems buggy to me.

Aren

On Fri, Mar 16, 2012 at 12:56 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Show quoted text

aren@arencambre.com writes:

(Note that this issue is described fully at

http://stackoverflow.com/questions/9713537/postgres-left-join-is-creating-new-data
.

A few knowledgeable people have weighed in, and there is no solution
identified.)

This might be a bug, but you've not provided sufficient information for
someone else to reproduce the problem. What would be good is a SQL
script that reproduces the error from a standing start (empty database).

regards, tom lane

#4Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Aren Cambre (#3)
Re: BUG #6535: LEFT JOIN on large table is altering data

Aren Cambre <aren@arencambre.com> wrote:

SELECT COUNT(*)
FROM consistent.master
WHERE citation_id IS NOT NULL
UNION
SELECT COUNT(*)
FROM consistent.master
UNION
SELECT COUNT(*)
FROM consistent.master
WHERE citation_id IS NULL

I got this result:

2085344
2085343
0

Not clear how adding a WHERE clause, whose only practical effect
is to reduce the number of rows returned, could cause *more* rows
to be returned. That seems buggy to me.

Never assume that the rows will be returned in any particular order
from a query unless you specify ORDER BY. Assuming, as you seem to
be doing, that rows from the left side of a UNION will be output
before rows from the right side is not safe. You have no way of
knowing which row in a result set like that came from which of the
UNIONed SELECTs. In this case your assumption is almost certainly
wrong.

-Kevin

#5Aren Cambre
aren@arencambre.com
In reply to: Kevin Grittner (#4)
Re: BUG #6535: LEFT JOIN on large table is altering data

Kevin,

You're absolutely correct; there's no telling what order the planner will
use.

I did in fact run the queries separately and got the same result that I
intimated in the UNIONed queries. I can no longer reproduce, however,
because I have since altered the table. So if there is a bug, I may not be
much help in nailing it down.

The only thing I can help with is this appears to have happened at row
# 583847 of just over 2 million rows, per a straight CSV dump of the table.

Aren

On Sun, Mar 18, 2012 at 3:22 PM, Kevin Grittner <Kevin.Grittner@wicourts.gov

Show quoted text

wrote:

Aren Cambre <aren@arencambre.com> wrote:

SELECT COUNT(*)
FROM consistent.master
WHERE citation_id IS NOT NULL
UNION
SELECT COUNT(*)
FROM consistent.master
UNION
SELECT COUNT(*)
FROM consistent.master
WHERE citation_id IS NULL

I got this result:

2085344
2085343
0

Not clear how adding a WHERE clause, whose only practical effect
is to reduce the number of rows returned, could cause *more* rows
to be returned. That seems buggy to me.

Never assume that the rows will be returned in any particular order
from a query unless you specify ORDER BY. Assuming, as you seem to
be doing, that rows from the left side of a UNION will be output
before rows from the right side is not safe. You have no way of
knowing which row in a result set like that came from which of the
UNIONed SELECTs. In this case your assumption is almost certainly
wrong.

-Kevin

#6Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Kevin Grittner (#4)
Re: BUG #6535: LEFT JOIN on large table is altering data

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> wrote:

Aren Cambre <aren@arencambre.com> wrote:

SELECT COUNT(*)
FROM consistent.master
WHERE citation_id IS NOT NULL
UNION
SELECT COUNT(*)
FROM consistent.master
UNION
SELECT COUNT(*)
FROM consistent.master
WHERE citation_id IS NULL

I got this result:

2085344
2085343
0

Not clear how adding a WHERE clause, whose only practical effect
is to reduce the number of rows returned, could cause *more* rows
to be returned.

Never assume that the rows will be returned in any particular
order from a query unless you specify ORDER BY.

Hmm. That doesn't explain why the numbers don't add up, though. Is
that a copy/paste from an actual query run, or was there some
hand-editing there? In particular, you might easily get that result
if that last line was really:

WHERE citation_id = ''

instead of the IS NULL test. In the ANSI standard and in PostgreSQL
there is a big difference between an empty string and NULL, although
there is at least one product I know of which breaks from standard
compliance by treating them as equivalent.

-Kevin

#7Aren Cambre
aren@arencambre.com
In reply to: Kevin Grittner (#6)
Re: BUG #6535: LEFT JOIN on large table is altering data

Kevin,

Thanks. That was a direct copy/paste. It happened that the *UNION*ed
queries spat out those results in the same order that the queries appeared.

Just want to again emphasize that my database state has changed, so I am
not sure this is remains a good case for finding a bug.

Aren

On Sun, Mar 18, 2012 at 3:33 PM, Kevin Grittner <Kevin.Grittner@wicourts.gov

Show quoted text

wrote:

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> wrote:

Aren Cambre <aren@arencambre.com> wrote:

SELECT COUNT(*)
FROM consistent.master
WHERE citation_id IS NOT NULL
UNION
SELECT COUNT(*)
FROM consistent.master
UNION
SELECT COUNT(*)
FROM consistent.master
WHERE citation_id IS NULL

I got this result:

2085344
2085343
0

Not clear how adding a WHERE clause, whose only practical effect
is to reduce the number of rows returned, could cause *more* rows
to be returned.

Never assume that the rows will be returned in any particular
order from a query unless you specify ORDER BY.

Hmm. That doesn't explain why the numbers don't add up, though. Is
that a copy/paste from an actual query run, or was there some
hand-editing there? In particular, you might easily get that result
if that last line was really:

WHERE citation_id = ''

instead of the IS NULL test. In the ANSI standard and in PostgreSQL
there is a big difference between an empty string and NULL, although
there is at least one product I know of which breaks from standard
compliance by treating them as equivalent.

-Kevin

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Aren Cambre (#7)
Re: BUG #6535: LEFT JOIN on large table is altering data

Aren Cambre <aren@arencambre.com> writes:

Just want to again emphasize that my database state has changed, so I am
not sure this is remains a good case for finding a bug.

Your description of an apparently-all-nulls row sounds a great deal like
data corruption. I suspect there may be more wrong than just the one
row, and that that's what's leading to the inconsistent results. It'd
be worthwhile to run hardware diagnostics (memtest, disk tests), make
sure you are on the latest minor release of Postgres, and perhaps do a
database dump and reload. (If the latter seems impractical, at least
consider reindexing everything.)

regards, tom lane

#9Aren Cambre
aren@arencambre.com
In reply to: Tom Lane (#8)
Re: BUG #6535: LEFT JOIN on large table is altering data

Thanks. I am running this on 1 year old hardware on a wholly-encrypted C
drive, encrypted with TrueCrypt. I suspect that I would have been alerted
already with that or SMART if I had hardware issues?

I am on 9.1.3, although I think the data was inserted back when I was
running 9.0.x. I upgraded to 9.1 almost as soon as it came out, doing the
requisite dumps and reloads. I have also reindexed a couple of times as
diagnostic steps.

So far, looks like only that one row has data corruption.

Aren

On Sun, Mar 18, 2012 at 9:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Show quoted text

Aren Cambre <aren@arencambre.com> writes:

Just want to again emphasize that my database state has changed, so I am
not sure this is remains a good case for finding a bug.

Your description of an apparently-all-nulls row sounds a great deal like
data corruption. I suspect there may be more wrong than just the one
row, and that that's what's leading to the inconsistent results. It'd
be worthwhile to run hardware diagnostics (memtest, disk tests), make
sure you are on the latest minor release of Postgres, and perhaps do a
database dump and reload. (If the latter seems impractical, at least
consider reindexing everything.)

regards, tom lane

#10Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Aren Cambre (#9)
Re: BUG #6535: LEFT JOIN on large table is altering data

Excerpts from Aren Cambre's message of dom mar 18 23:39:16 -0300 2012:

Thanks. I am running this on 1 year old hardware on a wholly-encrypted C
drive, encrypted with TrueCrypt. I suspect that I would have been alerted
already with that or SMART if I had hardware issues?

That sounds excessively optimistic -- not necessarily a good
characteristic on a database administrator.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#11Aren Cambre
aren@arencambre.com
In reply to: Alvaro Herrera (#10)
Re: BUG #6535: LEFT JOIN on large table is altering data

Thanks. I am running this on 1 year old hardware on a wholly-encrypted C
drive, encrypted with TrueCrypt. I suspect that I would have been alerted
already with that or SMART if I had hardware issues?

That sounds excessively optimistic -- not necessarily a good
characteristic on a database administrator.

Which is why I could never be a DBA. :-)

I am the only user. This is for my doctoral research project.

Aren