index corruption on composite primary key indexes

Started by Ng, Stanover 15 years ago10 messagesbugs

sng@automotive.com

over 15 years ago

I've noticed what appears to be index corruption on composite primary
key indexes during my testing. Data deletes, updates, and inserts are
applied from delta data that is loaded into temporary tables. The
duplicate key error occurs at different points in time and isn't
isolated to any single table, although all affected tables have a
composite primary key index. For example, if machine1 and machine2 both
start off from the same starting state, then machine1 might fail on
delta 100 while machine2 won't fail until delta 125. The error in the
log is "duplicate key value violates unique constraint" with a reference
to the composite primary key index. The issue occurs on different
machines, so that would seemingly rule out hardware failure. The fact
that it only happens on tables with composite primary keys is highly
suspicious. It occurs reproducibly enough that it seems to be a pgsql
bug, or maybe a pgsql running on a certain set of hardware + software.

Some info on the platform I'm using:

PostgreSQL 8.4.4

Red Hat 4.1.2-44

Linux 2.6.18-128.el5 #1 SMP Wed Dec 17 11:41:38 EST 2008 x86_64 x86_64
x86_64 GNU/Linux

Intel(R) Xeon(R) CPU E5420 @ 2.50GHz

Does anyone know if this a known issue? Any help would be much
appreciated.

Mikael Krantz

mk@zigamorph.se

over 15 years ago

In reply to: Ng, Stan (#1)

Re: index corruption on composite primary key indexes

This could quite possibly be a correct behaviour. "duplicate key value
violates unique constraint" usually happens when you try to insert a
row to a table already containing a row with the same value(s) for the
key column(s). If you have two connections both trying to insert a new
record with the same key, one of them will succeed and the other one
will fail with "duplicate key value violates unique constraint". The
first transaction to commit wins here so it may cause the
unpredictable behaviour you're seeing due to timing variations.

Probably you should catch this error and decide what the proper action
to take is; it might be ignoring the second insert, it might be
updating the already existing row with that key or something else
entirely, depending on your application.

Best Regards

Mikael Krantz

Show quoted text

On Tue, Dec 14, 2010 at 3:18 AM, Ng, Stan <sng@automotive.com> wrote:

I’ve noticed what appears to be index corruption on composite primary key
indexes during my testing. Data deletes, updates, and inserts are applied
from delta data that is loaded into temporary tables. The duplicate key
error occurs at different points in time and isn’t isolated to any single
table, although all affected tables have a composite primary key index. For
example, if machine1 and machine2 both start off from the same starting
state, then machine1 might fail on delta 100 while machine2 won’t fail until
delta 125. The error in the log is “duplicate key value violates unique
constraint” with a reference to the composite primary key index. The issue
occurs on different machines, so that would seemingly rule out hardware
failure. The fact that it only happens on tables with composite primary keys
is highly suspicious. It occurs reproducibly enough that it seems to be a
pgsql bug, or maybe a pgsql running on a certain set of hardware + software.

Some info on the platform I’m using:

PostgreSQL 8.4.4

Red Hat 4.1.2-44

Linux 2.6.18-128.el5 #1 SMP Wed Dec 17 11:41:38 EST 2008 x86_64 x86_64
x86_64 GNU/Linux

Intel(R) Xeon(R) CPU E5420 @ 2.50GHz

Does anyone know if this a known issue? Any help would be much appreciated.

Craig Ringer

craig@2ndquadrant.com

over 15 years ago

In reply to: Ng, Stan (#1)

Re: index corruption on composite primary key indexes

On 12/14/2010 10:18 AM, Ng, Stan wrote:

Some info on the platform Iï¿½m using:

Please also show your schema and query / queries. If you think it's
corruption, provide your postgresql logs too, including any mention of
the names of the suspect indexes.

Personally I agree with Mikael Krantz; you haven't shown any evidence of
index corruption, and you're most likely hitting concurrency problems in
your code.

--
Craig Ringer

Ng, Stan

sng@automotive.com

over 15 years ago

In reply to: Craig Ringer (#3)

Re: index corruption on composite primary key indexes

Ah, I forgot to mention why I suspected corruption. The delta
application is a single threaded, sequential process. Each delta is done
within a single transaction. If it fails, there is retry logic that will
continue to attempt the delta application forever, with about a 5 to 10
minute wait. This will fail every time until a manual REINDEX command is
run on the composite primary key index in-between retry attempts. I've
also backed up the data to another table w/ the same indices and applied
the delta data manually without any problems while this error was
happening.

Oddly enough, even with full logging on, the pgsql logs don't contain
anything about the duplicate key error or suspect index. Perhaps another
important piece of information is that we're doing all this in Java
using the JDBC drivers, so we're seeing this as part of our stacktrace.
i.e.
org.postgresql.util.PSQLException: ERROR: duplicate key value violates
unique constraint "vehicleusediccgradiuscache_i00"
at
org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecu
torImpl.java:2102)
at
org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImp
l.java:1835)
at
org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:
257)
at
org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Stateme
nt.java:500)
at
org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdb
c2Statement.java:388)
at
org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Stateme
nt.java:381)
...

The index looks like this:
CREATE UNIQUE INDEX vehicleusediccgradiuscache_i00
ON vehicleusediccgradiuscache
USING btree
(datasetid, lower(postalcode::text), productgroupid,
vehicleclassgroupidic, vehicleusedidic, vehicleuseddisplaypriority,
vehicleusedsearchradius);

A single delta is calculated on the master server, which is pulled by
each client server. Each client puts the delta data into temp tables
(one each for delete, update, and insert, applied in that order), and
modifies the data via SQL somewhat like the following:

insert into
VehicleUsedICCGRadiusCache(DatasetId,PostalCode,ProductGroupId,VehicleCl
assGroupIdIC,VehicleUsedIdIC,VehicleUsedDisplayPriority,VehicleUsedSearc
hRadius,"VehicleUsedPrice.average","VehicleUsedPrice.min","VehicleUsedPr
ice.max","VehicleUsedIntId.distinct_count") select
t1.DatasetId,t1.PostalCode,t1.ProductGroupId,t1.VehicleClassGroupIdIC,t1
.VehicleUsedIdIC,t1.VehicleUsedDisplayPriority,t1.VehicleUsedSearchRadiu
s,t1."VehicleUsedPrice.average",t1."VehicleUsedPrice.min",t1."VehicleUse
dPrice.max",t1."VehicleUsedIntId.distinct_count" from t1
left outer join VehicleUsedICCGRadiusCache on
VehicleUsedICCGRadiusCache.VehicleUsedSearchRadius=t1.VehicleUsedSearchR
adius and VehicleUsedICCGRadiusCache.ProductGroupId=t1.ProductGroupId
and
VehicleUsedICCGRadiusCache.VehicleUsedDisplayPriority=t1.VehicleUsedDisp
layPriority and
VehicleUsedICCGRadiusCache.VehicleUsedIdIC=t1.VehicleUsedIdIC and
VehicleUsedICCGRadiusCache.DatasetId=t1.DatasetId and
VehicleUsedICCGRadiusCache.PostalCode=t1.PostalCode and
VehicleUsedICCGRadiusCache.VehicleClassGroupIdIC=t1.VehicleClassGroupIdI
C
where VehicleUsedICCGRadiusCache.VehicleUsedSearchRadius is null

-----Original Message-----
From: Craig Ringer [mailto:craig@postnewspapers.com.au]
Sent: Tuesday, December 14, 2010 4:47 AM
To: Ng, Stan
Cc: pgsql-bugs@postgresql.org
Subject: Re: [BUGS] index corruption on composite primary key indexes

On 12/14/2010 10:18 AM, Ng, Stan wrote:

Some info on the platform I'm using:

Please also show your schema and query / queries. If you think it's
corruption, provide your postgresql logs too, including any mention of
the names of the suspect indexes.

Personally I agree with Mikael Krantz; you haven't shown any evidence of

index corruption, and you're most likely hitting concurrency problems in

your code.

--
Craig Ringer

Ng, Stan

sng@automotive.com

over 15 years ago

In reply to: Ng, Stan (#1)

Re: index corruption on composite primary key indexes

Just to follow up -- I did find the error in the pgsql logs after all,
although it does not contain any additional information. i.e.: PSTERROR:
duplicate key value violates unique constraint

-----Original Message-----
From: Ng, Stan
Sent: Tuesday, December 14, 2010 10:42 AM
To: 'Craig Ringer'
Cc: pgsql-bugs@postgresql.org
Subject: RE: [BUGS] index corruption on composite primary key indexes

On 12/14/2010 10:18 AM, Ng, Stan wrote:

Some info on the platform I'm using:

Please also show your schema and query / queries. If you think it's
corruption, provide your postgresql logs too, including any mention of
the names of the suspect indexes.

Personally I agree with Mikael Krantz; you haven't shown any evidence of

index corruption, and you're most likely hitting concurrency problems in

your code.

--
Craig Ringer

Mikael Krantz

mk@zigamorph.se

over 15 years ago

In reply to: Ng, Stan (#4)

Re: index corruption on composite primary key indexes

I'm afraid that I'm a bit out of my depth when it comes to index corruption.

I did note a possibly unrelated problem: The unique index includes the
lower(postalcode::text) but your left join is made on
VehicleUsedICCGRadiusCache.PostalCode = t1.PostalCode. It seems likely
that you do this left join to exclude rows already present in
VehicleUsedICCGRadiusCache. If PostalCode can contain mixed case this
will be one possible source of duplicate key violations. Differently
cased PostalCodes will not be excluded by your left join but will be
considered identical by the unique index.

This might not at all be connected to your current problem as it seems
like something that would work or not work in a very consistent manner
regardless on reindexing but it could be useful anyway. Changing the
join condition to use lower() would also make the join able to use
more of the index for the join and could improve performance.

Best Regards

Mikael Krantz

Show quoted text

On Tue, Dec 14, 2010 at 7:42 PM, Ng, Stan <sng@automotive.com> wrote:

Ah, I forgot to mention why I suspected corruption. The delta
application is a single threaded, sequential process. Each delta is done
within a single transaction. If it fails, there is retry logic that will
continue to attempt the delta application forever, with about a 5 to 10
minute wait. This will fail every time until a manual REINDEX command is
run on the composite primary key index in-between retry attempts. I've
also backed up the data to another table w/ the same indices and applied
the delta data manually without any problems while this error was
happening.

Oddly enough, even with full logging on, the pgsql logs don't contain
anything about the duplicate key error or suspect index. Perhaps another
important piece of information is that we're doing all this in Java
using the JDBC drivers, so we're seeing this as part of our stacktrace.
i.e.
org.postgresql.util.PSQLException: ERROR: duplicate key value violates
unique constraint "vehicleusediccgradiuscache_i00"
at
org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecu
torImpl.java:2102)
at
org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImp
l.java:1835)
at
org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:
257)
at
org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Stateme
nt.java:500)
at
org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdb
c2Statement.java:388)
at
org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Stateme
nt.java:381)
...

The index looks like this:
CREATE UNIQUE INDEX vehicleusediccgradiuscache_i00
ON vehicleusediccgradiuscache
USING btree
(datasetid, lower(postalcode::text), productgroupid,
vehicleclassgroupidic, vehicleusedidic, vehicleuseddisplaypriority,
vehicleusedsearchradius);

A single delta is calculated on the master server, which is pulled by
each client server. Each client puts the delta data into temp tables
(one each for delete, update, and insert, applied in that order), and
modifies the data via SQL somewhat like the following:

insert into
VehicleUsedICCGRadiusCache(DatasetId,PostalCode,ProductGroupId,VehicleCl
assGroupIdIC,VehicleUsedIdIC,VehicleUsedDisplayPriority,VehicleUsedSearc
hRadius,"VehicleUsedPrice.average","VehicleUsedPrice.min","VehicleUsedPr
ice.max","VehicleUsedIntId.distinct_count") select
t1.DatasetId,t1.PostalCode,t1.ProductGroupId,t1.VehicleClassGroupIdIC,t1
.VehicleUsedIdIC,t1.VehicleUsedDisplayPriority,t1.VehicleUsedSearchRadiu
s,t1."VehicleUsedPrice.average",t1."VehicleUsedPrice.min",t1."VehicleUse
dPrice.max",t1."VehicleUsedIntId.distinct_count" from t1
left outer join VehicleUsedICCGRadiusCache on
VehicleUsedICCGRadiusCache.VehicleUsedSearchRadius=t1.VehicleUsedSearchR
adius and VehicleUsedICCGRadiusCache.ProductGroupId=t1.ProductGroupId
and
VehicleUsedICCGRadiusCache.VehicleUsedDisplayPriority=t1.VehicleUsedDisp
layPriority and
VehicleUsedICCGRadiusCache.VehicleUsedIdIC=t1.VehicleUsedIdIC and
VehicleUsedICCGRadiusCache.DatasetId=t1.DatasetId and
VehicleUsedICCGRadiusCache.PostalCode=t1.PostalCode and
VehicleUsedICCGRadiusCache.VehicleClassGroupIdIC=t1.VehicleClassGroupIdI
C
where VehicleUsedICCGRadiusCache.VehicleUsedSearchRadius is null

-----Original Message-----
From: Craig Ringer [mailto:craig@postnewspapers.com.au]
Sent: Tuesday, December 14, 2010 4:47 AM
To: Ng, Stan
Cc: pgsql-bugs@postgresql.org
Subject: Re: [BUGS] index corruption on composite primary key indexes

On 12/14/2010 10:18 AM, Ng, Stan wrote:

Some info on the platform I'm using:

Please also show your schema and query / queries. If you think it's
corruption, provide your postgresql logs too, including any mention of
the names of the suspect indexes.

Personally I agree with Mikael Krantz; you haven't shown any evidence of

index corruption, and you're most likely hitting concurrency problems in

your code.

--
Craig Ringer

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Ng, Stan

sng@automotive.com

over 15 years ago

In reply to: Mikael Krantz (#6)

Re: index corruption on composite primary key indexes

Ah, thanks for pointing that out! It is an inconsistency that I should clean up. A lot of the SQL is generated programmatically, so it gets a bit hairy sometimes. In this particular case, I have confirmed that all the postalcodes are actually US zips, so the lower() is pretty unnecessary.

The more I dig into this, the more I suspect it's an issue w/ MVCC. This bug is somewhat similar, although in our case we have a pre-existing index: http://archives.postgresql.org/pgsql-bugs/2010-12/msg00033.php

I also found an instance of this duplicate key error in the logs where it recovered w/o manual intervention after 2 retries. My guess is that auto-vacuum cleared out some old duplicate row. And just to clarify for anyone who might have missed my other update, I did find the duplicate key error logged in the PostgreSQL logs.

Now I just need to get my head wrapped around the corner cases of MVCC + indexes. From a layman's perspective, it seems the index should only operate on the latest version. Perhaps there's a pgsql configuration option or hint on the index that I'm missing... The hunt continues!

Cheers,
-- stan

-----Original Message-----
From: Mikael Krantz [mailto:mk@zigamorph.se]
Sent: Tuesday, December 14, 2010 4:36 PM
To: Ng, Stan
Cc: Craig Ringer; pgsql-bugs@postgresql.org
Subject: Re: [BUGS] index corruption on composite primary key indexes

I'm afraid that I'm a bit out of my depth when it comes to index corruption.

Best Regards

Mikael Krantz

Show quoted text

On Tue, Dec 14, 2010 at 7:42 PM, Ng, Stan <sng@automotive.com> wrote:

Ah, I forgot to mention why I suspected corruption. The delta
application is a single threaded, sequential process. Each delta is done
within a single transaction. If it fails, there is retry logic that will
continue to attempt the delta application forever, with about a 5 to 10
minute wait. This will fail every time until a manual REINDEX command is
run on the composite primary key index in-between retry attempts. I've
also backed up the data to another table w/ the same indices and applied
the delta data manually without any problems while this error was
happening.

Oddly enough, even with full logging on, the pgsql logs don't contain
anything about the duplicate key error or suspect index. Perhaps another
important piece of information is that we're doing all this in Java
using the JDBC drivers, so we're seeing this as part of our stacktrace.
i.e.
org.postgresql.util.PSQLException: ERROR: duplicate key value violates
unique constraint "vehicleusediccgradiuscache_i00"
at
org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecu
torImpl.java:2102)
at
org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImp
l.java:1835)
at
org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:
257)
at
org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Stateme
nt.java:500)
at
org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdb
c2Statement.java:388)
at
org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Stateme
nt.java:381)
...

The index looks like this:
CREATE UNIQUE INDEX vehicleusediccgradiuscache_i00
ON vehicleusediccgradiuscache
USING btree
(datasetid, lower(postalcode::text), productgroupid,
vehicleclassgroupidic, vehicleusedidic, vehicleuseddisplaypriority,
vehicleusedsearchradius);

A single delta is calculated on the master server, which is pulled by
each client server. Each client puts the delta data into temp tables
(one each for delete, update, and insert, applied in that order), and
modifies the data via SQL somewhat like the following:

insert into
VehicleUsedICCGRadiusCache(DatasetId,PostalCode,ProductGroupId,VehicleCl
assGroupIdIC,VehicleUsedIdIC,VehicleUsedDisplayPriority,VehicleUsedSearc
hRadius,"VehicleUsedPrice.average","VehicleUsedPrice.min","VehicleUsedPr
ice.max","VehicleUsedIntId.distinct_count") select
t1.DatasetId,t1.PostalCode,t1.ProductGroupId,t1.VehicleClassGroupIdIC,t1
.VehicleUsedIdIC,t1.VehicleUsedDisplayPriority,t1.VehicleUsedSearchRadiu
s,t1."VehicleUsedPrice.average",t1."VehicleUsedPrice.min",t1."VehicleUse
dPrice.max",t1."VehicleUsedIntId.distinct_count" from t1
left outer join VehicleUsedICCGRadiusCache on
VehicleUsedICCGRadiusCache.VehicleUsedSearchRadius=t1.VehicleUsedSearchR
adius and VehicleUsedICCGRadiusCache.ProductGroupId=t1.ProductGroupId
and
VehicleUsedICCGRadiusCache.VehicleUsedDisplayPriority=t1.VehicleUsedDisp
layPriority and
VehicleUsedICCGRadiusCache.VehicleUsedIdIC=t1.VehicleUsedIdIC and
VehicleUsedICCGRadiusCache.DatasetId=t1.DatasetId and
VehicleUsedICCGRadiusCache.PostalCode=t1.PostalCode and
VehicleUsedICCGRadiusCache.VehicleClassGroupIdIC=t1.VehicleClassGroupIdI
C
where VehicleUsedICCGRadiusCache.VehicleUsedSearchRadius is null

-----Original Message-----
From: Craig Ringer [mailto:craig@postnewspapers.com.au]
Sent: Tuesday, December 14, 2010 4:47 AM
To: Ng, Stan
Cc: pgsql-bugs@postgresql.org
Subject: Re: [BUGS] index corruption on composite primary key indexes

On 12/14/2010 10:18 AM, Ng, Stan wrote:

Some info on the platform I'm using:

Please also show your schema and query / queries. If you think it's
corruption, provide your postgresql logs too, including any mention of
the names of the suspect indexes.

Personally I agree with Mikael Krantz; you haven't shown any evidence of

index corruption, and you're most likely hitting concurrency problems in

your code.

--
Craig Ringer

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Alvaro Herrera

alvherre@2ndquadrant.com

over 15 years ago

In reply to: Ng, Stan (#7)

Re: index corruption on composite primary key indexes

Excerpts from Ng, Stan's message of mié dic 15 21:07:57 -0300 2010:

Now I just need to get my head wrapped around the corner cases of MVCC
+ indexes. From a layman's perspective, it seems the index should only
operate on the latest version. Perhaps there's a pgsql configuration
option or hint on the index that I'm missing... The hunt continues!

All tuples need to have index pointers, even the dead ones. They (the
index pointers to dead tuples) are also cleaned up by vacuum. There's
an optimization called HOT that allow them to be skipped in some cases;
you may want to read
http://git.postgresql.org/gitweb?p=postgresql.git;a=blob;f=src/backend/access/heap/README.HOT

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

over 15 years ago

In reply to: Alvaro Herrera (#8)

Re: index corruption on composite primary key indexes

On 16.12.2010 16:50, Alvaro Herrera wrote:

Excerpts from Ng, Stan's message of mié dic 15 21:07:57 -0300 2010:

Now I just need to get my head wrapped around the corner cases of MVCC
+ indexes. From a layman's perspective, it seems the index should only
operate on the latest version. Perhaps there's a pgsql configuration
option or hint on the index that I'm missing... The hunt continues!

All tuples need to have index pointers, even the dead ones.

To be precise, tuples that are truly dead, ie. not visible to anyone
anymore, don't need index pointers. They usually have index pointers
anyway, because they were needed to find the tuples before they became
dead. But if you abort a vacuum after it has scanned the indexes, you
end up with dead heap tuples with no index pointers. Also, there's an
optimization in b-tree that removes index pointers to dead tuples, when
they are encountered during index scans.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#10

Tom Lane

tgl@sss.pgh.pa.us

over 15 years ago

In reply to: Heikki Linnakangas (#9)

Re: index corruption on composite primary key indexes

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:

On 16.12.2010 16:50, Alvaro Herrera wrote:

All tuples need to have index pointers, even the dead ones.

To be precise, tuples that are truly dead, ie. not visible to anyone
anymore, don't need index pointers. They usually have index pointers
anyway, because they were needed to find the tuples before they became
dead. But if you abort a vacuum after it has scanned the indexes, you
end up with dead heap tuples with no index pointers. Also, there's an
optimization in b-tree that removes index pointers to dead tuples, when
they are encountered during index scans.

Another way you can get to that state is failure while inserting the
index entries in the first place, eg one of the indexes reports a
uniqueness failure, whereupon we abort the inserting transaction.
Now the heap tuple is present but dead, and some of the table's indexes
may have entries for it while others don't.

The correct invariant is that (a) if an index entry exists, there must
be a heap tuple for it to point at, and (b) a live tuple must have all
the index entries it should have.

regards, tom lane