Concurrency problem building indexes

Started by Wesalmost 20 years ago39 messageshackersgeneral
Jump to latest
#1Wes
wespvp@syntegra.com
hackersgeneral

Postgres: 8.1.3
OS: Mac OS X 10.4.6

I've run into another concurrency issue - parallel building of indexes.
When I try to build multiple indexes at a time, I randomly get:

ERROR: tuple concurrently updated

The following thread talks about this, but there is no answer.

<http://archives.postgresql.org/pgsql-hackers/2002-07/msg00969.php&gt;

How can I safely build indexes in parallel?

At this point, I'm only trying to build two at a time. I will be building
indexes for tables with any where from a few rows to 100 million rows on a
daily basis - I need to maximize performance.

Wes

#2Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Wes (#1)
hackersgeneral
Re: [GENERAL] Concurrency problem building indexes

moving to -hackers

On Sun, Apr 23, 2006 at 09:06:59AM -0500, Wes wrote:

Postgres: 8.1.3
OS: Mac OS X 10.4.6

I've run into another concurrency issue - parallel building of indexes.
When I try to build multiple indexes at a time, I randomly get:

ERROR: tuple concurrently updated

The following thread talks about this, but there is no answer.

<http://archives.postgresql.org/pgsql-hackers/2002-07/msg00969.php&gt;

How can I safely build indexes in parallel?

At this point, I'm only trying to build two at a time. I will be building
indexes for tables with any where from a few rows to 100 million rows on a
daily basis - I need to maximize performance.

Since this seems to only be an issue due to trying to update pg_class
for the table, perhaps CREATE INDEX can just ignore errors there?
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jim Nasby (#2)
hackersgeneral
Re: [GENERAL] Concurrency problem building indexes

"Jim C. Nasby" <jnasby@pervasive.com> writes:

Since this seems to only be an issue due to trying to update pg_class
for the table, perhaps CREATE INDEX can just ignore errors there?

Lessee, where would ignoring an error potentially cause the greatest
damage? I can hardly think of a less critical catalog than pg_class :-(

regards, tom lane

#4Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Tom Lane (#3)
hackersgeneral
Re: [GENERAL] Concurrency problem building indexes

On Mon, Apr 24, 2006 at 08:14:33PM -0400, Tom Lane wrote:

"Jim C. Nasby" <jnasby@pervasive.com> writes:

Since this seems to only be an issue due to trying to update pg_class
for the table, perhaps CREATE INDEX can just ignore errors there?

Lessee, where would ignoring an error potentially cause the greatest
damage? I can hardly think of a less critical catalog than pg_class :-(

Sorry, should have been more specific... as I understand it, the update
is just to set pg_class.relpages for the heap, which shouldn't be
critical.

Was the code ever changed so that it won't update relpages if the number
is the same?
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jim Nasby (#4)
hackersgeneral
Re: [GENERAL] Concurrency problem building indexes

"Jim C. Nasby" <jnasby@pervasive.com> writes:

Was the code ever changed so that it won't update relpages if the number
is the same?

Long ago. I just tested it again, and AFAICS you can create indexes
concurrently so long as the underlying table isn't changing (ie,
neither reltuples nor relpages changes).

regards, tom lane

#6Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Tom Lane (#5)
hackersgeneral
Re: [GENERAL] Concurrency problem building indexes

On Mon, Apr 24, 2006 at 08:42:41PM -0400, Tom Lane wrote:

"Jim C. Nasby" <jnasby@pervasive.com> writes:

Was the code ever changed so that it won't update relpages if the number
is the same?

Long ago. I just tested it again, and AFAICS you can create indexes
concurrently so long as the underlying table isn't changing (ie,
neither reltuples nor relpages changes).

Hrm, the OP seemed to find a case that was having problems:
http://archives.postgresql.org/pgsql-general/2006-04/msg01009.php

Of course it's possible that he's getting that error from an entirely
different section of code, or that this is now only an issue if you're
doing a lot of indexing at once...
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

#7Wes
wespvp@syntegra.com
In reply to: Jim Nasby (#6)
hackersgeneral
Re: [GENERAL] Concurrency problem building indexes

On 4/24/06 7:54 PM, "Jim C. Nasby" <jnasby@pervasive.com> wrote:

Long ago. I just tested it again, and AFAICS you can create indexes
concurrently so long as the underlying table isn't changing (ie,
neither reltuples nor relpages changes).

Hrm, the OP seemed to find a case that was having problems:
http://archives.postgresql.org/pgsql-general/2006-04/msg01009.php

Of course it's possible that he's getting that error from an entirely
different section of code, or that this is now only an issue if you're
doing a lot of indexing at once...

I don't think there's much chance it's other code. The index build is a
standalone operation in an external script that uses psql (so it is easy to
tailor). This script is called as the last statement before the main
program exit. It does:

fork
create index1
create index2
exit

fork
create index3
create index4
exit

wait for termination
exit

As my test tables are small and indexing happens almost instantly, I put a
'sleep' after one of the forks so that the two don't complete at the same
time. When I do that, I don't get the errors.

Wes

#8Wes
wespvp@syntegra.com
In reply to: Jim Nasby (#6)
hackersgeneral
Re: [GENERAL] Concurrency problem building indexes

Long ago. I just tested it again, and AFAICS you can create indexes
concurrently so long as the underlying table isn't changing (ie,
neither reltuples nor relpages changes).

Hrm, the OP seemed to find a case that was having problems:
http://archives.postgresql.org/pgsql-general/2006-04/msg01009.php

Of course it's possible that he's getting that error from an entirely
different section of code, or that this is now only an issue if you're
doing a lot of indexing at once...

I just verified using 'ps' that there are no other open connections when the
index builds are running. I tried somewhat bigger test tables (a few rows
to a little over a hundred thousand). I can duplicate the error at will.

Wes

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Wes (#8)
hackersgeneral
Re: [GENERAL] Concurrency problem building indexes

Wes <wespvp@syntegra.com> writes:

I can duplicate the error at will.

[ shrug... ] Worksforme.

There is a short interval at the end of the first CREATE INDEX on the
table where the problem would happen if another CREATE INDEX tries to
modify the pg_class row before the first one's committed. That would be
hard to hit with any regularity though. Subsequent CREATE INDEXes after
that should be completely reliable. If you are modifying the table
while it's being indexed, then the same small window would apply for
each CREATE INDEX not just the first ... but you didn't admit to that.

Care to put together a self-contained test case?

regards, tom lane

#10Wes
wespvp@syntegra.com
In reply to: Tom Lane (#9)
hackersgeneral
Re: [GENERAL] Concurrency problem building indexes

On 4/24/06 11:02 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:

[ shrug... ] Worksforme.

There is a short interval at the end of the first CREATE INDEX on the
table where the problem would happen if another CREATE INDEX tries to
modify the pg_class row before the first one's committed. That would be
hard to hit with any regularity though. Subsequent CREATE INDEXes after
that should be completely reliable. If you are modifying the table
while it's being indexed, then the same small window would apply for
each CREATE INDEX not just the first ... but you didn't admit to that.

Care to put together a self-contained test case?

I think I've got a reasonably small test case I can send you in the morning.
I did a pg_dumpall and removed the index creation commands. The first time
I run the index build, I usually get at least one occurrence.

Where do you want me to send it to?

Of course, since your hardware is different, it may not show up since it
appears to be a timing thing.. I'm on a PB G4 1Ghz.

Wes

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Wes (#10)
hackersgeneral
Re: [GENERAL] Concurrency problem building indexes

Wes <wespvp@syntegra.com> writes:

Where do you want me to send it to?

pgsql-bugs would be appropriate.

Of course, since your hardware is different, it may not show up since it
appears to be a timing thing.. I'm on a PB G4 1Ghz.

My G4 is in the shop at the moment, but Apple promised it back by Friday.
I kinda doubt it's *that* platform specific though.

regards, tom lane

#12Wes
wespvp@syntegra.com
In reply to: Tom Lane (#11)
hackersgeneral
Re: [GENERAL] Concurrency problem building indexes

On 4/25/06 1:01 AM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:

Where do you want me to send it to?

pgsql-bugs would be appropriate.

It's not that small that I would want to post it to a list. Also, I think
I've sanitized the data, but I wouldn't want to post it on a public list.

Or are you just looking for the index build script?

Of course, since your hardware is different, it may not show up since it
appears to be a timing thing.. I'm on a PB G4 1Ghz.

My G4 is in the shop at the moment, but Apple promised it back by Friday.
I kinda doubt it's *that* platform specific though.

You never know... CPU speed, hard drive speed (slow), etc.

Wes

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Wes (#12)
hackersgeneral
Re: [GENERAL] Concurrency problem building indexes

Wes <wespvp@syntegra.com> writes:

On 4/25/06 1:01 AM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:

Where do you want me to send it to?

pgsql-bugs would be appropriate.

It's not that small that I would want to post it to a list. Also, I think
I've sanitized the data, but I wouldn't want to post it on a public list.

So invent some made-up data. I'd be seriously surprised if this
behavior has anything to do with the precise data being indexed.
Experiment around till you've got something you don't mind posting
that exhibits the behavior you see.

regards, tom lane

#14Zeugswetter Andreas SB SD
ZeugswetterA@spardat.at
In reply to: Tom Lane (#13)
hackers
Re: [GENERAL] Concurrency problem building indexes

[ shrug... ] Worksforme.

There is a short interval at the end of the first CREATE INDEX on

the

table where the problem would happen if another CREATE INDEX tries

to

modify the pg_class row before the first one's committed.

I did a pg_dumpall and removed the index creation commands. The first

time

I run the index build, I usually get at least one occurrence.

I think that narrows it down nicely. You create the table, load rows,
then without
analyze create the indexes, thus pg_class is not up to date, and the
update
needs to be done.

My answer to this would be to (have an option to) ommit this relpages
and reltuples update. It is imho not the task of create index to update
statistics
in the first place. I have been burnt by that behavior when creating
indexes on empty
tables in Informix and never liked it (iirc pg has a workaround for
empty tables though).

Wes, you could most likely solve your immediate problem if you did an
analyze before
creating the indexes.

Andreas

#15Wes
wespvp@syntegra.com
In reply to: Zeugswetter Andreas SB SD (#14)
hackers
Re: [GENERAL] Concurrency problem building indexes

On 4/25/06 2:18 AM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:

So invent some made-up data. I'd be seriously surprised if this
behavior has anything to do with the precise data being indexed.
Experiment around till you've got something you don't mind posting
that exhibits the behavior you see.

My initial attempts last night at duplicating it with a small result set
were not successful. I'll see what I can do.

On 4/25/06 3:25 AM, "Zeugswetter Andreas DCP SD" <ZeugswetterA@spardat.at>
wrote:

Wes, you could most likely solve your immediate problem if you did an
analyze before
creating the indexes.

I can try that. Is that going to be a reasonable thing to do when there's
100 million rows per table? I obviously want to minimize the number of
sequential passes through the database.

Wes

#16Zeugswetter Andreas SB SD
ZeugswetterA@spardat.at
In reply to: Wes (#15)
hackers
Re: [GENERAL] Concurrency problem building indexes

Wes, you could most likely solve your immediate problem if you did

an

analyze before creating the indexes.

I can try that. Is that going to be a reasonable thing to do when

there's

100 million rows per table? I obviously want to minimize the number

of

sequential passes through the database.

No, I think it would only help if it gets the exact tuple count.
For large tables it only gets an exact count with a full scan
(use vacuum instead of analyze).

Then again, when the table is large, the different "create index"es
should finish at sufficiently different times, so an analyze might
be sufficient to fix the problem for small tables.

(analyze is fast for large tables since it only does a sample)

Andreas

#17Tom Lane
tgl@sss.pgh.pa.us
In reply to: Zeugswetter Andreas SB SD (#14)
hackers
Re: [GENERAL] Concurrency problem building indexes

"Zeugswetter Andreas DCP SD" <ZeugswetterA@spardat.at> writes:

My answer to this would be to (have an option to) ommit this relpages
and reltuples update.

Wouldn't help, unless you want to get rid of relhasindex too.

Wes, you could most likely solve your immediate problem if you did an
analyze before creating the indexes.

No, because that would install an approximate tuple count that the first
CREATE INDEX would (most likely) still want to replace.

The best I can gather from Wes' mail is that he's somehow getting a
higher-than-chance probability that the first two CREATE INDEX commands
finish at almost exactly the same time, and thus neither one of them
sees the pg_class row as already updated with the correct values.
I can't think of any effect in the PG code that would make CREATE INDEX
commands tend to converge rather than diverge, so maybe there is some
external effect here. Wes, is your machine a dual processor by any
chance? Which OS X version exactly?

regards, tom lane

#18Wes
wespvp@syntegra.com
In reply to: Tom Lane (#17)
hackers
Re: [GENERAL] Concurrency problem building indexes

On 4/25/06 9:58 AM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:

I can't think of any effect in the PG code that would make CREATE INDEX
commands tend to converge rather than diverge, so maybe there is some
external effect here. Wes, is your machine a dual processor by any
chance? Which OS X version exactly?

No, I'm running on a single processor powerbook. I currently am running OS
X 10.4.5 on this system, 1GB memory.

A while back, I saw a posting (I think from Tom) to the effect of index
creation converging due to disk caching. It was along the lines of the
slower index would be reading from disk data cached by the first index
creation's read. When the faster creation went out to read from disk, the
one reading from memory could catch up. Possible?

Below is the (mostly psql) output from the creation script. Timing is
enabled. It's a bit difficult to follow because of the multiprocessing.
The index build script is perl, using popen to talk to psql. $|=1 is set on
the command pipe to flush each command to the process. I can provide that
if you want. If needed I could rewrite it using DBI to get easier to read
logging.

This is 6 tables, and I get three errors. The header table has 3 indexes,
and the detail table has 4.

Wes

In buildIndex type=header date=20050904 which=1
939: header date <<<<<----- The number is the PID of the child process
939: header msgid
\timing
Timing is on.
create index header_i_date_20050904
on header_20050904 (mdate) TABLESPACE indexes;
header 1 forked 939
In buildIndex type=header date=20050904 which=2
942: header originator
\timing
Timing is on.
create index header_i_originator_20050904
on header_20050904 (originator) TABLESPACE indexes;
header 2 forked 942
Waiting for index builds to complete
CREATE INDEX
Time: 79.463 ms
\q
ERROR: tuple concurrently updated
create index header_i_msgid_20050904
on header_20050904 (messageid) TABLESPACE indexes;
942 completed.
CREATE INDEX
Time: 7.125 ms
\q
939 completed.
header 1 forked 946
header 2 forked 947
Waiting for index builds to complete
In buildIndex type=header date=20050905 which=1
In buildIndex type=header date=20050905 which=2
946: header date
946: header msgid
947: header originator
\timing
Timing is on.
create index header_i_originator_20050905
on header_20050905 (originator) TABLESPACE indexes;
\timing
Timing is on.
create index header_i_date_20050905
on header_20050905 (mdate) TABLESPACE indexes;
CREATE INDEX
Time: 444.957 ms
create index header_i_msgid_20050905
on header_20050905 (messageid) TABLESPACE indexes;
CREATE INDEX
Time: 569.063 ms
\q
947 completed.
CREATE INDEX
Time: 293.467 ms
\q
946 completed.
header 1 forked 953
In buildIndex type=header date=20050906 which=1
In buildIndex type=header date=20050906 which=2
header 2 forked 954
Waiting for index builds to complete
953: header date
953: header msgid
954: header originator
\timing
Timing is on.
create index header_i_originator_20050906
on header_20050906 (originator) TABLESPACE indexes;
\timing
Timing is on.
create index header_i_date_20050906
on header_20050906 (mdate) TABLESPACE indexes;
CREATE INDEX
Time: 1048.094 ms
create index header_i_msgid_20050906
on header_20050906 (messageid) TABLESPACE indexes;
ERROR: tuple concurrently updated
\q
954 completed.
CREATE INDEX
Time: 566.794 ms
\q
953 completed.
In buildIndex type=detail date=20050904 which=1
960: detail originator
960: detail date
\timing
Timing is on.
create index detail_i_originator_20050904
on detail_20050904 (originator) TABLESPACE indexes;
detail 1 forked 960
In buildIndex type=detail date=20050904 which=2
963: detail recipient
963: detail msgid
\timing
Timing is on.
create index detail_i_recipient_20050904
on detail_20050904 (recipient) TABLESPACE indexes;
CREATE INDEX
Time: 11.679 ms
create index detail_i_msgid_20050904
on detail_20050904 (messageid) TABLESPACE indexes;
CREATE INDEX
Time: 4.952 ms
\q
detail 2 forked 963
Waiting for index builds to complete
963 completed.
CREATE INDEX
Time: 123.106 ms
create index detail_i_date_20050904
on detail_20050904 (mdate) TABLESPACE indexes;
CREATE INDEX
Time: 5.724 ms
\q
960 completed.
In buildIndex type=detail date=20050905 which=1
967: detail originator
967: detail date
\timing
Timing is on.
create index detail_i_originator_20050905
on detail_20050905 (originator) TABLESPACE indexes;
detail 1 forked 967
In buildIndex type=detail date=20050905 which=2
971: detail recipient
971: detail msgid
\timing
Timing is on.
create index detail_i_recipient_20050905
on detail_20050905 (recipient) TABLESPACE indexes;
detail 2 forked 971
Waiting for index builds to complete
CREATE INDEX
Time: 1223.597 ms
create index detail_i_date_20050905
on detail_20050905 (mdate) TABLESPACE indexes;
ERROR: tuple concurrently updated
create index detail_i_msgid_20050905
on detail_20050905 (messageid) TABLESPACE indexes;
CREATE INDEX
Time: 629.056 ms
\q
967 completed.
CREATE INDEX
Time: 568.197 ms
\q
971 completed.
In buildIndex type=detail date=20050906 which=1
975: detail originator
975: detail date
\timing
Timing is on.
create index detail_i_originator_20050906
on detail_20050906 (originator) TABLESPACE indexes;
detail 1 forked 975
In buildIndex type=detail date=20050906 which=2
978: detail recipient
978: detail msgid
detail 2 forked 978
Waiting for index builds to complete
\timing
Timing is on.
create index detail_i_recipient_20050906
on detail_20050906 (recipient) TABLESPACE indexes;
CREATE INDEX
Time: 2680.001 ms
create index detail_i_date_20050906
on detail_20050906 (mdate) TABLESPACE indexes;
CREATE INDEX
Time: 2675.191 ms
create index detail_i_msgid_20050906
on detail_20050906 (messageid) TABLESPACE indexes;
CREATE INDEX
Time: 2022.232 ms
\q
CREATE INDEX
Time: 1910.771 ms
\q
978 completed.
975 completed.

#19Tom Lane
tgl@sss.pgh.pa.us
In reply to: Wes (#18)
hackers
Re: [GENERAL] Concurrency problem building indexes

Wes <wespvp@syntegra.com> writes:

A while back, I saw a posting (I think from Tom) to the effect of index
creation converging due to disk caching. It was along the lines of the
slower index would be reading from disk data cached by the first index
creation's read. When the faster creation went out to read from disk, the
one reading from memory could catch up. Possible?

There would be some convergence effect while reading the table contents,
but the subsequent sorting and index-writing would be competitive and
ought to diverge again.

regards, tom lane

#20Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#19)
hackers
Re: [GENERAL] Concurrency problem building indexes

I'm late to this thread, but maybe we can make the process of storing
the new data in pg_class take a lock using LockObject() or something
like that to serialize the access to the pg_class row. The idea would
be that this lock doesn't conflict with a LockRelation(), but it would
of course conflict with itself so no two CREATE INDEXES can enter that
code section concurrently.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

#21Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#20)
hackers
#22Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Alvaro Herrera (#20)
hackers
#23Martijn van Oosterhout
kleptog@svana.org
In reply to: Jim Nasby (#22)
hackers
#24Tom Lane
tgl@sss.pgh.pa.us
In reply to: Martijn van Oosterhout (#23)
hackers
#25Joshua D. Drake
jd@commandprompt.com
In reply to: Martijn van Oosterhout (#23)
hackers
#26Martijn van Oosterhout
kleptog@svana.org
In reply to: Joshua D. Drake (#25)
hackers
#27Joshua D. Drake
jd@commandprompt.com
In reply to: Martijn van Oosterhout (#26)
hackers
#28Hannu Krosing
hannu@tm.ee
In reply to: Tom Lane (#24)
hackers
#29Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Hannu Krosing (#28)
hackers
#30Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#29)
hackers
#31Wes
wespvp@syntegra.com
In reply to: Tom Lane (#30)
hackers
#32Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Wes (#31)
hackers
#33Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jim Nasby (#32)
hackers
#34Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Tom Lane (#33)
hackers
#35Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jim Nasby (#34)
hackers
#36Wes
wespvp@syntegra.com
In reply to: Jim Nasby (#32)
hackers
#37Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Tom Lane (#35)
hackers
#38Wes
wespvp@syntegra.com
In reply to: Tom Lane (#33)
hackers
#39Tom Lane
tgl@sss.pgh.pa.us
In reply to: Wes (#18)
hackers