Using an SMP machine to make multiple indices on the same table

Started by Martin Weinbergabout 24 years ago12 messages

weinberg@osprey.astro.umass.edu

about 24 years ago

Folks,

We have some big tables (1.2 billion records) and indexing is quite
time consuming. Since we have this running on dual Athlon box, it
would be great to make indices in parallel.

On Postgresql 7.1.3, it seems that the table is locked after the
first "create index" is started up. Is this right? Is there any
way to do this in parallel?

--Martin

Tom Lane

tgl@sss.pgh.pa.us

about 24 years ago

In reply to: Martin Weinberg (#1)

Re: Using an SMP machine to make multiple indices on the same table

Martin Weinberg <weinberg@osprey.astro.umass.edu> writes:

On Postgresql 7.1.3, it seems that the table is locked after the
first "create index" is started up. Is this right?

AFAIK it's a share lock, which only prohibits modifications to the
table, not reads (nor concurrent index builds). Not sure how you
expect the system to do better than that.

regards, tom lane

Martijn van Oosterhout

kleptog@svana.org

about 24 years ago

In reply to: Martin Weinberg (#1)

Re: Using an SMP machine to make multiple indices on the same table

On Mon, Oct 22, 2001 at 05:38:23PM -0400, Martin Weinberg wrote:

Folks,

We have some big tables (1.2 billion records) and indexing is quite
time consuming. Since we have this running on dual Athlon box, it
would be great to make indices in parallel.

On Postgresql 7.1.3, it seems that the table is locked after the
first "create index" is started up. Is this right? Is there any
way to do this in parallel?

My question in, would it help. The creation of the index should only be
limited by the bandwidth of the drives. I would think that creating two
indexes at the same time would simply trash the disk a lot and end up being
slower.

The answer to your questions however, are yes and no respectivly.

HTH,
--
Martijn van Oosterhout <kleptog@svana.org>
http://svana.org/kleptog/

Show quoted text

Magnetism, electricity and motion are like a three-for-two special offer:
if you have two of them, the third one comes free.

Martin Weinberg

weinberg@osprey.astro.umass.edu

about 24 years ago

In reply to: Tom Lane (#2)

Re: Using an SMP machine to make multiple indices on the

Tom,

Yes, I understand locking the table, but empirically, two index
creations will not run simultaneously on the same table. So if
I start (and background) two

psql -c "create index one on mytable . . ." database
psql -c "create index two on mytable . . ." database

commands. The first one starts and the second one waits until the
first is finished (as tracked by "ps avx" or "top").

--Martin

Tom Lane wrote on Mon, 22 Oct 2001 23:09:26 EDT

Show quoted text

Martin Weinberg <weinberg@osprey.astro.umass.edu> writes:

On Postgresql 7.1.3, it seems that the table is locked after the
first "create index" is started up. Is this right?

AFAIK it's a share lock, which only prohibits modifications to the
table, not reads (nor concurrent index builds). Not sure how you
expect the system to do better than that.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Martin Weinberg

weinberg@osprey.astro.umass.edu

about 24 years ago

In reply to: Tom Lane (#2)

Re: Using an SMP machine to make multiple indices on the

Tom,

I should have forwarded you the ps output; here are the relevant lines:

*******************************************************************************
294 ttyp0 S 0:00 203 108 1991 836 0.0 psql -e -c create
index v3_pscat_k_m_idx on v3_pscat(k_m) wsdb
295 ? R 0:27 2170 1425 17122 13252 1.4 postgres: postgres
wsdb [local] CREATE
296 ttyp0 S 0:00 203 108 1991 836 0.0 psql -e -c create
index v3_pscat_h_m_idx on v3_pscat(h_m) wsdb
297 ? S 0:00 190 1425 11858 2436 0.2 postgres: postgres
wsdb [local] CREATE waiting
300 ttyp0 R 0:00 273 55 3016 1384 0.1 ps avx
*******************************************************************************

Note the "CREATE waiting" process . . .

--Martin

Tom Lane wrote on Mon, 22 Oct 2001 23:09:26 EDT

Show quoted text

Martin Weinberg <weinberg@osprey.astro.umass.edu> writes:

On Postgresql 7.1.3, it seems that the table is locked after the
first "create index" is started up. Is this right?

AFAIK it's a share lock, which only prohibits modifications to the
table, not reads (nor concurrent index builds). Not sure how you
expect the system to do better than that.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Tom Lane

tgl@sss.pgh.pa.us

about 24 years ago

In reply to: Martin Weinberg (#4)

Re: [GENERAL] Using an SMP machine to make multiple indices on the same table

Martin Weinberg <weinberg@osprey.astro.umass.edu> writes:

Yes, I understand locking the table, but empirically, two index
creations will not run simultaneously on the same table.

Hmm, on trying it you are right. The second index creation blocks here:

#6 0x1718e0 in XactLockTableWait (xid=17334) at lmgr.c:344
#7 0x9e530 in heap_mark4update (relation=0xc1be62f8, tuple=0x7b03b7f0,
buffer=0x7b03b828) at heapam.c:1686
#8 0xcb410 in LockClassinfoForUpdate (relid=387785, rtup=0x7b03b7f0,
buffer=0x7b03b828, confirmCommitted=0 '\000') at index.c:1131
#9 0xcb534 in IndexesAreActive (relid=387785, confirmCommitted=1 '\001')
at index.c:1176
#10 0xf0f04 in DefineIndex (heapRelationName=0x400aab20 "tenk1",
indexRelationName=0x400aab00 "anotherj", accessMethodName=0x59f48 "btree",
attributeList=0x400aab80, unique=0, primary=0, predicate=0x0,
rangetable=0x0) at indexcmds.c:133
#11 0x17e118 in ProcessUtility (parsetree=0x400aaba0, dest=Remote)
at utility.c:905

Essentially it's trying to do a SELECT FOR UPDATE on the pg_class tuple
of the relation before it starts building the index.

I have opined before that LockClassinfoForUpdate is a mistake that
shouldn't exist at all, since acquiring the proper lock on the relation
ought to be sufficient. I see no need for locking the pg_class tuple,
and certainly none for doing so at the beginning of the operation rather
than the end.

Hiroshi, I think you defended it last time; any comments?

regards, tom lane

Hiroshi Inoue

Inoue@tpf.co.jp

about 24 years ago

In reply to: Martin Weinberg (#1)

Re: [GENERAL] Using an SMP machine to make multiple indices on the same

Tom Lane wrote:

Martin Weinberg <weinberg@osprey.astro.umass.edu> writes:

Yes, I understand locking the table, but empirically, two index
creations will not run simultaneously on the same table.

Hmm, on trying it you are right. The second index creation blocks here:

#6 0x1718e0 in XactLockTableWait (xid=17334) at lmgr.c:344
#7 0x9e530 in heap_mark4update (relation=0xc1be62f8, tuple=0x7b03b7f0,
buffer=0x7b03b828) at heapam.c:1686
#8 0xcb410 in LockClassinfoForUpdate (relid=387785, rtup=0x7b03b7f0,
buffer=0x7b03b828, confirmCommitted=0 '\000') at index.c:1131
#9 0xcb534 in IndexesAreActive (relid=387785, confirmCommitted=1 '\001')
at index.c:1176
#10 0xf0f04 in DefineIndex (heapRelationName=0x400aab20 "tenk1",
indexRelationName=0x400aab00 "anotherj", accessMethodName=0x59f48 "btree",
attributeList=0x400aab80, unique=0, primary=0, predicate=0x0,
rangetable=0x0) at indexcmds.c:133
#11 0x17e118 in ProcessUtility (parsetree=0x400aaba0, dest=Remote)
at utility.c:905

Essentially it's trying to do a SELECT FOR UPDATE on the pg_class tuple
of the relation before it starts building the index.

I have opined before that LockClassinfoForUpdate is a mistake that
shouldn't exist at all, since acquiring the proper lock on the relation
ought to be sufficient.

As I've already mentioned many times I never agree with you.

I see no need for locking the pg_class tuple,
and certainly none for doing so at the beginning of the operation rather
than the end.

Hiroshi, I think you defended it last time; any comments?

Hmm the excluive row level lock by FOR UPDATE is too strong
in this case. OK I would change IndexesAreActive() to not
acquire a lock on the pg_class tuple for user tables because
reindex doesn't need to handle relhasindex for user tables
since 7.1.

regards,
Hiroshi Inoue

Hiroshi Inoue

Inoue@tpf.co.jp

about 24 years ago

In reply to: Hiroshi Inoue (#7)

Re: [GENERAL] Using an SMP machine to make multiple indices on the same

-----Original Message-----
From: Hiroshi Inoue

Tom Lane wrote:

Martin Weinberg <weinberg@osprey.astro.umass.edu> writes:

Yes, I understand locking the table, but empirically, two index
creations will not run simultaneously on the same table.

Hmm, on trying it you are right. The second index creation blocks here:

#6 0x1718e0 in XactLockTableWait (xid=17334) at lmgr.c:344
#7 0x9e530 in heap_mark4update (relation=0xc1be62f8, tuple=0x7b03b7f0,
buffer=0x7b03b828) at heapam.c:1686
#8 0xcb410 in LockClassinfoForUpdate (relid=387785, rtup=0x7b03b7f0,
buffer=0x7b03b828, confirmCommitted=0 '\000') at index.c:1131
#9 0xcb534 in IndexesAreActive (relid=387785,

confirmCommitted=1 '\001')

at index.c:1176
#10 0xf0f04 in DefineIndex (heapRelationName=0x400aab20 "tenk1",
indexRelationName=0x400aab00 "anotherj",

accessMethodName=0x59f48 "btree",

attributeList=0x400aab80, unique=0, primary=0, predicate=0x0,
rangetable=0x0) at indexcmds.c:133
#11 0x17e118 in ProcessUtility (parsetree=0x400aaba0, dest=Remote)
at utility.c:905

Essentially it's trying to do a SELECT FOR UPDATE on the pg_class tuple
of the relation before it starts building the index.

I have opined before that LockClassinfoForUpdate is a mistake that
shouldn't exist at all, since acquiring the proper lock on the relation
ought to be sufficient.

As I've already mentioned many times I never agree with you.

I see no need for locking the pg_class tuple,
and certainly none for doing so at the beginning of the operation rather
than the end.

Hiroshi, I think you defended it last time; any comments?

Hmm the excluive row level lock by FOR UPDATE is too strong
in this case. OK I would change IndexesAreActive() to not
acquire a lock on the pg_class tuple for user tables because
reindex doesn't need to handle relhasindex for user tables
since 7.1.

In the end, I changed DefineIndex() to not call IndexesAreActive().

regards,
Hiroshi Inoue

Tom Lane

tgl@sss.pgh.pa.us

about 24 years ago

In reply to: Hiroshi Inoue (#8)

Re: [GENERAL] Using an SMP machine to make multiple indices on the same

"Hiroshi Inoue" <Inoue@tpf.co.jp> writes:

In the end, I changed DefineIndex() to not call IndexesAreActive().

I saw that. But is it a good solution? If someone has deactivated
indexes on a user table (ie turned off relhasindex), then creating a
new index would activate them again, which would probably be bad.

I have realized that this code is wrong anyway, because it doesn't
acquire ShareLock on the relation until far too late; all the setup
processing is done with no lock at all :-(. LockClassinfoForUpdate
provided a little bit of security against concurrent schema changes,
though not enough.

Also, I'm now a little worried about whether concurrent index creations
will actually work. Both CREATE INDEX operations will try to update
the pg_class tuple to set relhasindex true. Since they use
simple_heap_update for that, the second one is likely to fail
because simple_heap_update doesn't handle concurrent updates.

I think what we probably want is

1. Acquire ShareLock at the very start.

2. Check for indexes present but relhasindex = false,
if so complain.

3. Build the index.

4. Update pg_class tuple, being prepared for concurrent
updates (ie, do NOT use simple_heap_update here).

I still don't see any value in LockClassinfoForUpdate, however.

regards, tom lane

#10

Hiroshi Inoue

Inoue@tpf.co.jp

about 24 years ago

In reply to: Hiroshi Inoue (#8)

Re: [GENERAL] Using an SMP machine to make multiple indices on

Tom Lane wrote:

"Hiroshi Inoue" <Inoue@tpf.co.jp> writes:

In the end, I changed DefineIndex() to not call IndexesAreActive().

I saw that. But is it a good solution? If someone has deactivated
indexes on a user table (ie turned off relhasindex), then creating a
new index would activate them again, which would probably be bad.

I apolgize my neglect of reconsidering the activte/deactivate
stuff for indexes. Probably it is no longer needed now(since 7.1).
Reindex under postmaster for user tables has been available
from the first. I didn't write a documentation about it inten-
tionally in 7.0 though it was my neglect also in 7.1 sorry.
In 7.0 REINDEX set relhasindex to false first to tell all
backends that the indexes are unavailable because we wasn't
able to recreate indexes safely in case of abort. Note
that relhasindex was set immediately(out of transactional
control) in 7.0 and acruiring a lock for the pg_class tuple
was very critical.
Since 7.1 we are able to recreate indexes safely under
postmaster and REINDEX doesn't set relhasindex to false
for user tables. Though REINDEX deactivates the indexes of
system tables the deactivation is done under transactional
control and other backends never see the deactivated
relhasindex.

I have realized that this code is wrong anyway, because it doesn't
acquire ShareLock on the relation until far too late; all the setup
processing is done with no lock at all :-(. LockClassinfoForUpdate
provided a little bit of security against concurrent schema changes,
though not enough.

Also, I'm now a little worried about whether concurrent index creations
will actually work. Both CREATE INDEX operations will try to update
the pg_class tuple to set relhasindex true.

Yes but there's a big difference. It's at the end of the creation
not at the beginning. Also note that UpdateStats() updates pg_class
tuple in case of B-trees etc before updating relhasindex. I'm
suspicios if we should update Stats under the transactional control.

Since they use

simple_heap_update for that, the second one is likely to fail
because simple_heap_update doesn't handle concurrent updates.

I think what we probably want is

1. Acquire ShareLock at the very start.

2. Check for indexes present but relhasindex = false,
if so complain.

3. Build the index.

4. Update pg_class tuple, being prepared for concurrent
updates (ie, do NOT use simple_heap_update here).

I still don't see any value in LockClassinfoForUpdate, however.

ISTM to rely on completely the lock for the corresponding
relation is a little misplaced. For example ALTER TABLE OWNER
doesn't acquire any lock on the table but it seems natural to me.
UPDATE pg_class set .. doesn't acquire any lock on the correspoding
relations of the target pg_class tuples but it seems natural to me,

regards,
Hiroshi Inoue

#11

Tom Lane

tgl@sss.pgh.pa.us

about 24 years ago

In reply to: Hiroshi Inoue (#10)

Re: [GENERAL] Using an SMP machine to make multiple indices on the same

Hiroshi Inoue <Inoue@tpf.co.jp> writes:

Tom Lane wrote:

Also, I'm now a little worried about whether concurrent index creations
will actually work. Both CREATE INDEX operations will try to update
the pg_class tuple to set relhasindex true.

Yes but there's a big difference. It's at the end of the creation
not at the beginning. Also note that UpdateStats() updates pg_class
tuple in case of B-trees etc before updating relhasindex. I'm
suspicios if we should update Stats under the transactional control.

It would probably be good to fix things so that there's only one update
done for both stats and relhasindex, instead of two. But we *will* get
failures in simple_heap_update if we continue to use that routine.
The window for failure may be relatively short but it's real. It's not
necessarily short, either; consider multiple CREATE INDEX commands
executed in a transaction block.

I still don't see any value in LockClassinfoForUpdate, however.

ISTM to rely on completely the lock for the corresponding
relation is a little misplaced.

Surely we *must* be able to rely on the relation lock. For example:
how does SELECT FOR UPDATE of the relation's pg_class tuple prevent
writers from adding tuples to the relation? It does not and cannot.
Only getting the appropriate relation lock provides a semantically
correct guarantee that the relation isn't changing underneath us.
Locking the pg_class tuple only locks the tuple itself, it has no wider
scope of meaning.

For example ALTER TABLE OWNER
doesn't acquire any lock on the table but it seems natural to me.

Seems like a bug to me. Consider this scenario:

Backend 1 Backend 2

begin;

lock table1;

select from table1; -- works

alter table1 set owner ...

select from table1; -- fails, no permissions

That should not happen. It wouldn't happen if ALTER TABLE OWNER
were acquiring an appropriate lock on the relation.

UPDATE pg_class set .. doesn't acquire any lock on the correspoding
relations of the target pg_class tuples but it seems natural to me,

While we allow knowledgeable users to poke at the system catalogs
directly, I feel that that is very much a "let the user beware"
facility. I have no urge to try to guarantee cross-backend
transactional safety for changes executed that way. But CREATE INDEX,
ALTER TABLE, and so forth should have safe concurrent behavior.

regards, tom lane

#12

Hiroshi Inoue

Inoue@tpf.co.jp

about 24 years ago

In reply to: Hiroshi Inoue (#8)

Re: [GENERAL] Using an SMP machine to make multiple indices on

Tom Lane wrote:

Hiroshi Inoue <Inoue@tpf.co.jp> writes:

Tom Lane wrote:

Also, I'm now a little worried about whether concurrent index creations
will actually work. Both CREATE INDEX operations will try to update
the pg_class tuple to set relhasindex true.

Yes but there's a big difference. It's at the end of the creation
not at the beginning. Also note that UpdateStats() updates pg_class
tuple in case of B-trees etc before updating relhasindex. I'm
suspicios if we should update Stats under the transactional control.

It would probably be good to fix things so that there's only one update
done for both stats and relhasindex, instead of two.

I don't fully agree with you at this point. It's pretty painful
to update relatively irrevalent items at a time in some cases.
UpdateStats() had updated both reltuples and relhasindex before 7.0.
It's me who changed UpdateStats() to not update relhasindex when
I implemented REINDEX command. Reindex has to set relhasindex to
true after all the indexes of a table were recreated.

But we *will* get
failures in simple_heap_update if we continue to use that routine.
The window for failure may be relatively short but it's real. It's not
necessarily short, either; consider multiple CREATE INDEX commands
executed in a transaction block.

I still don't see any value in LockClassinfoForUpdate, however.

ISTM to rely on completely the lock for the corresponding
relation is a little misplaced.

Surely we *must* be able to rely on the relation lock. For example:
how does SELECT FOR UPDATE of the relation's pg_class tuple prevent
writers from adding tuples to the relation? It does not and cannot.

I've never said that the relation lock is unnecessary.
The stuff around relhasindex is(was) an exception that keeps
a (possibly) long term lock for the pg_class tuple apart from
the relevant relation lock.
What I've mainly intended is to guard our(at least my) code.
If our(my) code acquires an AccessExclusiveLock on a relation
and would update the correspoing pg_class tuple, I'd like to
get the locked tuple not the unlocked one because I couldn't
change unlocked tuples without anxiety. That's almost all.
In most cases the AccessExclusiveLock on the relation would
already block other backends which must be blocked as you
say and so the lock on the pg_class tuple would cause few
additional lock conflicts. Where are disadvantages to get
locked pg_class tuples ?

Only getting the appropriate relation lock provides a semantically
correct guarantee that the relation isn't changing underneath us.
Locking the pg_class tuple only locks the tuple itself, it has no wider
scope of meaning.

For example ALTER TABLE OWNER
doesn't acquire any lock on the table but it seems natural to me.

Seems like a bug to me. Consider this scenario:

Backend 1 Backend 2

begin;

lock table1;

select from table1; -- works

alter table1 set owner ...

select from table1; -- fails, no permissions

That should not happen. It wouldn't happen if ALTER TABLE OWNER
were acquiring an appropriate lock on the relation.

Hmm ok agreed. One of my intentions is to guard our(my) code
from such careless(?) applications.

UPDATE pg_class set .. doesn't acquire any lock on the correspoding
relations of the target pg_class tuples but it seems natural to me,

While we allow knowledgeable users to poke at the system catalogs
directly, I feel that that is very much a "let the user beware"
facility.

Me too. Again what I intend is to guard our(my) code from
such knowledgeable users not guarantee them an expected(?)
result.

regards,
Hiroshi Inoue