locks in CREATE TRIGGER, ADD FK

neilc@samurai.com

over 21 years ago

In reply to: Neil Conway (#1)

Re: locks in CREATE TRIGGER, ADD FK

Neil Conway wrote:

AndrewSN pointed out on IRC that ALTER TABLE ... ADD FOREIGN KEY and
CREATE TRIGGER both acquire AccessExclusiveLocks on the table they are
adding triggers to (the PK table, in the case of ALTER TABLE). Is this
necessary? I don't see why we can't allow SELECT queries on the table to
proceed while the trigger is being added.

Sorry, I forgot to mention: I think RowExclusiveLock or ExclusiveLock
would be sufficient instead.

-Neil

neilc@samurai.com

over 21 years ago

In reply to: Neil Conway (#1)

Re: locks in CREATE TRIGGER, ADD FK

Neil Conway wrote:

AndrewSN pointed out on IRC that ALTER TABLE ... ADD FOREIGN KEY and
CREATE TRIGGER both acquire AccessExclusiveLocks on the table they are
adding triggers to (the PK table, in the case of ALTER TABLE). Is this
necessary? I don't see why we can't allow SELECT queries on the table to
proceed while the trigger is being added.

Attached is a patch that changes both to use ShareRowExclusiveLock, and
updates the documentation accordingly. I'll apply this later today,
barring any objections.

-Neil

Christopher Kings-Lynne

chriskl@familyhealth.com.au

over 21 years ago

In reply to: Neil Conway (#3)

Re: locks in CREATE TRIGGER, ADD FK

If you want to be my friend forever, then fix CLUSTER so that it uses
sharerowexclusive as well :D

Chris

Neil Conway wrote:

Show quoted text

Neil Conway wrote:

AndrewSN pointed out on IRC that ALTER TABLE ... ADD FOREIGN KEY and
CREATE TRIGGER both acquire AccessExclusiveLocks on the table they are
adding triggers to (the PK table, in the case of ALTER TABLE). Is this
necessary? I don't see why we can't allow SELECT queries on the table
to proceed while the trigger is being added.

Attached is a patch that changes both to use ShareRowExclusiveLock, and
updates the documentation accordingly. I'll apply this later today,
barring any objections.

Russell Smith

mr-russ@pws.com.au

over 21 years ago

In reply to: Christopher Kings-Lynne (#4)

Re: locks in CREATE TRIGGER, ADD FK

On Wed, 23 Mar 2005 12:40 pm, Christopher Kings-Lynne wrote:

If you want to be my friend forever, then fix CLUSTER so that it uses
sharerowexclusive as well :D

I don't think it's as easy as that, because you have to move tuples
around in the cluster operation. Same sort of issue as vacuum full I would suggest.

Russell Smith

Show quoted text

Chris

Neil Conway wrote:

Neil Conway wrote:

AndrewSN pointed out on IRC that ALTER TABLE ... ADD FOREIGN KEY and
CREATE TRIGGER both acquire AccessExclusiveLocks on the table they are
adding triggers to (the PK table, in the case of ALTER TABLE). Is this
necessary? I don't see why we can't allow SELECT queries on the table
to proceed while the trigger is being added.

Attached is a patch that changes both to use ShareRowExclusiveLock, and
updates the documentation accordingly. I'll apply this later today,
barring any objections.

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

Christopher Kings-Lynne

chriskl@familyhealth.com.au

over 21 years ago

In reply to: Russell Smith (#5)

Re: locks in CREATE TRIGGER, ADD FK

If you want to be my friend forever, then fix CLUSTER so that it uses
sharerowexclusive as well :D

I don't think it's as easy as that, because you have to move tuples
around in the cluster operation. Same sort of issue as vacuum full I would suggest.

Cluster doesn't move rows...

I didn't say it was easy. It would involve changing how cluster works.
It would keep the old table around while building the new, then grab
an exclusive lock to swap the two.

Chris

Alvaro Herrera

alvherre@dcc.uchile.cl

over 21 years ago

In reply to: Christopher Kings-Lynne (#6)

Re: locks in CREATE TRIGGER, ADD FK

On Wed, Mar 23, 2005 at 10:42:01AM +0800, Christopher Kings-Lynne wrote:

If you want to be my friend forever, then fix CLUSTER so that it uses
sharerowexclusive as well :D

I don't think it's as easy as that, because you have to move tuples
around in the cluster operation. Same sort of issue as vacuum full I
would suggest.

Cluster doesn't move rows...

I didn't say it was easy. It would involve changing how cluster works.
It would keep the old table around while building the new, then grab
an exclusive lock to swap the two.

Huh, cluster already does that.

I don't remember what the rationale was for locking the table, leaving
even simple SELECTs out. (In fact, IIRC the decision wasn't made by me,
and it wasn't discussed at all.)

--
Alvaro Herrera (<alvherre[@]dcc.uchile.cl>)
"I would rather have GNU than GNOT." (ccchips, lwn.net/Articles/37595/)

Christopher Kings-Lynne

chriskl@familyhealth.com.au

over 21 years ago

In reply to: Alvaro Herrera (#7)

Re: locks in CREATE TRIGGER, ADD FK

Huh, cluster already does that.

It does and it doesn't. Something like the first thing it does is muck
with the old table's filenode IIRC, meaning that immediately the old
table will no longer work.

Chris

neilc@samurai.com

over 21 years ago

In reply to: Christopher Kings-Lynne (#4)

Re: locks in CREATE TRIGGER, ADD FK

Christopher Kings-Lynne wrote:

If you want to be my friend forever, then fix CLUSTER so that it uses
sharerowexclusive as well :D

Hmm, this might be possible as well. During a CLUSTER, we currently

- lock the heap relation with AccessExclusiveLock
- lock the index we're clustering on with AccessExclusiveLock
- create a temporary heap relation
- fill with data from the old heap relation, via an index scan
- swap the relfilenodes of the old and temporary heap relations
- rebuild indexes

We certainly can't allow concurrent modifications to either the table or
the clustered index while this is happening. Allowing index scans
*should* be safe -- an index scan could result in modifications to the
index (e.g. updating "tuple is killed" bits), but those shouldn't be
essential. We might also want to disallow SELECT FOR UPDATE, since we
would end up invoking heap_mark4update() on the old heap relation. Not
sure offhand how serious that would be.

So I think it should be possible to lock both the heap relation and the
index with ExclusiveLock, which would allow SELECTs on them. This would
apply to both the single relation and multiple relation variants of
CLUSTER (since we do each individual clustering in its own transaction).

... except that when we rebuild the relation's indexes, we acquire an
AccessExclusiveLock on the index. This would introduce the risk of
deadlock. It seems necessary to acquire an AccessExclusiveLock when
rebuilding shared indexes, since we do the index build in-place, but I
think we can get by with an ExclusiveLock in the non-shared case, for
similar reasons as above: we build the new index and then swap relfilenodes.

-Neil

#10

bruce@momjian.us

over 21 years ago

In reply to: Neil Conway (#9)

Re: locks in CREATE TRIGGER, ADD FK

Neil Conway wrote:

So I think it should be possible to lock both the heap relation and the
index with ExclusiveLock, which would allow SELECTs on them. This would
apply to both the single relation and multiple relation variants of
CLUSTER (since we do each individual clustering in its own transaction).

... except that when we rebuild the relation's indexes, we acquire an
AccessExclusiveLock on the index. This would introduce the risk of
deadlock. It seems necessary to acquire an AccessExclusiveLock when
rebuilding shared indexes, since we do the index build in-place, but I
think we can get by with an ExclusiveLock in the non-shared case, for
similar reasons as above: we build the new index and then swap relfilenodes.

Certainly we need to upgrade to an exclusive table lock to replace the
heap table. Do we want to get a shared lock and possibly starve waiting
for an exclusive lock on the table to swap the new one in? Do we do
such escallation anywhere else?

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

#11

neilc@samurai.com

over 21 years ago

In reply to: Bruce Momjian (#10)

Re: locks in CREATE TRIGGER, ADD FK

Bruce Momjian wrote:

Certainly we need to upgrade to an exclusive table lock to replace the
heap table.

Well, we will be holding an ExclusiveLock on the heap relation
regardless. We "replace" the heap table by swapping its relfilenode, so
ISTM we needn't hold an AccessExclusiveLock.

Do we want to get a shared lock and possibly starve waiting
for an exclusive lock on the table to swap the new one in?

What I'm saying is that REINDEX on non-shared indexes need only acquire
an ExclusiveLock, and hence not need to escalate its lock.

-Neil

#12

tgl@sss.pgh.pa.us

over 21 years ago

In reply to: Neil Conway (#3)

Re: locks in CREATE TRIGGER, ADD FK

Neil Conway <neilc@samurai.com> writes:

AndrewSN pointed out on IRC that ALTER TABLE ... ADD FOREIGN KEY and
CREATE TRIGGER both acquire AccessExclusiveLocks on the table they are
adding triggers to (the PK table, in the case of ALTER TABLE). Is this
necessary? I don't see why we can't allow SELECT queries on the table to
proceed while the trigger is being added.

Attached is a patch that changes both to use ShareRowExclusiveLock, and
updates the documentation accordingly. I'll apply this later today,
barring any objections.

I don't think this has been adequately thought through at all ... but
at least make it ExclusiveLock. What is the use-case for allowing
SELECT FOR UPDATE in parallel with this? One may suppose that someone
doing SELECT FOR UPDATE intends an UPDATE. (No, don't tell me about
foreign keys. Alvaro is going to fix that.)

As Chris suggests nearby, this is really only the tip of the iceberg.
I would prefer to see someone do a survey of all our DDL commands and
put forward a coherent proposal for minimum required locks for all of
them.

regards, tom lane

#13

tgl@sss.pgh.pa.us

over 21 years ago

In reply to: Neil Conway (#3)

Re: locks in CREATE TRIGGER, ADD FK

Neil Conway <neilc@samurai.com> writes:

/*
! * Grab an exclusive lock on the pk table, so that someone doesn't
! * delete rows out from under us. (Although a lesser lock would do for
! * that purpose, we'll need exclusive lock anyway to add triggers to
! * the pk table; trying to start with a lesser lock will just create a
! * risk of deadlock.)
*/
! pkrel = heap_openrv(fkconstraint->pktable, AccessExclusiveLock);

/*
* Validity and permissions checks
--- 3829,3839 ----
Oid			constrOid;

/*
! * Grab a lock on the pk table, so that someone doesn't delete
! * rows out from under us; ShareRowExclusive should be good
! * enough.
*/

BTW, the above comment change is seriously inadequate, because it
removes the explanation of *why* that is the minimum required lock.

regards, tom lane

#14

tgl@sss.pgh.pa.us

over 21 years ago

In reply to: Christopher Kings-Lynne (#6)

Re: locks in CREATE TRIGGER, ADD FK

Christopher Kings-Lynne <chriskl@familyhealth.com.au> writes:

It would keep the old table around while building the new, then grab
an exclusive lock to swap the two.

Lock upgrading is right out.

regards, tom lane

#15

neilc@samurai.com

over 21 years ago

In reply to: Neil Conway (#9)

Re: locks in CREATE TRIGGER, ADD FK

Neil Conway wrote:

... except that when we rebuild the relation's indexes, we acquire an
AccessExclusiveLock on the index. This would introduce the risk of
deadlock. It seems necessary to acquire an AccessExclusiveLock when
rebuilding shared indexes, since we do the index build in-place, but I
think we can get by with an ExclusiveLock in the non-shared case, for
similar reasons as above: we build the new index and then swap
relfilenodes.

From looking at the code, it should be quite possible to do this.

Further points from discussion on IRC:

- TRUNCATE suffers from the same behavior (it acquires an
AccessExclusiveLock where really an ExclusiveLock or similar should be
good enough)

- if we make these changes, we will need some way to delete a
no-longer-visible relfilenode. It should be sufficient to delete a
relfilenode when the expired pg_class row that refers to it is no longer
visible to any transactions -- but this isn't necessarily going to be
true when the transaction that executed the REINDEX/CLUSTER/TRUNCATE
commits. We could perform this check in some kind of periodic process,
perhaps -- like the bgwriter, at checkpoint time.

-Neil

#16

tgl@sss.pgh.pa.us

over 21 years ago

In reply to: Neil Conway (#11)

Re: locks in CREATE TRIGGER, ADD FK

Neil Conway <neilc@samurai.com> writes:

Well, we will be holding an ExclusiveLock on the heap relation
regardless. We "replace" the heap table by swapping its relfilenode, so
ISTM we needn't hold an AccessExclusiveLock.

Utterly wrong. When you commit you will physically drop the old table.
If there is a SELECT running against the old table it will be quite
unhappy after that.

regards, tom lane

#17

neilc@samurai.com

over 21 years ago

In reply to: Tom Lane (#16)

Re: locks in CREATE TRIGGER, ADD FK

Tom Lane wrote:

Utterly wrong. When you commit you will physically drop the old table.
If there is a SELECT running against the old table it will be quite
unhappy after that.

How can we drop the file at commit, given that a serializable
transaction's snapshot should still be able to see old relfilenode's
content?

(If the serializable transaction has already acquired a read lock before
the TRUNCATE begins, it will block the TRUNCATE -- but there is no
guarantee that the operations will be ordered like that.)

-Neil

#18

Andrew - Supernews

andrew+nonews@supernews.com

over 21 years ago

In reply to: Neil Conway (#1)

swapping relfilenodes (was: Re: locks in CREATE TRIGGER, ADD FK)

On 2005-03-23, Neil Conway <neilc@samurai.com> wrote:

- swap the relfilenodes of the old and temporary heap relations

While discussing this one further on IRC, I noticed the following:

Everywhere I could find that currently replaces the relfilenode of a
relation does so while holding an AccessExclusive lock, and assumes that
this is sufficient to ensure that the old relfilenode can be killed when
the transaction commits. This is not correct.

Example:

- backend A begins a serializable transaction
- backend B truncates a table (and commits)
- backend A, still in the same transaction, accesses the truncated table

Currently backend A sees the truncated table as empty, which is obviously
not right. This is obviously related to any attempt to weaken the locking
on other operations that modify relfilenodes, because doing it right implies
a mechanism to defer the removals past the commit of the modifying
transaction and up to the point where the old data can no longer be seen by
a live transaction.

--
Andrew, Supernews
http://www.supernews.com - individual and corporate NNTP services

#19

bruce@momjian.us

over 21 years ago

In reply to: Neil Conway (#17)

Re: locks in CREATE TRIGGER, ADD FK

Neil Conway wrote:

Tom Lane wrote:

Utterly wrong. When you commit you will physically drop the old table.
If there is a SELECT running against the old table it will be quite
unhappy after that.

How can we drop the file at commit, given that a serializable
transaction's snapshot should still be able to see old relfilenode's
content?

Vacuum will not remove any old rows because of the transaction xid so
why does it care if the table is clustered/reindexed? It doesn't have
the table open yet.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

#20

bruce@momjian.us

over 21 years ago

In reply to: Andrew - Supernews (#18)

Re: swapping relfilenodes (was: Re: locks in CREATE TRIGGER,

Andrew - Supernews wrote:

On 2005-03-23, Neil Conway <neilc@samurai.com> wrote:

- swap the relfilenodes of the old and temporary heap relations

While discussing this one further on IRC, I noticed the following:

Everywhere I could find that currently replaces the relfilenode of a
relation does so while holding an AccessExclusive lock, and assumes that
this is sufficient to ensure that the old relfilenode can be killed when
the transaction commits. This is not correct.

Example:

- backend A begins a serializable transaction
- backend B truncates a table (and commits)
- backend A, still in the same transaction, accesses the truncated table

Currently backend A sees the truncated table as empty, which is obviously
not right. This is obviously related to any attempt to weaken the locking
on other operations that modify relfilenodes, because doing it right implies
a mechanism to defer the removals past the commit of the modifying
transaction and up to the point where the old data can no longer be seen by
a live transaction.

This is a good point. While DELETE keeps the old rows around and VACUUM
perserves them until the serialized transaction commits, truncate does
not keep the old rows around.

In fact, would a truncate during a backup cause the backup to be
inconsistent because it wouldn't be a true snapshot of the database at
backup start time? Seems so.

The docs mention:

TRUNCATE cannot be used if there are foreign-key refer-
ences to the table from other tables. Checking validity in
such cases would require table scans, and the whole point
is not to do one.

so it doesn't make the referential integrity inconsistent.

Perhaps we should document this.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

#21

tgl@sss.pgh.pa.us

over 21 years ago

In reply to: Neil Conway (#17)

#22

tgl@sss.pgh.pa.us

over 21 years ago

In reply to: Bruce Momjian (#20)

#23