Lock conflict behavior?

Started by Tatsuo Ishiiabout 17 years ago13 messages
#1Tatsuo Ishii
ishii@postgresql.org

Hi,

I'm wondering if following behavior of PostgreSQL regarding lock
conflict is an expected one. Here's a scenario:

Session A:
BEGIN;
SELECT * FROM pg_class limit 1; -- acquires access share lock

Session B:
BEGIN;
ALTER TABLE pg_class ....; -- waits for acquiring access
exclusive lock(wil fail anyway though)
Session C:
SELECT * FROM pg_class...; -- whatever query which needs
to acces pg_class will be
blocked, too bad...

I understand that B should wait for aquiring lock, but Should C wait
for?

Also, it seems that an attacker could do a denial service attack if he
could open session A and B, since other users on session C or
following sessions will be blocked.
--
Tatsuo Ishii
SRA OSS, Inc. Japan

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tatsuo Ishii (#1)
Re: Lock conflict behavior?

Tatsuo Ishii <ishii@postgresql.org> writes:

I'm wondering if following behavior of PostgreSQL regarding lock
conflict is an expected one. Here's a scenario:

Session A:
BEGIN;
SELECT * FROM pg_class limit 1; -- acquires access share lock

Session B:
BEGIN;
ALTER TABLE pg_class ....; -- waits for acquiring access
exclusive lock(wil fail anyway though)
Session C:
SELECT * FROM pg_class...; -- whatever query which needs
to acces pg_class will be
blocked, too bad...

I understand that B should wait for aquiring lock, but Should C wait
for?

If we didn't do this, then a would-be acquirer of exclusive lock would
have a very serious problem with lock starvation: it might wait forever
in the face of a continuous stream of access-share lock requests.

regards, tom lane

#3Jan Urbański
j.urbanski@students.mimuw.edu.pl
In reply to: Tom Lane (#2)
Re: Lock conflict behavior?

Tom Lane wrote:

Tatsuo Ishii <ishii@postgresql.org> writes:

I'm wondering if following behavior of PostgreSQL regarding lock
conflict is an expected one. Here's a scenario:

Session A:
BEGIN;
SELECT * FROM pg_class limit 1; -- acquires access share lock

Session B:
BEGIN;
ALTER TABLE pg_class ....; -- waits for acquiring access
exclusive lock(wil fail anyway though)
Session C:
SELECT * FROM pg_class...; -- whatever query which needs
to acces pg_class will be
blocked, too bad...

I understand that B should wait for aquiring lock, but Should C wait
for?

If we didn't do this, then a would-be acquirer of exclusive lock would
have a very serious problem with lock starvation: it might wait forever
in the face of a continuous stream of access-share lock requests.

See http://en.wikipedia.org/wiki/Readers-writers_problem

Jan

--
Jan Urbanski
GPG key ID: E583D7D2

ouden estin

#4Jeff Davis
pgsql@j-davis.com
In reply to: Tatsuo Ishii (#1)
Re: Lock conflict behavior?

On Mon, 2008-12-22 at 17:14 +0900, Tatsuo Ishii wrote:

Also, it seems that an attacker could do a denial service attack if he
could open session A and B, since other users on session C or
following sessions will be blocked.

LOCK TABLE checks the permissions before attempting to acquire the lock,
is there a reason that ALTER TABLE doesn't?

Even if they don't have any rights to the table at all (not even
SELECT), there are still other problems. For instance, the user could
just wait for a long running query (or VACUUM) and issue the ALTER TABLE
at that time.

I know we don't make any guarantees about preventing denial-of-service
attacks from users that can connect, but if possible we should be
consistent about checking the permissions.

Regards,
Jeff Davis

#5Tatsuo Ishii
ishii@postgresql.org
In reply to: Jeff Davis (#4)
Re: Lock conflict behavior?

Also, it seems that an attacker could do a denial service attack if he
could open session A and B, since other users on session C or
following sessions will be blocked.

LOCK TABLE checks the permissions before attempting to acquire the lock,
is there a reason that ALTER TABLE doesn't?

Right. I think we should check the permissions first too.

Even if they don't have any rights to the table at all (not even
SELECT), there are still other problems. For instance, the user could
just wait for a long running query (or VACUUM) and issue the ALTER TABLE
at that time.

In the scenario I mentioned, even a new connection cannot be made to
the database since the backend need to initialize relcache by reading
system catlogs with access share lock at the very eary stage in
strating up.
--
Tatsuo Ishii
SRA OSS, Inc. Japan

#6Tatsuo Ishii
ishii@postgresql.org
In reply to: Tatsuo Ishii (#5)
Re: Lock conflict behavior?

Also, it seems that an attacker could do a denial service attack if he
could open session A and B, since other users on session C or
following sessions will be blocked.

LOCK TABLE checks the permissions before attempting to acquire the lock,
is there a reason that ALTER TABLE doesn't?

Right. I think we should check the permissions first too.

Even if they don't have any rights to the table at all (not even
SELECT), there are still other problems. For instance, the user could
just wait for a long running query (or VACUUM) and issue the ALTER TABLE
at that time.

In the scenario I mentioned, even a new connection cannot be made to
the database since the backend need to initialize relcache by reading
system catlogs with access share lock at the very eary stage in
strating up.

Another concern is, two phase commit. If a 2PC transaction includes
DDL, access share lock for pg_class is left. Someone comes with alter
table pg_class and tries to hold an exclusive lock. After this all
SELECT and autovacuum will stop because access share lock for pg_class
cannot be aquired.
--
Tatsuo Ishii
SRA OSS, Inc. Japan

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tatsuo Ishii (#5)
Re: Lock conflict behavior?

Tatsuo Ishii <ishii@postgresql.org> writes:

LOCK TABLE checks the permissions before attempting to acquire the lock,
is there a reason that ALTER TABLE doesn't?

Right. I think we should check the permissions first too.

I've always thought that it was extremely shaky for LOCK to try to work
that way. With no lock, you have no confidence that the table isn't
changing or disappearing under you. In the worst case, the permissions
check might fail outright (likely with a "cache lookup failed" message
about a catalog row that disappeared as we attempted to fetch it); or it
might give an answer that's obsolete by the time we do acquire the lock.

The present coding of LOCK dates from before we had fixed things to
guarantee that relation_open() acquires table lock before attempting to
load the relcache entry. Back then, there were enough race conditions
involved in first access to a relation that one more didn't seem to
matter. But now, I'd sure insist on someone finding a more bulletproof
answer before I'd hold still for extending that bogus coding pattern
all over the system.

regards, tom lane

#8Jeff Davis
pgsql@j-davis.com
In reply to: Tom Lane (#7)
Re: Lock conflict behavior?

On Tue, 2008-12-23 at 08:48 -0500, Tom Lane wrote:

I've always thought that it was extremely shaky for LOCK to try to work
that way. With no lock, you have no confidence that the table isn't
changing or disappearing under you. In the worst case, the permissions
check might fail outright (likely with a "cache lookup failed" message
about a catalog row that disappeared as we attempted to fetch it); or it
might give an answer that's obsolete by the time we do acquire the lock.

It looks like it would be easy enough to throw a better error message
than that, e.g. with a try/catch. The information could be obsolete, but
if it succeeds, it would at least mean they had permissions at some time
in the past.

Or, we could just remove the ACL checks from LOCK TABLE, so that it's at
least consistent. Mostly it's the inconsistency that bothers me.

Regards,
Jeff Davis

#9Bruce Momjian
bruce@momjian.us
In reply to: Jeff Davis (#8)
Re: Lock conflict behavior?

Jeff Davis wrote:

On Tue, 2008-12-23 at 08:48 -0500, Tom Lane wrote:

I've always thought that it was extremely shaky for LOCK to try to work
that way. With no lock, you have no confidence that the table isn't
changing or disappearing under you. In the worst case, the permissions
check might fail outright (likely with a "cache lookup failed" message
about a catalog row that disappeared as we attempted to fetch it); or it
might give an answer that's obsolete by the time we do acquire the lock.

It looks like it would be easy enough to throw a better error message
than that, e.g. with a try/catch. The information could be obsolete, but
if it succeeds, it would at least mean they had permissions at some time
in the past.

Or, we could just remove the ACL checks from LOCK TABLE, so that it's at
least consistent. Mostly it's the inconsistency that bothers me.

Is this a TODO?

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#10Jeff Davis
pgsql@j-davis.com
In reply to: Bruce Momjian (#9)
Re: Lock conflict behavior?

On Wed, 2009-01-21 at 17:39 -0500, Bruce Momjian wrote:

It looks like it would be easy enough to throw a better error message
than that, e.g. with a try/catch. The information could be obsolete, but
if it succeeds, it would at least mean they had permissions at some time
in the past.

Or, we could just remove the ACL checks from LOCK TABLE, so that it's at
least consistent. Mostly it's the inconsistency that bothers me.

Is this a TODO?

I don't feel too strongly about it. I would feel better if we were
consistent about the permissions checks, because there's less of a
chance for confusion or a false sense of security.

If we keep the permission check in LockTableCommand(), I can make a
patch that produces a more useful error message when the table is
removed right before the pg_class_aclcheck().

Right now it does:
ERROR: relation with OID 16542 does not exist

which is undesirable.

Regards,
Jeff Davis

#11Jeff Davis
pgsql@j-davis.com
In reply to: Jeff Davis (#10)
1 attachment(s)
Re: Lock conflict behavior?

On Wed, 2009-01-21 at 15:08 -0800, Jeff Davis wrote:

If we keep the permission check in LockTableCommand(), I can make a
patch that produces a more useful error message when the table is
removed right before the pg_class_aclcheck().

Attached.

Regards,
Jeff Davis

Attachments:

lock_error.patchtext/x-patch; charset=utf-8; name=lock_error.patchDownload
diff --git a/src/backend/commands/lockcmds.c b/src/backend/commands/lockcmds.c
index e32b184..a96bdab 100644
--- a/src/backend/commands/lockcmds.c
+++ b/src/backend/commands/lockcmds.c
@@ -59,22 +59,58 @@ LockTableCommand(LockStmt *lockstmt)
 			Relation	rel;
 			AclResult	aclresult;
 
-			/* We don't want to open the relation until we've checked privilege. */
-			if (lockstmt->mode == AccessShareLock)
-				aclresult = pg_class_aclcheck(childreloid, GetUserId(),
-											  ACL_SELECT);
-			else
-				aclresult = pg_class_aclcheck(childreloid, GetUserId(),
-											  ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE);
+			PG_TRY();
+			{
+				/* We don't want to open the relation until we've checked privilege. */
+				if (lockstmt->mode == AccessShareLock)
+					aclresult = pg_class_aclcheck(childreloid, GetUserId(),
+												  ACL_SELECT);
+				else
+					aclresult = pg_class_aclcheck(childreloid, GetUserId(),
+												  ACL_UPDATE | ACL_DELETE | ACL_TRUNCATE);
+			}
+			PG_CATCH();
+			{
+				FlushErrorState();
+				if (relation->schemaname)
+					ereport(ERROR,
+							(errcode(ERRCODE_UNDEFINED_TABLE),
+							 errmsg("relation \"%s.%s\" does not exist",
+									relation->schemaname, relation->relname)));
+				else
+					ereport(ERROR,
+							(errcode(ERRCODE_UNDEFINED_TABLE),
+							 errmsg("relation \"%s\" does not exist",
+									relation->relname)));
+			}
+			PG_END_TRY();
 
 			if (aclresult != ACLCHECK_OK)
 				aclcheck_error(aclresult, ACL_KIND_CLASS,
 							   get_rel_name(childreloid));
 
-			if (lockstmt->nowait)
-				rel = relation_open_nowait(childreloid, lockstmt->mode);
-			else
-				rel = relation_open(childreloid, lockstmt->mode);
+			PG_TRY();
+			{
+				if (lockstmt->nowait)
+					rel = relation_open_nowait(childreloid, lockstmt->mode);
+				else
+					rel = relation_open(childreloid, lockstmt->mode);
+			}
+			PG_CATCH();
+			{
+				FlushErrorState();
+				if (relation->schemaname)
+					ereport(ERROR,
+							(errcode(ERRCODE_UNDEFINED_TABLE),
+							 errmsg("relation \"%s.%s\" does not exist",
+									relation->schemaname, relation->relname)));
+				else
+					ereport(ERROR,
+							(errcode(ERRCODE_UNDEFINED_TABLE),
+							 errmsg("relation \"%s\" does not exist",
+									relation->relname)));
+			}
+			PG_END_TRY();
 
 			/* Currently, we only allow plain tables to be locked */
 			if (rel->rd_rel->relkind != RELKIND_RELATION)
#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jeff Davis (#11)
Re: Lock conflict behavior?

Jeff Davis <pgsql@j-davis.com> writes:

On Wed, 2009-01-21 at 15:08 -0800, Jeff Davis wrote:

If we keep the permission check in LockTableCommand(), I can make a
patch that produces a more useful error message when the table is
removed right before the pg_class_aclcheck().

Attached.

This is pretty horrid, because it converts any error whatsoever into
"relation does not exist". For counterexamples consider "statement
timeout reached", "query cancelled by user", "pg_class is corrupted",
etc etc.

regards, tom lane

#13Jeff Davis
pgsql@j-davis.com
In reply to: Tom Lane (#12)
Re: Lock conflict behavior?

On Thu, 2009-01-22 at 18:20 -0500, Tom Lane wrote:

Jeff Davis <pgsql@j-davis.com> writes:

On Wed, 2009-01-21 at 15:08 -0800, Jeff Davis wrote:

If we keep the permission check in LockTableCommand(), I can make a
patch that produces a more useful error message when the table is
removed right before the pg_class_aclcheck().

Attached.

This is pretty horrid, because it converts any error whatsoever into
"relation does not exist". For counterexamples consider "statement
timeout reached", "query cancelled by user", "pg_class is corrupted",
etc etc.

Ah, I see. Well, I guess there's not a better way to handle that error
after all. There's no way to tell what exception you're catching
specifically, right?

Regards,
Jeff Davis