questions about concurrency control in Postgresql

Started by 黄晓骋about 16 years ago6 messages
#1黄晓骋
huangxclife@gmail.com

Hello,

I think in Postgresql, concurrency control acts like this:

tuple's infomask shows if it is being updated. If it is being updated now,
the latter transaction should reread the tuple and get the newer tuple.
During the progress of getting the newer tuple, it must use transaction
lock, I mean XactLockTableWait(...).

From the above, I think the tuple lock is unnecessary, because it uses
transaction lock.

Besides, tuple lock is unlocked after the tuple is updated but not after the
transaction commits. I mean it's not 2PL.

So, may you tell me why there is tuple lock in Postgresql ? Is the tuple
lock necessary?

Thanks,

--Huang Xiaocheng

--Database & Information System Lab, Nankai University

#2Daniel Farina
drfarina@gmail.com
In reply to: 黄晓骋 (#1)
Re: questions about concurrency control in Postgresql

2009/12/7 黄晓骋 <huangxclife@gmail.com>:

Hello,

I think in Postgresql, concurrency control acts like this:

tuple's infomask shows if it is being updated. If it is being updated now,
the latter transaction should reread the tuple and get the newer tuple.
During the progress of getting the newer tuple, it must use transaction
lock, I mean XactLockTableWait(...).

That is a table lock...depending on the lock, other backends may be
allowed to update tuples in the relation still. Fine-grained tuple
locks are used to prevent unnecessary contention for a table-wide
lock.

See the documentation at the manual page:

http://www.postgresql.org/docs/8.4/static/explicit-locking.html

It gives a thorough treatment of table and row locking levels and
conflicts, as well as what gets what locks.

fdr

#3Greg Stark
gsstark@mit.edu
In reply to: 黄晓骋 (#1)
Re: questions about concurrency control in Postgresql

2009/12/8 黄晓骋 <huangxclife@gmail.com>:

From the above, I think the tuple lock is unnecessary, because it uses
transaction lock.

Besides, tuple lock is unlocked after the tuple is updated but not after the
transaction commits. I mean it's not 2PL.

It's a two step process. An update marks the tuple locked. Another
transaction which comes along and wants to lock the tuple waits on the
transaction marked on the tuple. When the first transaction commits or
aborts then the second transaction can proceed and lock the tuple
itself. The reason we need both locks is because the first transaction
cannot go around the whole database finding every tuple it ever locked
to unlock it, firstly that could be a very large list and secondly
there would be no way to do that atomically.

Tuple locks and all user-visible locks are indeed held until the end
of the transaction.

--
greg

#4黄晓骋
huangxclife@gmail.com
In reply to: Greg Stark (#3)
答复: questions about concurrency control in Postgresql

It's a two step process. An update marks the tuple locked. Another
transaction which comes along and wants to lock the tuple waits on the
transaction marked on the tuple. When the first transaction commits or
aborts then the second transaction can proceed and lock the tuple
itself.

I agree with it.

The reason we need both locks is because the first transaction
cannot go around the whole database finding every tuple it ever locked
to unlock it, firstly that could be a very large list and secondly
there would be no way to do that atomically.

You mean that 2PL is hard to realize actually, I agree too.
But it doesn't mean tuple lock is necessary.

Tuple locks and all user-visible locks are indeed held until the end
of the transaction.

I don't agree with it, for I see unlocktuple(...) in heap_update(...).

--Huang Xiaocheng
--Database & Information System Lab, Nankai University

__________ Information from ESET NOD32 Antivirus, version of virus signature database 4671 (20091208) __________

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com

#5黄晓骋
huangxclife@gmail.com
In reply to: Greg Stark (#3)
答复: questions about concurrency control in Postgresql

I think I know why we need tuple lock.
Though we have tuple's infomask shows whether the tuple is being updated, before we set the tuple's infomask, there may be two transaction coming and updating the tuple. They both think the tuple is ok to be updated, and then it's wrong.
In PostgreSQL, we can use buffer lock to solve the problem , but its granularity is not proper. So we must use tuple lock to solve the problem.
Thank you, Greg. You prompt me to think clearly about it.
Happy communicating with you, and thanks again.

--Huang Xiaocheng
--Database & Information System Lab, Nankai University

-----邮件原件-----
发件人: gsstark@gmail.com [mailto:gsstark@gmail.com] 代表 Greg Stark
发送时间: 2009年12月8日 20:16
收件人: 黄晓骋
抄送: pgsql-hackers@postgresql.org
主题: Re: questions about concurrency control in Postgresql

2009/12/8 黄晓骋 <huangxclife@gmail.com>:

From the above, I think the tuple lock is unnecessary, because it uses
transaction lock.

Besides, tuple lock is unlocked after the tuple is updated but not after the
transaction commits. I mean it's not 2PL.

It's a two step process. An update marks the tuple locked. Another
transaction which comes along and wants to lock the tuple waits on the
transaction marked on the tuple. When the first transaction commits or
aborts then the second transaction can proceed and lock the tuple
itself. The reason we need both locks is because the first transaction
cannot go around the whole database finding every tuple it ever locked
to unlock it, firstly that could be a very large list and secondly
there would be no way to do that atomically.

Tuple locks and all user-visible locks are indeed held until the end
of the transaction.

--
greg

__________ Information from ESET NOD32 Antivirus, version of virus signature database 4671 (20091208) __________

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com

__________ Information from ESET NOD32 Antivirus, version of virus signature database 4674 (20091209) __________

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com

#6黄晓骋
huangxclife@gmail.com
In reply to: 黄晓骋 (#1)
答复: [HACKERS] 答复: questions about concurrency control in Postgresql

You are right. I never consider the SELECT FOR UPDATE/SHARE type queries, so I got the wrong conclusion.
I have seen the content in the comment of heap_lock_tuple().

Thank you,
Best Regards,

--Huang Xiaocheng
--Database & Information System Lab, Nankai University

-----邮件原件-----
发件人: Alvaro Herrera [mailto:alvherre@commandprompt.com]
发送时间: 2009年12月10日 22:54
收件人: 黄晓骋
抄送: 'Greg Stark'; pgsql-hackers@postgresql.org
主题: Re: [HACKERS] 答复: questions about concurrency control in Postgresql

黄晓骋 escribió:

I think I know why we need tuple lock.
Though we have tuple's infomask shows whether the tuple is being updated, before we set the tuple's infomask, there may be two transaction coming and updating the tuple. They both think the tuple is ok to be updated, and then it's wrong.
In PostgreSQL, we can use buffer lock to solve the problem , but its granularity is not proper. So we must use tuple lock to solve the problem.
Thank you, Greg. You prompt me to think clearly about it.

Actually it's the buffer lock that's used to protect most of infomask.
Tuple locks are only used while XMAX and some infomask bits are set for
SELECT FOR UPDATE/SHARE type queries. That can take a while because it
may need I/O in pg_multixact, so the buffer lock is not appropriate to
hold for so long.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

__________ Information from ESET NOD32 Antivirus, version of virus signature database 4677 (20091210) __________

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com

__________ Information from ESET NOD32 Antivirus, version of virus signature database 4687 (20091214) __________

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com