Improve checking for pg_index.xmin

Started by Alexander Korotkovabout 6 years ago6 messages
#1Alexander Korotkov
a.korotkov@postgrespro.ru
1 attachment(s)

Hi!

Our customer faced with issue, when index is invisible after creation.
The reproducible case is following.

$ psql db2
# begin;
# select txid_current();
$ psql db1
# select i as id, 0 as v into t from generate_series(1, 100000) i;
# create unique index idx on t (id);
# update t set v = v + 1 where id = 10000;
# update t set v = v + 1 where id = 10000;
# update t set v = v + 1 where id = 10000;
# update t set v = v + 1 where id = 10000;
# update t set v = v + 1 where id = 10000;
# drop index idx;
# create unique index idx on t (id);
# explain analyze select v from t where id = 10000;

There is no issue if there is no parallel session in database db2.
The fact that index visibility depends on open transaction in
different database is ridiculous for users.

This happens so, because we're checking that there is no broken HOT
chains after index creation by comparison pg_index.xmin and
TransactionXmin. So, we check that pg_index.xmin is in the past for
current transaction in lossy way by comparison just xmins. Attached
patch changes this check to XidInMVCCSnapshot().

With patch the issue is gone. My doubt about this patch is that it
changes check with TransactionXmin to check with GetActiveSnapshot(),
which might be more recent. However, query shouldn't be executer with
older snapshot than one it was planned with.

Any thoughts?

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachments:

0001-improve-check-for-pg_index-xmin.patchapplication/octet-stream; name=0001-improve-check-for-pg_index-xmin.patchDownload
commit f8a872414d7644f84166ab0467e489e6dd558369
Author: Alexander Korotkov <akorotkov@postgresql.org>
Date:   Fri Nov 1 02:05:51 2019 +0300

    Improve check for pg_index.xmin
    
    Use more precise check according to transaction snapshot, not just xmin.

diff --git a/src/backend/access/heap/README.HOT b/src/backend/access/heap/README.HOT
index 68c6709aa88..02fd0ceb12b 100644
--- a/src/backend/access/heap/README.HOT
+++ b/src/backend/access/heap/README.HOT
@@ -319,8 +319,8 @@ too, but since the case is unexpected we prefer to throw an error.)
 
 Practically, we prevent certain transactions from using the new index by
 setting pg_index.indcheckxmin to TRUE.  Transactions are allowed to use
-such an index only after pg_index.xmin is below their TransactionXmin
-horizon, thereby ensuring that any incompatible rows in HOT chains are
+such an index only after pg_index.xmin is below their active snapshot,
+thereby ensuring that any incompatible rows in HOT chains are
 dead to them. (pg_index.xmin will be the XID of the CREATE INDEX
 transaction.  The reason for using xmin rather than a normal column is
 that the regular vacuum freezing mechanism will take care of converting
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index e5f9e04d659..ab207c4481d 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -228,8 +228,8 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 			 * src/backend/access/heap/README.HOT for discussion.
 			 */
 			if (index->indcheckxmin &&
-				!TransactionIdPrecedes(HeapTupleHeaderGetXmin(indexRelation->rd_indextuple->t_data),
-									   TransactionXmin))
+				XidInMVCCSnapshot(HeapTupleHeaderGetXmin(indexRelation->rd_indextuple->t_data),
+								  GetActiveSnapshot()))
 			{
 				root->glob->transientPlan = true;
 				index_close(indexRelation, NoLock);
#2Heikki Linnakangas
hlinnaka@iki.fi
In reply to: Alexander Korotkov (#1)
Re: Improve checking for pg_index.xmin

On 01/11/2019 01:50, Alexander Korotkov wrote:

Hi!

Our customer faced with issue, when index is invisible after creation.
The reproducible case is following.

$ psql db2
# begin;
# select txid_current();
$ psql db1
# select i as id, 0 as v into t from generate_series(1, 100000) i;
# create unique index idx on t (id);
# update t set v = v + 1 where id = 10000;
# update t set v = v + 1 where id = 10000;
# update t set v = v + 1 where id = 10000;
# update t set v = v + 1 where id = 10000;
# update t set v = v + 1 where id = 10000;
# drop index idx;
# create unique index idx on t (id);
# explain analyze select v from t where id = 10000;

There is no issue if there is no parallel session in database db2.
The fact that index visibility depends on open transaction in
different database is ridiculous for users.

This happens so, because we're checking that there is no broken HOT
chains after index creation by comparison pg_index.xmin and
TransactionXmin. So, we check that pg_index.xmin is in the past for
current transaction in lossy way by comparison just xmins. Attached
patch changes this check to XidInMVCCSnapshot().

With patch the issue is gone. My doubt about this patch is that it
changes check with TransactionXmin to check with GetActiveSnapshot(),
which might be more recent. However, query shouldn't be executer with
older snapshot than one it was planned with.

Hmm. Maybe you could construct a case like that with a creative mix of
stable and volatile functions? Using GetOldestSnapshot() would be safer.

- Heikki

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Heikki Linnakangas (#2)
Re: Improve checking for pg_index.xmin

Heikki Linnakangas <hlinnaka@iki.fi> writes:

On 01/11/2019 01:50, Alexander Korotkov wrote:

This happens so, because we're checking that there is no broken HOT
chains after index creation by comparison pg_index.xmin and
TransactionXmin. So, we check that pg_index.xmin is in the past for
current transaction in lossy way by comparison just xmins. Attached
patch changes this check to XidInMVCCSnapshot().
With patch the issue is gone. My doubt about this patch is that it
changes check with TransactionXmin to check with GetActiveSnapshot(),
which might be more recent. However, query shouldn't be executer with
older snapshot than one it was planned with.

Hmm. Maybe you could construct a case like that with a creative mix of
stable and volatile functions? Using GetOldestSnapshot() would be safer.

I really wonder if this is safe at all.

(1) Can we assume that the query will be executed with same-or-newer
snapshot as what was used by the planner? There's no such constraint
in the plancache, I'm pretty sure.

(2) Is committed-ness of the index-creating transaction actually
sufficient to ensure that none of the broken HOT chains it saw are
a problem for the onlooker transaction? This is, at best, really
un-obvious. Some of those HOT chains could involve xacts that were
still not committed when the index build finished, I believe.

(3) What if the index was made with CREATE INDEX CONCURRENTLY ---
which xid is actually on the pg_index row, and how does that factor
into (1) and (2)?

On the whole I don't find the risk/reward tradeoff of this looking
promising. Even if it works reliably, I think the situations where
it'll help a lot are a bit artificial.

regards, tom lane

#4Alexander Korotkov
a.korotkov@postgrespro.ru
In reply to: Tom Lane (#3)
Re: Improve checking for pg_index.xmin

On Wed, Jan 8, 2020 at 4:37 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Heikki Linnakangas <hlinnaka@iki.fi> writes:

On 01/11/2019 01:50, Alexander Korotkov wrote:

This happens so, because we're checking that there is no broken HOT
chains after index creation by comparison pg_index.xmin and
TransactionXmin. So, we check that pg_index.xmin is in the past for
current transaction in lossy way by comparison just xmins. Attached
patch changes this check to XidInMVCCSnapshot().
With patch the issue is gone. My doubt about this patch is that it
changes check with TransactionXmin to check with GetActiveSnapshot(),
which might be more recent. However, query shouldn't be executer with
older snapshot than one it was planned with.

Hmm. Maybe you could construct a case like that with a creative mix of
stable and volatile functions? Using GetOldestSnapshot() would be safer.

I really wonder if this is safe at all.

(1) Can we assume that the query will be executed with same-or-newer
snapshot as what was used by the planner? There's no such constraint
in the plancache, I'm pretty sure.

(2) Is committed-ness of the index-creating transaction actually
sufficient to ensure that none of the broken HOT chains it saw are
a problem for the onlooker transaction? This is, at best, really
un-obvious. Some of those HOT chains could involve xacts that were
still not committed when the index build finished, I believe.

(3) What if the index was made with CREATE INDEX CONCURRENTLY ---
which xid is actually on the pg_index row, and how does that factor
into (1) and (2)?

Thank you for pointing. I'll investigate these issues in detail.

On the whole I don't find the risk/reward tradeoff of this looking
promising. Even if it works reliably, I think the situations where
it'll help a lot are a bit artificial.

I can't agree that these situations are artificial. For me, it seems
natural that user expects index to be visible once it's created.
Also, we always teach users that long-running transactions are evil,
but nevertheless they are frequent in real life. So, it doesn't seem
unlikely that one expects index to become visible, while long
transaction is running in parallel. This particular case was reported
by our customer. After investigation I was surprised how rare such
cases are reported...

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#5Amit Kapila
amit.kapila16@gmail.com
In reply to: Alexander Korotkov (#4)
Re: Improve checking for pg_index.xmin

On Sun, Jan 12, 2020 at 3:33 AM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:

On Wed, Jan 8, 2020 at 4:37 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Heikki Linnakangas <hlinnaka@iki.fi> writes:

On 01/11/2019 01:50, Alexander Korotkov wrote:

This happens so, because we're checking that there is no broken HOT
chains after index creation by comparison pg_index.xmin and
TransactionXmin. So, we check that pg_index.xmin is in the past for
current transaction in lossy way by comparison just xmins. Attached
patch changes this check to XidInMVCCSnapshot().
With patch the issue is gone. My doubt about this patch is that it
changes check with TransactionXmin to check with GetActiveSnapshot(),
which might be more recent. However, query shouldn't be executer with
older snapshot than one it was planned with.

Hmm. Maybe you could construct a case like that with a creative mix of
stable and volatile functions? Using GetOldestSnapshot() would be safer.

I really wonder if this is safe at all.

(1) Can we assume that the query will be executed with same-or-newer
snapshot as what was used by the planner? There's no such constraint
in the plancache, I'm pretty sure.

(2) Is committed-ness of the index-creating transaction actually
sufficient to ensure that none of the broken HOT chains it saw are
a problem for the onlooker transaction? This is, at best, really
un-obvious. Some of those HOT chains could involve xacts that were
still not committed when the index build finished, I believe.

(3) What if the index was made with CREATE INDEX CONCURRENTLY ---
which xid is actually on the pg_index row, and how does that factor
into (1) and (2)?

Thank you for pointing. I'll investigate these issues in detail.

Are you planning to work on this patch [1]https://commitfest.postgresql.org/27/2337/ for current CF? If not,
then I think it is better to either move this to the next CF or mark
it as RWF.

[1]: https://commitfest.postgresql.org/27/2337/

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#6Alexander Korotkov
a.korotkov@postgrespro.ru
In reply to: Amit Kapila (#5)
Re: Improve checking for pg_index.xmin

On Tue, Mar 24, 2020 at 3:38 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Sun, Jan 12, 2020 at 3:33 AM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:

On Wed, Jan 8, 2020 at 4:37 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Heikki Linnakangas <hlinnaka@iki.fi> writes:

On 01/11/2019 01:50, Alexander Korotkov wrote:

This happens so, because we're checking that there is no broken HOT
chains after index creation by comparison pg_index.xmin and
TransactionXmin. So, we check that pg_index.xmin is in the past for
current transaction in lossy way by comparison just xmins. Attached
patch changes this check to XidInMVCCSnapshot().
With patch the issue is gone. My doubt about this patch is that it
changes check with TransactionXmin to check with GetActiveSnapshot(),
which might be more recent. However, query shouldn't be executer with
older snapshot than one it was planned with.

Hmm. Maybe you could construct a case like that with a creative mix of
stable and volatile functions? Using GetOldestSnapshot() would be safer.

I really wonder if this is safe at all.

(1) Can we assume that the query will be executed with same-or-newer
snapshot as what was used by the planner? There's no such constraint
in the plancache, I'm pretty sure.

(2) Is committed-ness of the index-creating transaction actually
sufficient to ensure that none of the broken HOT chains it saw are
a problem for the onlooker transaction? This is, at best, really
un-obvious. Some of those HOT chains could involve xacts that were
still not committed when the index build finished, I believe.

(3) What if the index was made with CREATE INDEX CONCURRENTLY ---
which xid is actually on the pg_index row, and how does that factor
into (1) and (2)?

Thank you for pointing. I'll investigate these issues in detail.

Are you planning to work on this patch [1] for current CF? If not,
then I think it is better to either move this to the next CF or mark
it as RWF.

I didn't manage to investigate this subject and provide new patch
version. I'm marking this RWF.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company