REINDEX blocks virtually any queries but some prepared queries.

Started by Frédéric Yhuelalmost 4 years ago10 messages
#1Frédéric Yhuel
frederic.yhuel@dalibo.com

Hello,

From the documentation
(https://www.postgresql.org/docs/current/sql-reindex.html#id-1.9.3.162.7),
it sounds like REINDEX won't block read queries that don't need the
index. But it seems like the planner wants to take an ACCESS SHARE lock
on every indexes, regardless of the query, and so REINDEX actually
blocks any queries but some prepared queries whose plan have been cached.

I wonder if it is a bug, or if the documentation should be updated. What
do you think?

Here is a simple demo (tested with postgres 10 and master):

Session #1
===========================================================

srcpg@postgres=# CREATE TABLE flights (id INT generated always as
identity, takeoff DATE);
CREATE TABLE

srcpg@postgres=# INSERT INTO flights (takeoff) SELECT date '2022-03-01'
+ interval '1 day' * i FROM generate_series(1,1000) i;
INSERT 0 1000

srcpg@postgres=# CREATE INDEX ON flights(takeoff);
CREATE INDEX

srcpg@postgres=# BEGIN;
BEGIN
srcpg@postgres=# REINDEX INDEX flights_takeoff_idx ;
REINDEX

Session #2
===========================================================

srcpg@postgres=# SELECT pg_backend_pid();
pg_backend_pid
----------------
4114695

srcpg@postgres=# EXPLAIN SELECT id FROM flights;
--> it blocks

Session #3
===========================================================

srcpg@postgres=# SELECT locktype, relname, mode, granted FROM pg_locks
LEFT JOIN pg_class ON (oid = relation) WHERE pid = 4114695;
locktype | relname | mode | granted
------------+---------------------+-----------------+---------
virtualxid | ∅ | ExclusiveLock | t
relation | flights_takeoff_idx | AccessShareLock | f
relation | flights | AccessShareLock | t

In reply to: Frédéric Yhuel (#1)
Re: REINDEX blocks virtually any queries but some prepared queries.

On Wed, Apr 6, 2022 at 7:49 AM Frédéric Yhuel <frederic.yhuel@dalibo.com> wrote:

From the documentation
(https://www.postgresql.org/docs/current/sql-reindex.html#id-1.9.3.162.7),
it sounds like REINDEX won't block read queries that don't need the
index. But it seems like the planner wants to take an ACCESS SHARE lock
on every indexes, regardless of the query, and so REINDEX actually
blocks any queries but some prepared queries whose plan have been cached.

I wonder if it is a bug, or if the documentation should be updated. What
do you think?

I've always thought that the docs for REINDEX, while technically
accurate, are very misleading in practice.

--
Peter Geoghegan

#3Frédéric Yhuel
frederic.yhuel@dalibo.com
In reply to: Peter Geoghegan (#2)
1 attachment(s)
Re: REINDEX blocks virtually any queries but some prepared queries.

On 4/6/22 17:03, Peter Geoghegan wrote:

On Wed, Apr 6, 2022 at 7:49 AM Frédéric Yhuel <frederic.yhuel@dalibo.com> wrote:

From the documentation
(https://www.postgresql.org/docs/current/sql-reindex.html#id-1.9.3.162.7),
it sounds like REINDEX won't block read queries that don't need the
index. But it seems like the planner wants to take an ACCESS SHARE lock
on every indexes, regardless of the query, and so REINDEX actually
blocks any queries but some prepared queries whose plan have been cached.

I wonder if it is a bug, or if the documentation should be updated. What
do you think?

I've always thought that the docs for REINDEX, while technically
accurate, are very misleading in practice.

Maybe something along this line? (patch attached)

Attachments:

0001-Doc-Elaborate-locking-considerations-for-REINDEX.patchtext/x-patch; charset=UTF-8; name=0001-Doc-Elaborate-locking-considerations-for-REINDEX.patchDownload
From 4930bb8de182b78228d215bce1ab65263baabde7 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Fr=C3=A9d=C3=A9ric=20Yhuel?= <frederic.yhuel@dalibo.com>
Date: Thu, 7 Apr 2022 13:30:59 +0200
Subject: [PATCH] Doc: Elaborate locking considerations for REINDEX

---
 doc/src/sgml/ref/reindex.sgml | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index e6b25ee670..06c223d4a3 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -275,7 +275,11 @@ REINDEX [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] { IN
    considerations are rather different.  <command>REINDEX</command> locks out writes
    but not reads of the index's parent table.  It also takes an
    <literal>ACCESS EXCLUSIVE</literal> lock on the specific index being processed,
-   which will block reads that attempt to use that index.  In contrast,
+   which will block reads that attempt to use that index. In particular,
+   the PostgreSQL query planner wants to take an <literal>ACCESS SHARE</literal>
+   lock on every indexes of the table, regardless of the query, and so
+   <command>REINDEX</command> blocks virtually any queries but some prepared queries
+   whose plan have been cached and which don't use this very index. In contrast,
    <command>DROP INDEX</command> momentarily takes an
    <literal>ACCESS EXCLUSIVE</literal> lock on the parent table, blocking both
    writes and reads.  The subsequent <command>CREATE INDEX</command> locks out
-- 
2.30.2

#4Justin Pryzby
pryzby@telsasoft.com
In reply to: Frédéric Yhuel (#3)
Re: REINDEX blocks virtually any queries but some prepared queries.

On Thu, Apr 07, 2022 at 01:37:57PM +0200, Fr�d�ric Yhuel wrote:

Maybe something along this line? (patch attached)

Some language fixes.
I didn't verify the behavior, but +1 to document the practical consequences.
I guess this is why someone invented REINDEX CONCURRENTLY.

From 4930bb8de182b78228d215bce1ab65263baabde7 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Fr=C3=A9d=C3=A9ric=20Yhuel?= <frederic.yhuel@dalibo.com>
Date: Thu, 7 Apr 2022 13:30:59 +0200
Subject: [PATCH] Doc: Elaborate locking considerations for REINDEX

---
doc/src/sgml/ref/reindex.sgml | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index e6b25ee670..06c223d4a3 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -275,7 +275,11 @@ REINDEX [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] { IN
considerations are rather different.  <command>REINDEX</command> locks out writes
but not reads of the index's parent table.  It also takes an
<literal>ACCESS EXCLUSIVE</literal> lock on the specific index being processed,
-   which will block reads that attempt to use that index.  In contrast,
+   which will block reads that attempt to use that index. In particular,
+   the PostgreSQL query planner wants to take an <literal>ACCESS SHARE</literal>

s/wants/tries/

+ lock on every indexes of the table, regardless of the query, and so

every index

+ <command>REINDEX</command> blocks virtually any queries but some prepared queries

any query except for

+ whose plan have been cached and which don't use this very index. In contrast,

plan has

Show quoted text

<command>DROP INDEX</command> momentarily takes an
<literal>ACCESS EXCLUSIVE</literal> lock on the parent table, blocking both
writes and reads. The subsequent <command>CREATE INDEX</command> locks out
--
2.30.2

#5Frédéric Yhuel
frederic.yhuel@dalibo.com
In reply to: Justin Pryzby (#4)
1 attachment(s)
Re: REINDEX blocks virtually any queries but some prepared queries.

On 4/7/22 14:40, Justin Pryzby wrote:

On Thu, Apr 07, 2022 at 01:37:57PM +0200, Frédéric Yhuel wrote:

Maybe something along this line? (patch attached)

Some language fixes.

Thank you Justin! I applied your fixes in the v2 patch (attached).

I didn't verify the behavior, but +1 to document the practical consequences.
I guess this is why someone invented REINDEX CONCURRENTLY.

Indeed ;) That being said, REINDEX CONCURRENTLY could give you an
invalid index, so sometimes you may be tempted to go for a simpler
REINDEX, especially if you believe that the SELECTs won't be blocked.

Attachments:

0001-Doc-Elaborate-locking-considerations-for-REINDEX_v2.patchtext/x-patch; charset=UTF-8; name=0001-Doc-Elaborate-locking-considerations-for-REINDEX_v2.patchDownload
From 0b6c7d6e466fabc0233b6960b8a33141d512652f Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Fr=C3=A9d=C3=A9ric=20Yhuel?= <frederic.yhuel@dalibo.com>
Date: Thu, 7 Apr 2022 13:30:59 +0200
Subject: [PATCH] Doc: Elaborate locking considerations for REINDEX

---
 doc/src/sgml/ref/reindex.sgml | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/ref/reindex.sgml b/doc/src/sgml/ref/reindex.sgml
index e6b25ee670..d3c63c4deb 100644
--- a/doc/src/sgml/ref/reindex.sgml
+++ b/doc/src/sgml/ref/reindex.sgml
@@ -275,7 +275,12 @@ REINDEX [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] { IN
    considerations are rather different.  <command>REINDEX</command> locks out writes
    but not reads of the index's parent table.  It also takes an
    <literal>ACCESS EXCLUSIVE</literal> lock on the specific index being processed,
-   which will block reads that attempt to use that index.  In contrast,
+   which will block reads that attempt to use that index. In particular,
+   the PostgreSQL query planner tries to take an <literal>ACCESS SHARE</literal>
+   lock on every index of the table, regardless of the query, and so
+   <command>REINDEX</command> blocks virtually any queries except for some prepared
+   queries whose plan has been cached and which don't use this very index.
+   In contrast,
    <command>DROP INDEX</command> momentarily takes an
    <literal>ACCESS EXCLUSIVE</literal> lock on the parent table, blocking both
    writes and reads.  The subsequent <command>CREATE INDEX</command> locks out
-- 
2.30.2

#6Guillaume Lelarge
guillaume@lelarge.info
In reply to: Frédéric Yhuel (#5)
Re: REINDEX blocks virtually any queries but some prepared queries.

Le jeu. 7 avr. 2022 à 15:44, Frédéric Yhuel <frederic.yhuel@dalibo.com> a
écrit :

On 4/7/22 14:40, Justin Pryzby wrote:

On Thu, Apr 07, 2022 at 01:37:57PM +0200, Frédéric Yhuel wrote:

Maybe something along this line? (patch attached)

Some language fixes.

Thank you Justin! I applied your fixes in the v2 patch (attached).

v2 patch sounds good.

I didn't verify the behavior, but +1 to document the practical

consequences.

I guess this is why someone invented REINDEX CONCURRENTLY.

Indeed ;) That being said, REINDEX CONCURRENTLY could give you an
invalid index, so sometimes you may be tempted to go for a simpler
REINDEX, especially if you believe that the SELECTs won't be blocked.

Agreed.

--
Guillaume.

#7Michael Paquier
michael@paquier.xyz
In reply to: Guillaume Lelarge (#6)
Re: REINDEX blocks virtually any queries but some prepared queries.

On Thu, Apr 07, 2022 at 05:29:36PM +0200, Guillaume Lelarge a écrit :

Le jeu. 7 avr. 2022 à 15:44, Frédéric Yhuel <frederic.yhuel@dalibo.com> a écrit :

On 4/7/22 14:40, Justin Pryzby wrote:
Thank you Justin! I applied your fixes in the v2 patch (attached).

v2 patch sounds good.

The location of the new sentence and its wording seem fine to me. So
no objections from me to add what's suggested, as suggested. I'll
wait for a couple of days first.

Indeed ;) That being said, REINDEX CONCURRENTLY could give you an
invalid index, so sometimes you may be tempted to go for a simpler
REINDEX, especially if you believe that the SELECTs won't be blocked.

Agreed.

There are many factors to take into account, one is more expensive
than the other in terms of resources and has downsides, downsides
compensated by the reduction in the lock requirements. There are
cases where REINDEX is a must-have, as CONCURRENTLY does not support
catalog indexes, and these tend to be easily noticed when corruption
spreads around.
--
Michael

#8Frédéric Yhuel
frederic.yhuel@dalibo.com
In reply to: Michael Paquier (#7)
Re: REINDEX blocks virtually any queries but some prepared queries.

On 4/8/22 02:22, Michael Paquier wrote:

On Thu, Apr 07, 2022 at 05:29:36PM +0200, Guillaume Lelarge a écrit :

Le jeu. 7 avr. 2022 à 15:44, Frédéric Yhuel <frederic.yhuel@dalibo.com> a écrit :

On 4/7/22 14:40, Justin Pryzby wrote:
Thank you Justin! I applied your fixes in the v2 patch (attached).

v2 patch sounds good.

The location of the new sentence and its wording seem fine to me. So
no objections from me to add what's suggested, as suggested. I'll
wait for a couple of days first.

Thank you Michael.

Indeed ;) That being said, REINDEX CONCURRENTLY could give you an
invalid index, so sometimes you may be tempted to go for a simpler
REINDEX, especially if you believe that the SELECTs won't be blocked.

Agreed.

There are many factors to take into account, one is more expensive
than the other in terms of resources and has downsides, downsides
compensated by the reduction in the lock requirements. There are
cases where REINDEX is a must-have, as CONCURRENTLY does not support
catalog indexes, and these tend to be easily noticed when corruption
spreads around.

Indeed!

Best regards,
Frédéric

#9Michael Paquier
michael@paquier.xyz
In reply to: Frédéric Yhuel (#8)
Re: REINDEX blocks virtually any queries but some prepared queries.

On Fri, Apr 08, 2022 at 04:23:48PM +0200, Frédéric Yhuel wrote:

Thank you Michael.

And done as of 8ac700a.
--
Michael

#10Frédéric Yhuel
frederic.yhuel@dalibo.com
In reply to: Michael Paquier (#9)
Re: REINDEX blocks virtually any queries but some prepared queries.

On 4/11/22 02:57, Michael Paquier wrote:

On Fri, Apr 08, 2022 at 04:23:48PM +0200, Frédéric Yhuel wrote:

Thank you Michael.

And done as of 8ac700a.
--

Thank you Micheal!

For reference purposes, we can see in the code of get_relation_info(),
in plancat.c, that indeed every index of the table are opened, and
therefore locked, regardless of the query.

Best regards,
Frédéric