BUG #18016: REINDEX TABLE failure
The following bug has been logged on the website:
Bug reference: 18016
Logged by: Richard Vesely
Email address: richard.vesely@softea.sk
PostgreSQL version: 15.3
Operating system: Windows 10 Enterprise 22H2
Description:
Hi,
Given a table with a TOASTed variable length attribute, REINDEX TABLE fails
to rebuild indexes when you truncate (or otherwise corrupt) relation files
for both TOAST table index and a custom index on the varlena.
Here's an error from server log with log_error_verbosity set to verbose:
ERROR: XX001: could not read block 0 in file "base/[datoid]/[relfilenode]":
read only 0 of 8192 bytes
LOCATION: mdread, md.c:724
STATEMENT: reindex table t1
However, when you perform a manual reindex in the correct order - REINDEX
INDEX pg_toast.pg_toast_oid_index and then REINDEX INDEX t1_column1_idx it
works as expected. REINDEX TABLE should ensure that the TOAST index is
rebuilt first before rebuilding an index on (potentially) TOASTed values. In
this particular example when you REINDEX TOAST index first and then run the
full REINDEX TABLE you can see that it always rebuilds the custom index
first based on relation file nodes.
Best regards,
Richard Veselý
Here's a minimal repro dump:
--
-- PostgreSQL database dump
--
-- Dumped from database version 15.3
-- Dumped by pg_dump version 15.3
SET statement_timeout = 0;
SET lock_timeout = 0;
SET idle_in_transaction_session_timeout = 0;
SET client_encoding = 'UTF8';
SET standard_conforming_strings = on;
SELECT pg_catalog.set_config('search_path', '', false);
SET check_function_bodies = false;
SET xmloption = content;
SET client_min_messages = warning;
SET row_security = off;
--
-- Name: bug_report; Type: DATABASE; Schema: -; Owner: postgres
--
CREATE DATABASE bug_report WITH TEMPLATE = template0 ENCODING = 'UTF8'
LOCALE_PROVIDER = libc LOCALE = 'en_US.UTF-8';
ALTER DATABASE bug_report OWNER TO postgres;
\connect bug_report
SET statement_timeout = 0;
SET lock_timeout = 0;
SET idle_in_transaction_session_timeout = 0;
SET client_encoding = 'UTF8';
SET standard_conforming_strings = on;
SELECT pg_catalog.set_config('search_path', '', false);
SET check_function_bodies = false;
SET xmloption = content;
SET client_min_messages = warning;
SET row_security = off;
--
-- Name: public; Type: SCHEMA; Schema: -; Owner: postgres
--
-- *not* creating schema, since initdb creates it
ALTER SCHEMA public OWNER TO postgres;
SET default_tablespace = '';
SET default_table_access_method = heap;
--
-- Name: t1; Type: TABLE; Schema: public; Owner: postgres
--
CREATE TABLE public.t1 (
column1 text
);
ALTER TABLE public.t1 OWNER TO postgres;
--
-- Data for Name: t1; Type: TABLE DATA; Schema: public; Owner: postgres
--
COPY public.t1 (column1) FROM stdin;
vkifpbzxdplzkizpaugzhlejhqmvgwmlhqlgofbvoaiowqohnmxaldkyoawdrpttppkxfratkgeyxogzdvihkssbpyvgbnbhgaezhhgyehqcduakvrahnauymfuqznthijohfbbuzitrpifmqkezjbujngzsijsquskztqypdkienyhytyergfbibasksgntabxgzgrmhtzrukjuykaqfrksqcswwbsmlmdfrpovbdlvcaofztwasbfzwyoeklbnacgtdrwjfvdpdccnyetkohmtgwdkzlnofyccxgrbojcjnruvwlbwbpxyzubwqjmfnzvzkjsdgozewauqlbmckpxztuidtdfpvbhizlbrezvkndjcodbjabxggywtqpsofdtsfyspjscrmghbbpxhuvqvxpgwfdvhhcvekncudhzbtotqxxzixoqnybzpnhvgnhdlcbctyitiqdilwuensusfcfelojvzhgrefyrqohdqiaewddpharcwipjyyijudozpkomgsstqbarykbuoxgnmjwcvkufidiozxccwtfzatxyztjmeihlzyafdafqbkkqqekasgfllfcdaelwsecayspnspvofkelkxfytrwfccuynwjlafelgnuggvejoiketoeqpxtofivpxeqahxnhdkhfwdbytqlfulogxdpjbbtioelkuxywcdvknjbllmyvuckduywllkljfpoxiwgunwjwoiokenfygsduokepxjetyjjzbnxqbvsdbrpefdlghluynoqsxkfrttsibjkdtforzhmhazyzoaanvstmqafsuynrvmknivmcvcqlwxmdgjnhuivxzwjefszyrkzmvleskghrknohfyntnsovqiquojnrzsusyvjfcogtdgrlbyemggllpyvqxclqqcmwcvrvtejmiinlmqfcznszledlavaqwnugijgevehlrydlrlluqmepaqyqlhpyxeuryqwauyfaoifsxsxxxemgidmzxzjpoecapyubvprnzlgvrlidotzluaodlwrrphgxfpcsskkaxguwajcytusnpbudvuvdjqzujgdlqnoksainpdwcfdwizvpgnhysunadzaizywtzgydpgumfedoqbhdlqynufivmqyihkfqnvavofgojzjrzpfhmqqgxqmmhkyvsloegljgjglkywqfjqcwawigxhlbmztzytlqlheghhhykttjvbqkdnuuiajqvpihyrwjnlihglgxebhalthpizkrccgnxkwfxjsjrpcsitmdounnbxoeoomstbykypoflitwvirpwdrdvrtwkqwbqlsqxkvogdsdkwffvvzalibtgtkbcmqjcpvlwpubdhykqsrqwzmaqbwndmvribafoyizgbpbavvvtivkcofijaubtpmzfgauvrgfqjlsksdtfaaimfnurstbfikildbcdfzbwzqicjwewrxzppneyrlhsrdaprgmaofulgcffstvikvwvkmprddflkudytkrlccrkivvzwvmsyeigowqoqkidzcetlnfaxlpyalzennzgexiaqduzffijgsbhshyaiephqviluzzjdfgjjgkphdkamlwzppqpvpjbgnjnmvmgyrqubvsgpivstqbydtbpakripvsvnuqwwgngwdoeeichpljrnqstcdeobubjcudjizrgxjfmcvghrlhvjseinrfkmeqhrcullxildvkcjcbozpsowddwdqusclysmaasmcgruosqqjcjurtqhnnigvpviuhwroydcxhasvqwcgeauiawnqyreaoikhbaymizkanzjyrbtftiddryylqxfhmzomlqkcqkgrapqgiiylahganeibkzahxitcwswgpqmvnlgyuxywoaqqlbqdpfexlpzpzlpucwgqxfraqwqmvwhuojbmpngdhenplmkomgwmnplwnfnlgmejgyoapkjmyvsolpiqlebfumcywfxvbgshaakujitbbgrvtqxvsfvapuejebqoknhaefyeebmlqvoifjvlnosxkvk
\.
--
-- Name: t1_column1_idx; Type: INDEX; Schema: public; Owner: postgres
--
CREATE INDEX t1_column1_idx ON public.t1 USING btree (column1);
--
-- Name: SCHEMA public; Type: ACL; Schema: -; Owner: postgres
--
REVOKE USAGE ON SCHEMA public FROM PUBLIC;
GRANT ALL ON SCHEMA public TO PUBLIC;
--
-- PostgreSQL database dump complete
--
On Thu, Jul 06, 2023 at 08:29:19PM +0000, PG Bug reporting form wrote:
Given a table with a TOASTed variable length attribute, REINDEX TABLE fails
to rebuild indexes when you truncate (or otherwise corrupt) relation files
for both TOAST table index and a custom index on the varlena.
Could you clarify what you have done here? Did you manipulate the
physical files in the data folder before running the REINDEX commands
you expected should work? There are many things that can go wrong if
you do anything like that.
--
Michael
Michael Paquier <michael@paquier.xyz> writes:
On Thu, Jul 06, 2023 at 08:29:19PM +0000, PG Bug reporting form wrote:
Given a table with a TOASTed variable length attribute, REINDEX TABLE fails
to rebuild indexes when you truncate (or otherwise corrupt) relation files
for both TOAST table index and a custom index on the varlena.
Could you clarify what you have done here? Did you manipulate the
physical files in the data folder before running the REINDEX commands
you expected should work? There are many things that can go wrong if
you do anything like that.
I think the point of that was just to have a way to reproduce the problem
on-demand. I follow the argument, which is that if there's actual
corruption in the TOAST index (for whatever reason) that might interfere
with rebuilding the table's other indexes.
regards, tom lane
On Fri, Jul 7, 2023 at 5:20 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Michael Paquier <michael@paquier.xyz> writes:
On Thu, Jul 06, 2023 at 08:29:19PM +0000, PG Bug reporting form wrote:
Given a table with a TOASTed variable length attribute, REINDEX TABLE fails
to rebuild indexes when you truncate (or otherwise corrupt) relation files
for both TOAST table index and a custom index on the varlena.Could you clarify what you have done here? Did you manipulate the
physical files in the data folder before running the REINDEX commands
you expected should work? There are many things that can go wrong if
you do anything like that.I think the point of that was just to have a way to reproduce the problem
on-demand. I follow the argument, which is that if there's actual
corruption in the TOAST index (for whatever reason) that might interfere
with rebuilding the table's other indexes.
That's my understanding, as well.
This shouldn't be treated as a bug, but as a desirable improvement in
REINDEX TABLE's behaviour. Stated another way, we want REINDEX TABLE
to reindex toast tables' indexes before attempting to reindex the
table's index.
Below [1]initdb ./db/data pg_ctl -D ./db/data -l db/server.log start psql -d postgres are the commands to create the test case and reproduce the error.
I am taking a look at this; I'd like to avoid duplicate work if
someone else is looking at it, too.
Preliminary reading of the code indicates that a simple rearrangement
of the code in reindex_relation() would be sufficient to get the
desired behaviour. The code towards the bottom in that function,
protected by `if ((flags & REINDEX_REL_PROCESS_TOAST ...)` needs to be
moved to just before the `foreach(indexId, indexIds)` loop.
The only downside I see so far with the proposed change is that the
toast tables are currently reindexed after table_close() call, but
after the proposed change they'll be reindexed before that call to
close_table(). But since close_table() does not release the ShareLock
on the table that is taken at the beginning of reindex_relation(), I
don't think we'll losing anything by the proposed rearrangement of
code.
[1]: initdb ./db/data pg_ctl -D ./db/data -l db/server.log start psql -d postgres
initdb ./db/data
pg_ctl -D ./db/data -l db/server.log start
psql -d postgres
create table t1(column1 text);
create index on t1 (column1);
insert into t1 select repeat('fsdfaf', 30000);
select oid, relfilenode, relname from pg_class
where oid >= (select oid from pg_class where relname = 't1');
// Generate command to corrupt toast table's index
select 'echo > db/data/base/'
|| (select oid from pg_database where datname = current_database())
|| '/'
|| (select relfilenode from pg_class
where relname = ('pg_toast_'
|| (select oid from pg_class where relname = 't1'))
|| '_index');
# Stop the database before inducing corruption; else the reindex command may
# use cached copies of toast index blocks and succeed
pg_ctl -D ./db/data stop
echo > db/data/base/5/16388
pg_ctl -D ./db/data -l db/server.log start
psql -d postgres
reindex table t1;
ERROR: could not read block 0 in file "base/5/16388": read only 1 of 8192 bytes
reindex index pg_toast.pg_toast_16384_index ;
//REINDEX
reindex table t1;
//REINDEX
Best regards,
Gurjeet
http://Gurje.et
On Sun, Jul 09, 2023 at 12:01:03AM -0700, Gurjeet Singh wrote:
Preliminary reading of the code indicates that a simple rearrangement
of the code in reindex_relation() would be sufficient to get the
desired behaviour. The code towards the bottom in that function,
protected by `if ((flags & REINDEX_REL_PROCESS_TOAST ...)` needs to be
moved to just before the `foreach(indexId, indexIds)` loop.
I guess that it should be OK to do that from the point where
reltoastrelid is known, when extracted the parent relation locked with
this ShareLock.
The only downside I see so far with the proposed change is that the
toast tables are currently reindexed after table_close() call, but
after the proposed change they'll be reindexed before that call to
close_table(). But since close_table() does not release the ShareLock
on the table that is taken at the beginning of reindex_relation(), I
don't think we'll losing anything by the proposed rearrangement of
code.
That should be OK, I assume. However, if this is improved and
something we want to support in the long-run I guess that a TAP test
may be appropriate.
--
Michael
Hi everyone,
Sorry for the delay, I was away from computer for a couple of days.
Tom is exactly right. I was just giving you a minimum number of steps to reproduce the issue. That being said, it is also a good idea to give you a bit of a background context and maybe start a broader discussion. However, I don't want to pollute a bug report with an unrelated topic so someone might suggest a more appropriate venue.
In my particular case, I didn't encounter some hardware failure that caused corruption of both TOAST table index and other dependent indexes, but instead I didn't have either of them in the first place (hence my suggestion to truncate them to accurately reproduce my exact setup). So in that sense Michael is also asking a legitimate question of how we got to where we are.
I was dissatisfied with storage layer performance, especially during the initial database population, so I rewrote it for my use case. I'm done with the heap, but for the moment I still rely on PostgreSQL to build indexes, specifically by using the REINDEX TABLE command for its convenience and that's how I discovered this problem with a couple of tables that had the required combination of indexes and data to trigger the original issue.
I won't derail this discussion any further, because some people downthread are already working on fixing/improving this scenario, but there's no shortage of people that suffer from sluggish pg_dump/pg_restore cycle and I imagine there are any number of people that would be interested in improving bulk ingestion which is often a bottleneck for analytical workloads as you are well aware. What's the best place to discuss this topic further - pgsql-performance or someplace else?
Best,
Richard
-----Original Message-----
From: Tom Lane <tgl@sss.pgh.pa.us>
Sent: Saturday, July 8, 2023 2:20 AM
To: Michael Paquier <michael@paquier.xyz>
Cc: Richard Veselý <richard.vesely@softea.sk>; pgsql-bugs@lists.postgresql.org
Subject: Re: BUG #18016: REINDEX TABLE failure
Michael Paquier <michael@paquier.xyz> writes:
On Thu, Jul 06, 2023 at 08:29:19PM +0000, PG Bug reporting form wrote:
Given a table with a TOASTed variable length attribute, REINDEX TABLE
fails to rebuild indexes when you truncate (or otherwise corrupt)
relation files for both TOAST table index and a custom index on the varlena.
Could you clarify what you have done here? Did you manipulate the
physical files in the data folder before running the REINDEX commands
you expected should work? There are many things that can go wrong if
you do anything like that.
I think the point of that was just to have a way to reproduce the problem on-demand. I follow the argument, which is that if there's actual corruption in the TOAST index (for whatever reason) that might interfere with rebuilding the table's other indexes.
regards, tom lane
Michael Paquier <michael@paquier.xyz> writes:
That should be OK, I assume. However, if this is improved and
something we want to support in the long-run I guess that a TAP test
may be appropriate.
I do not see the point of a TAP test. It's not like the code isn't
covered perfectly well.
regards, tom lane
On Sun, Jul 9, 2023 at 7:18 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Michael Paquier <michael@paquier.xyz> writes:
That should be OK, I assume. However, if this is improved and
something we want to support in the long-run I guess that a TAP test
may be appropriate.I do not see the point of a TAP test. It's not like the code isn't
covered perfectly well.
Please find attached the patch that makes REINDEX TABLE perform
reindex on toast table before reindexing the main table's indexes.
The code block movement involved slightly more thought and care than I
had previously imagined. As explained in comments in the patch, the
enumeration and suppression of indexes on the main table must happen
before any CommandCounterIncrement() call, hence the
reindex-the-toast-table-if-any code had to be placed after that
enumeration.
In support of the argument above, the patch does not include any TAP
tests. Reliably reproducing the original error message involves
restarting the database, and since that can't be done via SQL
commands, no sql tests are included, either.
The patch also includes minor wordsmithing, and benign whitespace
changes in neighboring code.
Best regards,
Gurjeet
http://Gurje.et
Attachments:
v1-0001-Reindex-toast-table-s-index-before-main-table-s-i.patchapplication/octet-stream; name=v1-0001-Reindex-toast-table-s-index-before-main-table-s-i.patchDownload
From 4bfca93755e77c91ed9f309acdc6ffecae32ab4d Mon Sep 17 00:00:00 2001
From: Gurjeet Singh <gurjeet@singh.im>
Date: Mon, 10 Jul 2023 09:10:43 -0700
Subject: [PATCH v1] Reindex toast table's index before main table's indexes
Doing so helps in cases where a corruption in toast table's index would
otherwise prevent reindexing of main table's indexes.
While at it, perform minor wordsmithing, and benign whitespace changes
in the neighboring code.
---
src/backend/catalog/index.c | 54 ++++++++++++++++++++-----------------
1 file changed, 30 insertions(+), 24 deletions(-)
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 67b743e251..b6230254a5 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -3878,7 +3878,7 @@ reindex_relation(Oid relid, int flags, ReindexParams *params)
Oid toast_relid;
List *indexIds;
char persistence;
- bool result;
+ bool result = false;
ListCell *indexId;
int i;
@@ -3908,7 +3908,7 @@ reindex_relation(Oid relid, int flags, ReindexParams *params)
toast_relid = rel->rd_rel->reltoastrelid;
/*
- * Get the list of index OIDs for this relation. (We trust to the
+ * Get the list of index OIDs for this relation. (We trust the
* relcache to get this with a sequential scan if ignoring system
* indexes.)
*/
@@ -3926,6 +3926,31 @@ reindex_relation(Oid relid, int flags, ReindexParams *params)
CommandCounterIncrement();
}
+ /*
+ * Reindex the secondary toast table, if any, before the indexes on main
+ * table. This helps in cases where a corruption in toast table's index
+ * would otherwise prevent reindexing of main table's indexes.
+ *
+ * This should be done after the suppression of the use of indexes (above),
+ * because the recursive call to reindex_relation() below will invoke
+ * CommandCounterIncrement(), which may prevent enumeration of the indexes
+ * on the table.
+ */
+ if ((flags & REINDEX_REL_PROCESS_TOAST) && OidIsValid(toast_relid))
+ {
+ /*
+ * Note that this should fail if the toast relation is missing, so
+ * reset REINDEXOPT_MISSING_OK. Even if a new tablespace is set for
+ * the parent relation, the indexes on its toast table are not moved.
+ * This rule is enforced by setting tablespaceOid to InvalidOid.
+ */
+ ReindexParams newparams = *params;
+
+ newparams.options &= ~(REINDEXOPT_MISSING_OK);
+ newparams.tablespaceOid = InvalidOid;
+ result |= reindex_relation(toast_relid, flags, &newparams);
+ }
+
/*
* Compute persistence of indexes: same as that of owning rel, unless
* caller specified otherwise.
@@ -3941,8 +3966,8 @@ reindex_relation(Oid relid, int flags, ReindexParams *params)
i = 1;
foreach(indexId, indexIds)
{
- Oid indexOid = lfirst_oid(indexId);
- Oid indexNamespaceId = get_rel_namespace(indexOid);
+ Oid indexOid = lfirst_oid(indexId);
+ Oid indexNamespaceId = get_rel_namespace(indexOid);
/*
* Skip any invalid indexes on a TOAST table. These can only be
@@ -3979,26 +4004,7 @@ reindex_relation(Oid relid, int flags, ReindexParams *params)
*/
table_close(rel, NoLock);
- result = (indexIds != NIL);
-
- /*
- * If the relation has a secondary toast rel, reindex that too while we
- * still hold the lock on the main table.
- */
- if ((flags & REINDEX_REL_PROCESS_TOAST) && OidIsValid(toast_relid))
- {
- /*
- * Note that this should fail if the toast relation is missing, so
- * reset REINDEXOPT_MISSING_OK. Even if a new tablespace is set for
- * the parent relation, the indexes on its toast table are not moved.
- * This rule is enforced by setting tablespaceOid to InvalidOid.
- */
- ReindexParams newparams = *params;
-
- newparams.options &= ~(REINDEXOPT_MISSING_OK);
- newparams.tablespaceOid = InvalidOid;
- result |= reindex_relation(toast_relid, flags, &newparams);
- }
+ result |= (indexIds != NIL);
return result;
}
--
2.35.1
On Sun, Jul 9, 2023 at 7:21 AM Richard Veselý <richard.vesely@softea.sk> wrote:
... there's no shortage of people that suffer from sluggish pg_dump/pg_restore cycle and I imagine there are any number of people that would be interested in improving bulk ingestion which is often a bottleneck for analytical workloads as you are well aware. What's the best place to discuss this topic further - pgsql-performance or someplace else?
(moved conversation to -hackers, and moved -bugs to BCC)
I was dissatisfied with storage layer performance, especially during the initial database population, so I rewrote it for my use case. I'm done with the heap, but for the moment I still rely on PostgreSQL to build indexes,
It sounds like you've developed a method to speed up loading of
tables, and might have ideas/suggestions for speeding up CREATE
INDEX/REINDEX. The -hackers list feels like a place to discuss such
changes.
Best regards,
Gurjeet
http://Gurje.et
Hi Gurjeet,
Thank you for the follow-up. I was worried my message got buried in the middle of the thread. I also appreciate your work on the patch to fix/improve the REINDEX TABLE behavior even though most people would never encounter it in the wild.
As a preface I would first like to say that I can appreciate the emphasis on general maintainability of the codebase, trying to avoid having some overly clever hacks that might impede understanding, having ideally one way of doing things like having a common database page structure, etc. The more one keeps to this "happy" path the better the general state of the project end up by keeping it accessible to the rest of the community and attracting more contributions in turn.
That being said, PostgreSQL can be extremely conservative in scenarios where it might not be warranted while giving a limited opportunity to influence said behavior. This often leads to a very low hardware resource utilization. You can easily find many instances across StackOverflow, dba.stackexchange.com, /r/postgres and pgsql-performance where people run into ingress/egress bottlenecks even though their hardware can trivially support much larger workload.
In my experience, you can be very hard-pressed in many cases to saturate even a modest enterprise HDD while observing the official guidelines (https://www.postgresql.org/docs/current/populate.html), e.g. minimal WAL and host of other configuration optimizations, having no indexes and constraints and creating table and filling it with binary COPY within the same transaction. And the situation with pg_dump/pg_restore is often much worse.
Is there an interest in improving the current state of affairs? I will be rewriting the indexing first to get the whole picture, but I can already tell you that there is a -lot- of performance left on the table even considering the effort to improve COPY performance in PostgreSQL 16. Given sufficient hardware, you should always be heavily IO-bound without exception and saturate any reasonable number of NVMe SSDs.
Best regards,
Richard
-----Original Message-----
From: Gurjeet Singh <gurjeet@singh.im>
Sent: Monday, July 10, 2023 6:44 PM
To: Richard Veselý <richard.vesely@softea.sk>; Postgres Hackers <pgsql-hackers@postgresql.org>
Cc: Tom Lane <tgl@sss.pgh.pa.us>; Michael Paquier <michael@paquier.xyz>
Subject: Re: BUG #18016: REINDEX TABLE failure
On Sun, Jul 9, 2023 at 7:21 AM Richard Veselý <richard.vesely@softea.sk> wrote:
... there's no shortage of people that suffer from sluggish pg_dump/pg_restore cycle and I imagine there are any number of people that would be interested in improving bulk ingestion which is often a bottleneck for analytical workloads as you are well aware. What's the best place to discuss this topic further - pgsql-performance or someplace else?
(moved conversation to -hackers, and moved -bugs to BCC)
I was dissatisfied with storage layer performance, especially during
the initial database population, so I rewrote it for my use case. I'm
done with the heap, but for the moment I still rely on PostgreSQL to
build indexes,
It sounds like you've developed a method to speed up loading of tables, and might have ideas/suggestions for speeding up CREATE INDEX/REINDEX. The -hackers list feels like a place to discuss such changes.
Best regards,
Gurjeet
http://Gurje.et
On Mon, Jul 10, 2023 at 09:35:05AM -0700, Gurjeet Singh wrote:
The code block movement involved slightly more thought and care than I
had previously imagined. As explained in comments in the patch, the
enumeration and suppression of indexes on the main table must happen
before any CommandCounterIncrement() call, hence the
reindex-the-toast-table-if-any code had to be placed after that
enumeration.
Do we need to add another CCI after reindexing the TOAST table? It looks
like we presently do so between reindexing each relation, including the
TOAST table.
+ * This should be done after the suppression of the use of indexes (above),
+ * because the recursive call to reindex_relation() below will invoke
+ * CommandCounterIncrement(), which may prevent enumeration of the indexes
+ * on the table.
I'm not following this. We've already obtained the list of index OIDs
before this point. Does this create problems when we try to open and lock
the relations? And if so, how?
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
Import Notes
Reply to msg id not found: 20230726175000.GA3259285@nathanxps13
(Re-sending with -hackers list removed, to avoid message being held
for moderation)
---------- Forwarded message ---------
From: Gurjeet Singh <gurjeet@singh.im>
Date: Wed, Jul 26, 2023 at 2:53 PM
On Wed, Jul 26, 2023 at 10:50 AM Nathan Bossart
<nathandbossart@gmail.com> wrote:
On Mon, Jul 10, 2023 at 09:35:05AM -0700, Gurjeet Singh wrote:
The code block movement involved slightly more thought and care than I
had previously imagined. As explained in comments in the patch, the
enumeration and suppression of indexes on the main table must happen
before any CommandCounterIncrement() call, hence the
reindex-the-toast-table-if-any code had to be placed after that
enumeration.Do we need to add another CCI after reindexing the TOAST table? It looks
like we presently do so between reindexing each relation, including the
TOAST table.
Yes, we do need to do a CCI after reindex the relation's toast table.
But note that it is done by the recursive call to reindex_relation(),
right after it calls reindex_index().
+ * This should be done after the suppression of the use of indexes (above), + * because the recursive call to reindex_relation() below will invoke + * CommandCounterIncrement(), which may prevent enumeration of the indexes + * on the table.I'm not following this. We've already obtained the list of index OIDs
before this point. Does this create problems when we try to open and lock
the relations? And if so, how?
This comment is calling out the fact that there's a recursive call to
reindex_relation() inside the 'if' code block, and that that
reindex_relation() calls CCI. Hence this 'if' code block should _not_
be placed before the the calls to RelationGetIndexList() and
SetReindexPending(). Because if we do, then the CCI done by
reindex_relation() will impact what RelationGetIndexList() sees.
This is to match the expectation set for the
REINDEX_REL_SUPPRESS_INDEX_USE flag.
* REINDEX_REL_SUPPRESS_INDEX_USE: if true, the relation was just completely
...
* ... The caller is required to call us *without*
* having made the rebuilt table visible by doing CommandCounterIncrement;
* we'll do CCI after having collected the index list. (This way we can still
* use catalog indexes while collecting the list.)
I hope that makes sense.
Best regards,
Gurjeet
http://Gurje.et
Import Notes
Reply to msg id not found: CABwTF4U+t-8sD2y2xhgHyjovPMbrsNJO5F49nvZZtpwFkwWDA@mail.gmail.com
(Re-sending with -hackers list removed, to avoid message getting held
for moderation)
---------- Forwarded message ---------
From: Gurjeet Singh <gurjeet@singh.im>
Date: Wed, Jul 26, 2023 at 4:01 PM
On Wed, Jul 26, 2023 at 2:53 PM Gurjeet Singh <gurjeet@singh.im> wrote:
On Wed, Jul 26, 2023 at 10:50 AM Nathan Bossart
<nathandbossart@gmail.com> wrote:On Mon, Jul 10, 2023 at 09:35:05AM -0700, Gurjeet Singh wrote:
+ * This should be done after the suppression of the use of indexes (above), + * because the recursive call to reindex_relation() below will invoke + * CommandCounterIncrement(), which may prevent enumeration of the indexes + * on the table.I'm not following this. We've already obtained the list of index OIDs
before this point. Does this create problems when we try to open and lock
the relations? And if so, how?This comment is calling out the fact that there's a recursive call to
reindex_relation() inside the 'if' code block, and that that
reindex_relation() calls CCI. Hence this 'if' code block should _not_
be placed before the the calls to RelationGetIndexList() and
SetReindexPending(). Because if we do, then the CCI done by
reindex_relation() will impact what RelationGetIndexList() sees.This is to match the expectation set for the
REINDEX_REL_SUPPRESS_INDEX_USE flag.
Given that the issue is already explained by the flag's comments above
the function, this comment paragraph in the patch may be considered
extraneous. Feel free to remove it, if you think so.
I felt the need for that paragraph, because it doesn't feel obvious to
me as to why we can't simply reindex the toast table as the first
thing in this function; the toast table reindex will trigger CCI, and
that'd be bad if done before RelationGetIndexList().
Best regards,
Gurjeet
http://Gurje.et
Import Notes
Reply to msg id not found: CABwTF4XyGzeEetn5AhsVTfY7zzBBcfjuMCfTLinOBWdbHbCWPw@mail.gmail.com
On Wed, Jul 26, 2023 at 06:42:14PM -0700, Gurjeet Singh wrote:
On Wed, Jul 26, 2023 at 10:50 AM Nathan Bossart
<nathandbossart@gmail.com> wrote:On Mon, Jul 10, 2023 at 09:35:05AM -0700, Gurjeet Singh wrote:
The code block movement involved slightly more thought and care than I
had previously imagined. As explained in comments in the patch, the
enumeration and suppression of indexes on the main table must happen
before any CommandCounterIncrement() call, hence the
reindex-the-toast-table-if-any code had to be placed after that
enumeration.Do we need to add another CCI after reindexing the TOAST table? It looks
like we presently do so between reindexing each relation, including the
TOAST table.Yes, we do need to do a CCI after reindex the relation's toast table.
But note that it is done by the recursive call to reindex_relation(),
right after it calls reindex_index().
*facepalm*
Ah, I see it now.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
On Wed, Jul 26, 2023 at 06:43:18PM -0700, Gurjeet Singh wrote:
On Wed, Jul 26, 2023 at 2:53 PM Gurjeet Singh <gurjeet@singh.im> wrote:
On Wed, Jul 26, 2023 at 10:50 AM Nathan Bossart
<nathandbossart@gmail.com> wrote:On Mon, Jul 10, 2023 at 09:35:05AM -0700, Gurjeet Singh wrote: + * This should be done after the suppression of the use of indexes (above), + * because the recursive call to reindex_relation() below will invoke + * CommandCounterIncrement(), which may prevent enumeration of the indexes + * on the table.I'm not following this. We've already obtained the list of index OIDs
before this point. Does this create problems when we try to open and lock
the relations? And if so, how?This comment is calling out the fact that there's a recursive call to
reindex_relation() inside the 'if' code block, and that that
reindex_relation() calls CCI. Hence this 'if' code block should _not_
be placed before the the calls to RelationGetIndexList() and
SetReindexPending(). Because if we do, then the CCI done by
reindex_relation() will impact what RelationGetIndexList() sees.This is to match the expectation set for the
REINDEX_REL_SUPPRESS_INDEX_USE flag.Given that the issue is already explained by the flag's comments above
the function, this comment paragraph in the patch may be considered
extraneous. Feel free to remove it, if you think so.I felt the need for that paragraph, because it doesn't feel obvious to
me as to why we can't simply reindex the toast table as the first
thing in this function; the toast table reindex will trigger CCI, and
that'd be bad if done before RelationGetIndexList().
I see. I'd suggest referencing the comment above the function, but in
general I do think having a comment about this is appropriate.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
On Thu, Jul 27, 2023 at 04:14:41PM -0700, Nathan Bossart wrote:
On Wed, Jul 26, 2023 at 06:43:18PM -0700, Gurjeet Singh wrote:
I felt the need for that paragraph, because it doesn't feel obvious to
me as to why we can't simply reindex the toast table as the first
thing in this function; the toast table reindex will trigger CCI, and
that'd be bad if done before RelationGetIndexList().I see. I'd suggest referencing the comment above the function, but in
general I do think having a comment about this is appropriate.
+ * This should be done after the suppression of the use of indexes (above),
+ * because the recursive call to reindex_relation() below will invoke
+ * CommandCounterIncrement(), which may prevent enumeration of the indexes
+ * on the table.
This does not explain the reason why this would prevent the creation
of a consistent index list fetched from the parent table, does it?
Would some indexes be missing from what should be reindexed? Or some
added unnecessarily? Would that be that an incorrect list?
--
Michael
On Fri, Jul 28, 2023 at 10:50:50AM +0900, Michael Paquier wrote:
On Thu, Jul 27, 2023 at 04:14:41PM -0700, Nathan Bossart wrote:
On Wed, Jul 26, 2023 at 06:43:18PM -0700, Gurjeet Singh wrote:
I felt the need for that paragraph, because it doesn't feel obvious to
me as to why we can't simply reindex the toast table as the first
thing in this function; the toast table reindex will trigger CCI, and
that'd be bad if done before RelationGetIndexList().I see. I'd suggest referencing the comment above the function, but in
general I do think having a comment about this is appropriate.+ * This should be done after the suppression of the use of indexes (above), + * because the recursive call to reindex_relation() below will invoke + * CommandCounterIncrement(), which may prevent enumeration of the indexes + * on the table.This does not explain the reason why this would prevent the creation
of a consistent index list fetched from the parent table, does it?
Would some indexes be missing from what should be reindexed? Or some
added unnecessarily? Would that be that an incorrect list?
IIUC the issue is that something (e.g., VACUUM FULL, CLUSTER) might've just
rebuilt the heap, so if we CCI'd before gathering the list of indexes, the
new heap contents would become visible, and the indexes would be
inconsistent with the heap. This is a problem when the relation in
question is a system catalog that needs to be consulted to gather the list
of indexes. To handle this, we avoid the CCI until after gathering the
indexes so that the old heap contents appear valid and can be used as
needed. Once that is done, we mark the indexes as pending-rebuild and do a
CCI, at which point the indexes become inconsistent with the heap. This
behavior appears to have been added by commit b9b8831.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
On Fri, Jul 28, 2023 at 11:00:56AM -0700, Nathan Bossart wrote:
On Fri, Jul 28, 2023 at 10:50:50AM +0900, Michael Paquier wrote:
On Thu, Jul 27, 2023 at 04:14:41PM -0700, Nathan Bossart wrote:
On Wed, Jul 26, 2023 at 06:43:18PM -0700, Gurjeet Singh wrote:
I felt the need for that paragraph, because it doesn't feel obvious to
me as to why we can't simply reindex the toast table as the first
thing in this function; the toast table reindex will trigger CCI, and
that'd be bad if done before RelationGetIndexList().I see. I'd suggest referencing the comment above the function, but in
general I do think having a comment about this is appropriate.+ * This should be done after the suppression of the use of indexes (above), + * because the recursive call to reindex_relation() below will invoke + * CommandCounterIncrement(), which may prevent enumeration of the indexes + * on the table.This does not explain the reason why this would prevent the creation
of a consistent index list fetched from the parent table, does it?
Would some indexes be missing from what should be reindexed? Or some
added unnecessarily? Would that be that an incorrect list?IIUC the issue is that something (e.g., VACUUM FULL, CLUSTER) might've just
rebuilt the heap, so if we CCI'd before gathering the list of indexes, the
new heap contents would become visible, and the indexes would be
inconsistent with the heap. This is a problem when the relation in
question is a system catalog that needs to be consulted to gather the list
of indexes. To handle this, we avoid the CCI until after gathering the
indexes so that the old heap contents appear valid and can be used as
needed. Once that is done, we mark the indexes as pending-rebuild and do a
CCI, at which point the indexes become inconsistent with the heap. This
behavior appears to have been added by commit b9b8831.
How do we move this one forward? Michael and I provided some feedback
about the comment, but AFAICT this patch is in good shape otherwise.
Gurjeet, would you mind putting together a new version of the patch so that
we can close on this one?
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
On Fri, Sep 1, 2023 at 9:55 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
On Fri, Jul 28, 2023 at 11:00:56AM -0700, Nathan Bossart wrote:
On Fri, Jul 28, 2023 at 10:50:50AM +0900, Michael Paquier wrote:
On Thu, Jul 27, 2023 at 04:14:41PM -0700, Nathan Bossart wrote:
On Wed, Jul 26, 2023 at 06:43:18PM -0700, Gurjeet Singh wrote:
I felt the need for that paragraph, because it doesn't feel obvious to
me as to why we can't simply reindex the toast table as the first
thing in this function; the toast table reindex will trigger CCI, and
that'd be bad if done before RelationGetIndexList().I see. I'd suggest referencing the comment above the function, but in
general I do think having a comment about this is appropriate.+ * This should be done after the suppression of the use of indexes (above), + * because the recursive call to reindex_relation() below will invoke + * CommandCounterIncrement(), which may prevent enumeration of the indexes + * on the table.This does not explain the reason why this would prevent the creation
of a consistent index list fetched from the parent table, does it?
Would some indexes be missing from what should be reindexed? Or some
added unnecessarily? Would that be that an incorrect list?IIUC the issue is that something (e.g., VACUUM FULL, CLUSTER) might've just
rebuilt the heap, so if we CCI'd before gathering the list of indexes, the
new heap contents would become visible, and the indexes would be
inconsistent with the heap. This is a problem when the relation in
question is a system catalog that needs to be consulted to gather the list
of indexes. To handle this, we avoid the CCI until after gathering the
indexes so that the old heap contents appear valid and can be used as
needed. Once that is done, we mark the indexes as pending-rebuild and do a
CCI, at which point the indexes become inconsistent with the heap. This
behavior appears to have been added by commit b9b8831.How do we move this one forward? Michael and I provided some feedback
about the comment, but AFAICT this patch is in good shape otherwise.
Gurjeet, would you mind putting together a new version of the patch so that
we can close on this one?
Please see attached v2 of the patch; no code changes since v1, just
comments are changed to describe the reason for behaviour and the
placement of code.
I have tried to make the comment describe in more detail the condition
it's trying to avoid. I've also referenced the function comments, as
you suggested, so that the reader can get more context about why the
code is placed _after_ certain other code.
Hopefully the comments are sufficiently descriptive. If you or another
committer feels the need to change the comments any further, please
feel free to edit them as necessary.
Best regards,
Gurjeet
http://Gurje.et
Attachments:
v2-0001-Reindex-the-toast-table-if-any-before-the-main-ta.patchapplication/x-patch; name=v2-0001-Reindex-the-toast-table-if-any-before-the-main-ta.patchDownload
From 64d646105383e8f47394f4ee7e169ffde727da3c Mon Sep 17 00:00:00 2001
From: Gurjeet Singh <gurjeet@singh.im>
Date: Wed, 29 Nov 2023 13:17:39 -0800
Subject: [PATCH v2] Reindex the toast table before the main table
This helps in cases where a corruption in toast table's index would
otherwise error and stop REINDEX TABLE command when it tries to fetch a
toasted datum. This way toast table's index is rebuilt and fixed before
it is used for reindexing the main table.
---
src/backend/catalog/index.c | 58 ++++++++++++++++++++++---------------
1 file changed, 34 insertions(+), 24 deletions(-)
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 143fae01eb..ea751299c2 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -3871,7 +3871,7 @@ reindex_relation(Oid relid, int flags, const ReindexParams *params)
Oid toast_relid;
List *indexIds;
char persistence;
- bool result;
+ bool result = false;
ListCell *indexId;
int i;
@@ -3901,7 +3901,7 @@ reindex_relation(Oid relid, int flags, const ReindexParams *params)
toast_relid = rel->rd_rel->reltoastrelid;
/*
- * Get the list of index OIDs for this relation. (We trust to the
+ * Get the list of index OIDs for this relation. (We trust the
* relcache to get this with a sequential scan if ignoring system
* indexes.)
*/
@@ -3919,6 +3919,35 @@ reindex_relation(Oid relid, int flags, const ReindexParams *params)
CommandCounterIncrement();
}
+ /*
+ * Reindex the toast table, if any, before the main table.
+ *
+ * This helps in cases where a corruption in toast table's index would
+ * otherwise error and stop REINDEX TABLE command when it tries to fetch a
+ * toasted datum. This way toast table's index is rebuilt and fixed before
+ * it is used for reindexing the main table.
+ *
+ * We must call reindex_relation() _after_ the RelationGetIndexList(),
+ * because the reindex_relation() will call CommandCounterIncrement() after
+ * every reindex_index(). Performing CCI before the RelationGetIndexList()
+ * call may produce wrong results. Refer to the description of
+ * REINDEX_REL_SUPPRESS_INDEX_USE flag.
+ */
+ if ((flags & REINDEX_REL_PROCESS_TOAST) && OidIsValid(toast_relid))
+ {
+ /*
+ * Note that this should fail if the toast relation is missing, so
+ * reset REINDEXOPT_MISSING_OK. Even if a new tablespace is set for
+ * the parent relation, the indexes on its toast table are not moved.
+ * This rule is enforced by setting tablespaceOid to InvalidOid.
+ */
+ ReindexParams newparams = *params;
+
+ newparams.options &= ~(REINDEXOPT_MISSING_OK);
+ newparams.tablespaceOid = InvalidOid;
+ result |= reindex_relation(toast_relid, flags, &newparams);
+ }
+
/*
* Compute persistence of indexes: same as that of owning rel, unless
* caller specified otherwise.
@@ -3934,8 +3963,8 @@ reindex_relation(Oid relid, int flags, const ReindexParams *params)
i = 1;
foreach(indexId, indexIds)
{
- Oid indexOid = lfirst_oid(indexId);
- Oid indexNamespaceId = get_rel_namespace(indexOid);
+ Oid indexOid = lfirst_oid(indexId);
+ Oid indexNamespaceId = get_rel_namespace(indexOid);
/*
* Skip any invalid indexes on a TOAST table. These can only be
@@ -3972,26 +4001,7 @@ reindex_relation(Oid relid, int flags, const ReindexParams *params)
*/
table_close(rel, NoLock);
- result = (indexIds != NIL);
-
- /*
- * If the relation has a secondary toast rel, reindex that too while we
- * still hold the lock on the main table.
- */
- if ((flags & REINDEX_REL_PROCESS_TOAST) && OidIsValid(toast_relid))
- {
- /*
- * Note that this should fail if the toast relation is missing, so
- * reset REINDEXOPT_MISSING_OK. Even if a new tablespace is set for
- * the parent relation, the indexes on its toast table are not moved.
- * This rule is enforced by setting tablespaceOid to InvalidOid.
- */
- ReindexParams newparams = *params;
-
- newparams.options &= ~(REINDEXOPT_MISSING_OK);
- newparams.tablespaceOid = InvalidOid;
- result |= reindex_relation(toast_relid, flags, &newparams);
- }
+ result |= (indexIds != NIL);
return result;
}
--
2.34.1
On Thu, 30 Nov 2023 at 03:14, Gurjeet Singh <gurjeet@singh.im> wrote:
On Fri, Sep 1, 2023 at 9:55 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
On Fri, Jul 28, 2023 at 11:00:56AM -0700, Nathan Bossart wrote:
On Fri, Jul 28, 2023 at 10:50:50AM +0900, Michael Paquier wrote:
On Thu, Jul 27, 2023 at 04:14:41PM -0700, Nathan Bossart wrote:
On Wed, Jul 26, 2023 at 06:43:18PM -0700, Gurjeet Singh wrote:
I felt the need for that paragraph, because it doesn't feel obvious to
me as to why we can't simply reindex the toast table as the first
thing in this function; the toast table reindex will trigger CCI, and
that'd be bad if done before RelationGetIndexList().I see. I'd suggest referencing the comment above the function, but in
general I do think having a comment about this is appropriate.+ * This should be done after the suppression of the use of indexes (above), + * because the recursive call to reindex_relation() below will invoke + * CommandCounterIncrement(), which may prevent enumeration of the indexes + * on the table.This does not explain the reason why this would prevent the creation
of a consistent index list fetched from the parent table, does it?
Would some indexes be missing from what should be reindexed? Or some
added unnecessarily? Would that be that an incorrect list?IIUC the issue is that something (e.g., VACUUM FULL, CLUSTER) might've just
rebuilt the heap, so if we CCI'd before gathering the list of indexes, the
new heap contents would become visible, and the indexes would be
inconsistent with the heap. This is a problem when the relation in
question is a system catalog that needs to be consulted to gather the list
of indexes. To handle this, we avoid the CCI until after gathering the
indexes so that the old heap contents appear valid and can be used as
needed. Once that is done, we mark the indexes as pending-rebuild and do a
CCI, at which point the indexes become inconsistent with the heap. This
behavior appears to have been added by commit b9b8831.How do we move this one forward? Michael and I provided some feedback
about the comment, but AFAICT this patch is in good shape otherwise.
Gurjeet, would you mind putting together a new version of the patch so that
we can close on this one?Please see attached v2 of the patch; no code changes since v1, just
comments are changed to describe the reason for behaviour and the
placement of code.I have tried to make the comment describe in more detail the condition
it's trying to avoid. I've also referenced the function comments, as
you suggested, so that the reader can get more context about why the
code is placed _after_ certain other code.Hopefully the comments are sufficiently descriptive. If you or another
committer feels the need to change the comments any further, please
feel free to edit them as necessary.
CFBot shows that the patch does not apply anymore as in [1]http://cfbot.cputube.org/patch_46_4443.log:
=== Applying patches on top of PostgreSQL commit ID
06a66d87dbc7e06581af6765131ea250063fb4ac ===
=== applying patch
./v2-0001-Reindex-the-toast-table-if-any-before-the-main-ta.patch
patching file src/backend/catalog/index.c
...
Hunk #5 FAILED at 4001.
1 out of 5 hunks FAILED -- saving rejects to file
src/backend/catalog/index.c.rej
Please have a look and post an updated version.
[1]: http://cfbot.cputube.org/patch_46_4443.log
Regards,
Vignesh
On Fri, Jan 26, 2024 at 08:22:49AM +0530, vignesh C wrote:
Please have a look and post an updated version.
It happens that I am at the origin of both the conflicts when applying
the patch and the delay in handling it in the CF app as I was
registered as a committer for it while the entry was marked as RfC, so
thanks for the reminder. I have looked at it today with a fresh pair
of eyes, and reworked a bit the comments before applying it on HEAD.
--
Michael