BUG #17586: Look like there are an another bug in REINDEX INDEX CONCURRENTLY of pg_toast indexes
The following bug has been logged on the website:
Bug reference: 17586
Logged by: Maxim Boguk
Email address: maxim.boguk@gmail.com
PostgreSQL version: 14.4
Operating system: Ubuntu Linux
Description:
Details about the data structure
pg_toast.pg_toast_2502234687_index is index toast of exports_new_2022_32w
partition of exports_new table (Partition key: RANGE (epoch))
During the reindex job there were quite a lot new inserts into
exports_new_2022_32w table via partition head.
2022-08-11 02:46:16.972 UTC 1175180 ***_push@***_exports from [local]
[vxid:12/63545267 txid:354388805] [INSERT] LOG: duration: 2711.995 ms
statement: INSERT INTO exports_new(field_board_id, field_id_mod_256, epoch,
field_ids)
(SELECT field_board_id, (field_id % 256)::int
field_id_mod_256, epoch, ARRAY_AGG(field_id) AS field_ids FROM (SELECT
field_board_id, epoch, unnest(field_ids) AS field_id FROM temp_export_ids)
t
GROUP BY 1,2,3)
2022-08-11 02:47:28.927 UTC 1187482 postgres@***_exports from [local]
[vxid:14/66875173 txid:0] [REINDEX] LOG: temporary file: path
"base/pgsql_tmp/pgsql_tmp1187482.1.sharedfileset/2.0", size 8192
2022-08-11 02:47:28.927 UTC 1187482 postgres@***_exports from [local]
[vxid:14/66875173 txid:0] [REINDEX] STATEMENT: REINDEX INDEX CONCURRENTLY
pg_toast.pg_toast_2502234687_index
2022-08-11 02:47:28.927 UTC 1187482 postgres@***_exports from [local]
[vxid:14/66875173 txid:0] [REINDEX] LOG: temporary file: path
"base/pgsql_tmp/pgsql_tmp1187482.1.sharedfileset/1.0", size 8192
2022-08-11 02:47:28.927 UTC 1187482 postgres@***_exports from [local]
[vxid:14/66875173 txid:0] [REINDEX] STATEMENT: REINDEX INDEX CONCURRENTLY
pg_toast.pg_toast_2502234687_index
2022-08-11 02:47:28.927 UTC 1187482 postgres@***_exports from [local]
[vxid:14/66875173 txid:0] [REINDEX] LOG: temporary file: path
"base/pgsql_tmp/pgsql_tmp1187482.1.sharedfileset/0.0", size 8192
2022-08-11 02:47:28.927 UTC 1187482 postgres@***_exports from [local]
[vxid:14/66875173 txid:0] [REINDEX] STATEMENT: REINDEX INDEX CONCURRENTLY
pg_toast.pg_toast_2502234687_index
2022-08-11 02:47:28.965 UTC 1187482 postgres@***_exports from [local]
[vxid:14/66875173 txid:0] [REINDEX] LOG: temporary file: path
"base/pgsql_tmp/pgsql_tmp1187482.0.sharedfileset/2.0", size 252370944
2022-08-11 02:47:28.965 UTC 1187482 postgres@***_exports from [local]
[vxid:14/66875173 txid:0] [REINDEX] STATEMENT: REINDEX INDEX CONCURRENTLY
pg_toast.pg_toast_2502234687_index
2022-08-11 02:47:29.003 UTC 1187482 postgres@***_exports from [local]
[vxid:14/66875173 txid:0] [REINDEX] LOG: temporary file: path
"base/pgsql_tmp/pgsql_tmp1187482.0.sharedfileset/1.0", size 253599744
2022-08-11 02:47:29.003 UTC 1187482 postgres@***_exports from [local]
[vxid:14/66875173 txid:0] [REINDEX] STATEMENT: REINDEX INDEX CONCURRENTLY
pg_toast.pg_toast_2502234687_index
2022-08-11 02:47:29.041 UTC 1187482 postgres@***_exports from [local]
[vxid:14/66875173 txid:0] [REINDEX] LOG: temporary file: path
"base/pgsql_tmp/pgsql_tmp1187482.0.sharedfileset/0.0", size 251535360
2022-08-11 02:47:29.041 UTC 1187482 postgres@***_exports from [local]
[vxid:14/66875173 txid:0] [REINDEX] STATEMENT: REINDEX INDEX CONCURRENTLY
pg_toast.pg_toast_2502234687_index
2022-08-11 02:47:30.023 UTC 1175180 ***_push@***_exports from [local]
[vxid:12/63545272 txid:354389107] [INSERT] LOG: duration: 1991.666 ms
statement: INSERT INTO exports_new(field_board_id, field_id_mod_256, epoch,
field_ids)
(SELECT field_board_id, (field_id % 256)::int
field_id_mod_256, epoch, ARRAY_AGG(field_id) AS field_ids FROM (SELECT
field_board_id, epoch, unnest(field_ids) AS field_id FROM temp_export_ids)
t
GROUP BY 1,2,3)
2022-08-11 02:47:55.962 UTC 1175180 ***_push@***_exports from [local]
[vxid:12/63545274 txid:354389164] [INSERT] LOG: duration: 4027.248 ms
statement: INSERT INTO exports_new(field_board_id, field_id_mod_256, epoch,
field_ids)
(SELECT field_board_id, (field_id % 256)::int
field_id_mod_256, epoch, ARRAY_AGG(field_id) AS field_ids FROM (SELECT
field_board_id, epoch, unnest(field_ids) AS field_id FROM temp_export_ids)
t
GROUP BY 1,2,3)
2022-08-11 02:48:02.883 UTC 1175180 ***_push@***_exports from [local]
[vxid:12/63545275 txid:354389277] [INSERT] LOG: duration: 1055.236 ms
statement: INSERT INTO exports_new(field_board_id, field_id_mod_256, epoch,
field_ids)
(SELECT field_board_id, (field_id % 256)::int
field_id_mod_256, epoch, ARRAY_AGG(field_id) AS field_ids FROM (SELECT
field_board_id, epoch, unnest(field_ids) AS field_id FROM temp_export_ids)
t
GROUP BY 1,2,3)
2022-08-11 02:48:24.914 UTC 1175180 ***_push@***_exports from [local]
[vxid:12/63545278 txid:354389390] [INSERT] LOG: duration: 2020.759 ms
statement: INSERT INTO exports_new(field_board_id, field_id_mod_256, epoch,
field_ids)
(SELECT field_board_id, (field_id % 256)::int
field_id_mod_256, epoch, ARRAY_AGG(field_id) AS field_ids FROM (SELECT
field_board_id, epoch, unnest(field_ids) AS field_id FROM temp_export_ids)
t
GROUP BY 1,2,3)
2022-08-11 02:48:49.512 UTC 1184504 ***_field_sync@***_exports from [local]
[vxid:10/63478386 txid:0] [SELECT] ERROR: missing chunk number 0 for toast
value 2546878653 in pg_toast_2502234687
2022-08-11 02:48:49.512 UTC 1184504 ***_field_sync@***_exports from [local]
[vxid:10/63478386 txid:0] [SELECT] STATEMENT: SELECT field_ids FROM
exports_new WHERE field_board_id = $1 AND field_id_mod_256 = $2 AND epoch =
$3
2022-08-11 02:48:49.651 UTC 1187482 postgres@***_exports from [local]
[vxid:14/0 txid:0] [REINDEX] LOG: duration: 187669.153 ms statement:
REINDEX INDEX CONCURRENTLY pg_toast.pg_toast_2502234687_index
2022-08-11 02:48:50.048 UTC 1184504 ***_field_sync@***_exports from [local]
[vxid:10/63478398 txid:0] [SELECT] ERROR: missing chunk number 0 for toast
value 2546878502 in pg_toast_2502234687
2022-08-11 02:48:50.048 UTC 1184504 ***_field_sync@***_exports from [local]
[vxid:10/63478398 txid:0] [SELECT] STATEMENT: SELECT field_ids FROM
exports_new WHERE field_board_id = $1 AND field_id_mod_256 = $2 AND epoch =
$3
and so on...
A follow-up secondary manual REINDEX INDEX CONCURRENTLY
pg_toast.pg_toast_2502234687_index resolved the problem with damaged
toast.
Problems do not always happen (rarely) and I cannot build a quick reproducer
so far.
Kind Regards,
Maxim
Hi,
On 2022-08-15 11:36:52 +0000, PG Bug reporting form wrote:
Bug reference: 17586
Logged by: Maxim Boguk
Email address: maxim.boguk@gmail.com
PostgreSQL version: 14.4
Operating system: Ubuntu Linux
Description:Details about the data structure
pg_toast.pg_toast_2502234687_index is index toast of exports_new_2022_32w
partition of exports_new table (Partition key: RANGE (epoch))During the reindex job there were quite a lot new inserts into
exports_new_2022_32w table via partition head.
2022-08-11 02:46:16.972 UTC 1175180 ***_push@***_exports from [local]
[vxid:12/63545267 txid:354388805] [INSERT] LOG: �duration: 2711.995 ms
�statement: INSERT INTO exports_new(field_board_id, field_id_mod_256, epoch,
field_ids)
� � � � � � � � (SELECT field_board_id, (field_id % 256)::int
field_id_mod_256, epoch, ARRAY_AGG(field_id) AS field_ids FROM (SELECT
field_board_id, epoch, unnest(field_ids) AS field_id FROM temp_export_ids)
t
� � � � � � � � GROUP BY 1,2,3)2022-08-11 02:47:28.927 UTC 1187482 postgres@***_exports from [local]
[vxid:14/66875173 txid:0] [REINDEX] LOG: �temporary file: path
"base/pgsql_tmp/pgsql_tmp1187482.1.sharedfileset/2.0", size 8192
[vxid:10/63478386 txid:0] [SELECT] ERROR: �missing chunk number 0 for toast
value 2546878653 in pg_toast_2502234687
2022-08-11 02:48:49.512 UTC 1184504 ***_field_sync@***_exports from [local]
[vxid:10/63478386 txid:0] [SELECT] STATEMENT: �SELECT field_ids FROM
exports_new WHERE field_board_id = $1 AND field_id_mod_256 = $2 AND epoch =
$3
Hm. What is the setup of this system? Do you have a hot_standby_feedback
enabled replica? Does it use replication slots? Did it reconnect at the time?
This issue could be due to the xmin horizon going backward, rather than a bug
in reindex concurrently itself.
Too bad we don't have an errcontext set up for this, it'd be quite helpful to
know what row the missing toast value is in, so we could check its visibility.
Greetings,
Andres Freund
On Tue, Aug 16, 2022 at 1:01 AM Andres Freund <andres@anarazel.de> wrote:
Hi,
On 2022-08-15 11:36:52 +0000, PG Bug reporting form wrote:
Bug reference: 17586
Logged by: Maxim Boguk
Email address: maxim.boguk@gmail.com
PostgreSQL version: 14.4
Operating system: Ubuntu Linux
Description:Details about the data structure
pg_toast.pg_toast_2502234687_index is index toast of exports_new_2022_32w
partition of exports_new table (Partition key: RANGE (epoch))During the reindex job there were quite a lot new inserts into
exports_new_2022_32w table via partition head.2022-08-11 02:46:16.972 UTC 1175180 ***_push@***_exports from [local]
[vxid:12/63545267 txid:354388805] [INSERT] LOG: duration: 2711.995 ms
statement: INSERT INTO exports_new(field_board_id, field_id_mod_256,epoch,
field_ids)
(SELECT field_board_id, (field_id % 256)::int
field_id_mod_256, epoch, ARRAY_AGG(field_id) AS field_ids FROM (SELECT
field_board_id, epoch, unnest(field_ids) AS field_id FROMtemp_export_ids)
t
GROUP BY 1,2,3)2022-08-11 02:47:28.927 UTC 1187482 postgres@***_exports from [local]
[vxid:14/66875173 txid:0] [REINDEX] LOG: temporary file: path
"base/pgsql_tmp/pgsql_tmp1187482.1.sharedfileset/2.0", size 8192[vxid:10/63478386 txid:0] [SELECT] ERROR: missing chunk number 0 for
toast
value 2546878653 in pg_toast_2502234687
2022-08-11 02:48:49.512 UTC 1184504 ***_field_sync@***_exports from[local]
[vxid:10/63478386 txid:0] [SELECT] STATEMENT: SELECT field_ids FROM
exports_new WHERE field_board_id = $1 AND field_id_mod_256 = $2 ANDepoch =
$3
Hm. What is the setup of this system? Do you have a hot_standby_feedback
enabled replica? Does it use replication slots? Did it reconnect at the
time?This issue could be due to the xmin horizon going backward, rather than a
bug
in reindex concurrently itself.Too bad we don't have an errcontext set up for this, it'd be quite helpful
to
know what row the missing toast value is in, so we could check its
visibility.
Hi,
hot_standby_feedback enabled replica - yes (but with minimal activity)
use replication slots - no and no other apps with replication slots
Did it reconnect at the time - most definitely no, pg_stat_replication
shows
backend_start | 2022-06-30 07:36:51.095073+00
for replication connection.
Also I have log_replication_commands enabled and nothing about replication
logged during the last week.
--
Maxim Boguk
Senior Postgresql DBA
https://dataegret.com/
Phone UA: +380 99 143 0000
Phone AU: +61 45 218 5678
LinkedIn: http://www.linkedin.com/pub/maksym-boguk/80/b99/b1b
Skype: maxim.boguk
"Доктор, вы мне советовали так не делать, но почему мне по-прежнему больно
когда я так делаю ещё раз?"
Hi,
On 2022-08-16 10:34:14 +0200, Maxim Boguk wrote:
On Tue, Aug 16, 2022 at 1:01 AM Andres Freund <andres@anarazel.de> wrote:
On 2022-08-15 11:36:52 +0000, PG Bug reporting form wrote:
Bug reference: 17586
Logged by: Maxim Boguk
Email address: maxim.boguk@gmail.com
PostgreSQL version: 14.4
Operating system: Ubuntu Linux
Description:Details about the data structure
pg_toast.pg_toast_2502234687_index is index toast of exports_new_2022_32w
partition of exports_new table (Partition key: RANGE (epoch))During the reindex job there were quite a lot new inserts into
exports_new_2022_32w table via partition head.2022-08-11 02:46:16.972 UTC 1175180 ***_push@***_exports from [local]
[vxid:12/63545267 txid:354388805] [INSERT] LOG: duration: 2711.995 ms
statement: INSERT INTO exports_new(field_board_id, field_id_mod_256,epoch,
field_ids)
(SELECT field_board_id, (field_id % 256)::int
field_id_mod_256, epoch, ARRAY_AGG(field_id) AS field_ids FROM (SELECT
field_board_id, epoch, unnest(field_ids) AS field_id FROMtemp_export_ids)
t
GROUP BY 1,2,3)2022-08-11 02:47:28.927 UTC 1187482 postgres@***_exports from [local]
[vxid:14/66875173 txid:0] [REINDEX] LOG: temporary file: path
"base/pgsql_tmp/pgsql_tmp1187482.1.sharedfileset/2.0", size 8192[vxid:10/63478386 txid:0] [SELECT] ERROR: missing chunk number 0 for
toast
value 2546878653 in pg_toast_2502234687
2022-08-11 02:48:49.512 UTC 1184504 ***_field_sync@***_exports from[local]
[vxid:10/63478386 txid:0] [SELECT] STATEMENT: SELECT field_ids FROM
exports_new WHERE field_board_id = $1 AND field_id_mod_256 = $2 ANDepoch =
$3
Hm. What is the setup of this system? Do you have a hot_standby_feedback
enabled replica? Does it use replication slots? Did it reconnect at the
time?This issue could be due to the xmin horizon going backward, rather than a
bug
in reindex concurrently itself.Too bad we don't have an errcontext set up for this, it'd be quite helpful
to
know what row the missing toast value is in, so we could check its
visibility.Hi,
hot_standby_feedback enabled replica - yes (but with minimal activity)
use replication slots - no and no other apps with replication slots
Did it reconnect at the time - most definitely no, pg_stat_replication
shows
backend_start | 2022-06-30 07:36:51.095073+00
for replication connection.
Also I have log_replication_commands enabled and nothing about replication
logged during the last week.
Hm. How often does this occur? And could you run a server with a bit of added
debugging? Otherwise I don't see how we can narrow this down, unfortunately.
Greetings,
Andres Freund