BUG #16833: postgresql 13.1 process crash every hour
The following bug has been logged on the website:
Bug reference: 16833
Logged by: Alex F
Email address: phoedos16@gmail.com
PostgreSQL version: 13.1
Operating system: 4.14.209-160.339.amzn2.x86_64 #1 SMP Wed Dec 16 22
Description:
Postgresql run inside of official docker container
https://hub.docker.com/_/postgres
psql (PostgreSQL) 13.1 (Debian 13.1-1.pgdg100+1)
on Amazon Linux 2
Process crash inside docker containter 2-3 times per hour without any
additional information
./postgresql-Thu-00.log:2021-01-21 00:11:34 UTC [1]: user=,db=,app=,client=
LOG: server process (PID 20071) was terminated by signal 11: Segmentation
fault
./postgresql-Thu-01.log:2021-01-21 01:11:50 UTC [1]: user=,db=,app=,client=
LOG: server process (PID 23827) was terminated by signal 11: Segmentation
fault
./postgresql-Thu-02.log:2021-01-21 02:11:31 UTC [1]: user=,db=,app=,client=
LOG: server process (PID 27974) was terminated by signal 11: Segmentation
fault
./postgresql-Thu-03.log:2021-01-21 03:11:56 UTC [1]: user=,db=,app=,client=
LOG: server process (PID 31389) was terminated by signal 11: Segmentation
fault
./postgresql-Thu-04.log:2021-01-21 04:11:23 UTC [1]: user=,db=,app=,client=
LOG: server process (PID 2544) was terminated by signal 11: Segmentation
fault
./postgresql-Thu-05.log:2021-01-21 05:11:52 UTC [1]: user=,db=,app=,client=
LOG: server process (PID 5962) was terminated by signal 11: Segmentation
fault
./postgresql-Thu-06.log:2021-01-21 06:12:41 UTC [1]: user=,db=,app=,client=
LOG: server process (PID 60) was terminated by signal 11: Segmentation
fault
./postgresql-Thu-07.log:2021-01-21 07:12:59 UTC [1]: user=,db=,app=,client=
LOG: server process (PID 3810) was terminated by signal 11: Segmentation
fault
./postgresql-Thu-08.log:2021-01-21 08:12:37 UTC [1]: user=,db=,app=,client=
LOG: server process (PID 7730) was terminated by signal 11: Segmentation
fault
./postgresql-Thu-09.log:2021-01-21 09:13:26 UTC [1]: user=,db=,app=,client=
LOG: server process (PID 11257) was terminated by signal 11: Segmentation
fault
./postgresql-Thu-10.log:2021-01-21 10:13:05 UTC [1]: user=,db=,app=,client=
LOG: server process (PID 14982) was terminated by signal 11: Segmentation
fault
./postgresql-Thu-11.log:2021-01-21 11:13:18 UTC [1]: user=,db=,app=,client=
LOG: server process (PID 18503) was terminated by signal 11: Segmentation
fault
./postgresql-Thu-12.log:2021-01-21 12:13:04 UTC [1]: user=,db=,app=,client=
LOG: server process (PID 22037) was terminated by signal 11: Segmentation
fault
./postgresql-Thu-13.log:2021-01-21 13:13:57 UTC [1]: user=,db=,app=,client=
LOG: server process (PID 25484) was terminated by signal 11: Segmentation
fault
./postgresql-Thu-14.log:2021-01-21 14:13:15 UTC [1]: user=,db=,app=,client=
LOG: server process (PID 29072) was terminated by signal 11: Segmentation
fault
./postgresql-Thu-15.log:2021-01-21 15:13:22 UTC [1]: user=,db=,app=,client=
LOG: server process (PID 32657) was terminated by signal 11: Segmentation
fault
./postgresql-Thu-16.log:2021-01-21 16:13:15 UTC [1]: user=,db=,app=,client=
LOG: server process (PID 3788) was terminated by signal 11: Segmentation
fault
./postgresql-Thu-17.log:2021-01-21 17:13:26 UTC [1]: user=,db=,app=,client=
LOG: server process (PID 7306) was terminated by signal 11: Segmentation
fault
./postgresql-Thu-18.log:2021-01-21 18:12:58 UTC [1]: user=,db=,app=,client=
LOG: server process (PID 11885) was terminated by signal 11: Segmentation
fault
./postgresql-Thu-19.log:2021-01-21 19:13:21 UTC [1]: user=,db=,app=,client=
LOG: server process (PID 14442) was terminated by signal 11: Segmentation
fault
./postgresql-Thu-20.log:2021-01-21 20:13:36 UTC [1]: user=,db=,app=,client=
LOG: server process (PID 17975) was terminated by signal 11: Segmentation
fault
./postgresql-Thu-21.log:2021-01-21 21:13:55 UTC [1]: user=,db=,app=,client=
LOG: server process (PID 21631) was terminated by signal 11: Segmentation
fault
./postgresql-Thu-22.log:2021-01-21 22:14:45 UTC [1]: user=,db=,app=,client=
LOG: server process (PID 25251) was terminated by signal 11: Segmentation
fault
./postgresql-Thu-23.log:2021-01-21 23:15:01 UTC [1]: user=,db=,app=,client=
LOG: server process (PID 28851) was terminated by signal 11: Segmentation
fault
I can share postgresql.conf, process crash core dumps for analysis
PG Bug reporting form <noreply@postgresql.org> writes:
Process crash inside docker containter 2-3 times per hour without any
additional information
./postgresql-Thu-00.log:2021-01-21 00:11:34 UTC [1]: user=,db=,app=,client=
LOG: server process (PID 20071) was terminated by signal 11: Segmentation
fault
Hm, please see if you can get a stack trace:
https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend
Also try to figure out what query(s) are causing the crash.
(It's unlikely that the postmaster log doesn't provide more
information than you've shared here.)
I can share postgresql.conf, process crash core dumps for analysis
Core dumps are unlikely to help anyone else; they are too
machine-specific. Not to mention that they might contain
sensitive data. You'll need to examine them yourself.
regards, tom lane
Thank you Tom for detailed instructions!
What I understood is that some specific query lead to database block
corruption which causes segfault.
I analyzed action log before segfault
1. database worked fine on pg13.0 for a few months
2. issue caused right after pg13.1 binaries upgrade
3. (this actually was wrong action) rollback pg13.0 binaries and tried to
start database on it and now got infinite segfault on startup
LOG: database system was interrupted while in recovery at 2021-01-22
14:27:49 UTC
HINT: This probably means that some data is corrupted and you will have to
use the last backup for recovery.
LOG: database system was not properly shut down; automatic recovery in
progress
LOG: redo starts at 49D/364DD528
LOG: startup process (PID 26) was terminated by signal 11: Segmentation
fault
LOG: aborting startup due to startup process failure
LOG: database system is shut down
Most disappointed fact that slave node also got corrupted wal block and
also unable to start.
So I have a chance to recover database with initdb+pgdump only.
Anyway I will try to compile pg13.1 binaries with --enable-debug and enable
all queries logging. Hope this will help with the investigation.
Thanks for your support!
пт, 22 янв. 2021 г. в 20:23, Tom Lane <tgl@sss.pgh.pa.us>:
Show quoted text
PG Bug reporting form <noreply@postgresql.org> writes:
Process crash inside docker containter 2-3 times per hour without any
additional information
./postgresql-Thu-00.log:2021-01-21 00:11:34 UTC [1]:user=,db=,app=,client=
LOG: server process (PID 20071) was terminated by signal 11:
Segmentation
fault
Hm, please see if you can get a stack trace:
https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend
Also try to figure out what query(s) are causing the crash.
(It's unlikely that the postmaster log doesn't provide more
information than you've shared here.)I can share postgresql.conf, process crash core dumps for analysis
Core dumps are unlikely to help anyone else; they are too
machine-specific. Not to mention that they might contain
sensitive data. You'll need to examine them yourself.regards, tom lane
Hi,
On 2021-01-22 05:37:56 +0000, PG Bug reporting form wrote:
Postgresql run inside of official docker container
https://hub.docker.com/_/postgres
psql (PostgreSQL) 13.1 (Debian 13.1-1.pgdg100+1)
on Amazon Linux 2Process crash inside docker containter 2-3 times per hour without any
additional information
What kind of resource limits have you set up with docker?
Are there any kernel messages?
Greetings,
Andres Freund
Dear team, previously I wrote about postgresql crash (ref #16833)
Unfortunately the issue is still present. After databases reload from
backup I see that crash happens once 1-2 weeks.
It's almost impossible to activate debugging because daily logs size in
regular mode > 100gb/day with 10+million queries per day.
Together with colleagues I think we are able to identify a crash case. This
is a test environment running on pg13.2 with 10 databases inside (1 working
test server = 1 database)
Some tables inside the database are always updated with data product
crawler. Based on these tables in reporting purposes we build concurrent
materialized views which can build for 9-25min.
Some log example for such view build:
2021-05-03 06:14:55 UTC [6220]:
user=servlet,db=feature5042021,app=feature5-vertx,vertx-DataStore,client=vm_ip
LOG: duration: 0.031 ms parse <unnamed>: REFRESH MATERIALIZED VIEW
CONCURRENTLY project.product_original_mv
2021-05-03 06:14:55 UTC [6220]:
user=servlet,db=feature5042021,app=feature5-vertx,vertx-DataStore,client=vm_ip
LOG: duration: 0.038 ms bind <unnamed>: REFRESH MATERIALIZED VIEW
CONCURRENTLY project.product_original_mv
2021-05-03 06:16:21 UTC [6220]:
user=servlet,db=feature5042021,app=feature5-vertx,vertx-DataStore,client=vm_ip
STATEMENT: REFRESH MATERIALIZED VIEW CONCURRENTLY
project.product_original_mv
...
2021-05-03 06:21:22 UTC [6220]:
user=servlet,db=feature5042021,app=feature5-vertx,vertx-DataStore,client=vm_ip
STATEMENT: REFRESH MATERIALIZED VIEW CONCURRENTLY
project.product_original_mv
2021-05-03 06:37:42 UTC [6220]:
user=servlet,db=feature5042021,app=feature5-vertx,vertx-DataStore,client=vm_ip
LOG: duration: 1366910.618 ms execute <unnamed>: REFRESH MATERIALIZED
VIEW CONCURRENTLY project.product_original_mv
From example log cut you can see that STATEMENT [6220] process logs job
just every second until it's end.
Now suggest a review of the crash case. MATERIALIZED VIEW build started
work 18 min 12s before database crash
2021-05-10 06:45:58 UTC [28907]:
user=servlet,db=feature5042021,app=feature5-vertx,vertx-DataStore,client=vm_ip
LOG: duration: 0.039 ms parse <unnamed>: REFRESH MATERIALIZED VIEW
CONCURRENTLY project.product_original_mv
2021-05-10 06:45:58 UTC [28907]:
user=servlet,db=feature5042021,app=feature5-vertx,vertx-DataStore,client=vm_ip
LOG: duration: 0.039 ms bind <unnamed>: REFRESH MATERIALIZED VIEW
CONCURRENTLY project.product_original_mv
2021-05-10 06:47:06 UTC [28907]:
user=servlet,db=feature5042021,app=feature5-vertx,vertx-DataStore,client=vm_ip
STATEMENT: REFRESH MATERIALIZED VIEW CONCURRENTLY
project.product_original_mv
...with last logged entry for [28907] process
2021-05-10 06:52:35 UTC [28907]:
user=servlet,db=feature5042021,app=feature5-vertx,vertx-DataStore,client=vm_ip
STATEMENT: REFRESH MATERIALIZED VIEW CONCURRENTLY
project.product_original_mv
...then [28907] process do not log anymore
...and crash happened here after 11m 35s from last [28907] log entry
2021-05-10 07:04:10 UTC [1]: user=,db=,app=,client= LOG: server process
(PID 28907) was terminated by signal 11: Segmentation fault
2021-05-10 07:04:10 UTC [1]: user=,db=,app=,client= DETAIL: Failed process
was running: REFRESH MATERIALIZED VIEW CONCURRENTLY
project.product_original_mv
2021-05-10 07:04:10 UTC [1]: user=,db=,app=,client= LOG: terminating any
other active server processes
2021-05-10 07:04:10 UTC [32461]:
user=servlet,db=staging,app=staging-servlet,servlet-PostgresClient,client=vm_ip1
WARNING: terminating connection because of crash of another server process
2021-05-10 07:04:10 UTC [32461]:
user=servlet,db=staging,app=staging-servlet,servlet-PostgresClient,client=vm_ip1
DETAIL: The postmaster has commanded this server process to roll back the
current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.
2021-05-10 07:04:10 UTC [32461]:
user=servlet,db=staging,app=staging-servlet,servlet-PostgresClient,client=vm_ip1
HINT: In a moment you should be able to reconnect to the database and
repeat your command.
One more important note. Production system run on pg12.2 and have
absolutely the same project logic with the same crawlers, table updates and
views but work stable for a long time without any crash
Test system with log examples shown run on pg13.2 and crash every week.
Thanks for your support!
Import Notes
Reply to msg id not found: CAGbr_zVCq-eW+PrE4K3wJGFmqC60w27EqHsNH6+hYC=YL0EnTg@mail.gmail.com
Alex F <phoedos16@gmail.com> writes:
Dear team, previously I wrote about postgresql crash (ref #16833)
Unfortunately the issue is still present.
You still haven't provided a stack trace ... there is not much we
can do with the amount of information here.
regards, tom lane
Dear Tom,
Think I was able to get a gdb stack trace. Logs attached.
PostgreSQL 13.2 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 7.3.1
20180712 (Red Hat 7.3.1-12), 64-bit)
2021-05-14 06:10:54 UTC [22258]: user=,db=,app=,client= LOG: server
process (PID 22273) was terminated by signal 11: Segmentation fault
2021-05-14 06:10:54 UTC [22258]: user=,db=,app=,client= DETAIL: Failed
process was running: REFRESH MATERIALIZED VIEW CONCURRENTLY
project.product_master_mv
2021-05-14 06:10:54 UTC [22258]: user=,db=,app=,client= LOG: terminating
any other active server process
sudo -u postgres gdb -q -c
/usr/local/pgsql/data/core.postgres.22273.pg13_testdb.1620972642
/usr/local/pgsql/bin/postgres
Reading symbols from /usr/local/pgsql/bin/postgres...(no debugging symbols
found)...done.
[New LWP 22273]
warning: Corrupted shared library list: 0x7ffa44ed0680 != 0x68000000000021f
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `postgres: servlet staging 172.31.35.223(23762)
REFRESH MATER'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007ffa43a2c94d in __memmove_avx_unaligned_erms () from
/lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install
glibc-2.26-41.amzn2.x86_64 keyutils-libs-1.5.8-3.amzn2.0.2.x86_64
krb5-libs-1.15.1-37.amzn2.2.2.x86_64 libcom_err-1.42.9-19.amzn2.x86_64
libselinux-2.5-12.amzn2.0.2.x86_64 openssl-libs-1.0.2k-19.amzn2.0.6.x86_64
pcre-8.32-17.amzn2.0.2.x86_64 zlib-1.2.7-18.amzn2.x86_64
(gdb) bt
#0 0x00007ffa43a2c94d in __memmove_avx_unaligned_erms () from
/lib64/libc.so.6
#1 0x00000000004d67f5 in _bt_swap_posting ()
#2 0x00000000004d792e in _bt_insertonpg ()
#3 0x00000000004d9113 in _bt_doinsert ()
#4 0x00000000004dc1a1 in btinsert ()
#5 0x000000000060fa7f in ExecInsertIndexTuples ()
#6 0x0000000000637078 in ExecInsert ()
#7 0x00000000006379c7 in ExecModifyTable ()
#8 0x00000000006107dc in standard_ExecutorRun ()
#9 0x0000000000645695 in _SPI_execute_plan ()
#10 0x0000000000645a3c in SPI_execute ()
#11 0x00000000005c39ff in ExecRefreshMatView ()
#12 0x000000000075799d in ProcessUtilitySlow.isra.4 ()
#13 0x00000000007567ce in standard_ProcessUtility ()
#14 0x0000000000754382 in PortalRunUtility ()
#15 0x0000000000754d4d in PortalRunMulti ()
#16 0x0000000000755763 in PortalRun ()
#17 0x000000000075330b in PostgresMain ()
#18 0x00000000006e2d8a in ServerLoop ()
#19 0x00000000006e3df7 in PostmasterMain ()
#20 0x00000000004883c2 in main ()
Thanks for your support!
ср, 12 мая 2021 г. в 18:57, Tom Lane <tgl@sss.pgh.pa.us>:
Show quoted text
Alex F <phoedos16@gmail.com> writes:
Dear team, previously I wrote about postgresql crash (ref #16833)
Unfortunately the issue is still present.You still haven't provided a stack trace ... there is not much we
can do with the amount of information here.regards, tom lane
Alex F <phoedos16@gmail.com> writes:
Think I was able to get a gdb stack trace. Logs attached.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007ffa43a2c94d in __memmove_avx_unaligned_erms () from
/lib64/libc.so.6
(gdb) bt
#0 0x00007ffa43a2c94d in __memmove_avx_unaligned_erms () from
/lib64/libc.so.6
#1 0x00000000004d67f5 in _bt_swap_posting ()
#2 0x00000000004d792e in _bt_insertonpg ()
#3 0x00000000004d9113 in _bt_doinsert ()
#4 0x00000000004dc1a1 in btinsert ()
#5 0x000000000060fa7f in ExecInsertIndexTuples ()
#6 0x0000000000637078 in ExecInsert ()
#7 0x00000000006379c7 in ExecModifyTable ()
#8 0x00000000006107dc in standard_ExecutorRun ()
#9 0x0000000000645695 in _SPI_execute_plan ()
#10 0x0000000000645a3c in SPI_execute ()
#11 0x00000000005c39ff in ExecRefreshMatView ()
#12 0x000000000075799d in ProcessUtilitySlow.isra.4 ()
#13 0x00000000007567ce in standard_ProcessUtility ()
#14 0x0000000000754382 in PortalRunUtility ()
#15 0x0000000000754d4d in PortalRunMulti ()
#16 0x0000000000755763 in PortalRun ()
#17 0x000000000075330b in PostgresMain ()
#18 0x00000000006e2d8a in ServerLoop ()
#19 0x00000000006e3df7 in PostmasterMain ()
#20 0x00000000004883c2 in main ()
Hmm, looks like it's time to rope Peter Geoghegan in on this discussion.
regards, tom lane
On Fri, May 14, 2021 at 7:57 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Hmm, looks like it's time to rope Peter Geoghegan in on this discussion.
I think that this is likely to be a fairly generic symptom of index
corruption. Ockham's razor does not seem to point to a software bug
because posting list splits are just not that complicated, and are
fairly common in the grand scheme of things. Docker is the kind of
thing that I wouldn't necessarily trust to not do something fishy with
LVM snapshotting -- I tend to suspect that that is a factor.
There was a very similar bug report and stack trace back in March.
That case was tied back to generic index corruption using amcheck,
with indexes corrupted that weren't implicated in the hard crash.
There is a real problem for me to fix here in any case:
_bt_swap_posting() is unnecessarily trusting of the state of the
posting list tuple (compared to _bt_split(), say). I still plan on
adding hardening to _bt_swap_posting() to avoid a hard crash.
Unfortunately I missed the opportunity to get that into 13.3, but I'll
get it into 13.4.
Alex should probably run amcheck to see what that throws up. It should
be possible to run amcheck on your database, which will detect corrupt
posting list tuples on Postgres 13. It's a contrib extension, so you
must first run "CREATE EXTENSION amcheck;". From there, you can run a
query like the following (you may want to customize this):
SELECT bt_index_parent_check(index => c.oid, heapallindexed => true),
c.relname,
c.relpages
FROM pg_index i
JOIN pg_opclass op ON i.indclass[0] = op.oid
JOIN pg_am am ON op.opcmethod = am.oid
JOIN pg_class c ON i.indexrelid = c.oid
JOIN pg_namespace n ON c.relnamespace = n.oid
WHERE am.amname = 'btree'
-- Don't check temp tables, which may be from another session:
AND c.relpersistence != 't'
-- Function may throw an error when this is omitted:
AND c.relkind = 'i' AND i.indisready AND i.indisvalid
ORDER BY c.relpages DESC;
If this query takes too long to complete you may find it useful to add
something to limit the indexes check, such as: AND n.nspname =
'public' -- that change to the SQL will make the query just test
indexes from the public schema.
Do "SET client_min_messages=DEBUG1 " to get a kind of rudimentary
progress indicator, if that seems useful to you.
The docs have further information on what this bt_index_parent_check
function does, should you need it:
https://www.postgresql.org/docs/13/amcheck.html
--
Peter Geoghegan
Dear Peter,
Honestly don't know if you expect a response with amcheck results but
anyway will paste it here:
DEBUG: verifying that tuples from index
"price_model_product_id_latest_idx" are present in "price_model"
DEBUG: finished verifying presence of 5598051 tuples from table
"price_model" with bitset 48.61% set
DEBUG: verifying consistency of tree structure for index
"name_original_idx_s" with cross-level checks
DEBUG: verifying level 3 (true root level)
DEBUG: verifying level 2
ERROR: down-link lower bound invariant violated for index
"name_original_idx_s"
DETAIL: Parent block=64 child index tid=(868,3) parent page
lsn=1D2F/14483F28.
Anyway, I will wait for v13.4 and try to re-test this crash case.
Thanks for your support!
пт, 14 мая 2021 г. в 20:48, Peter Geoghegan <pg@bowt.ie>:
Show quoted text
On Fri, May 14, 2021 at 7:57 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Hmm, looks like it's time to rope Peter Geoghegan in on this discussion.
I think that this is likely to be a fairly generic symptom of index
corruption. Ockham's razor does not seem to point to a software bug
because posting list splits are just not that complicated, and are
fairly common in the grand scheme of things. Docker is the kind of
thing that I wouldn't necessarily trust to not do something fishy with
LVM snapshotting -- I tend to suspect that that is a factor.There was a very similar bug report and stack trace back in March.
That case was tied back to generic index corruption using amcheck,
with indexes corrupted that weren't implicated in the hard crash.There is a real problem for me to fix here in any case:
_bt_swap_posting() is unnecessarily trusting of the state of the
posting list tuple (compared to _bt_split(), say). I still plan on
adding hardening to _bt_swap_posting() to avoid a hard crash.
Unfortunately I missed the opportunity to get that into 13.3, but I'll
get it into 13.4.Alex should probably run amcheck to see what that throws up. It should
be possible to run amcheck on your database, which will detect corrupt
posting list tuples on Postgres 13. It's a contrib extension, so you
must first run "CREATE EXTENSION amcheck;". From there, you can run a
query like the following (you may want to customize this):SELECT bt_index_parent_check(index => c.oid, heapallindexed => true),
c.relname,
c.relpages
FROM pg_index i
JOIN pg_opclass op ON i.indclass[0] = op.oid
JOIN pg_am am ON op.opcmethod = am.oid
JOIN pg_class c ON i.indexrelid = c.oid
JOIN pg_namespace n ON c.relnamespace = n.oid
WHERE am.amname = 'btree'
-- Don't check temp tables, which may be from another session:
AND c.relpersistence != 't'
-- Function may throw an error when this is omitted:
AND c.relkind = 'i' AND i.indisready AND i.indisvalid
ORDER BY c.relpages DESC;If this query takes too long to complete you may find it useful to add
something to limit the indexes check, such as: AND n.nspname =
'public' -- that change to the SQL will make the query just test
indexes from the public schema.Do "SET client_min_messages=DEBUG1 " to get a kind of rudimentary
progress indicator, if that seems useful to you.The docs have further information on what this bt_index_parent_check
function does, should you need it:
https://www.postgresql.org/docs/13/amcheck.html--
Peter Geoghegan
On Fri, May 14, 2021 at 12:12 PM Alex F <phoedos16@gmail.com> wrote:
Dear Peter,
Honestly don't know if you expect a response with amcheck results but anyway will paste it here:
It is helpful -- thanks!
It should be possible to avoid this problem by reindexing. Of course
it's important to eliminate whatever the source of the corruption is,
which might be much harder.
Could you execute exactly the same query, only this time use
"bt_index_check(index => c.oid, heapallindexed => true)" in place of
the bt_index_parent_check() call from the original query? Maybe there
is something more to be learned by just focussing on the leaf pages,
and not failing earlier on, in the parent pages. The less thorough
bt_index_check() function can sometimes show something interesting by
failing later than bt_index_parent_check() would fail with the same
index.
I note that the amcheck error message that you showed happens between
level 2 and level 1, neither of which are leaf level (that's level 0)
-- only leaf pages can have posting list tuples. To me this suggests
that the chances of corruption being a bug in deduplication
specifically are very remote (it's more likely to be a bug in some
other place, even). I'm always curious about real world corruption, so
I'd still appreciate seeing the bt_index_check() variant query's
output just to satisfy myself that that's what it is.
--
Peter Geoghegan
Dear Peter,
so the 2nd query look like this:
SELECT bt_index_check(index => c.oid, heapallindexed => true),
c.relname,
c.relpages
FROM pg_index i
JOIN pg_opclass op ON i.indclass[0] = op.oid
JOIN pg_am am ON op.opcmethod = am.oid
JOIN pg_class c ON i.indexrelid = c.oid
JOIN pg_namespace n ON c.relnamespace = n.oid
WHERE am.amname = 'btree'
-- Don't check temp tables, which may be from another session:
AND c.relpersistence != 't'
-- Function may throw an error when this is omitted:
AND c.relkind = 'i' AND i.indisready AND i.indisvalid
ORDER BY c.relpages DESC;
and it's output listed below:
DEBUG: verifying that tuples from index
"price_model_product_id_latest_idx" are present in "price_model"
DEBUG: finished verifying presence of 5598051 tuples from table
"price_model" with bitset 48.61% set
DEBUG: verifying consistency of tree structure for index
"name_original_idx_s"
DEBUG: verifying level 3 (true root level)
DEBUG: verifying level 2
DEBUG: verifying level 1
ERROR: item order invariant violated for index "name_original_idx_s"
DETAIL: Lower index tid=(11900,58) (points to index tid=(858,1)) higher
index tid=(11900,59) (points to index tid=(859,1)) page lsn=1CE8/D3E85550.
develop052021=#
Thanks for your support!
сб, 15 мая 2021 г. в 00:25, Peter Geoghegan <pg@bowt.ie>:
Show quoted text
On Fri, May 14, 2021 at 12:12 PM Alex F <phoedos16@gmail.com> wrote:
Dear Peter,
Honestly don't know if you expect a response with amcheck results butanyway will paste it here:
It is helpful -- thanks!
It should be possible to avoid this problem by reindexing. Of course
it's important to eliminate whatever the source of the corruption is,
which might be much harder.Could you execute exactly the same query, only this time use
"bt_index_check(index => c.oid, heapallindexed => true)" in place of
the bt_index_parent_check() call from the original query? Maybe there
is something more to be learned by just focussing on the leaf pages,
and not failing earlier on, in the parent pages. The less thorough
bt_index_check() function can sometimes show something interesting by
failing later than bt_index_parent_check() would fail with the same
index.I note that the amcheck error message that you showed happens between
level 2 and level 1, neither of which are leaf level (that's level 0)
-- only leaf pages can have posting list tuples. To me this suggests
that the chances of corruption being a bug in deduplication
specifically are very remote (it's more likely to be a bug in some
other place, even). I'm always curious about real world corruption, so
I'd still appreciate seeing the bt_index_check() variant query's
output just to satisfy myself that that's what it is.--
Peter Geoghegan
On Fri, May 14, 2021 at 1:13 PM Alex F <phoedos16@gmail.com> wrote:
Thanks for your support!
I just pushed a commit that adds hardening that will be sufficient to
prevent this being a hard crash. Of course the index should not become
corrupt in the first place, but at least in Postgres 13.4 the same
scenario will result in an error rather than in a hard crash.
Thanks
--
Peter Geoghegan
Dear Peter,
First of all thanks for your input with the upcoming fix. Anyway
application shouldn't crash with segfault, just log error.
Another point that I should mention - amcheck extension and "magic" query
which can help us to find a broken index. Without mentioned queries it was
absolutely unclear why the application crashed.
Is it possible to extend the error log which can help to understand what
exactly went wrong?
For example, if error log look like this:
2021-05-14 06:10:54 UTC [22258]: user=,db=,app=,client= LOG: server
process (PID 22273) was terminated by signal 11: Segmentation fault
2021-05-14 06:10:54 UTC [22258]: user=,db=,app=,client= DETAIL: Failed
process was running: REFRESH MATERIALIZED VIEW CONCURRENTLY
project.product_master_mv
***CAUSED BY violated for index "name_original_idx_s"***
e.g. trace marked with *** symbols can really help user to understand issue
root cause and significantly decrease database recovery time.
In my case I had to create a separate VM, create a database from scratch
and recover it from pg_dump. Unfortunately mentioned actions took a
significant downtime.
In case of master-standby configuration WAL replication does not save
standby servers from broken objects (broken index in described case).
Please advice is it possible to use logical replication here? From my
understanding logical replication shouldn't push broken objects on standby.
Thanks for your support!
сб, 15 мая 2021 г. в 03:10, Peter Geoghegan <pg@bowt.ie>:
Show quoted text
On Fri, May 14, 2021 at 1:13 PM Alex F <phoedos16@gmail.com> wrote:
Thanks for your support!
I just pushed a commit that adds hardening that will be sufficient to
prevent this being a hard crash. Of course the index should not become
corrupt in the first place, but at least in Postgres 13.4 the same
scenario will result in an error rather than in a hard crash.Thanks
--
Peter Geoghegan
On Mon, May 17, 2021 at 2:30 AM Alex F <phoedos16@gmail.com> wrote:
Is it possible to extend the error log which can help to understand what exactly went wrong?
For example, if error log look like this:
2021-05-14 06:10:54 UTC [22258]: user=,db=,app=,client= LOG: server process (PID 22273) was terminated by signal 11: Segmentation fault
2021-05-14 06:10:54 UTC [22258]: user=,db=,app=,client= DETAIL: Failed process was running: REFRESH MATERIALIZED VIEW CONCURRENTLY project.product_master_mv
***CAUSED BY violated for index "name_original_idx_s"***
e.g. trace marked with *** symbols can really help user to understand issue root cause and significantly decrease database recovery time.
In my case I had to create a separate VM, create a database from scratch and recover it from pg_dump. Unfortunately mentioned actions took a significant downtime.
Once the database is corrupt it's more or less impossible to provide
hard guarantees about anything. We can only try our best to avoid the
worst consequences, such as a hard crash. This is guided by practical
experience and feedback from users. While this failure is clearly very
unfriendly, there is no getting around the fact that the real problem
began before there was any crash or error. Perhaps *long* before the
first crash, even.
In case of master-standby configuration WAL replication does not save standby servers from broken objects (broken index in described case).
Please advice is it possible to use logical replication here? From my understanding logical replication shouldn't push broken objects on standby.
Unfortunately there are no simple answers. There is no reason to
believe that the corruption is limited to the indexes. I'd even say
that it's unlikely to be limited to indexes. What we see from amcheck
looks very much like storage level inconsistencies.
You'll need to do some kind of root cause analysis, so that you can
find the underlying issue and systematically eliminate it.
--
Peter Geoghegan