amcheck verification for GiST and GIN
Hello! My name is Grigory Kryachko, I decided to join the efforts with
Andrey Borodin in his working on amcheck.
Here is the patch which I (with Andrey as my advisor) built on the top of
the last patch from this thread: https://commitfest.postgresql.org/25/1800/
.
It adds an ability to verify validity of GIN index. It is not polished
yet, but it works and we wanted to show it to you so you can give us some
feedback, and also let you know about this work if you have any plans of
writing something like that yourselves, so that you do not redo what is
already done.
In the mentioned above thread there was an issue with right type of lock,
we have not addressed it yet. Right now I am primarily interested in
feedback about GIN part.
Attachments:
amchek_gin_gist.patchtext/x-patch; charset=US-ASCII; name=amchek_gin_gist.patchDownload+1588-365
On Wed, May 27, 2020 at 10:11 AM Grigory Kryachko <gskryachko@gmail.com> wrote:
Here is the patch which I (with Andrey as my advisor) built on the top of the last patch from this thread: https://commitfest.postgresql.org/25/1800/ .
It adds an ability to verify validity of GIN index. It is not polished yet, but it works and we wanted to show it to you so you can give us some feedback, and also let you know about this work if you have any plans of writing something like that yourselves, so that you do not redo what is already done.
Can you rebase this patch, please?
Also suggest breaking out the series into distinct patch files using
"git format-patch master".
--
Peter Geoghegan
On 07/08/2020 00:33, Peter Geoghegan wrote:
On Wed, May 27, 2020 at 10:11 AM Grigory Kryachko <gskryachko@gmail.com> wrote:
Here is the patch which I (with Andrey as my advisor) built on the top of the last patch from this thread: https://commitfest.postgresql.org/25/1800/ .
It adds an ability to verify validity of GIN index. It is not polished yet, but it works and we wanted to show it to you so you can give us some feedback, and also let you know about this work if you have any plans of writing something like that yourselves, so that you do not redo what is already done.Can you rebase this patch, please?
Also suggest breaking out the series into distinct patch files using
"git format-patch master".
I rebased the GIN parts of this patch, see attached. I also ran pgindent
and made some other tiny cosmetic fixes, but I didn't review the patch,
only rebased it in the state it was.
I was hoping that this would be useful to track down the bug we're
discussing here:
/messages/by-id/CAJYBUS8aBQQL22oHsAwjHdwYfdB_NMzt7-sZxhxiOdEdn7cOkw@mail.gmail.com.
But now that I look what checks this performs, I doubt this will catch
the kind of corruption that's happened there. I suspect it's more subtle
than an inconsistencies between parent and child pages, because only a
few rows are affected. But doesn't hurt to try.
- Heikki
Attachments:
v2-0001-Amcheck-for-GIN.patchtext/x-patch; charset=UTF-8; name=v2-0001-Amcheck-for-GIN.patchDownload+1020-4
Hello,
First of all, thank you all -- Andrey, Peter, Heikki and others -- for this
work, GIN support in amcheck is *really* needed, especially for OS upgrades
such as from Ubuntu 16.04 (which is EOL now) to 18.04 or 20.04
I was trying to check a bunch of GINs on some production after switching
from Ubuntu 16.04 to 18.04 and got many errors. So decided to check for
16.04 first (that is still used on prod for that DB), without any OS/glibc
changes.
On 16.04, I still saw errors and it was not really expected because this
should mean that production is corrupted too. So, REINDEX should fix it.
But it didn't -- see output below. I cannot give data and thinking how to
create a synthetic demo of this. Any suggestions?
And is this a sign that the tool is wrong rather that we have a real
corruption cases? (I assume if we did, we would see no errors after
REINDEXing -- of course, if GIN itself doesn't have bugs).
Env: Ubuntu 16.04 (so, glibc 2.27), Postgres 12.7, patch from Heikki
slightly adjusted to work with PG12 (
https://gitlab.com/postgres/postgres/-/merge_requests/5) snippet used to
run amcheck:
https://gitlab.com/-/snippets/2001962 (see file #3)
Before reindex:
INFO: [2021-07-29 17:44:42.525+00] Processing 4/29: index:
index_XXX_trigram (index relpages: 117935; heap tuples: ~379793)...
ERROR: index "index_XXX_trigram" has wrong tuple order, block 65754, offset
232
test=# reindex index index_XXX_trigram;
REINDEX
After REINDEX:
INFO: [2021-07-29 18:01:23.339+00] Processing 4/29: index:
index_XXX_trigram (index relpages: 70100; heap tuples: ~379793)...
ERROR: index "index_XXX_trigram" has wrong tuple order, block 70048, offset
253
On Thu, Jul 15, 2021 at 00:03 Heikki Linnakangas <hlinnaka@iki.fi> wrote:
Show quoted text
On 07/08/2020 00:33, Peter Geoghegan wrote:
On Wed, May 27, 2020 at 10:11 AM Grigory Kryachko <gskryachko@gmail.com>
wrote:
Here is the patch which I (with Andrey as my advisor) built on the top
of the last patch from this thread:
https://commitfest.postgresql.org/25/1800/ .It adds an ability to verify validity of GIN index. It is not polished
yet, but it works and we wanted to show it to you so you can give us some
feedback, and also let you know about this work if you have any plans of
writing something like that yourselves, so that you do not redo what is
already done.Can you rebase this patch, please?
Also suggest breaking out the series into distinct patch files using
"git format-patch master".I rebased the GIN parts of this patch, see attached. I also ran pgindent
and made some other tiny cosmetic fixes, but I didn't review the patch,
only rebased it in the state it was.I was hoping that this would be useful to track down the bug we're
discussing here:/messages/by-id/CAJYBUS8aBQQL22oHsAwjHdwYfdB_NMzt7-sZxhxiOdEdn7cOkw@mail.gmail.com.
But now that I look what checks this performs, I doubt this will catch
the kind of corruption that's happened there. I suspect it's more subtle
than an inconsistencies between parent and child pages, because only a
few rows are affected. But doesn't hurt to try.- Heikki
On 29/07/2021 21:34, Nikolay Samokhvalov wrote:
I was trying to check a bunch of GINs on some production after switching
from Ubuntu 16.04 to 18.04 and got many errors. So decided to check for
16.04 first (that is still used on prod for that DB), without any
OS/glibc changes.On 16.04, I still saw errors and it was not really expected because this
should mean that production is corrupted too. So, REINDEX should fix it.
But it didn't -- see output below. I cannot give data and thinking how
to create a synthetic demo of this. Any suggestions?And is this a sign that the tool is wrong rather that we have a real
corruption cases? (I assume if we did, we would see no errors after
REINDEXing -- of course, if GIN itself doesn't have bugs).Env: Ubuntu 16.04 (so, glibc 2.27), Postgres 12.7, patch from Heikki
slightly adjusted to work with PG12 (
https://gitlab.com/postgres/postgres/-/merge_requests/5
<https://gitlab.com/postgres/postgres/-/merge_requests/5>) snippet used
to run amcheck:
https://gitlab.com/-/snippets/2001962
<https://gitlab.com/-/snippets/2001962> (see file #3)
Almost certainly the tool is wrong. We went back and forth a few times
with Pawel, fixing various bugs in the amcheck patch at this thread:
/messages/by-id/9fdbb584-1e10-6a55-ecc2-9ba8b5dca1cf@iki.fi.
Can you try again with the latest patch version from that thread,
please? That's v5-0001-Amcheck-for-GIN-13stable.patch.
- Heikki
Thank you, v5 didn't find any issues at all. One thing: for my 29 indexes,
the tool generated output 3.5 GiB. I guess many INFO messages should be
downgraded to something like DEBUG1?
On Fri, Jul 30, 2021 at 2:35 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
Show quoted text
On 29/07/2021 21:34, Nikolay Samokhvalov wrote:
I was trying to check a bunch of GINs on some production after switching
from Ubuntu 16.04 to 18.04 and got many errors. So decided to check for
16.04 first (that is still used on prod for that DB), without any
OS/glibc changes.On 16.04, I still saw errors and it was not really expected because this
should mean that production is corrupted too. So, REINDEX should fix it.
But it didn't -- see output below. I cannot give data and thinking how
to create a synthetic demo of this. Any suggestions?And is this a sign that the tool is wrong rather that we have a real
corruption cases? (I assume if we did, we would see no errors after
REINDEXing -- of course, if GIN itself doesn't have bugs).Env: Ubuntu 16.04 (so, glibc 2.27), Postgres 12.7, patch from Heikki
slightly adjusted to work with PG12 (
https://gitlab.com/postgres/postgres/-/merge_requests/5
<https://gitlab.com/postgres/postgres/-/merge_requests/5>) snippet used
to run amcheck:
https://gitlab.com/-/snippets/2001962
<https://gitlab.com/-/snippets/2001962> (see file #3)Almost certainly the tool is wrong. We went back and forth a few times
with Pawel, fixing various bugs in the amcheck patch at this thread:/messages/by-id/9fdbb584-1e10-6a55-ecc2-9ba8b5dca1cf@iki.fi.
Can you try again with the latest patch version from that thread,
please? That's v5-0001-Amcheck-for-GIN-13stable.patch.- Heikki