GIN data corruption bug(s) in 9.6devel
Hi,
while repeating some full-text benchmarks on master, I've discovered
that there's a data corruption bug somewhere. What happens is that while
loading data into a table with GIN indexes (using multiple parallel
connections), I sometimes get this:
TRAP: FailedAssertion("!(((PageHeader) (page))->pd_special >=
(__builtin_offsetof (PageHeaderData, pd_linp)))", File: "ginfast.c",
Line: 537)
LOG: server process (PID 22982) was terminated by signal 6: Aborted
DETAIL: Failed process was running: autovacuum: ANALYZE messages
The details of the assert are always exactly the same - it's always
autovacuum and it trips on exactly the same check. And the backtrace
always looks like this (full backtrace attached):
#0 0x00007f133b635045 in raise () from /lib64/libc.so.6
#1 0x00007f133b6364ea in abort () from /lib64/libc.so.6
#2 0x00000000007dc007 in ExceptionalCondition
(conditionName=conditionName@entry=0x81a088 "!(((PageHeader)
(page))->pd_special >= (__builtin_offsetof (PageHeaderData, pd_linp)))",
errorType=errorType@entry=0x81998b "FailedAssertion",
fileName=fileName@entry=0x83480a "ginfast.c",
lineNumber=lineNumber@entry=537) at assert.c:54
#3 0x00000000004894aa in shiftList (stats=0x0, fill_fsm=1 '\001',
newHead=26357, metabuffer=130744, index=0x7f133c0f7518) at ginfast.c:537
#4 ginInsertCleanup (ginstate=ginstate@entry=0x7ffd98ac9160,
vac_delay=vac_delay@entry=1 '\001', fill_fsm=fill_fsm@entry=1 '\001',
stats=stats@entry=0x0) at ginfast.c:908
#5 0x00000000004874f7 in ginvacuumcleanup (fcinfo=<optimized out>) at
ginvacuum.c:662
...
It's not perfectly deterministic - sometimes I had repeat the whole load
multiple times (up to 5x, and each load takes ~30minutes).
I'm pretty sure this is not external issue, because I've reproduced it
on a different machine (different CPU / kernel / libraries / compiler).
It's however interesting that on the other machine I've also observed a
different kind of lockups, where the sessions get stuck on semop() in
gininsert (again, full backtrace attached):
#0 0x0000003f3d4eaf97 in semop () from /lib64/libc.so.6
#1 0x000000000067a41f in PGSemaphoreLock (sema=0x7f93290405d8) at
pg_sema.c:387
#2 0x00000000006df613 in LWLockAcquire (lock=0x7f92a4dce900,
mode=LW_EXCLUSIVE) at lwlock.c:1049
#3 0x00000000004878c6 in ginHeapTupleFastInsert
(ginstate=0x7ffd969c88f0, collector=0x7ffd969caeb0) at ginfast.c:250
#4 0x000000000047423a in gininsert (fcinfo=<value optimized out>) at
gininsert.c:531
...
I'm not sure whether this is a different manifestation of the same issue
or another bug. The systems are not exactly the same - one has a single
socket (i5) while the other one has 2 (Xeon), the compilers and kernels
are different and so on.
I've also seen cases when the load seemingly completed OK, but trying to
dump the table to disk using COPY resulted in
ERROR: compressed data is corrupted
which I find rather strange as there was no previous error, and also
COPY should only dump table data (while the asserts were in GIN code
handling index pages, unless I'm mistaken). Seems like a case of
insufficient locking where two backends scribble on the same page
somehow, and then autovacuum hits the assert. Ot maybe not, not sure.
I've been unable to reproduce the issue on REL9_5_STABLE (despite
running the load ~20x on each machine), so that seems safe, and the
issue was introduced by some of the newer commits.
I've already spent too much CPU time on this, so perhaps someone with
better knowledge of the GIN code can take care of this. To reproduce it
you may use the same code I did - it's available here:
https://bitbucket.org/tvondra/archie
it's a PoC of database with pgsql mailing lists with fulltext. It's a
bit messy, but it's rather simple
1) clone the repo
$ git clone https://bitbucket.org/tvondra/archie.git
2) create a directory for downloading the mbox files
$ mkdir archie-data
3) download the mbox files (~4.5GB of data) using the download
script (make sure archie/bin is on PATH)
$ cd archie-data
$ export PATH=../archie/bin:$PATH
$ ../archie/download
4) use "run" scipt (attached) to run the load n-times on a given
commit
$ run.sh master 10
NOTICE: The run script is the messiest one, you'll have to
edit it to fix paths etc.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Import Notes
Reply to msg id not found: 563BD513.7020300@2ndquadrant.comReference msg id not found: 563BD513.7020300@2ndquadrant.com
On Thu, Nov 5, 2015 at 2:18 PM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
Hi,
while repeating some full-text benchmarks on master, I've discovered
that there's a data corruption bug somewhere. What happens is that while
loading data into a table with GIN indexes (using multiple parallel
connections), I sometimes get this:TRAP: FailedAssertion("!(((PageHeader) (page))->pd_special >=
(__builtin_offsetof (PageHeaderData, pd_linp)))", File: "ginfast.c",
Line: 537)
LOG: server process (PID 22982) was terminated by signal 6: Aborted
DETAIL: Failed process was running: autovacuum: ANALYZE messagesThe details of the assert are always exactly the same - it's always
autovacuum and it trips on exactly the same check. And the backtrace
always looks like this (full backtrace attached):#0 0x00007f133b635045 in raise () from /lib64/libc.so.6
#1 0x00007f133b6364ea in abort () from /lib64/libc.so.6
#2 0x00000000007dc007 in ExceptionalCondition
(conditionName=conditionName@entry=0x81a088 "!(((PageHeader)
(page))->pd_special >= (__builtin_offsetof (PageHeaderData, pd_linp)))",
errorType=errorType@entry=0x81998b "FailedAssertion",
fileName=fileName@entry=0x83480a "ginfast.c",
lineNumber=lineNumber@entry=537) at assert.c:54
#3 0x00000000004894aa in shiftList (stats=0x0, fill_fsm=1 '\001',
newHead=26357, metabuffer=130744, index=0x7f133c0f7518) at ginfast.c:537
#4 ginInsertCleanup (ginstate=ginstate@entry=0x7ffd98ac9160,
vac_delay=vac_delay@entry=1 '\001', fill_fsm=fill_fsm@entry=1 '\001',
stats=stats@entry=0x0) at ginfast.c:908
#5 0x00000000004874f7 in ginvacuumcleanup (fcinfo=<optimized out>) at
ginvacuum.c:662
...
This looks like it is probably the same bug discussed here:
/messages/by-id/CAMkU=1xALfLhUUohFP5v33RzedLVb5aknNUjcEuM9KNBKrB6-Q@mail.gmail.com
And here:
/messages/by-id/56041B26.2040902@sigaev.ru
The bug theoretically exists in 9.5, but it wasn't until 9.6 (commit
e95680832854cf300e64c) that free pages were recycled aggressively
enough that it actually becomes likely to be hit.
There are some proposed patches in those threads, but discussion on
them seems to have stalled out. Can you try one and see if it fixes
the problems you are seeing?
Cheers,
Jeff
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 11/05/2015 11:44 PM, Jeff Janes wrote:
This looks like it is probably the same bug discussed here:
/messages/by-id/CAMkU=1xALfLhUUohFP5v33RzedLVb5aknNUjcEuM9KNBKrB6-Q@mail.gmail.com
And here:
/messages/by-id/56041B26.2040902@sigaev.ru
The bug theoretically exists in 9.5, but it wasn't until 9.6 (commit
e95680832854cf300e64c) that free pages were recycled aggressively
enough that it actually becomes likely to be hit.
I have only quickly skimmed the discussions, but my impression was that
it's mostly about removing stuff that shouldn't be removed and such. But
maybe there are race conditions that cause data corruption. I don't
really want to dive too deeply into this, I've already spent too much
time trying to reproduce it.
There are some proposed patches in those threads, but discussion on
them seems to have stalled out. Can you try one and see if it fixes
the problems you are seeing?
I can do that - I see there are three patches in the two threads:
1) gin_pending_lwlock.patch (Jeff Janes)
2) gin_pending_pagelock.patch (Jeff Janes)
3) gin_alone_cleanup-2.patch (Teodor Sigaev)
Should I test all of them? Or is (1) obsoleted by (2) for example?
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Nov 5, 2015 at 3:50 PM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
On 11/05/2015 11:44 PM, Jeff Janes wrote:
This looks like it is probably the same bug discussed here:
/messages/by-id/CAMkU=1xALfLhUUohFP5v33RzedLVb5aknNUjcEuM9KNBKrB6-Q@mail.gmail.com
And here:
/messages/by-id/56041B26.2040902@sigaev.ru
The bug theoretically exists in 9.5, but it wasn't until 9.6 (commit
e95680832854cf300e64c) that free pages were recycled aggressively
enough that it actually becomes likely to be hit.I have only quickly skimmed the discussions, but my impression was that it's
mostly about removing stuff that shouldn't be removed and such. But maybe
there are race conditions that cause data corruption. I don't really want to
dive too deeply into this, I've already spent too much time trying to
reproduce it.There are some proposed patches in those threads, but discussion on
them seems to have stalled out. Can you try one and see if it fixes
the problems you are seeing?I can do that - I see there are three patches in the two threads:
1) gin_pending_lwlock.patch (Jeff Janes)
2) gin_pending_pagelock.patch (Jeff Janes)
3) gin_alone_cleanup-2.patch (Teodor Sigaev)Should I test all of them? Or is (1) obsoleted by (2) for example?
1 is obsolete. Either 2 or 3 should fix the bug, provided this is the
bug you are seeing. They have different performance side effects, but
as far as fixing the bug they should be equivalent.
Cheers,
Jeff
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi,
On 11/06/2015 01:05 AM, Jeff Janes wrote:
On Thu, Nov 5, 2015 at 3:50 PM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
...
I can do that - I see there are three patches in the two threads:
1) gin_pending_lwlock.patch (Jeff Janes)
2) gin_pending_pagelock.patch (Jeff Janes)
3) gin_alone_cleanup-2.patch (Teodor Sigaev)Should I test all of them? Or is (1) obsoleted by (2) for example?
1 is obsolete. Either 2 or 3 should fix the bug, provided this is the
bug you are seeing. They have different performance side effects, but
as far as fixing the bug they should be equivalent.
OK, I'll do testing with those two patches then, and I'll also note the
performance difference (the data load was very stable). Of course, it's
just one particular workload.
I'll post an update after the weekend.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi,
On 11/06/2015 02:09 AM, Tomas Vondra wrote:
Hi,
On 11/06/2015 01:05 AM, Jeff Janes wrote:
On Thu, Nov 5, 2015 at 3:50 PM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:...
I can do that - I see there are three patches in the two threads:
1) gin_pending_lwlock.patch (Jeff Janes)
2) gin_pending_pagelock.patch (Jeff Janes)
3) gin_alone_cleanup-2.patch (Teodor Sigaev)Should I test all of them? Or is (1) obsoleted by (2) for example?
1 is obsolete. Either 2 or 3 should fix the bug, provided this is the
bug you are seeing. They have different performance side effects, but
as far as fixing the bug they should be equivalent.OK, I'll do testing with those two patches then, and I'll also note the
performance difference (the data load was very stable). Of course, it's
just one particular workload.I'll post an update after the weekend.
I've finally managed to test the two patches. Sorry for the delay.
I've repeated the workload on 9.5, 9.6 and 9.6 with (1) or (2), looking
for lockups or data corruption. I've also measured duration of the
script, to see what's the impact on performance. The configuration
(shared_buffers, work_mem ...) was exactly the same in all cases.
9.5 : runtime ~1380 seconds
9.6 : runtime ~1380 seconds (but lockups and data corruption)
9.6+(1) : runtime ~1380 seconds
9.6+(2) : runtime ~1290 seconds
So both patches seem to do the trick, but (2) is faster. Not sure if
this is expected. (BTW all the results are without asserts enabled).
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sat, Dec 19, 2015 at 3:19 PM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
Hi,
On 11/06/2015 02:09 AM, Tomas Vondra wrote:
Hi,
On 11/06/2015 01:05 AM, Jeff Janes wrote:
On Thu, Nov 5, 2015 at 3:50 PM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:...
I can do that - I see there are three patches in the two threads:
1) gin_pending_lwlock.patch (Jeff Janes)
2) gin_pending_pagelock.patch (Jeff Janes)
3) gin_alone_cleanup-2.patch (Teodor Sigaev)Should I test all of them? Or is (1) obsoleted by (2) for example?
1 is obsolete. Either 2 or 3 should fix the bug, provided this is the
bug you are seeing. They have different performance side effects, but
as far as fixing the bug they should be equivalent.OK, I'll do testing with those two patches then, and I'll also note the
performance difference (the data load was very stable). Of course, it's
just one particular workload.I'll post an update after the weekend.
I've finally managed to test the two patches. Sorry for the delay.
I've repeated the workload on 9.5, 9.6 and 9.6 with (1) or (2), looking for
lockups or data corruption. I've also measured duration of the script, to
see what's the impact on performance. The configuration (shared_buffers,
work_mem ...) was exactly the same in all cases.9.5 : runtime ~1380 seconds
9.6 : runtime ~1380 seconds (but lockups and data corruption)
9.6+(1) : runtime ~1380 seconds
9.6+(2) : runtime ~1290 secondsSo both patches seem to do the trick, but (2) is faster. Not sure if this is
expected. (BTW all the results are without asserts enabled).
Do you know what the size of the pending list was at the end of each test?
I think last one may be faster because it left a large mess behind
that someone needs to clean up later.
Also, do you have the final size of the indexes in each case?
Cheers,
Jeff
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 12/21/2015 07:41 PM, Jeff Janes wrote:
On Sat, Dec 19, 2015 at 3:19 PM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
...
So both patches seem to do the trick, but (2) is faster. Not sure
if this is expected. (BTW all the results are without asserts
enabled).Do you know what the size of the pending list was at the end of each
test?I think last one may be faster because it left a large mess behind
that someone needs to clean up later.
No. How do I measure it?
Also, do you have the final size of the indexes in each case?
No, I haven't realized the patches do affect that, so I haven't measured it.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Dec 21, 2015 at 11:51 AM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
On 12/21/2015 07:41 PM, Jeff Janes wrote:
On Sat, Dec 19, 2015 at 3:19 PM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:...
So both patches seem to do the trick, but (2) is faster. Not sure
if this is expected. (BTW all the results are without asserts
enabled).Do you know what the size of the pending list was at the end of each
test?I think last one may be faster because it left a large mess behind
that someone needs to clean up later.No. How do I measure it?
pageinspect's gin_metapage_info, or pgstattuple's pgstatginindex
Also, do you have the final size of the indexes in each case?
No, I haven't realized the patches do affect that, so I haven't measured it.
There shouldn't be a difference between the two approaches (although I
guess there could be if one left a larger pending list than the other,
as pending lists is very space inefficient), but since you included
9.5 in your test I thought it would be interesting to see how either
patched version under 9.6 compared to 9.5.
Cheers,
Jeff
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi,
On 12/23/2015 09:33 PM, Jeff Janes wrote:
On Mon, Dec 21, 2015 at 11:51 AM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:On 12/21/2015 07:41 PM, Jeff Janes wrote:
On Sat, Dec 19, 2015 at 3:19 PM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:...
So both patches seem to do the trick, but (2) is faster. Not sure
if this is expected. (BTW all the results are without asserts
enabled).Do you know what the size of the pending list was at the end of each
test?I think last one may be faster because it left a large mess behind
that someone needs to clean up later.No. How do I measure it?
pageinspect's gin_metapage_info, or pgstattuple's pgstatginindex
Hmmm, so this turns out not very useful, because at the end the data I
get from gin_metapage_info is almost exactly the same for both patches
(more details below).
Also, do you have the final size of the indexes in each case?
No, I haven't realized the patches do affect that, so I haven't measured it.
There shouldn't be a difference between the two approaches (although I
guess there could be if one left a larger pending list than the other,
as pending lists is very space inefficient), but since you included
9.5 in your test I thought it would be interesting to see how either
patched version under 9.6 compared to 9.5.
Well, turns out there's a quite significant difference, actually. The
index sizes I get (quite stable after multiple runs):
9.5 : 2428 MB
9.6 + alone cleanup : 730 MB
9.6 + pending lock : 488 MB
So that's quite a significant difference, I guess. The load duration for
each version look like this:
9.5 : 1415 seconds
9.6 + alone cleanup : 1310 seconds
9.6 + pending lock : 1380 seconds
I'd say I'm happy with sacrificing ~5% of time in exchange for ~35%
reduction of index size.
The size of the index on 9.5 after VACUUM FULL (so pretty much the
smallest index possible) is 440MB, which suggests the "pending lock"
patch does a quite good job.
The gin_metapage_info at the end of one of the runs (pretty much all the
runs look exactly the same) looks like this:
pending lock alone cleanup 9.5
--------------------------------------------------------
pending_head 2 2 310460
pending_tail 338 345 310806
tail_free_size 812 812 812
n_pending_pages 330 339 347
n_pending_tuples 1003 1037 1059
n_total_pages 2 2 2
n_entry_pages 1 1 1
n_data_pages 0 0 0
n_entries 0 0 0
version 2 2 2
So almost no difference, except for the pending_* attributes, and even
in that case the values are only different for 9.5 branch. Not sure what
conclusion to draw from this - maybe it's necessary to collect the
function input while the load is running (but that'd be tricky to
process, I guess).
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi,
On 01/05/2016 10:38 AM, Tomas Vondra wrote:
Hi,
...
There shouldn't be a difference between the two approaches (although I
guess there could be if one left a larger pending list than the other,
as pending lists is very space inefficient), but since you included
9.5 in your test I thought it would be interesting to see how either
patched version under 9.6 compared to 9.5.Well, turns out there's a quite significant difference, actually. The
index sizes I get (quite stable after multiple runs):9.5 : 2428 MB
9.6 + alone cleanup : 730 MB
9.6 + pending lock : 488 MBSo that's quite a significant difference, I guess. The load duration for
each version look like this:9.5 : 1415 seconds
9.6 + alone cleanup : 1310 seconds
9.6 + pending lock : 1380 secondsI'd say I'm happy with sacrificing ~5% of time in exchange for ~35%
reduction of index size.The size of the index on 9.5 after VACUUM FULL (so pretty much the
smallest index possible) is 440MB, which suggests the "pending lock"
patch does a quite good job.The gin_metapage_info at the end of one of the runs (pretty much all the
runs look exactly the same) looks like this:pending lock alone cleanup 9.5
--------------------------------------------------------
pending_head 2 2 310460
pending_tail 338 345 310806
tail_free_size 812 812 812
n_pending_pages 330 339 347
n_pending_tuples 1003 1037 1059
n_total_pages 2 2 2
n_entry_pages 1 1 1
n_data_pages 0 0 0
n_entries 0 0 0
version 2 2 2So almost no difference, except for the pending_* attributes, and even
in that case the values are only different for 9.5 branch. Not sure what
conclusion to draw from this - maybe it's necessary to collect the
function input while the load is running (but that'd be tricky to
process, I guess).
Are we going to anything about this? While the bug is present in 9.5
(and possibly other versions), fixing it before 9.6 gets out seems
important because reproducing it there is rather trivial (while I've
been unable to reproduce it on 9.5).
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Feb 24, 2016 at 9:17 AM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
Well, turns out there's a quite significant difference, actually. The
index sizes I get (quite stable after multiple runs):9.5 : 2428 MB
9.6 + alone cleanup : 730 MB
9.6 + pending lock : 488 MBSo that's quite a significant difference, I guess. The load duration for
each version look like this:9.5 : 1415 seconds
9.6 + alone cleanup : 1310 seconds
9.6 + pending lock : 1380 secondsI'd say I'm happy with sacrificing ~5% of time in exchange for ~35%
reduction of index size.The size of the index on 9.5 after VACUUM FULL (so pretty much the
smallest index possible) is 440MB, which suggests the "pending lock"
patch does a quite good job.The gin_metapage_info at the end of one of the runs (pretty much all the
runs look exactly the same) looks like this:pending lock alone cleanup 9.5
--------------------------------------------------------
pending_head 2 2 310460
pending_tail 338 345 310806
tail_free_size 812 812 812
n_pending_pages 330 339 347
n_pending_tuples 1003 1037 1059
n_total_pages 2 2 2
n_entry_pages 1 1 1
n_data_pages 0 0 0
n_entries 0 0 0
version 2 2 2So almost no difference, except for the pending_* attributes, and even
in that case the values are only different for 9.5 branch. Not sure what
conclusion to draw from this - maybe it's necessary to collect the
function input while the load is running (but that'd be tricky to
process, I guess).Are we going to anything about this? While the bug is present in 9.5 (and
possibly other versions), fixing it before 9.6 gets out seems important
because reproducing it there is rather trivial (while I've been unable to
reproduce it on 9.5).
I'm not going to promise to commit anything here, because GIN is not
usually my area, but could you provide a link to the email that
contains the patch you think should be committed?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 02/24/2016 06:56 AM, Robert Haas wrote:
On Wed, Feb 24, 2016 at 9:17 AM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
...
Are we going to anything about this? While the bug is present in 9.5 (and
possibly other versions), fixing it before 9.6 gets out seems important
because reproducing it there is rather trivial (while I've been unable to
reproduce it on 9.5).I'm not going to promise to commit anything here, because GIN is not
usually my area, but could you provide a link to the email that
contains the patch you think should be committed?
Sure. There are actually two candidate patches in two separate threads,
I'm nots sure which one is better. Based on the testing both seem to fix
the issue and the "pending lock" patch produces much smaller indexes (at
least in my benchmark):
[1]: /messages/by-id/56041B26.2040902@sigaev.ru
[2]: /messages/by-id/CAMkU=1w7Uu1GZ8N0bxMykRLgTh-uPH+GPHfhMNeryZPCv7fqdA@mail.gmail.com
/messages/by-id/CAMkU=1w7Uu1GZ8N0bxMykRLgTh-uPH+GPHfhMNeryZPCv7fqdA@mail.gmail.com
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Thank you for remembering this problem, at least for me.
Well, turns out there's a quite significant difference, actually. The
index sizes I get (quite stable after multiple runs):9.5 : 2428 MB
9.6 + alone cleanup : 730 MB
9.6 + pending lock : 488 MB
Interesting, I don't see why alone_cleanup and pending_lock are so differ. I'd
like to understand that, but does somebody have an good theory? The single point
in pending_lock patch is an suspicious exception in ProcSleep, this exception
may cause problem in future.
So that's quite a significant difference, I guess. The load duration for
each version look like this:9.5 : 1415 seconds
9.6 + alone cleanup : 1310 seconds
9.6 + pending lock : 1380 secondsI'd say I'm happy with sacrificing ~5% of time in exchange for ~35%
reduction of index size.
I think, alone_cleanup patch is faster because regular insert could break its
cleanup process if autovacuum waits to begin work on cleanup. So, insert process
could returns earlier from pending cleanup process.
In attachment just rebased v2 alone_cleanup patch.
--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/
Attachments:
gin_alone_cleanup-3.patchbinary/octet-stream; name=gin_alone_cleanup-3.patchDownload+70-48
Well, turns out there's a quite significant difference, actually. The
index sizes I get (quite stable after multiple runs):9.5 : 2428 MB
9.6 + alone cleanup : 730 MB
9.6 + pending lock : 488 MB
In attach modified alone_cleanup patch which doesn't break cleanup process as it
does pending_lock patch but attached patch doesn't touch a lock management.
Tomas. if you can, pls, repeat test with this patch. If not, I will try to do
it, but later.
--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/
Attachments:
gin_alone_cleanup-3.1.patchbinary/octet-stream; name=gin_alone_cleanup-3.1.patchDownload+46-48
Hi,
On 02/25/2016 05:32 PM, Teodor Sigaev wrote:
Well, turns out there's a quite significant difference, actually. The
index sizes I get (quite stable after multiple runs):9.5 : 2428 MB
9.6 + alone cleanup : 730 MB
9.6 + pending lock : 488 MBIn attach modified alone_cleanup patch which doesn't break cleanup
process as it does pending_lock patch but attached patch doesn't touch a
lock management.Tomas. if you can, pls, repeat test with this patch. If not, I will try
to do it, but later.
I'll do that once the system I used for that gets available - right now
it's running other benchmarks.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Feb 24, 2016 at 8:51 AM, Teodor Sigaev <teodor@sigaev.ru> wrote:
Thank you for remembering this problem, at least for me.
Well, turns out there's a quite significant difference, actually. The
index sizes I get (quite stable after multiple runs):9.5 : 2428 MB
9.6 + alone cleanup : 730 MB
9.6 + pending lock : 488 MBInteresting, I don't see why alone_cleanup and pending_lock are so differ.
I'd like to understand that, but does somebody have an good theory?
Under my patch, anyone who wanted to do a clean up and detected
someone else was doing one would wait for the concurrent one to end.
(This is more consistent with the existing behavior, I just made it so
they don't do any damage while they wait.)
Under your patch, if a backend wants to do a clean up and detects
someone else is already doing one, it would just skip the clean up and
proceed on with whatever it was doing. This allows one process
(hopefully a vacuum, but maybe a user backend) to get pinned down
indefinitely, as other processes keep putting stuff onto the end of
the pending_list with no throttle.
Since the freespace recycling only takes place once the list is
completely cleaned, allowing some processes to add to the end while
one poor process is trying to clean can lead to less effective
recycling.
That is my theory, anyway.
Cheers,
Jeff
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Feb 25, 2016 at 11:19:20AM -0800, Jeff Janes wrote:
On Wed, Feb 24, 2016 at 8:51 AM, Teodor Sigaev <teodor@sigaev.ru> wrote:
Thank you for remembering this problem, at least for me.
Well, turns out there's a quite significant difference, actually. The
index sizes I get (quite stable after multiple runs):9.5 : 2428 MB
9.6 + alone cleanup : 730 MB
9.6 + pending lock : 488 MBInteresting, I don't see why alone_cleanup and pending_lock are so differ.
I'd like to understand that, but does somebody have an good theory?Under my patch, anyone who wanted to do a clean up and detected
someone else was doing one would wait for the concurrent one to end.
(This is more consistent with the existing behavior, I just made it so
they don't do any damage while they wait.)Under your patch, if a backend wants to do a clean up and detects
someone else is already doing one, it would just skip the clean up and
proceed on with whatever it was doing. This allows one process
(hopefully a vacuum, but maybe a user backend) to get pinned down
indefinitely, as other processes keep putting stuff onto the end of
the pending_list with no throttle.Since the freespace recycling only takes place once the list is
completely cleaned, allowing some processes to add to the end while
one poor process is trying to clean can lead to less effective
recycling.That is my theory, anyway.
[This is a generic notification.]
The above-described topic is currently a PostgreSQL 9.6 open item. Teodor,
since you committed the patch believed to have created it, you own this open
item. If that responsibility lies elsewhere, please let us know whose
responsibility it is to fix this. Since new open items may be discovered at
any time and I want to plan to have them all fixed well in advance of the ship
date, I will appreciate your efforts toward speedy resolution. Please
present, within 72 hours, a plan to fix the defect within seven days of this
message. Thanks.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
The above-described topic is currently a PostgreSQL 9.6 open item. Teodor,
since you committed the patch believed to have created it, you own this open
item. If that responsibility lies elsewhere, please let us know whose
responsibility it is to fix this. Since new open items may be discovered at
any time and I want to plan to have them all fixed well in advance of the ship
date, I will appreciate your efforts toward speedy resolution. Please
present, within 72 hours, a plan to fix the defect within seven days of this
message. Thanks.
I'm waiting of Tomas testing. Suppose, it isn't possible now? so, will do
myself, after that I will publish results.
--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 04/04/2016 02:06 PM, Teodor Sigaev wrote:
The above-described topic is currently a PostgreSQL 9.6 open item.
Teodor,
since you committed the patch believed to have created it, you own
this open
item. If that responsibility lies elsewhere, please let us know whose
responsibility it is to fix this. Since new open items may be
discovered at
any time and I want to plan to have them all fixed well in advance of
the ship
date, I will appreciate your efforts toward speedy resolution. Please
present, within 72 hours, a plan to fix the defect within seven days
of this
message. Thanks.I'm waiting of Tomas testing. Suppose, it isn't possible now? so, will
do myself, after that I will publish results.
Ah, damn. This completely slipped from my TODO list. I'll rerun the
tests either today or tomorrow, and report the results here.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers