Slow system due to ReorderBufferGetTupleBuf?
Postgres v10 on Debian stretch
I’m suffering from an occasionally very slow system. A few weeks ago someone mentioned using perf. I’ve installed this and caught the system during a slow period. It shows the following as the top cpu users:
9.09% postgres [.] ReorderBufferGetTupleBuf
6.14% postgres [.] ReorderBufferReturnChange
When ReorderBufferReturnChange is no longer running:
14.35% postgres [.] ReorderBufferGetTupleBuf
Can someone shed some light on this and advise how to prevent it reoccurring?
Cheers,
Martin.
On Mon, Jan 1, 2018 at 8:56 AM, Martin Moore <martin.moore@avbrief.com> wrote:
Can someone shed some light on this and advise how to prevent it reoccurring?
You're using v10, which has these two commits:
Unfortunately, per the commit message of the first commit, it doesn't
look like the tuple allocator uses any new strategy, at least until
this v11 commit:
My guess is that that would make a noticeable difference, once v11
becomes available. Could you test this yourself by building from the
master branch?
--
Peter Geoghegan
On 01/01/2018, 17:45, "Peter Geoghegan" <pg@bowt.ie> wrote:
On Mon, Jan 1, 2018 at 8:56 AM, Martin Moore <martin.moore@avbrief.com> wrote:
Can someone shed some light on this and advise how to prevent it reoccurring?
You're using v10, which has these two commits:
Unfortunately, per the commit message of the first commit, it doesn't
look like the tuple allocator uses any new strategy, at least until
this v11 commit:
My guess is that that would make a noticeable difference, once v11
becomes available. Could you test this yourself by building from the
master branch?
--
Peter Geoghegan
Thanks Peter. I don’t really want to go down that route for various reasons. There’s a task that copies ‘old’ rows to various old_ tables and then deletes from the main tables, then does a vaccum and analyse. Tables only have 20-30k rows. I’m guessing this may be the trigger for the problem so have changed the timing from every 20 mins to once in the middle of the night when things are quiet.
Would this explain the problem?
Martin.
On 02/01/2018, 12:09, "Martin Moore" <martin.moore@avbrief.com> wrote:
On 01/01/2018, 17:45, "Peter Geoghegan" <pg@bowt.ie> wrote:
On Mon, Jan 1, 2018 at 8:56 AM, Martin Moore <martin.moore@avbrief.com> wrote:
Can someone shed some light on this and advise how to prevent it reoccurring?
You're using v10, which has these two commits:
Unfortunately, per the commit message of the first commit, it doesn't
look like the tuple allocator uses any new strategy, at least until
this v11 commit:
My guess is that that would make a noticeable difference, once v11
becomes available. Could you test this yourself by building from the
master branch?
--
Peter Geoghegan
Thanks Peter. I don’t really want to go down that route for various reasons. There’s a task that copies ‘old’ rows to various old_ tables and then deletes from the main tables, then does a vaccum and analyse. Tables only have 20-30k rows. I’m guessing this may be the trigger for the problem so have changed the timing from every 20 mins to once in the middle of the night when things are quiet.
Would this explain the problem?
Martin.
======================================================================================================
Having stopped the suspect task, I’m still getting the same problem. Can’t even stop postgres:
waiting for server to shut down............................................................... failed
pg_ctl: server does not shut down
We’ve spent 2 yrs and a chunk of cash on a total system redesign and this is going to stop it from being released.
Can someone give me an idea what may be causing this – and what ReorderBufferGetTupleBuf is actually doing in case it gives me a clue.
Thanks.