Performance problem partially identified

Started by Tom Laneover 26 years ago7 messages
#1Tom Lane
tgl@sss.pgh.pa.us

I've been looking into Mergl's "update" performance problem. With
current sources, on a sequential-scan update of about 10,000 out of
1,000,000 records, I observe 33712 read() calls and 34107 write() calls.
The table occupies 33334 disk blocks, so the number of reads looks about
right -- but the number of writes is at least a factor of 3 higher than
it should be!

It looks to me like something is broken such that bufmgr.c *always*
thinks that a buffer is dirty (and needs written out) when it is
released.

Poking around for the cause, I find that heapgettup() calls
SetBufferCommitInfoNeedsSave() for every single tuple read from the
table:

7.14 42.15 1000055/1000055 heap_getnext [9]
[10]: 18.8 7.14 42.15 1000055 heapgettup [10] 1.53 30.10 1000020/1000020 HeapTupleSatisfiesSnapshot [11] 1.68 3.27 1000055/1000055 RelationGetBufferWithBuffer [50] 4.31 0.00 2066832/4129472 LockBuffer [45] 0.25 0.56 33361/33698 ReleaseAndReadBuffer [76] 0.44 0.00 1000000/1000000 SetBufferCommitInfoNeedsSave [92] 0.01 0.00 33371/33371 nextpage [240] 0.00 0.00 10/2033992 ReleaseBuffer [46] 0.00 0.00 45/201 HeapTupleSatisfiesNow [647] 0.00 0.00 5/9 nocachegetattr [730]
1.53 30.10 1000020/1000020 HeapTupleSatisfiesSnapshot [11]
1.68 3.27 1000055/1000055 RelationGetBufferWithBuffer [50]
4.31 0.00 2066832/4129472 LockBuffer [45]
0.25 0.56 33361/33698 ReleaseAndReadBuffer [76]
0.44 0.00 1000000/1000000 SetBufferCommitInfoNeedsSave [92]
0.01 0.00 33371/33371 nextpage [240]
0.00 0.00 10/2033992 ReleaseBuffer [46]
0.00 0.00 45/201 HeapTupleSatisfiesNow [647]
0.00 0.00 5/9 nocachegetattr [730]

This could only be from the call to SetBufferCommitInfoNeedsSave in
the HeapTupleSatisfies macro. If I'm reading the code correctly,
that means that HeapTupleSatisfiesSnapshot() always changes the
t_infomask field of the tuple.

I don't understand this code well enough to fix it, but I assert that
it's broken. Most of these tuples are *not* being modified, and there
is no reason to have to rewrite the buffer.

regards, tom lane

#2Hannu Krosing
hannu@trust.ee
In reply to: Tom Lane (#1)
Re: [HACKERS] Performance problem partially identified

Tom Lane wrote:

It looks to me like something is broken such that bufmgr.c *always*
thinks that a buffer is dirty (and needs written out) when it is
released.

That could also explain why the performance increases quite noticeably
even for _select_ queries when you specify "no fsync" for backend.
(I have'nt checked it lately, but it was the case about a year ago)

Poking around for the cause, I find that heapgettup() calls
SetBufferCommitInfoNeedsSave() for every single tuple read from the
table:

...

I don't understand this code well enough to fix it, but I assert that
it's broken.

More likely this is a "quick fix - will look at it later" for something
else, praobably an execution path that fails to call
SetBufferCommitInfoNeedsSave() when needed.

Or it can be just code to check if this fixes it which has been
forgotten in.

Most of these tuples are *not* being modified, and there
is no reason to have to rewrite the buffer.

It can be quite a lot of work to find all the places that can modify the
tuple, even with some special tools.

Is there any tool that can report writes to areas set read only
(similar to malloc/free debuggers ?)

If there is then we can replace SetBufferCommitInfoNeedsSave()
with a macro that does both SetBufferCommitInfoNeedsSave() and allows
writing, so that we can automate the checking.

-----------------
Hannu

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Hannu Krosing (#2)
Re: Performance problem partially identified

Further info:

I think this behavior may be state-dependent. The first update after
loading the table rewrites all the blocks, but subsequent ones do not.
Is it just a matter of marking recently-written tuples as confirmed
good? If so, then there is not a problem (or at least, not as big
a problem as I thought).

regards, tom lane

#4Vadim Mikheev
vadim@krs.ru
In reply to: Tom Lane (#1)
Re: [HACKERS] Performance problem partially identified

Tom Lane wrote:

Poking around for the cause, I find that heapgettup() calls
SetBufferCommitInfoNeedsSave() for every single tuple read from the
table:

7.14 42.15 1000055/1000055 heap_getnext [9]
[10] 18.8 7.14 42.15 1000055 heapgettup [10]
1.53 30.10 1000020/1000020 HeapTupleSatisfiesSnapshot [11]
1.68 3.27 1000055/1000055 RelationGetBufferWithBuffer [50]
4.31 0.00 2066832/4129472 LockBuffer [45]
0.25 0.56 33361/33698 ReleaseAndReadBuffer [76]
0.44 0.00 1000000/1000000 SetBufferCommitInfoNeedsSave [92]
0.01 0.00 33371/33371 nextpage [240]
0.00 0.00 10/2033992 ReleaseBuffer [46]
0.00 0.00 45/201 HeapTupleSatisfiesNow [647]
0.00 0.00 5/9 nocachegetattr [730]

This could only be from the call to SetBufferCommitInfoNeedsSave in
the HeapTupleSatisfies macro. If I'm reading the code correctly,
that means that HeapTupleSatisfiesSnapshot() always changes the

^^^^^^^^^^^^^^

t_infomask field of the tuple.

Not always but only if HEAP_XMIN_COMMITTED/HEAP_XMAX_COMMITTED
are not setted. Run vacuum before update and SetBufferCommitInfoNeedSave
will not be called. This func is just to avoid pg_log lookup
without vacuum.

Vadim

#5Vadim Mikheev
vadim@krs.ru
In reply to: Tom Lane (#1)
Re: [HACKERS] Performance problem partially identified

Hannu Krosing wrote:

Tom Lane wrote:

It looks to me like something is broken such that bufmgr.c *always*
thinks that a buffer is dirty (and needs written out) when it is
released.

That could also explain why the performance increases quite noticeably
even for _select_ queries when you specify "no fsync" for backend.
(I have'nt checked it lately, but it was the case about a year ago)

Enev selects try to update t_infomask to avoid pg_log lookup
for other queries.

Vadim

#6Vadim Mikheev
vadim@krs.ru
In reply to: Tom Lane (#3)
Re: [HACKERS] Re: Performance problem partially identified

Tom Lane wrote:

Further info:

I think this behavior may be state-dependent. The first update after
loading the table rewrites all the blocks, but subsequent ones do not.
Is it just a matter of marking recently-written tuples as confirmed
good? If so, then there is not a problem (or at least, not as big
a problem as I thought).

Exactly!

One have to run vacuum after loading a big amount of data.

Vadim

#7Bruce Momjian
maillist@candle.pha.pa.us
In reply to: Vadim Mikheev (#4)
Re: [HACKERS] Performance problem partially identified

This could only be from the call to SetBufferCommitInfoNeedsSave in
the HeapTupleSatisfies macro. If I'm reading the code correctly,
that means that HeapTupleSatisfiesSnapshot() always changes the

^^^^^^^^^^^^^^

t_infomask field of the tuple.

Not always but only if HEAP_XMIN_COMMITTED/HEAP_XMAX_COMMITTED
are not setted. Run vacuum before update and SetBufferCommitInfoNeedSave
will not be called. This func is just to avoid pg_log lookup
without vacuum.

Yes, we store the transaction status on first in the tuple so we don't
have to look it up.

-- 
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026