Bulk inserts and usage_count

Started by Heikki Linnakangasover 18 years ago2 messages
#1Heikki Linnakangas
heikki@enterprisedb.com

While testing the buffer ring patch, I noticed that bulk inserts with
both INSERT and COPY pin and unpin the buffer they insert to for every
tuple. That means that the usage_count of all those buffers are bumped
up to 5. That's gotta be bad if you try to run a COPY concurrently with
other activity. It also affects tables like TPC-C order_line where a
tuples are always inserted and updated in groups.

To demonstrate:

postgres=# truncate foo; TRUNCATE TABLE
postgres=# COPY foo FROM '/tmp/foo.data'; COPY 1000
postgres=# SELECT c.relname, bufferid, relblocknumber, isdirty,
usagecount FROM pg_buffercache bc, pg_class c WHERE c.relfilenode =
bc.relfilenode and c.relname='foo';
relname | bufferid | relblocknumber | isdirty | usagecount
---------+----------+----------------+---------+------------
foo | 105078 | 4 | f | 5
foo | 105079 | 3 | f | 5
foo | 105080 | 2 | f | 5
foo | 105081 | 1 | f | 5
foo | 105082 | 0 | f | 5
(5 rows)

A fix for COPY will fall naturally out of the buffer ring patch, but not
for INSERT.

A more general fix would be to somehow keep the last insertion page
pinned across calls to heap_insert.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#2Jim C. Nasby
decibel@decibel.org
In reply to: Heikki Linnakangas (#1)
Re: Bulk inserts and usage_count

On Tue, May 15, 2007 at 04:37:28PM +0100, Heikki Linnakangas wrote:

While testing the buffer ring patch, I noticed that bulk inserts with
both INSERT and COPY pin and unpin the buffer they insert to for every
tuple. That means that the usage_count of all those buffers are bumped

<snip>

A fix for COPY will fall naturally out of the buffer ring patch, but not
for INSERT.

A more general fix would be to somehow keep the last insertion page
pinned across calls to heap_insert.

ISTR discussion in the past about having things like COPY and INSERT
INTO ... SELECT building entire pages in one shot once they exhaust the
FSM. Not only would it address this issue, but it would probably improve
performance in many ways (less locking and unlocking, ability to
pre-sort before inserting into indexes, fewer calls to FSM, probably a
bunch of other things).
--
Jim Nasby decibel@decibel.org
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)