pgsql: In COPY, insert tuples to the heap in batches.
In COPY, insert tuples to the heap in batches.
This greatly reduces the WAL volume, especially when the table is narrow.
The overhead of locking the heap page is also reduced. Reduced WAL traffic
also makes it scale a lot better, if you run multiple COPY processes at
the same time.
Branch
------
master
Details
-------
http://git.postgresql.org/pg/commitdiff/d326d9e8ea1d690cf6d968000efaa5121206d231
Modified Files
--------------
src/backend/access/heap/heapam.c | 484 ++++++++++++++++++++++++++++++++++----
src/backend/commands/copy.c | 166 ++++++++++++-
src/backend/postmaster/pgstat.c | 6 +-
src/include/access/heapam.h | 2 +
src/include/access/htup.h | 31 +++
src/include/pgstat.h | 2 +-
6 files changed, 629 insertions(+), 62 deletions(-)
On Wed, Nov 9, 2011 at 9:06 AM, Heikki Linnakangas
<heikki.linnakangas@iki.fi> wrote:
In COPY, insert tuples to the heap in batches.
This greatly reduces the WAL volume, especially when the table is narrow.
The overhead of locking the heap page is also reduced. Reduced WAL traffic
also makes it scale a lot better, if you run multiple COPY processes at
the same time.
Sounds good.
I can't see where this applies backup blocks. If it does, can you
document why/where/how it differs from other WAL records?
There's no need for conflict processing on replay with this new WAL
record type. But you should document that and alter the comments that
say it is necessary. Search "conflict".
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On 09.11.2011 15:25, Simon Riggs wrote:
On Wed, Nov 9, 2011 at 9:06 AM, Heikki Linnakangas
<heikki.linnakangas@iki.fi> wrote:In COPY, insert tuples to the heap in batches.
This greatly reduces the WAL volume, especially when the table is narrow.
The overhead of locking the heap page is also reduced. Reduced WAL traffic
also makes it scale a lot better, if you run multiple COPY processes at
the same time.Sounds good.
I can't see where this applies backup blocks. If it does, can you
document why/where/how it differs from other WAL records?
Good catch, I missed that. I copied the redo function from normal
insertion, but missed that heap_redo() takes care of backup blocks for
you, while heap2_redo() does not.
I'll go fix that..
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com