Minor optimizations in lazy_scan_heap
I was looking at the code in lazy_scan_heap() and I realized there are
couple of low-hanging optimizations that we can do there.
1. The for-loop walks through each block of the relation. But if scan_all
is set to false, which would be the case most often, we can jump over to
the next not-all-visible block directly (after considering the
SKIP_PAGES_THRESHOLD etc). I understand that the cost of looping with no-op
may not be considerable, but it looks unnecessary. And it can matter when
there are thousands and millions of consecutive all-visible blocks in a
large table.
2. We also do a visibilitymap_test() for each block. I think it will be
more prudent to have a visibilitymap API, say visibilitymap_test_range(),
which can take a range of blocks and return the first not-all-visible block
from the range. Internally, the function can then test several blocks at a
time. We can still do this without holding a lock on the VM buffer because
when scan_all is false, we don't care much about the correctness of the
visibility check anyway. Also, this function can later be optimized if we
start saving some summary information about visibility maps, in which case
we can more efficiently find first not-all-visible block.
3. I also thought that the call to vacuum_delay_point() for every
visibility check is not required and a simple CHECK_FOR_INTERRUPTS would be
good enough. Later I realized that may be we need that because visibility
map check can do an IO for the VM page. But if we do 2, then we can at
least limit calling vacuum_delay_point() once for every VM page, instead of
one per bit. I concede that the cost of calling vacuum_delay_point() may
not be too high, but it again looks unnecessary and can be taken care by a
slight re-factoring of the code.
Comments ? Anyone thinks any/all of above is useful ?
Thanks,
Pavan
--
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee
On Mon, Dec 3, 2012 at 1:23 AM, Pavan Deolasee <pavan.deolasee@gmail.com> wrote:
I was looking at the code in lazy_scan_heap() and I realized there are
couple of low-hanging optimizations that we can do there.1. The for-loop walks through each block of the relation. But if scan_all is
set to false, which would be the case most often, we can jump over to the
next not-all-visible block directly (after considering the
SKIP_PAGES_THRESHOLD etc). I understand that the cost of looping with no-op
may not be considerable, but it looks unnecessary. And it can matter when
there are thousands and millions of consecutive all-visible blocks in a
large table.2. We also do a visibilitymap_test() for each block. I think it will be more
prudent to have a visibilitymap API, say visibilitymap_test_range(), which
can take a range of blocks and return the first not-all-visible block from
the range. Internally, the function can then test several blocks at a time.
We can still do this without holding a lock on the VM buffer because when
scan_all is false, we don't care much about the correctness of the
visibility check anyway. Also, this function can later be optimized if we
start saving some summary information about visibility maps, in which case
we can more efficiently find first not-all-visible block.3. I also thought that the call to vacuum_delay_point() for every visibility
check is not required and a simple CHECK_FOR_INTERRUPTS would be good
enough. Later I realized that may be we need that because visibility map
check can do an IO for the VM page. But if we do 2, then we can at least
limit calling vacuum_delay_point() once for every VM page, instead of one
per bit. I concede that the cost of calling vacuum_delay_point() may not be
too high, but it again looks unnecessary and can be taken care by a slight
re-factoring of the code.Comments ? Anyone thinks any/all of above is useful ?
I doubt that any of these things make enough difference to be worth
bothering with, but if you have benchmark results suggesting otherwise
I'm all ears.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Dec 4, 2012 at 11:02 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Dec 3, 2012 at 1:23 AM, Pavan Deolasee <pavan.deolasee@gmail.com>
Comments ? Anyone thinks any/all of above is useful ?
I doubt that any of these things make enough difference to be worth
bothering with,
You're right. These are not big ticket optimisations, still I felt they are
worth doing because tiny bits add up over a time and also because the code
may become little simpler. The benchmarks don't show anything interesting
though. The time taken to scan 100K+ bits is sub-second. So even when I
tried with the attached patch, the numbers did not show any noticeable
difference. It might be worth trying with a table with 1M or 10M data
blocks, but I don't have such a hardware to test.
The patch itself can be improved further, especially we can possibly
optimise the loop and test 32-bits at a time, instead of 8 I am doing
currently. Not sure its worth though.
Thanks,
Pavan
--
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee
Attachments:
vm_test_range-v2.patchapplication/octet-stream; name=vm_test_range-v2.patchDownload
diff --git a/src/backend/access/heap/visibilitymap.c b/src/backend/access/heap/visibilitymap.c
index d7a2916..ce33edc 100644
--- a/src/backend/access/heap/visibilitymap.c
+++ b/src/backend/access/heap/visibilitymap.c
@@ -16,6 +16,7 @@
* visibilitymap_pin_ok - check whether correct map page is already pinned
* visibilitymap_set - set a bit in a previously pinned page
* visibilitymap_test - test if a bit is set
+ * visibilitymap_test_range - test if a bit is set in the range of blocks
* visibilitymap_count - count number of bits set in visibility map
* visibilitymap_truncate - truncate the visibility map
*
@@ -344,6 +345,121 @@ visibilitymap_test(Relation rel, BlockNumber heapBlk, Buffer *buf)
}
/*
+ * Test a range of numberOfBlks bits starting startHeapBlk and return the first
+ * bit that is set. If a set bit is found, return the VM buffer containing that
+ * bit in buf passed by the caller
+ *
+ * Return InvalidBlockNumber if no bit is set in the given range.
+ */
+BlockNumber
+visibilitymap_test_range(Relation rel, BlockNumber startHeapBlk,
+ BlockNumber numberOfBlks, Buffer *buf)
+{
+ BlockNumber mapBlock;
+ uint32 mapByte;
+ uint8 mapBit;
+ char *map;
+ BlockNumber remainingBlks, currentHeapBlk;
+
+#ifdef TRACE_VISIBILITYMAP
+ elog(DEBUG1, "vm_test_range %s %d %d", RelationGetRelationName(rel),
+ startHeapBlk, numberOfBlks);
+#endif
+
+ if (numberOfBlks <= 0)
+ return InvalidBlockNumber;
+
+ remainingBlks = numberOfBlks;
+ currentHeapBlk = startHeapBlk;
+
+ mapBlock = HEAPBLK_TO_MAPBLOCK(currentHeapBlk);
+ mapByte = HEAPBLK_TO_MAPBYTE(currentHeapBlk);
+ mapBit = HEAPBLK_TO_MAPBIT(currentHeapBlk);
+
+ /* Reuse the old pinned buffer if possible */
+ if (BufferIsValid(*buf))
+ {
+ if (BufferGetBlockNumber(*buf) != mapBlock)
+ {
+ ReleaseBuffer(*buf);
+ *buf = InvalidBuffer;
+ }
+ }
+
+ if (!BufferIsValid(*buf))
+ {
+ *buf = vm_readbuf(rel, mapBlock, false);
+ if (!BufferIsValid(*buf))
+ return InvalidBlockNumber;
+ }
+
+ map = PageGetContents(BufferGetPage(*buf));
+
+ /*
+ * We are not at a byte-boundary, so do a bit-wise check until we get
+ * there
+ */
+ while ((mapBit % HEAPBLOCKS_PER_BYTE != 0) && (remainingBlks > 0))
+ {
+ if (!(map[mapByte] & (1 << mapBit)))
+ return currentHeapBlk;
+
+ mapBit++;
+ remainingBlks--;
+ currentHeapBlk++;
+ }
+
+ /*
+ * Now do a byte-wise check, moving to the next block if necessary
+ */
+ while (remainingBlks > 0)
+ {
+ mapBlock = HEAPBLK_TO_MAPBLOCK(currentHeapBlk);
+ mapByte = HEAPBLK_TO_MAPBYTE(currentHeapBlk);
+ mapBit = HEAPBLK_TO_MAPBIT(currentHeapBlk);
+
+ /* Reuse the old pinned buffer if possible */
+ if (BufferIsValid(*buf))
+ {
+ if (BufferGetBlockNumber(*buf) != mapBlock)
+ {
+ ReleaseBuffer(*buf);
+ *buf = InvalidBuffer;
+ }
+ }
+
+ if (!BufferIsValid(*buf))
+ {
+ *buf = vm_readbuf(rel, mapBlock, false);
+ if (!BufferIsValid(*buf))
+ return InvalidBlockNumber;
+ }
+
+ map = PageGetContents(BufferGetPage(*buf));
+
+ /*
+ * If any bit is set, get the first one and return that as the result
+ */
+ if ((remainingBlks < HEAPBLOCKS_PER_BYTE) || (map[mapByte]))
+ break;
+
+ currentHeapBlk += HEAPBLOCKS_PER_BYTE;
+ remainingBlks -= HEAPBLOCKS_PER_BYTE;
+ }
+
+ while ((mapBit < HEAPBLOCKS_PER_BYTE) && (remainingBlks > 0))
+ {
+ if (!(map[mapByte] & (1 << mapBit)))
+ return currentHeapBlk;
+
+ mapBit++;
+ remainingBlks--;
+ currentHeapBlk++;
+ }
+
+ return InvalidBlockNumber;
+}
+/*
* visibilitymap_count - count number of bits set in visibility map
*
* Note: we ignore the possibility of race conditions when the table is being
diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c
index 5036849..bcb03a9 100644
--- a/src/backend/commands/vacuumlazy.c
+++ b/src/backend/commands/vacuumlazy.c
@@ -362,7 +362,8 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
Relation *Irel, int nindexes, bool scan_all)
{
BlockNumber nblocks,
- blkno;
+ blkno,
+ nextblkno;
HeapTupleData tuple;
char *relname;
BlockNumber empty_pages,
@@ -431,20 +432,24 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
* them. If we make the reverse mistake and vacuum a page unnecessarily,
* it'll just be a no-op.
*/
- for (next_not_all_visible_block = 0;
- next_not_all_visible_block < nblocks;
- next_not_all_visible_block++)
- {
- if (!visibilitymap_test(onerel, next_not_all_visible_block, &vmbuffer))
- break;
- vacuum_delay_point();
- }
- if (next_not_all_visible_block >= SKIP_PAGES_THRESHOLD)
+ next_not_all_visible_block = visibilitymap_test_range(onerel, 0, nblocks, &vmbuffer);
+
+ if ((next_not_all_visible_block == InvalidBlockNumber) ||
+ (next_not_all_visible_block >= SKIP_PAGES_THRESHOLD))
skipping_all_visible_blocks = true;
else
skipping_all_visible_blocks = false;
- for (blkno = 0; blkno < nblocks; blkno++)
+ blkno = 0;
+ if (skipping_all_visible_blocks && !scan_all)
+ {
+ if (next_not_all_visible_block != InvalidBlockNumber)
+ blkno = next_not_all_visible_block;
+ else
+ blkno = nblocks;
+ }
+
+ for (; blkno < nblocks; blkno = nextblkno)
{
Buffer buf;
Page page;
@@ -461,37 +466,37 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
bool has_dead_tuples;
TransactionId visibility_cutoff_xid = InvalidTransactionId;
+ nextblkno = blkno + 1;
if (blkno == next_not_all_visible_block)
{
- /* Time to advance next_not_all_visible_block */
- for (next_not_all_visible_block++;
- next_not_all_visible_block < nblocks;
- next_not_all_visible_block++)
- {
- if (!visibilitymap_test(onerel, next_not_all_visible_block,
- &vmbuffer))
- break;
- vacuum_delay_point();
- }
+ next_not_all_visible_block = visibilitymap_test_range(onerel,
+ next_not_all_visible_block + 1,
+ nblocks - (next_not_all_visible_block + 1),
+ &vmbuffer);
/*
* We know we can't skip the current block. But set up
* skipping_all_visible_blocks to do the right thing at the
* following blocks.
*/
- if (next_not_all_visible_block - blkno > SKIP_PAGES_THRESHOLD)
+ if ((next_not_all_visible_block == InvalidBlockNumber) ||
+ (next_not_all_visible_block - blkno > SKIP_PAGES_THRESHOLD))
skipping_all_visible_blocks = true;
else
skipping_all_visible_blocks = false;
+
+ if (!scan_all)
+ {
+ if (next_not_all_visible_block != InvalidBlockNumber)
+ nextblkno = next_not_all_visible_block;
+ else
+ nextblkno = nblocks;
+ }
+
all_visible_according_to_vm = false;
}
else
- {
- /* Current block is all-visible */
- if (skipping_all_visible_blocks && !scan_all)
- continue;
all_visible_according_to_vm = true;
- }
vacuum_delay_point();
diff --git a/src/include/access/visibilitymap.h b/src/include/access/visibilitymap.h
index 5774e92..c3bc504 100644
--- a/src/include/access/visibilitymap.h
+++ b/src/include/access/visibilitymap.h
@@ -27,6 +27,8 @@ extern bool visibilitymap_pin_ok(BlockNumber heapBlk, Buffer vmbuf);
extern void visibilitymap_set(Relation rel, BlockNumber heapBlk,
XLogRecPtr recptr, Buffer vmbuf, TransactionId cutoff_xid);
extern bool visibilitymap_test(Relation rel, BlockNumber heapBlk, Buffer *vmbuf);
+extern BlockNumber visibilitymap_test_range(Relation rel, BlockNumber startHeapBlk,
+ BlockNumber numberOfBlks, Buffer *buf);
extern BlockNumber visibilitymap_count(Relation rel);
extern void visibilitymap_truncate(Relation rel, BlockNumber nheapblocks);