unnecessary executor overheads around seqscans

Started by Andres Freundabout 6 hours ago1 messages
#1Andres Freund
andres@anarazel.de

Hi,

In [1]/messages/by-id/rvlc7pb6zn4kydqovcqh72lf2qfcgs3qkj2seq7tcpvxyqwtqt@nrvv6lpehwwa I was looking at the profile of a seqscan with a where clause that
doesn't match any of the many rows. I was a bit saddened by where we were
spending time.

- The fetching of variables, as well as the null check of scandesc, in
SeqNext() is repeated in every loop iteration of ExecScanExtended, despite
that obviously not being required after the first iteration

We could perhaps address this by moving the check to the callers of
ExecScanExtended() or by extending ExecScanExtended to have an explicit
beginscan callback that it calls after.

Or perhaps we could just make it so that the entire if (scandesc == NULL)
branch isn't needed?

- The checkXidAlive checks that have been added to table_scan_getnextslot()
show up noticeably and in every loop iteration, despite afaict never being reachable

It's not obvious to me that this should
a) be in table_scan_getnextslot(), rather than in beginscan - how could it
change in the middle of a scan? That would require a wrapper around
rd_tableam->scan_begin(), but that seems like it might be good anyway.
b) not just be an assertion?

- The TupIsNull(slot) check in ExecScanExtended() is redundant with the return
value of table_scan_getnextslot(), but the compiler doesn't grok that.

We can use a pg_assume() in table_scan_getnextslot() to make the compiler
understand.

- We repeatedly store the table oid in the slot, table_scan_getnextslot() and
then again in ExecStoreBufferHeapTuple(). This shows up in the profile.

I wish we had made the slot a property of the scan, that way the scan could
assume the slot already has the oid set...

- heap_getnextslot() calls ExecStoreBufferHeapTuple() and then returns
true. That prevents the sibiling call optimization.

We should change ExecStoreBufferHeapTuple() to return true. Nobody uses the
current return value. Alternatively we should consider just moving it to
somewhere heapam.c/heapam_handler.c can see the implementations, they're the
only ones that should use it anyway.

There's more, but these already provide a decent improvement.

Greetings,

Andres Freund

[1]: /messages/by-id/rvlc7pb6zn4kydqovcqh72lf2qfcgs3qkj2seq7tcpvxyqwtqt@nrvv6lpehwwa