short circuit suggestion in find_hash_columns()
Hi,
I was looking at find_hash_columns() in nodeAgg.c
It seems the first loop tries to determine the max column number needed,
along with whether all columns are needed.
The loop can be re-written as shown in the patch.
In normal cases, we don't need to perform scanDesc->natts iterations.
In best case scenario, the loop would terminate after two iterations.
Please provide your comment.
Thanks
Attachments:
find-hash-col-short-circuit.patchapplication/octet-stream; name=find-hash-col-short-circuit.patchDownload
diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c
index 914b02ceee..6ef222ad96 100644
--- a/src/backend/executor/nodeAgg.c
+++ b/src/backend/executor/nodeAgg.c
@@ -1578,12 +1578,16 @@ find_hash_columns(AggState *aggstate)
aggstate->max_colno_needed = 0;
aggstate->all_cols_needed = true;
- for (int i = 0; i < scanDesc->natts; i++)
+ for (int i = scanDesc->natts-1;
+ i >= 0 && (aggstate->all_cols_needed || aggstate->max_colno_needed == 0);
+ i--)
{
- int colno = i + 1;
-
+ int colno = i + 1;
if (bms_is_member(colno, aggstate->colnos_needed))
- aggstate->max_colno_needed = colno;
+ {
+ if (aggstate->max_colno_needed == 0)
+ aggstate->max_colno_needed = colno;
+ }
else
aggstate->all_cols_needed = false;
}
On Sat, 10 Jul 2021 at 03:15, Zhihong Yu <zyu@yugabyte.com> wrote:
I was looking at find_hash_columns() in nodeAgg.c
It seems the first loop tries to determine the max column number needed, along with whether all columns are needed.
The loop can be re-written as shown in the patch.
This runs during ExecInitAgg(). Do you have a test case where you're
seeing any performance gains from this change?
David
On Fri, Jul 9, 2021 at 8:28 AM David Rowley <dgrowleyml@gmail.com> wrote:
On Sat, 10 Jul 2021 at 03:15, Zhihong Yu <zyu@yugabyte.com> wrote:
I was looking at find_hash_columns() in nodeAgg.c
It seems the first loop tries to determine the max column number needed,
along with whether all columns are needed.
The loop can be re-written as shown in the patch.
This runs during ExecInitAgg(). Do you have a test case where you're
seeing any performance gains from this change?David
Hi,
I made some attempt in varying related test but haven't seen much
difference in performance.
Let me spend more time (possibly in off hours) on this.
Cheers