dropping datumSort field

Started by Zhihong Yuover 3 years ago8 messages
#1Zhihong Yu
zyu@yugabyte.com
1 attachment(s)

Hi,
I was looking at ExecSort() w.r.t. datum sort.

I wonder if the datumSort field can be dropped.
Here is a patch illustrating the potential simplification.

Please take a look.

Thanks

Attachments:

drop-datum-sort-field.patchapplication/octet-stream; name=drop-datum-sort-field.patchDownload
diff --git a/src/backend/executor/nodeSort.c b/src/backend/executor/nodeSort.c
index 3c28d60c3e..76f9e002fc 100644
--- a/src/backend/executor/nodeSort.c
+++ b/src/backend/executor/nodeSort.c
@@ -102,7 +102,7 @@ ExecSort(PlanState *pstate)
 		if (node->bounded)
 			tuplesortopts |= TUPLESORT_ALLOWBOUNDED;
 
-		if (node->datumSort)
+		if (tupDesc->natts == 1)
 			tuplesortstate = tuplesort_begin_datum(TupleDescAttr(tupDesc, 0)->atttypid,
 												   plannode->sortOperators[0],
 												   plannode->collations[0],
@@ -128,7 +128,7 @@ ExecSort(PlanState *pstate)
 		 * Scan the subplan and feed all the tuples to tuplesort using the
 		 * appropriate method based on the type of sort we're doing.
 		 */
-		if (node->datumSort)
+		if (tupDesc->natts == 1)
 		{
 			for (;;)
 			{
@@ -194,7 +194,7 @@ ExecSort(PlanState *pstate)
 	 * For tuple sorts, tuplesort_gettupleslot manages the slot for us and
 	 * empties the slot when it runs out of tuples.
 	 */
-	if (node->datumSort)
+	if (tupDesc->natts == 1)
 	{
 		ExecClearTuple(slot);
 		if (tuplesort_getdatum(tuplesortstate, ScanDirectionIsForward(dir),
@@ -274,15 +274,6 @@ ExecInitSort(Sort *node, EState *estate, int eflags)
 	ExecInitResultTupleSlotTL(&sortstate->ss.ps, &TTSOpsMinimalTuple);
 	sortstate->ss.ps.ps_ProjInfo = NULL;
 
-	/*
-	 * We perform a Datum sort when we're sorting just a single column,
-	 * otherwise we perform a tuple sort.
-	 */
-	if (ExecGetResultType(outerPlanState(sortstate))->natts == 1)
-		sortstate->datumSort = true;
-	else
-		sortstate->datumSort = false;
-
 	SO1_printf("ExecInitSort: %s\n",
 			   "sort node initialized");
 
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 01b1727fc0..ec4add978a 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -2218,7 +2218,6 @@ typedef struct SortState
 	int64		bound_Done;		/* value of bound we did the sort with */
 	void	   *tuplesortstate; /* private state of tuplesort.c */
 	bool		am_worker;		/* are we a worker? */
-	bool		datumSort;		/* Datum sort instead of tuple sort? */
 	SharedSortInfo *shared_info;	/* one entry per worker */
 } SortState;
 
#2Robert Haas
robertmhaas@gmail.com
In reply to: Zhihong Yu (#1)
Re: dropping datumSort field

On Mon, Aug 8, 2022 at 5:51 PM Zhihong Yu <zyu@yugabyte.com> wrote:

Hi,
I was looking at ExecSort() w.r.t. datum sort.

I wonder if the datumSort field can be dropped.
Here is a patch illustrating the potential simplification.

Please take a look.

One problem with this patch is that, if I apply it, PostgreSQL does not compile:

nodeSort.c:197:6: error: use of undeclared identifier 'tupDesc'
if (tupDesc->natts == 1)
^
1 error generated.

Leaving that aside, I don't really see any advantage in this sort of change.

--
Robert Haas
EDB: http://www.enterprisedb.com

#3Zhihong Yu
zyu@yugabyte.com
In reply to: Robert Haas (#2)
Re: dropping datumSort field

On Tue, Aug 9, 2022 at 8:01 AM Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Aug 8, 2022 at 5:51 PM Zhihong Yu <zyu@yugabyte.com> wrote:

Hi,
I was looking at ExecSort() w.r.t. datum sort.

I wonder if the datumSort field can be dropped.
Here is a patch illustrating the potential simplification.

Please take a look.

One problem with this patch is that, if I apply it, PostgreSQL does not
compile:

nodeSort.c:197:6: error: use of undeclared identifier 'tupDesc'
if (tupDesc->natts == 1)
^
1 error generated.

Leaving that aside, I don't really see any advantage in this sort of
change.

--
Robert Haas
EDB: http://www.enterprisedb.com

Thanks for trying out the patch.

TupleDesc tupDesc;

tupDesc is declared inside `if (!node->sort_Done)` block whereas the last
reference to tupDesc is outside the if block.

I take your review comment and will go back to do more homework.

Cheers

#4Robert Haas
robertmhaas@gmail.com
In reply to: Zhihong Yu (#3)
Re: dropping datumSort field

On Tue, Aug 9, 2022 at 11:16 AM Zhihong Yu <zyu@yugabyte.com> wrote:

tupDesc is declared inside `if (!node->sort_Done)` block whereas the last reference to tupDesc is outside the if block.

Yep.

I take your review comment and will go back to do more homework.

The real point for me here is you haven't offered any reason to make
this change. The structure member in question is basically free.
Because of alignment padding it uses no more memory, and it makes the
intent of the code clearer.

Let's not change things just because we could.

--
Robert Haas
EDB: http://www.enterprisedb.com

#5Zhihong Yu
zyu@yugabyte.com
In reply to: Robert Haas (#4)
Re: dropping datumSort field

On Tue, Aug 9, 2022 at 8:24 AM Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Aug 9, 2022 at 11:16 AM Zhihong Yu <zyu@yugabyte.com> wrote:

tupDesc is declared inside `if (!node->sort_Done)` block whereas the

last reference to tupDesc is outside the if block.

Yep.

I take your review comment and will go back to do more homework.

The real point for me here is you haven't offered any reason to make
this change. The structure member in question is basically free.
Because of alignment padding it uses no more memory, and it makes the
intent of the code clearer.

Let's not change things just because we could.

--
Robert Haas
EDB: http://www.enterprisedb.com

I should have provided motivation for the patch.

I was aware of recent changes around ExprEvalStep. e.g.

Remove unused fields from ExprEvalStep

Though the datumSort field may be harmless for now, in the future, the
(auto) alignment padding may not work if more fields are added to the
struct.
I think we should always leave some room in struct's for future expansion.

As for making the intent of the code clearer, the datumSort field is only
used in one method.
My previous patch removed some comment which should have been shifted into
this method.
In my opinion, localizing the check in single method is easier to
understand than resorting to additional struct field.

Cheers

#6Robert Haas
robertmhaas@gmail.com
In reply to: Zhihong Yu (#5)
Re: dropping datumSort field

On Tue, Aug 9, 2022 at 11:42 AM Zhihong Yu <zyu@yugabyte.com> wrote:

Though the datumSort field may be harmless for now, in the future, the (auto) alignment padding may not work if more fields are added to the struct.
I think we should always leave some room in struct's for future expansion.

I doubt the size of this struct is particularly important, unlike
ExprEvalStep which needs to be small. But if it turns out in the
future that we need to try to squeeze this struct into fewer bytes, we
can always do something like this then. Right now there's no obvious
point to it.

Sure, it might be valuable *if* we add more fields to the struct and
*if* that means that the byte taken up by this flag actually makes the
struct bigger and *if* the size of the struct is demonstrated to be a
performance problem. But right now none of that has happened, and
maybe none of it will ever happen.

--
Robert Haas
EDB: http://www.enterprisedb.com

#7David Rowley
dgrowleyml@gmail.com
In reply to: Zhihong Yu (#3)
Re: dropping datumSort field

On Wed, 10 Aug 2022 at 03:16, Zhihong Yu <zyu@yugabyte.com> wrote:

On Tue, Aug 9, 2022 at 8:01 AM Robert Haas <robertmhaas@gmail.com> wrote:

One problem with this patch is that, if I apply it, PostgreSQL does not compile:

nodeSort.c:197:6: error: use of undeclared identifier 'tupDesc'
if (tupDesc->natts == 1)
^
1 error generated.

Leaving that aside, I don't really see any advantage in this sort of change.

I'm with Robert on this one. If you look at the discussion for that
commit, you'll find quite a bit of benchmarking was done around this
[1]: /messages/by-id/CAApHDvrWV=v0qKsC9_BHqhCn9TusrNvCaZDz77StCO--fmgbKA@mail.gmail.com
ensure whatever is being referenced here causes the least amount of
CPU cache misses. Volcano executors which process a single row at a
time are not exactly an ideal workload for a modern processor due to
the bad L1i and L1d hit ratios. This becomes worse with more plan
nodes as these caches are more likely to get flushed of useful cache
lines when mode notes are visited. Various other fields in 'node' have
just been accessed in the code leading up to the "if
(node->datumSort)" check, so we're probably not going to encounter any
CPU pipeline stalls waiting for cache lines to be loaded into L1d. It
seems you're proposing to change this and have offered no evidence of
no performance regressions from doing so. Going by the compilation
error above, it seems unlikely that you've given performance any
consideration at all.

I mean this in the best way possible; for the future, I really
recommend arriving with ideas that are well researched and tested.
When you can, arrive with evidence to prove your change is good. For
this case, evidence would be benchmark results. The problem is if you
were to arrive with patches such as this too often then you'll start
to struggle to get attention from anyone, let alone a committer. You
don't want to build a reputation for bad quality work as it's likely
most committers will steer clear of your work. If you want a good
recent example of a good proposal, have a look at Yuya Watari's
write-up at [2]/messages/by-id/CAJ2pMkZNCgoUKSE+_5LthD+KbXKvq6h2hQN8Esxpxd+cxmgomg@mail.gmail.com and [3]/messages/by-id/CAJ2pMkZKFVmPHovyyueBpwb_nYYVk2+GaDqgzxZVnjkvxgtXog@mail.gmail.com. There was a clear problem statement there and
a patch that was clearly a proof of concept only, so the author was
under no illusion that what he had was ideal. Now, some other ideas
were suggested on that thread to hopefully simplify the task and
provide even better performance. Yuya went off and did that and
arrived back armed with further benchmark results. I was quite
impressed with this considering he's not posted to -hackers very
often. Now, if you were a committer, would you be looking at the
patches from the person who sends in half-thought-through ideas, or
patches from someone that has clearly put a great deal of effort into
first clearly stating the problem and then proving that the problem is
solved by the given patch?

David

[1]: /messages/by-id/CAApHDvrWV=v0qKsC9_BHqhCn9TusrNvCaZDz77StCO--fmgbKA@mail.gmail.com
[2]: /messages/by-id/CAJ2pMkZNCgoUKSE+_5LthD+KbXKvq6h2hQN8Esxpxd+cxmgomg@mail.gmail.com
[3]: /messages/by-id/CAJ2pMkZKFVmPHovyyueBpwb_nYYVk2+GaDqgzxZVnjkvxgtXog@mail.gmail.com

#8Zhihong Yu
zyu@yugabyte.com
In reply to: David Rowley (#7)
Re: dropping datumSort field

On Tue, Aug 9, 2022 at 9:04 PM David Rowley <dgrowleyml@gmail.com> wrote:

On Wed, 10 Aug 2022 at 03:16, Zhihong Yu <zyu@yugabyte.com> wrote:

On Tue, Aug 9, 2022 at 8:01 AM Robert Haas <robertmhaas@gmail.com>

wrote:

One problem with this patch is that, if I apply it, PostgreSQL does not

compile:

nodeSort.c:197:6: error: use of undeclared identifier 'tupDesc'
if (tupDesc->natts == 1)
^
1 error generated.

Leaving that aside, I don't really see any advantage in this sort of

change.

I'm with Robert on this one. If you look at the discussion for that
commit, you'll find quite a bit of benchmarking was done around this
[1]. The "if" test here is in quite a hot code path, so we want to
ensure whatever is being referenced here causes the least amount of
CPU cache misses. Volcano executors which process a single row at a
time are not exactly an ideal workload for a modern processor due to
the bad L1i and L1d hit ratios. This becomes worse with more plan
nodes as these caches are more likely to get flushed of useful cache
lines when mode notes are visited. Various other fields in 'node' have
just been accessed in the code leading up to the "if
(node->datumSort)" check, so we're probably not going to encounter any
CPU pipeline stalls waiting for cache lines to be loaded into L1d. It
seems you're proposing to change this and have offered no evidence of
no performance regressions from doing so. Going by the compilation
error above, it seems unlikely that you've given performance any
consideration at all.

I mean this in the best way possible; for the future, I really
recommend arriving with ideas that are well researched and tested.
When you can, arrive with evidence to prove your change is good. For
this case, evidence would be benchmark results. The problem is if you
were to arrive with patches such as this too often then you'll start
to struggle to get attention from anyone, let alone a committer. You
don't want to build a reputation for bad quality work as it's likely
most committers will steer clear of your work. If you want a good
recent example of a good proposal, have a look at Yuya Watari's
write-up at [2] and [3]. There was a clear problem statement there and
a patch that was clearly a proof of concept only, so the author was
under no illusion that what he had was ideal. Now, some other ideas
were suggested on that thread to hopefully simplify the task and
provide even better performance. Yuya went off and did that and
arrived back armed with further benchmark results. I was quite
impressed with this considering he's not posted to -hackers very
often. Now, if you were a committer, would you be looking at the
patches from the person who sends in half-thought-through ideas, or
patches from someone that has clearly put a great deal of effort into
first clearly stating the problem and then proving that the problem is
solved by the given patch?

David

[1]
/messages/by-id/CAApHDvrWV=v0qKsC9_BHqhCn9TusrNvCaZDz77StCO--fmgbKA@mail.gmail.com
[2]
/messages/by-id/CAJ2pMkZNCgoUKSE+_5LthD+KbXKvq6h2hQN8Esxpxd+cxmgomg@mail.gmail.com
[3]
/messages/by-id/CAJ2pMkZKFVmPHovyyueBpwb_nYYVk2+GaDqgzxZVnjkvxgtXog@mail.gmail.com

Hi, David:
Thanks for your detailed response.