Fix gin index cost estimation

Started by Ronan Dunklauover 3 years ago18 messages

ronan.dunklau@aiven.io

over 3 years ago

1 attachment(s)

Hello,

Following the bug report at [1]/messages/by-id/flat/ 2187702.iZASKD2KPV%40aivenronan#0c2498c6a85e31a589b3e9a6a3616c52, I sent the attached patch to pgsql-bugs
mailing list. I'm starting a thread here to add it to the next commitfest.

The problem I'm trying to solve is that, contrary to btree, gist and sp-gist
indexes, gin indexes do not charge any cpu-cost for descending the entry tree.

This can be a problem in cases where the io cost is very low. This can happen
with manual tuning of course, but more surprisingly when the the IO cost is
amortized over a large number of iterations in a nested loop. In that case, we
basically consider it free since everything should already be in the shared
buffers. This leads to some inefficient plans, as an equivalent btree index
should be picked instead.

This has been discovered in PG14, as this release makes it possible to use a
pg_trgm gin index with the equality operator. Before that, only the btree
would have been considered and as such the discrepancy in the way we charge
cpu cost didn't have noticeable effects. However, I suspect users of btree_gin
could have the same kind of problems in prior versions.

Best regards,

[1]: /messages/by-id/flat/ 2187702.iZASKD2KPV%40aivenronan#0c2498c6a85e31a589b3e9a6a3616c52
2187702.iZASKD2KPV%40aivenronan#0c2498c6a85e31a589b3e9a6a3616c52

--
Ronan Dunklau

Attachments:

v1-0001-Fix-gin-costing.patchtext/x-patch; charset=UTF-8; name=v1-0001-Fix-gin-costing.patchDownload

From 0aa1ff24e58234d759c07f4eeec163a82244be25 Mon Sep 17 00:00:00 2001
From: Ronan Dunklau <ronan.dunklau@aiven.io>
Date: Wed, 6 Jul 2022 17:29:01 +0200
Subject: [PATCH v1] Fix gin costing.

GIN index scans were not taking any descent CPU-based cost into account. That made
them look cheaper than other types of indexes when they shouldn't be.

We use the same heuristic as for btree indexes, but multiplying it by
the number of searched entries.

Per report of Hung Nguyen.
---
 src/backend/utils/adt/selfuncs.c | 36 +++++++++++++++++++++++++++++++-
 1 file changed, 35 insertions(+), 1 deletion(-)

diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index fa1f589fad..21407c3d38 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -7425,6 +7425,7 @@ gincostestimate(PlannerInfo *root, IndexPath *path, double loop_count,
 				qual_arg_cost,
 				spc_random_page_cost,
 				outer_scans;
+	Cost		descentCost;
 	Relation	indexRel;
 	GinStatsData ginStats;
 	ListCell   *lc;
@@ -7524,6 +7525,18 @@ gincostestimate(PlannerInfo *root, IndexPath *path, double loop_count,
 							  &spc_random_page_cost,
 							  NULL);
 
+
+	/*
+	 * We model index descent costs similarly to those for btree, but we also
+	 * need an idea of the tree_height.
+	 * We use numEntries / numEntryPages as the fanout factor.
+	 */
+	if (index->tree_height < 0)
+	{
+		index->tree_height = ceil(log(numEntries) / log(numEntries / numEntryPages));
+	}
+
+
 	/*
 	 * Generic assumption about index correlation: there isn't any.
 	 */
@@ -7649,6 +7662,27 @@ gincostestimate(PlannerInfo *root, IndexPath *path, double loop_count,
 	 */
 	dataPagesFetched = ceil(numDataPages * partialScale);
 
+
+	*indexStartupCost = 0;
+
+	/*
+	 * Add a CPU-cost component similar to btree to represent the costs of the
+	 * initial descent.
+	 * We charge descentCost once for every entry
+	 */
+	if (numTuples > 1)
+	{
+		descentCost = ceil(log(numTuples) / log(2.0)) * cpu_operator_cost;
+		*indexStartupCost += descentCost * counts.searchEntries;
+	}
+
+	/*
+	 * Add a similar per-page charge, depending on the tree heights.
+	 */
+	descentCost = (index->tree_height + 1) * 50.0 * cpu_operator_cost;
+	*indexStartupCost += descentCost * counts.searchEntries;
+
+
 	/*
 	 * Calculate cache effects if more than one scan due to nestloops or array
 	 * quals.  The result is pro-rated per nestloop scan, but the array qual
@@ -7672,7 +7706,7 @@ gincostestimate(PlannerInfo *root, IndexPath *path, double loop_count,
 	 * Here we use random page cost because logically-close pages could be far
 	 * apart on disk.
 	 */
-	*indexStartupCost = (entryPagesFetched + dataPagesFetched) * spc_random_page_cost;
+	*indexStartupCost += (entryPagesFetched + dataPagesFetched) * spc_random_page_cost;
 
 	/*
 	 * Now compute the number of data pages fetched during the scan.
-- 
2.37.0

Tom Lane

tgl@sss.pgh.pa.us

over 3 years ago

In reply to: Ronan Dunklau (#1)

1 attachment(s)

Re: Fix gin index cost estimation

Ronan Dunklau <ronan.dunklau@aiven.io> writes:

Following the bug report at [1], I sent the attached patch to pgsql-bugs
mailing list. I'm starting a thread here to add it to the next commitfest.

That link didn't work easily for me (possibly because it got split across
lines). Here's another one for anybody having similar issues:

/messages/by-id/CABs3KGQnOkyQ42-zKQqiE7M0Ks9oWDSee=+Jx3-TGq=68xqWYw@mail.gmail.com

The problem I'm trying to solve is that, contrary to btree, gist and sp-gist
indexes, gin indexes do not charge any cpu-cost for descending the entry tree.

As I said in the bug report thread, I think we really need to take a look
at all of our index AMs not just GIN. I extended your original reproducer
script to check all the AMs (attached), and suppressed memoize because it
seemed to behave differently for different AMs. Here's what I see for the
estimated costs of the inner indexscan, and the actual runtime, for each:

btree:
-> Index Only Scan using t1_btree_index on t1 (cost=0.28..0.30 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=20000)
Execution Time: 19.763 ms

gin (gin_trgm_ops):
-> Bitmap Heap Scan on t1 (cost=0.01..0.02 rows=1 width=4) (actual time=0.003..0.003 rows=1 loops=20000)
-> Bitmap Index Scan on t1_gin_index (cost=0.00..0.01 rows=1 width=0) (actual time=0.003..0.003 rows=1 loops=20000)
Execution Time: 75.216 ms

gist:
-> Index Only Scan using t1_gist_index on t1 (cost=0.14..0.16 rows=1 width=4) (actual time=0.014..0.014 rows=1 loops=20000)
Execution Time: 277.799 ms

spgist:
-> Index Only Scan using t1_spgist_index on t1 (cost=0.14..0.16 rows=1 width=4) (actual time=0.002..0.002 rows=1 loops=20000)
Execution Time: 51.407 ms

hash:
-> Index Scan using t1_hash_index on t1 (cost=0.00..0.02 rows=1 width=4) (actual time=0.000..0.000 rows=1 loops=20000)
Execution Time: 13.090 ms

brin:
-> Bitmap Heap Scan on t1 (cost=0.03..18.78 rows=1 width=4) (actual time=0.049..0.093 rows=1 loops=20000)
-> Bitmap Index Scan on t1_brin_index (cost=0.00..0.03 rows=1500 width=0) (actual time=0.003..0.003 rows=70 loops=20000)
Execution Time: 1890.161 ms

bloom:
-> Bitmap Heap Scan on t1 (cost=11.25..11.26 rows=1 width=4) (actual time=0.004..0.004 rows=1 loops=20000)
-> Bitmap Index Scan on t1_bloom_index (cost=0.00..11.25 rows=1 width=0) (actual time=0.003..0.003 rows=2 loops=20000)
Execution Time: 88.703 ms

(These figures shouldn't be trusted too much because I did nothing
to suppress noise. They seem at least somewhat reproducible, though.)

So, taking btree as our reference point, gin has clearly got a problem
because it's estimating less than a tenth as much cost despite actually
being nearly 4X slower. gist and spgist are not as bad off, but
nonetheless they claim to be cheaper than btree when they are not.
The result for hash looks suspicious as well, though at least we'd
make the right index choice for this particular case. brin and bloom
correctly report being a lot more expensive than btree, so at least
for the moment I'm not worried about them.

BTW, the artificially small random_page_cost doesn't really affect
this much. If I set it to a perfectly reasonable value like 1.0,
gin produces a saner cost estimate but gist, spgist, and hash do
not change their estimates at all. btree's estimate doesn't change
either, which seems like it might be OK for index-only scans but
I doubt I believe it for index scans. In any case, at least one
of gin and hash is doing it wrong.

In short, I think gist and spgist probably need a minor tweak to
estimate more CPU cost than they do now, and hash needs a really
hard look at whether it's sane at all.

That's all orthogonal to the merits of your patch for gin,
so I'll respond separately about that.

regards, tom lane

Tom Lane

tgl@sss.pgh.pa.us

over 3 years ago

In reply to: Ronan Dunklau (#1)

Re: Fix gin index cost estimation

Ronan Dunklau <ronan.dunklau@aiven.io> writes:

The problem I'm trying to solve is that, contrary to btree, gist and sp-gist
indexes, gin indexes do not charge any cpu-cost for descending the entry tree.

I looked this over briefly. I think you are correct to charge an
initial-search cost per searchEntries count, but don't we also need to
scale up by arrayScans, similar to the "corrections for cache effects"?

+	 * We model index descent costs similarly to those for btree, but we also
+	 * need an idea of the tree_height.
+	 * We use numEntries / numEntryPages as the fanout factor.

I'm not following that calculation? It seems like it'd be correct
only for a tree height of 1, although maybe I'm just misunderstanding
this (overly terse, perhaps) comment.

+	 * We charge descentCost once for every entry
+	 */
+	if (numTuples > 1)
+	{
+		descentCost = ceil(log(numTuples) / log(2.0)) * cpu_operator_cost;
+		*indexStartupCost += descentCost * counts.searchEntries;
+	}

I had to read this twice before absorbing the point of the numTuples
test. Maybe help the reader a bit:

+ if (numTuples > 1) /* ensure positive log() */

Personally I'd duplicate the comments from nbtreecostestimate rather
than just assuming the reader will go consult them. For that matter,
why didn't you duplicate nbtree's logic for charging for SA scans?
This bit seems just as relevant for GIN:

* If there are ScalarArrayOpExprs, charge this once per SA scan. The
* ones after the first one are not startup cost so far as the overall
* plan is concerned, so add them only to "total" cost.

Keep in mind also that pgindent will have its own opinions about how to
format these comments, and it can out-stubborn you. Either run the
comments into single paragraphs, or if you really want them to be two
paras then leave an empty comment line between. Another formatting
nitpick is that you seem to have added a number of unnecessary blank
lines.

regards, tom lane

Ronan Dunklau

ronan.dunklau@aiven.io

over 3 years ago

In reply to: Tom Lane (#3)

Re: Fix gin index cost estimation

Thank you for looking at it.

I looked this over briefly. I think you are correct to charge an
initial-search cost per searchEntries count, but don't we also need to
scale up by arrayScans, similar to the "corrections for cache effects"?

+ * We model index descent costs similarly to those for btree, but

we also

+	 * need an idea of the tree_height.
+	 * We use numEntries / numEntryPages as the fanout factor.
I'm not following that calculation? It seems like it'd be correct
only for a tree height of 1, although maybe I'm just misunderstanding
this (overly terse, perhaps) comment.

I don't really understand why that would work only with a tree height of one ?
Every entry page contains a certain amount of entries, and as such computing
the average number of entries per page seems to be a good approximation for
the fanout. But I may have misunderstood what was done in other index types.

For consistency, maybe we should just use a hard coded value of 100 for the
fanout factor, similarly to what we do for other index types.

But I realised that another approach might be better suited: since we want to
charge a cpu cost for every page visited, actually basing that on the already
estimated entryPagesFetched and dataPagesFetched would be better, instead of
copying what is done for other indexes type and estimating the tree height. It
would be simpler, as we don't need to estimate the tree height anymore.

I will submit a patch doing that.

+	 * We charge descentCost once for every entry
+	 */
+	if (numTuples > 1)
+	{
+		descentCost = ceil(log(numTuples) / log(2.0)) *

cpu_operator_cost;

+ *indexStartupCost += descentCost *

counts.searchEntries;

+ }

I had to read this twice before absorbing the point of the numTuples
test. Maybe help the reader a bit:

+ if (numTuples > 1) /* ensure positive log() */

Ok. On second read, I think that part was actually wrong: what we care about
is not the number of tuples here, but the number of entries.

Personally I'd duplicate the comments from nbtreecostestimate rather
than just assuming the reader will go consult them. For that matter,
why didn't you duplicate nbtree's logic for charging for SA scans?
This bit seems just as relevant for GIN:

* If there are ScalarArrayOpExprs, charge this once per SA scan.

The

* ones after the first one are not startup cost so far as the

overall

* plan is concerned, so add them only to "total" cost.

You're right. So what we need to do here is scale up whatever we charge for
the startup cost by the number of arrayscans for the total cost.

Keep in mind also that pgindent will have its own opinions about how to
format these comments, and it can out-stubborn you. Either run the
comments into single paragraphs, or if you really want them to be two
paras then leave an empty comment line between. Another formatting
nitpick is that you seem to have added a number of unnecessary blank
lines.

Thanks, noted.

I'll submit a new patch soon, as soon as i've resolved some of the problems I
have when accounting for scalararrayops.

Best regards,

--
Ronan Dunklau

Ronan Dunklau

ronan.dunklau@aiven.io

over 3 years ago

In reply to: Ronan Dunklau (#4)

1 attachment(s)

Re: Fix gin index cost estimation

Le lundi 12 septembre 2022, 16:41:16 CEST Ronan Dunklau a écrit :

But I realised that another approach might be better suited: since we want

charge a cpu cost for every page visited, actually basing that on the

already

estimated entryPagesFetched and dataPagesFetched would be better, instead of
copying what is done for other indexes type and estimating the tree height.

would be simpler, as we don't need to estimate the tree height anymore.

I will submit a patch doing that.

The attached does that and is much simpler. I only took into account
entryPagesFetched, not sure if we should also charge something for data pages.

Instead of trying to estimate the height of the tree, we rely on the
(imperfect) estimation of the number of entry pages fetched, and charge 50
times cpu_operator_cost to that, in addition to the cpu_operator_cost charged
per entry visited.

I also adapted to take into accounts multiple scans induced by scalar array
operations.

As it is, I don't understand the following calculation:

/*
* Estimate number of entry pages read. We need to do
* counts.searchEntries searches. Use a power function as it should be,
* but tuples on leaf pages usually is much greater. Here we include all
* searches in entry tree, including search of first entry in partial
* match algorithm
*/
entryPagesFetched += ceil(counts.searchEntries * rint(pow(numEntryPages,
0.15)));

Is the power(0.15) used an approximation for a log ? If so why ? Also
shouldn't we round that up ?
It seems to me it's unlikely to affect the total too much in normal cases
(adding at worst random_page_cost) but if we start to charge cpu operator
costs as proposed here it makes a big difference and it is probably
safer to overestimate a bit than the opposite.

With those changes, the gin cost (purely cpu-wise) stays above the btree one
as I think it should be.

--
Ronan Dunklau

Attachments:

v2-0001-Fix-gin-costing.patchtext/x-patch; charset=UTF-8; name=v2-0001-Fix-gin-costing.patchDownload

From 8635a22ee5f90297756f072199c69a318bd17ea2 Mon Sep 17 00:00:00 2001
From: Ronan Dunklau <ronan.dunklau@aiven.io>
Date: Mon, 12 Sep 2022 15:40:18 +0200
Subject: [PATCH v2] Fix gin costing.

GIN index scans were not taking any descent CPU-based cost into account. That made
them look cheaper than other types of indexes when they shouldn't be.

We use the same heuristic as for btree indexes, but multiplying it by
the number of searched entries.

Per report of Hung Nguyen.
---
 src/backend/utils/adt/selfuncs.c | 34 +++++++++++++++++++++++++++++---
 1 file changed, 31 insertions(+), 3 deletions(-)

diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index c746759eef..881e470a07 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -7419,6 +7419,7 @@ gincostestimate(PlannerInfo *root, IndexPath *path, double loop_count,
 				qual_arg_cost,
 				spc_random_page_cost,
 				outer_scans;
+	Cost		descentCost;
 	Relation	indexRel;
 	GinStatsData ginStats;
 	ListCell   *lc;
@@ -7622,7 +7623,7 @@ gincostestimate(PlannerInfo *root, IndexPath *path, double loop_count,
 	 * searches in entry tree, including search of first entry in partial
 	 * match algorithm
 	 */
-	entryPagesFetched += ceil(counts.searchEntries * rint(pow(numEntryPages, 0.15)));
+	entryPagesFetched += ceil(counts.searchEntries * ceil(pow(numEntryPages, 0.15)));
 
 	/*
 	 * Add an estimate of entry pages read by partial match algorithm. It's a
@@ -7643,6 +7644,33 @@ gincostestimate(PlannerInfo *root, IndexPath *path, double loop_count,
 	 */
 	dataPagesFetched = ceil(numDataPages * partialScale);
 
+	*indexStartupCost = 0;
+	*indexTotalCost = 0;
+
+	/*
+	 * Add a CPU-cost component to represent the costs of initial entry btree
+	 * descent.  We don't charge any I/O cost for touching upper btree levels,
+	 * since they tend to stay in cache, but we still have to do about log2(N)
+	 * comparisons to descend a btree of N leaf tuples.  We charge one
+	 * cpu_operator_cost per comparison.
+	 *
+	 * If there are ScalarArrayOpExprs, charge this once per SA scan.  The
+	 * ones after the first one are not startup cost so far as the overall
+	 * plan is concerned, so add them only to "total" cost.
+	 */
+	if (numEntries > 1)			/* avoid computing log(0) */
+	{
+		descentCost = ceil(log(numEntries) / log(2.0)) * cpu_operator_cost;
+		*indexStartupCost += descentCost * counts.searchEntries;
+		*indexTotalCost += counts.arrayScans * descentCost * counts.searchEntries;
+	}
+
+	/*
+	 * Add a cpu cost per page fetched. This is not amortized over a loop.
+	 */
+	*indexStartupCost += entryPagesFetched * 50.0 * cpu_operator_cost;
+	*indexTotalCost += entryPagesFetched * counts.arrayScans * 50.0 * cpu_operator_cost;
+
 	/*
 	 * Calculate cache effects if more than one scan due to nestloops or array
 	 * quals.  The result is pro-rated per nestloop scan, but the array qual
@@ -7666,7 +7694,7 @@ gincostestimate(PlannerInfo *root, IndexPath *path, double loop_count,
 	 * Here we use random page cost because logically-close pages could be far
 	 * apart on disk.
 	 */
-	*indexStartupCost = (entryPagesFetched + dataPagesFetched) * spc_random_page_cost;
+	*indexStartupCost += (entryPagesFetched + dataPagesFetched) * spc_random_page_cost;
 
 	/*
 	 * Now compute the number of data pages fetched during the scan.
@@ -7705,7 +7733,7 @@ gincostestimate(PlannerInfo *root, IndexPath *path, double loop_count,
 	}
 
 	/* And apply random_page_cost as the cost per page */
-	*indexTotalCost = *indexStartupCost +
+	*indexTotalCost += *indexStartupCost +
 		dataPagesFetched * spc_random_page_cost;
 
 	/*
-- 
2.37.3

Tom Lane

tgl@sss.pgh.pa.us

over 3 years ago

In reply to: Ronan Dunklau (#5)

Re: Fix gin index cost estimation

Ronan Dunklau <ronan.dunklau@aiven.io> writes:

The attached does that and is much simpler. I only took into account
entryPagesFetched, not sure if we should also charge something for data pages.

I think this is wrong, because there is already a CPU charge based on
the number of tuples visited, down at the very end of the routine:

*indexTotalCost += (numTuples * *indexSelectivity) * (cpu_index_tuple_cost + qual_op_cost);

It's certainly possible to argue that that's incorrectly computed,
but just adding another cost charge for the same topic can't be right.

I do suspect that that calculation is bogus, because it looks like it's
based on the concept of "apply the quals at each index entry", which we
know is not how GIN operates. So maybe we should drop that bit in favor
of a per-page-ish cost like you have here. Not sure. In any case it
seems orthogonal to the question of startup/descent costs. Maybe we'd
better tackle just one thing at a time.

(BTW, given that that charge does exist and is not affected by
repeated-scan amortization, why do we have a problem in the first place?
Is it just too small? I guess that when we're only expecting one tuple
to be retrieved, it would only add about cpu_index_tuple_cost.)

Is the power(0.15) used an approximation for a log ? If so why ? Also
shouldn't we round that up ?

No idea, but I'm pretty hesitant to just randomly fool with that equation
when (a) neither of us know where it came from and (b) exactly no evidence
has been provided that it's wrong.

I note for instance that the existing logic switches from charging 1 page
per search to 2 pages per search at numEntryPages = 15 (1.5 ^ (1/0.15)).
Your version would switch at 2 pages, as soon as the pow() result is even
fractionally above 1.0. Maybe the existing logic is too optimistic there,
but ceil() makes it too pessimistic I think. I'd sooner tweak the power
constant than change from rint() to ceil().

regards, tom lane

Ronan Dunklau

ronan.dunklau@aiven.io

over 3 years ago

In reply to: Tom Lane (#6)

Re: Fix gin index cost estimation

Le vendredi 16 septembre 2022, 22:04:59 CEST Tom Lane a écrit :

Ronan Dunklau <ronan.dunklau@aiven.io> writes:

The attached does that and is much simpler. I only took into account
entryPagesFetched, not sure if we should also charge something for data

pages.

I think this is wrong, because there is already a CPU charge based on
the number of tuples visited, down at the very end of the routine:

*indexTotalCost += (numTuples * *indexSelectivity) *

(cpu_index_tuple_cost + qual_op_cost);

It's certainly possible to argue that that's incorrectly computed,
but just adding another cost charge for the same topic can't be right.

I don't think it's the same thing. The entryPagesFetched is computed
independently of the selectivity and the number of tuples. As such, I think it
makes sense to use it to compute the cost of descending the entry tree.

As mentioned earlier, I don't really understand the formula for computing
entryPagesFetched. If we were to estimate the tree height to compute the
descent cost as I first proposed, I feel like we would use two different metrics
for what is essentially the same cost: something proportional to the size of
the entry tree.

I do suspect that that calculation is bogus, because it looks like it's
based on the concept of "apply the quals at each index entry", which we
know is not how GIN operates. So maybe we should drop that bit in favor
of a per-page-ish cost like you have here. Not sure. In any case it
seems orthogonal to the question of startup/descent costs. Maybe we'd
better tackle just one thing at a time.

Hum, good point. Maybe that should be revisited too.

(BTW, given that that charge does exist and is not affected by
repeated-scan amortization, why do we have a problem in the first place?
Is it just too small? I guess that when we're only expecting one tuple
to be retrieved, it would only add about cpu_index_tuple_cost.)

Because with a very low selectivity, we end up under-charging for the cost of
walking the entry tree by a significant amount. As said above, I don't see how
those two things are the same: that charge estimates the cost of applying
index quals to the visited tuples, which is not the same as charging per entry
page visited.

Is the power(0.15) used an approximation for a log ? If so why ? Also
shouldn't we round that up ?

No idea, but I'm pretty hesitant to just randomly fool with that equation
when (a) neither of us know where it came from and (b) exactly no evidence
has been provided that it's wrong.

I note for instance that the existing logic switches from charging 1 page
per search to 2 pages per search at numEntryPages = 15 (1.5 ^ (1/0.15)).
Your version would switch at 2 pages, as soon as the pow() result is even
fractionally above 1.0. Maybe the existing logic is too optimistic there,
but ceil() makes it too pessimistic I think. I'd sooner tweak the power
constant than change from rint() to ceil().

You're right, I was too eager to try to raise the CPU cost proportionnally to
the number of array scans (scalararrayop). I'd really like to understand where
this equation comes from though...

Best regards,

--
Ronan Dunklau

Michael Paquier

michael@paquier.xyz

over 3 years ago

In reply to: Ronan Dunklau (#7)

Re: Fix gin index cost estimation

On Mon, Sep 19, 2022 at 09:15:25AM +0200, Ronan Dunklau wrote:

You're right, I was too eager to try to raise the CPU cost proportionnally to
the number of array scans (scalararrayop). I'd really like to understand where
this equation comes from though...

So, what's the latest update here?
--
Michael

Ronan Dunklau

ronan.dunklau@aiven.io

over 3 years ago

In reply to: Michael Paquier (#8)

Re: Fix gin index cost estimation

You're right, I was too eager to try to raise the CPU cost proportionnally

the number of array scans (scalararrayop). I'd really like to understand

where

this equation comes from though...

So, what's the latest update here?

Thanks Michael for reviving this thread.

Before proceeding any further with this, I'd like to understand where we
stand. Tom argued my way of charging cost per entry pages visited boils down
to charging per tuple, which I expressed disagreement with.

If we can come to a consensus whether that's a bogus way of thinking about it
I'll proceed with what we agree on.

--
Ronan Dunklau

#10

Alexander Korotkov

aekorotkov@gmail.com

about 3 years ago

In reply to: Ronan Dunklau (#9)

Re: Fix gin index cost estimation

Hi, Ronan!

On Wed, Oct 12, 2022 at 10:15 AM Ronan Dunklau <ronan.dunklau@aiven.io>
wrote:

You're right, I was too eager to try to raise the CPU cost

proportionnally

to

the number of array scans (scalararrayop). I'd really like to

understand

where

this equation comes from though...

So, what's the latest update here?

Thanks Michael for reviving this thread.

Before proceeding any further with this, I'd like to understand where we
stand. Tom argued my way of charging cost per entry pages visited boils

down

to charging per tuple, which I expressed disagreement with.

If we can come to a consensus whether that's a bogus way of thinking

about it

I'll proceed with what we agree on.

I briefly read the thread. I think this line is copy-paste from other index
access methods and trying to estimate the whole index scan CPU cost by
bypassing all the details.

*indexTotalCost += (numTuples * *indexSelectivity) * (cpu_index_tuple_cost
+ qual_op_cost);

I think Tom's point was that it's wrong to add a separate entry-tree CPU
cost estimation to another estimation, which tries (very inadequately) to
estimate the whole scan cost. Instead, I propose writing better estimations
for both entry-tree CPU cost and data-trees CPU cost and replacing existing
CPU estimation altogether.

------
Regards,
Alexander Korotkov

#11

Tom Lane

tgl@sss.pgh.pa.us

about 3 years ago

In reply to: Alexander Korotkov (#10)

Re: Fix gin index cost estimation

Alexander Korotkov <aekorotkov@gmail.com> writes:

I think Tom's point was that it's wrong to add a separate entry-tree CPU
cost estimation to another estimation, which tries (very inadequately) to
estimate the whole scan cost. Instead, I propose writing better estimations
for both entry-tree CPU cost and data-trees CPU cost and replacing existing
CPU estimation altogether.

Great idea, if someone is willing to do it ...

regards, tom lane

#12

Ronan Dunklau

ronan.dunklau@aiven.io

about 3 years ago

In reply to: Tom Lane (#11)

1 attachment(s)

Re: Fix gin index cost estimation

Le mardi 25 octobre 2022, 16:18:57 CET Tom Lane a écrit :

Alexander Korotkov <aekorotkov@gmail.com> writes:

I think Tom's point was that it's wrong to add a separate entry-tree CPU
cost estimation to another estimation, which tries (very inadequately) to
estimate the whole scan cost. Instead, I propose writing better
estimations
for both entry-tree CPU cost and data-trees CPU cost and replacing
existing
CPU estimation altogether.

Great idea, if someone is willing to do it ...

regards, tom lane

Hello,

Sorry for the delay, but here is an updated patch which changes the costing in
the following way:

- add a descent cost similar to the btree one is charged for the initial
entry-tree
- additionally, a charge is applied per page in both the entry tree and
posting trees / lists
- instead of charging the quals to each tuple, charge them per entry only. We
still charge cpu_index_tuple_cost per tuple though.

With those changes, no need to tweak the magic number formula estimating the
number of pages. Maybe we can come up with something better for estimating
those later on ?

Best regards,

--
Ronan Dunklau

Attachments:

v3-0001-Fix-gin-costing.patchtext/x-patch; charset=UTF-8; name=v3-0001-Fix-gin-costing.patchDownload

From 49bc71f25967f9367196803be363509922aac8ab Mon Sep 17 00:00:00 2001
From: Ronan Dunklau <ronan.dunklau@aiven.io>
Date: Mon, 12 Sep 2022 15:40:18 +0200
Subject: [PATCH v3] Fix gin costing.

GIN index scans were not taking any descent CPU-based cost into account. That made
them look cheaper than other types of indexes when they shouldn't be.

We use the same heuristic as for btree indexes, but multiplying it by
the number of searched entries.

Additionnally, the cpu cost for the tree was based largely on
genericcostestimate. For a GIN index, we should not charge index quals
per tuple, but per entry. On top of this, charge cpu_index_tuple_cost
per actual tuple.

This should fix the cases where a GIN index is preferred over a btree,
and the ones where a memoize node is not added on top of the GIN index
scan because it seemed too cheap.

Per report of Hung Nguyen.
---
 src/backend/utils/adt/selfuncs.c | 50 +++++++++++++++++++++++++++++---
 1 file changed, 46 insertions(+), 4 deletions(-)

diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index f116924d3c..1d07ae8e95 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -7445,6 +7445,7 @@ gincostestimate(PlannerInfo *root, IndexPath *path, double loop_count,
 				qual_arg_cost,
 				spc_random_page_cost,
 				outer_scans;
+	Cost		descentCost;
 	Relation	indexRel;
 	GinStatsData ginStats;
 	ListCell   *lc;
@@ -7669,6 +7670,41 @@ gincostestimate(PlannerInfo *root, IndexPath *path, double loop_count,
 	 */
 	dataPagesFetched = ceil(numDataPages * partialScale);
 
+	*indexStartupCost = 0;
+	*indexTotalCost = 0;
+
+	/*
+	 * Add a CPU-cost component to represent the costs of initial entry btree
+	 * descent.  We don't charge any I/O cost for touching upper btree levels,
+	 * since they tend to stay in cache, but we still have to do about log2(N)
+	 * comparisons to descend a btree of N leaf tuples.  We charge one
+	 * cpu_operator_cost per comparison.
+	 *
+	 * If there are ScalarArrayOpExprs, charge this once per SA scan.  The
+	 * ones after the first one are not startup cost so far as the overall
+	 * plan is concerned, so add them only to "total" cost.
+	 */
+	if (numEntries > 1)			/* avoid computing log(0) */
+	{
+		descentCost = ceil(log(numEntries) / log(2.0)) * cpu_operator_cost;
+		*indexStartupCost += descentCost * counts.searchEntries;
+		*indexTotalCost += counts.arrayScans * descentCost * counts.searchEntries;
+	}
+
+	/*
+	 * Add a cpu cost per entry-page fetched. This is not amortized over a loop.
+	 */
+	*indexStartupCost += entryPagesFetched * 50.0 * cpu_operator_cost;
+	*indexTotalCost += entryPagesFetched * counts.arrayScans * 50.0 * cpu_operator_cost;
+
+	/*
+	 * Add a cpu cost per data-page fetched. This is also not amortized over a loop.
+	 * We only charge one data page for the startup cost, and everything else to
+	 * the total cost.
+	 */
+	*indexStartupCost += 50.0 * cpu_operator_cost;
+	*indexTotalCost += dataPagesFetched * counts.arrayScans * 50.0 * cpu_operator_cost;
+
 	/*
 	 * Calculate cache effects if more than one scan due to nestloops or array
 	 * quals.  The result is pro-rated per nestloop scan, but the array qual
@@ -7692,7 +7728,7 @@ gincostestimate(PlannerInfo *root, IndexPath *path, double loop_count,
 	 * Here we use random page cost because logically-close pages could be far
 	 * apart on disk.
 	 */
-	*indexStartupCost = (entryPagesFetched + dataPagesFetched) * spc_random_page_cost;
+	*indexStartupCost += (entryPagesFetched + dataPagesFetched) * spc_random_page_cost;
 
 	/*
 	 * Now compute the number of data pages fetched during the scan.
@@ -7720,6 +7756,9 @@ gincostestimate(PlannerInfo *root, IndexPath *path, double loop_count,
 	if (dataPagesFetchedBySel > dataPagesFetched)
 		dataPagesFetched = dataPagesFetchedBySel;
 
+	/* Add once again a CPU-cost for those data pages, before amortizing for cache. */
+	*indexTotalCost += dataPagesFetched * counts.arrayScans * 50.0 * cpu_operator_cost;
+
 	/* Account for cache effects, the same as above */
 	if (outer_scans > 1 || counts.arrayScans > 1)
 	{
@@ -7731,11 +7770,11 @@ gincostestimate(PlannerInfo *root, IndexPath *path, double loop_count,
 	}
 
 	/* And apply random_page_cost as the cost per page */
-	*indexTotalCost = *indexStartupCost +
+	*indexTotalCost += *indexStartupCost +
 		dataPagesFetched * spc_random_page_cost;
 
 	/*
-	 * Add on index qual eval costs, much as in genericcostestimate.  But we
+	 * Add on index qual eval costs, much as in genericcostestimate. We charge cpu but we
 	 * can disregard indexorderbys, since GIN doesn't support those.
 	 */
 	qual_arg_cost = index_other_operands_eval_cost(root, indexQuals);
@@ -7743,7 +7782,10 @@ gincostestimate(PlannerInfo *root, IndexPath *path, double loop_count,
 
 	*indexStartupCost += qual_arg_cost;
 	*indexTotalCost += qual_arg_cost;
-	*indexTotalCost += (numTuples * *indexSelectivity) * (cpu_index_tuple_cost + qual_op_cost);
+	/* Add a cpu cost per search entry, corresponding to the actual visited entries. */
+	*indexTotalCost += (counts.searchEntries * counts.arrayScans) * (qual_op_cost);
+	/* Now add a cpu cost per tuple in the posting lists / trees */
+	*indexTotalCost += (numTuples * *indexSelectivity) * (cpu_index_tuple_cost);
 	*indexPages = dataPagesFetched;
 }
 
-- 
2.38.1

#13

Alexander Korotkov

aekorotkov@gmail.com

about 3 years ago

In reply to: Ronan Dunklau (#12)

Re: Fix gin index cost estimation

Hi, Ronan!

On Fri, Dec 2, 2022 at 1:19 PM Ronan Dunklau <ronan.dunklau@aiven.io> wrote:

Sorry for the delay, but here is an updated patch which changes the costing in
the following way:

- add a descent cost similar to the btree one is charged for the initial
entry-tree
- additionally, a charge is applied per page in both the entry tree and
posting trees / lists
- instead of charging the quals to each tuple, charge them per entry only. We
still charge cpu_index_tuple_cost per tuple though.

With those changes, no need to tweak the magic number formula estimating the
number of pages. Maybe we can come up with something better for estimating
those later on ?

Thank you for your patch. Couple of quick questions.
1) What magic number 50.0 stands for? I think we at least should make
it a macro.
2) "We only charge one data page for the startup cost" – should this
be dependent on number of search entries?

------
Regards,
Alexander Korotkov

#14

Ronan Dunklau

ronan.dunklau@aiven.io

about 3 years ago

In reply to: Alexander Korotkov (#13)

Re: Fix gin index cost estimation

Le vendredi 2 décembre 2022, 12:33:33 CET Alexander Korotkov a écrit :

Hi, Ronan!
Thank you for your patch. Couple of quick questions.
1) What magic number 50.0 stands for? I think we at least should make
it a macro.

This is what is used in other tree-descending estimation functions, so I used
that too. Maybe a DEFAULT_PAGE_CPU_COST macro would work for both ? If so I'll
separate this into two patches, one introducing the macro for the other
estimation functions, and this patch for gin.

2) "We only charge one data page for the startup cost" – should this
be dependent on number of search entries?

Good point, multiplying it by the number of search entries would do the trick.

Thank you for looking at this !

Regards,

--
Ronan Dunklau

#15

Ronan Dunklau

ronan.dunklau@aiven.io

about 3 years ago

In reply to: Ronan Dunklau (#14)

2 attachment(s)

Re: Fix gin index cost estimation

Le vendredi 2 décembre 2022, 13:58:27 CET Ronan Dunklau a écrit :

Le vendredi 2 décembre 2022, 12:33:33 CET Alexander Korotkov a écrit :

Hi, Ronan!
Thank you for your patch. Couple of quick questions.
1) What magic number 50.0 stands for? I think we at least should make
it a macro.

This is what is used in other tree-descending estimation functions, so I
used that too. Maybe a DEFAULT_PAGE_CPU_COST macro would work for both ? If
so I'll separate this into two patches, one introducing the macro for the
other estimation functions, and this patch for gin.

The 0001 patch does this.

2) "We only charge one data page for the startup cost" – should this
be dependent on number of search entries?

In fact there was another problem. The current code estimate two different
pathes for fetching data pages: in the case of a partial match, it takes into
account that all the data pages will have to be fetched. So this is is now
taken into account for the CPU cost as well.

For the regular search, we scale the number of data pages by the number of
search entries.

Best regards,

--
Ronan Dunklau

Attachments:

v4-0001-Extract-the-cpu-page-process-cost-multiplier-into.patchtext/x-patch; charset=UTF-8; name=v4-0001-Extract-the-cpu-page-process-cost-multiplier-into.patchDownload

From b3d03a96ecb3d03c4c212e3c434def6e0e77fcd4 Mon Sep 17 00:00:00 2001
From: Ronan Dunklau <ronan.dunklau@aiven.io>
Date: Tue, 6 Dec 2022 11:08:46 +0100
Subject: [PATCH v4 1/2] Extract the cpu page-process cost multiplier into a
 macro.

Btree, GiST and SP-GiST all charge 50.0 * cpu_operator_cost for
processing an index page. Extract this to a macro to avoid repeated
magic numbers.
---
 src/backend/utils/adt/selfuncs.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index f116924d3c..5baf8cf631 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -140,6 +140,7 @@
 #include "utils/timestamp.h"
 #include "utils/typcache.h"
 
+#define DEFAULT_PAGE_CPU_MULTIPLIER 50.0
 
 /* Hooks for plugins to get control when we ask for stats */
 get_relation_stats_hook_type get_relation_stats_hook = NULL;
@@ -6858,7 +6859,7 @@ btcostestimate(PlannerInfo *root, IndexPath *path, double loop_count,
 	 * touched.  The number of such pages is btree tree height plus one (ie,
 	 * we charge for the leaf page too).  As above, charge once per SA scan.
 	 */
-	descentCost = (index->tree_height + 1) * 50.0 * cpu_operator_cost;
+	descentCost = (index->tree_height + 1) * DEFAULT_PAGE_CPU_MULTIPLIER * cpu_operator_cost;
 	costs.indexStartupCost += descentCost;
 	costs.indexTotalCost += costs.num_sa_scans * descentCost;
 
@@ -7053,7 +7054,7 @@ gistcostestimate(PlannerInfo *root, IndexPath *path, double loop_count,
 	/*
 	 * Likewise add a per-page charge, calculated the same as for btrees.
 	 */
-	descentCost = (index->tree_height + 1) * 50.0 * cpu_operator_cost;
+	descentCost = (index->tree_height + 1) * DEFAULT_PAGE_CPU_MULTIPLIER * cpu_operator_cost;
 	costs.indexStartupCost += descentCost;
 	costs.indexTotalCost += costs.num_sa_scans * descentCost;
 
@@ -7108,7 +7109,7 @@ spgcostestimate(PlannerInfo *root, IndexPath *path, double loop_count,
 	/*
 	 * Likewise add a per-page charge, calculated the same as for btrees.
 	 */
-	descentCost = (index->tree_height + 1) * 50.0 * cpu_operator_cost;
+	descentCost = (index->tree_height + 1) * DEFAULT_PAGE_CPU_MULTIPLIER * cpu_operator_cost;
 	costs.indexStartupCost += descentCost;
 	costs.indexTotalCost += costs.num_sa_scans * descentCost;
 
-- 
2.38.1

v4-0002-Fix-gin-costing.patchtext/x-patch; charset=UTF-8; name=v4-0002-Fix-gin-costing.patchDownload

From 817f02fa5ace98b343e585568165cc2aef84328a Mon Sep 17 00:00:00 2001
From: Ronan Dunklau <ronan.dunklau@aiven.io>
Date: Mon, 12 Sep 2022 15:40:18 +0200
Subject: [PATCH v4 2/2] Fix gin costing.

GIN index scans were not taking any descent CPU-based cost into account. That made
them look cheaper than other types of indexes when they shouldn't be.

We use the same heuristic as for btree indexes, but multiplying it by
the number of searched entries.

Additionnally, the cpu cost for the tree was based largely on
genericcostestimate. For a GIN index, we should not charge index quals
per tuple, but per entry. On top of this, charge cpu_index_tuple_cost
per actual tuple.

This should fix the cases where a GIN index is preferred over a btree,
and the ones where a memoize node is not added on top of the GIN index
scan because it seemed too cheap.

Per report of Hung Nguyen.
---
 src/backend/utils/adt/selfuncs.c | 54 +++++++++++++++++++++++++++++---
 1 file changed, 50 insertions(+), 4 deletions(-)

diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 5baf8cf631..0b78fe450b 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -7446,6 +7446,7 @@ gincostestimate(PlannerInfo *root, IndexPath *path, double loop_count,
 				qual_arg_cost,
 				spc_random_page_cost,
 				outer_scans;
+	Cost		descentCost;
 	Relation	indexRel;
 	GinStatsData ginStats;
 	ListCell   *lc;
@@ -7670,6 +7671,43 @@ gincostestimate(PlannerInfo *root, IndexPath *path, double loop_count,
 	 */
 	dataPagesFetched = ceil(numDataPages * partialScale);
 
+	*indexStartupCost = 0;
+	*indexTotalCost = 0;
+
+	/*
+	 * Add a CPU-cost component to represent the costs of initial entry btree
+	 * descent.  We don't charge any I/O cost for touching upper btree levels,
+	 * since they tend to stay in cache, but we still have to do about log2(N)
+	 * comparisons to descend a btree of N leaf tuples.  We charge one
+	 * cpu_operator_cost per comparison.
+	 *
+	 * If there are ScalarArrayOpExprs, charge this once per SA scan.  The
+	 * ones after the first one are not startup cost so far as the overall
+	 * plan is concerned, so add them only to "total" cost.
+	 */
+	if (numEntries > 1)			/* avoid computing log(0) */
+	{
+		descentCost = ceil(log(numEntries) / log(2.0)) * cpu_operator_cost;
+		*indexStartupCost += descentCost * counts.searchEntries;
+		*indexTotalCost += counts.arrayScans * descentCost * counts.searchEntries;
+	}
+
+	/*
+	 * Add a cpu cost per entry-page fetched. This is not amortized over a loop.
+	 */
+	*indexStartupCost += entryPagesFetched * DEFAULT_PAGE_CPU_MULTIPLIER * cpu_operator_cost;
+	*indexTotalCost += entryPagesFetched * counts.arrayScans * DEFAULT_PAGE_CPU_MULTIPLIER * cpu_operator_cost;
+
+	/*
+	 * Add a cpu cost per data-page fetched. This is also not amortized over a loop.
+	 * Since those are the data pages from the partial match algorithm, charge them as startup cost.
+	 */
+	*indexStartupCost += DEFAULT_PAGE_CPU_MULTIPLIER * cpu_operator_cost * dataPagesFetched;
+	/*
+	 * Since we add the startup cost to the total cost later on, remove the initial arrayscan from the total.
+	 */
+	*indexTotalCost += dataPagesFetched * (counts.arrayScans - 1) * DEFAULT_PAGE_CPU_MULTIPLIER * cpu_operator_cost;
+
 	/*
 	 * Calculate cache effects if more than one scan due to nestloops or array
 	 * quals.  The result is pro-rated per nestloop scan, but the array qual
@@ -7693,7 +7731,7 @@ gincostestimate(PlannerInfo *root, IndexPath *path, double loop_count,
 	 * Here we use random page cost because logically-close pages could be far
 	 * apart on disk.
 	 */
-	*indexStartupCost = (entryPagesFetched + dataPagesFetched) * spc_random_page_cost;
+	*indexStartupCost += (entryPagesFetched + dataPagesFetched) * spc_random_page_cost;
 
 	/*
 	 * Now compute the number of data pages fetched during the scan.
@@ -7721,6 +7759,11 @@ gincostestimate(PlannerInfo *root, IndexPath *path, double loop_count,
 	if (dataPagesFetchedBySel > dataPagesFetched)
 		dataPagesFetched = dataPagesFetchedBySel;
 
+	/* Add one page cpu-cost to the startup cost */
+	*indexStartupCost += DEFAULT_PAGE_CPU_MULTIPLIER * cpu_operator_cost * counts.searchEntries;
+	/* Add once again a CPU-cost for those data pages, before amortizing for cache. */
+	*indexTotalCost += dataPagesFetched * counts.arrayScans * DEFAULT_PAGE_CPU_MULTIPLIER * cpu_operator_cost;
+
 	/* Account for cache effects, the same as above */
 	if (outer_scans > 1 || counts.arrayScans > 1)
 	{
@@ -7732,11 +7775,11 @@ gincostestimate(PlannerInfo *root, IndexPath *path, double loop_count,
 	}
 
 	/* And apply random_page_cost as the cost per page */
-	*indexTotalCost = *indexStartupCost +
+	*indexTotalCost += *indexStartupCost +
 		dataPagesFetched * spc_random_page_cost;
 
 	/*
-	 * Add on index qual eval costs, much as in genericcostestimate.  But we
+	 * Add on index qual eval costs, much as in genericcostestimate. We charge cpu but we
 	 * can disregard indexorderbys, since GIN doesn't support those.
 	 */
 	qual_arg_cost = index_other_operands_eval_cost(root, indexQuals);
@@ -7744,7 +7787,10 @@ gincostestimate(PlannerInfo *root, IndexPath *path, double loop_count,
 
 	*indexStartupCost += qual_arg_cost;
 	*indexTotalCost += qual_arg_cost;
-	*indexTotalCost += (numTuples * *indexSelectivity) * (cpu_index_tuple_cost + qual_op_cost);
+	/* Add a cpu cost per search entry, corresponding to the actual visited entries. */
+	*indexTotalCost += (counts.searchEntries * counts.arrayScans) * (qual_op_cost);
+	/* Now add a cpu cost per tuple in the posting lists / trees */
+	*indexTotalCost += (numTuples * *indexSelectivity) * (cpu_index_tuple_cost);
 	*indexPages = dataPagesFetched;
 }
 
-- 
2.38.1

#16

Alexander Korotkov

aekorotkov@gmail.com

about 3 years ago

In reply to: Ronan Dunklau (#15)

Re: Fix gin index cost estimation

On Tue, Dec 6, 2022 at 1:22 PM Ronan Dunklau <ronan.dunklau@aiven.io> wrote:

Le vendredi 2 décembre 2022, 13:58:27 CET Ronan Dunklau a écrit :

Le vendredi 2 décembre 2022, 12:33:33 CET Alexander Korotkov a écrit :

Hi, Ronan!
Thank you for your patch. Couple of quick questions.
1) What magic number 50.0 stands for? I think we at least should make
it a macro.

This is what is used in other tree-descending estimation functions, so I
used that too. Maybe a DEFAULT_PAGE_CPU_COST macro would work for both ? If
so I'll separate this into two patches, one introducing the macro for the
other estimation functions, and this patch for gin.

The 0001 patch does this.

2) "We only charge one data page for the startup cost" – should this
be dependent on number of search entries?

In fact there was another problem. The current code estimate two different
pathes for fetching data pages: in the case of a partial match, it takes into
account that all the data pages will have to be fetched. So this is is now
taken into account for the CPU cost as well.

For the regular search, we scale the number of data pages by the number of
search entries.

Now the patch looks good for me. I made some tests.

# create extension pg_trgm;
# create extension btree_gin;
# create table test1 as (select random() as val from
generate_series(1,1000000) i);
# create index test1_gin_idx on test1 using gin (val);

# explain select * from test1 where val between 0.1 and 0.2;
QUERY PLAN
---------------------------------------------------------------------------------------------
Bitmap Heap Scan on test1 (cost=1186.21..7089.57 rows=98557 width=8)
Recheck Cond: ((val >= '0.1'::double precision) AND (val <=
'0.2'::double precision))
-> Bitmap Index Scan on test1_gin_idx (cost=0.00..1161.57
rows=98557 width=0)
Index Cond: ((val >= '0.1'::double precision) AND (val <=
'0.2'::double precision))
(4 rows)

# create index test1_btree_idx on test1 using btree (val);
# explain select * from test1 where val between 0.1 and 0.2;
QUERY PLAN
-----------------------------------------------------------------------------------------
Index Only Scan using test1_btree_idx on test1 (cost=0.42..3055.57
rows=98557 width=8)
Index Cond: ((val >= '0.1'::double precision) AND (val <=
'0.2'::double precision))
(2 rows)

Looks reasonable. In this case GIN is much more expensive, because it
can't handle range query properly and overlaps two partial matches.

# create table test2 as (select 'samplestring' || i as val from
generate_series(1,1000000) i);
# create index test2_gin_idx on test2 using gin (val);
# explain select * from test2 where val = 'samplestring500000';
QUERY PLAN
-----------------------------------------------------------------------------
Bitmap Heap Scan on test2 (cost=20.01..24.02 rows=1 width=18)
Recheck Cond: (val = 'samplestring500000'::text)
-> Bitmap Index Scan on test2_gin_idx (cost=0.00..20.01 rows=1 width=0)
Index Cond: (val = 'samplestring500000'::text)
(4 rows)

# create index test2_btree_idx on test2 using btree (val);
# explain select * from test2 where val = 'samplestring500000';
QUERY PLAN
-----------------------------------------------------------------------------------
Index Only Scan using test2_btree_idx on test2 (cost=0.42..4.44
rows=1 width=18)
Index Cond: (val = 'samplestring500000'::text)
(2 rows)

This also looks reasonable. GIN is not terribly bad for this case,
but B-tree is much cheaper.

I'm going to push this and backpatch to all supported versions if no objections.

------
Regards,
Alexander Korotkov

#17

Tom Lane

tgl@sss.pgh.pa.us

about 3 years ago

In reply to: Alexander Korotkov (#16)

Re: Fix gin index cost estimation

Alexander Korotkov <aekorotkov@gmail.com> writes:

I'm going to push this and backpatch to all supported versions if no objections.

Push yes, but I'd counsel against back-patching. People don't
generally like unexpected plan changes in stable versions, and
that's what a costing change could produce. There's no argument
that we are fixing a failure or wrong answer here, so it doesn't
seem like back-patch material.

regards, tom lane

#18

Alexander Korotkov

aekorotkov@gmail.com

about 3 years ago

In reply to: Tom Lane (#17)

Re: Fix gin index cost estimation

On Sun, Jan 8, 2023 at 7:08 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Alexander Korotkov <aekorotkov@gmail.com> writes:

I'm going to push this and backpatch to all supported versions if no objections.

Push yes, but I'd counsel against back-patching. People don't
generally like unexpected plan changes in stable versions, and
that's what a costing change could produce. There's no argument
that we are fixing a failure or wrong answer here, so it doesn't
seem like back-patch material.

Pushed to master, thank you. I've noticed the reason for
non-backpatching in the commit message.

------
Regards,
Alexander Korotkov