Add estimated hit ratio to Memoize in EXPLAIN to explain cost adjustment
Hi,
I was debugging a planner problem on Postgres 14.4 the other day - and the
involved "bad" plan was including Memoize - though I don't necessarily
think that Memoize is to blame (and this isn't any of the problems recently
fixed in Memoize costing).
However, what I noticed whilst trying different ways to fix the plan, is
that the Memoize output was a bit hard to reason about - especially since
the plan involving Memoize was expensive to run, and so I was mostly
running EXPLAIN without ANALYZE to look at the costing.
Here is an example of the output I was looking at:
-> Nested Loop (cost=1.00..971672.56 rows=119623 width=0)
-> Index Only Scan using table1_idx on table1
(cost=0.43..372676.50 rows=23553966 width=8)
-> Memoize (cost=0.57..0.61 rows=1 width=8)
Cache Key: table1.table2_id
Cache Mode: logical
-> Index Scan using table2_idx on table2
(cost=0.56..0.60 rows=1 width=8)
Index Cond: (id = table1.table2_id)
The other plan I was comparing with (that I wanted the planner to choose
instead), had a total cost of 1,451,807.35 -- and so I was trying to figure
out why the Nested Loop was costed as 971,672.56.
Simple math makes me expect the Nested Loop should roughly have a total
cost of14,740,595.76 here (372,676.50 + 23,553,966 * 0.61), ignoring a lot
of the smaller costs. Thus, in this example, it appears Memoize made the
plan cost significantly cheaper (roughly 6% of the regular cost).
Essentially this comes down to the "cost reduction" performed by Memoize
only being implicitly visible in the Nested Loop's total cost - and with
nothing useful on the Memoize node itself - since the rescan costs are not
shown.
I think explicitly adding the estimated cache hit ratio for Memoize nodes
might make this easier to reason about, like this:
-> Memoize (cost=0.57..0.61 rows=1 width=8)
Cache Key: table1.table2_id
Cache Mode: logical
Cache Hit Ratio Estimated: 0.94
Alternatively (or in addition) we could consider showing the "ndistinct"
value that is calculated in cost_memoize_rescan - since that's the most
significant contributor to the cache hit ratio (and you can influence that
directly by improving the ndistinct statistics).
See attached a patch that implements showing the cache hit ratio as a
discussion starter.
I'll park this in the July commitfest for now.
Thanks,
Lukas
--
Lukas Fittl
Attachments:
v1-0001-Add-Estimated-Cache-Hit-Ratio-for-Memoize-plan-no.patchapplication/octet-stream; name=v1-0001-Add-Estimated-Cache-Hit-Ratio-for-Memoize-plan-no.patchDownload+26-4
On Sun, 5 Mar 2023 at 13:21, Lukas Fittl <lukas@fittl.com> wrote:
Alternatively (or in addition) we could consider showing the "ndistinct" value that is calculated in cost_memoize_rescan - since that's the most significant contributor to the cache hit ratio (and you can influence that directly by improving the ndistinct statistics).
I think the ndistinct estimate plus the est_entries together would be
useful. I think showing just the hit ratio number might often just
raise too many questions about how that's calculated. To calculate the
hit ratio we need to estimate the number of entries that can be kept
in the cache at once and also the number of input rows and the number
of distinct values. We can see the input rows by looking at the outer
side of the join in EXPLAIN, but we've no idea about the ndistinct or
how many items the planner thought could be kept in the cache at once.
The plan node already has est_entries, so it should just be a matter
of storing the ndistinct estimate in the Path and putting it into the
Plan node so the executor has access to it during EXPLAIN.
David
On 7 Mar 2023, at 10:51, David Rowley <dgrowleyml@gmail.com> wrote:
On Sun, 5 Mar 2023 at 13:21, Lukas Fittl <lukas@fittl.com> wrote:
Alternatively (or in addition) we could consider showing the "ndistinct" value that is calculated in cost_memoize_rescan - since that's the most significant contributor to the cache hit ratio (and you can influence that directly by improving the ndistinct statistics).
I think the ndistinct estimate plus the est_entries together would be
useful. I think showing just the hit ratio number might often just
raise too many questions about how that's calculated. To calculate the
hit ratio we need to estimate the number of entries that can be kept
in the cache at once and also the number of input rows and the number
of distinct values. We can see the input rows by looking at the outer
side of the join in EXPLAIN, but we've no idea about the ndistinct or
how many items the planner thought could be kept in the cache at once.The plan node already has est_entries, so it should just be a matter
of storing the ndistinct estimate in the Path and putting it into the
Plan node so the executor has access to it during EXPLAIN.
Lukas: do you have an updated patch for this commitfest to address David's
comments?
--
Daniel Gustafsson
On Thu, Jul 6, 2023 at 12:56 AM Daniel Gustafsson <daniel@yesql.se> wrote:
Lukas: do you have an updated patch for this commitfest to address David's
comments?
I have a draft - I should be able to post an updated patch in the next
days. Thanks for checking!
Thanks,
Lukas
--
Lukas Fittl
On 06.07.2023 11:27, Lukas Fittl wrote:
On Thu, Jul 6, 2023 at 12:56 AM Daniel Gustafsson <daniel@yesql.se> wrote:
Lukas: do you have an updated patch for this commitfest to address
David's
comments?I have a draft - I should be able to post an updated patch in the next
days. Thanks for checking!Thanks,
Lukas--
Lukas Fittl
Hi hackers,
While debugging a query execution plan involving Memoize, it'd be nice
to determine how many unique keys would fit into the cache. The
est_entries value provides some insight, but without knowing ndistinct,
it is unclear whether the cache is large enough to hold all unique keys
or if some will be evicted.
Given its potential usefulness, I would like to work for this. I
attached v2 patch with changes.
Example from memoize.sql
EXPLAIN SELECT COUNT(*),AVG(t1.unique1) FROM tenk1 t1
INNER JOIN tenk1 t2 ON t1.unique1 = t2.thousand
WHERE t2.unique1 < 1200;
QUERY PLAN
-----------------------------------------------------------------------------------------------------
Aggregate (cost=815.12..815.13 rows=1 width=40)
-> Nested Loop (cost=0.30..809.12 rows=1200 width=4)
-> Seq Scan on tenk1 t2 (cost=0.00..470.00 rows=1200 width=4)
Filter: (unique1 < 1200)
-> Memoize (cost=0.30..0.41 rows=1 width=4)
Cache Key: t2.thousand
Cache Mode: logical
Cache Estimated Entries: 655
Cache Estimated NDistinct: 721
-> Index Only Scan using tenk1_unique1 on tenk1 t1
(cost=0.29..0.40 rows=1 width=4)
Index Cond: (unique1 = t2.thousand)
(11 rows)
Additionally, since this information would only be shown in EXPLAIN when
costs are enabled, it should not cause any performance regression in
normal execution. However, reviewers should be especially careful when
verifying test outputs, as this change could affect plan details in
regression tests.
Any thoughts?
--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC.
Attachments:
v2-0001-Show-ndistinct-and-est_entries-in-EXPLAIN-for-Memoize.patchtext/x-patch; charset=UTF-8; name=v2-0001-Show-ndistinct-and-est_entries-in-EXPLAIN-for-Memoize.patchDownload+33-4
On Thu, 20 Mar 2025 at 21:48, Ilia Evdokimov
<ilya.evdokimov@tantorlabs.com> wrote:
-> Memoize (cost=0.30..0.41 rows=1 width=4)
Cache Key: t2.thousand
Cache Mode: logical
Cache Estimated Entries: 655
Cache Estimated NDistinct: 721
-> Index Only Scan using tenk1_unique1 on tenk1 t1 (cost=0.29..0.40 rows=1 width=4)
Index Cond: (unique1 = t2.thousand)
(11 rows)Additionally, since this information would only be shown in EXPLAIN when costs are enabled, it should not cause any performance regression in normal execution. However, reviewers should be especially careful when verifying test outputs, as this change could affect plan details in regression tests.
Any thoughts?
+ ExplainIndentText(es); + appendStringInfo(es->str, "Cache Estimated Entries: %d\n", ((Memoize *) plan)->est_entries); + ExplainIndentText(es); + appendStringInfo(es->str, "Cache Estimated NDistinct: %0.0f\n", ((Memoize *) plan)->ndistinct);
est_entries is a uint32, so %u is the correct format character for that type.
I don't think you need to prefix all these properties with "Cache"
just because the other two properties have that prefix. I also don't
think the names you've chosen really reflect the meaning. How about
something like: "Estimated Distinct Lookup Keys: 721 Estimated
Capacity: 655", in that order. I think maybe having that as one line
for format=text is better than 2 lines. The EXPLAIN output is already
often taking up more lines in v18 than in v17, would be good to not
make that even worse unnecessarily.
I see the existing code there could use ExplainPropertyText rather
than have a special case for text and non-text formats. That's likely
my fault. If we're touching this code, then we should probably tidy
that up. Do you want to create a precursor fixup patch for that?
+ double ndistinct; /* Estimated number of distinct memoization keys,
+ * used for cache size evaluation. Kept for EXPLAIN */
Maybe this field in MemoizePath needs a better name. How about
"est_unique_keys"? and also do the rename in struct Memoize.
I'm also slightly concerned about making struct Memoize bigger. I had
issues with a performance regression [1]/messages/by-id/flat/CAHoyFK9n-QCXKTUWT_xxtXninSMEv+gbJN66-y6prM3f4WkEHw@mail.gmail.com for 908a96861 when increasing
the WindowAgg struct size last year and the only way I found to make
it go away was to shuffle the fields around so that the struct size
didn't increase. I think we'll need to see a benchmark of a query that
hits Memoize quite hard with a small cache size to see if the
performance decreases as a result of adding the ndistinct field. It's
unfortunate that we'll not have the luxury of squeezing this double
into padding if we do see a slowdown.
David
[1]: /messages/by-id/flat/CAHoyFK9n-QCXKTUWT_xxtXninSMEv+gbJN66-y6prM3f4WkEHw@mail.gmail.com
On 20.03.2025 13:37, David Rowley wrote:
est_entries is a uint32, so %u is the correct format character for that type.
I don't think you need to prefix all these properties with "Cache"
just because the other two properties have that prefix. I also don't
think the names you've chosen really reflect the meaning. How about
something like: "Estimated Distinct Lookup Keys: 721 Estimated
Capacity: 655", in that order. I think maybe having that as one line
for format=text is better than 2 lines. The EXPLAIN output is already
often taking up more lines in v18 than in v17, would be good to not
make that even worse unnecessarily.
LGTM
I see the existing code there could use ExplainPropertyText rather
than have a special case for text and non-text formats. That's likely
my fault. If we're touching this code, then we should probably tidy
that up. Do you want to create a precursor fixup patch for that?
I fully agree with this suggestion. Then I'll begin with this on another
new thread.
+ double ndistinct; /* Estimated number of distinct memoization keys, + * used for cache size evaluation. Kept for EXPLAIN */Maybe this field in MemoizePath needs a better name. How about
"est_unique_keys"? and also do the rename in struct Memoize.
LGTM
I'm also slightly concerned about making struct Memoize bigger. I had
issues with a performance regression [1] for 908a96861 when increasing
the WindowAgg struct size last year and the only way I found to make
it go away was to shuffle the fields around so that the struct size
didn't increase. I think we'll need to see a benchmark of a query that
hits Memoize quite hard with a small cache size to see if the
performance decreases as a result of adding the ndistinct field. It's
unfortunate that we'll not have the luxury of squeezing this double
into padding if we do see a slowdown.
I tried rearranging the fields in the structure with 'ndistinct' field,
but unfortunately, sizeof(Memoize) did not remain the same. Therefore,
benchmarking Memoize is definitely necessary. Thanks for the information!
--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC.
On 20/3/2025 11:37, David Rowley wrote:
I'm also slightly concerned about making struct Memoize bigger. I had
issues with a performance regression [1] for 908a96861 when increasing
the WindowAgg struct size last year and the only way I found to make
it go away was to shuffle the fields around so that the struct size
didn't increase. I think we'll need to see a benchmark of a query that
hits Memoize quite hard with a small cache size to see if the
performance decreases as a result of adding the ndistinct field. It's
unfortunate that we'll not have the luxury of squeezing this double
into padding if we do see a slowdown.
I quite frequently need the number of distinct values (or groups)
predicted during the Memoize node creation to understand why caching is
sometimes employed or not.
But I had thought about an alternative way: having an extensible EXPLAIN
(thanks to Robert), we may save optimisation-stage data (I have the same
necessity in the case of IncrementalSort, for example) and put it into
the Plan node on-demand. So, the way I want to go is a Plan::extlist
node and create_plan hook, which may allow copying best_path data to the
final plan. So, here, we will add a new parameter and avoid touching the
core code.
But I would give +1 to current approach if it were done in a shorter time.
--
regards, Andrei Lepikhov
On 20.03.2025 15:32, Andrei Lepikhov wrote:
I quite frequently need the number of distinct values (or groups)
predicted during the Memoize node creation to understand why caching
is sometimes employed or not.
But I had thought about an alternative way: having an extensible
EXPLAIN (thanks to Robert), we may save optimisation-stage data (I
have the same necessity in the case of IncrementalSort, for example)
and put it into the Plan node on-demand. So, the way I want to go is a
Plan::extlist node and create_plan hook, which may allow copying
best_path data to the final plan. So, here, we will add a new
parameter and avoid touching the core code.
But I would give +1 to current approach if it were done in a shorter
time.
Then before proceeding further, I think we need to benchmark this change
to ensure there are no performance regressions. If performance issues
arise, then using extensible EXPLAIN might be the only viable approach.
I have addressed most of the review comments except for the
ExplainPropertyText change. I am attaching v3 of the patch with these
updates. If anyone notices any performance issues, please let me know.
Issue with ExplainPropertyText for this thread I'll fix in the next
patches if it will be necessary.
So far, I have tested it on small machines. There are no performance
issues yet. I'll run benchmarks on larger ones soon.
--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC.
Attachments:
v3-0001-Show-ndistinct-and-est_entries-in-EXPLAIN-for-Memoize.patchtext/x-patch; charset=UTF-8; name=v3-0001-Show-ndistinct-and-est_entries-in-EXPLAIN-for-Memoize.patchDownload+33-4
On 20/3/2025 15:03, Ilia Evdokimov wrote:
On 20.03.2025 15:32, Andrei Lepikhov wrote:
I quite frequently need the number of distinct values (or groups)
predicted during the Memoize node creation to understand why caching
is sometimes employed or not.
But I had thought about an alternative way: having an extensible
EXPLAIN (thanks to Robert), we may save optimisation-stage data (I
have the same necessity in the case of IncrementalSort, for example)
and put it into the Plan node on-demand. So, the way I want to go is a
Plan::extlist node and create_plan hook, which may allow copying
best_path data to the final plan. So, here, we will add a new
parameter and avoid touching the core code.
But I would give +1 to current approach if it were done in a shorter
time.Then before proceeding further, I think we need to benchmark this change
to ensure there are no performance regressions. If performance issues
arise, then using extensible EXPLAIN might be the only viable approach.I have addressed most of the review comments except for the
ExplainPropertyText change. I am attaching v3 of the patch with these
updates. If anyone notices any performance issues, please let me know.
Issue with ExplainPropertyText for this thread I'll fix in the next
patches if it will be necessary.So far, I have tested it on small machines. There are no performance
issues yet. I'll run benchmarks on larger ones soon.
I have some doubts here.
The number of distinct values says something only when it has taken
together with the number of calls.
Frequently, one of the caching keys is from outer table A (10 tuples),
and another is from outer table B (100 tuples). Calculating the number
of successful cache fetches predicted by the planner may not be evident
in the case of a composite cache key.
What I may propose here is:
1. Use fraction of calls, for example - 50% duplicated key values.
2. Show the calculated hit and eviction ratio.
The second option looks better to me. It is pretty understandable by a
user and may be compared to the numbers obtained during the execution.
--
regards, Andrei Lepikhov
On Fri, 21 Mar 2025 at 07:54, Andrei Lepikhov <lepihov@gmail.com> wrote:
I have some doubts here.
The number of distinct values says something only when it has taken
together with the number of calls.
Couldn't the reader just look at the Nested Loop's outer side row
estimate for that?
Frequently, one of the caching keys is from outer table A (10 tuples),
and another is from outer table B (100 tuples). Calculating the number
of successful cache fetches predicted by the planner may not be evident
in the case of a composite cache key.What I may propose here is:
1. Use fraction of calls, for example - 50% duplicated key values.
2. Show the calculated hit and eviction ratio.
My concern with showing just the estimated hit and evict ratios it
that it might lead to more questions, like how those were calculated.
However, I can see the argument for having one or both of these in
addition to the expected unique keys and expected cache capacity.
I think the primary factors in how useful Memoize is are: 1) How many
items do we expect to be able to store in the cache concurrently, and;
2) How many unique lookups keys do we expect to be looked up, and; 3)
The total number of expected lookups. #1 is quite difficult to
figure out (maybe by looking at row width and row estimates) and
there's just no information about #2. #3 is already shown, in the
Nested Loop's outer side.
The reason that the hit and evict ratios might also be useful are that
it you need a bit of inside knowledge on how the 3 input values are
used to calculate these. For example you might come up with a higher
estimated hit ratio if you assumed the keys arrived in order vs
unordered. If they're unordered and you don't have room for all keys
in the cache at once then that increases the chances that Memoize had
to evict something that will be needed for a future lookup.
Also, I was just looking back at the concern I had with increasing the
size of struct Memoize. I suspect that might not be that much of a
concern. The WindowAgg problem I mentioned was with the executor
state, not the plan node. I see the only time we access the plan node
for Memoize during execution is when we call build_hash_table().
David
On 21/3/2025 03:50, David Rowley wrote:
On Fri, 21 Mar 2025 at 07:54, Andrei Lepikhov <lepihov@gmail.com> wrote:
I have some doubts here.
The number of distinct values says something only when it has taken
together with the number of calls.Couldn't the reader just look at the Nested Loop's outer side row
estimate for that?
In my cases, key sources are usually spread across dozens of joins, and
it is visually hard to find out (especially when we have an EXPLAIN
ANALYSE VERBOSE) the JOIN operator to extract the number of calls. The
hit ratio, meanwhile, may be analysed locally in the Memoize node. For
example, 80% (0.8) is evidently a good one, 40% is questionable, and 5%
is too low and we should avoid Memoize here.
May it be beaten by just printing the "calls" number at the Memoize output?
Frequently, one of the caching keys is from outer table A (10 tuples),
and another is from outer table B (100 tuples). Calculating the number
of successful cache fetches predicted by the planner may not be evident
in the case of a composite cache key.What I may propose here is:
1. Use fraction of calls, for example - 50% duplicated key values.
2. Show the calculated hit and eviction ratio.I think the primary factors in how useful Memoize is are: 1) How many
items do we expect to be able to store in the cache concurrently, and;
2) How many unique lookups keys do we expect to be looked up, and; 3)
The total number of expected lookups. #1 is quite difficult to
figure out (maybe by looking at row width and row estimates) and
there's just no information about #2. #3 is already shown, in the
Nested Loop's outer side.
It depends on the task. If you are looking for the answer to how precise
the group's estimation has been (to check statistics), I agree. In cases
I have seen before, the main question is how effective was (or maybe) a
Memoize node == how often the incoming key fits the cache. In that case,
the hit ratio fraction is more understandable for a broad audience.
That's why according to my experience in case of a good cache
reusability factor, users are usually okay with increasing the cache
size to the necessary numbers and avoiding evictions at all costs. So,
the predicted evict_ratio also tells us about incrementing work_mem to
enhance the chances of Memoisation.
Having written the last sentence I came back to the point why work_mem
is so universal and is used at each node as a criteria of memory
allocation size? But it is a different story, I think.
--
regards, Andrei Lepikhov
After considering the opinions expressed in this discussion, I tend to
agree more with David. If we add info about estimated unique keys and
estimated capacity, then any additional information - such as
evict_ratio and hit_ratio - can also be calculated, as EXPLAIN ANALYZE
provides all the necessary details to compute these values.
For now, I’m attaching a rebased v4 patch, which includes the fix for
ExplainPropertyText.
--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC.
Attachments:
v4-0001-Show-ndistinct-and-est_entries-in-EXPLAIN-for-Memoize.patchtext/x-patch; charset=UTF-8; name=v4-0001-Show-ndistinct-and-est_entries-in-EXPLAIN-for-Memoize.patchDownload+37-4
On Fri, 21 Mar 2025 at 22:02, Andrei Lepikhov <lepihov@gmail.com> wrote:
In cases
I have seen before, the main question is how effective was (or maybe) a
Memoize node == how often the incoming key fits the cache. In that case,
the hit ratio fraction is more understandable for a broad audience.
That's why according to my experience in case of a good cache
reusability factor, users are usually okay with increasing the cache
size to the necessary numbers and avoiding evictions at all costs. So,
the predicted evict_ratio also tells us about incrementing work_mem to
enhance the chances of Memoisation.
Can you explain why "Estimated Capacity" and "Estimated Distinct
Lookup Keys" don't answer that? If there are more distinct lookup
keys than there is capacity to store them, then some will be evicted.
Once again, I'm not necessarily objecting to hit and evict ratios
being shown, I just want to know they're actually useful enough to
show and don't just bloat the EXPLAIN output needlessly. So far your
arguments aren't convincing me that they are.
Having written the last sentence I came back to the point why work_mem
is so universal and is used at each node as a criteria of memory
allocation size? But it is a different story, I think.
We have to set the limit somehow. We could have done this by having a
GUC per node type that uses memory, but it looks like something more
universal was decided, perhaps to save on GUCs. I don't know the exact
history, but once upon a time, sort_mem existed. Perhaps that
disappeared because we grew more node types that needed to allocate
large, otherwise unbounded amounts of memory. We did more recently
grow a hash_mem_multiplier GUC, so it's not true to say that work_mem
solely controls the limits of each node's memory allocation sizes.
David
On 3/23/25 22:16, David Rowley wrote:
On Fri, 21 Mar 2025 at 22:02, Andrei Lepikhov <lepihov@gmail.com> wrote:
Can you explain why "Estimated Capacity" and "Estimated Distinct
Lookup Keys" don't answer that? If there are more distinct lookup
keys than there is capacity to store them, then some will be evicted.
I wouldn't say these parameters don't answer. I try to debate how usable
they are. To be more practical, let me demonstrate the example:
EXPLAIN (COSTS OFF) SELECT * FROM t1,t2 WHERE t1.x=t2.x;
Nested Loop (cost=0.44..7312.65 rows=211330 width=33)
-> Seq Scan on t1 (cost=0.00..492.00 rows=30000 width=22)
-> Memoize (cost=0.44..3.82 rows=7 width=11)
Cache Key: t1.x
Cache Mode: logical
Estimated Capacity: 1001 Estimated Distinct Lookup Keys: 1001
-> Index Scan using t2_x_idx2 on t2 (cost=0.43..3.81 rows=7)
Index Cond: (x = t1.x)
At first, I began to look for documentation because it was unclear what
both new parameters specifically meant. Okay, there was no documentation
but trivial code, and after a short discovery, I realised the meaning.
The first fact I see from this EXPLAIN is that Postgres estimates it has
enough memory to fit all the entries. Okay, but what does it give me? I
may just increase work_mem and provide the query with more memory if
needed. My main concern is how frequently this cache is planned to be
used. Doing some mental effort, I found the line "rows=30000."
Calculating a bit more, I may suppose it means that we have a 95% chance
to reuse the cache. Okay, I got it.
Now, see:
1. I needed to discover the meaning of the new parameters because they
were different from the earlier "hit" and "miss."
2. I need to find a common JOIN for keys of this node. Imagine a typical
200-row EXPLAIN with 2-3 Memoization keys from different tables.
3. I need to make calculations
On the opposite, the hit ratio, written instead, already known by
analogy, already provides me with necessary cache efficacy data; no need
to watch outside the node; it may be easily compared with the actual
value. Am I wrong?
Both approaches provide the data, but each one is more usable?
I think we may ask more people, for example, Nikolay Samokhvalov, who,
as I heard, works hard with explains.
Once again, I'm not necessarily objecting to hit and evict ratios
being shown, I just want to know they're actually useful enough to
show and don't just bloat the EXPLAIN output needlessly. So far your
arguments aren't convincing me that they are.
I'm -1 for this redundancy.
--
regards, Andrei Lepikhov
On Tue, 25 Mar 2025 at 10:23, Andrei Lepikhov <lepihov@gmail.com> wrote:
On 3/23/25 22:16, David Rowley wrote:
Once again, I'm not necessarily objecting to hit and evict ratios
being shown, I just want to know they're actually useful enough to
show and don't just bloat the EXPLAIN output needlessly. So far your
arguments aren't convincing me that they are.
I'm -1 for this redundancy.
I'm not following what the -1 is for. Is it for showing both hit and
evict ratios? And your vote is only for adding hit ratio?
Just to make it clear, the evict ratio isn't redundant because we show
hit ratio. If you have 1000 calls and 1000 distinct values and enough
memory to store those, then that's a 0% hit ratio since the first
lookup is a miss. If you change the calls to 2000 then that's a 50%
hit ratio and still 0% evict.
David
David Rowley <dgrowleyml@gmail.com> writes:
I'm not following what the -1 is for. Is it for showing both hit and
evict ratios? And your vote is only for adding hit ratio?
Just to make it clear, the evict ratio isn't redundant because we show
hit ratio. If you have 1000 calls and 1000 distinct values and enough
memory to store those, then that's a 0% hit ratio since the first
lookup is a miss. If you change the calls to 2000 then that's a 50%
hit ratio and still 0% evict.
FWIW, I share these doubts about whether these values are useful
enough to include in the default EXPLAIN output. My main beef
with them though is that they are basically numbers derived along
the way to producing a cost estimate, and I don't think we break
out such intermediate results for other node types.
It's looking like Robert's "pg_overexplain" will hit the tree soon,
so maybe there could be a case for teaching that to emit additional
costing details such as these? I'd kind of like to see a larger
scope than just Memoize for such an addition, though I'm not sure
exactly which other intermediate estimates are worth showing.
regards, tom lane
On Mon, Mar 24, 2025 at 3:15 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
FWIW, I share these doubts about whether these values are useful
enough to include in the default EXPLAIN output. My main beef
with them though is that they are basically numbers derived along
the way to producing a cost estimate, and I don't think we break
out such intermediate results for other node types.
The main argument I had initially when proposing this, is that Memoize is
different from other plan nodes, in that it makes the child node costs
"cheaper". Clearly seeing the expected cache hit/ratio (that drives that
costing modification) helps interpret why the planner came up with a given
plan.
Put differently, the reason this should be in the default EXPLAIN output
(or at least the VERBOSE output), is because Memoize's impact on costing is
counterintuitive (in my experience), and breaks the user's understanding of
planner costs you can usually derive by looking at the per-node cost
details, which typically flows up (i.e. gets larger as you step higher up
in the tree).
It's looking like Robert's "pg_overexplain" will hit the tree soon,
so maybe there could be a case for teaching that to emit additional
costing details such as these?
I don't think that addresses the end-user usability sufficiently - from
what I gathered "pg_overexplain" was meant for a hacker audience, not
DBAs/etc interpreting Postgres plan choices.
Thanks,
Lukas
--
Lukas Fittl
Lukas Fittl <lukas@fittl.com> writes:
The main argument I had initially when proposing this, is that Memoize is
different from other plan nodes, in that it makes the child node costs
"cheaper". Clearly seeing the expected cache hit/ratio (that drives that
costing modification) helps interpret why the planner came up with a given
plan.
Memoize is hardly unique in that respect. Merge Join sometimes
expects that it won't have to read the inner input to completion,
and reduces its cost estimate accordingly, and that confuses people.
LIMIT also reduces the cost estimate of its input, though perhaps
that doesn't surprise people.
As I said, I'm not necessarily averse to showing these numbers
somehow. But I don't think they belong in the default output,
and I'm not even convinced that VERBOSE is the right place.
pg_overexplain seems like it could be an ideal home for this
sort of detail.
regards, tom lane
On Tue, 25 Mar 2025 at 11:15, Tom Lane <tgl@sss.pgh.pa.us> wrote:
FWIW, I share these doubts about whether these values are useful
enough to include in the default EXPLAIN output. My main beef
with them though is that they are basically numbers derived along
the way to producing a cost estimate, and I don't think we break
out such intermediate results for other node types.
The only likeness I can think of is the "(originally N)" for the
buckets and batches in EXPLAIN ANALYZE for Hash Joins. I see that that
only shows when one of those isn't the same as the planner's
expectations
Maybe there's room to show additional details for Memoize in some
similar way only during EXPLAIN ANALYZE. EXPLAIN ANALYZE for Memoize
currently shows things like the number of hits, misses and evictions.
We could calculate what the planner expected those values to be and
show those. For this case, they're almost certain not to match what
the planner expected, but maybe that's ok.
David