Negative cache entries for memoize

Started by Bruce Momjianabout 3 years ago3 messageshackers
Jump to latest
#1Bruce Momjian
bruce@momjian.us

I wrote an optimizer talk that explains memoize, slides 24-25:

https://momjian.us/main/writings/pgsql/beyond.pdf#page=25

During two presentations, I was asked if negative cache entries were
created for cases where inner-side lookups returned no rows.

It seems we don't do that. Has this been considered or is it planned?

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Embrace your flaws. They make you human, rather than perfect,
which you will never be.

#2David Rowley
dgrowleyml@gmail.com
In reply to: Bruce Momjian (#1)
Re: Negative cache entries for memoize

On Thu, 6 Apr 2023 at 03:12, Bruce Momjian <bruce@momjian.us> wrote:

During two presentations, I was asked if negative cache entries were
created for cases where inner-side lookups returned no rows.

It seems we don't do that. Has this been considered or is it planned?

It does allow negative cache entries, so I'm curious about what you
did to test this.

A cache entry is always marked as complete (i.e valid to use for
lookups) when we execute the subnode to completion. In some plan
shapes we might not execute the inner side until it returns NULL, for
example in Nested Loop Semi Joins we skip to the next outer row when
matching the first inner row. This could leave an incomplete cache
entry which Memoize can't be certain if it contains all rows from the
subnode or not.

For the negative entry case, which really there is no special code
for, there are simply just no matching rows so the cache entry will be
marked as complete always as the inner node will return NULL on the
first call. So negative entries will even work in the semi-join case.

Here's a demo of the negative entries working with normal joins:

create table t0 (a int);
insert into t0 select 0 from generate_Series(1,1000000);
create table t1 (a int primary key);
insert into t1 select x from generate_series(1,1000000)x;
vacuum analyze t0,t1;
explain (analyze, costs off, timing off, summary off)
select * from t0 inner join t1 on t0.a=t1.a;
QUERY PLAN
--------------------------------------------------------------------------------
Nested Loop (actual rows=0 loops=1)
-> Seq Scan on t0 (actual rows=1000000 loops=1)
-> Memoize (actual rows=0 loops=1000000)
Cache Key: t0.a
Cache Mode: logical
Hits: 999999 Misses: 1 Evictions: 0 Overflows: 0 Memory Usage: 1kB
-> Index Only Scan using t1_pkey on t1 (actual rows=0 loops=1)
Index Cond: (a = t0.a)
Heap Fetches: 0

David

#3Bruce Momjian
bruce@momjian.us
In reply to: David Rowley (#2)
Re: Negative cache entries for memoize

On Thu, Apr 6, 2023 at 09:23:31AM +1200, David Rowley wrote:

On Thu, 6 Apr 2023 at 03:12, Bruce Momjian <bruce@momjian.us> wrote:

During two presentations, I was asked if negative cache entries were
created for cases where inner-side lookups returned no rows.

It seems we don't do that. Has this been considered or is it planned?

It does allow negative cache entries, so I'm curious about what you
did to test this.

My mistake. Someone asked in Los Angeles and Jan Wieck checked during
the talk and said he didn't see it, and when someone asked in Moscow, I
repeated that answer. My mistake. I have updated the slides with the
correct information.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Embrace your flaws. They make you human, rather than perfect,
which you will never be.