Reduce planning time for large NOT IN lists containing NULL
Hi hackers,
In this thread [0]/messages/by-id/CAHza6qdAH99C0xa27YDTixiNVFa99j90QYquUPAxL0JwahmggA@mail.gmail.com an interesting idea came up about avoiding
unnecessary work during selectivity estimation for x <> ALL (NULL, ...)
or x NOT IN (NULL, ...)
Semantically, if the array contains at least one NULL, the selectivity
of x NOT IN (...) is always 0.0, regardless of the other elements in the
list.
Currently, the planner still iterates over all array elements and
invokes the operator's selectivity estimator for each of them. For large
IN / ALL lists, this increases planning time.
For constant arrays I propose adding a simple pre-check before entering
the per-element loop: detect whether the array contains at least one
NULL element (e.g., via memchr() for the deconstructed array case). If
so, and we are in the ALL / NOT IN case, we can immediately return
selectivity = 0.0 and skip all further computation. This would avoid
extra per-element estimation work while preserving semantics.
In cases where array elements are not known to be constants in advance,
such a pre-check is less straightforward, because each element must
first be inspected to determine whether it is a Const and whether it is
NULL. That already requires iterating through the list, so introducing a
separate early pass would not actually reduce the amount of work.
Therefore, it like makes sense to keep the current behavior in that
situation.
Thoughts?
[0]: /messages/by-id/CAHza6qdAH99C0xa27YDTixiNVFa99j90QYquUPAxL0JwahmggA@mail.gmail.com
/messages/by-id/CAHza6qdAH99C0xa27YDTixiNVFa99j90QYquUPAxL0JwahmggA@mail.gmail.com
--
Best regards.
Ilia Evdokimov,
Tantor Labs LLC,
https://tantorlabs.com/
Attachments:
v1-0001-Reduce-planning-time-for-large-NOT-IN-lists-conta.patchtext/x-patch; charset=UTF-8; name=v1-0001-Reduce-planning-time-for-large-NOT-IN-lists-conta.patchDownload+3-1
Hi Ilia!
On 18.02.2026 15:11, Ilia Evdokimov wrote:
Hi hackers,
In this thread [0] an interesting idea came up about avoiding
unnecessary work during selectivity estimation for x <> ALL (NULL, ...)
or x NOT IN (NULL, ...)Semantically, if the array contains at least one NULL, the selectivity
of x NOT IN (...) is always 0.0, regardless of the other elements in the
list.Currently, the planner still iterates over all array elements and
invokes the operator's selectivity estimator for each of them. For large
IN / ALL lists, this increases planning time.
+1 on the general idea.
For constant arrays I propose adding a simple pre-check before entering
the per-element loop: detect whether the array contains at least one
NULL element (e.g., via memchr() for the deconstructed array case). If
so, and we are in the ALL / NOT IN case, we can immediately return
selectivity = 0.0 and skip all further computation. This would avoid
extra per-element estimation work while preserving semantics.
How much overhead does the memchr() call add? It seems like this
approach optimizes the edge case at the expense of the common case,
which doesn't seem like a good trade-off.
How about instead adding a flag to ArrayType which indicates if the
array contains NULL or not. This flag could be set in
construct_md_array() which already iterates over all elements. The flag
would need to be kept up-to-date, e.g. in array_set_element() and
possibly other functions modifying the array.
In cases where array elements are not known to be constants in advance,
such a pre-check is less straightforward, because each element must
first be inspected to determine whether it is a Const and whether it is
NULL. That already requires iterating through the list, so introducing a
separate early pass would not actually reduce the amount of work.
Therefore, it like makes sense to keep the current behavior in that
situation.
Agreed.
--
David Geier
Hi David,
Thanks for review
On 2/19/26 18:38, David Geier wrote:
+1 on the general idea.
For constant arrays I propose adding a simple pre-check before entering
the per-element loop: detect whether the array contains at least one
NULL element (e.g., via memchr() for the deconstructed array case). If
so, and we are in the ALL / NOT IN case, we can immediately return
selectivity = 0.0 and skip all further computation. This would avoid
extra per-element estimation work while preserving semantics.How much overhead does the memchr() call add? It seems like this
approach optimizes the edge case at the expense of the common case,
which doesn't seem like a good trade-off.How about instead adding a flag to ArrayType which indicates if the
array contains NULL or not. This flag could be set in
construct_md_array() which already iterates over all elements. The flag
would need to be kept up-to-date, e.g. in array_set_element() and
possibly other functions modifying the array.
It seems we might reinventing the wheel.
There is already ARR_HASNULL() which allows us to detect the presence of
NULL in ArrayType.
In cases where array elements are not known to be constants in advance,
such a pre-check is less straightforward, because each element must
first be inspected to determine whether it is a Const and whether it is
NULL. That already requires iterating through the list, so introducing a
separate early pass would not actually reduce the amount of work.
Therefore, it like makes sense to keep the current behavior in that
situation.Agreed.
After thinking about this more, is seems reasonable to short-circuit еру
loop when we detect a NULL element by checking whether the element is a
Const and NULL.
I've attached v2 patch.
--
Best regards.
Ilia Evdokimov,
Tantor Labs LLC,
https://tantorlabs.com/
Attachments:
v2-0001-Reduce-planning-time-for-large-NOT-IN-lists-conta.patchtext/x-patch; charset=UTF-8; name=v2-0001-Reduce-planning-time-for-large-NOT-IN-lists-conta.patchDownload+7-1
On 2/20/26 12:06, Ilia Evdokimov wrote:
There is already ARR_HASNULL() which allows us to detect the presence
of NULL in ArrayType.
I've moved the NULL check higher, placing it immediately after
DatumGetArrayTypeP(). This allows us to detect the presence of NULL
elements before decontructing the array.
I tested this with several queries of the form:
WHERE x NOT IN (NULL, ...)
In my tests (with column x having detailed statistics, so selectivity
estimation performs more work), planning time decreases from *5-200 ms
before the patch* to *~ 1-2 ms after the patch*, depending on the list size.
--
Best regards.
Ilia Evdokimov,
Tantor Labs LLC,
https://tantorlabs.com/
On 2/20/26 16:25, Ilia Evdokimov wrote:
I've moved the NULL check higher, placing it immediately after
DatumGetArrayTypeP(). This allows us to detect the presence of NULL
elements before decontructing the array.
Sorry, I forgot to attach the patch in my previous mail. Attaching it now.
--
Best regards.
Ilia Evdokimov,
Tantor Labs LLC,
https://tantorlabs.com/
Attachments:
v3-0001-Reduce-planning-time-for-large-NOT-IN-lists-conta.patchtext/x-patch; charset=UTF-8; name=v3-0001-Reduce-planning-time-for-large-NOT-IN-lists-conta.patchDownload+9-1
On 20.02.2026 14:28, Ilia Evdokimov wrote:
On 2/20/26 16:25, Ilia Evdokimov wrote:
I've moved the NULL check higher, placing it immediately after
DatumGetArrayTypeP(). This allows us to detect the presence of NULL
elements before decontructing the array.Sorry, I forgot to attach the patch in my previous mail. Attaching it now.
Cool that the array code had that functionality already.
The patch looks good to me and now no longer regresses other cases. The
speedup will be less pronounced once the hash-based [NOT] IN code is
merged [1]/messages/by-id/7db341e0-fbc6-4ec5-922c-11fdafe7be12@tantorlabs.com but will still save considerable amounts of cycles.
It seems like we don't have a regression test which has a NULL value in
the NOT IN list. Maybe you can find a good place to add that one?
--
David Geier
[1]: /messages/by-id/7db341e0-fbc6-4ec5-922c-11fdafe7be12@tantorlabs.com
/messages/by-id/7db341e0-fbc6-4ec5-922c-11fdafe7be12@tantorlabs.com
On 2/20/26 17:22, David Geier wrote:
It seems like we don't have a regression test which has a NULL value in
the NOT IN list. Maybe you can find a good place to add that one?
There are already regression tests covering NOT IN with NULL values in
expressions.sql. Do you think we need an additional simple case, or is
the current coverage sufficient?
--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC,
https://tantorlabs.com/
Hello
I think it would be a good idea to add a test, I think there's a
regression with this patch:
CREATE TABLE notin_test AS SELECT generate_series(1, 1000) AS x;
ANALYZE notin_test;
CREATE FUNCTION replace_elem(arr int[], idx int, val int)
RETURNS int[] AS $$
BEGIN
arr[idx] := val;
RETURN arr;
END;
$$ LANGUAGE plpgsql IMMUTABLE;
EXPLAIN SELECT * FROM notin_test WHERE x <> ALL(ARRAY[1,99,3]);
-- same array, constructed from an array with a NULL
EXPLAIN SELECT * FROM notin_test WHERE x <>
ALL(replace_elem(ARRAY[1,NULL,3], 2, 99));
DROP TABLE notin_test;
DROP FUNCTION replace_elem;
ARR_HASNULL probably should be array_contains_nulls, as ARR_HASNULL
simply checks for the existence of a NULL bitmap.
Hi,
On 2/23/26 22:44, Zsolt Parragi wrote:
Hello
I think it would be a good idea to add a test, I think there's a
regression with this patch:CREATE TABLE notin_test AS SELECT generate_series(1, 1000) AS x;
ANALYZE notin_test;CREATE FUNCTION replace_elem(arr int[], idx int, val int)
RETURNS int[] AS $$
BEGIN
arr[idx] := val;
RETURN arr;
END;
$$ LANGUAGE plpgsql IMMUTABLE;EXPLAIN SELECT * FROM notin_test WHERE x <> ALL(ARRAY[1,99,3]);
-- same array, constructed from an array with a NULL
EXPLAIN SELECT * FROM notin_test WHERE x <>
ALL(replace_elem(ARRAY[1,NULL,3], 2, 99));
DROP TABLE notin_test;
DROP FUNCTION replace_elem;ARR_HASNULL probably should be array_contains_nulls, as ARR_HASNULL
simply checks for the existence of a NULL bitmap.
Could you clarify what exactly this additional test meant to verify?
The current patch only introduces an early exit from the expensive
per-element selectivity loop in the <> ALL case when a NULL is detected.
If the goal is to verify the correctness of IN / NOT IN semantics, those
cases already covered in expressions.sql, including scenarios with NULL
elements.
I attached this thread to commitfest:
https://commitfest.postgresql.org/patch/6519/
--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC,
https://tantorlabs.com/
I think it would be a good idea to add a test, I think there's a
regression with this patch:CREATE TABLE notin_test AS SELECT generate_series(1, 1000) AS x;
ANALYZE notin_test;CREATE FUNCTION replace_elem(arr int[], idx int, val int)
RETURNS int[] AS $$
BEGIN
arr[idx] := val;
RETURN arr;
END;
$$ LANGUAGE plpgsql IMMUTABLE;EXPLAIN SELECT * FROM notin_test WHERE x <> ALL(ARRAY[1,99,3]);
-- same array, constructed from an array with a NULL
EXPLAIN SELECT * FROM notin_test WHERE x <>
ALL(replace_elem(ARRAY[1,NULL,3], 2, 99));
DROP TABLE notin_test;
DROP FUNCTION replace_elem;
Good catch. The macro name is misleading here. It should have been
called ARR_HASNULLBITMAP().
+1 on adding an explicit test that says why we care about that case.
ARR_HASNULL probably should be array_contains_nulls, as ARR_HASNULL
simply checks for the existence of a NULL bitmap.
Using array_contains_nulls() seems fine. In case the IN list doesn't
contain NULL, the function can immediately bail thanks to the
!ARR_HASNULL() check in the beginning.
It only needs to iterate over the NULL-bitmap, if it exists. This is the
case if there's actually a NULL element in the array, or if the array
initially contained NULL and all NULLs got removed subsequently.
If we ever find the latter case to matter we could remove the
NULL-bitmap in array_set_element() / array_set_element_expanded(), when
the last NULL element got removed.
Could you clarify what exactly this additional test meant to verify?
Zsolt's test case creates an array that initially contains NULL. The
NULL element is subsequently replaced by a non-NULL value but
array_set_element_expanded() keeps the NULL-bitmap around. With that,
your ARR_ISNULL() check bails and causes the selectivity estimation to
incorrectly return 0.
I attached this thread to commitfest: https://commitfest.postgresql.org/
patch/6519/
I'll assign myself as reviewer.
--
David Geier
On 2/24/26 11:29, David Geier wrote:
Using array_contains_nulls() seems fine. In case the IN list doesn't
contain NULL, the function can immediately bail thanks to the
!ARR_HASNULL() check in the beginning.It only needs to iterate over the NULL-bitmap, if it exists. This is the
case if there's actually a NULL element in the array, or if the array
initially contained NULL and all NULLs got removed subsequently.If we ever find the latter case to matter we could remove the
NULL-bitmap in array_set_element() / array_set_element_expanded(), when
the last NULL element got removed.Could you clarify what exactly this additional test meant to verify?
Zsolt's test case creates an array that initially contains NULL. The
NULL element is subsequently replaced by a non-NULL value but
array_set_element_expanded() keeps the NULL-bitmap around. With that,
your ARR_ISNULL() check bails and causes the selectivity estimation to
incorrectly return 0.
Ah, right - thanks for the clarification. I agree.
Regarding the regression test: the suggestion test case is good, but
there is not a straightforward way to expose the estimated row count
without also showing the costs, and costs are unstable. To avoid that, I
reused the parsing approach already present in stats_ext.sql to extract
only the estimated row count from EXPLAIN.
Since the test table contains exactly 1000 rows and we run ANALYZE, all
rows are included in the statistics sample. Therefore the estimate for x
<> ALL(array[1, 99, 2]) is deterministically 997 rows, and the test
stable and ensures we detect the incorrect early-zero estimate.
Let me know if you'd prefer a different approach. I've attached v4 patch.
--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC,
https://tantorlabs.com/
Attachments:
v4-0001-Reduce-planning-time-for-large-NOT-IN-lists-conta.patchtext/x-patch; charset=UTF-8; name=v4-0001-Reduce-planning-time-for-large-NOT-IN-lists-conta.patchDownload+94-1
expressions.sql is missing a new line at the end of the file,
otherwise it looks good to me.
On 2/25/26 16:43, Zsolt Parragi wrote:
expressions.sql is missing a new line at the end of the file,
otherwise it looks good to me.
I've fixed this in v5-patch.
Since the C changes are minimal, the patch is relatively small, and no
further issues were raised, I am moving this patch to "Ready for Committer"
--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC,
https://tantorlabs.com/
Attachments:
v5-0001-Reduce-planning-time-for-large-NOT-IN-lists-conta.patchtext/x-patch; charset=UTF-8; name=v5-0001-Reduce-planning-time-for-large-NOT-IN-lists-conta.patchDownload+94-1
On Tue, 3 Mar 2026 at 04:04, Ilia Evdokimov
<ilya.evdokimov@tantorlabs.com> wrote:
I've fixed this in v5-patch.
I had a look at this and wondered if we guarantee that no rows will
match, then why can't we perform constant folding on the
ScalarArrayOpExpr when !useOr and the array contains a NULL element
and the operator is strict. Seemingly, one of the reasons for that is
down to the expression returning NULL vs false. Take the following two
tests from expressions.out:
select return_int_input(1) not in (10, 9, 2, 8, 3, 7, 4, 6, 5, 2, null);
?column?
----------
(1 row)
select return_int_input(1) not in (10, 9, 2, 8, 3, 7, 4, 6, 5, 1, null);
?column?
----------
f
(1 row)
Here we see that we return false when we find the left operand in the
array, but NULL when we don't find it and the array contains NULL. So,
unless the left operand is a const, we wouldn't know how to simplify
the ScalarArrayOpExpr during planning as the false or NULL would only
be known during evaluation of the expression.
However, when the expression being simplified is an EXPRKIND_QUAL, it
shouldn't matter if the result is false or NULL as both mean the same
and there shouldn't be any code that cares about the difference.
Currently, we don't pass the "kind" down into
eval_const_expressions(), but I don't really see why we couldn't. It
would be a fair bit of work figuring out with confidence what the
extra arg should be passed as in all the existing call sites of that
function. We'd have to document in the header comment for
eval_const_expressions() that constant-folding on EXPRKIND_QUAL
expressions can enable additional optimisations which disregard the
difference between NULL and false.
For the patch, I imagine it's still a useful optimisation as the
ScalarArrayOpExpr might not be in an EXPRKIND_QUAL.
There are a couple of things I don't like:
1) The new test is in expressions.sql. The comment at the top of that
file reads: "expression evaluation tests that don't fit into a more
specific file". The new test isn't anything to do with expression
evaluation. It's about planner estimation. I see that
misc_function.sql has the explain_mask_costs() function. I'm not sure
that's the right place either, as the usages of that function are for
testing SupportRequestRows prosupport functions. I wonder if we need a
dedicated row_estimate.sql or selectivity_est.sql file. The
explain_mask_costs() wouldn't be out of place if they were moved into
a new test like that. It was me that started putting those in
misc_function.sql, and I don't object to them being moved to a new
test. I'd be as a separate commit, however.
2) The new test creates a new table and inserts 1000 rows. There does
not seem to be anything special about the new table. Why don't you use
one of the ones from test_setup.sql?
3) Looking at var_eq_const(), it seems like it's coded to assume the
operator is always strict, per "If the constant is NULL, assume
operator is strict and return zero". If that's good enough for
var_eq_const(), then it should be good enough for the new code. I
think it would be good if you wrote that or something similar in the
new code so that the reader knows taking the short-circuit with
non-strict functions is on purpose.
David
On 3/3/26 04:08, David Rowley wrote:
I had a look at this and wondered if we guarantee that no rows will
match, then why can't we perform constant folding on the
ScalarArrayOpExpr when !useOr and the array contains a NULL element
and the operator is strict. Seemingly, one of the reasons for that is
down to the expression returning NULL vs false. Take the following two
tests from expressions.out:select return_int_input(1) not in (10, 9, 2, 8, 3, 7, 4, 6, 5, 2, null);
?column?
----------(1 row)
select return_int_input(1) not in (10, 9, 2, 8, 3, 7, 4, 6, 5, 1, null);
?column?
----------
f
(1 row)Here we see that we return false when we find the left operand in the
array, but NULL when we don't find it and the array contains NULL. So,
unless the left operand is a const, we wouldn't know how to simplify
the ScalarArrayOpExpr during planning as the false or NULL would only
be known during evaluation of the expression.However, when the expression being simplified is an EXPRKIND_QUAL, it
shouldn't matter if the result is false or NULL as both mean the same
and there shouldn't be any code that cares about the difference.
Currently, we don't pass the "kind" down into
eval_const_expressions(), but I don't really see why we couldn't. It
would be a fair bit of work figuring out with confidence what the
extra arg should be passed as in all the existing call sites of that
function. We'd have to document in the header comment for
eval_const_expressions() that constant-folding on EXPRKIND_QUAL
expressions can enable additional optimisations which disregard the
difference between NULL and false.For the patch, I imagine it's still a useful optimisation as the
ScalarArrayOpExpr might not be in an EXPRKIND_QUAL.
I agree that this could be a useful optimization, at least
for EXPRKIND_QUAL, where NULL and false are semantically equivalent.
However, passing EXPRKIND through eval_const_expressions() would requir
careful auditing of all call sites. If I explore this further and have
something concrete, I'll start a separate thread on that topic.
There are a couple of things I don't like:
1) The new test is in expressions.sql. The comment at the top of that
file reads: "expression evaluation tests that don't fit into a more
specific file". The new test isn't anything to do with expression
evaluation. It's about planner estimation. I see that
misc_function.sql has the explain_mask_costs() function. I'm not sure
that's the right place either, as the usages of that function are for
testing SupportRequestRows prosupport functions. I wonder if we need a
dedicated row_estimate.sql or selectivity_est.sql file. The
explain_mask_costs() wouldn't be out of place if they were moved into
a new test like that. It was me that started putting those in
misc_function.sql, and I don't object to them being moved to a new
test. I'd be as a separate commit, however.
I've moved explain_mask_costs() and all related tests into a new
regression test file selectivity_est.sql. This is done as separate patch
v6-0001 containing only the test refactoring, with no behavioral changes.
2) The new test creates a new table and inserts 1000 rows. There does
not seem to be anything special about the new table. Why don't you use
one of the ones from test_setup.sql?
I've switched the test to use tenk1 from test_setup.sql instead of
creating a new table.
3) Looking at var_eq_const(), it seems like it's coded to assume the
operator is always strict, per "If the constant is NULL, assume
operator is strict and return zero". If that's good enough for
var_eq_const(), then it should be good enough for the new code. I
think it would be good if you wrote that or something similar in the
new code so that the reader knows taking the short-circuit with
non-strict functions is on purpose.
The updated comments are included in the v6-0002, and the test is now in
selectivity_est.sql
--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC,
https://tantorlabs.com/
Attachments:
v6-0002-Reduce-planning-time-for-large-NOT-IN-lists-conta.patchtext/x-patch; charset=UTF-8; name=v6-0002-Reduce-planning-time-for-large-NOT-IN-lists-conta.patchDownload+69-1
v6-0001-Move-planner-row-estimation-tests-to-selectivity..patchtext/x-patch; charset=UTF-8; name=v6-0001-Move-planner-row-estimation-tests-to-selectivity..patchDownload+305-241
Rebased the v7-patches to fix git apply failure.
--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC,
https://tantorlabs.com/
Attachments:
v7-0002-Reduce-planning-time-for-large-NOT-IN-lists-conta.patchtext/x-patch; charset=UTF-8; name=v7-0002-Reduce-planning-time-for-large-NOT-IN-lists-conta.patchDownload+69-1
v7-0001-Move-planner-row-estimation-tests-to-selectivity..patchtext/x-patch; charset=UTF-8; name=v7-0001-Move-planner-row-estimation-tests-to-selectivity..patchDownload+305-241
On Tue, 17 Mar 2026 at 23:51, Ilia Evdokimov
<ilya.evdokimov@tantorlabs.com> wrote:
Rebased the v7-patches to fix git apply failure.
Thanks. I've pushed 0001.
I ended up renaming the new file to planner_est.sql as the function
handles width estimate masking too, so I thought just calling it
selectivity_est was a bit too restrictive. I went with planner_est.
That means 0002 needed rebased. I've done that in the attached. Will
look more closely at that one later.
David
Attachments:
v8-0001-Reduce-planning-time-for-large-NOT-IN-lists-conta.patchtext/plain; charset=US-ASCII; name=v8-0001-Reduce-planning-time-for-large-NOT-IN-lists-conta.patchDownload+69-1
On 3/18/26 07:32, David Rowley wrote:
Thanks. I've pushed 0001.
I ended up renaming the new file to planner_est.sql as the function
handles width estimate masking too, so I thought just calling it
selectivity_est was a bit too restrictive. I went with planner_est.
+1. Thank you.
That means 0002 needed rebased. I've done that in the attached.
After the new test was committed, I realized that v8 tests relies on
selectivity calculation, which are not guaranteed to remain stable over
time and way vary depending on planner heuristics or platform
differences. Therefore, it seems better to remove tests from v8.
Instead, we can test the invariant behavior: when NULL is present in a
<> ALL clause, the selectivity is always 0.0.
The v9-patch adds three test cases: a degenerate case with only NULL,
NULL combined with constants, NULL combined with both constants and
non-constant expression.
Thoughts?
--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC,
https://tantorlabs.com/