[PATCH] Fix hashed ScalarArrayOp semantics for NULL LHS with non-strict comparators
Hi,
ExecEvalHashedScalarArrayOp() produces results that are not semantically
equivalent to ExecEvalScalarArrayOp() when the LHS is NULL and the
comparison function is non-strict.
With the attached setup, the following two queries disagree:
select (a in
(0::myint,1::myint,2::myint,3::myint,4::myint,
5::myint,6::myint,7::myint,8::myint,9::myint)) is null
from inttest where a is null;
This takes the hashed SAOP path and returns false.
select (a in
(0::myint,1::myint,2::myint,3::myint,4::myint,5::myint)) is null
from inttest where a is null;
This stays on the linear path and returns true.
This only occurs when:
* the planner selects the hashed SAOP path,
* the LHS evaluates to NULL at execution time, and
* the comparison function is non-strict.
The root cause is that ExecEvalHashedScalarArrayOp() only special-cases
NULL LHS for strict functions, and otherwise proceeds to probe the hash
table. This is incorrect for non-strict functions, and can also result
in probing with an undefined Datum.
The first attached patch fixes this by bypassing hash probing when the
LHS is NULL and the comparator is non-strict, falling back to a linear
evaluation consistent with ExecEvalScalarArrayOp(). For NOT IN, only
non-NULL results are inverted.
The second patch is a cleanup that factors out the common array scan and
boolean reduction logic shared by ExecEvalScalarArrayOp() and the new
fallback path. No functional change intended.
The patches include regression tests using a custom type with a
non-strict, hashable “=” operator. A standalone SQL reproducer is also
attached.
--
Best regards,
Chengpeng Yan
Attachments:
v1-0001-Fix-hashed-ScalarArrayOp-NULL-handling-for-non-st.patchapplication/octet-stream; name=v1-0001-Fix-hashed-ScalarArrayOp-NULL-handling-for-non-st.patchDownload+132-5
v1-0002-Factor-out-common-ScalarArrayOp-array-reduction-l.patchapplication/octet-stream; name=v1-0002-Factor-out-common-ScalarArrayOp-array-reduction-l.patchDownload+65-84
Hi Chengpeng,
Thanks for your report. After reading the code, I find that there is still
an issue even if the LHS is not null and the comparator is strict (make
myinteq always return null, see the attached sql for details).
The main reason is that we did not consider whether the comparator's
return value is null during looking up the hash. Not sure whether this
is a big problem.
--
Regards,
ChangAo Chen
Attachments:
On Fri, 17 Apr 2026 at 01:01, Chengpeng Yan <chengpeng_yan@outlook.com> wrote:
The first attached patch fixes this by bypassing hash probing when the
LHS is NULL and the comparator is non-strict, falling back to a linear
evaluation consistent with ExecEvalScalarArrayOp(). For NOT IN, only
non-NULL results are inverted.
Thanks for the bug report.
I don't think we need to fallback on a linear search. If the
non-strict function returns false for NULL = NULL, then as far as I
can see, we can still get the correct result by checking if the hash
table contains any other members. What I'm not certain of is if a
non-strict function must return NULL for NULL = non-NULL. If yes, then
we could just do it as the attached patch. I made this check the hash
table to see if it has non-NULL Datums hashed. This means something
like "WHERE NULL IN (NULL, 1)" for a non-strict function returning
false for NULL = NULL and NULL for NULL = 1 would evaluate the same as
"WHERE false OR NULL", which is NULL. Whereas, "WHERE NULL IN(NULL)"
would be "false".
If we need to assume the non-strict function could return false on
NULL = non-NULL, then we could test for that when inserting the first
datum into the hash table and store the behaviour in the expression.
It may also be worth doing that check for NULL = NULL so that we don't
need to call the equals function every time we see a NULL.
I'll need to dig a bit deeper to see if we've written down any rules
about non-strict equality functions anywhere...
David
Attachments:
v2-0001-Fix-incorrect-logic-for-hashed-IN-NOT-IN-with-non.patchapplication/octet-stream; name=v2-0001-Fix-incorrect-logic-for-hashed-IN-NOT-IN-with-non.patchDownload+236-78
On Mon, 20 Apr 2026 at 01:06, David Rowley <dgrowleyml@gmail.com> wrote:
If we need to assume the non-strict function could return false on
NULL = non-NULL, then we could test for that when inserting the first
datum into the hash table and store the behaviour in the expression.
It may also be worth doing that check for NULL = NULL so that we don't
need to call the equals function every time we see a NULL.I'll need to dig a bit deeper to see if we've written down any rules
about non-strict equality functions anywhere...
Of course, it is possible to make the strict function do that, and
non-hashed IN / NOT IN handles it, so the hashed version shouldn't
have an excuse to not do the right thing.
I've attached a version that "probes" the equality function for its
NULL = NULL behaviour and its NULL = non-NULL behaviour and returns
whatever the result of the probe was at the appropriate time.
What I came up with does feel quite elaborate, so I'd quite like a 2nd opinion.
The patch does assume that the non-strict function will return the
same thing for NULL = non-NULL as it will for non-NULL = NULL.
Technically, if you coded the function to do something different
there, the hashed vs non-hashed could differ in their result. My
thoughts there, if someone is expecting anything sane out of such an
equality function, then they're probably going to be disappointed due
to various other optimisations we have.
David
Attachments:
v3-0001-Fix-incorrect-logic-for-hashed-IN-NOT-IN-with-non.patchapplication/octet-stream; name=v3-0001-Fix-incorrect-logic-for-hashed-IN-NOT-IN-with-non.patchDownload+395-76
David Rowley <dgrowleyml@gmail.com> writes:
I've attached a version that "probes" the equality function for its
NULL = NULL behaviour and its NULL = non-NULL behaviour and returns
whatever the result of the probe was at the appropriate time.
What I came up with does feel quite elaborate, so I'd quite like a 2nd opinion.
The patch does assume that the non-strict function will return the
same thing for NULL = non-NULL as it will for non-NULL = NULL.
Meh. I think we assume that hashable equality functions satisfy the
symmetric law, ie A = B if and only if B = A, so that part is fine.
However, I do not care for the assumption that any random non-null
input will produce the same answer. As a quick counter-example,
consider a text-like datatype that tries to emulate Oracle's semantics
that an empty string is the same as NULL. Your code would arrive at
different results depending on whether the first non-null input
chanced to be an empty string.
(I've not read this whole thread, so I don't have a global opinion
on what we ought to do here. I suspect it's a tricky subject.)
regards, tom lane
On Apr 20, 2026, at 11:46, David Rowley <dgrowleyml@gmail.com> wrote:
Of course, it is possible to make the strict function do that, and
non-hashed IN / NOT IN handles it, so the hashed version shouldn't
have an excuse to not do the right thing.I've attached a version that "probes" the equality function for its
NULL = NULL behaviour and its NULL = non-NULL behaviour and returns
whatever the result of the probe was at the appropriate time.What I came up with does feel quite elaborate, so I'd quite like a 2nd opinion.
The patch does assume that the non-strict function will return the
same thing for NULL = non-NULL as it will for non-NULL = NULL.
Technically, if you coded the function to do something different
there, the hashed vs non-hashed could differ in their result. My
thoughts there, if someone is expecting anything sane out of such an
equality function, then they're probably going to be disappointed due
to various other optimisations we have.
Hi,
Thanks for the discussion.
I agree with Tom's concern that it does not seem safe to generalize from
NULL = first-non-NULL to all non-NULL values. Unless I am missing one, I
do not know of a planner/executor-visible contract that would justify
that assumption.
For the original NULL-LHS bug, a linear fallback still seems like the
safest baseline fix to me. It is conservative, but it matches
ExecEvalScalarArrayOp() without adding extra assumptions. The obvious
downside is performance, although this path only triggers when the
runtime LHS is NULL and the comparator is non-strict. It may also be
possible to cache the NULL-LHS outcome once per expression, since the
RHS array is constant in the hashed SAOP case, which might help reduce
the cost of that fallback.
ChangAo's example also seems to expose a separate correctness issue. If
the comparator can return NULL even for non-NULL inputs, then a lookup
hit seems sufficient, but a miss is no longer enough to distinguish
FALSE for IN / TRUE for NOT IN from NULL.
A conservative fix there would again be a linear fallback after miss,
which should recover the right semantics, but that case does seem much
more performance-sensitive.
So I would be interested to hear what people think about both cases,
especially if there is a better way to preserve correctness without
paying the full linear-fallback cost.
--
Best regards,
Chengpeng Yan
On Mon, 20 Apr 2026 at 18:17, Chengpeng Yan <chengpeng_yan@outlook.com> wrote:
It may also be
possible to cache the NULL-LHS outcome once per expression, since the
RHS array is constant in the hashed SAOP case, which might help reduce
the cost of that fallback.
Yeah, it doesn't make sense to repeatedly perform a linear search over
the array to check if NULL matches anything in the array. Let's just
do that once when we build the hash table and reuse that cached value
whenever we see a NULL. We can skip that step with strict functions
since we'll short-circuit earlier.
A patch for that is attached.
ChangAo's example also seems to expose a separate correctness issue. If
the comparator can return NULL even for non-NULL inputs, then a lookup
hit seems sufficient, but a miss is no longer enough to distinguish
FALSE for IN / TRUE for NOT IN from NULL.
IMO it's unrealistic to assume we can do anything sane with an
equality function that always returns NULL.
A conservative fix there would again be a linear fallback after miss,
which should recover the right semantics, but that case does seem much
more performance-sensitive.
I really doubt it's worth troubling over that. If we did want to do
something, then it would be more efficient to probe the hash table
directly after we insert a Datum and verify we can find it again. If
we can't find any value we just inserted, mark the entire table as
broken and have it so we check for that and do a linear search.
David
Attachments:
v3-0001-Fix-incorrect-logic-for-hashed-IN-NOT-IN-with-non.patchapplication/octet-stream; name=v3-0001-Fix-incorrect-logic-for-hashed-IN-NOT-IN-with-non.patchDownload+368-57
Hi,
On Apr 23, 2026, at 07:33, David Rowley <dgrowleyml@gmail.com> wrote:
Yeah, it doesn't make sense to repeatedly perform a linear search over
the array to check if NULL matches anything in the array. Let's just
do that once when we build the hash table and reuse that cached value
whenever we see a NULL. We can skip that step with strict functions
since we'll short-circuit earlier.A patch for that is attached.
Thanks for working on this. Overall, this version looks good to me, and
I'm fine with the current approach. One possible improvement, though not
a blocker, would be to defer the lhs-NULL handling until we actually
encounter the first NULL on the lhs. That could avoid a bit of extra
work in the common case where the lhs contains no NULLs. That said, I
think the current implementation is perfectly OK as-is.
IMO it's unrealistic to assume we can do anything sane with an
equality function that always returns NULL.I really doubt it's worth troubling over that. If we did want to do
something, then it would be more efficient to probe the hash table
directly after we insert a Datum and verify we can find it again. If
we can't find any value we just inserted, mark the entire table as
broken and have it so we check for that and do a linear search.
I tend to agree. Even if such a case can be constructed, it seems rare
enough that I am not sure it is worth adding more complexity, or extra
overhead in the common hashed SAOP path, to handle it in this patch. I
think we can revisit that separately if a concrete case turns up that
seems worth looking into.
--
Best regards,
Chengpeng Yan
Thanks for looking.
On Thu, 23 Apr 2026 at 16:31, Chengpeng Yan <chengpeng_yan@outlook.com> wrote:
One possible improvement, though not
a blocker, would be to defer the lhs-NULL handling until we actually
encounter the first NULL on the lhs. That could avoid a bit of extra
work in the common case where the lhs contains no NULLs.
I thought of it, but didn't do it as it meant having to keep a bit
more state to track if we've filled the cache yet, plus the extra
costs incurred to check if we've done it yet that would have to be
paid for every NULL lookup. We currently have to check if the hash
table has been set up already, so I felt more comfortable installing
the new code in with that.
David
On Apr 23, 2026, at 13:32, David Rowley <dgrowleyml@gmail.com> wrote:
I thought of it, but didn't do it as it meant having to keep a bit
more state to track if we've filled the cache yet, plus the extra
costs incurred to check if we've done it yet that would have to be
paid for every NULL lookup. We currently have to check if the hash
table has been set up already, so I felt more comfortable installing
the new code in with that.
That makes sense to me. I agree that tying it to the existing hash-table
setup is the simpler place to do it, and avoids adding extra state plus
another check on each NULL lookup.
So I am fine with keeping it as-is.
--
Best regards,
Chengpeng Yan