Inconsistent Behavior in JSONB Numeric Array Deletion

Started by Mark Dake7 months ago9 messages
#1Mark Dake
mark.drake@golden-hind.com

Hi all,

I'd like to report what I believe is an inconsistency in the behavior of the
jsonb - operator when applied to numeric arrays.

Problem

PostgreSQL allows us to check for the presence of a scalar inside a JSONB
array:

SELECT jsonb('[2,3,1]') @> to_jsonb(1);

-- Returns true

However, when attempting to remove that value from the array using -, the
operation fails:

SELECT jsonb('[2,3,1]') - to_jsonb(1);

-- ERROR: operator does not exist: jsonb - jsonb

-- HINT: You might need to add explicit type casts.

This behavior differs from index-based removal:

SELECT jsonb('[2,3,1]') - 1;

-- Returns [2, 1]

But logically, if @> to_jsonb(1) is true, then jsonb('[2,3,1]') -
to_jsonb(1) should remove the value, not the index.

Proposal

Support a jsonb - jsonb operator where, if the RHS is a scalar that appears
in the LHS array, the operator removes all matching values:

SELECT jsonb('[2,3,1]') - to_jsonb(1);

-- Expected: [2, 3]

This would mirror similar behavior in many application languages and allow
value-based deletion from JSON arrays without casting back to SQL arrays or
using procedural workarounds.

Impact

The absence of this capability creates a gap in value-level JSONB
manipulation. Developers often have to resort to:

* Procedural code in PL/pgSQL
* Transforming JSONB arrays into SQL arrays (with limited type
support)
* Writing client-side logic

Adding support for this behavior would simplify many API use cases involving
JSON state manipulation.

_____

Happy to clarify further or contribute a patch.

Best regards,
Mark Drake

#2David E. Wheeler
david@justatheory.com
In reply to: Mark Dake (#1)
Re: Inconsistent Behavior in JSONB Numeric Array Deletion

On Jun 7, 2025, at 16:20, Mark Dake <mark.drake@golden-hind.com> wrote:

Support a jsonb - jsonb operator where, if the RHS is a scalar that appears in the LHS array, the operator removes all matching values:
SELECT jsonb('[2,3,1]') - to_jsonb(1);
-- Expected: [2, 3]
This would mirror similar behavior in many application languages and allow value-based deletion from JSON arrays without casting back to SQL arrays or using procedural workarounds.

FWIW, this behavior exists using text values:

david=# select '["a", "b", "c", "b"]'::jsonb - 'b';
?column?
------------
["a", "c"]

But I take your point about using a JSONB value as the second argument. I wonder if it might be slightly confusing, though. The `-` operator is already pretty overloaded with varying behavior based on the type of the right operand, but maybe that ship has sunk.

Impact
The absence of this capability creates a gap in value-level JSONB manipulation. Developers often have to resort to:
• Procedural code in PL/pgSQL
• Transforming JSONB arrays into SQL arrays (with limited type support)
• Writing client-side logic
Adding support for this behavior would simplify many API use cases involving JSON state manipulation.

I like the idea, we just may want to muck with the semantics a bit. Do you have a patch to share?

Best,

David

#3Robert Haas
robertmhaas@gmail.com
In reply to: Mark Dake (#1)
Re: Inconsistent Behavior in JSONB Numeric Array Deletion

On Tue, Jun 10, 2025 at 4:52 PM Mark Dake <mark.drake@golden-hind.com> wrote:

SELECT jsonb('[2,3,1]') @> to_jsonb(1);
-- Returns true

However, when attempting to remove that value from the array using -, the operation fails:
SELECT jsonb('[2,3,1]') - to_jsonb(1);
-- ERROR: operator does not exist: jsonb - jsonb
-- HINT: You might need to add explicit type casts.

This behavior differs from index-based removal:
SELECT jsonb('[2,3,1]') - 1;
-- Returns [2, 1]

But logically, if @> to_jsonb(1) is true, then jsonb('[2,3,1]') - to_jsonb(1) should remove the value, not the index.

I don't see how you can say that there's anything inconsistent here.
The proposed behavior of the operator you want to add would be
inconsistent with the existing integer subtraction operator, because
the former would remove by value and the latter removes by index. But
there's no inconsistency right now because the operator you want to
add doesn't exist yet. And I don't think the fact that @>(jsonb,jsonb)
exists can reasonably be said to be inconsistent with the fact that
-(jsonb,jsonb) doesn't, either.

Now, none of that means that we couldn't define -(jsonb,jsonb) in the
manner you propose. But that's just a feature idea, not an
inconsistency.

--
Robert Haas
EDB: http://www.enterprisedb.com

#4David E. Wheeler
david@justatheory.com
In reply to: Robert Haas (#3)
Re: Inconsistent Behavior in JSONB Numeric Array Deletion

On Jun 11, 2025, at 16:03, Robert Haas <robertmhaas@gmail.com> wrote:

The proposed behavior of the operator you want to add would be
inconsistent with the existing integer subtraction operator, because
the former would remove by value and the latter removes by index.

Bear in mind that `-` currently does both. Of the three current variants, the first two delete from an array by value:

* jsonb - text: Deletes a key (and its value) from a JSON object, or matching string value(s) from a JSON array.

* jsonb - text[] → jsonb: Deletes all matching keys or array elements from the left operand.

* jsonb - integer → jsonb: Deletes the array element with specified index (negative integers count from the end). Throws an error if JSON value is not an array.

Before I went and looked it up, I was also thinking this could use a different operator. But it’s already a bit overloaded, alas. So I could see the new behavior being:

* jsonb - jsonb → jsonb: Deletes the array element with specified value. Throws an error if JSON value is not an array.

Now, none of that means that we couldn't define -(jsonb,jsonb) in the
manner you propose. But that's just a feature idea, not an
inconsistency.

Agreed. One might also think of it as a “gap to fill”, but it’s a new behavior. In any event, whether the feature uses `-` is a point we can resolve by consensus, given a patch that implements the feature.

Best,

David

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: David E. Wheeler (#4)
Re: Inconsistent Behavior in JSONB Numeric Array Deletion

"David E. Wheeler" <david@justatheory.com> writes:

Bear in mind that `-` currently does both. Of the three current variants, the first two delete from an array by value:

* jsonb - text: Deletes a key (and its value) from a JSON object, or matching string value(s) from a JSON array.

* jsonb - text[] → jsonb: Deletes all matching keys or array elements from the left operand.

* jsonb - integer → jsonb: Deletes the array element with specified index (negative integers count from the end). Throws an error if JSON value is not an array.

Before I went and looked it up, I was also thinking this could use a different operator. But it’s already a bit overloaded, alas. So I could see the new behavior being:

* jsonb - jsonb → jsonb: Deletes the array element with specified value. Throws an error if JSON value is not an array.

I fear that that would cause some problems. Consider

regression=# select '["foo", "bar"]'::jsonb - 'bar';
?column?
----------
["foo"]
(1 row)

Right now we resolve the unlabeled literal as type text.
But if jsonb - jsonb existed, we'd decide it's jsonb, thanks
to the heuristic that prefers same-type-as-the-other-input
(rule 2a at [1]https://www.postgresql.org/docs/current/typeconv-oper.html). So it's pretty nearly certain that
adding jsonb - jsonb would break some existing queries;
or worse, silently cause them to do something different.
Maybe that's acceptable, but it's a demerit of this proposal.

regards, tom lane

[1]: https://www.postgresql.org/docs/current/typeconv-oper.html

#6David E. Wheeler
david@justatheory.com
In reply to: Tom Lane (#5)
Re: Inconsistent Behavior in JSONB Numeric Array Deletion

On Jun 11, 2025, at 17:43, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I fear that that would cause some problems. Consider

regression=# select '["foo", "bar"]'::jsonb - 'bar';
?column?
----------
["foo"]
(1 row)

Right now we resolve the unlabeled literal as type text.
But if jsonb - jsonb existed, we'd decide it's jsonb, thanks
to the heuristic that prefers same-type-as-the-other-input
(rule 2a at [1]). So it's pretty nearly certain that
adding jsonb - jsonb would break some existing queries;
or worse, silently cause them to do something different.
Maybe that's acceptable, but it's a demerit of this proposal.

Ah. It’s a pity the existing operator behaves differently for different rhs operands. But maybe add a new one that’s defined to operator on contents rather than keys/indexes and deprecate (or un-document) the content behavior in the `-` operator?

Best,

David

#7Mark Drake
mark.drake@golden-hind.com
In reply to: David E. Wheeler (#2)
RE: Inconsistent Behavior in JSONB Numeric Array Deletion

Sorry, not a 'C' coder. A man must know his limits. ☹

-----Original Message-----
From: David E. Wheeler <david@justatheory.com>
Sent: Wednesday, June 11, 2025 11:49 AM
To: Mark Dake <mark.drake@golden-hind.com>
Cc: pgsql-hackers@postgresql.org
Subject: Re: Inconsistent Behavior in JSONB Numeric Array Deletion

On Jun 7, 2025, at 16:20, Mark Dake <mark.drake@golden-hind.com> wrote:

Support a jsonb - jsonb operator where, if the RHS is a scalar that appears in the LHS array, the operator removes all matching values:
SELECT jsonb('[2,3,1]') - to_jsonb(1);
-- Expected: [2, 3]
This would mirror similar behavior in many application languages and allow value-based deletion from JSON arrays without casting back to SQL arrays or using procedural workarounds.

FWIW, this behavior exists using text values:

david=# select '["a", "b", "c", "b"]'::jsonb - 'b';
?column?
------------
["a", "c"]

But I take your point about using a JSONB value as the second argument. I wonder if it might be slightly confusing, though. The `-` operator is already pretty overloaded with varying behavior based on the type of the right operand, but maybe that ship has sunk.

Impact
The absence of this capability creates a gap in value-level JSONB manipulation. Developers often have to resort to:
• Procedural code in PL/pgSQL
• Transforming JSONB arrays into SQL arrays (with limited type support)
• Writing client-side logic
Adding support for this behavior would simplify many API use cases involving JSON state manipulation.

I like the idea, we just may want to muck with the semantics a bit. Do you have a patch to share?

Best,

David

#8Mark Drake
mark.drake@golden-hind.com
In reply to: David E. Wheeler (#6)
RE: Inconsistent Behavior in JSONB Numeric Array Deletion

I am certainly not tied to the '-' operator, but I think the ability to remove items from a numeric json array, based on a value would be something that would benefit many users.

-----Original Message-----
From: David E. Wheeler <david@justatheory.com>
Sent: Wednesday, June 11, 2025 2:48 PM
To: Tom Lane <tgl@sss.pgh.pa.us>
Cc: Robert Haas <robertmhaas@gmail.com>; Mark Dake <mark.drake@golden-hind.com>; pgsql-hackers@postgresql.org
Subject: Re: Inconsistent Behavior in JSONB Numeric Array Deletion

On Jun 11, 2025, at 17:43, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I fear that that would cause some problems. Consider

regression=# select '["foo", "bar"]'::jsonb - 'bar'; ?column?
----------
["foo"]
(1 row)

Right now we resolve the unlabeled literal as type text.
But if jsonb - jsonb existed, we'd decide it's jsonb, thanks to the
heuristic that prefers same-type-as-the-other-input (rule 2a at [1]).
So it's pretty nearly certain that adding jsonb - jsonb would break
some existing queries; or worse, silently cause them to do something
different.
Maybe that's acceptable, but it's a demerit of this proposal.

Ah. It’s a pity the existing operator behaves differently for different rhs operands. But maybe add a new one that’s defined to operator on contents rather than keys/indexes and deprecate (or un-document) the content behavior in the `-` operator?

Best,

Davi

#9Robert Haas
robertmhaas@gmail.com
In reply to: Mark Drake (#7)
Re: Inconsistent Behavior in JSONB Numeric Array Deletion

On Tue, Jun 10, 2025 at 4:52 PM Mark Dake <mark.drake@golden-hind.com> wrote:

Happy to clarify further or contribute a patch.

On Thu, Jun 12, 2025 at 9:23 AM Mark Drake <mark.drake@golden-hind.com> wrote:

Sorry, not a 'C' coder. A man must know his limits. ☹

Uh ... what?

--
Robert Haas
EDB: http://www.enterprisedb.com