Partial index on JSON column
Hello
I have a table with ~3 billion events.
Of this, there are a small subset of events which match the following query:
CREATE INDEX index_user_event_for_suggestion_notification ON
public.user_event USING btree ((((parameters ->>
'suggestion_id'::text))::integer), what) WHERE ((parameters ->>
'suggestion_id'::text) IS NOT NULL)
When I do this query:
EXPLAIN SELECT COUNT(*) FROM "user_event" WHERE ((parameters ->>
'suggestion_id'::text)::integer = 26) AND what = 'suggestion_notification';
Aggregate (cost=7115959.48..7115959.49 rows=1 width=8)
-> Bitmap Heap Scan on user_event (cost=37360.24..7115907.56 rows=20771
width=0)
Recheck Cond: ((what)::text = 'suggestion_notification'::text)
Filter: (((parameters ->> 'suggestion_id'::text))::integer = 26)
-> Bitmap Index Scan on index_user_event_for_clustering
(cost=0.00..37355.05 rows=4154273 width=0)
Index Cond: ((what)::text = 'suggestion_notification'::text)
It's slow. I need to explicitly add the NULL constraint:
EXPLAIN SELECT COUNT(*) FROM "user_event" WHERE ((parameters ->>
'suggestion_id'::text)::integer = 26) AND ((parameters ->> 'suggestion_id'::
text) IS NOT NULL) AND what = 'suggestion_notification';
Aggregate (cost=38871.48..38871.49 rows=1 width=8)
-> Index Scan using index_user_event_for_suggestion_notification on
user_event (cost=0.42..38819.81 rows=20668 width=0)
Index Cond: ((((parameters ->> 'suggestion_id'::text))::integer =
26) AND ((what)::text = 'suggestion_notification'::text))
I feel like the null constraint should be implicit.
That being said:
- Is my partial index wrong? Should I write it differently so the optimiser
knows this?
- Is my query wrong? How can I make the most use of this index without
being explicit?
- Any other suggestions for how I can do this?
Thanks
Samuel
Samuel Williams <space.ship.traveller@gmail.com> writes:
When I do this query:
EXPLAIN SELECT COUNT(*) FROM "user_event" WHERE ((parameters ->>
'suggestion_id'::text)::integer = 26) AND what = 'suggestion_notification';
It's slow. I need to explicitly add the NULL constraint:
Try it like
EXPLAIN SELECT COUNT(*) FROM "user_event" WHERE ((parameters ->>
'suggestion_id'::text) = '26') AND what = 'suggestion_notification';
I don't think we assume that CoerceViaIO is strict, and without that
the deduction that the value couldn't be null doesn't hold. In any
case you're better off without the runtime type conversion: that
isn't doing much for you except raising the odds of getting an error.
regards, tom lane
Thanks for the quick reply Tom,
I will try your advice.
The reason why I used ::integer for the INDEX is because I assumed it would
be more efficient both in space and performance.
In the JSONB field, it is actually an integer, i.e. {"location_age": 1,
"suggestion_id": 26}
So, now that I think about it, maybe the way I'm using ::text is wrong.
Any further advice is most appreciated.
Kind regards,
Samuel
On Wed, 20 Feb 2019 at 10:14, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Show quoted text
Samuel Williams <space.ship.traveller@gmail.com> writes:
When I do this query:
EXPLAIN SELECT COUNT(*) FROM "user_event" WHERE ((parameters ->>
'suggestion_id'::text)::integer = 26) AND what ='suggestion_notification';
It's slow. I need to explicitly add the NULL constraint:
Try it like
EXPLAIN SELECT COUNT(*) FROM "user_event" WHERE ((parameters ->>
'suggestion_id'::text) = '26') AND what = 'suggestion_notification';I don't think we assume that CoerceViaIO is strict, and without that
the deduction that the value couldn't be null doesn't hold. In any
case you're better off without the runtime type conversion: that
isn't doing much for you except raising the odds of getting an error.regards, tom lane
On Tue, Feb 19, 2019 at 1:41 PM Samuel Williams <
space.ship.traveller@gmail.com> wrote:
Hello
I have a table with ~3 billion events.
Of this, there are a small subset of events which match the following
query:CREATE INDEX index_user_event_for_suggestion_notification ON
public.user_event USING btree ((((parameters ->>
'suggestion_id'::text))::integer), what) WHERE ((parameters ->>
'suggestion_id'::text) IS NOT NULL)
You may want to use this-
CREATE INDEX index_user_event_for_suggestion_notification ON
public.user_event USING btree (parameters ->> 'suggestion_id', what) WHERE
parameters ? 'suggestion_id';
-Michael
On Wed, 20 Feb 2019 at 10:14, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Samuel Williams <space.ship.traveller@gmail.com> writes:
When I do this query:
EXPLAIN SELECT COUNT(*) FROM "user_event" WHERE ((parameters ->>
'suggestion_id'::text)::integer = 26) AND what ='suggestion_notification';
It's slow. I need to explicitly add the NULL constraint:
Try it like
EXPLAIN SELECT COUNT(*) FROM "user_event" WHERE ((parameters ->>
'suggestion_id'::text) = '26') AND what = 'suggestion_notification';I don't think we assume that CoerceViaIO is strict, and without that
the deduction that the value couldn't be null doesn't hold. In any
case you're better off without the runtime type conversion: that
isn't doing much for you except raising the odds of getting an error.
For run-time safe comparisons of something stored in jsonb that I expect to
only have booleans (and I don't trust input to always be valid), I have a
utility function like this-
CREATE OR REPLACE FUNCTION public.util_to_bool( pInput TEXT )
RETURNS BOOL AS
$BODY$
BEGIN
RETURN pInput::BOOL;
EXCEPTION WHEN OTHERS THEN
RETURN FALSE;
END;
$BODY$
LANGUAGE 'plpgsql' IMMUTABLE;
This is probably not best practice however.
Show quoted text
regards, tom lane
So, I checked my original code, and I didn't add `::text`, it was actually
added by Postgres:
EXPLAIN for: SELECT "user_event".* FROM "user_event" WHERE
"user_event"."what" = $1 AND (((parameters ->> 'suggestion_id')::integer) =
119 AND ((parameters ->> 'suggestion_id') IS NOT NULL)) [["what",
"suggestion_notification"]]
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
Index Scan using index_user_event_for_suggestion_notification on
user_event (cost=0.42..77193.95 rows=20669 width=138)
Index Cond: ((((parameters ->> 'suggestion_id'::text))::integer = 119)
AND ((what)::text = 'suggestion_notification'::text))
(2 rows)
Is there some way to directly use the integer value in the index with
minimal type coercions?
Thanks
Samuel
On Wed, 20 Feb 2019 at 10:24, Samuel Williams <
space.ship.traveller@gmail.com> wrote:
Show quoted text
Thanks for the quick reply Tom,
I will try your advice.
The reason why I used ::integer for the INDEX is because I assumed it
would be more efficient both in space and performance.In the JSONB field, it is actually an integer, i.e. {"location_age": 1,
"suggestion_id": 26}So, now that I think about it, maybe the way I'm using ::text is wrong.
Any further advice is most appreciated.
Kind regards,
SamuelOn Wed, 20 Feb 2019 at 10:14, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Samuel Williams <space.ship.traveller@gmail.com> writes:
When I do this query:
EXPLAIN SELECT COUNT(*) FROM "user_event" WHERE ((parameters ->>
'suggestion_id'::text)::integer = 26) AND what ='suggestion_notification';
It's slow. I need to explicitly add the NULL constraint:
Try it like
EXPLAIN SELECT COUNT(*) FROM "user_event" WHERE ((parameters ->>
'suggestion_id'::text) = '26') AND what = 'suggestion_notification';I don't think we assume that CoerceViaIO is strict, and without that
the deduction that the value couldn't be null doesn't hold. In any
case you're better off without the runtime type conversion: that
isn't doing much for you except raising the odds of getting an error.regards, tom lane
I wrote:
Try it like
EXPLAIN SELECT COUNT(*) FROM "user_event" WHERE ((parameters ->>
'suggestion_id'::text) = '26') AND what = 'suggestion_notification';
I don't think we assume that CoerceViaIO is strict, and without that
the deduction that the value couldn't be null doesn't hold.
Hmm ... on closer inspection, we do assume that CoerceViaIO is strict,
but not everyplace knows that, so there's something that could be
improved here. In the meantime, try it as stated above.
regards, tom lane
Thanks Tom, I did solve the problem by adding the null constraint for now,
it's a quick solution, and I look forward to the future where this case is
handled appropriately.
On Wed, 20 Feb 2019 at 12:17, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Show quoted text
I wrote:
Try it like
EXPLAIN SELECT COUNT(*) FROM "user_event" WHERE ((parameters ->>
'suggestion_id'::text) = '26') AND what = 'suggestion_notification';I don't think we assume that CoerceViaIO is strict, and without that
the deduction that the value couldn't be null doesn't hold.Hmm ... on closer inspection, we do assume that CoerceViaIO is strict,
but not everyplace knows that, so there's something that could be
improved here. In the meantime, try it as stated above.regards, tom lane