Simple IN vs IN values performace
Greetings,
We have queries with IN filters with long list of INT values
Sometimes, they running extremely slow, and I have found suggestion to use syntax
Field IN (VALUES(1465), (1478), ...
Instead of
Field IN (1465, 1478, ...
On some cases it helps, but on other it makes query running 1000+ times slower
So the issue is:
- some queries much faster with simple IN
- some queries much faster with IN + VALUES
Here is explain on case when it’s extremely slow:
-> HashAggregate (cost=5.78..9.62 rows=385 width=4)
Group Key: ""*VALUES*"".column1"
-> Values Scan on ""*VALUES*"" (cost=0.00..4.81 rows=385 width=4)"
What is the right way to pass long INT values list to IN filter?
I am using PostgreSQL 13.1 on Ubuntu
On Monday, February 22, 2021, Oleksandr Voytsekhovskyy <av@uniweb.ua> wrote:
What is the right way to pass long INT values list to IN filter
Don’t.
Pass in a delimited string, then parse that string into an array and use “=
any(array)”.
This has the primary benefit of making the input a single parameter.
David J.
Wouldn't using “= any(array)” change how the query is planned? Or is the
concern just parsing the values?
On 2/22/21 9:00 AM, Oleksandr Voytsekhovskyy wrote:
Greetings,
We have queries with IN filters with long list of INT values
Sometimes, they running extremely slow, and I have found suggestion to use
syntax
Field IN (VALUES(1465), (1478), ...
Instead of
Field IN (1465, 1478, ...
On some cases it helps, but on other it makes query running 1000+ times slower
Insert the values into a TEMPORARY TABLE, then join that to your main table?
So the issue is:
- some queries much faster with simple IN
- some queries much faster with IN + VALUESHere is explain on case when it’s extremely slow:
-> HashAggregate (cost=5.78..9.62 rows=385 width=4)
Group Key: ""*VALUES*"".column1"
-> Values Scan on ""*VALUES*"" (cost=0.00..4.81 rows=385 width=4)"
What is the right way to pass long INT values list to IN filter?I am using PostgreSQL 13.1 on Ubuntu
--
Angular momentum makes the world go 'round.
Insert the values into a TEMPORARY TABLE, then join that to your main
table?
In my experience, this is very performant but needs an analyze command
after populating the temp table to ensure there are statistics so the plan
doesn't go awry. Otherwise, I'm not sure it is different from a
materialized CTE or a subquery with OFFSET 0 at the end.
Greetings,
Didn’t get your ideas with materialized CTE or a subquery with OFFSET 0
Could you please show simple example?
Show quoted text
23 февр. 2021 г., в 04:33, Michael Lewis <mlewis@entrata.com> написал(а):
Insert the values into a TEMPORARY TABLE, then join that to your main table?
In my experience, this is very performant but needs an analyze command after populating the temp table to ensure there are statistics so the plan doesn't go awry. Otherwise, I'm not sure it is different from a materialized CTE or a subquery with OFFSET 0 at the end.
Thanks for suggestion
with tmp tables there are another issue - there are already 5-6 tables and
2-3 IN filters. If i will replace them with tmp tables it may hit query
planner limits and it will become to produce terrible query plans, for
example when genetic query optimizer starts
On Tue, Feb 23, 2021 at 1:45 AM Ron <ronljohnsonjr@gmail.com> wrote:
Show quoted text
On 2/22/21 9:00 AM, Oleksandr Voytsekhovskyy wrote:
Greetings,
We have queries with IN filters with long list of INT values
Sometimes, they running extremely slow, and I have found suggestion to use
syntaxField IN (VALUES(1465), (1478), ...
Instead of
Field IN (1465, 1478, ...
On some cases it helps, but on other it makes query running 1000+ times
slowerInsert the values into a TEMPORARY TABLE, then join that to your main
table?So the issue is:
- some queries much faster with simple IN
- some queries much faster with IN + VALUESHere is explain on case when it’s extremely slow:
-> HashAggregate (cost=5.78..9.62 rows=385 width=4)
Group Key: ""*VALUES*"".column1"
-> Values Scan on ""*VALUES*"" (cost=0.00..4.81 rows=385 width=4)"What is the right way to pass long INT values list to IN filter?
I am using PostgreSQL 13.1 on Ubuntu
--
Angular momentum makes the world go 'round.
Hi
út 23. 2. 2021 v 10:36 odesílatel Alexander Voytsekhovskyy <
young.inbox@gmail.com> napsal:
Thanks for suggestion
with tmp tables there are another issue - there are already 5-6 tables and
2-3 IN filters. If i will replace them with tmp tables it may hit query
planner limits and it will become to produce terrible query plans, for
example when genetic query optimizer starts
you can increase these limits - they are relatively low, and can be
increased on modern CPU.
https://www.postgresql.org/docs/current/runtime-config-query.html
Regards
Pavel
Show quoted text
On Tue, Feb 23, 2021 at 1:45 AM Ron <ronljohnsonjr@gmail.com> wrote:
On 2/22/21 9:00 AM, Oleksandr Voytsekhovskyy wrote:
Greetings,
We have queries with IN filters with long list of INT values
Sometimes, they running extremely slow, and I have found suggestion to
use syntaxField IN (VALUES(1465), (1478), ...
Instead of
Field IN (1465, 1478, ...
On some cases it helps, but on other it makes query running 1000+ times
slowerInsert the values into a TEMPORARY TABLE, then join that to your main
table?So the issue is:
- some queries much faster with simple IN
- some queries much faster with IN + VALUESHere is explain on case when it’s extremely slow:
-> HashAggregate (cost=5.78..9.62 rows=385 width=4)
Group Key: ""*VALUES*"".column1"
-> Values Scan on ""*VALUES*"" (cost=0.00..4.81 rows=385 width=4)"What is the right way to pass long INT values list to IN filter?
I am using PostgreSQL 13.1 on Ubuntu
--
Angular momentum makes the world go 'round.
Greetings,
i have tested both options ANY with string + parsing and simple array - and
there are cases when execution time 100+ times worse than IN
On Tue, Feb 23, 2021 at 12:23 AM David G. Johnston <
david.g.johnston@gmail.com> wrote:
Show quoted text
On Monday, February 22, 2021, Oleksandr Voytsekhovskyy <av@uniweb.ua>
wrote:What is the right way to pass long INT values list to IN filter
Don’t.
Pass in a delimited string, then parse that string into an array and use
“= any(array)”.This has the primary benefit of making the input a single parameter.
David J.