How bad is using queries with thousands of values for operators IN or ANY?

Started by Thorsten Schöningover 5 years ago13 messagesgeneral
Jump to latest
#1Thorsten Schöning
tschoening@am-soft.de

Hi all,

I have lots of queries in which I need to restrict access to rows
using some decimal row-ID and am mostly doing so with using the
operator IN in WHERE-clauses. Additionally I'm mostly embedding the
IDs as ","-seperated list into the query directly, e.g. because I
already hit a limitation of ~32k parameters of the JDBC-driver[1]https://github.com/pgjdbc/pgjdbc/issues/90 for
Postgres.

I really thought that in most cases simply sending a large amount of
IDs embedded into the query is better than looping, because it safes
roundtrips to access the DB, the planner of the DB has all pieces of
information it needs to decide best strategies etc. OTOH, with recent
tests and an increased number of IDs of about factor 100, I have
additional load in Tomcat before actually sending the query to the DB
already and in the DB itself as well of course. I've attached an
example query and plan.

-> Hash (cost=242592.66..242592.66 rows=6825 width=39) (actual time=91.117..91.117 rows=40044 loops=3)
Buckets: 65536 (originally 8192) Batches: 1 (originally 1) Memory Usage: 3016kB
-> Hash Join (cost=137.57..242592.66 rows=6825 width=39) (actual time=10.194..82.412 rows=40044 loops=3)
Hash Cond: (meter.meter_bcd = meter_bcd.id)
-> Index Scan using pk_meter on meter (cost=0.42..242237.10 rows=40044 width=25) (actual time=9.350..71.276 rows=40044 loops=3)
Index Cond: (id = ANY ('{[...]}'::integer[]))
-> Hash (cost=135.73..135.73 rows=113 width=22) (actual time=0.830..0.830 rows=113 loops=3)

Do you know of any obvious limitations of the JDBC-driver of handling
such large queries? In the end, the query is mostly large text with
only very few bind parameters.

Do you know of any obvious problem in Postgres itself with that query,
when parsing it or alike? Do things simply take how long they take and
are mostly comparable to looping or is there some additional overhead
the larger the query itself gets? From my naive expectation, comparing
IDs shouldn't care if things get looped or transmitted at once.

I'm just trying to collect some input for where to look at to optimize
things in the future. Thanks!

[1]: https://github.com/pgjdbc/pgjdbc/issues/90

Mit freundlichen Grüßen,

Thorsten Schöning

--
Thorsten Schöning E-Mail: Thorsten.Schoening@AM-SoFT.de
AM-SoFT IT-Systeme http://www.AM-SoFT.de/

Telefon...........05151- 9468- 55
Fax...............05151- 9468- 88
Mobil..............0178-8 9468- 04

AM-SoFT GmbH IT-Systeme, Brandenburger Str. 7c, 31789 Hameln
AG Hannover HRB 207 694 - Geschäftsführer: Andreas Muchow

Attachments:

attachment.zipapplication/x-zip-compressed; name=attachment.zipDownload+5-3
#2Tim Cross
theophilusx@gmail.com
In reply to: Thorsten Schöning (#1)
Re: How bad is using queries with thousands of values for operators IN or ANY?

Thorsten Schöning <tschoening@am-soft.de> writes:

Hi all,

I have lots of queries in which I need to restrict access to rows
using some decimal row-ID and am mostly doing so with using the
operator IN in WHERE-clauses. Additionally I'm mostly embedding the
IDs as ","-seperated list into the query directly, e.g. because I
already hit a limitation of ~32k parameters of the JDBC-driver[1] for
Postgres.

I really thought that in most cases simply sending a large amount of
IDs embedded into the query is better than looping, because it safes
roundtrips to access the DB, the planner of the DB has all pieces of
information it needs to decide best strategies etc. OTOH, with recent
tests and an increased number of IDs of about factor 100, I have
additional load in Tomcat before actually sending the query to the DB
already and in the DB itself as well of course. I've attached an
example query and plan.

-> Hash (cost=242592.66..242592.66 rows=6825 width=39) (actual time=91.117..91.117 rows=40044 loops=3)
Buckets: 65536 (originally 8192) Batches: 1 (originally 1) Memory Usage: 3016kB
-> Hash Join (cost=137.57..242592.66 rows=6825 width=39) (actual time=10.194..82.412 rows=40044 loops=3)
Hash Cond: (meter.meter_bcd = meter_bcd.id)
-> Index Scan using pk_meter on meter (cost=0.42..242237.10 rows=40044 width=25) (actual time=9.350..71.276 rows=40044 loops=3)
Index Cond: (id = ANY ('{[...]}'::integer[]))
-> Hash (cost=135.73..135.73 rows=113 width=22) (actual time=0.830..0.830 rows=113 loops=3)

Do you know of any obvious limitations of the JDBC-driver of handling
such large queries? In the end, the query is mostly large text with
only very few bind parameters.

Do you know of any obvious problem in Postgres itself with that query,
when parsing it or alike? Do things simply take how long they take and
are mostly comparable to looping or is there some additional overhead
the larger the query itself gets? From my naive expectation, comparing
IDs shouldn't care if things get looped or transmitted at once.

I'm just trying to collect some input for where to look at to optimize
things in the future. Thanks!

[1]: https://github.com/pgjdbc/pgjdbc/issues/90

Mit freundlichen Grüßen,

Thorsten Schöning

It would help to see the query as well as the plan.

Where are these 100s of IDs coming from? I sometimes find this a sign
you could be re-structuring your query to be a join between two tables
where one table contains the IDs of interest rather than trying to embed
them into the query as part of a where clause.

--
Tim Cross

#3Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Thorsten Schöning (#1)
Re: How bad is using queries with thousands of values for operators IN or ANY?

On Mon, 2020-08-31 at 10:04 +0200, Thorsten Schöning wrote:

I have lots of queries in which I need to restrict access to rows
using some decimal row-ID and am mostly doing so with using the
operator IN in WHERE-clauses. Additionally I'm mostly embedding the
IDs as ","-seperated list into the query directly, e.g. because I
already hit a limitation of ~32k parameters of the JDBC-driver[1] for
Postgres.

Is it an option to change that usage pattern?

Very often this is some kind of "poor man's join": first select all
rows from table A that match a certain condition, then select all
rows from B where the foreign key matches any IDs from the first query.

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com

#4Pavel Stehule
pavel.stehule@gmail.com
In reply to: Thorsten Schöning (#1)
Re: How bad is using queries with thousands of values for operators IN or ANY?

po 31. 8. 2020 v 10:04 odesílatel Thorsten Schöning <tschoening@am-soft.de>
napsal:

Hi all,

I have lots of queries in which I need to restrict access to rows
using some decimal row-ID and am mostly doing so with using the
operator IN in WHERE-clauses. Additionally I'm mostly embedding the
IDs as ","-seperated list into the query directly, e.g. because I
already hit a limitation of ~32k parameters of the JDBC-driver[1] for
Postgres.

I really thought that in most cases simply sending a large amount of
IDs embedded into the query is better than looping, because it safes
roundtrips to access the DB, the planner of the DB has all pieces of
information it needs to decide best strategies etc. OTOH, with recent
tests and an increased number of IDs of about factor 100, I have
additional load in Tomcat before actually sending the query to the DB
already and in the DB itself as well of course. I've attached an
example query and plan.

-> Hash (cost=242592.66..242592.66 rows=6825 width=39) (actual

time=91.117..91.117 rows=40044 loops=3)

Buckets: 65536 (originally 8192) Batches: 1 (originally 1)

Memory Usage: 3016kB

-> Hash Join (cost=137.57..242592.66 rows=6825 width=39) (actual

time=10.194..82.412 rows=40044 loops=3)

Hash Cond: (meter.meter_bcd = meter_bcd.id)
-> Index Scan using pk_meter on meter

(cost=0.42..242237.10 rows=40044 width=25) (actual time=9.350..71.276
rows=40044 loops=3)

Index Cond: (id = ANY ('{[...]}'::integer[]))
-> Hash (cost=135.73..135.73 rows=113 width=22) (actual

time=0.830..0.830 rows=113 loops=3)

Do you know of any obvious limitations of the JDBC-driver of handling
such large queries? In the end, the query is mostly large text with
only very few bind parameters.

Do you know of any obvious problem in Postgres itself with that query,
when parsing it or alike? Do things simply take how long they take and
are mostly comparable to looping or is there some additional overhead
the larger the query itself gets? From my naive expectation, comparing
IDs shouldn't care if things get looped or transmitted at once.

I'm just trying to collect some input for where to look at to optimize
things in the future. Thanks!

It is not good - it increases the memory necessary for query parsing,
optimizer and executor are slower.

Postgres currently has not any optimization for processing searching in
these long lists - so this search is very slow against other methods.

I think this is a signal so something in the design database or
architecture is wrong. Sure, there can be exception, but the Postgres has
not any optimization for this design

Regards

Pavel

Show quoted text

[1]: https://github.com/pgjdbc/pgjdbc/issues/90

Mit freundlichen Grüßen,

Thorsten Schöning

--
Thorsten Schöning E-Mail: Thorsten.Schoening@AM-SoFT.de
AM-SoFT IT-Systeme http://www.AM-SoFT.de/

Telefon...........05151- 9468- 55
Fax...............05151- 9468- 88
Mobil..............0178-8 9468- 04

AM-SoFT GmbH IT-Systeme, Brandenburger Str. 7c, 31789 Hameln
AG Hannover HRB 207 694 - Geschäftsführer: Andreas Muchow

#5Thorsten Schöning
tschoening@am-soft.de
In reply to: Tim Cross (#2)
Re: How bad is using queries with thousands of values for operators IN or ANY?

Guten Tag Tim Cross,
am Montag, 31. August 2020 um 10:21 schrieben Sie:

It would help to see the query as well as the plan.

I've attached both to my mail and my received copy shows it's
available. Maybe the following file hoster works better for you:

https://gofile.io/d/vWPliD
https://gofile.io/d/AiaDmN

Where are these 100s of IDs coming from? I sometimes find this a sign
you could be re-structuring your query to be a join between two tables
where one table contains the IDs of interest rather than trying to embed
them into the query as part of a where clause.

It's somewhat like that, but on a higher level: My app has different
functions, one searching for some rows according to some criteria and
returning IDs, one reader for those IDs and caring about additional
things. It's like that because both has already been available
individually and is necessary individually, while sometimes it make
sense to combine them.

Integrating things is an option of cource, that's exactly why I asked
for opinions about how bad the current implementation is. If this
concrete apsect is not too bad, I might simply keep it and look
elsewhere to improve.

Mit freundlichen Grüßen,

Thorsten Schöning

--
Thorsten Schöning E-Mail: Thorsten.Schoening@AM-SoFT.de
AM-SoFT IT-Systeme http://www.AM-SoFT.de/

Telefon...........05151- 9468- 55
Fax...............05151- 9468- 88
Mobil..............0178-8 9468- 04

AM-SoFT GmbH IT-Systeme, Brandenburger Str. 7c, 31789 Hameln
AG Hannover HRB 207 694 - Geschäftsführer: Andreas Muchow

#6Thorsten Schöning
tschoening@am-soft.de
In reply to: Laurenz Albe (#3)
Re: How bad is using queries with thousands of values for operators IN or ANY?

Guten Tag Laurenz Albe,
am Montag, 31. August 2020 um 10:23 schrieben Sie:

Is it an option to change that usage pattern?

Not everywhere easily, because I simply have places in which I'm
working with IDs only and e.g. need to check if they are owned by some
user. Those checks are abstracted away into individual functions
instead of JOINing always, as the latter makes queries more difficult
to maintain.

Of course there are alternatives like row level security I might have
a look at etc. And that's exactly why I asked the question, to make my
mind about that thing, if it's worth or necessary to look into
alternatives at all etc.

Mit freundlichen Grüßen,

Thorsten Schöning

--
Thorsten Schöning E-Mail: Thorsten.Schoening@AM-SoFT.de
AM-SoFT IT-Systeme http://www.AM-SoFT.de/

Telefon...........05151- 9468- 55
Fax...............05151- 9468- 88
Mobil..............0178-8 9468- 04

AM-SoFT GmbH IT-Systeme, Brandenburger Str. 7c, 31789 Hameln
AG Hannover HRB 207 694 - Geschäftsführer: Andreas Muchow

#7Thorsten Schöning
tschoening@am-soft.de
In reply to: Pavel Stehule (#4)
Re: How bad is using queries with thousands of values for operators IN or ANY?

Guten Tag Pavel Stehule,
am Montag, 31. August 2020 um 10:51 schrieben Sie:

It is not good - it increases the memory necessary for query parsing,
optimizer and executor are slower.

At least memory and I/O were not a problem regarding my tests,
CPU-load was pretty high. But I was unable to see if it's because
comparing IDs or parsing the query or alike.

Can one see the difference somehow? This would be the only chance to
see if splitting the large query up into multiple smaller is an
overall improvement.

Postgres currently has not any optimization for processing searching in
these long lists - so this search is very slow against other methods.

Which other methods do you have in mind, populating a temporary table
and joining that? I tested things like those in context of other
queries and creating those tables, indexes etc. introduced more
overhead than the query benefitted.

Additionally, it's not always these large queries. Most of the times
it's far less IDs and things are fast. I even introduced embedding
those IDs because the query was faster afterwards instead of using
some JOIN and comparing the IDs, even though indexes were in place and
used etc.

So for what query size or number of IDs to compare in IN would you
consider a different approach at all?

Mit freundlichen Grüßen,

Thorsten Schöning

--
Thorsten Schöning E-Mail: Thorsten.Schoening@AM-SoFT.de
AM-SoFT IT-Systeme http://www.AM-SoFT.de/

Telefon...........05151- 9468- 55
Fax...............05151- 9468- 88
Mobil..............0178-8 9468- 04

AM-SoFT GmbH IT-Systeme, Brandenburger Str. 7c, 31789 Hameln
AG Hannover HRB 207 694 - Geschäftsführer: Andreas Muchow

#8Thomas Kellerer
shammat@gmx.net
In reply to: Thorsten Schöning (#7)
Re: How bad is using queries with thousands of values for operators IN or ANY?

Thorsten Schöning schrieb am 31.08.2020 um 12:37:

So for what query size or number of IDs to compare in IN would you
consider a different approach at all?

In my experience "hundreds" of IDs tend to be quite slow if used with an IN clause.

Rewriting the IN to a JOIN against a VALUES clause is very often faster:

So instead of:

select *
from t
where id in (1,2,3, .... ,500);

using this:

select *
from t
join (
values (1),(2),(3),...(500)
) as x(id) on x.id = t.id

produces more often than not a more efficient execution plan (assuming no values are duplicated in the IN list)

Obviously I don't know if such a re-write is even feasible though.

Thomas

#9Pavel Stehule
pavel.stehule@gmail.com
In reply to: Thomas Kellerer (#8)
Re: How bad is using queries with thousands of values for operators IN or ANY?

po 31. 8. 2020 v 13:29 odesílatel Thomas Kellerer <shammat@gmx.net> napsal:

Thorsten Schöning schrieb am 31.08.2020 um 12:37:

So for what query size or number of IDs to compare in IN would you
consider a different approach at all?

In my experience "hundreds" of IDs tend to be quite slow if used with an
IN clause.

Rewriting the IN to a JOIN against a VALUES clause is very often faster:

So instead of:

select *
from t
where id in (1,2,3, .... ,500);

using this:

select *
from t
join (
values (1),(2),(3),...(500)
) as x(id) on x.id = t.id

produces more often than not a more efficient execution plan (assuming no
values are duplicated in the IN list)

Obviously I don't know if such a re-write is even feasible though.

yes - this query probably will have a slow start, but the execution will be
fast. Unfortunately, there are not available statistics.

Show quoted text

Thomas

#10Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Pavel Stehule (#9)
Re: How bad is using queries with thousands of values for operators IN or ANY?

At Mon, 31 Aug 2020 16:04:43 +0200, Pavel Stehule <pavel.stehule@gmail.com> wrote in

po 31. 8. 2020 v 13:29 odesílatel Thomas Kellerer <shammat@gmx.net> napsal:

Thorsten Schöning schrieb am 31.08.2020 um 12:37:

So for what query size or number of IDs to compare in IN would you
consider a different approach at all?

In my experience "hundreds" of IDs tend to be quite slow if used with an
IN clause.

Rewriting the IN to a JOIN against a VALUES clause is very often faster:

So instead of:

select *
from t
where id in (1,2,3, .... ,500);

using this:

select *
from t
join (
values (1),(2),(3),...(500)
) as x(id) on x.id = t.id

produces more often than not a more efficient execution plan (assuming no
values are duplicated in the IN list)

Obviously I don't know if such a re-write is even feasible though.

yes - this query probably will have a slow start, but the execution will be
fast. Unfortunately, there are not available statistics.

FWIW, the attached is the dusted-off version of a part of a stalled
development of mine, which unconditionally(!) creates on-the-fly
statistics on VALUES list. It seems to work for certain cases,
although the planning time increases significantly.

=$ CREATE TABLE t1 AS SELECT a, a * 2 AS b FROM generate_series(0, 99999) a;
=$ CREATE INDEX ON t1 (a);

perl q.pl(*) | psql

*: q.pl:

print "explain analyze select b from t1 join (values ";
foreach $i (0..10000) {
print ", " if ($i > 0);
printf("(%d)", $i/10 + 1000);
}
print ") as v(v) on (v.v = t1.a);";

patched:

Merge Join (cost=824.25..1005.19 rows=10001 width=4) (actual time=13.513..24.285 rows=10001 loops=1)
Merge Cond: (t1.a = "*VALUES*".column1)
-> Index Scan using t1_a_idx on t1 (cost=0.29..3050.29 rows=100000 width=8) (actual time=0.033..1.629 rows=2002 loops=1)
-> Sort (cost=789.47..814.47 rows=10001 width=4) (actual time=12.557..14.546 rows=10001 loops=1)
Sort Key: "*VALUES*".column1
Sort Method: quicksort Memory: 931kB
-> Values Scan on "*VALUES*" (cost=0.00..125.01 rows=10001 width=4) (actual time=0.002..8.271 rows=10001 loops=1)
Planning Time: 17.290 ms
Execution Time: 26.344 ms
(9 rows)

master:
Hash Join (cost=250.03..2168.03 rows=10001 width=4) (actual time=14.482..77.205 rows=10001 loops=1)
Hash Cond: (t1.a = "*VALUES*".column1)
-> Seq Scan on t1 (cost=0.00..1443.00 rows=100000 width=8) (actual time=0.017..23.540 rows=100000 loops=1)
-> Hash (cost=125.01..125.01 rows=10001 width=4) (actual time=13.786..13.788 rows=10001 loops=1)
Buckets: 16384 Batches: 1 Memory Usage: 480kB
-> Values Scan on "*VALUES*" (cost=0.00..125.01 rows=10001 width=4) (actual time=0.002..8.503 rows=10001 loops=1)
Planning Time: 12.365 ms
Execution Time: 78.567 ms
(8 rows)

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

0001-Temporary-statistics-on-VALUES-list.patchtext/x-patch; charset=us-asciiDownload+260-93
#11Michael Lewis
mlewis@entrata.com
In reply to: Kyotaro Horiguchi (#10)
Re: How bad is using queries with thousands of values for operators IN or ANY?

On Tue, Sep 1, 2020 at 1:22 AM Kyotaro Horiguchi <horikyota.ntt@gmail.com>
wrote:

FWIW, the attached is the dusted-off version of a part of a stalled
development of mine, which unconditionally(!) creates on-the-fly
statistics on VALUES list. It seems to work for certain cases,
although the planning time increases significantly.

=$ CREATE TABLE t1 AS SELECT a, a * 2 AS b FROM generate_series(0, 99999)
a;
=$ CREATE INDEX ON t1 (a);

perl q.pl(*) | psql

*: q.pl:

print "explain analyze select b from t1 join (values ";
foreach $i (0..10000) {
print ", " if ($i > 0);
printf("(%d)", $i/10 + 1000);
}
print ") as v(v) on (v.v = t1.a);";

patched:

Merge Join (cost=824.25..1005.19 rows=10001 width=4) (actual
time=13.513..24.285 rows=10001 loops=1)
Merge Cond: (t1.a = "*VALUES*".column1)
-> Index Scan using t1_a_idx on t1 (cost=0.29..3050.29 rows=100000
width=8) (actual time=0.033..1.629 rows=2002 loops=1)
-> Sort (cost=789.47..814.47 rows=10001 width=4) (actual
time=12.557..14.546 rows=10001 loops=1)
Sort Key: "*VALUES*".column1
Sort Method: quicksort Memory: 931kB
-> Values Scan on "*VALUES*" (cost=0.00..125.01 rows=10001
width=4) (actual time=0.002..8.271 rows=10001 loops=1)
Planning Time: 17.290 ms
Execution Time: 26.344 ms
(9 rows)

master:
Hash Join (cost=250.03..2168.03 rows=10001 width=4) (actual
time=14.482..77.205 rows=10001 loops=1)
Hash Cond: (t1.a = "*VALUES*".column1)
-> Seq Scan on t1 (cost=0.00..1443.00 rows=100000 width=8) (actual
time=0.017..23.540 rows=100000 loops=1)
-> Hash (cost=125.01..125.01 rows=10001 width=4) (actual
time=13.786..13.788 rows=10001 loops=1)
Buckets: 16384 Batches: 1 Memory Usage: 480kB
-> Values Scan on "*VALUES*" (cost=0.00..125.01 rows=10001
width=4) (actual time=0.002..8.503 rows=10001 loops=1)
Planning Time: 12.365 ms
Execution Time: 78.567 ms
(8 rows)

We end up abusing the option of creating temp tables and analyzing them to
get around the pain of queries going off the rails because of bad stats or
lack of stats on values. I believe most/all of the core team and perhaps
most contributors are against query hints in general (with some very good
reasons) but it might be amazing to have the option to incur the planning
time cost in some cases at least.

For my case, I think the changes coming in PG v13 or maybe v14 for joins
being helped by multivariate statistics will mitigate the pain point.

#12Pavel Stehule
pavel.stehule@gmail.com
In reply to: Kyotaro Horiguchi (#10)
Re: How bad is using queries with thousands of values for operators IN or ANY?

út 1. 9. 2020 v 9:22 odesílatel Kyotaro Horiguchi <horikyota.ntt@gmail.com>
napsal:

At Mon, 31 Aug 2020 16:04:43 +0200, Pavel Stehule <pavel.stehule@gmail.com>
wrote in

po 31. 8. 2020 v 13:29 odesílatel Thomas Kellerer <shammat@gmx.net>

napsal:

Thorsten Schöning schrieb am 31.08.2020 um 12:37:

So for what query size or number of IDs to compare in IN would you
consider a different approach at all?

In my experience "hundreds" of IDs tend to be quite slow if used with

an

IN clause.

Rewriting the IN to a JOIN against a VALUES clause is very often

faster:

So instead of:

select *
from t
where id in (1,2,3, .... ,500);

using this:

select *
from t
join (
values (1),(2),(3),...(500)
) as x(id) on x.id = t.id

produces more often than not a more efficient execution plan (assuming

no

values are duplicated in the IN list)

Obviously I don't know if such a re-write is even feasible though.

yes - this query probably will have a slow start, but the execution will

be

fast. Unfortunately, there are not available statistics.

FWIW, the attached is the dusted-off version of a part of a stalled
development of mine, which unconditionally(!) creates on-the-fly
statistics on VALUES list. It seems to work for certain cases,
although the planning time increases significantly.

=$ CREATE TABLE t1 AS SELECT a, a * 2 AS b FROM generate_series(0, 99999)
a;
=$ CREATE INDEX ON t1 (a);

perl q.pl(*) | psql

*: q.pl:

print "explain analyze select b from t1 join (values ";
foreach $i (0..10000) {
print ", " if ($i > 0);
printf("(%d)", $i/10 + 1000);
}
print ") as v(v) on (v.v = t1.a);";

patched:

Merge Join (cost=824.25..1005.19 rows=10001 width=4) (actual
time=13.513..24.285 rows=10001 loops=1)
Merge Cond: (t1.a = "*VALUES*".column1)
-> Index Scan using t1_a_idx on t1 (cost=0.29..3050.29 rows=100000
width=8) (actual time=0.033..1.629 rows=2002 loops=1)
-> Sort (cost=789.47..814.47 rows=10001 width=4) (actual
time=12.557..14.546 rows=10001 loops=1)
Sort Key: "*VALUES*".column1
Sort Method: quicksort Memory: 931kB
-> Values Scan on "*VALUES*" (cost=0.00..125.01 rows=10001
width=4) (actual time=0.002..8.271 rows=10001 loops=1)
Planning Time: 17.290 ms
Execution Time: 26.344 ms
(9 rows)

master:
Hash Join (cost=250.03..2168.03 rows=10001 width=4) (actual
time=14.482..77.205 rows=10001 loops=1)
Hash Cond: (t1.a = "*VALUES*".column1)
-> Seq Scan on t1 (cost=0.00..1443.00 rows=100000 width=8) (actual
time=0.017..23.540 rows=100000 loops=1)
-> Hash (cost=125.01..125.01 rows=10001 width=4) (actual
time=13.786..13.788 rows=10001 loops=1)
Buckets: 16384 Batches: 1 Memory Usage: 480kB
-> Values Scan on "*VALUES*" (cost=0.00..125.01 rows=10001
width=4) (actual time=0.002..8.503 rows=10001 loops=1)
Planning Time: 12.365 ms
Execution Time: 78.567 ms
(8 rows)

regards.

nice :)

Pavel

--

Show quoted text

Kyotaro Horiguchi
NTT Open Source Software Center

#13Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Pavel Stehule (#12)
Re: How bad is using queries with thousands of values for operators IN or ANY?

At Tue, 1 Sep 2020 18:46:01 +0200, Pavel Stehule <pavel.stehule@gmail.com> wrote in

nice :)

thanks ;p

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center