Query never completes with low work_mem (at least not within one hour)

Started by Daniel Westermann (DWE)about 9 years ago13 messagesgeneral
Jump to latest
#1Daniel Westermann (DWE)
daniel.westermann@dbi-services.com

Hi,

PostgreSQL 9.6.2 on CentOS 7.3 x64.

This is my data set:

drop table if exists users;
drop table if exists ids;
create table users ( user_id int
, username varchar(50)
);
with generator as
( select a.*
from generate_series (1,3000000) a
order by random()
)
insert into users ( user_id
, username
)
select a
, md5(a::varchar)
from generator;
create unique index i_users on users ( user_id );
create table ids ( id int );
insert into ids (id) values ( generate_series ( 2500000, 3500000 ) );
create unique index i_ids on ids ( id );
analyze users;
analyze ids;

I have set work_mem to a very low value intentionally for demonstration purposes:

postgres=# show work_mem;
work_mem
----------
16MB
(1 row)

postgres=# show shared_buffers ;
shared_buffers
----------------
128MB
(1 row)

When I run the following query ( I know that "not in" is not a good choice here ):

postgres=# select count(user_id) from users where user_id not in ( select id from ids);

... this seems to never complete (at least not within one hour).

Setting work_mem to 32MB and all is fine.

top shows the session at 100% CPU. No waits are listed in pg_stat_activity:

postgres=# select pid,wait_event_type,wait_event,state,query from pg_stat_activity ;
pid | wait_event_type | wait_event | state | query
-------+-----------------+------------+--------+------------------------------------------------------------------------------
17817 | | | active | select count(user_id) from users where user_id not in ( select id from ids);
17847 | | | active | select pid,wait_event_type,wait_event,state,query from pg_stat_activity ;
(2 rows)

strace shows more ore less always this, so something is happening:

read(14, "\0\0\1\0\0\t\30\0\351G1\0\16\0\0\0\1\0\0\t\30\0\352G1\0\16\0\0\0\1\0"..., 8192) = 8192
read(14, "\1\0\0\t\30\0002J1\0\16\0\0\0\1\0\0\t\30\0003J1\0\16\0\0\0\1\0\0\t"..., 8192) = 8192
read(14, "\0\t\30\0{L1\0\16\0\0\0\1\0\0\t\30\0|L1\0\16\0\0\0\1\0\0\t\30\0"..., 8192) = 8192
read(14, "\30\0\304N1\0\16\0\0\0\1\0\0\t\30\0\305N1\0\16\0\0\0\1\0\0\t\30\0\306N"..., 8192) = 8192
read(14, "\rQ1\0\16\0\0\0\1\0\0\t\30\0\16Q1\0\16\0\0\0\1\0\0\t\30\0\17Q1\0"..., 8192) = 8192^C

postgres@pgbox:/u02/pgdata/PG962/ [PG962] ls -la /proc/17817/fd/
total 0
dr-x------. 2 postgres postgres 0 Apr 4 14:45 .
dr-xr-xr-x. 9 postgres postgres 0 Apr 4 14:34 ..
lr-x------. 1 postgres postgres 64 Apr 4 14:45 0 -> /dev/null
l-wx------. 1 postgres postgres 64 Apr 4 14:45 1 -> pipe:[58121]
lrwx------. 1 postgres postgres 64 Apr 4 14:45 10 -> socket:[58881]
lr-x------. 1 postgres postgres 64 Apr 4 14:45 11 -> pipe:[58882]
l-wx------. 1 postgres postgres 64 Apr 4 14:45 12 -> pipe:[58882]
lrwx------. 1 postgres postgres 64 Apr 4 14:45 13 -> /u02/pgdata/PG962/base/13323/16516
lrwx------. 1 postgres postgres 64 Apr 4 14:45 14 -> /u02/pgdata/PG962/base/pgsql_tmp/pgsql_tmp17817.1
l-wx------. 1 postgres postgres 64 Apr 4 14:45 2 -> pipe:[58121]
lrwx------. 1 postgres postgres 64 Apr 4 14:45 3 -> anon_inode:[eventpoll]
lrwx------. 1 postgres postgres 64 Apr 4 14:45 4 -> /u02/pgdata/PG962/base/13323/2601
lrwx------. 1 postgres postgres 64 Apr 4 14:45 5 -> /u02/pgdata/PG962/base/13323/16517
lr-x------. 1 postgres postgres 64 Apr 4 14:45 6 -> pipe:[58120]
lrwx------. 1 postgres postgres 64 Apr 4 14:45 7 -> /u02/pgdata/PG962/base/13323/16520
lrwx------. 1 postgres postgres 64 Apr 4 14:45 8 -> /u02/pgdata/PG962/base/13323/16513
lrwx------. 1 postgres postgres 64 Apr 4 14:45 9 -> socket:[58126]

The size of the temp file does not change over time:

postgres@pgbox:/u02/pgdata/PG962/ [PG962] ls -lha /u02/pgdata/PG962/base/pgsql_tmp/pgsql_tmp17817.2
-rw-------. 1 postgres postgres 14M Apr 4 14:48 /u02/pgdata/PG962/base/pgsql_tmp/pgsql_tmp17817.2

What do I miss here? Shouldn't this complete with 16MB work_mem as well, even when slower, but in less than one hour? Or is this expected?

Thanks for your help
Daniel

#2Merlin Moncure
mmoncure@gmail.com
In reply to: Daniel Westermann (DWE) (#1)
Re: Query never completes with low work_mem (at least not within one hour)

On Tue, Apr 4, 2017 at 8:20 AM, Daniel Westermann
<daniel.westermann@dbi-services.com> wrote:

Hi,

PostgreSQL 9.6.2 on CentOS 7.3 x64.

This is my data set:

drop table if exists users;
drop table if exists ids;
create table users ( user_id int
, username varchar(50)
);
with generator as
( select a.*
from generate_series (1,3000000) a
order by random()
)
insert into users ( user_id
, username
)
select a
, md5(a::varchar)
from generator;
create unique index i_users on users ( user_id );
create table ids ( id int );
insert into ids (id) values ( generate_series ( 2500000, 3500000 ) );
create unique index i_ids on ids ( id );
analyze users;
analyze ids;

I have set work_mem to a very low value intentionally for demonstration
purposes:

postgres=# show work_mem;
work_mem
----------
16MB
(1 row)

postgres=# show shared_buffers ;
shared_buffers
----------------
128MB
(1 row)

When I run the following query ( I know that "not in" is not a good choice
here ):

postgres=# select count(user_id) from users where user_id not in ( select id
from ids);

"NOT IN" where the predate is a table column can lead to very poor
query plans especially where the haystack is not provably known (at
plan time) to contain only not null values. By reducing work_mem, the
server has decided has to repeatedly search the table to search for
the presence of null values. Try converting the query to NOT EXISTS.

merlin

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#3Daniel Westermann (DWE)
daniel.westermann@dbi-services.com
In reply to: Merlin Moncure (#2)
Re: Query never completes with low work_mem (at least not within one hour)

I have set work_mem to a very low value intentionally for demonstration
purposes:

postgres=# show work_mem;
work_mem
----------
16MB
(1 row)

postgres=# show shared_buffers ;
shared_buffers
----------------
128MB
(1 row)

When I run the following query ( I know that "not in" is not a good choice
here ):

postgres=# select count(user_id) from users where user_id not in ( select id
from ids);

"NOT IN" where the predate is a table column can lead to very poor
query plans especially where the haystack is not provably known (at
plan time) to contain only not null values. By reducing work_mem, the
server has decided has to repeatedly search the table to search for
the presence of null values. Try converting the query to NOT EXISTS.

Thank you, Merlin. As said I know that "not in" is not a good choice in this case but I still do not get what is going here. Why does the server repeatedly search for NULL values when I decrease work_mem and why not when increasing work_mem?

Regards
Daniel

#4Pavel Stehule
pavel.stehule@gmail.com
In reply to: Daniel Westermann (DWE) (#3)
Re: Query never completes with low work_mem (at least not within one hour)

2017-04-05 8:57 GMT+02:00 Daniel Westermann <
daniel.westermann@dbi-services.com>:

I have set work_mem to a very low value intentionally for demonstration
purposes:

postgres=# show work_mem;
work_mem
----------
16MB
(1 row)

postgres=# show shared_buffers ;
shared_buffers
----------------
128MB
(1 row)

When I run the following query ( I know that "not in" is not a good

choice

here ):

postgres=# select count(user_id) from users where user_id not in (

select id

from ids);

"NOT IN" where the predate is a table column can lead to very poor
query plans especially where the haystack is not provably known (at
plan time) to contain only not null values. By reducing work_mem, the
server has decided has to repeatedly search the table to search for
the presence of null values. Try converting the query to NOT EXISTS.

Thank you, Merlin. As said I know that "not in" is not a good choice in
this case but I still do not get what is going here. Why does the server
repeatedly search for NULL values when I decrease work_mem and why not when
increasing work_mem?

what is result of EXPLAIN statement for slow and fast cases?

regards

Pavel

Show quoted text

Regards
Daniel

#5Daniel Westermann (DWE)
daniel.westermann@dbi-services.com
In reply to: Pavel Stehule (#4)
Re: Query never completes with low work_mem (at least not within one hour)

what is result of EXPLAIN statement for slow and fast cases?

regards

Pavel

For work_mem=32MB

explain (analyze,verbose,buffers) select count(user_id) from users where user_id not in ( select id from ids);
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=83175.01..83175.02 rows=1 width=8) (actual time=7945.592..7945.593 rows=1 loops=1)
Output: count(users.user_id)
Buffers: shared read=29425
-> Seq Scan on public.users (cost=16925.01..79425.01 rows=1500000 width=4) (actual time=1928.665..5888.645 rows=2499999 loops=1)
Output: users.user_id, users.username
Filter: (NOT (hashed SubPlan 1))
Rows Removed by Filter: 500001
Buffers: shared read=29425
SubPlan 1
-> Seq Scan on public.ids (cost=0.00..14425.01 rows=1000001 width=4) (actual time=11.111..868.382 rows=1000001 loops=1)
Output: ids.id
Buffers: shared read=4425
Planning time: 187.396 ms
Execution time: 7948.108 ms
(14 rows)

Time: 8244.493 ms

For work_mem='16MB' it does not complete with analyze in on hour. For explain only:

explain (verbose) select count(user_id) from users where user_id not in ( select id from ids);
QUERY PLAN
----------------------------------------------------------------------------------------
Aggregate (cost=38748092500.00..38748092500.01 rows=1 width=8)
Output: count(users.user_id)
-> Seq Scan on public.users (cost=0.00..38748088750.00 rows=1500000 width=4)
Output: users.user_id, users.username
Filter: (NOT (SubPlan 1))
SubPlan 1
-> Materialize (cost=0.00..23332.01 rows=1000001 width=4)
Output: ids.id
-> Seq Scan on public.ids (cost=0.00..14425.01 rows=1000001 width=4)
Output: ids.id
(10 rows)

#6Pavel Stehule
pavel.stehule@gmail.com
In reply to: Daniel Westermann (DWE) (#5)
Re: Query never completes with low work_mem (at least not within one hour)

2017-04-05 9:28 GMT+02:00 Daniel Westermann <
daniel.westermann@dbi-services.com>:

what is result of EXPLAIN statement for slow and fast cases?

regards

Pavel

For work_mem=32MB

explain (analyze,verbose,buffers) select count(user_id) from users where
user_id not in ( select id from ids);
QUERY
PLAN
------------------------------------------------------------
--------------------------------------------------------------------------
Aggregate (cost=83175.01..83175.02 rows=1 width=8) (actual
time=7945.592..7945.593 rows=1 loops=1)
Output: count(users.user_id)
Buffers: shared read=29425
-> Seq Scan on public.users (cost=16925.01..79425.01 rows=1500000
width=4) (actual time=1928.665..5888.645 rows=2499999 loops=1)
Output: users.user_id, users.username
Filter: (NOT (hashed SubPlan 1))
Rows Removed by Filter: 500001
Buffers: shared read=29425
SubPlan 1
-> Seq Scan on public.ids (cost=0.00..14425.01 rows=1000001
width=4) (actual time=11.111..868.382 rows=1000001 loops=1)
Output: ids.id
Buffers: shared read=4425
Planning time: 187.396 ms
Execution time: 7948.108 ms
(14 rows)

Time: 8244.493 ms

For work_mem='16MB' it does not complete with analyze in on hour. For
explain only:

explain (verbose) select count(user_id) from users where user_id not in (
select id from ids);
QUERY
PLAN
------------------------------------------------------------
----------------------------
Aggregate (cost=38748092500.00..38748092500.01 rows=1 width=8)
Output: count(users.user_id)
-> Seq Scan on public.users (cost=0.00..38748088750.00 rows=1500000
width=4)
Output: users.user_id, users.username
Filter: (NOT (SubPlan 1))
SubPlan 1
-> Materialize (cost=0.00..23332.01 rows=1000001 width=4)
Output: ids.id
-> Seq Scan on public.ids (cost=0.00..14425.01
rows=1000001 width=4)
Output: ids.id
(10 rows)

There is a materialize op more

do you have a index on ids.id?

Pavel

#7Daniel Westermann (DWE)
daniel.westermann@dbi-services.com
In reply to: Pavel Stehule (#6)
Re: Query never completes with low work_mem (at least not within one hour)

2017-04-05 9:28 GMT+02:00 Daniel Westermann < daniel.westermann@dbi-services.com > :

what is result of EXPLAIN statement for slow and fast cases?

regards

Pavel

For work_mem=32MB

explain (analyze,verbose,buffers) select count(user_id) from users where user_id not in ( select id from ids);
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=83175.01..83175.02 rows=1 width=8) (actual time=7945.592..7945.593 rows=1 loops=1)
Output: count(users.user_id)
Buffers: shared read=29425
-> Seq Scan on public.users (cost=16925.01..79425.01 rows=1500000 width=4) (actual time=1928.665..5888.645 rows=2499999 loops=1)
Output: users.user_id, users.username
Filter: (NOT (hashed SubPlan 1))
Rows Removed by Filter: 500001
Buffers: shared read=29425
SubPlan 1
-> Seq Scan on public.ids (cost=0.00..14425.01 rows=1000001 width=4) (actual time=11.111..868.382 rows=1000001 loops=1)
Output: ids.id
Buffers: shared read=4425
Planning time: 187.396 ms
Execution time: 7948.108 ms
(14 rows)

Time: 8244.493 ms

For work_mem='16MB' it does not complete with analyze in on hour. For explain only:

explain (verbose) select count(user_id) from users where user_id not in ( select id from ids);
QUERY PLAN
----------------------------------------------------------------------------------------
Aggregate (cost=38748092500.00..38748092500.01 rows=1 width=8)
Output: count(users.user_id)
-> Seq Scan on public.users (cost=0.00..38748088750.00 rows=1500000 width=4)
Output: users.user_id, users.username
Filter: (NOT (SubPlan 1))
SubPlan 1
-> Materialize (cost=0.00..23332.01 rows=1000001 width=4)
Output: ids.id
-> Seq Scan on public.ids (cost=0.00..14425.01 rows=1000001 width=4)
Output: ids.id
(10 rows)

There is a materialize op more

do you have a index on ids.id ?

Yes:

\d ids
Table "public.ids"
Column | Type | Modifiers
--------+---------+-----------
id | integer |
Indexes:
"i_ids" UNIQUE, btree (id)

#8Pavel Stehule
pavel.stehule@gmail.com
In reply to: Daniel Westermann (DWE) (#7)
Re: Query never completes with low work_mem (at least not within one hour)

2017-04-05 10:13 GMT+02:00 Daniel Westermann <
daniel.westermann@dbi-services.com>:

2017-04-05 9:28 GMT+02:00 Daniel Westermann <daniel.westermann@dbi-
services.com>:

what is result of EXPLAIN statement for slow and fast cases?

regards

Pavel

For work_mem=32MB

explain (analyze,verbose,buffers) select count(user_id) from users where
user_id not in ( select id from ids);
QUERY
PLAN
------------------------------------------------------------
------------------------------------------------------------
--------------
Aggregate (cost=83175.01..83175.02 rows=1 width=8) (actual
time=7945.592..7945.593 rows=1 loops=1)
Output: count(users.user_id)
Buffers: shared read=29425
-> Seq Scan on public.users (cost=16925.01..79425.01 rows=1500000
width=4) (actual time=1928.665..5888.645 rows=2499999 loops=1)
Output: users.user_id, users.username
Filter: (NOT (hashed SubPlan 1))
Rows Removed by Filter: 500001
Buffers: shared read=29425
SubPlan 1
-> Seq Scan on public.ids (cost=0.00..14425.01 rows=1000001
width=4) (actual time=11.111..868.382 rows=1000001 loops=1)
Output: ids.id
Buffers: shared read=4425
Planning time: 187.396 ms
Execution time: 7948.108 ms
(14 rows)

Time: 8244.493 ms

For work_mem='16MB' it does not complete with analyze in on hour. For
explain only:

explain (verbose) select count(user_id) from users where user_id not in
( select id from ids);
QUERY
PLAN
------------------------------------------------------------
----------------------------
Aggregate (cost=38748092500.00..38748092500.01 rows=1 width=8)
Output: count(users.user_id)
-> Seq Scan on public.users (cost=0.00..38748088750.00 rows=1500000
width=4)
Output: users.user_id, users.username
Filter: (NOT (SubPlan 1))
SubPlan 1
-> Materialize (cost=0.00..23332.01 rows=1000001 width=4)
Output: ids.id
-> Seq Scan on public.ids (cost=0.00..14425.01
rows=1000001 width=4)
Output: ids.id
(10 rows)

There is a materialize op more

do you have a index on ids.id?

Yes:

\d ids
Table "public.ids"
Column | Type | Modifiers
--------+---------+-----------
id | integer |
Indexes:
"i_ids" UNIQUE, btree (id)

hmm .. NOT IN is just bad :(

The second is slow becase table ids is stored in temp file. and it is
repeatedly read from file. In first case, ids table is stored in memory.

SELECT count(user_id) FROM users WHERE user_id NOT EXISTS(SELECT id from
ids where id = user_id)

Regards

Pavel

#9Daniel Westermann (DWE)
daniel.westermann@dbi-services.com
In reply to: Pavel Stehule (#8)
Re: Query never completes with low work_mem (at least not within one hour)

2017-04-05 10:13 GMT+02:00 Daniel Westermann < daniel.westermann@dbi-services.com > :

2017-04-05 9:28 GMT+02:00 Daniel Westermann < daniel.westermann@dbi-services.com > :

BQ_BEGIN

what is result of EXPLAIN statement for slow and fast cases?

regards

Pavel

For work_mem=32MB

explain (analyze,verbose,buffers) select count(user_id) from users where user_id not in ( select id from ids);
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=83175.01..83175.02 rows=1 width=8) (actual time=7945.592..7945.593 rows=1 loops=1)
Output: count(users.user_id)
Buffers: shared read=29425
-> Seq Scan on public.users (cost=16925.01..79425.01 rows=1500000 width=4) (actual time=1928.665..5888.645 rows=2499999 loops=1)
Output: users.user_id, users.username
Filter: (NOT (hashed SubPlan 1))
Rows Removed by Filter: 500001
Buffers: shared read=29425
SubPlan 1
-> Seq Scan on public.ids (cost=0.00..14425.01 rows=1000001 width=4) (actual time=11.111..868.382 rows=1000001 loops=1)
Output: ids.id
Buffers: shared read=4425
Planning time: 187.396 ms
Execution time: 7948.108 ms
(14 rows)

Time: 8244.493 ms

For work_mem='16MB' it does not complete with analyze in on hour. For explain only:

explain (verbose) select count(user_id) from users where user_id not in ( select id from ids);
QUERY PLAN
----------------------------------------------------------------------------------------
Aggregate (cost=38748092500.00..38748092500.01 rows=1 width=8)
Output: count(users.user_id)
-> Seq Scan on public.users (cost=0.00..38748088750.00 rows=1500000 width=4)
Output: users.user_id, users.username
Filter: (NOT (SubPlan 1))
SubPlan 1
-> Materialize (cost=0.00..23332.01 rows=1000001 width=4)
Output: ids.id
-> Seq Scan on public.ids (cost=0.00..14425.01 rows=1000001 width=4)
Output: ids.id
(10 rows)

There is a materialize op more

do you have a index on ids.id ?

Yes:

\d ids
Table "public.ids"
Column | Type | Modifiers
--------+---------+-----------
id | integer |
Indexes:
"i_ids" UNIQUE, btree (id)

BQ_END

hmm .. NOT IN is just bad :(

The second is slow becase table ids is stored in temp file. and it is repeatedly read from file. In first case, ids table is stored in memory.

SELECT count(user_id) FROM users WHERE user_id NOT EXISTS(SELECT id from ids where id = user_id)

Yes, really bad :) ... and I still do not get it. Even when reading from the tempfile all the time it should at least complete within one hour, shouldn't it? The tables are not so big:

select * from pg_size_pretty ( pg_relation_size ('ids' ));
pg_size_pretty
----------------
35 MB
(1 row)
select * from pg_size_pretty ( pg_relation_size ('users' ));
pg_size_pretty
----------------
195 MB
(1 row)

#10Pavel Stehule
pavel.stehule@gmail.com
In reply to: Daniel Westermann (DWE) (#9)
Re: Query never completes with low work_mem (at least not within one hour)

2017-04-05 10:33 GMT+02:00 Daniel Westermann <
daniel.westermann@dbi-services.com>:

2017-04-05 10:13 GMT+02:00 Daniel Westermann <daniel.westermann@dbi-
services.com>:

2017-04-05 9:28 GMT+02:00 Daniel Westermann <daniel.westermann@dbi-
services.com>:

what is result of EXPLAIN statement for slow and fast cases?

regards

Pavel

For work_mem=32MB

explain (analyze,verbose,buffers) select count(user_id) from users
where user_id not in ( select id from ids);
QUERY
PLAN
------------------------------------------------------------
------------------------------------------------------------
--------------
Aggregate (cost=83175.01..83175.02 rows=1 width=8) (actual
time=7945.592..7945.593 rows=1 loops=1)
Output: count(users.user_id)
Buffers: shared read=29425
-> Seq Scan on public.users (cost=16925.01..79425.01 rows=1500000
width=4) (actual time=1928.665..5888.645 rows=2499999 loops=1)
Output: users.user_id, users.username
Filter: (NOT (hashed SubPlan 1))
Rows Removed by Filter: 500001
Buffers: shared read=29425
SubPlan 1
-> Seq Scan on public.ids (cost=0.00..14425.01 rows=1000001
width=4) (actual time=11.111..868.382 rows=1000001 loops=1)
Output: ids.id
Buffers: shared read=4425
Planning time: 187.396 ms
Execution time: 7948.108 ms
(14 rows)

Time: 8244.493 ms

For work_mem='16MB' it does not complete with analyze in on hour. For
explain only:

explain (verbose) select count(user_id) from users where user_id not in
( select id from ids);
QUERY
PLAN
------------------------------------------------------------
----------------------------
Aggregate (cost=38748092500.00..38748092500.01 rows=1 width=8)
Output: count(users.user_id)
-> Seq Scan on public.users (cost=0.00..38748088750.00 rows=1500000
width=4)
Output: users.user_id, users.username
Filter: (NOT (SubPlan 1))
SubPlan 1
-> Materialize (cost=0.00..23332.01 rows=1000001 width=4)
Output: ids.id
-> Seq Scan on public.ids (cost=0.00..14425.01
rows=1000001 width=4)
Output: ids.id
(10 rows)

There is a materialize op more

do you have a index on ids.id?

Yes:

\d ids
Table "public.ids"
Column | Type | Modifiers
--------+---------+-----------
id | integer |
Indexes:
"i_ids" UNIQUE, btree (id)

hmm .. NOT IN is just bad :(

The second is slow becase table ids is stored in temp file. and it is

repeatedly read from file. In first case, ids table is stored in memory.

SELECT count(user_id) FROM users WHERE user_id NOT EXISTS(SELECT id from

ids where id = user_id)

Yes, really bad :) ... and I still do not get it. Even when reading from
the tempfile all the time it should at least complete within one hour,
shouldn't it? The tables are not so big:

select * from pg_size_pretty ( pg_relation_size ('ids' ));
pg_size_pretty
----------------
35 MB
(1 row)
select * from pg_size_pretty ( pg_relation_size ('users' ));
pg_size_pretty
----------------
195 MB
(1 row)

1500000 * few ms ~ big time

#11Daniel Westermann (DWE)
daniel.westermann@dbi-services.com
In reply to: Pavel Stehule (#10)
Re: Query never completes with low work_mem (at least not within one hour)

2017-04-05 10:33 GMT+02:00 Daniel Westermann < daniel.westermann@dbi-services.com > :

2017-04-05 10:13 GMT+02:00 Daniel Westermann < daniel.westermann@dbi-services.com > :

BQ_BEGIN

2017-04-05 9:28 GMT+02:00 Daniel Westermann < daniel.westermann@dbi-services.com > :

BQ_BEGIN

what is result of EXPLAIN statement for slow and fast cases?

regards

Pavel

For work_mem=32MB

explain (analyze,verbose,buffers) select count(user_id) from users where user_id not in ( select id from ids);
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=83175.01..83175.02 rows=1 width=8) (actual time=7945.592..7945.593 rows=1 loops=1)
Output: count(users.user_id)
Buffers: shared read=29425
-> Seq Scan on public.users (cost=16925.01..79425.01 rows=1500000 width=4) (actual time=1928.665..5888.645 rows=2499999 loops=1)
Output: users.user_id, users.username
Filter: (NOT (hashed SubPlan 1))
Rows Removed by Filter: 500001
Buffers: shared read=29425
SubPlan 1
-> Seq Scan on public.ids (cost=0.00..14425.01 rows=1000001 width=4) (actual time=11.111..868.382 rows=1000001 loops=1)
Output: ids.id
Buffers: shared read=4425
Planning time: 187.396 ms
Execution time: 7948.108 ms
(14 rows)

Time: 8244.493 ms

For work_mem='16MB' it does not complete with analyze in on hour. For explain only:

explain (verbose) select count(user_id) from users where user_id not in ( select id from ids);
QUERY PLAN
----------------------------------------------------------------------------------------
Aggregate (cost=38748092500.00..38748092500.01 rows=1 width=8)
Output: count(users.user_id)
-> Seq Scan on public.users (cost=0.00..38748088750.00 rows=1500000 width=4)
Output: users.user_id, users.username
Filter: (NOT (SubPlan 1))
SubPlan 1
-> Materialize (cost=0.00..23332.01 rows=1000001 width=4)
Output: ids.id
-> Seq Scan on public.ids (cost=0.00..14425.01 rows=1000001 width=4)
Output: ids.id
(10 rows)

There is a materialize op more

do you have a index on ids.id ?

Yes:

\d ids
Table "public.ids"
Column | Type | Modifiers
--------+---------+-----------
id | integer |
Indexes:
"i_ids" UNIQUE, btree (id)

BQ_END

hmm .. NOT IN is just bad :(

The second is slow becase table ids is stored in temp file. and it is repeatedly read from file. In first case, ids table is stored in memory.

SELECT count(user_id) FROM users WHERE user_id NOT EXISTS(SELECT id from ids where id = user_id)

Yes, really bad :) ... and I still do not get it. Even when reading from the tempfile all the time it should at least complete within one hour, shouldn't it? The tables are not so big:

select * from pg_size_pretty ( pg_relation_size ('ids' ));
pg_size_pretty
----------------
35 MB
(1 row)
select * from pg_size_pretty ( pg_relation_size ('users' ));
pg_size_pretty
----------------
195 MB
(1 row)

BQ_END

1500000 * few ms ~ big time

Ok got it
Thanks
Pavel

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Daniel Westermann (DWE) (#3)
Re: Query never completes with low work_mem (at least not within one hour)

Daniel Westermann <daniel.westermann@dbi-services.com> writes:

Thank you, Merlin. As said I know that "not in" is not a good choice in this case but I still do not get what is going here. Why does the server repeatedly search for NULL values when I decrease work_mem and why not when increasing work_mem?

The core point is that one plan is using a hashed subplan and the other is
not, because the planner estimated that the hashtable wouldn't fit into
work_mem. With a hashtable you'll have one probe into the hashtable per
outer row, and each probe is O(1) unless you are unlucky about data
distributions, so the runtime is more or less linear. Without a
hashtable, the inner table is rescanned for each outer row, so the
runtime is O(N^2) which gets pretty bad pretty fast. "Materializing"
the inner table doesn't really help: it gets rid of per-inner-row
visibility checks and some buffer locking overhead, so it cuts the
constant factor some, but the big-O situation is still disastrous.

regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#13Daniel Westermann (DWE)
daniel.westermann@dbi-services.com
In reply to: Tom Lane (#12)
Re: Query never completes with low work_mem (at least not within one hour)

Daniel Westermann <daniel.westermann@dbi-services.com> writes:

Thank you, Merlin. As said I know that "not in" is not a good choice in this case but I still do not get what is going here. Why does the ><server repeatedly search for NULL values when I decrease work_mem and why not when increasing work_mem?

The core point is that one plan is using a hashed subplan and the other is
not, because the planner estimated that the hashtable wouldn't fit into
work_mem. With a hashtable you'll have one probe into the hashtable per
outer row, and each probe is O(1) unless you are unlucky about data
distributions, so the runtime is more or less linear. Without a
hashtable, the inner table is rescanned for each outer row, so the
runtime is O(N^2) which gets pretty bad pretty fast. "Materializing"
the inner table doesn't really help: it gets rid of per-inner-row
visibility checks and some buffer locking overhead, so it cuts the
constant factor some, but the big-O situation is still disastrous.

Thanks, Tom