Death postgres

Started by Marc Millasalmost 3 years ago26 messagesgeneral
Jump to latest
#1Marc Millas
marc.millas@mokadb.com

Hi,

postgres 14.2 on Linux redhat

temp_file_limit set around 210 GB.

a select request with 2 left join have crashed the server (oom killer)
after the postgres disk occupation did grow from 15TB to 16 TB.

What are the cases where postgres may grow without caring about
temp_file_limit ?

thanks,

Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com

#2Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Marc Millas (#1)
Re: Death postgres

On 5/5/23 18:14, Marc Millas wrote:

Hi,

postgres 14.2 on Linux redhat

temp_file_limit set around 210 GB.

a select request with 2 left join have crashed the server (oom killer)
after the postgres disk occupation did grow from 15TB to 16 TB.

The result of EXPLAIN <the query> would be helpful.

What are the cases where postgres may grow without caring about
temp_file_limit ?

thanks,

Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com <http://www.mokadb.com&gt;

--
Adrian Klaver
adrian.klaver@aklaver.com

#3Peter J. Holzer
hjp-pgsql@hjp.at
In reply to: Marc Millas (#1)
Re: Death postgres

On 2023-05-06 03:14:20 +0200, Marc Millas wrote:

postgres 14.2 on Linux redhat

temp_file_limit set around 210 GB.

a select request with 2 left join have crashed the server (oom killer) after
the postgres disk occupation did grow from 15TB to 16 TB.

temp_file_limit limits the space a process may use on disk while the OOM
killer gets activated when the system runs out of RAM. So these seem to
be unrelated.

hp

--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"

#4Marc Millas
marc.millas@mokadb.com
In reply to: Peter J. Holzer (#3)
Re: Death postgres

Le sam. 6 mai 2023 à 09:46, Peter J. Holzer <hjp-pgsql@hjp.at> a écrit :

On 2023-05-06 03:14:20 +0200, Marc Millas wrote:

postgres 14.2 on Linux redhat

temp_file_limit set around 210 GB.

a select request with 2 left join have crashed the server (oom killer)

after

the postgres disk occupation did grow from 15TB to 16 TB.

temp_file_limit limits the space a process may use on disk while the OOM
killer gets activated when the system runs out of RAM. So these seem to
be unrelated.

hp

Its clear that oom killer is triggered by RAM and temp_file is a disk
thing...
But the sudden growth of disk space usage and RAM did happen exactly at the
very same time, with only one user connected, and only one query running...

Show quoted text

--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"

#5Marc Millas
marc.millas@mokadb.com
In reply to: Adrian Klaver (#2)
Re: Death postgres

Le sam. 6 mai 2023 à 06:18, Adrian Klaver <adrian.klaver@aklaver.com> a
écrit :

On 5/5/23 18:14, Marc Millas wrote:

Hi,

postgres 14.2 on Linux redhat

temp_file_limit set around 210 GB.

a select request with 2 left join have crashed the server (oom killer)
after the postgres disk occupation did grow from 15TB to 16 TB.

The result of EXPLAIN <the query> would be helpful.
Sure!

But. One of the table looks "inaccessible" since.
Ie. Even explain select * from the_table didnt answer and must be killed by
control c

Show quoted text

What are the cases where postgres may grow without caring about
temp_file_limit ?

thanks,

Marc MILLAS
Senior Architect
+33607850334
www.mokadb.com <http://www.mokadb.com&gt;

--
Adrian Klaver
adrian.klaver@aklaver.com

#6Ron
ronljohnsonjr@gmail.com
In reply to: Marc Millas (#4)
Re: Death postgres

On 5/6/23 07:19, Marc Millas wrote:

Le sam. 6 mai 2023 à 09:46, Peter J. Holzer <hjp-pgsql@hjp.at> a écrit :

On 2023-05-06 03:14:20 +0200, Marc Millas wrote:

postgres 14.2 on Linux redhat

temp_file_limit set around 210 GB.

a select request with 2 left join have crashed the server (oom

killer) after

the postgres disk occupation did grow from 15TB to 16 TB.

"15TB" and "16TB" are pretty low-resolution.  For example, 15.4TB rounds
*down* to 15TB, while 15.6TB rounds *up* to 16TB, while they are in fact
only 200GB apart.

Heck, even 15.4TB and 15.6TB are low-resolution.  temp_file_limit may
actually be working.

temp_file_limit limits the space a process may use on disk while the OOM
killer gets activated when the system runs out of RAM. So these seem to
be unrelated.

        hp

Its clear that oom killer is triggered by RAM and temp_file is a disk
thing...
But the sudden growth of disk space usage and RAM did happen exactly at
the very same time, with only one user connected, and only one query
running...

If your question is about temp_file_limit, don't distract us with OOM issues.

--
Born in Arizona, moved to Babylonia.

#7Marc Millas
marc.millas@mokadb.com
In reply to: Ron (#6)
Re: Death postgres

Le sam. 6 mai 2023 à 15:15, Ron <ronljohnsonjr@gmail.com> a écrit :

On 5/6/23 07:19, Marc Millas wrote:

Le sam. 6 mai 2023 à 09:46, Peter J. Holzer <hjp-pgsql@hjp.at> a écrit :

On 2023-05-06 03:14:20 +0200, Marc Millas wrote:

postgres 14.2 on Linux redhat

temp_file_limit set around 210 GB.

a select request with 2 left join have crashed the server (oom killer)

after

the postgres disk occupation did grow from 15TB to 16 TB.

"15TB" and "16TB" are pretty low-resolution. For example, 15.4TB rounds
*down* to 15TB, while 15.6TB rounds *up* to 16TB, while they are in fact
only 200GB apart.

Heck, even 15.4TB and 15.6TB are low-resolution. temp_file_limit may
actually be working.

It was... 15.2 and becomes 16.3...

Show quoted text

temp_file_limit limits the space a process may use on disk while the OOM
killer gets activated when the system runs out of RAM. So these seem to
be unrelated.

hp

Its clear that oom killer is triggered by RAM and temp_file is a disk
thing...
But the sudden growth of disk space usage and RAM did happen exactly at
the very same time, with only one user connected, and only one query
running...

If your question is about temp_file_limit, don't distract us with OOM
issues.

--
Born in Arizona, moved to Babylonia.

#8Marc Millas
marc.millas@mokadb.com
In reply to: Ron (#6)
Re: Death postgres

Le sam. 6 mai 2023 à 15:15, Ron <ronljohnsonjr@gmail.com> a écrit :

On 5/6/23 07:19, Marc Millas wrote:

Le sam. 6 mai 2023 à 09:46, Peter J. Holzer <hjp-pgsql@hjp.at> a écrit :

On 2023-05-06 03:14:20 +0200, Marc Millas wrote:

postgres 14.2 on Linux redhat

temp_file_limit set around 210 GB.

a select request with 2 left join have crashed the server (oom killer)

after

the postgres disk occupation did grow from 15TB to 16 TB.

"15TB" and "16TB" are pretty low-resolution. For example, 15.4TB rounds
*down* to 15TB, while 15.6TB rounds *up* to 16TB, while they are in fact
only 200GB apart.

Heck, even 15.4TB and 15.6TB are low-resolution. temp_file_limit may
actually be working.

temp_file_limit limits the space a process may use on disk while the OOM
killer gets activated when the system runs out of RAM. So these seem to
be unrelated.

hp

Its clear that oom killer is triggered by RAM and temp_file is a disk
thing...
But the sudden growth of disk space usage and RAM did happen exactly at
the very same time, with only one user connected, and only one query
running...

If your question is about temp_file_limit, don't distract us with OOM
issues.

My question is how postgres can use space without caring about
temp_file_limit. The oom info is kind of hint about the context as, as
said, one select did generate both things

Show quoted text

--
Born in Arizona, moved to Babylonia.

#9Ron
ronljohnsonjr@gmail.com
In reply to: Marc Millas (#8)
Re: Death postgres

On 5/6/23 08:52, Marc Millas wrote:

Le sam. 6 mai 2023 à 15:15, Ron <ronljohnsonjr@gmail.com> a écrit :

[snip]

If your question is about temp_file_limit, don't distract us with OOM
issues.

My question is how postgres can use space without caring about
temp_file_limit. The oom info is kind of hint about the context as, as
said, one select did generate both things

It's a distraction to lead with "OOM killed my process".  Evidence of this
fact is that all respondents have talked about is memory, not disk space.

--
Born in Arizona, moved to Babylonia.

#10Thomas Guyot
tguyot@gmail.com
In reply to: Marc Millas (#1)
Re: Death postgres

On 2023-05-05 21:14, Marc Millas wrote:

Hi,

postgres 14.2 on Linux redhat

temp_file_limit set around 210 GB.

a select request with 2 left join have crashed the server (oom killer)
after the postgres disk occupation did grow from 15TB to 16 TB.

What are the cases where postgres may grow without caring about
temp_file_limit ?

thanks,

Some OSes like IIRC RHEL9 now default to tmpfs for /tmp - if your temp
files are written in a tmpfs then it may may very well trigger the OOM
because of the temp file used up all RAM.

Check the filesystem type of your temp file's location.

--
Thomas

#11Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Marc Millas (#5)
Re: Death postgres

On 5/6/23 05:25, Marc Millas wrote:

Le sam. 6 mai 2023 à 06:18, Adrian Klaver <adrian.klaver@aklaver.com
<mailto:adrian.klaver@aklaver.com>> a écrit :

On 5/5/23 18:14, Marc Millas wrote:

Hi,

postgres 14.2 on Linux redhat

temp_file_limit set around 210 GB.

a select request with 2 left join have crashed the server (oom

killer)

after the postgres disk occupation did grow from 15TB to 16 TB.

The result of EXPLAIN <the query> would be helpful.
Sure!

But. One of the table looks "inaccessible" since.
Ie. Even explain select * from the_table didnt answer and must be killed
by control c

When you restarted the server where there any warnings shown?

Using psql can you \d <the_table>?

Can you select from any other table in the database?

--
Adrian Klaver
adrian.klaver@aklaver.com

#12Marc Millas
marc.millas@mokadb.com
In reply to: Adrian Klaver (#11)
Re: Death postgres

Le sam. 6 mai 2023 à 18:11, Adrian Klaver <adrian.klaver@aklaver.com> a
écrit :

On 5/6/23 05:25, Marc Millas wrote:

Le sam. 6 mai 2023 à 06:18, Adrian Klaver <adrian.klaver@aklaver.com
<mailto:adrian.klaver@aklaver.com>> a écrit :

On 5/5/23 18:14, Marc Millas wrote:

Hi,

postgres 14.2 on Linux redhat

temp_file_limit set around 210 GB.

a select request with 2 left join have crashed the server (oom

killer)

after the postgres disk occupation did grow from 15TB to 16 TB.

The result of EXPLAIN <the query> would be helpful.
Sure!

But. One of the table looks "inaccessible" since.
Ie. Even explain select * from the_table didnt answer and must be killed
by control c

When you restarted the server where there any warnings shown?

Sadly, I cannot. Will be done next tuesday.

Using psql can you \d <the_table>?

Yes, and no pb to check pg_statistic,...

Can you select from any other table in the database?

Yes

Show quoted text

--
Adrian Klaver
adrian.klaver@aklaver.com

#13Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Marc Millas (#12)
Re: Death postgres

On 5/6/23 10:13, Marc Millas wrote:

When you restarted the server where there any warnings shown?

Sadly, I cannot. Will be done next tuesday.

Cannot do what:

1) Get to the log to see if there are warnings?

2) Restart the server?

Your original post said the server crashed.

If that was the case how can you do any of the below without restarting it?

Using psql can you \d <the_table>?

Yes, and no pb to check pg_statistic,...

Can you select from any other table in the database?

Yes

--
Adrian Klaver
adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>

--
Adrian Klaver
adrian.klaver@aklaver.com

#14Marc Millas
marc.millas@mokadb.com
In reply to: Adrian Klaver (#13)
Re: Death postgres

Le sam. 6 mai 2023 à 21:46, Adrian Klaver <adrian.klaver@aklaver.com> a
écrit :

On 5/6/23 10:13, Marc Millas wrote:

When you restarted the server where there any warnings shown?

Sadly, I cannot. Will be done next tuesday.

Cannot do what:

1) Get to the log to see if there are warnings?

2) Restart the server?

Your original post said the server crashed.

If that was the case how can you do any of the below without restarting it?
Infrastructure is managed by another company. The db is within a
container.

So.. If I run a pg_ctl restart, the container... Vanished.
So for each main thing, I must ask... By mail..

Show quoted text

Using psql can you \d <the_table>?

Yes, and no pb to check pg_statistic,...

Can you select from any other table in the database?

Yes

--
Adrian Klaver
adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>

--
Adrian Klaver
adrian.klaver@aklaver.com

#15Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Marc Millas (#14)
Re: Death postgres

On 5/7/23 04:46, Marc Millas wrote:

Le sam. 6 mai 2023 à 21:46, Adrian Klaver <adrian.klaver@aklaver.com
<mailto:adrian.klaver@aklaver.com>> a écrit :

On 5/6/23 10:13, Marc Millas wrote:

     When you restarted the server where there any warnings shown?

Sadly, I cannot. Will be done next tuesday.

Cannot do what:

1) Get to the log to see if there are warnings?

2) Restart the server?

Your original post said the server crashed.

If that was the case how can you do any of the below without
restarting it?
Infrastructure is managed by another company. The db is within a
container.

So.. If I run a pg_ctl restart, the container... Vanished.
So for each main thing, I must ask... By mail..

1) The question remains, if the container crashes when you run pg_ctl
how can you do?:

Using psql can you \d <the_table>?

Yes, and no pb to check pg_statistic,...

Can you select from any other table in the database?

Yes

2) You need to provide a more detailed explanation of the setup that the
Postgres server is running in.

a) The host OS and version.

b) The container specifications.

1) What container software and version?

2) 'Hardware' specifications.

c) Some indication of the size of the data sets you are querying. Also
the query you used.

     Using psql can you \d <the_table>?

Yes, and no pb to check pg_statistic,...

     Can you select from any other table in the database?

Yes

     --
     Adrian Klaver
adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>

<mailto:adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>>

--
Adrian Klaver
adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>

--
Adrian Klaver
adrian.klaver@aklaver.com

#16Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Marc Millas (#1)
Re: Death postgres

On Sat, 2023-05-06 at 03:14 +0200, Marc Millas wrote:

postgres 14.2 on Linux redhat

temp_file_limit set around 210 GB.

a select request with 2 left join have crashed the server (oom killer) after the postgres
disk occupation did grow from 15TB to 16 TB.

What are the cases where postgres may grow without caring about temp_file_limit ?

That's too little information for a decent answer.
One obvious answer is: if it is not writing temporary files.

Yours,
Laurenz Albe

#17Marc Millas
marc.millas@mokadb.com
In reply to: Laurenz Albe (#16)
Re: Death postgres

On Sun, May 7, 2023 at 8:42 PM Laurenz Albe <laurenz.albe@cybertec.at>
wrote:

On Sat, 2023-05-06 at 03:14 +0200, Marc Millas wrote:

postgres 14.2 on Linux redhat

temp_file_limit set around 210 GB.

a select request with 2 left join have crashed the server (oom killer)

after the postgres

disk occupation did grow from 15TB to 16 TB.

What are the cases where postgres may grow without caring about

temp_file_limit ?

That's too little information for a decent answer.
One obvious answer is: if it is not writing temporary files.

Yours,
Laurenz Albe

Logical ...

so here is a little more info:

db=# analyze myschema.table_a;
ANALYZE

db=# with ta as(select 'myschema' as s, 'table_a' as t), p as (select *
from information_schema.columns, ta where table_schema=ta.s and
table_name=ta.t), tableid as(select relid from pg_stat_user_tables, ta
where schemaname=ta.s and relname=ta.t) select staattnum, column_name,
stanullfrac, stadistinct from tableid, pg_statistic join p on
p.ordinal_position=staattnum where starelid=tableid.relid order by
staattnum;
staattnum | column_name | stanullfrac | stadistinct
-----------+--------------+-------------+-------------
1 | col_ne | 0 | -0.6100224
2 | col_brg | 0.0208 | 6
3 | col_ano | 0 | 447302
4 | col_ine | 0 | -0.5341927
5 | col_cha | 0 | 11
(5 rows)

db=# select count(*) from myschema.table_a;
count
----------
13080776
(1 row)

db=# select count(distinct col_ano) from myschema.table_a;
count
----------
10149937
(1 row)

// stats looks somewhat absurd, as analyze guess 20 times less distinct
values as a select distinct does on column col_ano...
db=# analyze myschema.table_b;
ANALYZE
db=# with ta as(select 'myschema' as s, 'table_b' as t), p as (select *
from information_schema.columns, ta where table_schema=ta.s and
table_name=ta.t), tableid as(select relid from pg_stat_user_tables, ta
where schemaname=ta.s and relname=ta.t) select staattnum, column_name,
stanullfrac, stadistinct from tableid, pg_statistic join p on
p.ordinal_position=staattnum where starelid=tableid.relid order by
staattnum;
staattnum | column_name | stanullfrac | stadistinct
-----------+-----------------+-------------+-------------
1 | col_nerg | 0 | 161828
2 | col_ibi | 0 | 362161
3 | col_imi | 0 | 381023
4 | col_ipi | 0 | 391915
5 | col_ne | 0 | -0.53864235
6 | col_ano | 0 | 482225
(6 rows)

db=# select count(*) from myschema.table_b;
count
----------
14811485
(1 row)

db=# select count(distinct col_ano) from myschema.table_b;
count
----------
10149937
(1 row)

//same remark

db=# explain select distinct t1.col_ine, case when t2.col_ibi is null
then t3.col_ibi else t2.col_ibi end from myschema.table_a t1 left join
myschema.table_b t2 on t1.col_ano=t2.col_ano Left join myschema.table_b t3
on t1.NUM_ENQ=t3.NUM_ENQ;
QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------
Unique (cost=72377463163.02..201012533981.80 rows=1021522829864 width=97)
-> Gather Merge (cost=72377463163.02..195904919832.48
rows=1021522829864 width=97)
Workers Planned: 5
-> Sort (cost=72377462162.94..72888223577.87 rows=204304565973
width=97)
Sort Key: t1.col_ine, (CASE WHEN (t2.col_ibi IS NULL) THEN
t3.col_ibi ELSE t2.col_ibi END)
-> Parallel Hash Left Join (cost=604502.76..1276224253.51
rows=204304565973 width=97)
Hash Cond: ((t1.col_ano)::text = (t2.col_ano)::text)
-> Parallel Hash Left Join
(cost=300803.38..582295.38 rows=4857277 width=52)
Hash Cond: ((t1.col_ne)::text =
(t3.col_ne)::text)
-> Parallel Seq Scan on table_a t1
(cost=0.00..184052.35 rows=2616335 width=53)
-> Parallel Hash (cost=243466.06..243466.06
rows=2965306 width=31)
-> Parallel Seq Scan on table_b t3
(cost=0.00..243466.06 rows=2965306 width=31)
-> Parallel Hash (cost=243466.06..243466.06
rows=2965306 width=34)
-> Parallel Seq Scan on table_b t2
(cost=0.00..243466.06 rows=2965306 width=34)
JIT:
Functions: 19
Options: Inlining true, Optimization true, Expressions true, Deforming
true
(17 rows)

//so.. the planner guess that those 2 join will generate 1000 billions
rows...

//so, I try to change stats

db=# alter table myschema.table_a alter column col_ano set
(n_distinct=-0.8);
ALTER TABLE
db=# analyze myschema.table_a;
ANALYZE
db=# with ta as(select 'myschema' as s, 'table_a' as t), p as (select *
from information_schema.columns, ta where table_schema=ta.s and
table_name=ta.t), tableid as(select relid from pg_stat_user_tables, ta
where schemaname=ta.s and relname=ta.t) select staattnum, column_name,
stanullfrac, stadistinct from tableid, pg_statistic join p on
p.ordinal_position=staattnum where starelid=tableid.relid order by
staattnum;
staattnum | column_name | stanullfrac | stadistinct
-----------+--------------+-------------+-------------
1 | col_ne | 0 | -0.6694066
2 | col_brg | 0.0224 | 6
3 | col_ano | 0 | -0.8
4 | col_ine | 0 | -0.542651
5 | col_cha | 0 | 12
(5 rows)

db=# alter table myschema.table_b alter column col_ano set
(n_distinct=-0.8);
ALTER TABLE
db=# analyze myschema.table_b;
ANALYZE
db=# with ta as(select 'myschema' as s, 'table_b' as t), p as (select *
from information_schema.columns, ta where table_schema=ta.s and
table_name=ta.t), tableid as(select relid from pg_stat_user_tables, ta
where schemaname=ta.s and relname=ta.t) select staattnum, column_name,
stanullfrac, stadistinct from tableid, pg_statistic join p on
p.ordinal_position=staattnum where starelid=tableid.relid order by
staattnum;
staattnum | column_name | stanullfrac | stadistinct
-----------+-----------------+-------------+-------------
1 | col_nerg | 0 | 166151
2 | col_ibi | 0 | 374699
3 | col_imi | 0 | 394296
4 | col_ipi | 0 | 404515
5 | col_ne | 0 | -0.47193906
6 | col_ano | 0 | -0.8
(6 rows)

//doing the explain of the joins again

db=# explain select distinct t1.col_ine, case when t2.col_ibi is null
then t3.col_ibi else t2.col_ibi end from myschema.table_a t1 left join
myschema.table_b t2 on t1.col_ano=t2.col_ano Left join myschema.table_b t3
on t1.NUM_ENQ=t3.NUM_ENQ;
QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------
Unique (cost=62824833042.98..174732492390.53 rows=888686328950 width=97)
-> Gather Merge (cost=62824833042.98..170289060745.78
rows=888686328950 width=97)
Workers Planned: 5
-> Sort (cost=62824832042.91..63269175207.38 rows=177737265790
width=97)
Sort Key: t1.col_ine, (CASE WHEN (t2.col_ibi IS NULL) THEN
t3.col_ibi ELSE t2.col_ibi END)
-> Parallel Hash Left Join (cost=604274.78..1148041043.61
rows=177737265790 width=97)
Hash Cond: ((t1.col_ano)::text = (t2.col_ano)::text)
-> Parallel Hash Left Join
(cost=300691.39..581280.67 rows=4424044 width=53)
Hash Cond: ((t1.col_ne)::text =
(t3.col_ne)::text)
-> Parallel Seq Scan on table_a t1
(cost=0.00..184049.73 rows=2616073 width=54)
-> Parallel Hash (cost=243427.84..243427.84
rows=2961484 width=31)
-> Parallel Seq Scan on table_b t3
(cost=0.00..243427.84 rows=2961484 width=31)
-> Parallel Hash (cost=243427.84..243427.84
rows=2961484 width=34)
-> Parallel Seq Scan on table_b t2
(cost=0.00..243427.84 rows=2961484 width=34)
JIT:
Functions: 19
Options: Inlining true, Optimization true, Expressions true, Deforming
true
(17 rows)

//almost equivalent.
//temp_file_limit is set to 210 GB.
// if I run the request, or explain analyze it (same thing) the disk
occupation grows of 1.1 TB, and the RAM occupation grows until oom kills
//BTW, if I simplify the select to keep only the first join, the number of
estimated rows is still ""huge"" and I didnt try to execute it.

thanks

#18Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Marc Millas (#17)
Re: Death postgres

On Wed, 2023-05-10 at 16:35 +0200, Marc Millas wrote:

postgres 14.2 on Linux redhat

temp_file_limit set around 210 GB.

a select request with 2 left join have crashed the server (oom killer) after the postgres
disk occupation did grow from 15TB to 16 TB.

What are the cases where postgres may grow without caring about temp_file_limit ?

That's too little information for a decent answer.
One obvious answer is: if it is not writing temporary files.

so here is a little more info:

                                                        QUERY PLAN                                                        
---------------------------------------------------------------------------------------------------------------------------
 Unique  (cost=72377463163.02..201012533981.80 rows=1021522829864 width=97)
   ->  Gather Merge  (cost=72377463163.02..195904919832.48 rows=1021522829864 width=97)
         Workers Planned: 5
         ->  Sort  (cost=72377462162.94..72888223577.87 rows=204304565973 width=97)
               Sort Key: t1.col_ine, (CASE WHEN (t2.col_ibi IS NULL) THEN t3.col_ibi ELSE t2.col_ibi END)
               ->  Parallel Hash Left Join  (cost=604502.76..1276224253.51 rows=204304565973 width=97)
                     Hash Cond: ((t1.col_ano)::text = (t2.col_ano)::text)
                     ->  Parallel Hash Left Join  (cost=300803.38..582295.38 rows=4857277 width=52)
                           Hash Cond: ((t1.col_ne)::text = (t3.col_ne)::text)
                           ->  Parallel Seq Scan on table_a t1  (cost=0.00..184052.35 rows=2616335 width=53)
                           ->  Parallel Hash  (cost=243466.06..243466.06 rows=2965306 width=31)
                                 ->  Parallel Seq Scan on table_b t3  (cost=0.00..243466.06 rows=2965306 width=31)
                     ->  Parallel Hash  (cost=243466.06..243466.06 rows=2965306 width=34)
                           ->  Parallel Seq Scan on table_b t2  (cost=0.00..243466.06 rows=2965306 width=34)
 JIT:
   Functions: 19
   Options: Inlining true, Optimization true, Expressions true, Deforming true
(17 rows)

Perhaps parallel query drives you OOM. Does the problem also happen if
"max_parallel_workers_per_gather" is set to 0?

Yours,
Laurenz Albe

#19Peter J. Holzer
hjp-pgsql@hjp.at
In reply to: Marc Millas (#17)
Re: Death postgres

On 2023-05-10 16:35:04 +0200, Marc Millas wrote:

 Unique  (cost=72377463163.02..201012533981.80 rows=1021522829864 width=97)
   ->  Gather Merge  (cost=72377463163.02..195904919832.48 rows=1021522829864 width=97)

...

               ->  Parallel Hash Left Join  (cost=604502.76..1276224253.51 rows=204304565973 width=97)
                     Hash Cond: ((t1.col_ano)::text = (t2.col_ano)::text)

...

//so.. the planner guess that those 2 join will generate 1000 billions rows...

Are some of the col_ano values very frequent? If say the value 42 occurs
1 million times in both table_a and table_b, the join will create 1
trillion rows for that value alone. That doesn't explain the crash or the
disk usage, but it would explain the crazy cost (and would probably be a
hint that this query is unlikely to finish in any reasonable time).

hp

--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"

#20Marc Millas
marc.millas@mokadb.com
In reply to: Peter J. Holzer (#19)
Re: Death postgres

On Wed, May 10, 2023 at 7:24 PM Peter J. Holzer <hjp-pgsql@hjp.at> wrote:

On 2023-05-10 16:35:04 +0200, Marc Millas wrote:

Unique (cost=72377463163.02..201012533981.80 rows=1021522829864

width=97)

-> Gather Merge (cost=72377463163.02..195904919832.48

rows=1021522829864 width=97)
...

-> Parallel Hash Left Join

(cost=604502.76..1276224253.51 rows=204304565973 width=97)

Hash Cond: ((t1.col_ano)::text = (t2.col_ano)::text)

...

//so.. the planner guess that those 2 join will generate 1000 billions

rows...

Are some of the col_ano values very frequent? If say the value 42 occurs
1 million times in both table_a and table_b, the join will create 1
trillion rows for that value alone. That doesn't explain the crash or the
disk usage, but it would explain the crazy cost (and would probably be a
hint that this query is unlikely to finish in any reasonable time).

hp

good guess, even if a bit surprising: there is one (and only one) "value"

which fit your supposition: NULL
750000 in each table which perfectly fit the planner rows estimate.
One question: what is postgres doing when it planned to hash 1000 billions
rows ?
Did postgres create an appropriate ""space"" to handle those 1000 billions
hash values ?
thanks,
MM

Show quoted text

--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"

#21Peter J. Holzer
hjp-pgsql@hjp.at
In reply to: Marc Millas (#20)
#22Marc Millas
marc.millas@mokadb.com
In reply to: Peter J. Holzer (#21)
#23Marc Millas
marc.millas@mokadb.com
In reply to: Peter J. Holzer (#21)
#24Peter J. Holzer
hjp-pgsql@hjp.at
In reply to: Marc Millas (#23)
#25Marc Millas
marc.millas@mokadb.com
In reply to: Peter J. Holzer (#24)
#26Peter J. Holzer
hjp-pgsql@hjp.at
In reply to: Marc Millas (#25)