Unkillable processes creating millions of tiny temp files

Started by Jeremy Finzelabout 5 years ago5 messagesgeneral
Jump to latest
#1Jeremy Finzel
finzelj@gmail.com

Greetings!

We are running postgres 11.9 (were running 11.7 prior to recent restart) on
a large db (10s of TB) with 5 or 6 tablespaces and 1000s of tables/indexes.

Within the past few days we have started to see a few queries running for
over 8 hours which we then attempt to terminate, but will not terminate.
These queries are also generating hundreds of thousands of tiny/empty temp
files. In fact, before the restart there were over 23 million files in
pg_tmp which were removed. We also have verified no server settings have
changed for at least a couple weeks, well before this issue started
happening only in the past few days.

Looking back awhile at the same query a couple weeks ago, we see this
(large file sizes):
2021-02-18 12:01:59.195
GMT,"foo_user","mydb",81780,"0.0.0.0:49926",602e5716.13f74,3,"SELECT",2021-02-18
12:01:26 GMT,164/1009590,0,LOG,00000,"temporary file: path
""pg_tblspc/16428/PG_11_201809051/pgsql_tmp/pgsql_tmp81780.0"", size
21299836",,,,,,"
2021-02-18 12:01:59.446
GMT,"foo_user","mydb",81780,"0.0.0.0:49926",602e5716.13f74,4,"SELECT",2021-02-18
12:01:26 GMT,164/1009590,0,LOG,00000,"temporary file: path
""pg_tblspc/16428/PG_11_201809051/pgsql_tmp/pgsql_tmp81780.5"", size
4138272",,,,,,"
2021-02-18 12:01:59.496
GMT,"foo_user","mydb",81780,"0.0.0.0:49926",602e5716.13f74,5,"SELECT",2021-02-18
12:01:26 GMT,164/1009590,0,LOG,00000,"temporary file: path
""pg_tblspc/16428/PG_11_201809051/pgsql_tmp/pgsql_tmp81780.1"", size
10706416",,,,,,"
2021-02-18 12:01:59.747
GMT,"foo_user","mydb",81780,"0.0.0.0:49926",602e5716.13f74,6,"SELECT",2021-02-18
12:01:26 GMT,164/1009590,0,LOG,00000,"temporary file: path
""pg_tblspc/16428/PG_11_201809051/pgsql_tmp/pgsql_tmp81780.3"", size
4150920",,,,,,"
2021-02-18 12:01:59.797
GMT,"foo_user","mydb",81780,"0.0.0.0:49926",602e5716.13f74,7,"SELECT",2021-02-18
12:01:26 GMT,164/1009590,0,LOG,00000,"temporary file: path
""pg_tblspc/16428/PG_11_201809051/pgsql_tmp/pgsql_tmp81780.2"", size
10660908",,,,,,"
2021-02-18 12:02:00.050
GMT,"foo_user","mydb",81780,"0.0.0.0:49926",602e5716.13f74,8,"SELECT",2021-02-18
12:01:26 GMT,164/1009590,0,LOG,00000,"temporary file: path
""pg_tblspc/16428/PG_11_201809051/pgsql_tmp/pgsql_tmp81780.4"", size
4132224",,,,,,"

But here is what we see presently (tiny files):
2021-03-05 21:30:52.712
GMT,"foo_user","mydb",67713,"0.0.0.0:48482",60424a1d.10881,73562,"SELECT",2021-03-05
15:11:25 GMT,80/2860,0,LOG,00000,"temporary file: path
""pg_tblspc/16428/PG_11_201809051/pgsql_tmp/pgsql_tmp67713.177717"", size
24",,,,,,"
2021-03-05 21:30:52.735
GMT,"foo_user","mydb",67713,"0.0.0.0:48482",60424a1d.10881,73563,"SELECT",2021-03-05
15:11:25 GMT,80/2860,0,LOG,00000,"temporary file: path
""pg_tblspc/16428/PG_11_201809051/pgsql_tmp/pgsql_tmp67713.990067"", size
92",,,,,,"
2021-03-05 21:30:52.950
GMT,"foo_user","mydb",67713,"0.0.0.0:48482",60424a1d.10881,73564,"SELECT",2021-03-05
15:11:25 GMT,80/2860,0,LOG,00000,"temporary file: path
""pg_tblspc/16428/PG_11_201809051/pgsql_tmp/pgsql_tmp67713.490"", size
24",,,,,,"
2021-03-05 21:30:53.072
GMT,"foo_user","mydb",67713,"0.0.0.0:48482",60424a1d.10881,73565,"SELECT",2021-03-05
15:11:25 GMT,80/2860,0,LOG,00000,"temporary file: path
""pg_tblspc/16428/PG_11_201809051/pgsql_tmp/pgsql_tmp67713.800016"", size
140",,,,,,"
2021-03-05 21:30:53.522
GMT,"foo_user","mydb",67713,"0.0.0.0:48482",60424a1d.10881,73566,"SELECT",2021-03-05
15:11:25 GMT,80/2860,0,LOG,00000,"temporary file: path
""pg_tblspc/16428/PG_11_201809051/pgsql_tmp/pgsql_tmp67713.500266"", size
48",,,,,,"

After the restart of the db and upgrade to 11.9, we see the same problem.

There are 2 separate queries that have had this behavior with seemingly
unrelated objects. Here is the plan for one of them:

QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=2654520.75..2654520.76 rows=1 width=336)
-> Hash Left Join (cost=1087907.49..2019675.12 rows=14107680 width=69)
Hash Cond: (t1.team_stadium_id = ra.team_stadium_id)
-> Hash Left Join (cost=1079740.68..1955870.37 rows=14107680
width=69)
Hash Cond: (t1.team_stadium_id = frg.team_stadium_id)
-> Hash Right Join (cost=1073614.70..1894916.60
rows=14107680 width=69)
Hash Cond: (sm_1.id = t1.id)
-> Hash Join (cost=47373.90..695007.85 rows=170011
width=4)
Hash Cond: (sm_1.id = l.id)
-> Index Scan using
index_foo_fact_on_id_first_win on foo_fact sm_1 (cost=0.42..647177.59
rows=174010 width=12)
Filter: (CASE WHEN
(basketball_renewal_date <= now()) THEN true ELSE false END AND (NOT CASE
WHEN (basketball_renewal_date <= (now() - '5 days'::interval)) THEN false
ELSE true END))
-> Hash (cost=34408.14..34408.14 rows=1037227
width=4)
-> Index Only Scan using
index_wins_on_team_covering on wins l (cost=0.43..34408.14 rows=1037227
width=4)
Filter: (team_status_id <> ALL
('{1,2}'::integer[]))
-> Hash (cost=684569.80..684569.80 rows=14107680
width=69)
-> Seq Scan on bar_team_stadiums_info t1
(cost=0.00..684569.80 rows=14107680 width=69)
-> Hash (cost=3720.99..3720.99 rows=192399 width=4)
-> Seq Scan on bar_team_stadiums_frg frg
(cost=0.00..3720.99 rows=192399 width=4)
-> Hash (cost=4749.14..4749.14 rows=273414 width=8)
-> Seq Scan on sports_team_stadiums ra (cost=0.00..4749.14
rows=273414 width=8)
(20 rows)

Any ideas here would be much appreciated!!!

Thanks,
Jeremy

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jeremy Finzel (#1)
Re: Unkillable processes creating millions of tiny temp files

Jeremy Finzel <finzelj@gmail.com> writes:

We are running postgres 11.9 (were running 11.7 prior to recent restart) on
a large db (10s of TB) with 5 or 6 tablespaces and 1000s of tables/indexes.

Within the past few days we have started to see a few queries running for
over 8 hours which we then attempt to terminate, but will not terminate.
These queries are also generating hundreds of thousands of tiny/empty temp
files. In fact, before the restart there were over 23 million files in
pg_tmp which were removed. We also have verified no server settings have
changed for at least a couple weeks, well before this issue started
happening only in the past few days.

Hm. For the query plan you show, I think the only plausible explanation
for using temp files is that a hash join's hash table is exceeding
work_mem so it's spilling batches of tuples to disk. With some bad
luck those could be small not large. But I don't see anything in
our commit logs between 11.7 and 11.9 that looks like it would have
affected any of that behavior. (There were some changes to code
governing temp-file tablespace selection, but that could only affect
where the files get put not how big they are.) So I doubt that this
can be blamed on the update, especially since if I read you correctly
it didn't start happening immediately after the update.

I'm wondering about changes in table statistics possibly causing a
poorly-chosen change in the hashing parameters. What have you got
work_mem set to? Can you comment on whether the estimated rowcounts
shown in the query plan are accurate? Does manually ANALYZE'ing the
tables used in the query change anything?

The "unkillable" aspect is odd, but I wonder if that's just a red
herring. A query that's generated lots of temp files will try to
clean them up at termination, so maybe the backend is just sitting
there removing temp files before it'll give control back.

regards, tom lane

#3Jerry Sievers
gsievers19@comcast.net
In reply to: Tom Lane (#2)
Re: Unkillable processes creating millions of tiny temp files

Hi Tom, thx for the quick response and a few remarks below...

I work at the same site that Jeremy does and we're both looking at this
today.

Tom Lane <tgl@sss.pgh.pa.us> writes:

Jeremy Finzel <finzelj@gmail.com> writes:

We are running postgres 11.9 (were running 11.7 prior to recent restart) on
a large db (10s of TB) with 5 or 6 tablespaces and 1000s of tables/indexes.

Actually it's v11.11 now, but in any case, the issue was evident before
and after the minor version update.

Within the past few days we have started to see a few queries running for
over 8 hours which we then attempt to terminate, but will not terminate.
These queries are also generating hundreds of thousands of tiny/empty temp
files. In fact, before the restart there were over 23 million files in
pg_tmp which were removed. We also have verified no server settings have
changed for at least a couple weeks, well before this issue started
happening only in the past few days.

Hm. For the query plan you show, I think the only plausible explanation
for using temp files is that a hash join's hash table is exceeding
work_mem so it's spilling batches of tuples to disk. With some bad
luck those could be small not large. But I don't see anything in
our commit logs between 11.7 and 11.9 that looks like it would have
affected any of that behavior. (There were some changes to code
governing temp-file tablespace selection, but that could only affect
where the files get put not how big they are.) So I doubt that this
can be blamed on the update, especially since if I read you correctly
it didn't start happening immediately after the update.

It started happening a couple days ago for no obvious reason, so bad
stats are one of my thrtheories and w'ere running a defensive analyze
through the entire system to rule it out.

I'm wondering about changes in table statistics possibly causing a
poorly-chosen change in the hashing parameters. What have you got
work_mem set to? Can you comment on whether the estimated rowcounts

work_mem 1G. We are not aware of the client overriding this in their
code, but it's possible.

shown in the query plan are accurate? Does manually ANALYZE'ing the
tables used in the query change anything?

These other points are still under investigation.

The "unkillable" aspect is odd, but I wonder if that's just a red
herring. A query that's generated lots of temp files will try to
clean them up at termination, so maybe the backend is just sitting
there removing temp files before it'll give control back.

I believe this is confirmed. I see that the backend after being
sig-term'd are now cycling through unlinks as seen by strace -p $pid.

None too quickly I might add and as mentioned earlier, the number of
these files is in the millions so it's hard to predict when cleanup will
finish.

As we did one night prior, a hard shutdown got the rogue jobs closed a
lot more quickly and we noticed that on the recovery restart, Pg took
about 15 minutes to clear >23M files from the temp area. I assume we'd
experience the same again if we need to do it.

I'm going to leave it a alone a a while longer before taking action.

Thanks again

regards, tom lane

--
Jerry Sievers
Postgres DBA/Development Consulting
e: postgres.consulting@comcast.net

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jerry Sievers (#3)
Re: Unkillable processes creating millions of tiny temp files

Jerry Sievers <gsievers19@comcast.net> writes:

Tom Lane <tgl@sss.pgh.pa.us> writes:

The "unkillable" aspect is odd, but I wonder if that's just a red
herring. A query that's generated lots of temp files will try to
clean them up at termination, so maybe the backend is just sitting
there removing temp files before it'll give control back.

I believe this is confirmed. I see that the backend after being
sig-term'd are now cycling through unlinks as seen by strace -p $pid.

None too quickly I might add and as mentioned earlier, the number of
these files is in the millions so it's hard to predict when cleanup will
finish.

Hm. I don't recall exactly what sort of data structure we use to track
open temp files, but it's certainly plausible that it's not too efficient
with millions of temp files :-(

As we did one night prior, a hard shutdown got the rogue jobs closed a
lot more quickly and we noticed that on the recovery restart, Pg took
about 15 minutes to clear >23M files from the temp area. I assume we'd
experience the same again if we need to do it.

Not sure how fast that is either. If you need to do it again, you could
try manually rm'ing everything under the pgsql_tmp directory before
letting the postmaster start.

regards, tom lane

#5Jan Wieck
JanWieck@Yahoo.com
In reply to: Tom Lane (#4)
Re: Unkillable processes creating millions of tiny temp files

On 3/5/21 6:57 PM, Tom Lane wrote:

Not sure how fast that is either. If you need to do it again, you could
try manually rm'ing everything under the pgsql_tmp directory before
letting the postmaster start.

That is actually a strategy that works rather well. mv(1) the tmp
directory to something date(1) based, then kick off a recursive rm(1)
-rf on everything named pgsql_tmp_*. That won't miss anything in the
case the whole server is restarted while the procedure is under way. It
can cause multiple rm(1) processes trampling over each other, but that
has no real ill side effects. They are just trying to unlink a file
another one already did.

Under normal circumstances the rm(1) will clean up while the postmaster
is already up and possibly created a new pgsql_tmp.

Regards, Jan

--
Jan Wieck
Principle Database Engineer
Amazon Web Services