how to limit statement memory allocation
Hi all,
Occasionally we get bad queries on our db that consume a lot of memory.
These typically are full joins by mistake or just too large result sets.
My understanding is these should go to a temp file but apparently memory
allocation is preferred.
Last time a statement hit 150GB RAM and did not get a cancel or terminate
signal, so we had to
restart the db after a few hours, after another one popped up.
The only relevant , i guess, settings on the server are:
huge_pages = 'on'
shared_buffers = 196GB
work_mem = 1GB
maintenance_work_mem = 4GB
effective_cache_size = 720GB
Thank you in advance,
Rado
Radoslav Nedyalkov <rnedyalkov@gmail.com> writes:
Occasionally we get bad queries on our db that consume a lot of memory.
These typically are full joins by mistake or just too large result sets.
My understanding is these should go to a temp file but apparently memory
allocation is preferred.
Perhaps the accumulation is happening on the client side? libpq doesn't
have any provision for spilling a result set to disk.
If that's it, you could consider revising your application to read results
row-at-a-time, although that might require a good deal of effort.
https://www.postgresql.org/docs/current/libpq-single-row-mode.html
regards, tom lane
On Tue, Mar 9, 2021 at 6:03 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Radoslav Nedyalkov <rnedyalkov@gmail.com> writes:
Occasionally we get bad queries on our db that consume a lot of memory.
These typically are full joins by mistake or just too large result sets.
My understanding is these should go to a temp file but apparently memory
allocation is preferred.Perhaps the accumulation is happening on the client side? libpq doesn't
have any provision for spilling a result set to disk.If that's it, you could consider revising your application to read results
row-at-a-time, although that might require a good deal of effort.https://www.postgresql.org/docs/current/libpq-single-row-mode.html
regards, tom lane
Ah, I named it result set wrongly perhaps.
These are queries , part of a larger ETL function or statement which at the
end just write to a table.
The huge join is an intermediate step.
So I'm still wondering.
Thanks and regards,
Rado
Radoslav Nedyalkov <rnedyalkov@gmail.com> writes:
On Tue, Mar 9, 2021 at 6:03 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Perhaps the accumulation is happening on the client side? libpq doesn't
have any provision for spilling a result set to disk.
Ah, I named it result set wrongly perhaps.
These are queries , part of a larger ETL function or statement which at the
end just write to a table.
The huge join is an intermediate step.
Hm. What's the query plan look like?
The only thing I'm aware of that could consume unbounded memory
on the server side is hash aggregation. (v13 has improved that
situation, but I'm guessing you are running some older version.)
The planner won't choose hash aggregation if it estimates that
the hash table would need to exceed work_mem, but its estimate
could be wrong. If that's the scenario, you could back off
work_mem until hash aggregation isn't used, or you could try to
improve the planner's estimates. If your ETL process involves
intermediate tables, you might need to explicitly ANALYZE those
after you fill them so that the planner has a better idea how
to plan the later steps.
regards, tom lane
On Tue, Mar 9, 2021 at 6:53 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Radoslav Nedyalkov <rnedyalkov@gmail.com> writes:
On Tue, Mar 9, 2021 at 6:03 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Perhaps the accumulation is happening on the client side? libpq doesn't
have any provision for spilling a result set to disk.Ah, I named it result set wrongly perhaps.
These are queries , part of a larger ETL function or statement which atthe
end just write to a table.
The huge join is an intermediate step.Hm. What's the query plan look like?
The only thing I'm aware of that could consume unbounded memory
on the server side is hash aggregation. (v13 has improved that
situation, but I'm guessing you are running some older version.)
The planner won't choose hash aggregation if it estimates that
the hash table would need to exceed work_mem, but its estimate
could be wrong. If that's the scenario, you could back off
work_mem until hash aggregation isn't used, or you could try to
improve the planner's estimates. If your ETL process involves
intermediate tables, you might need to explicitly ANALYZE those
after you fill them so that the planner has a better idea how
to plan the later steps.regards, tom lanea
Thanks Tom,
Query and plan attached.
Rado
Attachments:
q_and_plan.txttext/plain; charset=US-ASCII; name=q_and_plan.txtDownload
Radoslav Nedyalkov <rnedyalkov@gmail.com> writes:
On Tue, Mar 9, 2021 at 6:53 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
The only thing I'm aware of that could consume unbounded memory
on the server side is hash aggregation. (v13 has improved that
situation, but I'm guessing you are running some older version.)
The planner won't choose hash aggregation if it estimates that
the hash table would need to exceed work_mem, but its estimate
could be wrong.
Query and plan attached.
Yeah, so, this looks suspicious:
-> HashAggregate (cost=181320662.52..181323190.52 rows=252800 width=16)
Group Key: gsm.merchant_id, cer.date
-> Merge Left Join (cost=161250580.17..170174828.82 rows=891666696 width=71)
How accurate is that estimate for the number of merchant_id/date
groups? Is the estimate for the size of the input join reasonable?
Assuming this is the problem, a brute-force fix could be to turn off
enable_hashagg. But it'd be better to try to get the planner's
estimates more in line with reality.
regards, tom lane
On Tue, Mar 9, 2021 at 8:13 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Radoslav Nedyalkov <rnedyalkov@gmail.com> writes:
On Tue, Mar 9, 2021 at 6:53 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
The only thing I'm aware of that could consume unbounded memory
on the server side is hash aggregation. (v13 has improved that
situation, but I'm guessing you are running some older version.)
The planner won't choose hash aggregation if it estimates that
the hash table would need to exceed work_mem, but its estimate
could be wrong.Query and plan attached.
Yeah, so, this looks suspicious:
-> HashAggregate (cost=181320662.52..181323190.52 rows=252800
width=16)
Group Key: gsm.merchant_id, cer.date
-> Merge Left Join (cost=161250580.17..170174828.82
rows=891666696 width=71)How accurate is that estimate for the number of merchant_id/date
groups? Is the estimate for the size of the input join reasonable?Assuming this is the problem, a brute-force fix could be to turn off
enable_hashagg. But it'd be better to try to get the planner's
estimates more in line with reality.regards, tom lane
Merely taking the number of rows from cross join inputs, N*M gives 1.4B
rows + some multiplication from left join,
that might by the "surprising" part, rows seem to be underestimated. Team
already went to rewrite the query.
Thanks for pointing out HashAggregate as a potential danger.
Thanks and regards,
Rado