Suggestion for memory parameters
Hello All,
In a RDS postgres we are seeing some select queries when running and doing
sorting on 50 million rows(as its having order by clause in it) , the
significant portion of wait event is showing as "IO:BufFileWrite" and it
runs for ~20minutes+.
Going through the document in the link below, it states we should monitor
the "FreeLocalStorage" metric and when monitoring that, I see it showing up
to ~535GB as the max limit and when these queries run this goes down till
100GB. Note- (it's a R7g8xl instance)
https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/apg-waits.iobuffile.html
We were thinking of bumping up the work_mem to a higher value in database
level , which is currently having size 4MB default. But we will also have
~100 sessions running at same time and majority were from other
applications which execute other single row "insert" queries and I hope
that will not need high "work_mem" . And setting it at database level will
consume 100 times that set work_mem value. So how to handle this situation?
Or
Is it fine to let it use "FreeLocalStorage" unless it goes till zero?
Also I am confused between the local storage (which is showing as 535GB) vs
the memory/RAM which is 256GB for this instance class with ~128TB max
storage space restriction, how are these storage different, (mainly the
535GB space which it's showing vs the 128TB storage space restriction)?
Appreciate your guidance.
select query looks something as below with no Joins but just single table
fetch:-
Select....
from <table_name>
where
order by column1, column2 LIMIT $b1 OFFSET $B2 ;
Regards
Yudhi
On Thu, 26 Sept 2024 at 16:33, yudhi s <learnerdatabase99@gmail.com> wrote:
Hello All,
In a RDS postgres we are seeing some select queries when running and doing
sorting on 50 million rows(as its having order by clause in it) , the
significant portion of wait event is showing as "IO:BufFileWrite" and it
runs for ~20minutes+.Going through the document in the link below, it states we should monitor
the "FreeLocalStorage" metric and when monitoring that, I see it showing up
to ~535GB as the max limit and when these queries run this goes down till
100GB. Note- (it's a R7g8xl instance)https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/apg-waits.iobuffile.html
We were thinking of bumping up the work_mem to a higher value in database
level , which is currently having size 4MB default. But we will also have
~100 sessions running at same time and majority were from other
applications which execute other single row "insert" queries and I hope
that will not need high "work_mem" . And setting it at database level will
consume 100 times that set work_mem value. So how to handle this situation?
Or
Is it fine to let it use "FreeLocalStorage" unless it goes till zero?Also I am confused between the local storage (which is showing as 535GB)
vs the memory/RAM which is 256GB for this instance class with ~128TB max
storage space restriction, how are these storage different, (mainly the
535GB space which it's showing vs the 128TB storage space restriction)?
Appreciate your guidance.select query looks something as below with no Joins but just single table
fetch:-Select....
from <table_name>
where
order by column1, column2 LIMIT $b1 OFFSET $B2 ;
My 2 cents
I think you should set the work_mem on specific session level , if your
sorting queries are only from specific handful of sessions, as because
setting it up at database level will eat up your most of RAM(which you said
is 256GB) and you said 100+ sessions getting spawned at any point in time.
On Fri, Sep 27, 2024 at 9:11 AM veem v <veema0000@gmail.com> wrote:
On Thu, 26 Sept 2024 at 16:33, yudhi s <learnerdatabase99@gmail.com>
wrote:Hello All,
In a RDS postgres we are seeing some select queries when running and
doing sorting on 50 million rows(as its having order by clause in it) , the
significant portion of wait event is showing as "IO:BufFileWrite" and it
runs for ~20minutes+.Going through the document in the link below, it states we should monitor
the "FreeLocalStorage" metric and when monitoring that, I see it showing up
to ~535GB as the max limit and when these queries run this goes down till
100GB. Note- (it's a R7g8xl instance)https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/apg-waits.iobuffile.html
We were thinking of bumping up the work_mem to a higher value in database
level , which is currently having size 4MB default. But we will also have
~100 sessions running at same time and majority were from other
applications which execute other single row "insert" queries and I hope
that will not need high "work_mem" . And setting it at database level will
consume 100 times that set work_mem value. So how to handle this situation?
Or
Is it fine to let it use "FreeLocalStorage" unless it goes till zero?Also I am confused between the local storage (which is showing as 535GB)
vs the memory/RAM which is 256GB for this instance class with ~128TB max
storage space restriction, how are these storage different, (mainly the
535GB space which it's showing vs the 128TB storage space restriction)?
Appreciate your guidance.select query looks something as below with no Joins but just single table
fetch:-Select....
from <table_name>
where
order by column1, column2 LIMIT $b1 OFFSET $B2 ;My 2 cents
I think you should set the work_mem on specific session level , if your
sorting queries are only from specific handful of sessions, as because
setting it up at database level will eat up your most of RAM(which you said
is 256GB) and you said 100+ sessions getting spawned at any point in time.
Thank you.
When I checked pg_stat_statements for this query , and divided the
temp_blk_read+temp_blk_written with the "calls", it came as ~1million which
means ~7GB. So does that mean ~7GB of work_mem should be allocated for this
query?
On Sep 26, 2024, at 7:03 AM, yudhi s <learnerdatabase99@gmail.com> wrote:
In a RDS postgres ...
Is it fine to let it use "FreeLocalStorage" unless it goes till zero?
Hi Yudhi,
FreeLocalStorage and some of the other things you ask about are specific to AWS RDS, so you might have better luck getting answers on an RDS-specific mailing list. We also use RDS-hosted Postgres and so I completely understand how Postgres and RDS are intertwined.
We have had runaway queries exhaust FreeLocalStorage. It has been quite a while since that happened, so my memories are hazy, but I’m pretty sure that when we used all of FreeLocalStorage, the result was that Postgres restarted. It might be equivalent to using all memory and disk space on a standalone system. Once there’s no storage left, behavior is unpredictable but we can’t be surprised if things crash. Usually our runaway queries got killed before FreeLocalStorage filled up, but not always.
I second Veem’s suggestion to set work_mem on a per-session basis. Also note that the doc for work_mem says, “the total memory used could be many times the value of work_mem; it is necessary to keep this fact in mind when choosing the value."
https://www.postgresql.org/docs/current/runtime-config-resource.html#GUC-WORK-MEM
Cheers
Philip
On Mon, Sep 30, 2024 at 8:46 PM Philip Semanchuk <
philip@americanefficient.com> wrote:
On Sep 26, 2024, at 7:03 AM, yudhi s <learnerdatabase99@gmail.com>
wrote:
In a RDS postgres ...
Is it fine to let it use "FreeLocalStorage" unless it goes till zero?
Hi Yudhi,
FreeLocalStorage and some of the other things you ask about are specific
to AWS RDS, so you might have better luck getting answers on an
RDS-specific mailing list. We also use RDS-hosted Postgres and so I
completely understand how Postgres and RDS are intertwined.We have had runaway queries exhaust FreeLocalStorage. It has been quite a
while since that happened, so my memories are hazy, but I’m pretty sure
that when we used all of FreeLocalStorage, the result was that Postgres
restarted. It might be equivalent to using all memory and disk space on a
standalone system. Once there’s no storage left, behavior is unpredictable
but we can’t be surprised if things crash. Usually our runaway queries got
killed before FreeLocalStorage filled up, but not always.I second Veem’s suggestion to set work_mem on a per-session basis. Also
note that the doc for work_mem says, “the total memory used could be many
times the value of work_mem; it is necessary to keep this fact in mind when
choosing the value."https://www.postgresql.org/docs/current/runtime-config-resource.html#GUC-WORK-MEM
Thank you.
When I execute the query with explain (analyze, buffers),I see the section
below in the plan having "sort method" information in three places
each showing ~75MB size, which if combined is coming <250MB. So , does that
mean it's enough to set the work_mem as ~250MB for these queries before
they start?
But yes somehow this query is finished in a few seconds when i execute
using explain(analyze,buffers) while if i run it without using explain it
runs for ~10minutes+. My expectation was that doing (explain analyze)
should actually execute the query fully. Is my understanding correct here
and if the disk spilling stats which I am seeing is accurate enough to go
with?
Limit (cost=557514.75..592517.20 rows=300000 width=1430) (actual
time=2269.939..2541.527 rows=300000 loops=1)
Buffers: shared hit=886206, temp read=38263 written=56947
I/O Timings: temp read=70.040 write=660.073
-> Gather Merge (cost=557514.75..643393.02 rows=736048 width=1430)
(actual time=2269.938..2513.748 rows=300000 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=886206, temp read=38263 written=56947
I/O Timings: temp read=70.040 write=660.073
-> Sort (cost=556514.73..557434.79 rows=368024 width=1430)
(actual time=2227.392..2279.389 rows=100135 loops=3)
Sort Key: column1, column2
Sort Method: external merge Disk: *77352kB*
Buffers: shared hit=886206, temp read=38263 written=56947
I/O Timings: temp read=70.040 write=660.073
Worker 0: Sort Method: external merge Disk: *75592kB*
Worker 1: Sort Method: external merge Disk: *74440kB*
-> Parallel Append (cost=0.00..64614.94 rows=368024
width=1430) (actual time=0.406..570.105 rows=299204 loops=3)
On Tue, Oct 1, 2024 at 2:52 AM yudhi s <learnerdatabase99@gmail.com> wrote:
When I execute the query with explain (analyze, buffers),I see the section
below in the plan having "sort method" information in three places
each showing ~75MB size, which if combined is coming <250MB. So , does that
mean it's enough to set the work_mem as ~250MB for these queries before
they start?
work_mem is set per action, so you don't need to usually combine them.
However, these are parallel workers, so you probably need to account for
the case in which no workers are available, in which case you DO want to
combine the values - but only for parallel workers all doing the same
action.
But yes somehow this query is finished in a few seconds when i execute
using explain(analyze,buffers) while if i run it without using explain it
runs for ~10minutes+. My expectation was that doing (explain analyze)
should actually execute the query fully. Is my understanding correct here
and if the disk spilling stats which I am seeing is accurate enough to go
with?
Running explain analyze does indeed run the actual query, but it also
throws away the output. It looks like your limit is set to 300,000 rows
(why!??), which could account for some or all of the time taken - to pass
back those rows and for your client to process them. But it's hard to say
if that's the total reason for the difference without more data. It might
help to see the query, but as a rule of thumb, don't use SELECT * and keep
your LIMIT sane - only pull back the columns and rows your application
absolutely needs.
Cheers,
Greg