Latency problems with simple queries
I randomly get latency/performance problems even with very simple
queries, for example fetching a row by primary key from a small table.
Since I could not trace it back to specific queries, I decided to give
LatencyTOP (http://www.latencytop.org/) a go. Soon after running a
couple of queries, I saw this in latencytop whilst a query was hanging
in postgres:
Cause Maximum Percentage
Writing a page to disk 19283.9 msec 99.7
the disk configuration is as follows:
RAID controller: LSI MegaRAID 9261
tablespace is on a dedicated RAID10 volume, xlog on its own RAID1 and
another disk for temporary data.
Volumes are mounted with noatime,errors=remount-ro.
This are the sysctl.conf changes I made (machine has 48GB memory)
kernel.shmmax = 25344188416
kernel.shmall = 6187546
vm.swappiness = 0
vm.overcommit_memory = 2
vm.dirty_background_ratio = 1
vm.dirty_ratio = 2
vm.zone_reclaim_mode = 0
Maybe someone has seen this before and can give me some advice.
Adrian
On Thu, 2011-07-07 at 12:13 +0100, Adrian Schreyer wrote:
I randomly get latency/performance problems even with very simple
queries, for example fetching a row by primary key from a small table.
Since I could not trace it back to specific queries, I decided to give
LatencyTOP (http://www.latencytop.org/) a go. Soon after running a
couple of queries, I saw this in latencytop whilst a query was hanging
in postgres:Cause Maximum Percentage
Writing a page to disk 19283.9 msec 99.7
What IO scheduler and filesystem are you using?
I think that CFQ has some problems for database workloads. It would be
easy to test: just switch to deadline and/or noop for a while and see if
the problem persists.
Also, I have heard of a few strange things with ext4, but they have
probably fixed those issues and it would be much harder for you to test.
But it might be worth searching for issues/bugs with your particular
version of the filesystem.
Regards,
Jeff Davis