access time performance problem

Started by Louis-Marie Croisezover 23 years ago8 messagesgeneral
Jump to latest
#1Louis-Marie Croisez
louis-marie.croisez@etca.alcatel.be

I have an IBM Xseries 300 single cpu with RH installed, 512Mb RAM and SCSI drive with hardware mirroring.
Postgresql database is on a partition with ext3 (journalized file system).
My greatest table contains about 30.000 records.

Postgresql in my project is used to feed/get data from an external hardware as quick as possible.
The external device ask the IBM for its configuration data, and the goal is to do a fetch on the database and to send back the info
as quick as possible.
The second scenario is when the external device wants to back up its configuration.
A mean time of 50ms between database accesses is foreseen.
For both scenario I have chosen auto-commit mode, because every record has to be on disc as quick as possible.

I have remarked very bad database access time performances. I have then tried with another computer : a common desktop PC (compaq),
IDE drive, less memory and less CPU speed. I got better database access time.
Here is the results:

delete_records insert_records update_records
Compaq mean access time: 2.7ms 4.5ms 4.8ms
IBM mean access time: 22.9ms 24.6ms 25.9ms

When browsing newsgroups, I found that playing with wal_sync_method parameter could give better results.
I tried with wal_sync_method=open_sync and here are the results:

delete_records insert_records update_records
Compaq mean access time: 1.0ms 2.6ms 2.6ms
IBM mean access time: 4.0ms 1.3ms 1.3ms

My first question is: how is it possible to have such gain in time for the IBM between the case wal_sync_method=fsync and the case
wal_sync_method=open_sync ?

Another problem is the following:
about every 1000 database access (not regular), the database accesses are hanged during approximately 2500ms.
I suppose that this time is used by the OS to flush the memory cache to hard disk.

My second question is: how is it possible to avoid such hanging of the database ? Is it possible to flush a part of the cache while
working on another part of it, the goal being not to interrupt the whole process ?

Thanx for your future comments.

--Louis Croisez.

#2scott.marlowe
scott.marlowe@ihs.com
In reply to: Louis-Marie Croisez (#1)
Re: access time performance problem

Quick question, are you regularly vacuuming and analyzing your database?

Also, ext3 can definitely slow things down. If your machine is stable and
on a UPS it may be worth your while to just run ext2.

Also, have you compared output from bonnie++ on the compaq against the
IBM (run it on the same drive that hosts the database of course.) it's a
free program you can download to test your drive subsystem's performance.
A SCSI mirror set on 10k drives should be able to read at >30 Megs a
second and an IDE drive should be in the 5 to 15 Megs a second range.

Since Postgresql is designed more for integrity and transactions, it may
not be your best choice here. I'm not sure what would be your best
choice, but Postgresql is not known for being a real time system with
performance guarantees on response times.

Also, what processor speeds are these two machines? Just wondering.

On Wed, 9 Oct 2002, Louis-Marie Croisez wrote:

Show quoted text

I have an IBM Xseries 300 single cpu with RH installed, 512Mb RAM and SCSI drive with hardware mirroring.
Postgresql database is on a partition with ext3 (journalized file system).
My greatest table contains about 30.000 records.

Postgresql in my project is used to feed/get data from an external hardware as quick as possible.
The external device ask the IBM for its configuration data, and the goal is to do a fetch on the database and to send back the info
as quick as possible.
The second scenario is when the external device wants to back up its configuration.
A mean time of 50ms between database accesses is foreseen.
For both scenario I have chosen auto-commit mode, because every record has to be on disc as quick as possible.

I have remarked very bad database access time performances. I have then tried with another computer : a common desktop PC (compaq),
IDE drive, less memory and less CPU speed. I got better database access time.
Here is the results:

delete_records insert_records update_records
Compaq mean access time: 2.7ms 4.5ms 4.8ms
IBM mean access time: 22.9ms 24.6ms 25.9ms

When browsing newsgroups, I found that playing with wal_sync_method parameter could give better results.
I tried with wal_sync_method=open_sync and here are the results:

delete_records insert_records update_records
Compaq mean access time: 1.0ms 2.6ms 2.6ms
IBM mean access time: 4.0ms 1.3ms 1.3ms

My first question is: how is it possible to have such gain in time for the IBM between the case wal_sync_method=fsync and the case
wal_sync_method=open_sync ?

Another problem is the following:
about every 1000 database access (not regular), the database accesses are hanged during approximately 2500ms.
I suppose that this time is used by the OS to flush the memory cache to hard disk.

My second question is: how is it possible to avoid such hanging of the database ? Is it possible to flush a part of the cache while
working on another part of it, the goal being not to interrupt the whole process ?

Thanx for your future comments.

--Louis Croisez.

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

#3Louis-Marie Croisez
louis-marie.croisez@etca.alcatel.be
In reply to: scott.marlowe (#2)
Re: access time performance problem

Thanx for your response Scott.
In order to illustrate my problem, here is a capture of an Excel graph showing the access peaks I get:
http://louis.croisez.free.fr/download/capture1.jpg
The test program makes updates of a table (on 1000 records randomly) every 50ms.
Approximately, every about 3 seconds, the update process is hanged for about 800ms (these data are measured on the Compaq. On the
IBM, the results are worse).
This is a great problem for us.
Could you give me some help to correct that ?

LM Croisez.

----- Original Message -----
From: "scott.marlowe" <scott.marlowe@ihs.com>
To: "Louis-Marie Croisez" <louis-marie.croisez@etca.alcatel.be>
Cc: <pgsql-general@postgresql.org>
Sent: Wednesday, October 09, 2002 6:46 PM
Subject: Re: [GENERAL] access time performance problem

Quick question, are you regularly vacuuming and analyzing your database?

Also, ext3 can definitely slow things down. If your machine is stable and
on a UPS it may be worth your while to just run ext2.

Also, have you compared output from bonnie++ on the compaq against the
IBM (run it on the same drive that hosts the database of course.) it's a
free program you can download to test your drive subsystem's performance.
A SCSI mirror set on 10k drives should be able to read at >30 Megs a
second and an IDE drive should be in the 5 to 15 Megs a second range.

Since Postgresql is designed more for integrity and transactions, it may
not be your best choice here. I'm not sure what would be your best
choice, but Postgresql is not known for being a real time system with
performance guarantees on response times.

Also, what processor speeds are these two machines? Just wondering.

On Wed, 9 Oct 2002, Louis-Marie Croisez wrote:

I have an IBM Xseries 300 single cpu with RH installed, 512Mb RAM and SCSI drive with hardware mirroring.
Postgresql database is on a partition with ext3 (journalized file system).
My greatest table contains about 30.000 records.

Postgresql in my project is used to feed/get data from an external hardware as quick as possible.
The external device ask the IBM for its configuration data, and the goal is to do a fetch on the database and to send back the

info

as quick as possible.
The second scenario is when the external device wants to back up its configuration.
A mean time of 50ms between database accesses is foreseen.
For both scenario I have chosen auto-commit mode, because every record has to be on disc as quick as possible.

I have remarked very bad database access time performances. I have then tried with another computer : a common desktop PC

(compaq),

IDE drive, less memory and less CPU speed. I got better database access time.
Here is the results:

delete_records insert_records update_records
Compaq mean access time: 2.7ms 4.5ms 4.8ms
IBM mean access time: 22.9ms 24.6ms 25.9ms

When browsing newsgroups, I found that playing with wal_sync_method parameter could give better results.
I tried with wal_sync_method=open_sync and here are the results:

delete_records insert_records update_records
Compaq mean access time: 1.0ms 2.6ms 2.6ms
IBM mean access time: 4.0ms 1.3ms 1.3ms

My first question is: how is it possible to have such gain in time for the IBM between the case wal_sync_method=fsync and the

case

wal_sync_method=open_sync ?

Another problem is the following:
about every 1000 database access (not regular), the database accesses are hanged during approximately 2500ms.
I suppose that this time is used by the OS to flush the memory cache to hard disk.

My second question is: how is it possible to avoid such hanging of the database ? Is it possible to flush a part of the cache

while

Show quoted text

working on another part of it, the goal being not to interrupt the whole process ?

Thanx for your future comments.

--Louis Croisez.

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

#4Louis-Marie Croisez
louis-marie.croisez@etca.alcatel.be
In reply to: scott.marlowe (#2)
Re: access time performance problem

Sorry, bad url:
use this instead:
http://louis.croisez.free.fr/capture1.jpg

LM Croisez

----- Original Message -----
From: "Louis-Marie Croisez" <louis-marie.croisez@etca.alcatel.be>
To: "scott.marlowe" <scott.marlowe@ihs.com>
Cc: <pgsql-general@postgresql.org>
Sent: Thursday, October 10, 2002 10:41 AM
Subject: Re: [GENERAL] access time performance problem

Show quoted text

Thanx for your response Scott.
In order to illustrate my problem, here is a capture of an Excel graph showing the access peaks I get:
http://louis.croisez.free.fr/download/capture1.jpg
The test program makes updates of a table (on 1000 records randomly) every 50ms.
Approximately, every about 3 seconds, the update process is hanged for about 800ms (these data are measured on the Compaq. On the
IBM, the results are worse).
This is a great problem for us.
Could you give me some help to correct that ?

LM Croisez.

----- Original Message -----
From: "scott.marlowe" <scott.marlowe@ihs.com>
To: "Louis-Marie Croisez" <louis-marie.croisez@etca.alcatel.be>
Cc: <pgsql-general@postgresql.org>
Sent: Wednesday, October 09, 2002 6:46 PM
Subject: Re: [GENERAL] access time performance problem

Quick question, are you regularly vacuuming and analyzing your database?

Also, ext3 can definitely slow things down. If your machine is stable and
on a UPS it may be worth your while to just run ext2.

Also, have you compared output from bonnie++ on the compaq against the
IBM (run it on the same drive that hosts the database of course.) it's a
free program you can download to test your drive subsystem's performance.
A SCSI mirror set on 10k drives should be able to read at >30 Megs a
second and an IDE drive should be in the 5 to 15 Megs a second range.

Since Postgresql is designed more for integrity and transactions, it may
not be your best choice here. I'm not sure what would be your best
choice, but Postgresql is not known for being a real time system with
performance guarantees on response times.

Also, what processor speeds are these two machines? Just wondering.

On Wed, 9 Oct 2002, Louis-Marie Croisez wrote:

I have an IBM Xseries 300 single cpu with RH installed, 512Mb RAM and SCSI drive with hardware mirroring.
Postgresql database is on a partition with ext3 (journalized file system).
My greatest table contains about 30.000 records.

Postgresql in my project is used to feed/get data from an external hardware as quick as possible.
The external device ask the IBM for its configuration data, and the goal is to do a fetch on the database and to send back the

info

as quick as possible.
The second scenario is when the external device wants to back up its configuration.
A mean time of 50ms between database accesses is foreseen.
For both scenario I have chosen auto-commit mode, because every record has to be on disc as quick as possible.

I have remarked very bad database access time performances. I have then tried with another computer : a common desktop PC

(compaq),

IDE drive, less memory and less CPU speed. I got better database access time.
Here is the results:

delete_records insert_records update_records
Compaq mean access time: 2.7ms 4.5ms 4.8ms
IBM mean access time: 22.9ms 24.6ms 25.9ms

When browsing newsgroups, I found that playing with wal_sync_method parameter could give better results.
I tried with wal_sync_method=open_sync and here are the results:

delete_records insert_records update_records
Compaq mean access time: 1.0ms 2.6ms 2.6ms
IBM mean access time: 4.0ms 1.3ms 1.3ms

My first question is: how is it possible to have such gain in time for the IBM between the case wal_sync_method=fsync and the

case

wal_sync_method=open_sync ?

Another problem is the following:
about every 1000 database access (not regular), the database accesses are hanged during approximately 2500ms.
I suppose that this time is used by the OS to flush the memory cache to hard disk.

My second question is: how is it possible to avoid such hanging of the database ? Is it possible to flush a part of the cache

while

working on another part of it, the goal being not to interrupt the whole process ?

Thanx for your future comments.

--Louis Croisez.

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

#5Richard Huxton
dev@archonet.com
In reply to: Louis-Marie Croisez (#4)
Re: access time performance problem

On Thursday 10 Oct 2002 9:51 am, Louis-Marie Croisez wrote:

Sorry, bad url:
use this instead:
http://louis.croisez.free.fr/capture1.jpg

Thanx for your response Scott.
In order to illustrate my problem, here is a capture of an Excel graph
showing the access peaks I get:
http://louis.croisez.free.fr/download/capture1.jpg
The test program makes updates of a table (on 1000 records randomly)
every 50ms. Approximately, every about 3 seconds, the update process is
hanged for about 800ms (these data are measured on the Compaq. On the
IBM, the results are worse).
This is a great problem for us.
Could you give me some help to correct that ?

I'm no expert on fs issues, but I do remember reading about bursts of write
activity occuring with ext3 (which you mentioned using in a previous mail,
and Scott remarked upon). IIRC it was something to do with ext3 buffering for
a period and then performing a bunch of writes at the same time. This looks
like a plausible candidate for your problem. The bad news is you'll have to
do some googling, since I can't remember where I heard about it. The good
news is that I seem to remember tuning that would help.

You might find vmstat/iostat useful to identify where the slowdown is
occuring.

- Richard Huxton

#6Louis-Marie Croisez
louis-marie.croisez@etca.alcatel.be
In reply to: scott.marlowe (#2)
Re: access time performance problem

Hi Richard,
I have test it right now mounting the ext3 partition with ext2.
The result is identical.

LM Croisez

----- Original Message -----
From: "Richard Huxton" <dev@archonet.com>
To: "Louis-Marie Croisez" <louis-marie.croisez@etca.alcatel.be>
Cc: <pgsql-general@postgresql.org>
Sent: Thursday, October 10, 2002 11:26 AM
Subject: Re: [GENERAL] access time performance problem

On Thursday 10 Oct 2002 9:51 am, Louis-Marie Croisez wrote:

Sorry, bad url:
use this instead:
http://louis.croisez.free.fr/capture1.jpg

Thanx for your response Scott.
In order to illustrate my problem, here is a capture of an Excel graph
showing the access peaks I get:
http://louis.croisez.free.fr/download/capture1.jpg
The test program makes updates of a table (on 1000 records randomly)
every 50ms. Approximately, every about 3 seconds, the update process is
hanged for about 800ms (these data are measured on the Compaq. On the
IBM, the results are worse).
This is a great problem for us.
Could you give me some help to correct that ?

I'm no expert on fs issues, but I do remember reading about bursts of write
activity occuring with ext3 (which you mentioned using in a previous mail,
and Scott remarked upon). IIRC it was something to do with ext3 buffering for
a period and then performing a bunch of writes at the same time. This looks
like a plausible candidate for your problem. The bad news is you'll have to
do some googling, since I can't remember where I heard about it. The good
news is that I seem to remember tuning that would help.

You might find vmstat/iostat useful to identify where the slowdown is
occuring.

- Richard Huxton

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

#7Richard Huxton
dev@archonet.com
In reply to: Louis-Marie Croisez (#6)
Re: access time performance problem

On Thursday 10 Oct 2002 12:40 pm, Louis-Marie Croisez wrote:

Hi Richard,
I have test it right now mounting the ext3 partition with ext2.
The result is identical.

Hmm - that's ruled the journalling out then. Logically, it can only be two
things:

1. PostgreSQL flushing buffers or similar.
2. Another process flushing buffers or similar.

Can you try it updating batches of 200 records rather than 1000 - if the time
between delays increases that would indicate it's PG doing something.
Otherwise it could be related to logging, another process or the OS.

Try tracing activity with vmstat/iostat - that should show you what's
happening in a general sense. Then we can see if we can pin it down in
detail.

--
Richard Huxton

#8scott.marlowe
scott.marlowe@ihs.com
In reply to: Louis-Marie Croisez (#4)
Re: access time performance problem

On Thu, 10 Oct 2002, Louis-Marie Croisez wrote:

Sorry, bad url:
use this instead:
http://louis.croisez.free.fr/capture1.jpg

I'm gonna make a guess here that it's WAL activity, or log activity.

So first off, are you saving postgresql's output to a log? if so, try
logging to a different drive than your data drive or turning off logging
for a quick test.

If it's WAL writes that are causing the problem, you can either try to
change the settings for commit_delay and commit_siblings or symlinking the
pg_xlog directory somewhere else. If you aren't sure how to do that,
don't learn how to on a live production server with real data.