Need for help!

Started by Đỗ Ngọc Trí Cườngalmost 18 years ago14 messagesgeneral
Jump to latest

Dear all,
I have a postgres 8.2.5 and ~6 GB database with lots of simple selects using
indexes. I see that they use the shared memory so much.
Before, my server has 4GB of RAM, shmmax 1GB, Shared_buffers is set to 256
MB, effective_cache_size 300MB, when i test it's performance with option -c
40 -t 1000, it's results is about 54.542 tps, but when i up number of
clients to over 64 it refuses to run? Now, my server has 8GB, shmmax 3 GB,
shared_buffers is 2GB ----> it uses ~ 7GB cache, after benchmark ( c 40 t
1000 ) the results is 57.658 (???). But after upgrade the max clients is
also 64 (?!?) Is this the maximum clients support by program pgbench (my
server on Linux ver8.2.5, pgbench on Windows - version postgresql is 8.3.1)?
And the number 57 tps is fast?

Another questions, i heard that PostgreSQL does not support HT Technology,
is it right?

Last question, i don't understand so much the shmmax, shared_buffers, after
upgrading my server from 4 GB RAM to 8 GB RAM, first i configure shmmax to
2GB, share_buffers to 1GB and start server, it runs, after that i set shmmax
to 4GB and restart, it fails (?!?). The error logs said that not enough
share memory! and final i set shmmax to 3GB and share buffer to 2GB, it
runs. Don't know why, can you explain?
Thanks so much!
Regards,

#2Pavan Deolasee
pavan.deolasee@gmail.com
In reply to: Đỗ Ngọc Trí Cường (#1)
Re: Need for help!

On Tue, May 13, 2008 at 2:43 PM, Semi Noob <seminoob@gmail.com> wrote:

But after upgrade the max clients is
also 64 (?!?) Is this the maximum clients support by program pgbench (my
server on Linux ver8.2.5, pgbench on Windows - version postgresql is 8.3.1)?
And the number 57 tps is fast?

You did not give CPU and disk info. But still 57 seems a small number.
What I guess is you're running pgbench with scale factor 1 (since you
haven't mentioned scale factor) and that causes extreme contention for
smaller tables with large number of clients.

Regarding maximum number of clients, check your "max_connections" setting.

Another questions, i heard that PostgreSQL does not support HT Technology,
is it right?

I'm not sure what do you mean by HT, but if it's hyper threading, then
IMO that statement is not completely true. Postgres is not
multi-threaded, so a single process (or connection) may not be able to
use all the CPUs, but as long as there are multiple connections (each
connection corresponds to one backend process), as many CPUs will be
used.

Last question, i don't understand so much the shmmax, shared_buffers, after
upgrading my server from 4 GB RAM to 8 GB RAM, first i configure shmmax to
2GB, share_buffers to 1GB and start server, it runs, after that i set shmmax
to 4GB and restart, it fails (?!?). The error logs said that not enough
share memory! and final i set shmmax to 3GB and share buffer to 2GB, it
runs. Don't know why, can you explain?

That doesn't make sense. I am guessing that you are running a 32 bit
OS. 4GB of shmmax won't work on a 32 bit OS.

Thanks,
Pavan

Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

In reply to: Pavan Deolasee (#2)
Re: Need for help!

Thank you for your answer!
*"You did not give CPU and disk info. But still 57 seems a small number.
What I guess is you're running pgbench with scale factor 1 (since you
haven't mentioned scale factor) and that causes extreme contention for
smaller tables with large number of clients."*

My CPU is 2CPU: Intel(R) Xeon(TM) CPU 3.20GHz. Disk: disk system is RAID-5;
OS CentOS. the number of scale in pgbench initialization is 100. It will be
generate 10 000 000 rows in the accounts table. Fill factor is default.
In the other way, I heard that: PostgreSQL working with RAID-10 better than
RAID-5 is it right?

* "Regarding maximum number of clients, check your "max_connections"
setting."
*
I set max_connections is 200. *
*
57 seems a small number, according to you, how much tps is normal or fast?
and what is the different of "shared_buffers" and "effective_cache_size".

Thank you once more!
Regards,
Semi Noob

2008/5/15 Pavan Deolasee <pavan.deolasee@gmail.com>:

Show quoted text

On Tue, May 13, 2008 at 2:43 PM, Semi Noob <seminoob@gmail.com> wrote:

But after upgrade the max clients is
also 64 (?!?) Is this the maximum clients support by program pgbench (my
server on Linux ver8.2.5, pgbench on Windows - version postgresql is

8.3.1)?

And the number 57 tps is fast?

You did not give CPU and disk info. But still 57 seems a small number.
What I guess is you're running pgbench with scale factor 1 (since you
haven't mentioned scale factor) and that causes extreme contention for
smaller tables with large number of clients.

Regarding maximum number of clients, check your "max_connections" setting.

Another questions, i heard that PostgreSQL does not support HT

Technology,

is it right?

I'm not sure what do you mean by HT, but if it's hyper threading, then
IMO that statement is not completely true. Postgres is not
multi-threaded, so a single process (or connection) may not be able to
use all the CPUs, but as long as there are multiple connections (each
connection corresponds to one backend process), as many CPUs will be
used.

Last question, i don't understand so much the shmmax, shared_buffers,

after

upgrading my server from 4 GB RAM to 8 GB RAM, first i configure shmmax

to

2GB, share_buffers to 1GB and start server, it runs, after that i set

shmmax

to 4GB and restart, it fails (?!?). The error logs said that not enough
share memory! and final i set shmmax to 3GB and share buffer to 2GB, it
runs. Don't know why, can you explain?

That doesn't make sense. I am guessing that you are running a 32 bit
OS. 4GB of shmmax won't work on a 32 bit OS.

Thanks,
Pavan

Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

#4Pavan Deolasee
pavan.deolasee@gmail.com
In reply to: Đỗ Ngọc Trí Cường (#3)
Re: Need for help!

On Thu, May 15, 2008 at 3:48 PM, Semi Noob <seminoob@gmail.com> wrote:

I set max_connections is 200.

What error message you get when you try with more than 64 clients ?

57 seems a small number, according to you, how much tps is normal or fast?

Its difficult to say how much is good. On my laptop for s = 10, c =
40, t = 1000, I get 51 tps. But on a larger 2 CPU, 2 GB, 3 RAID 0
disks for data and a separate disk for xlog, I get 232 tps.

and what is the different of "shared_buffers" and "effective_cache_size".

"shared_buffers" is the size of the buffer pool which Postgres uses to
cache the data blocks.
"effective_cache_size" is usually size of the shared buffer plus
estimate of whatever data OS can cache. Planner uses this
approximation to choose right plan for execution.

http://www.postgresql.org/docs/8.3/interactive/runtime-config-query.html

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

#5Justin
justin@emproshunts.com
In reply to: Pavan Deolasee (#4)
Re: Need for help!

Pavan Deolasee wrote:

On Thu, May 15, 2008 at 3:48 PM, Semi Noob <seminoob@gmail.com> wrote:

I set max_connections is 200.

What error message you get when you try with more than 64 clients ?

I have the max connection set 50. You want to be careful with this
setting if theres allot of active users and the processes/users get
hung open this will be eating up memory on the server This number needs
to be set to the maximum number of users you ever want on the server at
any one time. If you get to many hung open presses performance gets hurt.

57 seems a small number, according to you, how much tps is normal or fast?

Its difficult to say how much is good. On my laptop for s = 10, c =
40, t = 1000, I get 51 tps. But on a larger 2 CPU, 2 GB, 3 RAID 0
disks for data and a separate disk for xlog, I get 232 tps.

As Pevan has started TPS number is directly related to how fast your
hardware is. 51 tps is not very good given the hardware specs you have
stated.

Now the server i have gets 1500 to 2000 tps depending on the the test .
We had a pretty detail discussion about performance numbers back in March
http://archives.postgresql.org/pgsql-performance/2008-03/thrd3.php#00370
the thread is called Benchmark: Dell/Perc 6, 8 disk RAID 10.

RIAD 5 is a terrible setup for performance RAID 10 seems to be what
everybody goes with unless you get in SANs storage or other more
complicated setups

Show quoted text

and what is the different of "shared_buffers" and "effective_cache_size".

"shared_buffers" is the size of the buffer pool which Postgres uses to
cache the data blocks.
"effective_cache_size" is usually size of the shared buffer plus
estimate of whatever data OS can cache. Planner uses this
approximation to choose right plan for execution.

http://www.postgresql.org/docs/8.3/interactive/runtime-config-query.html

Thanks,
Pavan

#6Shane Ambler
pgsql@Sheeky.Biz
In reply to: Đỗ Ngọc Trí Cường (#3)
Re: Need for help!

Semi Noob wrote:

My CPU is 2CPU: Intel(R) Xeon(TM) CPU 3.20GHz. Disk: disk system is RAID-5;

Early versions of postgresql had issues with P4 HT CPU's but I believe
they have been resolved.

I am quite certain that it only related to the early P4's not the Xeon.

--

Shane Ambler
pgSQL (at) Sheeky (dot) Biz

Get Sheeky @ http://Sheeky.Biz

#7Scott Marlowe
scott.marlowe@gmail.com
In reply to: Shane Ambler (#6)
Re: Need for help!

On Thu, May 15, 2008 at 11:08 AM, Shane Ambler <pgsql@sheeky.biz> wrote:

Semi Noob wrote:

My CPU is 2CPU: Intel(R) Xeon(TM) CPU 3.20GHz. Disk: disk system is
RAID-5;

Early versions of postgresql had issues with P4 HT CPU's but I believe they
have been resolved.

I am quite certain that it only related to the early P4's not the Xeon.

The real problem was with the various OS kernels not know how to treat
a HT "core" versus a real core. Linux in particular was a bad
performer with HT turned on, and pgsql made it suffer more than many
other apps for not knowing the difference. The linux kernel has known
for some time now how to treat a HT core properly.

#8Justin
justin@emproshunts.com
In reply to: Scott Marlowe (#7)
Re: Need for help!

Scott Marlowe wrote:

On Thu, May 15, 2008 at 11:08 AM, Shane Ambler <pgsql@sheeky.biz> wrote:

Semi Noob wrote:

My CPU is 2CPU: Intel(R) Xeon(TM) CPU 3.20GHz. Disk: disk system is
RAID-5;

Early versions of postgresql had issues with P4 HT CPU's but I believe they
have been resolved.

I am quite certain that it only related to the early P4's not the Xeon.

The real problem was with the various OS kernels not know how to treat
a HT "core" versus a real core. Linux in particular was a bad
performer with HT turned on, and pgsql made it suffer more than many
other apps for not knowing the difference. The linux kernel has known
for some time now how to treat a HT core properly.

From every thing i have read about Hyper Threading, it should be just
turned off. There is so much over head to process, it killed its own
performance if the application was not designed to take advantage of
it.. A really cool idea that proved unfeasible at the time.

Intel is says its bringing back hyper threading for Nehalem
http://en.wikipedia.org/wiki/Nehalem_(CPU_architecture)

If you can i would turn it off and see what the results are

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Justin (#8)
Re: Need for help!

Justin <justin@emproshunts.com> writes:

From every thing i have read about Hyper Threading, it should be just
turned off.

Depends entirely on what you're doing. I usually leave it turned on
because compiling Postgres from source is measurably faster with it
than without it on my dual-Xeon box. I'd recommend experimenting
with your own workload before making any decisions.

regards, tom lane

#10Justin
justin@emproshunts.com
In reply to: Tom Lane (#9)
Re: Need for help!

Tom Lane wrote:

Justin <justin@emproshunts.com> writes:

From every thing i have read about Hyper Threading, it should be just
turned off.

Depends entirely on what you're doing. I usually leave it turned on
because compiling Postgres from source is measurably faster with it
than without it on my dual-Xeon box. I'd recommend experimenting
with your own workload before making any decisions.

regards, tom lane

Sense PostgreSql is not a multi-threaded but a single thread
application which spawns little exe's how is hyper threading helping
Postgresql at all ??

To perfectly honest my programming skills with milti-threading apps is
non-existent along with Linux world but in the Windows world single
threaded apps saw no measurable performance boost but the opposite it
kill the apps performance and allot of multi-threaded apps also got
there performance smashed? And if you really kill performance turn on
HT running W2K?

#11Scott Marlowe
scott.marlowe@gmail.com
In reply to: Justin (#10)
Re: Need for help!

On Thu, May 15, 2008 at 12:53 PM, Justin <justin@emproshunts.com> wrote:

Sense PostgreSql is not a multi-threaded but a single thread application
which spawns little exe's how is hyper threading helping Postgresql at all
??

Two ways:

* The stats collector / autovacuum / bgwriter can operate on one CPU
while you the user are using another (whether they're physically
separate cores on different sockets, dual or quad cores on the same
socket, or the sorta another CPU provided by hyperthreading P4s.
* You can use > 1 connection, and each new connection spawns another process.

Note that this actually makes postgresql scale to a large number of
CPUs better than many multi-threaded apps, which can have a hard time
using > 1 CPU at a time without a lot of extra help.

#12Craig Ringer
craig@2ndquadrant.com
In reply to: Justin (#10)
Re: Need for help!

Justin wrote:

[Since] PostgreSql is not multi-threaded but a single thread
application which spawns little exe's how is hyper threading helping
Postgresql at all ??

Multiple threads and multiple processes are two ways to tackle a similar
problem - that of how to do more than one thing on the CPU(s) at once.

Applications that use multiple cooperating processes benefit from more
CPUs, CPU cores, and CPU hardware execution threads (HT) just like
applications that use multiple threads do, so long as there is enough
work to keep multiple CPU cores busy.

There's really not *that* much difference between a multi-threaded
executable and an executable with multiple processes cooperating using
shared memory (like PostgreSQL). Nor is there much difference in how
they use multiple logical CPUs.

The main difference between the two models is that multiple processes
with shared memory don't share address space except where they have
specifically mapped it. This means that it's relatively hard for one
process to mangle other processes' state, especially if it's properly
careful with its shared memory. By contrast, it's depressingly easy for
one thread to corrupt the shared heap or even to corrupt other threads'
stacks in a multi-threaded executable.

On Windows, threads are usually preferred because Windows has such a
horrible per-process overhead, but it's very good at creating threads
quickly and cheaply. On UNIX, which has historically been bad at
threading and very good at creating and destroying processes, the use of
multiple processes is preferred.

It's also worth noting that you can combine multi-process and
multi-threaded operation. For example, if PostgreSQL was ever to support
evaluating a single query on multiple CPU cores one way it could do that
would be to spawn multiple threads within a single backend. (Note: I
know it's not even remotely close to that easy - I've been doing way too
much threaded coding lately).

So ... honestly, whether PostgreSQL is multi-threaded or multi-process
just doesn't matter. Even if it was muti-threaded instead of
multi-process, so long as it can only execute each query on a maximum of
one core then single queries will not benefit (much) from having
multiple CPU cores, multiple physical CPUs, or CPUs with hyperthreading.
However, multiple CPU bound queries running in parallel will benefit
massively from extra cores or physical CPUs, and might also benefit from
hyperthreading.

To perfectly honest my programming skills with milti-threading apps is
non-existent along with Linux world but in the Windows world single
threaded apps saw no measurable performance boost

Of course. They cannot use the extra logical core for anything, so it's
just overhead. You will find, though, that hyperthreading may improve
system responsiveness under load even when using only single threaded
apps, because two different single threaded apps can run (kind of) at
the same time.

It's pretty useless compared to real multiple cores, though.

and allot of multi-threaded apps also got
there performance smashed?

That will depend a lot on details of CPU cache use, exactly what they
were doing on their various threads, how their thread priorities were
set up, etc. Some apps benefit, some lose.

--
Craig Ringer

#13Justin
justin@emproshunts.com
In reply to: Craig Ringer (#12)
Re: Need for help!

Craig Ringer wrote:

Justin wrote:

[Since] PostgreSql is not multi-threaded but a single thread
application which spawns little exe's how is hyper threading
helping Postgresql at all ??

Multiple threads and multiple processes are two ways to tackle a
similar problem - that of how to do more than one thing on the CPU(s)
at once.

Applications that use multiple cooperating processes benefit from more
CPUs, CPU cores, and CPU hardware execution threads (HT) just like
applications that use multiple threads do, so long as there is enough
work to keep multiple CPU cores busy.

There's really not *that* much difference between a multi-threaded
executable and an executable with multiple processes cooperating using
shared memory (like PostgreSQL). Nor is there much difference in how
they use multiple logical CPUs.

The main difference between the two models is that multiple processes
with shared memory don't share address space except where they have
specifically mapped it. This means that it's relatively hard for one
process to mangle other processes' state, especially if it's properly
careful with its shared memory. By contrast, it's depressingly easy
for one thread to corrupt the shared heap or even to corrupt other
threads' stacks in a multi-threaded executable.

On Windows, threads are usually preferred because Windows has such a
horrible per-process overhead, but it's very good at creating threads
quickly and cheaply. On UNIX, which has historically been bad at
threading and very good at creating and destroying processes, the use
of multiple processes is preferred.

It's also worth noting that you can combine multi-process and
multi-threaded operation. For example, if PostgreSQL was ever to
support evaluating a single query on multiple CPU cores one way it
could do that would be to spawn multiple threads within a single
backend. (Note: I know it's not even remotely close to that easy -
I've been doing way too much threaded coding lately).

So ... honestly, whether PostgreSQL is multi-threaded or multi-process
just doesn't matter. Even if it was muti-threaded instead of
multi-process, so long as it can only execute each query on a maximum
of one core then single queries will not benefit (much) from having
multiple CPU cores, multiple physical CPUs, or CPUs with
hyperthreading. However, multiple CPU bound queries running in
parallel will benefit massively from extra cores or physical CPUs, and
might also benefit from hyperthreading.

To perfectly honest my programming skills with milti-threading apps
is non-existent along with Linux world but in the Windows world
single threaded apps saw no measurable performance boost

Of course. They cannot use the extra logical core for anything, so
it's just overhead. You will find, though, that hyperthreading may
improve system responsiveness under load even when using only single
threaded apps, because two different single threaded apps can run
(kind of) at the same time.

Isn't that the rub point in HT processor. A process running in HT
virtual processor can lock up a specific chunk of the real processor up
that would not normally be locked. So the process running in the Real
processor gets blocked and put to sleep till the process running in HT
is cleared out. Now the problem is on windows to my understanding is
the kernel scheduler did not understand that HT was a virtual process
so it screwed up scheduling orders and what not. This added allot of
overhead to the processor to sort what was going on sending sleep
commands to keep the processor from rolling over and dieing.

As time has progress the kernel schedulers i imagine have improved to
better understand that this virtual processor locks parts of the real
processor up so it needs to schedule things a little better so it don't
keep dumping things into the HT processor that it will need a second
latter for another process it queried for the real processor.

It's pretty useless compared to real multiple cores, though.

and allot of multi-threaded apps also got there performance smashed?

That will depend a lot on details of CPU cache use, exactly what they
were doing on their various threads, how their thread priorities were
set up, etc. Some apps benefit, some lose.

I understand these things in theory I just never had to do any of the
programming. Life has taught me theory/book differs allot from real
life/fact.

Show quoted text

--
Craig Ringer

#14Martijn van Oosterhout
kleptog@svana.org
In reply to: Justin (#13)
Re: Need for help!

On Thu, May 15, 2008 at 04:57:03PM -0400, Justin wrote:

Isn't that the rub point in HT processor. A process running in HT
virtual processor can lock up a specific chunk of the real processor up
that would not normally be locked. So the process running in the Real
processor gets blocked and put to sleep till the process running in HT
is cleared out.

There is no "real" or "virtual" processor, HT is symmetric. Both look
like real CPU's but internally they share certain resources. The
advantage is that if a program gets a cache miss it will stall for
dozens of cycles waiting for memory. On a real multicore CPU that's
wasted time, on an HT CPU there is another thread which can (hopefully)
keep running and use the resources the other isn't using.

For programs like GCC which are waiting for memory 50% of the time of
so, HT can provide a measurable increase in performace. For
computationally expensive programs it may be worse.

As time has progress the kernel schedulers i imagine have improved to
better understand that this virtual processor locks parts of the real
processor up so it needs to schedule things a little better so it don't
keep dumping things into the HT processor that it will need a second
latter for another process it queried for the real processor.

The thing is that HT processors share an L1 cache so switching between
two HT processors on the same die is much less expensive than switching
to another core. But if you have only two runnning processes it better
to split them across two cores. Schedulars know this now, they didn't
at first.

Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

Please line up in a tree and maintain the heap invariant while
boarding. Thank you for flying nlogn airlines.