OSDL DBT-2 w/ PostgreSQL 7.3.4 and 7.4beta5
I thought someone might be interested in a data point I have comparing
7.3.4 and 7.4beta5 with results from our DBT-2 workload. Keep in mind I
haven't done much tuning with either version. The following links have
references iostat, vmstat, sar, readprofile (linux kernel profile), and
oprofile (postgresql profile) statistics.
Results from 7.3.4:
http://developer.osdl.org/markw/dbt2-pgsql/184/
- metric 1354.58
Results from 7.4beta5
http://developer.osdl.org/markw/dbt2-pgsql/188/
- metric 1446.01
7.4beta5 offers more throughput. One significant difference I see is in
the oprofile for the database. For the additional 7% increase in the
metric, there are about 32% less ticks in SearchCatCache.
These are the only database parameters I've explicitly set for each one,
any other differences will be differences in default values:
- shared_buffers = 40000
- tcpip_socket = true
- checkpoint_segments = 200
- checkpoint_timeout = 1800
- stats_start_collector = true
- stats_command_string = true
- stats_block_level = true
- stats_row_level = true
- stats_reset_on_server_start = true
If anyone has any tuning recommendations for either 7.3 or 7.4, I'll be
happy to try them. Or if anyone wants to be able to poke around on the
system, we can arrange that too. Feel free to ask any questions.
--
Mark Wong - - markw@osdl.org
Open Source Development Lab Inc - A non-profit corporation
12725 SW Millikan Way - Suite 400 - Beaverton, OR 97005
(503) 626-2455 x 32 (office)
(503) 626-2436 (fax)
Excellent.
I just noticed that most of the numbers in the system are given the
numeric data type. Is there any particular reason you don't use integer
(test enforced?)?
Show quoted text
On Fri, 2003-10-31 at 19:18, markw@osdl.org wrote:
I thought someone might be interested in a data point I have comparing
7.3.4 and 7.4beta5 with results from our DBT-2 workload. Keep in mind I
haven't done much tuning with either version. The following links have
references iostat, vmstat, sar, readprofile (linux kernel profile), and
oprofile (postgresql profile) statistics.Results from 7.3.4:
http://developer.osdl.org/markw/dbt2-pgsql/184/
- metric 1354.58Results from 7.4beta5
http://developer.osdl.org/markw/dbt2-pgsql/188/
- metric 1446.017.4beta5 offers more throughput. One significant difference I see is in
the oprofile for the database. For the additional 7% increase in the
metric, there are about 32% less ticks in SearchCatCache.These are the only database parameters I've explicitly set for each one,
any other differences will be differences in default values:
- shared_buffers = 40000
- tcpip_socket = true
- checkpoint_segments = 200
- checkpoint_timeout = 1800
- stats_start_collector = true
- stats_command_string = true
- stats_block_level = true
- stats_row_level = true
- stats_reset_on_server_start = trueIf anyone has any tuning recommendations for either 7.3 or 7.4, I'll be
happy to try them. Or if anyone wants to be able to poke around on the
system, we can arrange that too. Feel free to ask any questions.
markw@osdl.org writes:
7.4beta5 offers more throughput. One significant difference I see is in
the oprofile for the database. For the additional 7% increase in the
metric, there are about 32% less ticks in SearchCatCache.
Hmm. I have been profiling PG for some years now, and I cannot remember
ever seeing a profile in which SearchCatCache topped everything else
(the usual suspects for me are palloc/pfree support code). Can you give
any explanation why it looks like that? Can your profiling code tell
where the hotspot call sites of SearchCatCache are?
regards, tom lane
markw@osdl.org wrote:
Results from 7.4beta5
http://developer.osdl.org/markw/dbt2-pgsql/188/
- metric 1446.01
CPU: P4 / Xeon with 2 hyper-threads, speed 1497.51 MHz (estimated)
Counted GLOBAL_POWER_EVENTS events (time during which processor is not stopped) with a unit mask of 0x01 (count cycles when processor is active) count 100000
samples % app name symbol name
15369575 9.6780 postgres SearchCatCache
13714258 8.6357 vmlinux .text.lock.signal
10611912 6.6822 vmlinux do_sigaction
4400461 2.7709 vmlinux rm_from_queue
18% cpu time in the kernel signal handlers.
What are signals used for by postgres? I've seen the sigalarm to
implement timeouts, what else?
--
Manfred
Tom Lane wrote:
markw@osdl.org writes:
7.4beta5 offers more throughput. One significant difference I see is in
the oprofile for the database. For the additional 7% increase in the
metric, there are about 32% less ticks in SearchCatCache.Hmm. I have been profiling PG for some years now, and I cannot remember
ever seeing a profile in which SearchCatCache topped everything else
(the usual suspects for me are palloc/pfree support code). Can you give
any explanation why it looks like that? Can your profiling code tell
where the hotspot call sites of SearchCatCache are?
If I understand the docs correctly, op_to_source -a can do that - the
result is annotated assembly, with percentage numbers for each
instruction. If the sources were compiled with -g2, even source level
annotation is possible.
Mark, do you still have the oprofile output? I don't understand why so
much time is spent in the kernel signal handlers, i.e. I could use
annotated assembly or source of linux/kernel/signal.c.
--
Manfred
I've straced
$ pgbench -c 5 -s 6 -t 1000
total 157k syscalls, 70k of them are rt_sigaction(SIGPIPE):
1754 poll([{fd=3, events=POLLOUT|POLLERR, revents=POLLOUT}], 1, -1) = 1
1754 rt_sigaction(SIGPIPE, {SIG_IGN}, {SIG_DFL}, 8) = 0
1754 send(3, "\0\0\0%\0\3\0\0user\0postgres\0database\0t"..., 37, 0) = 37
1754 rt_sigaction(SIGPIPE, {SIG_DFL}, {SIG_IGN}, 8) = 0
1754 poll([{fd=3, events=POLLIN|POLLERR, revents=POLLIN}], 1, -1) = 1
1754 recv(3, "R\0\0\0\10\0\0\0\0S\0\0\0\36client_encoding\0SQ"...,
16384, 0) = 169
1754 rt_sigaction(SIGPIPE, {SIG_IGN}, {SIG_DFL}, 8) = 0
1754 send(3, "Q\0\0\0\35SET search_path = public\0", 30, 0) = 30
1754 rt_sigaction(SIGPIPE, {SIG_DFL}, {SIG_IGN}, 8) = 0
1754 poll([{fd=3, events=POLLIN|POLLERR, revents=POLLIN}], 1, -1) = 1
1754 recv(3, "C\0\0\0\10SET\0Z\0\0\0\5I", 16384, 0) = 15
1754 rt_sigaction(SIGPIPE, {SIG_IGN}, {SIG_DFL}, 8) = 0
and so on. Is that really necessary?
Mark: could you strace your dbt2 app? I guess your app creates a similar
streams of rt_sigaction calls.
--
Manfred
Manfred Spraul <manfred@colorfullife.com> writes:
Is that really necessary?
Unfortunately, yes. libpq can't change the global setting of SIGPIPE
without breaking the surrounding application, but we don't want to crash
the app if the server connection has disappeared, either. So we have to
set the SIGPIPE handler and then restore it around every send().
On some platforms there might be a better way, but this is the only
portable way I know about.
regards, tom lane
Tom Lane wrote:
Manfred Spraul <manfred@colorfullife.com> writes:
Is that really necessary?
Unfortunately, yes. libpq can't change the global setting of SIGPIPE
without breaking the surrounding application, but we don't want to crash
the app if the server connection has disappeared, either. So we have to
set the SIGPIPE handler and then restore it around every send().
Ok. Ahm. No, wait. libpq is multi-threaded, right?
signal handlers are a process property, not a thread property - that
code is broken for multi-threaded apps.
At least that's how I understand the opengroup man page, and a quick
google confirmed that:
http://groups.google.de/groups?selm=353662BF.9D70F63A%40brighttiger.com
I haven't found a reliable thread-safe approach yet:
My first idea was block with pthread_sigmask, after send check if
pending with sigpending, and then delete with sigwait, and restore
blocked state. But that breaks if SIGPIPE is blocked and a signal is
already pending: there is no way to remove our additional SIGPIPE. I
don't see how we can avoid destroying the realtime signal info.
Mark: Is your dbt2 testapp multithreaded? I don't see the signal
functions near the top in the profiles on the osdl website.
--
Manfred
Manfred Spraul <manfred@colorfullife.com> writes:
signal handlers are a process property, not a thread property - that
code is broken for multi-threaded apps.
Yeah, that's been mentioned before, but I don't see any way around it.
What we really want is to turn off SIGPIPE delivery on our socket
(only), but AFAIK there is no API to do that.
regards, tom lane
Tom Lane wrote:
Manfred Spraul <manfred@colorfullife.com> writes:
signal handlers are a process property, not a thread property - that
code is broken for multi-threaded apps.Yeah, that's been mentioned before, but I don't see any way around it.
Do not handle SIGPIPE on multithreaded apps, and ask the caller to do
that? The current code doesn't block SIGPIPE reliably, which makes it
totally useless (except that it's a debugging nightmare, because
triggering it depends on the right timing).
What we really want is to turn off SIGPIPE delivery on our socket
(only), but AFAIK there is no API to do that.
Linux has as MSG_NOSIGNAL flag for send(), but that seems to be Linux
specific.
--
Manfred
On Sat, Nov 01, 2003 at 02:37:21PM +0100, Manfred Spraul wrote:
Tom Lane wrote:
markw@osdl.org writes:
7.4beta5 offers more throughput. One significant difference I see is in
the oprofile for the database. For the additional 7% increase in the
metric, there are about 32% less ticks in SearchCatCache.Hmm. I have been profiling PG for some years now, and I cannot remember
ever seeing a profile in which SearchCatCache topped everything else
(the usual suspects for me are palloc/pfree support code). Can you give
any explanation why it looks like that? Can your profiling code tell
where the hotspot call sites of SearchCatCache are?If I understand the docs correctly, op_to_source -a can do that - the
result is annotated assembly, with percentage numbers for each
instruction. If the sources were compiled with -g2, even source level
annotation is possible.Mark, do you still have the oprofile output? I don't understand why so
much time is spent in the kernel signal handlers, i.e. I could use
annotated assembly or source of linux/kernel/signal.c.
I haven't been saving the raw output, but I will start. I'll try to get
some annotated source for the kernel going too.
Mark
On Sat, Nov 01, 2003 at 07:27:01PM +0100, Manfred Spraul wrote:
Tom Lane wrote:
Manfred Spraul <manfred@colorfullife.com> writes:
Is that really necessary?
Unfortunately, yes. libpq can't change the global setting of SIGPIPE
without breaking the surrounding application, but we don't want to crash
the app if the server connection has disappeared, either. So we have to
set the SIGPIPE handler and then restore it around every send().Ok. Ahm. No, wait. libpq is multi-threaded, right?
signal handlers are a process property, not a thread property - that
code is broken for multi-threaded apps.
At least that's how I understand the opengroup man page, and a quick
google confirmed that:
http://groups.google.de/groups?selm=353662BF.9D70F63A%40brighttiger.comI haven't found a reliable thread-safe approach yet:
My first idea was block with pthread_sigmask, after send check if
pending with sigpending, and then delete with sigwait, and restore
blocked state. But that breaks if SIGPIPE is blocked and a signal is
already pending: there is no way to remove our additional SIGPIPE. I
don't see how we can avoid destroying the realtime signal info.Mark: Is your dbt2 testapp multithreaded? I don't see the signal
functions near the top in the profiles on the osdl website.
Yeah, my dbt2 applications are multithreaded.
Mark
Manfred Spraul <manfred@colorfullife.com> writes:
Tom Lane wrote:
What we really want is to turn off SIGPIPE delivery on our socket
(only), but AFAIK there is no API to do that.Linux has as MSG_NOSIGNAL flag for send(), but that seems to be Linux
specific.
Hmm ... a Linux-specific solution would be better than none at all.
A bigger objection is that we couldn't get libssl to use it (AFAIK).
The flag really needs to be settable on the socket (eg, via fcntl),
not per-send.
regards, tom lane
Tom Lane wrote:
A bigger objection is that we couldn't get libssl to use it (AFAIK).
The flag really needs to be settable on the socket (eg, via fcntl),
not per-send.
It's a per-send flag, it's not possible to force it on with a fcntl :-(
What about an option to skip the sigaction calls for apps that can
handle SIGPIPE? I'm not sure if an option at connect time, or a flag
accessible through a function like PQsetnonblocking() is the better
approach.
Attached is a patch that adds a connstr option, but I don't like it.
--
Manfred
Attachments:
patch-sigpipetext/plain; name=patch-sigpipeDownload+42-8
Mark Wong wrote:
Yeah, my dbt2 applications are multithreaded.
Do you need SIGPIPE delivery in your app? If no, could you try what
happens if you apply the attached patch to postgres, and perform the
signal(SIGPIPE, SIG_IGN);
once in your dbt2 app?
--
Manfred
Attachments:
patch-dirtytext/plain; name=patch-dirtyDownload+2-2
Manfred Spraul <manfred@colorfullife.com> writes:
What about an option to skip the sigaction calls for apps that can
handle SIGPIPE?
If the app is ignoring SIGPIPE globally, then our calls will have no
effect anyway. I don't see that this proposal adds any security.
regards, tom lane
Tom Lane wrote:
Manfred Spraul <manfred@colorfullife.com> writes:
What about an option to skip the sigaction calls for apps that can
handle SIGPIPE?If the app is ignoring SIGPIPE globally, then our calls will have no
effect anyway.
Wrong. From the opengroup manpage:
<<
SIG_IGN - ignore signal
[snip]
- Setting a signal action to SIG_IGN for a signal that is pending will
cause the pending signal to be discarded, whether or not it is blocked
<<
This is why the kernel spends 20% cpu time processing the SIG_IGN:
it must walk through all threads of the process and check if there
are any SIGPIPE signals pending.
I don't see that this proposal adds any security.
It's not about security: Right now multithreaded apps must call
signal(SIGPIPE, SIG_IGN), otherwise they could get killed by sudden
SIGPIPE signals. Additionally, they can't rely on sigpending, because
the pendings bits are cleared regularly. On top, they get a noticable
performance hit.
My proposal means that apps that know what they are doing (SIGPIPE
either SIG_IGN, or blocked, or a suitable handler) can avoid the
signal(SIGPIPE, SIG_IGN) in pqsecure_write. With backward compatibility,
because the current system works for single threaded apps.
--
Manfred
On Friday 31 October 2003 19:18, markw@osdl.org wrote:
These are the only database parameters I've explicitly set for each one,
any other differences will be differences in default values:
- shared_buffers = 40000
- tcpip_socket = true
- checkpoint_segments = 200
- checkpoint_timeout = 1800
ISTM that these two are fairly unrepresentative of any real world setups. I
might be better to knock them way back towards there defaults and turn on
"checkpoint_warning" to see if they should be altered.
- stats_start_collector = true
- stats_command_string = true
- stats_block_level = true
- stats_row_level = true
- stats_reset_on_server_start = trueIf anyone has any tuning recommendations for either 7.3 or 7.4, I'll be
happy to try them. Or if anyone wants to be able to poke around on the
system, we can arrange that too. Feel free to ask any questions.
Robert Treat
--
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL
I don't remember making a conscious decision between the number and integer
database type. Is that a significant oversight on my part?
Show quoted text
On Fri, Oct 31, 2003 at 08:04:34PM -0500, Rod Taylor wrote:
Excellent.
I just noticed that most of the numbers in the system are given the
numeric data type. Is there any particular reason you don't use integer
(test enforced?)?On Fri, 2003-10-31 at 19:18, markw@osdl.org wrote:
I thought someone might be interested in a data point I have comparing
7.3.4 and 7.4beta5 with results from our DBT-2 workload. Keep in mind I
haven't done much tuning with either version. The following links have
references iostat, vmstat, sar, readprofile (linux kernel profile), and
oprofile (postgresql profile) statistics.Results from 7.3.4:
http://developer.osdl.org/markw/dbt2-pgsql/184/
- metric 1354.58Results from 7.4beta5
http://developer.osdl.org/markw/dbt2-pgsql/188/
- metric 1446.017.4beta5 offers more throughput. One significant difference I see is in
the oprofile for the database. For the additional 7% increase in the
metric, there are about 32% less ticks in SearchCatCache.These are the only database parameters I've explicitly set for each one,
any other differences will be differences in default values:
- shared_buffers = 40000
- tcpip_socket = true
- checkpoint_segments = 200
- checkpoint_timeout = 1800
- stats_start_collector = true
- stats_command_string = true
- stats_block_level = true
- stats_row_level = true
- stats_reset_on_server_start = trueIf anyone has any tuning recommendations for either 7.3 or 7.4, I'll be
happy to try them. Or if anyone wants to be able to poke around on the
system, we can arrange that too. Feel free to ask any questions.
On Sat, Nov 01, 2003 at 02:37:21PM +0100, Manfred Spraul wrote:
Tom Lane wrote:
markw@osdl.org writes:
7.4beta5 offers more throughput. One significant difference I see is in
the oprofile for the database. For the additional 7% increase in the
metric, there are about 32% less ticks in SearchCatCache.Hmm. I have been profiling PG for some years now, and I cannot remember
ever seeing a profile in which SearchCatCache topped everything else
(the usual suspects for me are palloc/pfree support code). Can you give
any explanation why it looks like that? Can your profiling code tell
where the hotspot call sites of SearchCatCache are?If I understand the docs correctly, op_to_source -a can do that - the
result is annotated assembly, with percentage numbers for each
instruction. If the sources were compiled with -g2, even source level
annotation is possible.Mark, do you still have the oprofile output? I don't understand why so
much time is spent in the kernel signal handlers, i.e. I could use
annotated assembly or source of linux/kernel/signal.c.
I've rerun a test, capturing the raw oprofile output, running opannotate for
source and assmebly output (links for each should be on the page now.) Let
me know if I've missed anything:
http://developer.osdl.org/markw/dbt2-pgsql/190/
I'm running a test with your patch now too. I should have results shortly.
Thanks,
Mark