Performance monitor signal handler
I was going to implement the signal handler like we do with Cancel,
where the signal sets a flag and we check the status of the flag in
various _safe_ places.
Can anyone think of a better way to get information out of a backend?
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
* Bruce Momjian <pgman@candle.pha.pa.us> [010312 12:12] wrote:
I was going to implement the signal handler like we do with Cancel,
where the signal sets a flag and we check the status of the flag in
various _safe_ places.Can anyone think of a better way to get information out of a backend?
Why not use a static area of the shared memory segment? Is it possible
to have a spinlock over it so that an external utility can take a snapshot
of it with the spinlock held?
Also, this could work for other stuff as well, instead of overloading
a lot of signal handlers one could just periodically poll a region of
the shared segment.
just some ideas..
--
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
Daemon News Magazine in your snail-mail! http://magazine.daemonnews.org/
At 13:34 12/03/01 -0800, Alfred Perlstein wrote:
Is it possible
to have a spinlock over it so that an external utility can take a snapshot
of it with the spinlock held?
I'd suggest that locking the stats area might be a bad idea; there is only
one writer for each backend-specific chunk, and it won't matter a hell of a
lot if a reader gets inconsistent views (since I assume they will be
re-reading every second or so). All the stats area should contain would be
a bunch of counters with timestamps, I think, and the cost up writing to it
should be kept to an absolute minimum.
just some ideas..
Unfortunatley, based on prior discussions, Bruce seems quite opposed to a
shared memory solution.
----------------------------------------------------------------
Philip Warner | __---_____
Albatross Consulting Pty. Ltd. |----/ - \
(A.B.N. 75 008 659 498) | /(@) ______---_
Tel: (+61) 0500 83 82 81 | _________ \
Fax: (+61) 0500 83 82 82 | ___________ |
Http://www.rhyme.com.au | / \|
| --________--
PGP key available upon request, | /
and from pgp5.ai.mit.edu:11371 |/
* Philip Warner <pjw@rhyme.com.au> [010312 18:56] wrote:
At 13:34 12/03/01 -0800, Alfred Perlstein wrote:
Is it possible
to have a spinlock over it so that an external utility can take a snapshot
of it with the spinlock held?I'd suggest that locking the stats area might be a bad idea; there is only
one writer for each backend-specific chunk, and it won't matter a hell of a
lot if a reader gets inconsistent views (since I assume they will be
re-reading every second or so). All the stats area should contain would be
a bunch of counters with timestamps, I think, and the cost up writing to it
should be kept to an absolute minimum.just some ideas..
Unfortunatley, based on prior discussions, Bruce seems quite opposed to a
shared memory solution.
Ok, here's another nifty idea.
On reciept of the info signal, the backends collaborate to piece
together a status file. The status file is given a temporay name.
When complete the status file is rename(2)'d over a well known
file.
This ought to always give a consistant snapshot of the file to
whomever opens it.
--
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
Daemon News Magazine in your snail-mail! http://magazine.daemonnews.org/
This ought to always give a consistant snapshot of the file to
whomever opens it.
I think Tom has previously stated that there are technical reasons not to
do IO in signal handlers, and I have philosophical problems with
performance monitors that ask 50 backends to do file IO. I really do think
shared memory is TWTG.
----------------------------------------------------------------
Philip Warner | __---_____
Albatross Consulting Pty. Ltd. |----/ - \
(A.B.N. 75 008 659 498) | /(@) ______---_
Tel: (+61) 0500 83 82 81 | _________ \
Fax: (+61) 0500 83 82 82 | ___________ |
Http://www.rhyme.com.au | / \|
| --________--
PGP key available upon request, | /
and from pgp5.ai.mit.edu:11371 |/
At 13:34 12/03/01 -0800, Alfred Perlstein wrote:
Is it possible
to have a spinlock over it so that an external utility can take a snapshot
of it with the spinlock held?I'd suggest that locking the stats area might be a bad idea; there is only
one writer for each backend-specific chunk, and it won't matter a hell of a
lot if a reader gets inconsistent views (since I assume they will be
re-reading every second or so). All the stats area should contain would be
a bunch of counters with timestamps, I think, and the cost up writing to it
should be kept to an absolute minimum.just some ideas..
Unfortunatley, based on prior discussions, Bruce seems quite opposed to a
shared memory solution.
No, I like the shared memory idea. Such an idea will have to wait for
7.2, and second, there are limits to how much shared memory I can use.
Eventually, I think shared memory will be the way to go.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
This ought to always give a consistant snapshot of the file to
whomever opens it.I think Tom has previously stated that there are technical reasons not to
do IO in signal handlers, and I have philosophical problems with
performance monitors that ask 50 backends to do file IO. I really do think
shared memory is TWTG.
The good news is that right now pgmonitor gets all its information from
'ps', and only shows the query when the user asks for it.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
* Philip Warner <pjw@rhyme.com.au> [010313 06:42] wrote:
This ought to always give a consistant snapshot of the file to
whomever opens it.I think Tom has previously stated that there are technical reasons not to
do IO in signal handlers, and I have philosophical problems with
performance monitors that ask 50 backends to do file IO. I really do think
shared memory is TWTG.
I wasn't really suggesting any of those courses of action, all I
suggested was using rename(2) to give a seperate appilcation a
consistant snapshot of the stats.
Actually, what makes the most sense (although it may be a performance
killer) is to have the backends update a system table that the external
app can query.
--
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
Daemon News Magazine in your snail-mail! http://magazine.daemonnews.org/
I think Tom has previously stated that there are technical reasons not to
do IO in signal handlers, and I have philosophical problems with
performance monitors that ask 50 backends to do file IO. I really do think
shared memory is TWTG.I wasn't really suggesting any of those courses of action, all I
suggested was using rename(2) to give a seperate appilcation a
consistant snapshot of the stats.Actually, what makes the most sense (although it may be a performance
killer) is to have the backends update a system table that the external
app can query.
Yes, it seems storing info in shared memory and having a system table to
access it is the way to go.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
On reciept of the info signal, the backends collaborate to piece
together a status file. The status file is given a temporay name.
When complete the status file is rename(2)'d over a well known
file.
Reporting to files, particularly well known ones, could lead to race
conditions.
All in all, I think your better off passing messages through pipes or a
similar communication method.
I really liked the idea of a "server" that could parse/analyze data from
multiple backends.
My 2/100 worth...
* Thomas Swan <tswan-lst@ics.olemiss.edu> [010313 13:37] wrote:
On reciept of the info signal, the backends collaborate to piece
together a status file. The status file is given a temporay name.
When complete the status file is rename(2)'d over a well known
file.Reporting to files, particularly well known ones, could lead to race
conditions.All in all, I think your better off passing messages through pipes or a
similar communication method.I really liked the idea of a "server" that could parse/analyze data from
multiple backends.My 2/100 worth...
Take a few moments to think about the semantics of rename(2).
Yes, you would still need syncronization between the backend
processes to do this correctly, but not any external app.
The external app can just open the file, assuming it exists it
will always have a complete and consistant snapshot of whatever
the backends agreed on.
--
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
Daemon News Magazine in your snail-mail! http://magazine.daemonnews.org/
Bruce Momjian wrote:
Yes, it seems storing info in shared memory and having a system table to
access it is the way to go.
Depends,
first of all we need to know WHAT we want to collect. If we
talk about block read/write statistics and such on a per
table base, which is IMHO the most accurate thing for tuning
purposes, then we're talking about an infinite size of shared
memory perhaps.
And shared memory has all the interlocking problems we want
to avoid.
What about a collector deamon, fired up by the postmaster and
receiving UDP packets from the backends. Under heavy load, it
might miss some statistic messages, well, but that's not as
bad as having locks causing backends to loose performance.
The postmaster could already provide the UDP socket for the
backends, so the collector can know the peer address from
which to accept statistics messages only. Any message from
another peer address is simply ignored. For getting the
statistics out of it, the collector has his own server
socket, using TCP and providing some lookup protocol.
Now whatever the backend has to tell the collector, it simply
throws a UDP packet into his direction. If the collector can
catch it or not, not the backends problem.
Jan
--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #
_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com
Jan Wieck <janwieck@Yahoo.com> writes:
What about a collector deamon, fired up by the postmaster and
receiving UDP packets from the backends. Under heavy load, it
might miss some statistic messages, well, but that's not as
bad as having locks causing backends to loose performance.
Interesting thought, but we don't want UDP I think; that just opens
up a whole can of worms about checking access permissions and so forth.
Why not a simple pipe? The postmaster creates the pipe and the
collector daemon inherits one end, while all the backends inherit the
other end.
regards, tom lane
Tom Lane wrote:
Jan Wieck <janwieck@Yahoo.com> writes:
What about a collector deamon, fired up by the postmaster and
receiving UDP packets from the backends. Under heavy load, it
might miss some statistic messages, well, but that's not as
bad as having locks causing backends to loose performance.Interesting thought, but we don't want UDP I think; that just opens
up a whole can of worms about checking access permissions and so forth.
Why not a simple pipe? The postmaster creates the pipe and the
collector daemon inherits one end, while all the backends inherit the
other end.
I don't think so - though I haven't tested the following yet,
but AFAIR it's correct.
Have the postmaster creating two UDP sockets before it forks
off the collector. It can examine the peer addresses of both,
so they don't need well known port numbers, it can be the
randomly ones assigned by the kernel. Thus, we don't need
SO_REUSE on them either.
Now, since the collector is forked off by the postmaster, it
knows the peer address of the other socket. And since all
backends get forked off from the postmaster as well, they'll
all use the same peer address, don't they? So all the
collector has to look at is the sender address including port
number of the packets. It needs to be what the postmaster
examined, anything else is from someone else and goes to bit
heaven. The same way the backends know where to send their
statistics.
If I'm right that in the case of fork() all children share
the same socket with the same peer address, then it's even
safe in the case the collector dies. The postmaster can still
hold the collectors socket and will notice that the collector
died (due to a wait() returning it's PID) and can fire up
another one. Again some packets got lost (plus all the so far
collected statistics, hmmm - aint that a cool way to reset
statistic counters - killing the collector?), but it did not
disturb any live backend in any way. They will never get any
signal, don't care about what's done with their statistics
and such. They just do their work...
Jan
--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #
_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com
Import Notes
Resolved by subject fallback
Tom Lane wrote:
Jan Wieck <janwieck@Yahoo.com> writes:
What about a collector deamon, fired up by the postmaster and
receiving UDP packets from the backends. Under heavy load, it
might miss some statistic messages, well, but that's not as
bad as having locks causing backends to loose performance.Interesting thought, but we don't want UDP I think; that just opens
up a whole can of worms about checking access permissions and so forth.
Why not a simple pipe? The postmaster creates the pipe and the
collector daemon inherits one end, while all the backends inherit the
other end.
I don't think so - though I haven't tested the following yet,
but AFAIR it's correct.
Have the postmaster creating two UDP sockets before it forks
off the collector. It can examine the peer addresses of both,
so they don't need well known port numbers, it can be the
randomly ones assigned by the kernel. Thus, we don't need
SO_REUSE on them either.
Now, since the collector is forked off by the postmaster, it
knows the peer address of the other socket. And since all
backends get forked off from the postmaster as well, they'll
all use the same peer address, don't they? So all the
collector has to look at is the sender address including port
number of the packets. It needs to be what the postmaster
examined, anything else is from someone else and goes to bit
heaven. The same way the backends know where to send their
statistics.
If I'm right that in the case of fork() all children share
the same socket with the same peer address, then it's even
safe in the case the collector dies. The postmaster can still
hold the collectors socket and will notice that the collector
died (due to a wait() returning it's PID) and can fire up
another one. Again some packets got lost (plus all the so far
collected statistics, hmmm - aint that a cool way to reset
statistic counters - killing the collector?), but it did not
disturb any live backend in any way. They will never get any
signal, don't care about what's done with their statistics
and such. They just do their work...
Jan
--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #
_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com
Import Notes
Resolved by subject fallback
At 06:57 15/03/01 -0500, Jan Wieck wrote:
And shared memory has all the interlocking problems we want
to avoid.
I suspect that if we keep per-backend data in a separate area, then we
don;t need locking since there is only one writer. It does not matter if a
reader gets an inconsistent view, the same as if you drop a few UDP packets.
What about a collector deamon, fired up by the postmaster and
receiving UDP packets from the backends.
This does sound appealing; it means that individual backend data (IO etc)
will survive past the termination of the backend. I'd like to see the stats
survive the death of the collector if possible, possibly even survive a
stop/start of the postmaster.
Now whatever the backend has to tell the collector, it simply
throws a UDP packet into his direction. If the collector can
catch it or not, not the backends problem.
If we get the backends to keep the stats they are sending in local counters
as well, then they can send the counter value (not delta) each time, which
would mean that the collector would not 'miss' anything - just it's
operations/sec might see a hiccough. This could have a sidebenefit that(if
wewanted to?) we could allow a client to query their own counters to get an
idea of the costs of their queries.
When we need to reset the counters that should be done explicitly, I think.
----------------------------------------------------------------
Philip Warner | __---_____
Albatross Consulting Pty. Ltd. |----/ - \
(A.B.N. 75 008 659 498) | /(@) ______---_
Tel: (+61) 0500 83 82 81 | _________ \
Fax: (+61) 0500 83 82 82 | ___________ |
Http://www.rhyme.com.au | / \|
| --________--
PGP key available upon request, | /
and from pgp5.ai.mit.edu:11371 |/
* Philip Warner <pjw@rhyme.com.au> [010315 16:14] wrote:
At 06:57 15/03/01 -0500, Jan Wieck wrote:
And shared memory has all the interlocking problems we want
to avoid.I suspect that if we keep per-backend data in a separate area, then we
don;t need locking since there is only one writer. It does not matter if a
reader gets an inconsistent view, the same as if you drop a few UDP packets.
No, this is completely different.
Lost data is probably better than incorrect data. Either use locks
or a copying mechanism. People will depend on the data returned
making sense.
--
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
At 16:17 15/03/01 -0800, Alfred Perlstein wrote:
Lost data is probably better than incorrect data. Either use locks
or a copying mechanism. People will depend on the data returned
making sense.
But with per-backend data, there is only ever *one* writer to a given set
of counters. Everyone else is a reader.
----------------------------------------------------------------
Philip Warner | __---_____
Albatross Consulting Pty. Ltd. |----/ - \
(A.B.N. 75 008 659 498) | /(@) ______---_
Tel: (+61) 0500 83 82 81 | _________ \
Fax: (+61) 0500 83 82 82 | ___________ |
Http://www.rhyme.com.au | / \|
| --________--
PGP key available upon request, | /
and from pgp5.ai.mit.edu:11371 |/
* Philip Warner <pjw@rhyme.com.au> [010315 16:46] wrote:
At 16:17 15/03/01 -0800, Alfred Perlstein wrote:
Lost data is probably better than incorrect data. Either use locks
or a copying mechanism. People will depend on the data returned
making sense.But with per-backend data, there is only ever *one* writer to a given set
of counters. Everyone else is a reader.
This doesn't prevent a reader from getting an inconsistant view.
Think about a 64bit counter on a 32bit machine. If you charged per
megabyte, wouldn't it upset you to have a small chance of loosing
4 billion units of sale?
(ie, doing a read after an addition that wraps the low 32 bits
but before the carry is done to the top most signifigant 32bits?)
Ok, what what if everything can be read atomically by itself?
You're still busted the minute you need to export any sort of
compound stat.
If A, B and C need to add up to 100 you have a read race.
--
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
At 16:55 15/03/01 -0800, Alfred Perlstein wrote:
* Philip Warner <pjw@rhyme.com.au> [010315 16:46] wrote:
At 16:17 15/03/01 -0800, Alfred Perlstein wrote:
Lost data is probably better than incorrect data. Either use locks
or a copying mechanism. People will depend on the data returned
making sense.But with per-backend data, there is only ever *one* writer to a given set
of counters. Everyone else is a reader.This doesn't prevent a reader from getting an inconsistant view.
Think about a 64bit counter on a 32bit machine. If you charged per
megabyte, wouldn't it upset you to have a small chance of loosing
4 billion units of sale?(ie, doing a read after an addition that wraps the low 32 bits
but before the carry is done to the top most signifigant 32bits?)
I assume this means we can not rely on the existence of any kind of
interlocked add on 64 bit machines?
Ok, what what if everything can be read atomically by itself?
You're still busted the minute you need to export any sort of
compound stat.
Which is why the backends should not do anything other than maintain the
raw data. If there is atomic data than can cause inconsistency, then a
dropped UDP packet will do the same.
----------------------------------------------------------------
Philip Warner | __---_____
Albatross Consulting Pty. Ltd. |----/ - \
(A.B.N. 75 008 659 498) | /(@) ______---_
Tel: (+61) 0500 83 82 81 | _________ \
Fax: (+61) 0500 83 82 82 | ___________ |
Http://www.rhyme.com.au | / \|
| --________--
PGP key available upon request, | /
and from pgp5.ai.mit.edu:11371 |/