Configuration of statistical views

Started by Jan Wieckover 24 years ago14 messages
#1Jan Wieck
JanWieck@Yahoo.com

Hi,

OK, all the high-frequently called functions of the pgstat
stuff are macros (will commit that later today).

Now about the per database configuration. The thing is that I
don't know if it is worth doing it too detailed. #ifdef'ing
out the functionality I have the following wallclock runtimes
for the regression test on a 500MHz P-III:

Backend does nothing: 1:03
Backend sends per table
scan and block IO: 1:05
Backend sends per table
info plus querystring: 1:10

If somebody wants to see an applications querystring (at
least the first 512 bytes) just in case something goes wrong
and the client hangs, he'd have to run querystring reporting
all the time either way.

So I can see value in a per database default in pg_database
plus the ability to switch it on/off via statement to analyze
single commands.

What do others think?

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jan Wieck (#1)
Re: Configuration of statistical views

Jan Wieck <JanWieck@Yahoo.com> writes:

So I can see value in a per database default in pg_database
plus the ability to switch it on/off via statement to analyze
single commands.

Do you even need a per-database default? Why not an installation-wide
default in postgresql.conf plus on/off commands? The great advantage
of doing it that way is that it's simply a GUC variable or three, and
you don't need to expend any work on developing infrastructure. So
I'd recommend doing it that way to get started, even if you later decide
that something more complex is warranted.

regards, tom lane

#3Jan Wieck
JanWieck@Yahoo.com
In reply to: Tom Lane (#2)
Re: Configuration of statistical views

Tom Lane wrote:

Jan Wieck <JanWieck@Yahoo.com> writes:

So I can see value in a per database default in pg_database
plus the ability to switch it on/off via statement to analyze
single commands.

Do you even need a per-database default? Why not an installation-wide
default in postgresql.conf plus on/off commands? The great advantage
of doing it that way is that it's simply a GUC variable or three, and
you don't need to expend any work on developing infrastructure. So
I'd recommend doing it that way to get started, even if you later decide
that something more complex is warranted.

Personally, I can live with no options at all, because I
think that amount of performance loss is worth it beeing able
to look at a query in case. You know, if it's a config option
it tends to allways being off when the errors happen.

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

#4Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Jan Wieck (#1)
Re: Configuration of statistical views

If somebody wants to see an applications querystring (at
least the first 512 bytes) just in case something goes wrong
and the client hangs, he'd have to run querystring reporting
all the time either way.

Agreed. That should be on all the time.

So I can see value in a per database default in pg_database
plus the ability to switch it on/off via statement to analyze
single commands.

Sounds fine. You may be able to just to a GUC/SET option and not do a
per-database field. GUC doesn't do per-database and having a database
flag and GUC would be confusing. Let's roll with just GUC/SET and see
how it goes.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#5Jan Wieck
JanWieck@Yahoo.com
In reply to: Bruce Momjian (#4)
Re: Configuration of statistical views

Bruce Momjian wrote:

If somebody wants to see an applications querystring (at
least the first 512 bytes) just in case something goes wrong
and the client hangs, he'd have to run querystring reporting
all the time either way.

Agreed. That should be on all the time.

So I can see value in a per database default in pg_database
plus the ability to switch it on/off via statement to analyze
single commands.

Sounds fine. You may be able to just to a GUC/SET option and not do a
per-database field. GUC doesn't do per-database and having a database
flag and GUC would be confusing. Let's roll with just GUC/SET and see
how it goes.

No per backend on/off statement - is that what you mean?
That'd be easiest to get started.

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

#6Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Jan Wieck (#5)
Re: Configuration of statistical views

Bruce Momjian wrote:

If somebody wants to see an applications querystring (at
least the first 512 bytes) just in case something goes wrong
and the client hangs, he'd have to run querystring reporting
all the time either way.

Agreed. That should be on all the time.

So I can see value in a per database default in pg_database
plus the ability to switch it on/off via statement to analyze
single commands.

Sounds fine. You may be able to just to a GUC/SET option and not do a
per-database field. GUC doesn't do per-database and having a database
flag and GUC would be confusing. Let's roll with just GUC/SET and see
how it goes.

No per backend on/off statement - is that what you mean?
That'd be easiest to get started.

GUC as the default, and SET for per-backend. I am liking GUC more and
more.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#4)
Re: Configuration of statistical views

Bruce Momjian <pgman@candle.pha.pa.us> writes:

If somebody wants to see an applications querystring (at
least the first 512 bytes) just in case something goes wrong
and the client hangs, he'd have to run querystring reporting
all the time either way.

Agreed. That should be on all the time.

"On by default", sure. "On all the time", I'm not sold on.

But anyway, we seem to be converging on the conclusion that setting
up a GUC variable will do fine, at least until there is definite
evidence that it won't.

Probably there need to be at least 2 variables: (a) a PGC_POSTMASTER
variable that controls whether the stats collector is even started,
and (b) PGC_USERSET variable(s) that enable a particular backend to
send particular kinds of data to the collector. Note that, for example,
backend start/stop events probably need to be reported whenever the
postmaster variable is set, even if all the USERSET variables are off.

regards, tom lane

#8Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#7)
Re: Configuration of statistical views

Bruce Momjian <pgman@candle.pha.pa.us> writes:

If somebody wants to see an applications querystring (at
least the first 512 bytes) just in case something goes wrong
and the client hangs, he'd have to run querystring reporting
all the time either way.

Agreed. That should be on all the time.

"On by default", sure. "On all the time", I'm not sold on.

So we will havre GUC for stats and query string. Fine. Set query
string on by default and stats off by default. Good.

But anyway, we seem to be converging on the conclusion that setting
up a GUC variable will do fine, at least until there is definite
evidence that it won't.

Probably there need to be at least 2 variables: (a) a PGC_POSTMASTER
variable that controls whether the stats collector is even started,
and (b) PGC_USERSET variable(s) that enable a particular backend to
send particular kinds of data to the collector. Note that, for example,
backend start/stop events probably need to be reported whenever the
postmaster variable is set, even if all the USERSET variables are off.

And another one to control whether the daemon is even running. OK.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#8)
Re: Configuration of statistical views

Jan Wieck <JanWieck@yahoo.com> writes:

backend start/stop events probably need to be reported whenever the
postmaster variable is set, even if all the USERSET variables are off.

I don't consider backend start/stop messages to be critical,
although we get some complaints already about connection
slowness - well, this is somewhere in the microseconds. And
it'd be a little messy because the start message is sent by
the backend while the stop message is sent by the postmaster.
So where exactly to put it?

This is exactly why I think they should be sent unconditionally.
It doesn't matter if a particular backend turns its reporting on and
off while it runs (I hope), but I'd think the stats collector would
get confused if it saw, say, a start and no stop message for a
particular backend.

OTOH, given that we need to treat the transmission channel as
unreliable, it would be a bad idea anyway if the stats collector got
seriously confused by not seeing the start or the stop message.

regards, tom lane

#10Jan Wieck
JanWieck@Yahoo.com
In reply to: Tom Lane (#7)
Re: Configuration of statistical views

Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

If somebody wants to see an applications querystring (at
least the first 512 bytes) just in case something goes wrong
and the client hangs, he'd have to run querystring reporting
all the time either way.

Agreed. That should be on all the time.

"On by default", sure. "On all the time", I'm not sold on.

But anyway, we seem to be converging on the conclusion that setting
up a GUC variable will do fine, at least until there is definite
evidence that it won't.

Up to now, only three fulltime PG-developers spoke up. Maybe
someone else likes to comment on it too and hasn't had the
time yet. Let's be a little patient.

Probably there need to be at least 2 variables: (a) a PGC_POSTMASTER
variable that controls whether the stats collector is even started,
and (b) PGC_USERSET variable(s) that enable a particular backend to
send particular kinds of data to the collector. Note that, for example,
backend start/stop events probably need to be reported whenever the
postmaster variable is set, even if all the USERSET variables are off.

I don't consider backend start/stop messages to be critical,
although we get some complaints already about connection
slowness - well, this is somewhere in the microseconds. And
it'd be a little messy because the start message is sent by
the backend while the stop message is sent by the postmaster.
So where exactly to put it?

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

#11Jan Wieck
JanWieck@Yahoo.com
In reply to: Bruce Momjian (#8)
Re: Configuration of statistical views

Bruce Momjian wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

If somebody wants to see an applications querystring (at
least the first 512 bytes) just in case something goes wrong
and the client hangs, he'd have to run querystring reporting
all the time either way.

Agreed. That should be on all the time.

"On by default", sure. "On all the time", I'm not sold on.

So we will havre GUC for stats and query string. Fine. Set query
string on by default and stats off by default. Good.

But anyway, we seem to be converging on the conclusion that setting
up a GUC variable will do fine, at least until there is definite
evidence that it won't.

Probably there need to be at least 2 variables: (a) a PGC_POSTMASTER
variable that controls whether the stats collector is even started,
and (b) PGC_USERSET variable(s) that enable a particular backend to
send particular kinds of data to the collector. Note that, for example,
backend start/stop events probably need to be reported whenever the
postmaster variable is set, even if all the USERSET variables are off.

And another one to control whether the daemon is even running. OK.

Forcing the other two to stay off if no daemon present.

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

#12Jan Wieck
JanWieck@Yahoo.com
In reply to: Tom Lane (#9)
Re: Configuration of statistical views

Tom Lane wrote:

Jan Wieck <JanWieck@yahoo.com> writes:

backend start/stop events probably need to be reported whenever the
postmaster variable is set, even if all the USERSET variables are off.

I don't consider backend start/stop messages to be critical,
although we get some complaints already about connection
slowness - well, this is somewhere in the microseconds. And
it'd be a little messy because the start message is sent by
the backend while the stop message is sent by the postmaster.
So where exactly to put it?

This is exactly why I think they should be sent unconditionally.
It doesn't matter if a particular backend turns its reporting on and
off while it runs (I hope), but I'd think the stats collector would
get confused if it saw, say, a start and no stop message for a
particular backend.

OTOH, given that we need to treat the transmission channel as
unreliable, it would be a bad idea anyway if the stats collector got
seriously confused by not seeing the start or the stop message.

Hmmm - that's a good point. Right now, the collector is
totally lax on all of that. Missing start packet - no
problem, we create the backend slot on the fly. Missing stats
packet - well, the counters aren't 100% correct, so be it.
But OTOH it causes him to remember the dead backend for
postmaster lifetime in case of a missing stop. Except a PID
wraparound causes a fix someday. Maybe it should
periodically (every 10 minutes or even longer) check with a
zero-kill if all the backends it knows about are really
alive.

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

#13Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#7)
Re: Configuration of statistical views

Tom Lane writes:

Probably there need to be at least 2 variables: (a) a PGC_POSTMASTER
variable that controls whether the stats collector is even started,
and (b) PGC_USERSET variable(s) that enable a particular backend to
send particular kinds of data to the collector. Note that, for example,
backend start/stop events probably need to be reported whenever the
postmaster variable is set, even if all the USERSET variables are off.

I'm not familiar with the kinds of statistics that are supposed to be
gathered here, but I suppose their usefulness would be greatly increased
if they were gathered across all data/actions, not only the ones that the
users turned them on for. So I think ordinary users have no business
controlling these settings.

--
Peter Eisentraut peter_e@gmx.net http://funkturm.homeip.net/~peter

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#13)
Re: Configuration of statistical views

Peter Eisentraut <peter_e@gmx.net> writes:

I'm not familiar with the kinds of statistics that are supposed to be
gathered here, but I suppose their usefulness would be greatly increased
if they were gathered across all data/actions, not only the ones that the
users turned them on for. So I think ordinary users have no business
controlling these settings.

Okay, the per-backend GUC variables should be SUSET instead of USERSET.
I don't have a problem with that ...

regards, tom lane