keeping track of connections

Started by Brett McCormickover 27 years ago38 messages

brett@work.chicken.org

over 27 years ago

I would love a way to keep track of the connections/attempted
connections to the postmaster. I'm thinking that when the postmaster
accept()s a connection, it can just insert a record into a table
(system catalog or not) with the information, which can be updated
after the authentication succeeds/fails or whatnot.

something like 'smbstatus' for the samba system.

So, my question is: how should I go about doing this? should I look
into SPI, which I know nothing about? or, what.. I don't think the
catalog cache stuff needs to be changed, it isn't as if this info
needs to be immediately accessible.

Bruce Momjian

maillist@candle.pha.pa.us

over 27 years ago

In reply to: Brett McCormick (#1)

Re: [HACKERS] keeping track of connections

I would love a way to keep track of the connections/attempted
connections to the postmaster. I'm thinking that when the postmaster
accept()s a connection, it can just insert a record into a table
(system catalog or not) with the information, which can be updated
after the authentication succeeds/fails or whatnot.

something like 'smbstatus' for the samba system.

So, my question is: how should I go about doing this? should I look
into SPI, which I know nothing about? or, what.. I don't think the
catalog cache stuff needs to be changed, it isn't as if this info
needs to be immediately accessible.

Good question. Postmaster does not have access to the system tables, so
it can't access them. You could add a debug option to show it in the
server logs, or add it to the -d2 debug option that already shows SQL
statements.

-- 
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

Brett McCormick

brett@work.chicken.org

over 27 years ago

In reply to: Bruce Momjian (#2)

Re: [HACKERS] keeping track of connections

On Wed, 3 June 1998, at 00:11:01, Bruce Momjian wrote:

Good question. Postmaster does not have access to the system tables, so
it can't access them. You could add a debug option to show it in the
server logs, or add it to the -d2 debug option that already shows SQL
statements.

How about something like this: a pool of shared memory where this
information is stored, and then a view which calls a set of functions
to return the information from the shared memory?

Apache does something similar.

Noname

dg@illustra.com

over 27 years ago

In reply to: Brett McCormick (#3)

Re: [HACKERS] keeping track of connections

Brett M writes:

On Wed, 3 June 1998, at 00:11:01, Bruce Momjian wrote:

Good question. Postmaster does not have access to the system tables, so
it can't access them. You could add a debug option to show it in the
server logs, or add it to the -d2 debug option that already shows SQL
statements.

How about something like this: a pool of shared memory where this
information is stored, and then a view which calls a set of functions
to return the information from the shared memory?

Apache does something similar.

I am curious, what is it you are trying to accomplish with this? Are you
trying to build a persistant log that you can query later for billing
or load management/capacity planning information? Are you trying to monitor
login attempts for security auditing? Are you trying to catch logins in
real time for some sort of middleware integration?

Here we are discussion solutions, but I don't even know what the problem
is. So, please describe what is needed in terms of requirements/functionality.

Thanks

-dg

David Gould dg@illustra.com 510.628.3783 or 510.305.9468
Informix Software (No, really) 300 Lakeside Drive Oakland, CA 94612
"Don't worry about people stealing your ideas. If your ideas are any
good, you'll have to ram them down people's throats." -- Howard Aiken

Brett McCormick

brett@work.chicken.org

over 27 years ago

In reply to: Noname (#4)

Re: [HACKERS] keeping track of connections

On Wed, 3 June 1998, at 01:05:17, David Gould wrote:

I am curious, what is it you are trying to accomplish with this? Are you
trying to build a persistant log that you can query later for billing
or load management/capacity planning information? Are you trying to monitor
login attempts for security auditing? Are you trying to catch logins in
real time for some sort of middleware integration?

The problem is that when I do a process listing for the postgres user,
I see many backends. There's no (convenient) way to see what those
backends are doing, what db they're connected to or the remote
host/postgres user.

My required functionality is this: a list of all backends and
connection details. IP, queries issued, listens/notifications
requested/served, bytes transfered, postgres user, db, current query,
client version, etcetcetc.

What problem am I trying to solve? It is purely a desire for this
information. I also feel it will help be debug problems. It would be
nice to track down my clients that are now failing because of password
authentication, but I do admit that this would not help much.

What I shall be doing is hacking libpq to report the name of the
process and related information like environment when connecting to a
database. This would let me track down those programs. As it is, I
have programs failing, and I don't know which ones. Obviously they
aren't very crucial, but it would be nice to know how much more it is
than me typing 'psql' on the host and expecting to connect.

Obviously, this is unrelated. But it is purely a desire for
information. The more info the better. The debug log is quite
henious when trying to figure out what's going on, especially with
lots of connections.

On another unrelated note, the postmaster has been dying lately,
leaving children hanging about. I thought something might be
corrupted (disk full at one point) so I did a dump/reload. We'll see
what happens.

Call it a feature.

Bruce Momjian

maillist@candle.pha.pa.us

over 27 years ago

In reply to: Brett McCormick (#3)

Re: [HACKERS] keeping track of connections

On Wed, 3 June 1998, at 00:11:01, Bruce Momjian wrote:

Good question. Postmaster does not have access to the system tables, so
it can't access them. You could add a debug option to show it in the
server logs, or add it to the -d2 debug option that already shows SQL
statements.

How about something like this: a pool of shared memory where this
information is stored, and then a view which calls a set of functions
to return the information from the shared memory?

Apache does something similar.

Yes, that would work. Are you looking for something to show current
backend status. What type of info would be in there?

-- 
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

Bruce Momjian

maillist@candle.pha.pa.us

over 27 years ago

In reply to: Brett McCormick (#5)

Re: [HACKERS] keeping track of connections

On Wed, 3 June 1998, at 01:05:17, David Gould wrote:

I am curious, what is it you are trying to accomplish with this? Are you
trying to build a persistant log that you can query later for billing
or load management/capacity planning information? Are you trying to monitor
login attempts for security auditing? Are you trying to catch logins in
real time for some sort of middleware integration?

The problem is that when I do a process listing for the postgres user,
I see many backends. There's no (convenient) way to see what those
backends are doing, what db they're connected to or the remote
host/postgres user.

My required functionality is this: a list of all backends and
connection details. IP, queries issued, listens/notifications
requested/served, bytes transfered, postgres user, db, current query,
client version, etcetcetc.

That's a lot of info. One solution for database and username would be
to modify argv[1] and argv[2] for the postgres backend so it shows this
information on the ps command line. As long as these args are already
used as part of startup ( and they are when started under the
postmaster), we could set argv to whatever values we are interested in,
and clear the rest of them so the output would look nice.

This would be easy to do, and I would be glad to do it.

What problem am I trying to solve? It is purely a desire for this
information. I also feel it will help be debug problems. It would be
nice to track down my clients that are now failing because of password
authentication, but I do admit that this would not help much.

I think you need a log entry for that, and it would be a good idea.

-- 
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

Hal Snyder

hal@enteract.com

over 27 years ago

In reply to: Brett McCormick (#5)

Re: [HACKERS] keeping track of connections

Date: Wed, 3 Jun 1998 02:37:58 -0700 (PDT)
From: Brett McCormick <brett@work.chicken.org>
Cc: maillist@candle.pha.pa.us, pgsql-hackers@hub.org
Sender: owner-pgsql-hackers@hub.org

On Wed, 3 June 1998, at 01:05:17, David Gould wrote:

I am curious, what is it you are trying to accomplish with this? Are you
trying to build a persistant log that you can query later for billing
or load management/capacity planning information? Are you trying to monitor
login attempts for security auditing? Are you trying to catch logins in
real time for some sort of middleware integration?

The problem is that when I do a process listing for the postgres user,
I see many backends. There's no (convenient) way to see what those
backends are doing, what db they're connected to or the remote
host/postgres user.

My required functionality is this: a list of all backends and
connection details. IP, queries issued, listens/notifications
requested/served, bytes transfered, postgres user, db, current query,
client version, etcetcetc.

....

Can backend monitoring be compatible with one or more extant
monitoring techniques?

1. syslog
2. HTML (like Apache's real time status)
3. SNMP/SMUX/AgentX

Thomas G. Lockhart

lockhart@alumni.caltech.edu

over 27 years ago

In reply to: Brett McCormick (#3)

Re: [HACKERS] keeping track of connections

Can backend monitoring be compatible with one or more extant
monitoring techniques?

1. syslog
2. HTML (like Apache's real time status)
3. SNMP/SMUX/AgentX

Oooh. An SNMP agent for Postgres. That would be slick...

- Tom

#10

The Hermit Hacker

scrappy@hub.org

over 27 years ago

In reply to: Hal Snyder (#8)

Re: [HACKERS] keeping track of connections

On Wed, 3 Jun 1998, Hal Snyder wrote:

Date: Wed, 3 Jun 1998 02:37:58 -0700 (PDT)
From: Brett McCormick <brett@work.chicken.org>
Cc: maillist@candle.pha.pa.us, pgsql-hackers@hub.org
Sender: owner-pgsql-hackers@hub.org

On Wed, 3 June 1998, at 01:05:17, David Gould wrote:

I am curious, what is it you are trying to accomplish with this? Are you
trying to build a persistant log that you can query later for billing
or load management/capacity planning information? Are you trying to monitor
login attempts for security auditing? Are you trying to catch logins in
real time for some sort of middleware integration?

The problem is that when I do a process listing for the postgres user,
I see many backends. There's no (convenient) way to see what those
backends are doing, what db they're connected to or the remote
host/postgres user.

My required functionality is this: a list of all backends and
connection details. IP, queries issued, listens/notifications
requested/served, bytes transfered, postgres user, db, current query,
client version, etcetcetc.

....

Can backend monitoring be compatible with one or more extant
monitoring techniques?

1. syslog
2. HTML (like Apache's real time status)

I like this method the best...it makes it easier for clients to
monitor as well, without having access to the machines...but does it pose
any security implications?

#11

Ulrich Voss

voss@vocalweb.de

over 27 years ago

In reply to: Brett McCormick (#5)

Re: [HACKERS] keeping track of connections

Hi,

On Wed, 3 June 1998, at 01:05:17, David Gould wrote:

I am curious, what is it you are trying to accomplish with this? Are you
trying to build a persistant log that you can query later for billing
or load management/capacity planning information? Are you trying to monitor
login attempts for security auditing? Are you trying to catch logins in
real time for some sort of middleware integration?

The problem is that when I do a process listing for the postgres user,
I see many backends. There's no (convenient) way to see what those
backends are doing, what db they're connected to or the remote
host/postgres user.

My required functionality is this: a list of all backends and
connection details. IP, queries issued, listens/notifications
requested/served, bytes transfered, postgres user, db, current query,
client version, etcetcetc.

Perhaps a wild guess ...

Massimo had a patch, which added the pid in the first field of the
debug output (and I guess a timestamp). So you can easily
sort/grep/trace the debug output.

Perhaps this would help and should be really easy.

BTW., I think this feature is so neat, it should be integrated even
if it doesn't solve *your* problem ;-)

Ciao

Ulrich

Ulrich Voss \ \ / /__ / ___|__ _| |
VoCal web publishing \ \ / / _ \| | / _` | |
voss@vocalweb.de \ V / (_) | |__| (_| | |
http://www.vocalweb.de \_/ \___/ \____\__,_|_|
Tel: (++49) 203-306-1560 web publishing

#12

Noname

dg@illustra.com

over 27 years ago

In reply to: Ulrich Voss (#11)

Re: [HACKERS] keeping track of connections

Ulrich Voss writes:

Massimo had a patch, which added the pid in the first field of the
debug output (and I guess a timestamp). So you can easily
sort/grep/trace the debug output.

Perhaps this would help and should be really easy.

BTW., I think this feature is so neat, it should be integrated even
if it doesn't solve *your* problem ;-)

This is very very helpful when trying to debug interactions between backends
too. For example if something blows up in the lock manager this can give
a record of who did what to who when.

Great idea.
-dg

#13

Noname

dg@illustra.com

over 27 years ago

In reply to: Hal Snyder (#8)

Re: [HACKERS] keeping track of connections

Hal Synder writes:

Can backend monitoring be compatible with one or more extant
monitoring techniques?

1. syslog
2. HTML (like Apache's real time status)
3. SNMP/SMUX/AgentX

In Illustra, we use (gasp) SQL for this.

select * from procs;

procc_pid |proc_xid |proc_database|proc_locktab |proc_locktid |proc_locktype|proc_prio |proc_licenseid|proc_status |proc_user |proc_host |proc_display |proc_spins |proc_buffers |
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|4787 |0 |58.201e |tables |(7,0) |R |0 |0 |lock wait |miadmin |warbler.illustra.com|/dev/pts/4 |[] |[] |
|3997 |0 |58.201e |- |(-1,0) | |0 |0 |client input |miadmin |warbler.illustra.com|/dev/pts/11 |[] |[] |
|29597 |1320638 |58.201e |- |(-1,0) | |0 |0 |running *|miadmin |warbler.illustra.com|/dev/pts/5 |[] |[] |
|4790 |1320646 |58.7 |- |(-1,0) | |0 |0 |running *|miadmin |warbler.illustra.com|/dev/pts/4 |[6] |[] |
-------------------------------------------------------------------------------

"procs" is a pseudo-table that is generated on the fly from the process data
structures in the shared memory when queried. There are also pseudo-tables
for locks and traces and other information.

The advantage of using SQL is that the data can be selected into other tables,
grouped, projected, joined or whatever. The other advantage is that all the
exiting clients can take advantage of the data. So if you wanted to write
a graphical status monitor, you could do so quite simply in pgtcl.

Illustra also provides a set of prewritten functions (which are just sql
funcs) to provide convenient access to many kinds of common catalog queries.

I often see posts on this list that overlook the fact that postgresql is
a "relational database system" and also "an SQL system". Relational
systems are meant to be both "complete" and "reflexive". That is, the
query language (SQL) should suffice to do _any_ task needed. And any
meta-information about the system itself should be available and manageable
through the query language.

That is why we have system catalogs describing things like columns, tables,
types, indexes etc. The system maintains its metadata by doing queries and
updates to the catalogs in the same way that a user can query the catalogs.
This reflexivity is the main reason relational systems have such power.

So, whenever you are thinking about managing information related to a
database system, think about using the system itself to do it. Managing
information is what database systems are _for_. That is, if the current
SQL facilities cannot implement your feature, extend the SQL system,
don't invent some other _kind_ of facility.

The observation that Apache provides status in HTML means that the Apache
team _understand_ that *Apache is a web server*. The natural form of
interaction with a web server is HTML.

Postgres is a SQL database server. The natural form of interaction with
a database server is relational queries and tuples.

Sorry if this is a bit of a rant, but I really think we will have a much
better system if we understand what our system _is_ and try to extend it
in ways that make it better at that rather than to let it go all shapeless
and bloated with unrelated features and interfaces.

-dg

David Gould dg@illustra.com 510.628.3783 or 510.305.9468
Informix Software (No, really) 300 Lakeside Drive Oakland, CA 94612
If simplicity worked, the world would be overrun with insects.

#14

Bruce Momjian

maillist@candle.pha.pa.us

over 27 years ago

In reply to: Noname (#13)

Re: [HACKERS] keeping track of connections

Sorry if this is a bit of a rant, but I really think we will have a much
better system if we understand what our system _is_ and try to extend it
in ways that make it better at that rather than to let it go all shapeless
and bloated with unrelated features and interfaces.

I'll wait for this discussion to come down to earth, thanks. :-)

Meaning, wow, that sounds nice, but sounds pretty hard too.

-- 
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

#15

The Hermit Hacker

scrappy@hub.org

over 27 years ago

In reply to: Noname (#13)

Re: [HACKERS] keeping track of connections

On Wed, 3 Jun 1998, David Gould wrote:

Hal Synder writes:

Can backend monitoring be compatible with one or more extant
monitoring techniques?

1. syslog
2. HTML (like Apache's real time status)
3. SNMP/SMUX/AgentX

In Illustra, we use (gasp) SQL for this.

select * from procs;

procc_pid |proc_xid |proc_database|proc_locktab |proc_locktid |proc_locktype|proc_prio |proc_licenseid|proc_status |proc_user |proc_host |proc_display |proc_spins |proc_buffers |
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|4787 |0 |58.201e |tables |(7,0) |R |0 |0 |lock wait |miadmin |warbler.illustra.com|/dev/pts/4 |[] |[] |
|3997 |0 |58.201e |- |(-1,0) | |0 |0 |client input |miadmin |warbler.illustra.com|/dev/pts/11 |[] |[] |
|29597 |1320638 |58.201e |- |(-1,0) | |0 |0 |running *|miadmin |warbler.illustra.com|/dev/pts/5 |[] |[] |
|4790 |1320646 |58.7 |- |(-1,0) | |0 |0 |running *|miadmin |warbler.illustra.com|/dev/pts/4 |[6] |[] |
-------------------------------------------------------------------------------

"procs" is a pseudo-table that is generated on the fly from the process
data structures in the shared memory when queried. There are also
pseudo-tables for locks and traces and other information.

The advantage of using SQL is that the data can be selected into other
tables, grouped, projected, joined or whatever. The other advantage is
that all the exiting clients can take advantage of the data. So if you
wanted to write a graphical status monitor, you could do so quite simply
in pgtcl.

Illustra also provides a set of prewritten functions (which are just sql
funcs) to provide convenient access to many kinds of common catalog
queries.

I definitely like this...it keeps us self-contained as far as the
data is concerned, and everyone that is using it knows enough about SQL
(or should) to be able to gleam information as required...

What would it take to do this though? The 'postmaster' itself,
unless I've misunderstand a good many of the conversations on this, can't
access the tables themselves, only 'flat files' (re: the password issue),
so it would have to be done in the fork'd process itself. That, IMHO,
would pose a possible inconsequential problem though...what if the backend
dies? Its 'record' in the proc table wouldn't be removed, which would be
like having our own internal 'process zombies'...

I think this does bear further discussion though...one 'branch' of
this would be to have a dynamic table for 'live' processes, but also one
that contains a history of past ones...?

Marc G. Fournier
Systems Administrator @ hub.org
primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org

#16

The Hermit Hacker

scrappy@hub.org

over 27 years ago

In reply to: Bruce Momjian (#2)

Re: [HACKERS] keeping track of connections

On Wed, 3 Jun 1998, Bruce Momjian wrote:

I would love a way to keep track of the connections/attempted
connections to the postmaster. I'm thinking that when the postmaster
accept()s a connection, it can just insert a record into a table
(system catalog or not) with the information, which can be updated
after the authentication succeeds/fails or whatnot.

something like 'smbstatus' for the samba system.

So, my question is: how should I go about doing this? should I look
into SPI, which I know nothing about? or, what.. I don't think the
catalog cache stuff needs to be changed, it isn't as if this info
needs to be immediately accessible.

Good question. Postmaster does not have access to the system tables, so
it can't access them. You could add a debug option to show it in the
server logs, or add it to the -d2 debug option that already shows SQL
statements.

Here's one for you...and don't laugh at me, eh? :)

postmaster starts up to listen for connections, and then starts up its own
backend to handle database queries? So, on a quiet system, you would have
two processes running, one postmaster, and one postgres...

basically, the idea is that postmaster can't talk to a table, only
postgres can...so, setup postmaster the same way that any other interface
is setup...connect to a backend and pass its transactions through that
way...

Marc G. Fournier
Systems Administrator @ hub.org
primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org

#17

The Hermit Hacker

scrappy@hub.org

over 27 years ago

In reply to: Noname (#4)

Re: [HACKERS] keeping track of connections

On Wed, 3 Jun 1998, David Gould wrote:

I am curious, what is it you are trying to accomplish with this? Are you
trying to build a persistant log that you can query later for billing
or load management/capacity planning information? Are you trying to monitor
login attempts for security auditing? Are you trying to catch logins in
real time for some sort of middleware integration?

Here we are discussion solutions, but I don't even know what the problem
is. So, please describe what is needed in terms of
requirements/functionality.

I think the uses could be many. Keep track, on a per 'backend'
basis, max memory used during the life of the process, so that you can
estimate memory requirements/upgrades. Average query times for the
duration of the process? Or maybe even bring it down to a 'per query'
logging, so that you know what the query was, how long it took, and what
resources were required? Tie that to a table of processes, maybe with a
timestamp for when the process started up and when it started.

Then, using a simple query, you could figure out peak times for
processes, or number of processes per hour, or...

Marc G. Fournier
Systems Administrator @ hub.org
primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org

#18

Bruce Momjian

maillist@candle.pha.pa.us

over 27 years ago

In reply to: The Hermit Hacker (#16)

Re: [HACKERS] keeping track of connections

Here's one for you...and don't laugh at me, eh? :)

postmaster starts up to listen for connections, and then starts up its own
backend to handle database queries? So, on a quiet system, you would have
two processes running, one postmaster, and one postgres...

basically, the idea is that postmaster can't talk to a table, only
postgres can...so, setup postmaster the same way that any other interface
is setup...connect to a backend and pass its transactions through that
way...

So have the postmaster use the libpq library to open a database
connection and execute queries. Sounds interesting.

-- 
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

#19

Brett McCormick

brett@work.chicken.org

over 27 years ago

In reply to: The Hermit Hacker (#16)

Re: [HACKERS] keeping track of connections

that's a really cool idea. I think I'll try that.

On Wed, 3 June 1998, at 18:46:02, The Hermit Hacker wrote:

Show quoted text

postmaster starts up to listen for connections, and then starts up its own
backend to handle database queries? So, on a quiet system, you would have
two processes running, one postmaster, and one postgres...

basically, the idea is that postmaster can't talk to a table, only
postgres can...so, setup postmaster the same way that any other interface
is setup...connect to a backend and pass its transactions through that
way...

#20

Brett McCormick

brett@work.chicken.org

over 27 years ago

In reply to: The Hermit Hacker (#15)

Re: [HACKERS] keeping track of connections

On Wed, 3 June 1998, at 18:40:10, The Hermit Hacker wrote:

select * from procs;

"procs" is a pseudo-table that is generated on the fly from the process
data structures in the shared memory when queried. There are also
pseudo-tables for locks and traces and other information.

That's exactly what I envision. PRobably not what I articulated.

The advantage of using SQL is that the data can be selected into other
tables, grouped, projected, joined or whatever. The other advantage is
that all the exiting clients can take advantage of the data. So if you
wanted to write a graphical status monitor, you could do so quite simply
in pgtcl.

Exactly.

#21

Noname

dg@illustra.com

over 27 years ago

In reply to: Bruce Momjian (#14)

Re: [HACKERS] keeping track of connections

Bruce Momjian gently chides:

I wrote:

Sorry if this is a bit of a rant, but I really think we will have a much
better system if we understand what our system _is_ and try to extend it
in ways that make it better at that rather than to let it go all shapeless
and bloated with unrelated features and interfaces.

I'll wait for this discussion to come down to earth, thanks. :-)

Meaning, wow, that sounds nice, but sounds pretty hard too.

Really? Most of the data we need to collect is in the process table, or lock
manager data structure or could be added fairly readily.

So you need a few things:

- parser/planner needs to recognize the special tables and flag them in
the query plan. Easy way to do this is to store catalog and type info
for them in the normal places except that the tables table entry would
have a flag that says "I'm special", and maybe a function oid to the
actual iterator function (see next item).

The idea is that you rewrite the query "select * from procs" into
"select * from pg_pseudo_procs()".

- you then need an iterator function (returns next row per call) for each
fake table. This function reads the data from whatever the in memory
structure is and returns a tuple. That is, to the caller it looks a lot
like heapgetnext() or whatever we call that.

The rest of this, joins, projections, grouping, insert to another table etc
pretty much falls out of the basic functionality of the system for free.

-dg

#22

Noname

dg@illustra.com

over 27 years ago

In reply to: The Hermit Hacker (#16)

Re: [HACKERS] keeping track of connections

Marc G. Fournier writes:

On Wed, 3 Jun 1998, Bruce Momjian wrote:

Good question. Postmaster does not have access to the system tables, so
it can't access them. You could add a debug option to show it in the
server logs, or add it to the -d2 debug option that already shows SQL
statements.

Here's one for you...and don't laugh at me, eh? :)

postmaster starts up to listen for connections, and then starts up its own
backend to handle database queries? So, on a quiet system, you would have
two processes running, one postmaster, and one postgres...

basically, the idea is that postmaster can't talk to a table, only
postgres can...so, setup postmaster the same way that any other interface
is setup...connect to a backend and pass its transactions through that
way...

Ok, can I laugh now?

Seriously, if we are going to have a separate backend to do the table access
(and I agree that this is both neccessary and reasonable), why not have it
be a plain ordinary backend like all the others and just connect to it from
the client? Why get the postmaster involved at all?

First, modifying the postmaster to add services has a couple of problems:

- we have to modify the postmaster. This adds code bloat and bugs etc, and
since the same binary is also the backend, it means the backends carry
around extra baggage that only is used in the postmaster.

- more importantly, if the postmaster is busy processing a big select from
a pseudo table or log (well, forwarding results etc), then it cannot also
respond to a new connection request. Unless we multithread the postmaster.

Second, it really isn't required to get the postmaster involved except in
maintaining its portion of the shared memory. Anyone that wants to do
status monitoring can connect in the normal way from a client to a backend
and query the pseudo-tables every second or however often they want. I
imagine an event log in a circular buffer could even be maintained in the
shared memory and made available as a pseudo-table for those who want that
sort of thing.

-dg

#23

Brett McCormick

brett@work.chicken.org

over 27 years ago

In reply to: Ulrich Voss (#11)

Re: [HACKERS] keeping track of connections

On Wed, 3 June 1998, at 16:38:43, Ulrich Voss wrote:

Massimo had a patch, which added the pid in the first field of the
debug output (and I guess a timestamp). So you can easily
sort/grep/trace the debug output.

I'm looking for a little more than that.

Perhaps this would help and should be really easy.

BTW., I think this feature is so neat, it should be integrated even
if it doesn't solve *your* problem ;-)

There isn't much of a problem, I just would love to have the feature I
mentioned. What are you referring to, the above?

#24

Ulrich Voss

voss@vocalweb.de

over 27 years ago

In reply to: Brett McCormick (#23)

Re: [HACKERS] keeping track of connections

Hi,

On Wed, 3 June 1998, at 16:38:43, Ulrich Voss wrote:

Massimo had a patch, which added the pid in the first field of the
debug output (and I guess a timestamp). So you can easily
sort/grep/trace the debug output.

I'm looking for a little more than that.

OK, but step one is simple, Massimo's patch could possibly be
integrated in two or three hours. And it adds valuable debugging
info.

(Btw., Massimo's patch was the first (and I hope last) very helpful
patch, which for obscure reasons never made into the official
distribution. And it had this simple pid/time patch (not in current
cvs), it had a spinlock patch (not in current cvs), a better deadlock
detection (than 6.2.1, not 6.3) and an async listen option (also the
6.4. version will be much better I gues). That's why we still use
6.2.1p6 + massimo patch).

Perhaps this would help and should be really easy.

BTW., I think this feature is so neat, it should be integrated even
if it doesn't solve *your* problem ;-)

There isn't much of a problem, I just would love to have the feature I
mentioned. What are you referring to, the above?

Yeah, fine. Monitoring the backend is wonderful, but the
pid/timestamp addition is simple and useful too.

Thanks again for a great product!

Ulrich

#25

Andreas Zeugswetter

andreas.zeugswetter@telecom.at

over 27 years ago

In reply to: Ulrich Voss (#24)

AW: [HACKERS] keeping track of connections

Basically I would rename template1 to pg_master and connect postmaster
to this database. Then it would have access to system global sql info.
The problem is: postgres backends would need to do a reconnect to their
own database instead of a connect.

Andreas

Import Notes

Resolved by subject fallback

#26

Noname

dg@illustra.com

over 27 years ago

In reply to: Ulrich Voss (#24)

Re: [HACKERS] keeping track of connections

Ulrich Voss writes:

On Wed, 3 June 1998, at 16:38:43, Ulrich Voss wrote:

Massimo had a patch, which added the pid in the first field of the
debug output (and I guess a timestamp). So you can easily
sort/grep/trace the debug output.

I'm looking for a little more than that.

OK, but step one is simple, Massimo's patch could possibly be
integrated in two or three hours. And it adds valuable debugging
info.

(Btw., Massimo's patch was the first (and I hope last) very helpful
patch, which for obscure reasons never made into the official
distribution. And it had this simple pid/time patch (not in current
cvs), it had a spinlock patch (not in current cvs), a better deadlock

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Well, uhmmm, yes there is a spinlock patch in the current CVS. Please look
again.

Btw, I am about to update the spinlock patch based on some testing I did to
resolve some of Bruce Momjians performance concerns. I will post the results
of the testing (which are quite interesting if you are a performance fanatic)
later today, and the patch this weekend.

-dg

#27

Noname

dg@illustra.com

over 27 years ago

In reply to: Noname (#26)

Re: [HACKERS] keeping track of connections

On Wed, 3 June 1998, at 20:29:52, David Gould wrote:

Ok, can I laugh now?

Seriously, if we are going to have a separate backend to do the table access
(and I agree that this is both neccessary and reasonable), why not have it
be a plain ordinary backend like all the others and just connect to it from
the client? Why get the postmaster involved at all?

I'm confused, I guess.

First, modifying the postmaster to add services has a couple of problems:

I wasn't quite suggesting this, I think we should just modify the
postmaster to store the information. As you say below, doing queries
is probably bad, shared memory seems like the way to go. I'll assume
we'll use a different block of shared memory than the one currently
used.

Oh, ok. Some suggestions have been made the the postmaster would open a
connection to it's own backend to do queries. I was responding to this.
I agree that we should just store the information in shared memory.

do you know how shared memory is currently used? I'm fairly clueless
on this aspect.

The shared memory stores the process table, the lock table, the buffer cache,
and the shared invalidate list, and a couple of other minor things that all
the backends need to know about.

Strangely, the shared memory does not share a copy of the system catalog
cache. This seems like a real misfeature as the catalog data is very useful
to all the backends.

The shared memory is managed by its own allocator. It is not hard to carve
out a block for a new use, the only real trick is to make sure you account
for it when the system starts up so it can get the size right as the shared
memory is not extendable.

- we have to modify the postmaster. This adds code bloat and bugs etc, and
since the same binary is also the backend, it means the backends carry
around extra baggage that only is used in the postmaster.

the reverse could also be said -- why does the postmaster need the
bloat of a backend?

Well, right now the postmaster and the backend are the same binary. This
has the advantage of keeping them in sync as we make changes, and now with
Bruces patch we can avoid an exec() on backend startup. Illustra has a
separate backend and postmaster binary. This works too, but they share a
lot of code and sometimes a change in something you thought was only in the
backend will break the postmaster.

- more importantly, if the postmaster is busy processing a big select from
a pseudo table or log (well, forwarding results etc), then it cannot also
respond to a new connection request. Unless we multithread the postmaster.

good point. I think storing this information in shared memory and
accessing it from a view is good -- how do other dbs do this sort of
thing?

Well, it is not really a view, although a view is a good analogy. The term
of art is pseudo-table. That is, a table you generate on the fly. This concept
is very useful as you can use it to read text files or rows from some other
database (think gateways) etc. It is also pretty common. Sybase and Informix
both support system specific pseudo-tables. Illustra supports extendable
access methods where you can plug a set of functions (opentable, getnext,
update, delete, insert etc) into the server and they create a table interface
to whatever datasource you want.

-dg

David Gould dg@illustra.com 510.628.3783 or 510.305.9468
Informix Software 300 Lakeside Drive Oakland, CA 94612
- A child of five could understand this! Fetch me a child of five.

Import Notes

Reply to msg id not found: 13686.17983.482187.472899@web0.speakeasy.org | Resolved by subject fallback

#28

Ulrich Voss

voss@vocalweb.de

over 27 years ago

In reply to: Noname (#26)

Re: [HACKERS] keeping track of connections

Ulrich Voss writes:

On Wed, 3 June 1998, at 16:38:43, Ulrich Voss wrote:

Massimo had a patch, which added the pid in the first field of the
debug output (and I guess a timestamp). So you can easily
sort/grep/trace the debug output.

I'm looking for a little more than that.

OK, but step one is simple, Massimo's patch could possibly be
integrated in two or three hours. And it adds valuable debugging
info.

(Btw., Massimo's patch was the first (and I hope last) very helpful
patch, which for obscure reasons never made into the official
distribution. And it had this simple pid/time patch (not in current
cvs), it had a spinlock patch (not in current cvs), a better deadlock

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Well, uhmmm, yes there is a spinlock patch in the current CVS. Please look
again.

Sorry. I meant in the current stable release ...

Btw, I am about to update the spinlock patch based on some testing I did to
resolve some of Bruce Momjians performance concerns. I will post the results
of the testing (which are quite interesting if you are a performance fanatic)
later today, and the patch this weekend.

-dg

I hope a patch for 6.3.2 will make its way someday ...

Ciao

Ulrich

#29

Bruce Momjian

maillist@candle.pha.pa.us

over 27 years ago

In reply to: Noname (#27)

Re: [HACKERS] keeping track of connections

Oh, ok. Some suggestions have been made the the postmaster would open a
connection to it's own backend to do queries. I was responding to this.
I agree that we should just store the information in shared memory.

do you know how shared memory is currently used? I'm fairly clueless
on this aspect.

The shared memory stores the process table, the lock table, the buffer cache,
and the shared invalidate list, and a couple of other minor things that all
the backends need to know about.

Strangely, the shared memory does not share a copy of the system catalog
cache. This seems like a real misfeature as the catalog data is very useful
to all the backends.

On TODO list. Vadim wants to do this, perhaps for 6.4(not sure):

* Shared catalog cache, reduce lseek()'s by caching table size in shared area

The shared memory is managed by its own allocator. It is not hard to carve
out a block for a new use, the only real trick is to make sure you account
for it when the system starts up so it can get the size right as the shared
memory is not extendable.

- we have to modify the postmaster. This adds code bloat and bugs etc, and
since the same binary is also the backend, it means the backends carry
around extra baggage that only is used in the postmaster.

the reverse could also be said -- why does the postmaster need the
bloat of a backend?

Well, right now the postmaster and the backend are the same binary. This
has the advantage of keeping them in sync as we make changes, and now with
Bruces patch we can avoid an exec() on backend startup. Illustra has a
separate backend and postmaster binary. This works too, but they share a
lot of code and sometimes a change in something you thought was only in the
backend will break the postmaster.

Then a good reason not to split them up.

Well, it is not really a view, although a view is a good analogy. The term
of art is pseudo-table. That is, a table you generate on the fly. This concept
is very useful as you can use it to read text files or rows from some other
database (think gateways) etc. It is also pretty common. Sybase and Informix
both support system specific pseudo-tables. Illustra supports extendable
access methods where you can plug a set of functions (opentable, getnext,
update, delete, insert etc) into the server and they create a table interface
to whatever datasource you want.

Yes, this would be nice, but don't we have more important items to the
TODO list to address?

-- 
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

#30

The Hermit Hacker

scrappy@hub.org

over 27 years ago

In reply to: Noname (#27)

Re: [HACKERS] keeping track of connections

On Thu, 4 Jun 1998, David Gould wrote:

Oh, ok. Some suggestions have been made the the postmaster would open a
connection to it's own backend to do queries. I was responding to this.
I agree that we should just store the information in shared memory.

How does one get a history for long term monitoring and statistics
by storing in shared memory?

Marc G. Fournier
Systems Administrator @ hub.org
primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org

#31

Noname

dg@illustra.com

over 27 years ago

In reply to: The Hermit Hacker (#30)

Re: [HACKERS] keeping track of connections

On Thu, 4 Jun 1998, David Gould wrote:

Oh, ok. Some suggestions have been made the the postmaster would open a
connection to it's own backend to do queries. I was responding to this.
I agree that we should just store the information in shared memory.

How does one get a history for long term monitoring and statistics
by storing in shared memory?

Marc G. Fournier
Systems Administrator @ hub.org
primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org

My thought was a circular event buffer which could provide short term
history. If someone wanted to store long term history (most sites probably
won't, but I agree it can be useful), they would have an application which
queried the short term history and saved it to what ever long term history
they wanted. Eg:

FOREVER {
sleep(1);
insert into long_term_hist values
(select * from pg_eventlog where event_num > highest_seen_so_far);
}

Obviously some details need to be worked out to make sure no history is
ever lost (if that is important). But the basic mechanism is general and
useful for many purposes.

-dg

David Gould dg@illustra.com 510.628.3783 or 510.305.9468
Informix Software 300 Lakeside Drive Oakland, CA 94612
- A child of five could understand this! Fetch me a child of five.

#32

Vadim Mikheev

vadim@krs.ru

over 27 years ago

In reply to: Bruce Momjian (#29)

Re: [HACKERS] keeping track of connections

Bruce Momjian wrote:

Strangely, the shared memory does not share a copy of the system catalog
cache. This seems like a real misfeature as the catalog data is very useful
to all the backends.

On TODO list. Vadim wants to do this, perhaps for 6.4(not sure):

* Shared catalog cache, reduce lseek()'s by caching table size in shared area

Yes, for 6.4...

Vadim

#33

Bruce Momjian

maillist@candle.pha.pa.us

over 27 years ago

In reply to: Vadim Mikheev (#32)

Re: [HACKERS] keeping track of connections

Bruce Momjian wrote:

Strangely, the shared memory does not share a copy of the system catalog
cache. This seems like a real misfeature as the catalog data is very useful
to all the backends.

On TODO list. Vadim wants to do this, perhaps for 6.4(not sure):

* Shared catalog cache, reduce lseek()'s by caching table size in shared area

Yes, for 6.4...

Can you share any other 6.4 plans with us?

-- 
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

#34

Massimo Dal Zotto

dz@cs.unitn.it

over 27 years ago

In reply to: Ulrich Voss (#24)

Re: [HACKERS] keeping track of connections

Hi,

On Wed, 3 June 1998, at 16:38:43, Ulrich Voss wrote:

Massimo had a patch, which added the pid in the first field of the
debug output (and I guess a timestamp). So you can easily
sort/grep/trace the debug output.

I'm looking for a little more than that.

OK, but step one is simple, Massimo's patch could possibly be
integrated in two or three hours. And it adds valuable debugging
info.

(Btw., Massimo's patch was the first (and I hope last) very helpful
patch, which for obscure reasons never made into the official
distribution. And it had this simple pid/time patch (not in current
cvs), it had a spinlock patch (not in current cvs), a better deadlock
detection (than 6.2.1, not 6.3) and an async listen option (also the
6.4. version will be much better I gues). That's why we still use
6.2.1p6 + massimo patch).

Me too. I'm still using 6.2.1p6 because I didn't found the time to port all
the patches to 6.3. They are almost done except for the lock code which was
in the meantime modified by Bruce. I hope they will be available before 6.4.

Perhaps this would help and should be really easy.

BTW., I think this feature is so neat, it should be integrated even
if it doesn't solve *your* problem ;-)

There isn't much of a problem, I just would love to have the feature I
mentioned. What are you referring to, the above?

Yeah, fine. Monitoring the backend is wonderful, but the
pid/timestamp addition is simple and useful too.

Thanks again for a great product!

Ulrich

Ulrich Voss \ \ / /__ / ___|__ _| |
VoCal web publishing \ \ / / _ \| | / _` | |
voss@vocalweb.de \ V / (_) | |__| (_| | |
http://www.vocalweb.de \_/ \___/ \____\__,_|_|
Tel: (++49) 203-306-1560 web publishing

--
Massimo Dal Zotto

+----------------------------------------------------------------------+
|  Massimo Dal Zotto                e-mail:  dz@cs.unitn.it            |
|  Via Marconi, 141                 phone:  ++39-461-534251            |
|  38057 Pergine Valsugana (TN)     www:  http://www.cs.unitn.it/~dz/  |
|  Italy                            pgp:  finger dz@tango.cs.unitn.it  |
+----------------------------------------------------------------------+

#35

Massimo Dal Zotto

dz@cs.unitn.it

over 27 years ago

In reply to: The Hermit Hacker (#30)

Re: [HACKERS] keeping track of connections

On Thu, 4 Jun 1998, David Gould wrote:

Oh, ok. Some suggestions have been made the the postmaster would open a
connection to it's own backend to do queries. I was responding to this.
I agree that we should just store the information in shared memory.

How does one get a history for long term monitoring and statistics
by storing in shared memory?

Marc G. Fournier
Systems Administrator @ hub.org
primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org

Why not simply append history lines to a normal log file ? In this way you
don't have the overhead for accessing tables and can do real-time processing
of the data with a simple tail -f on the file.
I use this trick to monitor the log file written by 30 backends and it works
fine for me.

--
Massimo Dal Zotto

+----------------------------------------------------------------------+
|  Massimo Dal Zotto                e-mail:  dz@cs.unitn.it            |
|  Via Marconi, 141                 phone:  ++39-461-534251            |
|  38057 Pergine Valsugana (TN)     www:  http://www.cs.unitn.it/~dz/  |
|  Italy                            pgp:  finger dz@tango.cs.unitn.it  |
+----------------------------------------------------------------------+

#36

Bruce Momjian

maillist@candle.pha.pa.us

over 27 years ago

In reply to: Massimo Dal Zotto (#35)

Re: [HACKERS] keeping track of connections

On Thu, 4 Jun 1998, David Gould wrote:

Oh, ok. Some suggestions have been made the the postmaster would open a
connection to it's own backend to do queries. I was responding to this.
I agree that we should just store the information in shared memory.

How does one get a history for long term monitoring and statistics
by storing in shared memory?

Marc G. Fournier
Systems Administrator @ hub.org
primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org

Why not simply append history lines to a normal log file ? In this way you
don't have the overhead for accessing tables and can do real-time processing
of the data with a simple tail -f on the file.
I use this trick to monitor the log file written by 30 backends and it works
fine for me.

I agree. We have more important items to address.

-- 
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

#37

The Hermit Hacker

scrappy@hub.org

over 27 years ago

In reply to: Massimo Dal Zotto (#34)

Re: [HACKERS] keeping track of connections

On Mon, 8 Jun 1998, Massimo Dal Zotto wrote:

(Btw., Massimo's patch was the first (and I hope last) very helpful
patch, which for obscure reasons never made into the official
distribution. And it had this simple pid/time patch (not in current
cvs), it had a spinlock patch (not in current cvs), a better deadlock
detection (than 6.2.1, not 6.3) and an async listen option (also the
6.4. version will be much better I gues). That's why we still use
6.2.1p6 + massimo patch).

Me too. I'm still using 6.2.1p6 because I didn't found the time to port all
the patches to 6.3. They are almost done except for the lock code which was
in the meantime modified by Bruce. I hope they will be available before 6.4.

What are we currently missing?

Marc G. Fournier
Systems Administrator @ hub.org
primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org

#38

Noname

dg@illustra.com

over 27 years ago

In reply to: Massimo Dal Zotto (#35)

Re: [HACKERS] keeping track of connections

On Thu, 4 Jun 1998, David Gould wrote:

Oh, ok. Some suggestions have been made the the postmaster would open a
connection to it's own backend to do queries. I was responding to this.
I agree that we should just store the information in shared memory.

How does one get a history for long term monitoring and statistics
by storing in shared memory?

Marc G. Fournier
Systems Administrator @ hub.org
primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org

Why not simply append history lines to a normal log file ? In this way you
don't have the overhead for accessing tables and can do real-time processing
of the data with a simple tail -f on the file.
I use this trick to monitor the log file written by 30 backends and it works
fine for me.

--
Massimo Dal Zotto

I was going to suggest this too, but didn't want to be too much of a
spoilsport.

-dg