contrib/pg_stat_statements

Started by ITAGAKI Takahiroover 17 years ago25 messages
#1ITAGAKI Takahiro
itagaki.takahiro@oss.ntt.co.jp
2 attachment(s)

Hello,

Postgres has configurations to log statements into server logs, but there
are few convenience ways to summarize the logs as a view. Some external
projects like pgFouine or PQA have the functionality, but they consume
much CPU resources and cannot summarize in real-time -- summarize-time
is longer than write-time of the logs in heavily loads.

I'd like to submit pg_stat_statements contrib module, that counts up
incoming statements in shared memory and summarizes the result as a view.
It is just a statements-version of pg_stat_user_functions.

Information collected by the view are:
- query string
- total_time and cpu_time
- buffer access (gets, reads, writes, local reads and local writes)
- number of rows retrieved or affected
(There is a sample output in sgml in the attached tarball.)

I attach WIP version of the module. auto_explain.patch is also required
because the module uses a new version DefineCustomVariable.
http://archives.postgresql.org/message-id/20081009165157.9BE4.52131E4D@oss.ntt.co.jp
The module allocates fixed share memory at the server start and store
statements statistics in it. On out of memory, least recently used
statements are discarded.

Comments and suggenstions welcome!

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

Attachments:

pgstatstatements.sgmlapplication/octet-stream; name=pgstatstatements.sgmlDownload
pg_stat_statements-1011.tgzapplication/octet-stream; name=pg_stat_statements-1011.tgzDownload
#2Decibel!
decibel@decibel.org
In reply to: ITAGAKI Takahiro (#1)
1 attachment(s)
Re: contrib/pg_stat_statements

On Oct 11, 2008, at 4:05 AM, ITAGAKI Takahiro wrote:

I'd like to submit pg_stat_statements contrib module, that counts up
incoming statements in shared memory and summarizes the result as a
view.
It is just a statements-version of pg_stat_user_functions.

Awesome!

I attach WIP version of the module. auto_explain.patch is also
required
because the module uses a new version DefineCustomVariable.
http://archives.postgresql.org/message-id/
20081009165157.9BE4.52131E4D@oss.ntt.co.jp
The module allocates fixed share memory at the server start and store
statements statistics in it. On out of memory, least recently used
statements are discarded.

How hard would it be to dump this information to a table, or some
other more-permanent form of storage? Of course there would need to
be some means of cleaning that up over time, but if it's a simple
table you can DELETE from, we could put the burden on the users to do
that on occasion (I believe Oracle does something similar). It would
also be good to periodically save everything to the table so that
data wasn't completely lost on a crash.

I'm concerned because ISTM that in a high velocity environment you'd
over-run shared memory pretty quickly if you had a lot of different
queries you were running.

Of course, someone could always just setup a cron job to grab the
stats once a minute, so if this greatly complicates the patch I
wouldn't worry about it.
--
Decibel!, aka Jim C. Nasby, Database Architect decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828

Attachments:

smime.p7sapplication/pkcs7-signature; name=smime.p7sDownload
#3ITAGAKI Takahiro
itagaki.takahiro@oss.ntt.co.jp
In reply to: Decibel! (#2)
Re: contrib/pg_stat_statements

Decibel! <decibel@decibel.org> wrote:

How hard would it be to dump this information to a table, or some
other more-permanent form of storage? Of course there would need to
be some means of cleaning that up over time, but if it's a simple
table you can DELETE from, we could put the burden on the users to do
that on occasion (I believe Oracle does something similar). It would
also be good to periodically save everything to the table so that
data wasn't completely lost on a crash.

I had tried to use a normal table for store stats information,
but several acrobatic hacks are needed to keep performance.

We need to avoid using normal UPDATEs to increment counters
because it requires row-level exclusive locks and kills concurrency.
My idea was modifying heap tuples directly in pages:

buffer = ReadBuffer(stats_rel, blknum);
LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
htup = PageGetItem(BufferGetPage(buffer), itemid);
statobj = ((char *) htup + htup->t_hoff);
statobj->calls++;
LockBuffer(buffer, BUFFER_LOCK_UNLOCK);

It seemed to work in my test, but of course it is too dangerous.
(If we had supported ISAM-like storage engine, we might use it here.)

Another idea is saving stats information into a file at the stop of server.
Server crashes are still problem, but we keep statistics after server restart.
However, there is no callback at the end of server for plugins.
_PG_fini is not called at shutdown; it is only called at re-LOAD.

Hmmm... can I use on_shmem_exit() for the purpose?
i.e,
on_shmem_exit( save_stats_to_file_if_i_am_postmaster )

I'm concerned because ISTM that in a high velocity environment you'd
over-run shared memory pretty quickly if you had a lot of different
queries you were running.

It might be good to ignore queries using simple protocol, but I'm
not sure how to distinguish them from extended or prepared queries.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

#4Alvaro Herrera
alvherre@commandprompt.com
In reply to: ITAGAKI Takahiro (#3)
Re: contrib/pg_stat_statements

ITAGAKI Takahiro wrote:

We need to avoid using normal UPDATEs to increment counters
because it requires row-level exclusive locks and kills concurrency.
My idea was modifying heap tuples directly in pages:

buffer = ReadBuffer(stats_rel, blknum);
LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
htup = PageGetItem(BufferGetPage(buffer), itemid);
statobj = ((char *) htup + htup->t_hoff);
statobj->calls++;
LockBuffer(buffer, BUFFER_LOCK_UNLOCK);

It seemed to work in my test, but of course it is too dangerous.
(If we had supported ISAM-like storage engine, we might use it here.)

heap_inplace_update?

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#5ITAGAKI Takahiro
itagaki.takahiro@oss.ntt.co.jp
In reply to: Alvaro Herrera (#4)
Re: contrib/pg_stat_statements

Alvaro Herrera <alvherre@commandprompt.com> wrote:

ITAGAKI Takahiro wrote:

My idea was modifying heap tuples directly in pages:

heap_inplace_update?

It writes WAL. I don't think we need WAL here,
and it is enough to write out pages every checkpoints.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

#6Vladimir Sitnikov
sitnikov.vladimir@gmail.com
In reply to: ITAGAKI Takahiro (#3)
Re: contrib/pg_stat_statements

Decibel! <decibel@decibel.org> wrote:

I had tried to use a normal table for store stats information,
but several acrobatic hacks are needed to keep performance.

I guess it is not really required to synchronize the stats into some
physical table immediately.
I would suggest keeping all the data in memory, and having a job that
periodically dumps snapshots into physical tables (with WAL etc).
In that case one would be able to compute database workload as a difference
between two given snapshots. From my point of view, it does not look like a
performance killer to have snapshots every 15 minutes. It does not look too
bad to get the statistics of last 15 minutes lost in case of database crash
either.

Regards,
Vladimir Sitnikov

#7Magnus Hagander
magnus@hagander.net
In reply to: ITAGAKI Takahiro (#1)
Re: contrib/pg_stat_statements

ITAGAKI Takahiro wrote:

Hello,

I'd like to submit pg_stat_statements contrib module, that counts up
incoming statements in shared memory and summarizes the result as a view.
It is just a statements-version of pg_stat_user_functions.

Sounds very good, but why contrib and not along with the rest of the
stats stuff in the main backend (with an on/off switch if the overhead
is high)?

//Magnus

#8ITAGAKI Takahiro
itagaki.takahiro@oss.ntt.co.jp
In reply to: Magnus Hagander (#7)
Re: contrib/pg_stat_statements

Magnus Hagander <magnus@hagander.net> wrote:

I'd like to submit pg_stat_statements contrib module

Sounds very good, but why contrib and not along with the rest of the
stats stuff in the main backend (with an on/off switch if the overhead
is high)?

That's because it could be done as a contrib module ;-)
(Yeah! Postgres is a very extensible database!)

I'm not sure what should be in the main and what should not.
Why is pg_freespacemap still in contrib?

I think the module is not mature yet and users will want to
modify it to their satisfaction. It will be easier if the
module is separated from the core codes. (The same can be
said for contrib/auto_explan, which is my other proposal.)

Now I'm working on storing statistics into disks on server
shutdown. If it is impossible unless the module is in core,
I would change my policy...

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

#9Magnus Hagander
magnus@hagander.net
In reply to: ITAGAKI Takahiro (#8)
Re: contrib/pg_stat_statements

ITAGAKI Takahiro wrote:

Magnus Hagander <magnus@hagander.net> wrote:

I'd like to submit pg_stat_statements contrib module

Sounds very good, but why contrib and not along with the rest of the
stats stuff in the main backend (with an on/off switch if the overhead
is high)?

That's because it could be done as a contrib module ;-)
(Yeah! Postgres is a very extensible database!)

It can also go as a pgfoundry project, no? ;-)

Given that making it a contrib module instead of core will significantly
decrease the exposure, I personally consider that a bad thing..

I'm not sure what should be in the main and what should not.
Why is pg_freespacemap still in contrib?

I don't know, why is it? :-)

I think the module is not mature yet and users will want to
modify it to their satisfaction. It will be easier if the
module is separated from the core codes. (The same can be
said for contrib/auto_explan, which is my other proposal.)

Yes, that is a reasonable argument for keeping it in contrib. Or perhaps
even better, on pgFoundry until it is ready.

//Magnus

#10Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Magnus Hagander (#9)
Re: contrib/pg_stat_statements

Magnus Hagander wrote:

ITAGAKI Takahiro wrote:

Magnus Hagander <magnus@hagander.net> wrote:
I'm not sure what should be in the main and what should not.
Why is pg_freespacemap still in contrib?

I don't know, why is it? :-)

I guess that was a joke, given the smiley, but I'll bite:

1. pg_freespacemap is intimately tied to the implementation of the FSM.
It was completely overhauled in the FSM rewrite, and there's no
guarantee it won't change again in the future to reflect changes in the
FSM implementation. And if it does, we won't bother to provide an easy
upgrade path, so existing queries using it will brake. For the same
reason it's in contrib and not pgFoundry: it's very dependent on the
server version.

2. It's not useful for the general public.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#11Martin Pihlak
martin.pihlak@gmail.com
In reply to: ITAGAKI Takahiro (#1)
Re: contrib/pg_stat_statements

ITAGAKI Takahiro wrote:

I'd like to submit pg_stat_statements contrib module, that counts up
incoming statements in shared memory and summarizes the result as a view.
It is just a statements-version of pg_stat_user_functions.

Nice work! There is one feature I'd like to request -- we need to be able
to also track nested statements. This would greatly simplify diagnosing
performance problems in complex stored procedures. Perhaps the GUC
track_statements could be made an enum - none, top, all?

PS. Similar feature would be useful for auto-explain as well.

regards,
Martin

#12ITAGAKI Takahiro
itagaki.takahiro@oss.ntt.co.jp
In reply to: ITAGAKI Takahiro (#8)
pg_stat_statements in core

I wrote:

Now I'm working on storing statistics into disks on server
shutdown. If it is impossible unless the module is in core,
I would change my policy...

I reconsidered this part and found that pg_stat_statements needs to be
in core to write stats in file on server shutdown because:

1. shared_preload_libraries are only loaded in postmaster and backends.
2. Postmaster should write stats file because it is the only process
that knows when to shutdown.
3. Postmaster cannot attach shared memory which contains statement stats
because it cannot acquire LWLocks at all.

My next plan is put pg_stat_statements into core and treat it just like as
freespacemap in 8.3 or before. Stats file is read by bootstrap process and
written down by bgwriter process. All of the codes in pg_stat_statements
will be merged into pgstat.c and some of interface functions will be called
from ExecutorRun instead of using ExecutorRun_hook.

If no objections, I'll go ahead in the direction.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

#13Decibel!
decibel@decibel.org
In reply to: Vladimir Sitnikov (#6)
1 attachment(s)
Re: contrib/pg_stat_statements

On Oct 17, 2008, at 4:30 AM, Vladimir Sitnikov wrote:

Decibel! <decibel@decibel.org> wrote:

I had tried to use a normal table for store stats information,
but several acrobatic hacks are needed to keep performance.
I guess it is not really required to synchronize the stats into
some physical table immediately.
I would suggest keeping all the data in memory, and having a job
that periodically dumps snapshots into physical tables (with WAL etc).
In that case one would be able to compute database workload as a
difference between two given snapshots. From my point of view, it
does not look like a performance killer to have snapshots every 15
minutes. It does not look too bad to get the statistics of last 15
minutes lost in case of database crash either.

Yeah, that's exactly what I had in mind. I agree that trying to write
to a real table for every counter update would be insane.

My thought was to treat the shared memory area as a buffer of stats
counters. When you go to increment a counter, if it's not in the
buffer then you'd read it out of the table, stick it in the buffer
and increment it. As items age, they'd get pushed out of the buffer.
--
Decibel!, aka Jim C. Nasby, Database Architect decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828

Attachments:

smime.p7sapplication/pkcs7-signature; name=smime.p7sDownload
#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: ITAGAKI Takahiro (#12)
Re: pg_stat_statements in core

ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes:

Now I'm working on storing statistics into disks on server
shutdown. If it is impossible unless the module is in core,
I would change my policy...

I'm really not happy with a proposal to put such a feature in core.
Once it's in core we'll have pretty strong backwards-compatibility
constraints to meet, and I don't think you are anywhere near being
able to demonstrate that you have a solid API that won't require
changes. It needs to be a contrib or pgfoundry package for awhile,
to shake out feature issues in a context where users will understand
the API is subject to change. (As an example of why I'm skittish
about this: just a few days ago someone was complaining about the
plans to get rid of pg_autovacuum, despite the fact that it's been
clearly documented as subject to change or removal since day one.
People expect stability in core features.)

It seems to me that all you're really missing is a shutdown hook
someplace, which would be a reasonable core addition.

regards, tom lane

#15ITAGAKI Takahiro
itagaki.takahiro@oss.ntt.co.jp
In reply to: Tom Lane (#14)
Re: pg_stat_statements in core

Tom Lane <tgl@sss.pgh.pa.us> wrote:

It needs to be a contrib or pgfoundry package for awhile,
to shake out feature issues in a context where users will understand
the API is subject to change.

Agreed. That is what I want to do at first.

It seems to me that all you're really missing is a shutdown hook
someplace, which would be a reasonable core addition.

A shutdown hook is probably not needed because I can use proc_exit()
or shmem_exit() for the purpose.

- On Windows, shared_preload_libraries will be loaded not only
by postmaster and backends but also by auxiliary processes.
(I think it is also required by rmgr hooks.)

- Add a function BgWriterIAm() to export a variable am_bg_writer.
It is used for dumping stats data on shutdown as following:
on_proc_exit( dump_stats_if_BgWriterIAm );

- It might be cleaner to add a callback function on_startup().
Registered functions are called by startup process after REDO.
It is used for loading stats data, but is not always necessary
because the first user of the module can load the data.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

#16ITAGAKI Takahiro
itagaki.takahiro@oss.ntt.co.jp
In reply to: Martin Pihlak (#11)
Re: contrib/pg_stat_statements

Martin Pihlak <martin.pihlak@gmail.com> wrote:

ITAGAKI Takahiro wrote:

I'd like to submit pg_stat_statements contrib module

Nice work! There is one feature I'd like to request -- we need to be able
to also track nested statements. This would greatly simplify diagnosing
performance problems in complex stored procedures. Perhaps the GUC
track_statements could be made an enum - none, top, all?

I tried your request, but found it's hard to determine query text
where the executing plan comes. We can get the query text from
ActivePortal->sourceText only for top statements. Stored procedures
doesn't use portals and uses executor directly.

It might be possbile to add a queryText field into QueryDesc
and all of the codes using QueryDesc initialize the field with
their own query texts, but it requires modifications in many
different modules and copying query text might be needed.
I don't want to do it only for this contrib module,
so I'll support only top statements at the moment. Sorry.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

#17ITAGAKI Takahiro
itagaki.takahiro@oss.ntt.co.jp
In reply to: ITAGAKI Takahiro (#15)
3 attachment(s)
contrib/pg_stat_statements v2

Here is an updated version of contrib/pg_stat_statements module.

I wrote:

A shutdown hook is probably not needed because I can use proc_exit()
or shmem_exit() for the purpose.

I added shutdown hook eventually because MyProc has been reset already
in on_shmem_exit(). A callback that resets MyProc is registered to
on_shmem_exit after shared_preload_libraries are loaded and callbacks
are called in order of LIFO.

Attached files are:

* startup+shutdown_hooks-1027.patch
The patch modifies corecode in the 3 points:
- startup_hook:
Called on server startup by startup process
where LoadFreeSpaceMap() in 8.3 had been called.
- shutdown_hook:
Called on server shutdown by bgwriter
where DumpFreeSpaceMap() in 8.3 had been called.
- shared_preload_libraries are loaded by auxiliary processes:
Windows port requires it.

* pg_stat_statements-1027.tgz
The pg_stat_statements contrib module. It requires patches
startup+shutdown_hooks-1027.patch (above) and auto_explain.patch
( http://archives.postgresql.org/message-id/20081009165157.9BE4.52131E4D@oss.ntt.co.jp ) .

Now it dumps statistics into file at server shutdown and load it
at the next restart. The default path of the saved file is
'global/pg_stat_statements.dump'.

* pgstatstatements.sgml
Documentation of the module.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

Attachments:

pg_stat_statements-1027.tgzapplication/octet-stream; name=pg_stat_statements-1027.tgzDownload
pgstatstatements.sgmlapplication/octet-stream; name=pgstatstatements.sgmlDownload
startup+shutdown_hooks-1027.patchapplication/octet-stream; name=startup+shutdown_hooks-1027.patchDownload
diff -cpr HEAD/src/backend/bootstrap/bootstrap.c startup+shutdown_hooks/src/backend/bootstrap/bootstrap.c
*** HEAD/src/backend/bootstrap/bootstrap.c	Tue Sep 30 19:52:11 2008
--- startup+shutdown_hooks/src/backend/bootstrap/bootstrap.c	Mon Oct 27 13:03:17 2008
*************** typedef struct _IndexList
*** 193,198 ****
--- 193,199 ----
  
  static IndexList *ILHead = NULL;
  
+ startup_hook_type	startup_hook = NULL;
  
  /*
   *	 AuxiliaryProcessMain
*************** AuxiliaryProcessMain(int argc, char *arg
*** 419,424 ****
--- 420,427 ----
  			bootstrap_signals();
  			StartupXLOG();
  			BuildFlatFiles(false);
+ 			if (startup_hook)
+ 				startup_hook();
  			proc_exit(0);		/* startup done */
  
  		case BgWriterProcess:
diff -cpr HEAD/src/backend/postmaster/bgwriter.c startup+shutdown_hooks/src/backend/postmaster/bgwriter.c
*** HEAD/src/backend/postmaster/bgwriter.c	Tue Oct 14 17:06:39 2008
--- startup+shutdown_hooks/src/backend/postmaster/bgwriter.c	Mon Oct 27 13:03:17 2008
*************** static BgWriterShmemStruct *BgWriterShme
*** 141,146 ****
--- 141,148 ----
  /* interval for calling AbsorbFsyncRequests in CheckpointWriteDelay */
  #define WRITES_PER_ABSORB		1000
  
+ shutdown_hook_type	shutdown_hook = NULL;
+ 
  /*
   * GUC parameters
   */
*************** BackgroundWriterMain(void)
*** 396,401 ****
--- 398,405 ----
  			 */
  			ExitOnAnyError = true;
  			/* Close down the database */
+ 			if (shutdown_hook)
+ 				shutdown_hook();
  			ShutdownXLOG(0, 0);
  			/* Normal exit from the bgwriter is here */
  			proc_exit(0);		/* done */
diff -cpr HEAD/src/backend/storage/lmgr/proc.c startup+shutdown_hooks/src/backend/storage/lmgr/proc.c
*** HEAD/src/backend/storage/lmgr/proc.c	Tue Jun 10 03:23:05 2008
--- startup+shutdown_hooks/src/backend/storage/lmgr/proc.c	Mon Oct 27 13:03:17 2008
*************** InitAuxiliaryProcess(void)
*** 383,388 ****
--- 383,392 ----
  	if (MyProc != NULL)
  		elog(ERROR, "you already exist");
  
+ #ifdef EXEC_BACKEND
+ 	process_shared_preload_libraries();
+ #endif
+ 
  	/*
  	 * We use the ProcStructLock to protect assignment and releasing of
  	 * AuxiliaryProcs entries.
diff -cpr HEAD/src/include/storage/ipc.h startup+shutdown_hooks/src/include/storage/ipc.h
*** HEAD/src/include/storage/ipc.h	Thu Apr 17 08:59:40 2008
--- startup+shutdown_hooks/src/include/storage/ipc.h	Mon Oct 27 13:03:17 2008
***************
*** 20,25 ****
--- 20,30 ----
  
  typedef void (*pg_on_exit_callback) (int code, Datum arg);
  
+ typedef void (*startup_hook_type) (void);
+ extern PGDLLIMPORT startup_hook_type startup_hook;
+ typedef void (*shutdown_hook_type) (void);
+ extern PGDLLIMPORT shutdown_hook_type shutdown_hook;
+ 
  /*----------
   * API for handling cleanup that must occur during either ereport(ERROR)
   * or ereport(FATAL) exits from a block of code.  (Typical examples are
#18Hannu Krosing
hannu@2ndQuadrant.com
In reply to: ITAGAKI Takahiro (#16)
Re: contrib/pg_stat_statements

On Mon, 2008-10-27 at 17:00 +0900, ITAGAKI Takahiro wrote:

Martin Pihlak <martin.pihlak@gmail.com> wrote:

ITAGAKI Takahiro wrote:

I'd like to submit pg_stat_statements contrib module

Nice work! There is one feature I'd like to request -- we need to be able
to also track nested statements. This would greatly simplify diagnosing
performance problems in complex stored procedures. Perhaps the GUC
track_statements could be made an enum - none, top, all?

I tried your request, but found it's hard to determine query text
where the executing plan comes.

Are you sure you need query _text_ ?

Why not just aggregate on function OID and reslve OID to text only at
display stage ?

----------------
Hannu

#19ITAGAKI Takahiro
itagaki.takahiro@oss.ntt.co.jp
In reply to: Hannu Krosing (#18)
Re: contrib/pg_stat_statements

Hannu Krosing <hannu@2ndQuadrant.com> wrote:

Martin Pihlak <martin.pihlak@gmail.com> wrote:

we need to be able to also track nested statements.

I tried your request, but found it's hard to determine query text
where the executing plan comes.

Are you sure you need query _text_ ?
Why not just aggregate on function OID and reslve OID to text only at
display stage ?

He wants to track pieces of functions.
Function OID and *positions in the source text* would be needed.
Per-function tracking is already done in pg_stat_user_functions,
that uses function OIDs.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

#20Martin Pihlak
martin.pihlak@gmail.com
In reply to: ITAGAKI Takahiro (#16)
1 attachment(s)
Re: contrib/pg_stat_statements

ITAGAKI Takahiro wrote:

It might be possbile to add a queryText field into QueryDesc
and all of the codes using QueryDesc initialize the field with
their own query texts, but it requires modifications in many
different modules and copying query text might be needed.
I don't want to do it only for this contrib module,
so I'll support only top statements at the moment. Sorry.

Indeed, this turned out to be more difficult than I initally thought.
Nevertheless I still think that tracking nested statements is worth the
effort. Attached is a patch that adds sourceText to QueryDesc (should
apply cleanly after auto_explain.patch). This is very WIP, for one thing
it assumes that the source text pointers can be passed around freely.
But is the idea of extending QueryDesc generally acceptable? Is it OK
to make a copy of the query string?

I tested with modified pg_stat_statements (removed toplevel checks),
and much to my delight it didn't crash immediately :) I've only done
some limited testing but a lot of interesting stuff seems to turns out --
for instance we get to see the lookups generated by referential integrity
checks (could help to identify missing indexes on FK fields).

Regards,
Martin

Attachments:

querydesc.patchtext/x-diff; name=querydesc.patchDownload
*** a/src/backend/commands/copy.c
--- b/src/backend/commands/copy.c
***************
*** 1056,1062 **** DoCopy(const CopyStmt *stmt, const char *queryString)
  		/* Create a QueryDesc requesting no output */
  		cstate->queryDesc = CreateQueryDesc(plan, GetActiveSnapshot(),
  											InvalidSnapshot,
! 											dest, NULL, false);
  
  		/*
  		 * Call ExecutorStart to prepare the plan for execution.
--- 1056,1062 ----
  		/* Create a QueryDesc requesting no output */
  		cstate->queryDesc = CreateQueryDesc(plan, GetActiveSnapshot(),
  											InvalidSnapshot,
! 											dest, NULL, false, queryString);
  
  		/*
  		 * Call ExecutorStart to prepare the plan for execution.
*** a/src/backend/commands/explain.c
--- b/src/backend/commands/explain.c
***************
*** 172,178 **** ExplainOneQuery(Query *query, ExplainStmt *stmt, const char *queryString,
  		plan = pg_plan_query(query, 0, params);
  
  		/* run it (if needed) and produce output */
! 		ExplainOnePlan(plan, params, stmt, tstate);
  	}
  }
  
--- 172,178 ----
  		plan = pg_plan_query(query, 0, params);
  
  		/* run it (if needed) and produce output */
! 		ExplainOnePlan(plan, params, stmt, tstate, queryString);
  	}
  }
  
***************
*** 219,225 **** ExplainOneUtility(Node *utilityStmt, ExplainStmt *stmt,
   */
  void
  ExplainOnePlan(PlannedStmt *plannedstmt, ParamListInfo params,
! 			   ExplainStmt *stmt, TupOutputState *tstate)
  {
  	QueryDesc  *queryDesc;
  	instr_time	starttime;
--- 219,226 ----
   */
  void
  ExplainOnePlan(PlannedStmt *plannedstmt, ParamListInfo params,
! 			   ExplainStmt *stmt, TupOutputState *tstate,
! 			   const char *queryString)
  {
  	QueryDesc  *queryDesc;
  	instr_time	starttime;
***************
*** 237,243 **** ExplainOnePlan(PlannedStmt *plannedstmt, ParamListInfo params,
  	queryDesc = CreateQueryDesc(plannedstmt,
  								GetActiveSnapshot(), InvalidSnapshot,
  								None_Receiver, params,
! 								stmt->analyze);
  
  	INSTR_TIME_SET_CURRENT(starttime);
  
--- 238,244 ----
  	queryDesc = CreateQueryDesc(plannedstmt,
  								GetActiveSnapshot(), InvalidSnapshot,
  								None_Receiver, params,
! 								stmt->analyze, queryString);
  
  	INSTR_TIME_SET_CURRENT(starttime);
  
*** a/src/backend/commands/prepare.c
--- b/src/backend/commands/prepare.c
***************
*** 697,703 **** ExplainExecuteQuery(ExecuteStmt *execstmt, ExplainStmt *stmt,
  				pstmt->intoClause = execstmt->into;
  			}
  
! 			ExplainOnePlan(pstmt, paramLI, stmt, tstate);
  		}
  		else
  		{
--- 697,703 ----
  				pstmt->intoClause = execstmt->into;
  			}
  
! 			ExplainOnePlan(pstmt, paramLI, stmt, tstate, queryString);
  		}
  		else
  		{
*** a/src/backend/executor/functions.c
--- b/src/backend/executor/functions.c
***************
*** 309,320 **** postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
  		es->qd = CreateQueryDesc((PlannedStmt *) es->stmt,
  								 snapshot, InvalidSnapshot,
  								 None_Receiver,
! 								 fcache->paramLI, false);
  	else
  		es->qd = CreateUtilityQueryDesc(es->stmt,
  										snapshot,
  										None_Receiver,
! 										fcache->paramLI);
  
  	/* We assume we don't need to set up ActiveSnapshot for ExecutorStart */
  
--- 309,320 ----
  		es->qd = CreateQueryDesc((PlannedStmt *) es->stmt,
  								 snapshot, InvalidSnapshot,
  								 None_Receiver,
! 								 fcache->paramLI, false, fcache->src);
  	else
  		es->qd = CreateUtilityQueryDesc(es->stmt,
  										snapshot,
  										None_Receiver,
! 										fcache->paramLI, fcache->src);
  
  	/* We assume we don't need to set up ActiveSnapshot for ExecutorStart */
  
*** a/src/backend/executor/spi.c
--- b/src/backend/executor/spi.c
***************
*** 1795,1801 **** _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI,
  				qdesc = CreateQueryDesc((PlannedStmt *) stmt,
  										snap, crosscheck_snapshot,
  										dest,
! 										paramLI, false);
  				res = _SPI_pquery(qdesc, fire_triggers,
  								  canSetTag ? tcount : 0);
  				FreeQueryDesc(qdesc);
--- 1795,1802 ----
  				qdesc = CreateQueryDesc((PlannedStmt *) stmt,
  										snap, crosscheck_snapshot,
  										dest,
! 										paramLI, false,
! 										plansource->query_string);
  				res = _SPI_pquery(qdesc, fire_triggers,
  								  canSetTag ? tcount : 0);
  				FreeQueryDesc(qdesc);
*** a/src/backend/tcop/pquery.c
--- b/src/backend/tcop/pquery.c
***************
*** 37,43 **** bool		force_instrument = false;
  static void ProcessQuery(PlannedStmt *plan,
  			 ParamListInfo params,
  			 DestReceiver *dest,
! 			 char *completionTag);
  static void FillPortalStore(Portal portal, bool isTopLevel);
  static uint32 RunFromStore(Portal portal, ScanDirection direction, long count,
  			 DestReceiver *dest);
--- 37,44 ----
  static void ProcessQuery(PlannedStmt *plan,
  			 ParamListInfo params,
  			 DestReceiver *dest,
! 			 char *completionTag,
! 			 const char *sourceText);
  static void FillPortalStore(Portal portal, bool isTopLevel);
  static uint32 RunFromStore(Portal portal, ScanDirection direction, long count,
  			 DestReceiver *dest);
***************
*** 64,70 **** CreateQueryDesc(PlannedStmt *plannedstmt,
  				Snapshot crosscheck_snapshot,
  				DestReceiver *dest,
  				ParamListInfo params,
! 				bool doInstrument)
  {
  	QueryDesc  *qd = (QueryDesc *) palloc(sizeof(QueryDesc));
  
--- 65,72 ----
  				Snapshot crosscheck_snapshot,
  				DestReceiver *dest,
  				ParamListInfo params,
! 				bool doInstrument,
! 				const char *sourceText)
  {
  	QueryDesc  *qd = (QueryDesc *) palloc(sizeof(QueryDesc));
  
***************
*** 77,82 **** CreateQueryDesc(PlannedStmt *plannedstmt,
--- 79,85 ----
  	qd->dest = dest;			/* output dest */
  	qd->params = params;		/* parameter values passed into query */
  	qd->doInstrument = force_instrument || doInstrument;	/* instrumentation wanted? */
+ 	qd->sourceText = sourceText;
  
  	/* null these fields until set by ExecutorStart */
  	qd->tupDesc = NULL;
***************
*** 93,99 **** QueryDesc *
  CreateUtilityQueryDesc(Node *utilitystmt,
  					   Snapshot snapshot,
  					   DestReceiver *dest,
! 					   ParamListInfo params)
  {
  	QueryDesc  *qd = (QueryDesc *) palloc(sizeof(QueryDesc));
  
--- 96,103 ----
  CreateUtilityQueryDesc(Node *utilitystmt,
  					   Snapshot snapshot,
  					   DestReceiver *dest,
! 					   ParamListInfo params,
! 					   const char *sourceText)
  {
  	QueryDesc  *qd = (QueryDesc *) palloc(sizeof(QueryDesc));
  
***************
*** 105,110 **** CreateUtilityQueryDesc(Node *utilitystmt,
--- 109,115 ----
  	qd->dest = dest;			/* output dest */
  	qd->params = params;		/* parameter values passed into query */
  	qd->doInstrument = false;	/* uninteresting for utilities */
+ 	qd->sourceText = sourceText;
  
  	/* null these fields until set by ExecutorStart */
  	qd->tupDesc = NULL;
***************
*** 152,158 **** static void
  ProcessQuery(PlannedStmt *plan,
  			 ParamListInfo params,
  			 DestReceiver *dest,
! 			 char *completionTag)
  {
  	QueryDesc  *queryDesc;
  
--- 157,164 ----
  ProcessQuery(PlannedStmt *plan,
  			 ParamListInfo params,
  			 DestReceiver *dest,
! 			 char *completionTag,
! 			 const char *sourceText)
  {
  	QueryDesc  *queryDesc;
  
***************
*** 168,174 **** ProcessQuery(PlannedStmt *plan,
  	 */
  	queryDesc = CreateQueryDesc(plan,
  								GetActiveSnapshot(), InvalidSnapshot,
! 								dest, params, false);
  
  	/*
  	 * Set up to collect AFTER triggers
--- 174,180 ----
  	 */
  	queryDesc = CreateQueryDesc(plan,
  								GetActiveSnapshot(), InvalidSnapshot,
! 								dest, params, false, sourceText);
  
  	/*
  	 * Set up to collect AFTER triggers
***************
*** 504,510 **** PortalStart(Portal portal, ParamListInfo params, Snapshot snapshot)
  											InvalidSnapshot,
  											None_Receiver,
  											params,
! 											false);
  
  				/*
  				 * We do *not* call AfterTriggerBeginQuery() here.	We assume
--- 510,517 ----
  											InvalidSnapshot,
  											None_Receiver,
  											params,
! 											false,
! 											portal->sourceText);
  
  				/*
  				 * We do *not* call AfterTriggerBeginQuery() here.	We assume
***************
*** 1252,1265 **** PortalRunMulti(Portal portal, bool isTopLevel,
  				/* statement can set tag string */
  				ProcessQuery(pstmt,
  							 portal->portalParams,
! 							 dest, completionTag);
  			}
  			else
  			{
  				/* stmt added by rewrite cannot set tag */
  				ProcessQuery(pstmt,
  							 portal->portalParams,
! 							 altdest, NULL);
  			}
  
  			if (log_executor_stats)
--- 1259,1272 ----
  				/* statement can set tag string */
  				ProcessQuery(pstmt,
  							 portal->portalParams,
! 							 dest, completionTag, portal->sourceText);
  			}
  			else
  			{
  				/* stmt added by rewrite cannot set tag */
  				ProcessQuery(pstmt,
  							 portal->portalParams,
! 							 altdest, NULL, portal->sourceText);
  			}
  
  			if (log_executor_stats)
*** a/src/include/commands/explain.h
--- b/src/include/commands/explain.h
***************
*** 39,45 **** extern void ExplainOneUtility(Node *utilityStmt, ExplainStmt *stmt,
  				  TupOutputState *tstate);
  
  extern void ExplainOnePlan(PlannedStmt *plannedstmt, ParamListInfo params,
! 			   ExplainStmt *stmt, TupOutputState *tstate);
  
  extern void ExplainOneResult(StringInfo str, QueryDesc *queryDesc,
  							 bool analyze, bool verbose);
--- 39,46 ----
  				  TupOutputState *tstate);
  
  extern void ExplainOnePlan(PlannedStmt *plannedstmt, ParamListInfo params,
! 			   ExplainStmt *stmt, TupOutputState *tstate,
! 			   const char *queryString);
  
  extern void ExplainOneResult(StringInfo str, QueryDesc *queryDesc,
  							 bool analyze, bool verbose);
*** a/src/include/executor/execdesc.h
--- b/src/include/executor/execdesc.h
***************
*** 44,54 **** typedef struct QueryDesc
--- 44,56 ----
  	DestReceiver *dest;			/* the destination for tuple output */
  	ParamListInfo params;		/* param values being passed in */
  	bool		doInstrument;	/* TRUE requests runtime instrumentation */
+ 	const char *sourceText;		/* Source text for the query */
  
  	/* These fields are set by ExecutorStart */
  	TupleDesc	tupDesc;		/* descriptor for result tuples */
  	EState	   *estate;			/* executor's query-wide state */
  	PlanState  *planstate;		/* tree of per-plan-node state */
+ 
  } QueryDesc;
  
  /* in pquery.c */
***************
*** 57,68 **** extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
  				Snapshot crosscheck_snapshot,
  				DestReceiver *dest,
  				ParamListInfo params,
! 				bool doInstrument);
  
  extern QueryDesc *CreateUtilityQueryDesc(Node *utilitystmt,
  					   Snapshot snapshot,
  					   DestReceiver *dest,
! 					   ParamListInfo params);
  
  extern void FreeQueryDesc(QueryDesc *qdesc);
  
--- 59,72 ----
  				Snapshot crosscheck_snapshot,
  				DestReceiver *dest,
  				ParamListInfo params,
! 				bool doInstrument,
! 				const char *sourceText);
  
  extern QueryDesc *CreateUtilityQueryDesc(Node *utilitystmt,
  					   Snapshot snapshot,
  					   DestReceiver *dest,
! 					   ParamListInfo params,
! 					   const char *sourceText);
  
  extern void FreeQueryDesc(QueryDesc *qdesc);
  
#21ITAGAKI Takahiro
itagaki.takahiro@oss.ntt.co.jp
In reply to: Martin Pihlak (#20)
1 attachment(s)
Re: contrib/pg_stat_statements

Martin Pihlak <martin.pihlak@gmail.com> wrote:

Attached is a patch that adds sourceText to QueryDesc.

It worked fine surprisingly :) . Internal and C functions don't use
executor, so we can ignore trivial function calls (ex. operators).
Furthermore, it is ok if QueryDesc doesn't have queryText
because the result is counted up in the upper statement.

But is the idea of extending QueryDesc generally acceptable? Is it OK
to make a copy of the query string?

The only thing I'm worried about is that QueryDesc lives longer than
its queryText. Can I assume it never occurs?

I tested with modified pg_stat_statements (removed toplevel checks),

Stack of counters would be better. The attached is modified to do so.
It might be worth thinking about adding counters that are equivalent
to total_time and self_time in in pg_stat_user_functions.

=# CREATE OR REPLACE FUNCTION plfn(integer) RETURNS bigint AS
$$
DECLARE
i bigint;
BEGIN
SELECT count(*) INTO i FROM pg_class;
SELECT count(*) INTO i FROM pg_class;
SELECT count(*) INTO i FROM generate_series(1, $1);
RETURN i;
END;
$$
LANGUAGE plpgsql;

=# SELECT sum(plfn(10000)) FROM generate_series(1, 100);
=# SELECT query, calls, total_time, rows FROM pg_stat_statements;

query | calls | total_time | rows
-------------------------------------------------------+-------+------------+------
SELECT sum(plfn(10000)) FROM generate_series(1, 100); | 1 | 428 | 1
SELECT count(*) FROM pg_class | 200 | 32 | 200
SELECT count(*) FROM generate_series(1, $1 ) | 100 | 309 | 100
(3 rows)

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

Attachments:

pg_stat_statements.capplication/octet-stream; name=pg_stat_statements.cDownload
#22Martin Pihlak
martin.pihlak@gmail.com
In reply to: ITAGAKI Takahiro (#21)
1 attachment(s)
Re: contrib/pg_stat_statements

ITAGAKI Takahiro wrote:

But is the idea of extending QueryDesc generally acceptable? Is it OK
to make a copy of the query string?

The only thing I'm worried about is that QueryDesc lives longer than
its queryText. Can I assume it never occurs?

I just finished validating this -- it seems OK. All of the query strings
that make it to CreateQueryDesc are either pstrdup-ed to Portal, in long
lived memory context or just literals. So no need to make extra copies :)

regards,
Martin

Attachments:

querydesc.patchtext/x-diff; name=querydesc.patchDownload
*** a/src/backend/commands/copy.c
--- b/src/backend/commands/copy.c
***************
*** 1056,1062 **** DoCopy(const CopyStmt *stmt, const char *queryString)
  		/* Create a QueryDesc requesting no output */
  		cstate->queryDesc = CreateQueryDesc(plan, GetActiveSnapshot(),
  											InvalidSnapshot,
! 											dest, NULL, false);
  
  		/*
  		 * Call ExecutorStart to prepare the plan for execution.
--- 1056,1062 ----
  		/* Create a QueryDesc requesting no output */
  		cstate->queryDesc = CreateQueryDesc(plan, GetActiveSnapshot(),
  											InvalidSnapshot,
! 											dest, NULL, false, queryString);
  
  		/*
  		 * Call ExecutorStart to prepare the plan for execution.
*** a/src/backend/commands/explain.c
--- b/src/backend/commands/explain.c
***************
*** 172,178 **** ExplainOneQuery(Query *query, ExplainStmt *stmt, const char *queryString,
  		plan = pg_plan_query(query, 0, params);
  
  		/* run it (if needed) and produce output */
! 		ExplainOnePlan(plan, params, stmt, tstate);
  	}
  }
  
--- 172,178 ----
  		plan = pg_plan_query(query, 0, params);
  
  		/* run it (if needed) and produce output */
! 		ExplainOnePlan(plan, params, stmt, tstate, queryString);
  	}
  }
  
***************
*** 219,225 **** ExplainOneUtility(Node *utilityStmt, ExplainStmt *stmt,
   */
  void
  ExplainOnePlan(PlannedStmt *plannedstmt, ParamListInfo params,
! 			   ExplainStmt *stmt, TupOutputState *tstate)
  {
  	QueryDesc  *queryDesc;
  	instr_time	starttime;
--- 219,226 ----
   */
  void
  ExplainOnePlan(PlannedStmt *plannedstmt, ParamListInfo params,
! 			   ExplainStmt *stmt, TupOutputState *tstate,
! 			   const char *queryString)
  {
  	QueryDesc  *queryDesc;
  	instr_time	starttime;
***************
*** 238,244 **** ExplainOnePlan(PlannedStmt *plannedstmt, ParamListInfo params,
  	queryDesc = CreateQueryDesc(plannedstmt,
  								GetActiveSnapshot(), InvalidSnapshot,
  								None_Receiver, params,
! 								stmt->analyze);
  
  	INSTR_TIME_SET_CURRENT(starttime);
  
--- 239,245 ----
  	queryDesc = CreateQueryDesc(plannedstmt,
  								GetActiveSnapshot(), InvalidSnapshot,
  								None_Receiver, params,
! 								stmt->analyze, queryString);
  
  	INSTR_TIME_SET_CURRENT(starttime);
  
*** a/src/backend/commands/prepare.c
--- b/src/backend/commands/prepare.c
***************
*** 697,703 **** ExplainExecuteQuery(ExecuteStmt *execstmt, ExplainStmt *stmt,
  				pstmt->intoClause = execstmt->into;
  			}
  
! 			ExplainOnePlan(pstmt, paramLI, stmt, tstate);
  		}
  		else
  		{
--- 697,703 ----
  				pstmt->intoClause = execstmt->into;
  			}
  
! 			ExplainOnePlan(pstmt, paramLI, stmt, tstate, queryString);
  		}
  		else
  		{
*** a/src/backend/executor/functions.c
--- b/src/backend/executor/functions.c
***************
*** 309,320 **** postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
  		es->qd = CreateQueryDesc((PlannedStmt *) es->stmt,
  								 snapshot, InvalidSnapshot,
  								 None_Receiver,
! 								 fcache->paramLI, false);
  	else
  		es->qd = CreateUtilityQueryDesc(es->stmt,
  										snapshot,
  										None_Receiver,
! 										fcache->paramLI);
  
  	/* We assume we don't need to set up ActiveSnapshot for ExecutorStart */
  
--- 309,320 ----
  		es->qd = CreateQueryDesc((PlannedStmt *) es->stmt,
  								 snapshot, InvalidSnapshot,
  								 None_Receiver,
! 								 fcache->paramLI, false, fcache->src);
  	else
  		es->qd = CreateUtilityQueryDesc(es->stmt,
  										snapshot,
  										None_Receiver,
! 										fcache->paramLI, fcache->src);
  
  	/* We assume we don't need to set up ActiveSnapshot for ExecutorStart */
  
*** a/src/backend/executor/spi.c
--- b/src/backend/executor/spi.c
***************
*** 1795,1801 **** _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI,
  				qdesc = CreateQueryDesc((PlannedStmt *) stmt,
  										snap, crosscheck_snapshot,
  										dest,
! 										paramLI, false);
  				res = _SPI_pquery(qdesc, fire_triggers,
  								  canSetTag ? tcount : 0);
  				FreeQueryDesc(qdesc);
--- 1795,1802 ----
  				qdesc = CreateQueryDesc((PlannedStmt *) stmt,
  										snap, crosscheck_snapshot,
  										dest,
! 										paramLI, false,
! 										plansource->query_string);
  				res = _SPI_pquery(qdesc, fire_triggers,
  								  canSetTag ? tcount : 0);
  				FreeQueryDesc(qdesc);
*** a/src/backend/tcop/pquery.c
--- b/src/backend/tcop/pquery.c
***************
*** 37,43 **** Portal		ActivePortal = NULL;
  static void ProcessQuery(PlannedStmt *plan,
  			 ParamListInfo params,
  			 DestReceiver *dest,
! 			 char *completionTag);
  static void FillPortalStore(Portal portal, bool isTopLevel);
  static uint32 RunFromStore(Portal portal, ScanDirection direction, long count,
  			 DestReceiver *dest);
--- 37,44 ----
  static void ProcessQuery(PlannedStmt *plan,
  			 ParamListInfo params,
  			 DestReceiver *dest,
! 			 char *completionTag,
! 			 const char *sourceText);
  static void FillPortalStore(Portal portal, bool isTopLevel);
  static uint32 RunFromStore(Portal portal, ScanDirection direction, long count,
  			 DestReceiver *dest);
***************
*** 64,70 **** CreateQueryDesc(PlannedStmt *plannedstmt,
  				Snapshot crosscheck_snapshot,
  				DestReceiver *dest,
  				ParamListInfo params,
! 				bool doInstrument)
  {
  	QueryDesc  *qd = (QueryDesc *) palloc(sizeof(QueryDesc));
  
--- 65,72 ----
  				Snapshot crosscheck_snapshot,
  				DestReceiver *dest,
  				ParamListInfo params,
! 				bool doInstrument,
! 				const char *sourceText)
  {
  	QueryDesc  *qd = (QueryDesc *) palloc(sizeof(QueryDesc));
  
***************
*** 77,82 **** CreateQueryDesc(PlannedStmt *plannedstmt,
--- 79,85 ----
  	qd->dest = dest;			/* output dest */
  	qd->params = params;		/* parameter values passed into query */
  	qd->doInstrument = doInstrument;	/* instrumentation wanted? */
+ 	qd->sourceText = sourceText;
  
  	/* null these fields until set by ExecutorStart */
  	qd->tupDesc = NULL;
***************
*** 93,99 **** QueryDesc *
  CreateUtilityQueryDesc(Node *utilitystmt,
  					   Snapshot snapshot,
  					   DestReceiver *dest,
! 					   ParamListInfo params)
  {
  	QueryDesc  *qd = (QueryDesc *) palloc(sizeof(QueryDesc));
  
--- 96,103 ----
  CreateUtilityQueryDesc(Node *utilitystmt,
  					   Snapshot snapshot,
  					   DestReceiver *dest,
! 					   ParamListInfo params,
! 					   const char *sourceText)
  {
  	QueryDesc  *qd = (QueryDesc *) palloc(sizeof(QueryDesc));
  
***************
*** 105,110 **** CreateUtilityQueryDesc(Node *utilitystmt,
--- 109,115 ----
  	qd->dest = dest;			/* output dest */
  	qd->params = params;		/* parameter values passed into query */
  	qd->doInstrument = false;	/* uninteresting for utilities */
+ 	qd->sourceText = sourceText;
  
  	/* null these fields until set by ExecutorStart */
  	qd->tupDesc = NULL;
***************
*** 152,158 **** static void
  ProcessQuery(PlannedStmt *plan,
  			 ParamListInfo params,
  			 DestReceiver *dest,
! 			 char *completionTag)
  {
  	QueryDesc  *queryDesc;
  
--- 157,164 ----
  ProcessQuery(PlannedStmt *plan,
  			 ParamListInfo params,
  			 DestReceiver *dest,
! 			 char *completionTag,
! 			 const char *sourceText)
  {
  	QueryDesc  *queryDesc;
  
***************
*** 168,174 **** ProcessQuery(PlannedStmt *plan,
  	 */
  	queryDesc = CreateQueryDesc(plan,
  								GetActiveSnapshot(), InvalidSnapshot,
! 								dest, params, false);
  
  	/*
  	 * Set up to collect AFTER triggers
--- 174,180 ----
  	 */
  	queryDesc = CreateQueryDesc(plan,
  								GetActiveSnapshot(), InvalidSnapshot,
! 								dest, params, false, sourceText);
  
  	/*
  	 * Set up to collect AFTER triggers
***************
*** 504,510 **** PortalStart(Portal portal, ParamListInfo params, Snapshot snapshot)
  											InvalidSnapshot,
  											None_Receiver,
  											params,
! 											false);
  
  				/*
  				 * We do *not* call AfterTriggerBeginQuery() here.	We assume
--- 510,517 ----
  											InvalidSnapshot,
  											None_Receiver,
  											params,
! 											false,
! 											portal->sourceText);
  
  				/*
  				 * We do *not* call AfterTriggerBeginQuery() here.	We assume
***************
*** 1252,1265 **** PortalRunMulti(Portal portal, bool isTopLevel,
  				/* statement can set tag string */
  				ProcessQuery(pstmt,
  							 portal->portalParams,
! 							 dest, completionTag);
  			}
  			else
  			{
  				/* stmt added by rewrite cannot set tag */
  				ProcessQuery(pstmt,
  							 portal->portalParams,
! 							 altdest, NULL);
  			}
  
  			if (log_executor_stats)
--- 1259,1272 ----
  				/* statement can set tag string */
  				ProcessQuery(pstmt,
  							 portal->portalParams,
! 							 dest, completionTag, portal->sourceText);
  			}
  			else
  			{
  				/* stmt added by rewrite cannot set tag */
  				ProcessQuery(pstmt,
  							 portal->portalParams,
! 							 altdest, NULL, portal->sourceText);
  			}
  
  			if (log_executor_stats)
*** a/src/include/commands/explain.h
--- b/src/include/commands/explain.h
***************
*** 39,44 **** extern void ExplainOneUtility(Node *utilityStmt, ExplainStmt *stmt,
  				  TupOutputState *tstate);
  
  extern void ExplainOnePlan(PlannedStmt *plannedstmt, ParamListInfo params,
! 			   ExplainStmt *stmt, TupOutputState *tstate);
  
  #endif   /* EXPLAIN_H */
--- 39,45 ----
  				  TupOutputState *tstate);
  
  extern void ExplainOnePlan(PlannedStmt *plannedstmt, ParamListInfo params,
! 			   ExplainStmt *stmt, TupOutputState *tstate,
! 			   const char *queryString);
  
  #endif   /* EXPLAIN_H */
*** a/src/include/executor/execdesc.h
--- b/src/include/executor/execdesc.h
***************
*** 42,47 **** typedef struct QueryDesc
--- 42,48 ----
  	DestReceiver *dest;			/* the destination for tuple output */
  	ParamListInfo params;		/* param values being passed in */
  	bool		doInstrument;	/* TRUE requests runtime instrumentation */
+ 	const char *sourceText;		/* Source text for the query */
  
  	/* These fields are set by ExecutorStart */
  	TupleDesc	tupDesc;		/* descriptor for result tuples */
***************
*** 55,66 **** extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
  				Snapshot crosscheck_snapshot,
  				DestReceiver *dest,
  				ParamListInfo params,
! 				bool doInstrument);
  
  extern QueryDesc *CreateUtilityQueryDesc(Node *utilitystmt,
  					   Snapshot snapshot,
  					   DestReceiver *dest,
! 					   ParamListInfo params);
  
  extern void FreeQueryDesc(QueryDesc *qdesc);
  
--- 56,69 ----
  				Snapshot crosscheck_snapshot,
  				DestReceiver *dest,
  				ParamListInfo params,
! 				bool doInstrument,
! 				const char *sourceText);
  
  extern QueryDesc *CreateUtilityQueryDesc(Node *utilitystmt,
  					   Snapshot snapshot,
  					   DestReceiver *dest,
! 					   ParamListInfo params,
! 					   const char *sourceText);
  
  extern void FreeQueryDesc(QueryDesc *qdesc);
  
#23Vladimir Sitnikov
sitnikov.vladimir@gmail.com
In reply to: ITAGAKI Takahiro (#17)
Re: contrib/pg_stat_statements v2

Hello,

I have two concerns regarding the patch:

A) I am not sure if it is good to have a single contention point (pgss->lock
= LWLockAssign()). I guess that would impact scalability, especially on a
multi-cpu systems. I guess the real solution will come when PostgreSQL have
a pool for sql statements, hovewer it could make sense to split pgss->lock
into several ones to reduce contention on it.

B) I really do not like the idea of ResetBufferUsage.

I do vote for eliminating ResetBufferUsage from the sources (even from the
core sources)
The reason is as follows:
1) No one really tries to reset "current timestamp" counter. Why do we
reset buffer usage every now and then?
2) As new call sites of ResetBufferUsage appear it becomes more likely to
fetch "wrong" statistics from that counters due to "accidental" reset.
3) When it comes to fetch "buffer usage", one might use the same approach
as with timings: calculate the difference between two measurements. I do not
believe it is noticeably slower than reset+measure.

I wish PostgreSQL had some kind of pg_session_statistics view that reports
resource usage statistics for each session.
For instance, it could expose "buffer usage" to the client, so it could get
more details on resource usage. For instance, I would like to see a new tab
in pgAdmin that shows "total number of buffer gets", "number of WAL records
created", "number of rows sorted" and similar information after query
finishes (even in plain "execute" mode).
The second application of that statistics could be server health monitoring:
provided there is an interface for ongoing integral statistics, one could
create a job that takes snapshots, computes the difference and plots it on a
graph.

Sincerely,
Vladimir Sitnikov

#24Tom Lane
tgl@sss.pgh.pa.us
In reply to: Martin Pihlak (#22)
Re: contrib/pg_stat_statements

Martin Pihlak <martin.pihlak@gmail.com> writes:

ITAGAKI Takahiro wrote:

But is the idea of extending QueryDesc generally acceptable? Is it OK
to make a copy of the query string?

The only thing I'm worried about is that QueryDesc lives longer than
its queryText. Can I assume it never occurs?

I just finished validating this -- it seems OK. All of the query strings
that make it to CreateQueryDesc are either pstrdup-ed to Portal, in long
lived memory context or just literals. So no need to make extra copies :)

I'm trying to figure out what is the status of this patch? I'm not sure
if there's any point in applying it, when contrib/pg_stat_statements
hasn't been updated to make use of it.

regards, tom lane

#25ITAGAKI Takahiro
itagaki.takahiro@oss.ntt.co.jp
In reply to: Tom Lane (#24)
1 attachment(s)
Re: contrib/pg_stat_statements

Tom Lane <tgl@sss.pgh.pa.us> wrote:

I'm trying to figure out what is the status of this patch? I'm not sure
if there's any point in applying it, when contrib/pg_stat_statements
hasn't been updated to make use of it.

Here is an updated patch to use QueryDesc.queryText and supports nested
statements. The patch needs to be applied after contrib/auto_explain patch.

Now we can choose statistics.track_statements from 'none', 'top' and 'all'.
Nested statements are collected in the case of 'all'.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

Attachments:

pg_stat_statements.1115.tar.gzapplication/octet-stream; name=pg_stat_statements.1115.tar.gzDownload