WIP patch - INSERT-able log statements

Started by FAST PostgreSQLalmost 19 years ago34 messages
#1FAST PostgreSQL
fastpgs@fast.fujitsu.com.au
1 attachment(s)

Hi,

I've been working on the following TODO item and attached is an initial patch. (It is only partial and not yet completely functional)

"Allow server log information to be output as INSERT statements
This would allow server log information to be easily loaded into a database for analysis. "

I want to confirm, if what I have done so far is what community is looking for and also want to clear some doubts.

What is done so far
---------------

Two postgresql.conf variables

#log_output_type = 'text' #Valid values are 'SQL' or 'text'
#log_output_table_name = 'auditlogs'

These control how to output the log. Defaults to 'text' which is status quo. If it is set to 'SQL' log will be output as INSERT commands.

The second variable is of interest. We need to specify a table in the insert command. My preferred option is for the user to give one and he can create it if and when he wants to. The alternative is we decide the table name and make initdb to create one.

The proposed log output structure
------------------
INSERT INTO user_defined_table values( timestamp_with_milliseconds, timestamp, username, databasename, sessionid, host_and_port, host, proc_id, command_tag, session_start, transaction_id, error_severity, SQL_State_Code, error_message);

All these columns will follow the current rules of log output. ie, unless explicity requested by the user, these columns will have NULL. User can still give log_line_prefix in any order he wants, and logger will output it in appropriate columns. The code has been modified to do
this.

Issues/Questions are:
- How about 'Statement duration log'. This will come to the logger as a single string and after the query execution. In the existing log we can make sense of the duration log by matching it with the statement above it or by the statement which gets printed besides it (Again as
a single string). But when this is loaded onto a table doesn't make much sense untless everything is in a single row. (My preferred option is to add another column to the table structure defined above as 'duration'. But haven't figured out how to achieve this, because the
statement is printed first and then the duration as another log.)

- If the SQL log output is to the syslog, then it becomes pretty awkward and possibly useless because our current syslog writer function breaks up the log into several lines to accomodate various platforms. Syslog also then adds other information before outputting it, which
cannot be loaded onto a table. The preferred option is to educate the user through documentation that SQL type log output is best served when it is output to stderr and redirected to a file? Same goes with other aspects such as verbose and various other statistics log.

- There are also other minor issues such as, the actual query currently gets output in log as 'Statement: CREATE ........'. For sql type log we may not need the 'Statement:' part as it will be in a column ? Do we remove this in both text and SQL outputs ?

Rgds,
Arul Shaji

This is an email from Fujitsu Australia Software Technology Pty Ltd, ABN 27 003 693 481. It is confidential to the ordinary user of the email address to which it was addressed and may contain copyright and/or legally privileged information. No one else may read, print, store, copy or forward all or any of it or its attachments. If you receive this email in error, please return to sender. Thank you.

If you do not wish to receive commercial email messages from Fujitsu Australia Software Technology Pty Ltd, please email unsubscribe@fast.fujitsu.com.au

Attachments:

sql_log_new.patchtext/x-diff; charset=iso-8859-15; name=sql_log_new.patchDownload
*** pgsql/src/backend/utils/misc/postgresql.conf.sample	2007-02-01 11:38:25.000000000 +1100
--- workingpgsql/src/backend/utils/misc/postgresql.conf.sample	2007-02-14 15:30:45.000000000 +1100
***************
*** 257,262 ****
--- 257,267 ----
  #syslog_facility = 'LOCAL0'
  #syslog_ident = 'postgres'
  
+ # - How to Log -
+ 
+ #log_output_type = 'text'		#Valid values are 'SQL' or 'text'
+ #log_output_table_name = 'auditlogs'
+ 
  
  # - When to Log -
  
*** pgsql/src/backend/utils/misc/guc.c	2007-02-14 15:16:24.000000000 +1100
--- workingpgsql/src/backend/utils/misc/guc.c	2007-02-14 16:14:03.000000000 +1100
***************
*** 153,158 ****
--- 153,159 ----
  static const char *show_tcp_keepalives_idle(void);
  static const char *show_tcp_keepalives_interval(void);
  static const char *show_tcp_keepalives_count(void);
+ static const char *check_logtype_combination(const char *facility, bool doit, GucSource source);
  
  /*
   * GUC option variables that are exported from this module
***************
*** 210,215 ****
--- 211,217 ----
  static char *log_statement_str;
  static char *log_min_error_statement_str;
  static char *log_destination_string;
+ static char *log_output_type_string;
  
  #ifdef HAVE_SYSLOG
  static char *syslog_facility_str;
***************
*** 324,329 ****
--- 326,333 ----
  	gettext_noop("Reporting and Logging"),
  	/* LOGGING_WHERE */
  	gettext_noop("Reporting and Logging / Where to Log"),
+ 	/* LOGGING_HOW */
+ 	gettext_noop("Reporting and Logging / How to Log"),
  	/* LOGGING_WHEN */
  	gettext_noop("Reporting and Logging / When to Log"),
  	/* LOGGING_WHAT */
***************
*** 2158,2163 ****
--- 2162,2187 ----
  	},
  #endif
  
+ 	/*
+ 	 * For the logs to be output as INSERT statements
+ 	 */
+ 	{
+ 		{"log_output_table_name", PGC_POSTMASTER, LOGGING_HOW,
+ 			gettext_noop("The name of the table where the logs will be uploaded to."),
+ 			NULL
+ 		},
+ 		&Log_output_table_name,
+ 		"", NULL, NULL
+ 	},
+ 	{
+ 		{"log_output_type", PGC_POSTMASTER, LOGGING_HOW,
+ 			gettext_noop("Decides the output type of the log. Text or SQL."),
+ 			gettext_noop("Valid values are text, SQL.")
+ 		},
+ 		&log_output_type_string,
+ 		"text", check_logtype_combination, NULL
+ 	},
+ 
  	{
  		{"TimeZone", PGC_USERSET, CLIENT_CONN_LOCALE,
  			gettext_noop("Sets the time zone for displaying and interpreting time stamps."),
***************
*** 6022,6027 ****
--- 6046,6068 ----
  	return value;
  }
  
+ /*
+  * Check and set the log output type. SQL or text.
+  * Defaults to text.
+  */
+ static const char *
+ check_logtype_combination(const char *value, bool doit, GucSource source)
+ {
+ 	Log_output_sql = false;
+ 
+ 	if ( (value != NULL) && (pg_strcasecmp(value, "SQL") == 0) )
+ 	{
+ 			Log_output_sql = true;
+ 	}
+ 
+ 	return value;
+ }
+ 
  #ifdef HAVE_SYSLOG
  
  static const char *
*** pgsql/src/include/utils/elog.h	2007-01-16 16:29:39.000000000 +1100
--- workingpgsql/src/include/utils/elog.h	2007-02-14 16:21:21.000000000 +1100
***************
*** 273,278 ****
--- 273,280 ----
  extern PGErrorVerbosity Log_error_verbosity;
  extern char *Log_line_prefix;
  extern int	Log_destination;
+ extern bool Log_output_sql;
+ extern char *Log_output_table_name;
  
  /* Log destination bitmap */
  #define LOG_DESTINATION_STDERR	 1
*** pgsql/src/include/utils/guc_tables.h	2007-01-16 16:29:40.000000000 +1100
--- workingpgsql/src/include/utils/guc_tables.h	2007-02-14 16:22:41.000000000 +1100
***************
*** 59,64 ****
--- 59,65 ----
  	QUERY_TUNING_OTHER,
  	LOGGING,
  	LOGGING_WHERE,
+ 	LOGGING_HOW,
  	LOGGING_WHEN,
  	LOGGING_WHAT,
  	STATS,
*** pgsql/src/backend/utils/error/elog.c	2007-02-14 15:16:22.000000000 +1100
--- workingpgsql/src/backend/utils/error/elog.c	2007-02-15 11:36:57.000000000 +1100
***************
*** 83,88 ****
--- 83,91 ----
  char	   *Log_line_prefix = NULL;		/* format for extra log line info */
  int			Log_destination = LOG_DESTINATION_STDERR;
  
+ bool	Log_output_sql = false;
+ char	*Log_output_table_name = NULL;
+ 
  #ifdef HAVE_SYSLOG
  static bool openlog_done = false;
  static char *syslog_ident = NULL;
***************
*** 116,122 ****
  	} while (0)
  
  
! static void log_line_prefix(StringInfo buf);
  static void send_message_to_server_log(ErrorData *edata);
  static void send_message_to_frontend(ErrorData *edata);
  static char *expand_fmt_string(const char *fmt, ErrorData *edata);
--- 119,125 ----
  	} while (0)
  
  
! static void log_line_prefix(char *loglineprefix, StringInfo buf);
  static void send_message_to_server_log(ErrorData *edata);
  static void send_message_to_frontend(ErrorData *edata);
  static char *expand_fmt_string(const char *fmt, ErrorData *edata);
***************
*** 124,130 ****
  static const char *error_severity(int elevel);
  static void append_with_tabs(StringInfo buf, const char *str);
  static bool is_log_level_output(int elevel, int log_min_level);
! 
  
  /*
   * errstart --- begin an error-reporting cycle
--- 127,135 ----
  static const char *error_severity(int elevel);
  static void append_with_tabs(StringInfo buf, const char *str);
  static bool is_log_level_output(int elevel, int log_min_level);
! static void get_SQL_type_server_log(StringInfo buf, ErrorData *edata);
! static void get_timestamp(StringInfo buf);
! static void get_error_message(StringInfo buf, ErrorData *edata);
  
  /*
   * errstart --- begin an error-reporting cycle
***************
*** 1330,1336 ****
   * Format tag info for log lines; append to the provided buffer.
   */
  static void
! log_line_prefix(StringInfo buf)
  {
  	/* static counter for line numbers */
  	static long log_line_number = 0;
--- 1335,1341 ----
   * Format tag info for log lines; append to the provided buffer.
   */
  static void
! log_line_prefix(char *loglineprefix, StringInfo buf)
  {
  	/* static counter for line numbers */
  	static long log_line_number = 0;
***************
*** 1353,1369 ****
  	}
  	log_line_number++;
  
! 	if (Log_line_prefix == NULL)
  		return;					/* in case guc hasn't run yet */
  
! 	format_len = strlen(Log_line_prefix);
  
  	for (i = 0; i < format_len; i++)
  	{
! 		if (Log_line_prefix[i] != '%')
  		{
  			/* literal char, just copy */
! 			appendStringInfoChar(buf, Log_line_prefix[i]);
  			continue;
  		}
  		/* go to char after '%' */
--- 1358,1374 ----
  	}
  	log_line_number++;
  
! 	if (loglineprefix == NULL)
  		return;					/* in case guc hasn't run yet */
  
! 	format_len = strlen(loglineprefix);
  
  	for (i = 0; i < format_len; i++)
  	{
! 		if (loglineprefix[i] != '%')
  		{
  			/* literal char, just copy */
! 			appendStringInfoChar(buf, loglineprefix[i]);
  			continue;
  		}
  		/* go to char after '%' */
***************
*** 1372,1378 ****
  			break;				/* format error - ignore it */
  
  		/* process the option */
! 		switch (Log_line_prefix[i])
  		{
  			case 'u':
  				if (MyProcPort)
--- 1377,1383 ----
  			break;				/* format error - ignore it */
  
  		/* process the option */
! 		switch (loglineprefix[i])
  		{
  			case 'u':
  				if (MyProcPort)
***************
*** 1557,1641 ****
  
  	initStringInfo(&buf);
  
! 	log_line_prefix(&buf);
! 	appendStringInfo(&buf, "%s:  ", error_severity(edata->elevel));
! 
! 	if (Log_error_verbosity >= PGERROR_VERBOSE)
! 		appendStringInfo(&buf, "%s: ", unpack_sql_state(edata->sqlerrcode));
! 
! 	if (edata->message)
! 		append_with_tabs(&buf, edata->message);
  	else
- 		append_with_tabs(&buf, _("missing error text"));
- 
- 	if (edata->cursorpos > 0)
- 		appendStringInfo(&buf, _(" at character %d"),
- 						 edata->cursorpos);
- 	else if (edata->internalpos > 0)
- 		appendStringInfo(&buf, _(" at character %d"),
- 						 edata->internalpos);
- 
- 	appendStringInfoChar(&buf, '\n');
- 
- 	if (Log_error_verbosity >= PGERROR_DEFAULT)
  	{
! 		if (edata->detail)
! 		{
! 			log_line_prefix(&buf);
! 			appendStringInfoString(&buf, _("DETAIL:  "));
! 			append_with_tabs(&buf, edata->detail);
! 			appendStringInfoChar(&buf, '\n');
! 		}
! 		if (edata->hint)
! 		{
! 			log_line_prefix(&buf);
! 			appendStringInfoString(&buf, _("HINT:  "));
! 			append_with_tabs(&buf, edata->hint);
! 			appendStringInfoChar(&buf, '\n');
! 		}
! 		if (edata->internalquery)
! 		{
! 			log_line_prefix(&buf);
! 			appendStringInfoString(&buf, _("QUERY:  "));
! 			append_with_tabs(&buf, edata->internalquery);
! 			appendStringInfoChar(&buf, '\n');
! 		}
! 		if (edata->context)
! 		{
! 			log_line_prefix(&buf);
! 			appendStringInfoString(&buf, _("CONTEXT:  "));
! 			append_with_tabs(&buf, edata->context);
! 			appendStringInfoChar(&buf, '\n');
! 		}
  		if (Log_error_verbosity >= PGERROR_VERBOSE)
  		{
! 			/* assume no newlines in funcname or filename... */
! 			if (edata->funcname && edata->filename)
  			{
! 				log_line_prefix(&buf);
! 				appendStringInfo(&buf, _("LOCATION:  %s, %s:%d\n"),
! 								 edata->funcname, edata->filename,
! 								 edata->lineno);
  			}
! 			else if (edata->filename)
  			{
! 				log_line_prefix(&buf);
! 				appendStringInfo(&buf, _("LOCATION:  %s:%d\n"),
! 								 edata->filename, edata->lineno);
  			}
  		}
! 	}
! 
! 	/*
! 	 * If the user wants the query that generated this error logged, do it.
! 	 */
! 	if (edata->elevel >= log_min_error_statement && debug_query_string != NULL)
! 	{
! 		log_line_prefix(&buf);
! 		appendStringInfoString(&buf, _("STATEMENT:  "));
! 		append_with_tabs(&buf, debug_query_string);
! 		appendStringInfoChar(&buf, '\n');
! 	}
  
  #ifdef HAVE_SYSLOG
  	/* Write to syslog, if enabled */
--- 1562,1646 ----
  
  	initStringInfo(&buf);
  
! 	if ((Log_output_sql == true) &&
! 		(Log_output_table_name != NULL) &&
! 			(pg_strcasecmp(Log_output_table_name, "") != 0))
! 	{
! 		get_SQL_type_server_log(&buf, edata);					  	
! 	}
  	else
  	{
! 		log_line_prefix(Log_line_prefix, &buf);
! 		appendStringInfo(&buf, "%s:  ", error_severity(edata->elevel));
!   
  		if (Log_error_verbosity >= PGERROR_VERBOSE)
+ 			appendStringInfo(&buf, "%s: ", unpack_sql_state(edata->sqlerrcode));
+   
+		/* Get the error message and cursor position if any. */
+ 		get_error_message(&buf, edata);
+   
+ 		appendStringInfoChar(&buf, '\n');
+ 
+ 		if (Log_error_verbosity >= PGERROR_DEFAULT)
  		{
! 			if (edata->detail)
! 			{
! 			log_line_prefix(Log_line_prefix, &buf);
! 				appendStringInfoString(&buf, _("DETAIL:  "));
! 				append_with_tabs(&buf, edata->detail);
! 				appendStringInfoChar(&buf, '\n');
! 			}
! 			if (edata->hint)
! 			{
! 				log_line_prefix(Log_line_prefix, &buf);
! 				appendStringInfoString(&buf, _("HINT:  "));
! 				append_with_tabs(&buf, edata->hint);
! 				appendStringInfoChar(&buf, '\n');
! 			}
! 			if (edata->internalquery)
  			{
! 				log_line_prefix(Log_line_prefix, &buf);
! 				appendStringInfoString(&buf, _("QUERY:  "));
! 				append_with_tabs(&buf, edata->internalquery);
! 				appendStringInfoChar(&buf, '\n');
  			}
! 			if (edata->context)
  			{
! 				log_line_prefix(Log_line_prefix, &buf);
! 				appendStringInfoString(&buf, _("CONTEXT:  "));
! 				append_with_tabs(&buf, edata->context);
! 				appendStringInfoChar(&buf, '\n');
! 			}
! 			if (Log_error_verbosity >= PGERROR_VERBOSE)
! 			{
! 				/* assume no newlines in funcname or filename... */
! 				if (edata->funcname && edata->filename)
! 				{
! 					log_line_prefix(Log_line_prefix, &buf);
! 					appendStringInfo(&buf, _("LOCATION:  %s, %s:%d\n"),
! 									 edata->funcname, edata->filename,
! 									 edata->lineno);
! 				}
! 				else if (edata->filename)
! 				{
! 					log_line_prefix(Log_line_prefix, &buf);
! 					appendStringInfo(&buf, _("LOCATION:  %s:%d\n"),
! 									 edata->filename, edata->lineno);
! 				}
  			}
  		}
! 	
! 		/*
! 		 * If the user wants the query that generated this error logged, do it.
! 		 */
! 		if (edata->elevel >= log_min_error_statement && debug_query_string != NULL)
! 		{
! 			log_line_prefix(Log_line_prefix, &buf);
! 			appendStringInfoString(&buf, _("STATEMENT:  "));
! 			append_with_tabs(&buf, debug_query_string);
! 			appendStringInfoChar(&buf, '\n');
! 		}
! }
  
  #ifdef HAVE_SYSLOG
  	/* Write to syslog, if enabled */
***************
*** 2079,2081 ****
--- 2084,2323 ----
  
  	return false;
  }
+ 
+ /*
+  * This function is to construct the log string into a SQL format.
+  * For text based log, the output string is based on the user's input whereas
+  * for SQL we need a fixed format. So a seperate method to simplify.
+  * Benefit to the user is that he can continue to specify arbitrary
+  * log_line_prefix.
+  */
+ static void
+ get_SQL_type_server_log(StringInfo buf, ErrorData *edata)
+ {
+	int format_len = 0;
+ 	int i;
+	int proc_id = 0;
+ 	char **loglineprefixvalues = NULL;
+ 
+ 	StringInfoData msgbuf;
+ 
+ 	initStringInfo(&msgbuf);
+ 
+ 	loglineprefixvalues = (char **) palloc(10 * sizeof(char *));
+ 
+ 	for (i=0; i < 10; i++)
+ 		 strcpy(loglineprefixvalues[i], "NULL");
+ 
+ 	if (Log_line_prefix != NULL)
+ 		format_len = strlen(Log_line_prefix);
+ 
+ 	for (i = 0; i < format_len; i++)
+ 	{
+ 		if (Log_line_prefix[i] != '%')
+ 		{
+ 			/* literal char, ignore for SQL style output */
+ 			continue;
+ 		}
+ 		/* go to char after '%' */
+ 		i++;
+ 		if (i >= format_len)
+ 			break;				/* format error - ignore it */
+ 
+ 		/* process the option */
+ 
+ 		switch (Log_line_prefix[i])
+ 		{
+ 			case 'm':
+ 				{
+ 					initStringInfo(&msgbuf);
+ 					log_line_prefix("%m", &msgbuf);
+ 					loglineprefixvalues[0] = pstrdup(msgbuf.data);
+ 				}
+ 				break;
+ 			case 't':
+ 				{
+ 					initStringInfo(&msgbuf);			
+ 					log_line_prefix("%t", &msgbuf);
+ 					loglineprefixvalues[1] = pstrdup(msgbuf.data);
+ 				}
+ 				break;
+ 			case 'u':
+ 				{
+ 			  	initStringInfo(&msgbuf);
+ 					log_line_prefix("%u", &msgbuf);
+ 					loglineprefixvalues[2] = pstrdup(msgbuf.data);
+ 				}
+ 				break;
+ 			case 'd':
+ 				{
+ 					initStringInfo(&msgbuf);			
+ 					log_line_prefix("%d", &msgbuf);
+ 					loglineprefixvalues[3] = pstrdup(msgbuf.data);
+ 				}
+ 				break;
+ 			case 'c':
+ 				{
+ 					initStringInfo(&msgbuf);			
+ 					log_line_prefix("%c", &msgbuf);
+  					loglineprefixvalues[4] = pstrdup(msgbuf.data);
+ 				}
+ 				break;
+ 			case 'r':
+ 				{
+ 					initStringInfo(&msgbuf);			
+ 					log_line_prefix("%r", &msgbuf);
+ 					loglineprefixvalues[5] = pstrdup(msgbuf.data);
+ 				}
+ 				break;
+ 			case 'h':
+ 				{
+ 					initStringInfo(&msgbuf);			
+ 					log_line_prefix("%h", &msgbuf);
+ 					loglineprefixvalues[6] = pstrdup(msgbuf.data);
+ 				}
+ 				break;
+ 
+ 			case 'p':
+ 				proc_id = MyProcPid;
+ 				break;
+ 			case 'i':
+ 				{
+ 					initStringInfo(&msgbuf);			
+ 					log_line_prefix("%i", &msgbuf);
+ 					loglineprefixvalues[7] = pstrdup(msgbuf.data);
+ 				}
+ 				break;
+ 			case 's':
+ 				{
+ 					initStringInfo(&msgbuf);			
+ 					log_line_prefix("%s", &msgbuf);
+ 					loglineprefixvalues[8] = pstrdup(msgbuf.data);
+ 				}
+ 				break;
+ 			case 'x':
+ 				{
+ 					initStringInfo(&msgbuf);			
+ 					log_line_prefix("%x", &msgbuf);				
+ 					loglineprefixvalues[9] = pstrdup(msgbuf.data);
+ 				}
+ 				break;
+ 			default:
+ 				/* Nothing else for SQL type. ignore it */
+ 				break;
+ 		}
+ 	}
+ 	
+ 	/*
+ 	 * We need timestamp as the unique column. So we get it even if the user
+ 	 *  hasn't given it as an option.
+ 	 */
+ 	if ( (loglineprefixvalues[0] == NULL) || (strcasecmp(loglineprefixvalues[0],"NULL") == 0) )
+ 	{
+ 		initStringInfo(&msgbuf);
+ 		get_timestamp(&msgbuf);
+ 		loglineprefixvalues[0] = pstrdup(msgbuf.data);
+	}
+ 
+ 	/* Construct the SQL type log string. */
+ 	initStringInfo(&msgbuf);
+ 	appendStringInfo(&msgbuf, "INSERT INTO %s VALUES (" , Log_output_table_name);
+ 
+	appendStringInfo(&msgbuf, "%s,%s,%s,%s,%s,%s,%s,%d,%s,%s, %s,",
+ 						loglineprefixvalues[0],
+ 						loglineprefixvalues[1],
+ 						loglineprefixvalues[2],
+ 						loglineprefixvalues[3],
+ 						loglineprefixvalues[4],
+ 						loglineprefixvalues[5],
+ 						loglineprefixvalues[6],
+ 						proc_id,
+ 						loglineprefixvalues[7],
+ 						loglineprefixvalues[8],
+ 						loglineprefixvalues[9]
+ 									);
+ 	appendStringInfo(&msgbuf, "%s,", error_severity(edata->elevel));
+ 	appendStringInfo(&msgbuf, "%s,", unpack_sql_state(edata->sqlerrcode));
+	get_error_message(&msgbuf, edata);
+ 
+ 	/*
+ 	 * If the user wants the query that generated this error logged, do it.
+ 	 */
+ 	if (edata->elevel >= log_min_error_statement && debug_query_string != NULL)
+ 	{
+ 		appendStringInfoString(&msgbuf, _(",STATEMENT:  "));
+ 		append_with_tabs(&msgbuf, debug_query_string);
+ 	}
+ 	appendStringInfoString(&msgbuf, _(")"));
+ 
+ 	appendStringInfo(buf, pstrdup(msgbuf.data));
+ 
+ 	pfree(loglineprefixvalues);
+ 	pfree (msgbuf.data); 
+ }
+ 
+ /*
+  * Appends the buffer with the error message and the cursor position.
+  */
+ static void
+ get_error_message(StringInfo buf, ErrorData *edata)
+ {
+ 	StringInfoData msgbuf;
+ 
+ 	initStringInfo(&msgbuf);
+ 
+ 	if (edata->message)
+ 		append_with_tabs(&msgbuf, edata->message);
+ 	else
+ 		append_with_tabs(&msgbuf, _("missing error text"));
+ 
+ 	if (edata->cursorpos > 0)
+ 		appendStringInfo(&msgbuf, _(" at character %d"),
+ 						 edata->cursorpos);
+ 	else if (edata->internalpos > 0)
+ 		appendStringInfo(&msgbuf, _(" at character %d"),
+ 						 edata->internalpos);
+ 	appendStringInfo(buf, "%s", pstrdup(msgbuf.data));
+ }
+ 
+ /*
+  * Calculates and returns the timestamp
+  */
+ static void
+ get_timestamp(StringInfo buf)
+ {
+ 	/*
+ 	* Note: for %m, %t, and %s we deliberately use the C
+ 	* library's strftime/localtime, and not the equivalent
+ 	* functions from src/timezone.  This ensures that all
+ 	* backends will report log entries in the same timezone,
+ 	* namely whatever C-library setting they inherit from the
+ 	* postmaster.	If we used src/timezone then local
+ 	* settings of the TimeZone GUC variable would confuse the
+ 	* log.
+ 	*/
+ 	time_t		stamp_time;
+ 	char		strfbuf[128],
+ 			msbuf[8];
+ 	struct timeval tv;
+ 
+ 	gettimeofday(&tv, NULL);
+ 	stamp_time = tv.tv_sec;
+ 
+ 	strftime(strfbuf, sizeof(strfbuf),
+ 	/* leave room for milliseconds... */
+ 	/* Win32 timezone names are too long so don't print them. */
+ #ifndef WIN32
+ 	"%Y-%m-%d %H:%M:%S     %Z",
+ #else
+ 	"%Y-%m-%d %H:%M:%S     ",
+ #endif
+ 	localtime(&stamp_time));
+ 
+ 	/* 'paste' milliseconds into place... */
+ 	sprintf(msbuf, ".%03d", (int) (tv.tv_usec / 1000));
+ 	strncpy(strfbuf + 19, msbuf, 4);
+ 
+ 	appendStringInfoString(buf, strfbuf);
+ }
#2Greg Smith
gsmith@gregsmith.com
In reply to: FAST PostgreSQL (#1)
Re: WIP patch - INSERT-able log statements

On Sat, 17 Feb 2007, FAST PostgreSQL wrote:

#log_output_type = 'text' #Valid values are 'SQL' or 'text'
Defaults to 'text' which is status quo. If it is set to 'SQL' log will
be output as INSERT commands.

This assumes someone wants either the INSERT-able logs or the current,
easily readable ones. I know I don't want either--I want both. There are
times I want to look through the logs with a text editor, there are times
where I want to query against them.

I would suggest treating this similarly to how the Windows eventlog is
handled: made SQL INSERT format another option available to
log_destination, so it can be combined with the existing formats. In
addition to the syslog concerns you already mentioned (which are
themselves a showstopper for using this feature in companies that rely on
or aggregate syslogs), I know I'd want to keep the existing logs rolling
in parallel while I tested out the SQL-based version for a while, before
cutting over to exclusively INSERT format logs.

I've thought a bit about how to implement this TODO already (I have a log
file parser and I hate maintaining it), and the only thing that made sense
to me was giving a new parameter with the filename to output to in this
format. For example, make a new log_sql_filename with the same syntax
already used for log_filename. There will probably need to be a second
parameter for the table name to insert into as you've already commented
on. And like Joshua has already suggested, the main useful applications
for this feature I've thought of all involve reading from the INSERT-able
logs in real-time, using something like "tail -f", and pumping that data
immediately into a logger table.

Also, I feel that supporting the whole log_line_prefix syntax for this
feature is not just overkill, it's a bad idea. Output everything in a
standard, complete format instead, and then it becomes easy for the
community at large to build tools on top of that to analyze the log
database entries instead of having so many ad-hoc approaches. You want a
subset, use a view or copy just the fields you want into another table.
I would guess this simplifies the patch as well.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

#3Guillaume Smet
guillaume.smet@gmail.com
In reply to: Greg Smith (#2)
Re: WIP patch - INSERT-able log statements

Added -hackers to CC:.

On 2/18/07, Greg Smith <gsmith@gregsmith.com> wrote:

I've thought a bit about how to implement this TODO already (I have a log
file parser and I hate maintaining it)

Any problem using pgFouine?

Also, I feel that supporting the whole log_line_prefix syntax for this
feature is not just overkill, it's a bad idea. Output everything in a
standard, complete format instead, and then it becomes easy for the
community at large to build tools on top of that to analyze the log
database entries instead of having so many ad-hoc approaches. You want a
subset, use a view or copy just the fields you want into another table.
I would guess this simplifies the patch as well.

I agree with you on this point. We need every information we can put
in the database (database, user and so on) in a structured way.

Logging statements is not the only thing to do IMHO. Logging errors in
another table is important too. I'm pretty sure there are a lot people
who don't know if there are errors in their statements.

Anyway, if something is developed to do that, I'll extend pgFouine to
support it (or I'll write another application to do it) and I'm pretty
sure others will do. The most annoying and difficult part of the work
on pgFouine/PQA/... is to maintain the parser. If we don't have to do
that anymore, we'll have more time to think about new features.
In fact, it was a long plan for pgFouine to be able to store the
results of its analysis in a database so it's a good news if it's
implemented in core.

--
Guillaume

#4Greg Smith
gsmith@gregsmith.com
In reply to: FAST PostgreSQL (#1)
Re: WIP patch - INSERT-able log statements

On Tue, 20 Feb 2007, FAST PostgreSQL wrote:

I think adding the 'format' of the log as an option in the 'destination'
may be confusing. We can have a new boolean variable like
'output_sql_log' or 'log_sql_format' which will trigger the output of
INSERT-able log in addition to syslog/stderr/eventlog in text format as
it is now.

What's confusing about it? Consider this hypothetical postgresql.conf
snippet:

log_destination = 'stderr,sql' # Valid values are combinations of
# stderr, syslog, sql, and eventlog,
# depending on platform.

# These are relevant when logging to sql:
log_sql_table = 'pg_log' # Table SQL formatted logs INSERT into
# Default is 'pg_log'

Adding new GUC variables isn't without cost; no reason to add a new one
when there's a perfectly good destination one already whose syntax is
completely appropriate for this task.

I personally would just ignore the duration two entries per statement
problem and make that the log analyzer software's job to fix, but I'm
likely biased here because I don't ever do anything with that data.

My preference would be for the sql-logs to use the same variables as
that of redirect_stderr. In the directory mentioned by the user on
log_directory we just output the sql logs in a file with an '.SQL'
appended to the filename specified by the user in log_filename. This
also means we can make use of the log_truncation and log_rotate options
(I hope. I will have a look at that).

You're talking along the same lines here I was trying to suggest in my
previous message. Keep all the current behavior as far as truncation and
rotation go, just provide a different name for the file. If you just
appended a suffix like .sql to the existing name, that would remove
another GUC variable requirement. I think eventually people will complain
about this, and want a separately formatted filename altogether, but
there's nothing wrong with the approach you suggest for a first version of
this feature. I know I'm so desparate for a better solution to log
importing that I'd be happy with any workable improvement.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

#5Guillaume Smet
guillaume.smet@gmail.com
In reply to: Greg Smith (#4)
Re: WIP patch - INSERT-able log statements

On 2/19/07, Greg Smith <gsmith@gregsmith.com> wrote:

log_destination = 'stderr,sql' # Valid values are combinations of
# stderr, syslog, sql, and eventlog,
# depending on platform.

+1

# These are relevant when logging to sql:
log_sql_table = 'pg_log' # Table SQL formatted logs INSERT into
# Default is 'pg_log'

Is there really a need for that? Why not simply put something like
%log_table% in the sql file and let the admin replace it with sed or
whatever he likes?
And it could allow us to move to copy without having to drop the code
added to manage the new GUC variable.

I personally would just ignore the duration two entries per statement
problem and make that the log analyzer software's job to fix, but I'm
likely biased here because I don't ever do anything with that data.

We have basically 4 different cases:
* log_min_duration_statement = 0: every query is logged with the
duration on the same line,
* log_statement = all: we don't have any duration,
* log_statement = all & log_duration = on: we have every query and the
duration on another line,
* log_min_duration_statement = n > 0 & log_duration = on: we have
duration lines for every query and statement + duration if the query
is slower than n ms.

There are other variants but I think they are the main cases to deal with.

--
Guillaume

#6Alvaro Herrera
alvherre@commandprompt.com
In reply to: Guillaume Smet (#5)
Re: WIP patch - INSERT-able log statements

Guillaume Smet escribi�:

On 2/19/07, Greg Smith <gsmith@gregsmith.com> wrote:

log_destination = 'stderr,sql' # Valid values are combinations of
# stderr, syslog, sql, and eventlog,
# depending on platform.

+1

Please don't do that. We already have a "combined GUC option" that is
used to change two different things (DateStyle) and I regularly see
people confused about how to use it.

Also, "sql" is not really a destination -- it is a format. Maybe have a
"format=plain,sql" GUC var is best: plain format is logged as currently,
and SQL format is logged somewhere else (maybe use the same name, and
stash a .sql suffix to the filename, or use .sql.log). That way you can
choose to have one or the other, or both if you're really dumb.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#7Guillaume Smet
guillaume.smet@gmail.com
In reply to: Alvaro Herrera (#6)
Re: WIP patch - INSERT-able log statements

On 2/19/07, Alvaro Herrera <alvherre@commandprompt.com> wrote:

Please don't do that. We already have a "combined GUC option" that is
used to change two different things (DateStyle) and I regularly see
people confused about how to use it.

Perhaps I don't understand your "combined GUC option" but ISTM it's
already the case for log_destination. I don't use this ability but the
current comment in postgresql.conf seems to be self explanatory:
log_destination = 'syslog' # Valid values are combinations of
# stderr, syslog and eventlog,
# depending on platform.

Also, "sql" is not really a destination -- it is a format. Maybe have a
"format=plain,sql" GUC var is best: plain format is logged as currently,
and SQL format is logged somewhere else (maybe use the same name, and
stash a .sql suffix to the filename, or use .sql.log). That way you can
choose to have one or the other, or both if you're really dumb.

I don't see how you will deal with your format GUC variable. Or it
won't be possible to have plain and sql at the same time? My problem
is how you will separate plain and sql if you put them in the stderr
or syslog.

The configuration which may interest me at first is to have syslog
plain text output to run pgFouine daily and from time to time use the
SQL output (I won't be able to insert into a db my daily logs so I
will do it sometimes if I need more details, ability to look for a
particular query and so on).
I don't know how other people plan to use it though.

--
Guillaume

#8Alvaro Herrera
alvherre@commandprompt.com
In reply to: Guillaume Smet (#7)
Re: WIP patch - INSERT-able log statements

Guillaume Smet escribi�:

On 2/19/07, Alvaro Herrera <alvherre@commandprompt.com> wrote:

Please don't do that. We already have a "combined GUC option" that is
used to change two different things (DateStyle) and I regularly see
people confused about how to use it.

Perhaps I don't understand your "combined GUC option" but ISTM it's
already the case for log_destination. I don't use this ability but the
current comment in postgresql.conf seems to be self explanatory:
log_destination = 'syslog' # Valid values are combinations of
# stderr, syslog and eventlog,
# depending on platform.

That defines one thing: where does the log output go. It can be syslog,
stderr, eventlog, or any combination thereof. But it's only one thing
that you're configuring.

In datestyle, you set two things: output format and input format.
That's problematic because people changes one of the settings and then
expect the other to change too, and get confused because it doesn't
work. What I'm saying is that this was not a very good design and
please let's not repeat the mistake here.

Also, "sql" is not really a destination -- it is a format. Maybe have a
"format=plain,sql" GUC var is best: plain format is logged as currently,
and SQL format is logged somewhere else (maybe use the same name, and
stash a .sql suffix to the filename, or use .sql.log). That way you can
choose to have one or the other, or both if you're really dumb.

I don't see how you will deal with your format GUC variable. Or it
won't be possible to have plain and sql at the same time? My problem
is how you will separate plain and sql if you put them in the stderr
or syslog.

Well, to syslog or eventlog you don't send the SQL format, only plain.
If you configure the SQL format, then it's expected that stderr is
enabled, so it would be an error if you enable SQL format but not
stderr. If redirect_stderr is also enabled then you could enable both
plain format and SQL format, and have each logged to a different file
(though I repeat that IMHO that would be a dumb thing to do).

The configuration which may interest me at first is to have syslog
plain text output to run pgFouine daily and from time to time use the
SQL output (I won't be able to insert into a db my daily logs so I
will do it sometimes if I need more details, ability to look for a
particular query and so on).
I don't know how other people plan to use it though.

I don't think the syslog option is so nice to use these days; the
redirect_stderr stuff is more powerful and easy to use.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#9Guillaume Smet
guillaume.smet@gmail.com
In reply to: Alvaro Herrera (#8)
Re: WIP patch - INSERT-able log statements

On 2/19/07, Alvaro Herrera <alvherre@commandprompt.com> wrote:

I don't think the syslog option is so nice to use these days; the
redirect_stderr stuff is more powerful and easy to use.

Did you already analyze logs of a highly loaded platform using stderr?
It's impossible to guarantee the consistency of the queries because
you can have:
Query 1 Line 1
Query 2 Line 1
Query 3 Line 1
Query 1 Line 2
Query 2 Line 2
Query 1 Line 3
Query 3 Line 2
(each query is run on a different backend). With stderr you don't have
sufficient information to know the query you should append the text
to. Syslog adds useful context information (pid, command line, command
number) which allows you to guarantee the consistency.
It's a real problem for us and that's why I don't recommend using
stderr when you use pgFouine.
Moreover, syslog can send the log to the network which moves the I/O
needed to log on another box and it's necessary in our case.

--
Guillaume

#10Alvaro Herrera
alvherre@commandprompt.com
In reply to: Guillaume Smet (#9)
Re: WIP patch - INSERT-able log statements

Guillaume Smet escribi�:

On 2/19/07, Alvaro Herrera <alvherre@commandprompt.com> wrote:

I don't think the syslog option is so nice to use these days; the
redirect_stderr stuff is more powerful and easy to use.

Did you already analyze logs of a highly loaded platform using stderr?
It's impossible to guarantee the consistency of the queries because
you can have:
Query 1 Line 1
Query 2 Line 1
Query 3 Line 1
Query 1 Line 2
Query 2 Line 2
Query 1 Line 3
Query 3 Line 2
(each query is run on a different backend). With stderr you don't have
sufficient information to know the query you should append the text
to. Syslog adds useful context information (pid, command line, command
number) which allows you to guarantee the consistency.

So add the session ID (%c) to log_line_prefix.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

#11Guillaume Smet
guillaume.smet@gmail.com
In reply to: Alvaro Herrera (#10)
Re: WIP patch - INSERT-able log statements

On 2/19/07, Alvaro Herrera <alvherre@commandprompt.com> wrote:

So add the session ID (%c) to log_line_prefix.

It could work if log_line_prefix was added before every line but it's
definitely not the case:
myuser mydb 45d9d615.4abe LOG: duration : 185.223 ms, statement : SELECT *
FROM lieu
LIMIT 10;
if you execute:
SELECT *
FROM lieu
LIMIT 10;

I worked a *lot* on query logging and I really didn't find any
solution to use stderr in a safe way, reason why I don't use it at all
and I don't recommend it (even if pgFouine supports it, it outputs a
warning as soon as the parser finds a multiline query).

--
Guillaume

#12Alvaro Herrera
alvherre@commandprompt.com
In reply to: Guillaume Smet (#11)
Re: WIP patch - INSERT-able log statements

Guillaume Smet escribi�:

On 2/19/07, Alvaro Herrera <alvherre@commandprompt.com> wrote:

So add the session ID (%c) to log_line_prefix.

It could work if log_line_prefix was added before every line but it's
definitely not the case:
myuser mydb 45d9d615.4abe LOG: duration : 185.223 ms, statement : SELECT *
FROM lieu
LIMIT 10;
if you execute:
SELECT *
FROM lieu
LIMIT 10;

Interesting. I wonder why didn't you report this as a bug before?
Maybe we could have discussed it and fixed it.

This is irrelevant in this particular discussion anyway if we introduce
SQL format, because then the newline should be part of the COPY or INSERT
data. Let's just make sure to not make the same errors again.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

#13Guillaume Smet
guillaume.smet@gmail.com
In reply to: Alvaro Herrera (#12)
Re: [PATCHES] WIP patch - INSERT-able log statements

On 2/19/07, Alvaro Herrera <alvherre@commandprompt.com> wrote:

Guillaume Smet escribió:

On 2/19/07, Alvaro Herrera <alvherre@commandprompt.com> wrote:

So add the session ID (%c) to log_line_prefix.

It could work if log_line_prefix was added before every line but it's
definitely not the case:
myuser mydb 45d9d615.4abe LOG: duration : 185.223 ms, statement : SELECT *
FROM lieu
LIMIT 10;
if you execute:
SELECT *
FROM lieu
LIMIT 10;

Interesting. I wonder why didn't you report this as a bug before?
Maybe we could have discussed it and fixed it.

Perhaps because I thought it was not really a bug but the intended behaviour.
Syslog has the same behaviour and it's quite logical when you consider
how queries are logged (I've spent a few hours in the logging code).
Syslog has exactly the same behaviour but adds the necessary context
information.
I'm pretty sure I have explained the problem a few times on the lists
though but perhaps it was just on IRC.

From the feedback I have on pgFouine, very few people think it's a
real problem, probably because they don't use query logging as we do:
our production servers have it enabled all the time and we have a high
load on them so this particular case is a common case for us.

(Second try to move this discussion to -hackers)

--
Guillaume

#14Alvaro Herrera
alvherre@commandprompt.com
In reply to: Guillaume Smet (#13)
Re: [PATCHES] WIP patch - INSERT-able log statements

Guillaume Smet escribi�:

On 2/19/07, Alvaro Herrera <alvherre@commandprompt.com> wrote:

Guillaume Smet escribi�:

On 2/19/07, Alvaro Herrera <alvherre@commandprompt.com> wrote:

So add the session ID (%c) to log_line_prefix.

It could work if log_line_prefix was added before every line but it's
definitely not the case:
myuser mydb 45d9d615.4abe LOG: duration : 185.223 ms, statement :

SELECT *

FROM lieu
LIMIT 10;
if you execute:
SELECT *
FROM lieu
LIMIT 10;

Interesting. I wonder why didn't you report this as a bug before?
Maybe we could have discussed it and fixed it.

Perhaps because I thought it was not really a bug but the intended
behaviour.
Syslog has the same behaviour and it's quite logical when you consider
how queries are logged (I've spent a few hours in the logging code).
Syslog has exactly the same behaviour but adds the necessary context
information.

If it adds necessary context then it clear does not have "the same
behavior", because the problem is precisely that the context is missing.
I'd propose adding a log_entry_prefix separate from log_line_prefix; the
entry prefix would contain most of the stuff, and log_line_prefix would
be a minimal thing intended to be put in front of each _line_, so the
example you show above could be

myuser mydb 45d9d615.4abe LOG: duration : 185.223 ms, statement : SELECT *
45d9d615.4abe FROM lieu
45d9d615.4abe LIMIT 10;

where you have

log_entry_prefix="%d %u "
log_line_prefix="%c "

Really, prefixing with a tab does not strike me as a great idea
precisely because it's ambiguous.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#15Guillaume Smet
guillaume.smet@gmail.com
In reply to: Alvaro Herrera (#14)
Re: [PATCHES] WIP patch - INSERT-able log statements

On 2/19/07, Alvaro Herrera <alvherre@commandprompt.com> wrote:

If it adds necessary context then it clear does not have "the same
behavior",

I mean log_line_prefix behaviour is the same. The other information
are syslog specific.

I'd propose adding a log_entry_prefix separate from log_line_prefix; the
entry prefix would contain most of the stuff, and log_line_prefix would
be a minimal thing intended to be put in front of each _line_, so the
example you show above could be

It could be a good idea.
It won't make me use stderr output but it will allow other people to
do so without any disadvantage :).

Really, prefixing with a tab does not strike me as a great idea
precisely because it's ambiguous.

Sure.

--
Guillaume

#16Greg Smith
gsmith@gregsmith.com
In reply to: Guillaume Smet (#5)
Re: [PATCHES] WIP patch - INSERT-able log statements

On Mon, 19 Feb 2007, Guillaume Smet wrote:

Why not simply put something like %log_table% in the sql file and let
the admin replace it with sed or whatever he likes?

This is a reasonable approach. I would suggest that no special characters
be used though, so that the SQL could be used as-is by a DBA who doesn't
even know about or want to use tools like sed. I also think the default
should be a pg_ name like the pg_log I suggested, partly because I'd like
this to be completely internal one day--just push the logs into the
database directly without even passing through an external file first.
Also, something like pg_log is unlikely to cause a conflict with existing
tables. I would bet there's already databases out there with tables
called log_table, for example, but everyone already avoids naming
application tables starting with pg_.

A workable syntax might be

INSERT INTO "pg_log" ...

The redundant quotation marks will make it easier to do a search/replace
to change the table name without worrying about accidentally impacting the
text of the message, so that even people who aren't aware how to build a
regular expression that only modifies the first match will probably be OK.

I consider using the same name as the default log directory helpful, but
would understand that others might consider it confusing to overload the
name like that.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

#17Greg Smith
gsmith@gregsmith.com
In reply to: Alvaro Herrera (#6)
Re: [PATCHES] WIP patch - INSERT-able log statements

On Mon, 19 Feb 2007, Alvaro Herrera wrote:

We already have a "combined GUC option" that is used to change two
different things (DateStyle) and I regularly see people confused about
how to use it.

You already have a combined GUC option called log_destination that's
sitting in the appropriate area of the configuration file, doing something
similar to what's needed for the new feature. People confused by that are
already confused.

Also, "sql" is not really a destination -- it is a format.

A log file with a different name is another destination. eventlog is
certainly a different format and it's sitting happily as an option there.
I haven't heard anyone make a useful argument yet as to how insert/sql
logs are any different than the current way that stderr, syslog, and
eventlog are all possibilities now for log_destination, each with their
own little quirks (and in the case of syslog, their own additional GUC
parameters).

That way you can choose to have one or the other, or both if you're
really dumb.

The fact that you're characterizing people who might want both as "really
dumb" tells me you're not familiar with enterprise logging requirements. I
already commented on situations where wanting both types of output going
at once is going to absolutely be a requirement in some environments for
this feature addition to be useful; there are a lot of large operations
that rely heavily on features like syslog to help manage their systems.
Most of the places I've worked at, the syslog server where the analysis is
running wasn't necessarily even in the same state as the machine
generating the log entries.

I know I can't deploy this feature unless it operates in parallel with the
existing text-based info going to syslog, both because of that and because
of transition issues--I can't disrupt the existing logs to test a new log
mechanism until that new mechanism has proven itself. I'll probably
deploy it with both turned on forever once it's available.

As for your comments on syslog vs. stderr, I completely agree with
Guillaume's response to you on that subject. The stderr output is
difficult to use for the reasons he describes, but the kind of
environments that use complicated logging aren't relying on that anyway.
I wouldn't get distracted by fixing that implementation when it's
functional enough for most who are satisfied with stderr output.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: Greg Smith (#16)
Re: [PATCHES] WIP patch - INSERT-able log statements

Greg Smith <gsmith@gregsmith.com> writes:

A workable syntax might be
INSERT INTO "pg_log" ...

Why is this still under discussion? I thought we'd agreed that COPY
format was the way to go.

regards, tom lane

#19Tom Lane
tgl@sss.pgh.pa.us
In reply to: Greg Smith (#17)
Re: [PATCHES] WIP patch - INSERT-able log statements

Greg Smith <gsmith@gregsmith.com> writes:

On Mon, 19 Feb 2007, Alvaro Herrera wrote:

Also, "sql" is not really a destination -- it is a format.

A log file with a different name is another destination. eventlog is
certainly a different format and it's sitting happily as an option there.
I haven't heard anyone make a useful argument yet as to how insert/sql
logs are any different than the current way that stderr, syslog, and
eventlog are all possibilities now for log_destination, each with their
own little quirks (and in the case of syslog, their own additional GUC
parameters).

Since the "sql" format doesn't make any sense for syslog or eventlog
output, I tend to agree that treating it as a destination is a
reasonable answer. It's going to be a bit non-orthogonal no matter
which way we jump, but this seems like the most natural and useful
extension from where we are. To me anyway ... YMMV ...

regards, tom lane

#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: Guillaume Smet (#9)
Re: WIP patch - INSERT-able log statements

"Guillaume Smet" <guillaume.smet@gmail.com> writes:

(each query is run on a different backend). With stderr you don't have
sufficient information to know the query you should append the text
to. Syslog adds useful context information (pid, command line, command
number) which allows you to guarantee the consistency.
It's a real problem for us and that's why I don't recommend using
stderr when you use pgFouine.

Of course, the other side of that coin is that syslog is known to drop
messages altogether under sufficient load. (At least on some platforms;
dunno about yours.)

regards, tom lane

#21Alvaro Herrera
alvherre@commandprompt.com
In reply to: Tom Lane (#20)
Re: WIP patch - INSERT-able log statements

Tom Lane escribi�:

"Guillaume Smet" <guillaume.smet@gmail.com> writes:

(each query is run on a different backend). With stderr you don't have
sufficient information to know the query you should append the text
to. Syslog adds useful context information (pid, command line, command
number) which allows you to guarantee the consistency.
It's a real problem for us and that's why I don't recommend using
stderr when you use pgFouine.

Of course, the other side of that coin is that syslog is known to drop
messages altogether under sufficient load. (At least on some platforms;
dunno about yours.)

So lets fix stderr by having a true per-line prefix?

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#22Greg Smith
gsmith@gregsmith.com
In reply to: Tom Lane (#18)
Re: [PATCHES] WIP patch - INSERT-able log statements

On Mon, 19 Feb 2007, Tom Lane wrote:

Why is this still under discussion? I thought we'd agreed that COPY
format was the way to go.

Joshua Drake said "COPY would be a good option, but INSERT is probably
what I would use as the default. The most use I see for this is something
where I am tailing out the log and inserting live into a log db..." and I
completely agreed with him--that's also how all the applications I'd like
to build around this feature are expected to operate. No one said
anything else on this topic to defend COPY as the right choice until you
just brought it back up here.

The arguments for COPY are performance and that you don't need to specify
the table name. INSERT is slower and you need a name, but it's easier to
build a UNIX tool style pipeline to import it in real-time.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

#23Tom Lane
tgl@sss.pgh.pa.us
In reply to: Greg Smith (#22)
Re: [PATCHES] WIP patch - INSERT-able log statements

Greg Smith <gsmith@gregsmith.com> writes:

The arguments for COPY are performance and that you don't need to specify
the table name. INSERT is slower and you need a name, but it's easier to
build a UNIX tool style pipeline to import it in real-time.

I can't believe that any production situation could tolerate the
overhead of one-commit-per-log-line. So a realistic tool for this
is going to have to be able to wrap blocks of maybe 100 or 1000 or so
log lines with BEGIN/COMMIT, and that is exactly as difficult as
wrapping them with a COPY command. Thus, I disbelieve your argument.
We should not be designing this around an assumed use-case that will
only work for toy installations.

regards, tom lane

#24FAST PostgreSQL
fastpgs@fast.fujitsu.com.au
In reply to: Greg Smith (#2)
Re: WIP patch - INSERT-able log statements

On Sun, 18 Feb 2007 18:25, Greg Smith wrote:

On Sat, 17 Feb 2007, FAST PostgreSQL wrote:

#log_output_type = 'text' #Valid values are 'SQL' or 'text'
Defaults to 'text' which is status quo. If it is set to 'SQL' log will
be output as INSERT commands.

This assumes someone wants either the INSERT-able logs or the current,
easily readable ones. I know I don't want either--I want both. There are
times I want to look through the logs with a text editor, there are times
where I want to query against them.

I would suggest treating this similarly to how the Windows eventlog is
handled: made SQL INSERT format another option available to
log_destination, so it can be combined with the existing formats. In

I think adding the 'format' of the log as an option in the 'destination' may
be confusing. We can have a new boolean variable like 'output_sql_log' or
'log_sql_format' which will trigger the output of INSERT-able log in addition
to syslog/stderr/eventlog in text format as it is now.

addition to the syslog concerns you already mentioned (which are
themselves a showstopper for using this feature in companies that rely on
or aggregate syslogs), I know I'd want to keep the existing logs rolling
in parallel while I tested out the SQL-based version for a while, before
cutting over to exclusively INSERT format logs.

I've thought a bit about how to implement this TODO already (I have a log

Any thoughts on how to output the duration in the same INSERT as that of the
query without having to output two entries per statement as it is done in
text now? ?

file parser and I hate maintaining it), and the only thing that made sense
to me was giving a new parameter with the filename to output to in this
format. For example, make a new log_sql_filename with the same syntax
already used for log_filename. There will probably need to be a second
parameter for the table name to insert into as you've already commented
on. And like Joshua has already suggested, the main useful applications

This would mean that the user will have to maintain two sets of variable. One
for text output (if redirect stderr is on) and another for sql output. My
preference would be for the sql-logs to use the same variables as that of
redirect_stderr. In the directory mentioned by the user on log_directory we
just output the sql logs in a file with an '.SQL' appended to the filename
specified by the user in log_filename. This also means we can make use of the
log_truncation and log_rotate options (I hope. I will have a look at that).
Of course we document this making user aware of this. (The other option is we
output the logs in a subdirectory called 'SQL' which we can silently create,
if we are allowed to ! ! !)

This also means only two new variable is created in postgresql.conf for this
feature. To trigger the output and the table name. (Or we can introduce only
the table name variable which will automatically trigger the SQL log output.
Maybe confusing ? ? )

Rgds,
Arul Shaji

for this feature I've thought of all involve reading from the INSERT-able
logs in real-time, using something like "tail -f", and pumping that data
immediately into a logger table.

Also, I feel that supporting the whole log_line_prefix syntax for this
feature is not just overkill, it's a bad idea. Output everything in a
standard, complete format instead, and then it becomes easy for the
community at large to build tools on top of that to analyze the log
database entries instead of having so many ad-hoc approaches. You want a
subset, use a view or copy just the fields you want into another table.
I would guess this simplifies the patch as well.

This is an email from Fujitsu Australia Software Technology Pty Ltd, ABN 27 003 693 481. It is confidential to the ordinary user of the email address to which it was addressed and may contain copyright and/or legally privileged information. No one else may read, print, store, copy or forward all or any of it or its attachments. If you receive this email in error, please return to sender. Thank you.

If you do not wish to receive commercial email messages from Fujitsu Australia Software Technology Pty Ltd, please email unsubscribe@fast.fujitsu.com.au

#25Guillaume Smet
guillaume.smet@gmail.com
In reply to: Tom Lane (#20)
Re: WIP patch - INSERT-able log statements

On 2/20/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Of course, the other side of that coin is that syslog is known to drop
messages altogether under sufficient load. (At least on some platforms;
dunno about yours.)

Yes I know. That's one of the reason why I asked for the bahaviour of
7.4 log_duration back in 8.2. It's a good compromise which allows us
not to lose lines and have a good level of information (at least, the
best we can have). Async IO helps.

Moreover we use syslog to send the log lines via UDP so we know that
it's not perfect. But it works nice most of the time. We know that we
can't log every query (we use a combination of log_duration and
log_min_duration_statement - I patched 8.1 for that) because if we do
so we lose a lot of lines and queries are not consistent but we can't
do it locally with stderr anyway due to I/O. This method has been
reliable for more than a year and our daily reports are consistent.

--
Guillaume

#26Greg Smith
gsmith@gregsmith.com
In reply to: Tom Lane (#23)
Re: [PATCHES] WIP patch - INSERT-able log statements

On Tue, 20 Feb 2007, Tom Lane wrote:

I can't believe that any production situation could tolerate the
overhead of one-commit-per-log-line.

There aren't that many log lines, and a production environment with lots
of commit throughput won't even notice. The installation I work on tuning
does 300 small commits per second on a bad day. I can barely measure the
overhead of whether or not the log files are involved in that if I'm
importing them at the same time. The situation obviously changes if
you're logging per-query level detail.

So a realistic tool for this is going to have to be able to wrap blocks
of maybe 100 or 1000 or so log lines with BEGIN/COMMIT, and that is
exactly as difficult as wrapping them with a COPY command. Thus, I
disbelieve your argument. We should not be designing this around an
assumed use-case that will only work for toy installations.

Wrapping the commits in blocks to lower overhead is appropriate for toy
installations, and probably medium sized ones too. Serious installations,
with battery-backed cache writes and similar commit throughput enhancing
hardware, can commit a low-volume stream like the logs whenever they
please. That's the environment my use-case comes from.

Anyway, it doesn't really matter; I can build a tool with COPY style
output as well, it just won't be trivial like the INSERT one would be.
My reasons for "would slightly prefer INSERT" clearly aren't strong enough
to override the issues you bring up with the average case.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

#27Tom Lane
tgl@sss.pgh.pa.us
In reply to: FAST PostgreSQL (#1)
Re: WIP patch - INSERT-able log statements

"FAST PostgreSQL" <fastpgs@fast.fujitsu.com.au> writes:

- The log output will be in COPY format and will include the following
information, irrespective of the log_line_prefix setting.
( timestamp_with_milliseconds, �timestamp, username, �databasename,
sessionid, �host_and_port, host, proc_id, command_tag, �session_start,
transaction_id, �error_severity, �SQL_State_Code, statement/error_message);

How exactly are you fitting the message structure
(primary/detail/context lines) into this? It looks like your proposal
loses that structure ...

A smaller problem is that this forces people to incur a gettimeofday
call for every message logged; depending on your hardware that can be a
pretty nasty overhead. Some people might find some of the other columns
not worth their weight, either. Is it worth providing a knob to
determine the set of columns emitted?

regards, tom lane

#28Greg Smith
gsmith@gregsmith.com
In reply to: Tom Lane (#27)
Re: WIP patch - INSERT-able log statements

On Tue, 20 Feb 2007, Tom Lane wrote:

A smaller problem is that this forces people to incur a gettimeofday
call for every message logged

I'm stumped trying to think of an application that would require importing
the logs into a database to analyze them, but not need the timestamp.
I'd expect it to be the primary key on the data.

Is it worth providing a knob to determine the set of columns emitted?

Myself and Guillaume felt that having the format be standardized had
significant value from a downstream application perspective; it would be
nice to know that everyone can work together to write one simple tool
chain to process these things and it would work everywhere. The current
level of log output customization is part of what makes log analysis tools
so much of a pain.

How about this as a simple way to proceed: have the patch include
everything, as Arul already planned. When it's done, do some benchmarking
with it turned on or off. If it really seems like a drag, then consider a
GUC addition to trim it down. Why optimize prematurely? It's not like
this will be on by default. My guess is that the person sophisticated to
analyze their logs probably has an installation that can support the
overhead.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

#29daveg
daveg@sonic.net
In reply to: Guillaume Smet (#25)
Re: WIP patch - INSERT-able log statements

On Thu, Feb 22, 2007 at 11:50:06AM +1100, FAST PostgreSQL wrote:

- The log output will be in COPY format and will include the following
information, irrespective of the log_line_prefix setting.
( timestamp_with_milliseconds, �timestamp, username, �databasename,
sessionid, �host_and_port, host, proc_id, command_tag, �session_start,
transaction_id, �error_severity, �SQL_State_Code, statement/error_message);

Why are there two timestamps?

What about durations? Do they include all this?

Is the statement log written when the statement starts or when it completes?

-dg

--
David Gould daveg@sonic.net
If simplicity worked, the world would be overrun with insects.

#30FAST PostgreSQL
fastpgs@fast.fujitsu.com.au
In reply to: Guillaume Smet (#25)
Re: WIP patch - INSERT-able log statements

Ok..... Summarizing the key changes required on my patch, based on the
discussions so far are :

- The log_destination will include a new option 'sql'. This can be given with
other combinations of stderr, syslog or eventlog.
- The sql logs will be written in log_directory in a file log_filename.SQL
- The log output will be in COPY format and will include the following
information, irrespective of the log_line_prefix setting.
( timestamp_with_milliseconds, �timestamp, username, �databasename,
sessionid, �host_and_port, host, proc_id, command_tag, �session_start,
transaction_id, �error_severity, �SQL_State_Code, statement/error_message);

Anything else missing ? ?

Rgds,
Arul Shaji

On Tue, 20 Feb 2007 19:03, Guillaume Smet wrote:

On 2/20/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Of course, the other side of that coin is that syslog is known to drop
messages altogether under sufficient load. (At least on some platforms;
dunno about yours.)

Yes I know. That's one of the reason why I asked for the bahaviour of
7.4 log_duration back in 8.2. It's a good compromise which allows us
not to lose lines and have a good level of information (at least, the
best we can have). Async IO helps.

Moreover we use syslog to send the log lines via UDP so we know that
it's not perfect. But it works nice most of the time. We know that we
can't log every query (we use a combination of log_duration and
log_min_duration_statement - I patched 8.1 for that) because if we do
so we lose a lot of lines and queries are not consistent but we can't
do it locally with stderr anyway due to I/O. This method has been
reliable for more than a year and our daily reports are consistent.

This is an email from Fujitsu Australia Software Technology Pty Ltd, ABN 27 003 693 481. It is confidential to the ordinary user of the email address to which it was addressed and may contain copyright and/or legally privileged information. No one else may read, print, store, copy or forward all or any of it or its attachments. If you receive this email in error, please return to sender. Thank you.

If you do not wish to receive commercial email messages from Fujitsu Australia Software Technology Pty Ltd, please email unsubscribe@fast.fujitsu.com.au

#31FAST PostgreSQL
fastpgs@fast.fujitsu.com.au
In reply to: Tom Lane (#27)
Re: WIP patch - INSERT-able log statements

On Wed, 21 Feb 2007 12:08, Tom Lane wrote:

"FAST PostgreSQL" <fastpgs@fast.fujitsu.com.au> writes:

- The log output will be in COPY format and will include the following
information, irrespective of the log_line_prefix setting.
( timestamp_with_milliseconds, �timestamp, username, �databasename,
sessionid, �host_and_port, host, proc_id, command_tag, �session_start,
transaction_id, �error_severity, �SQL_State_Code,
statement/error_message);

How exactly are you fitting the message structure
(primary/detail/context lines) into this? It looks like your proposal
loses that structure ...

Sorry, didn't understand.... Can you please elaborate ?

A smaller problem is that this forces people to incur a gettimeofday
call for every message logged; depending on your hardware that can be a
pretty nasty overhead. Some people might find some of the other columns
not worth their weight, either. Is it worth providing a knob to
determine the set of columns emitted?

Totally agree. My original patch infact uses log_line_prefix. So the user can
fill in the columns he wants by turning on appropriate settings in
log_line_prefix. The columns which he hasn't turned on will be output as NULL
in the sql outptut.

But I can also see merit in others' request that it is ideal to have all
possible info in the sql log, so that once the log is loaded into the table,
it can be queried, sub-tabled, created view to analyze it in whatever way.....

Rgds,
Arul Shaji

regards, tom lane

This is an email from Fujitsu Australia Software Technology Pty Ltd, ABN 27 003 693 481. It is confidential to the ordinary user of the email address to which it was addressed and may contain copyright and/or legally privileged information. No one else may read, print, store, copy or forward all or any of it or its attachments. If you receive this email in error, please return to sender. Thank you.

If you do not wish to receive commercial email messages from Fujitsu Australia Software Technology Pty Ltd, please email unsubscribe@fast.fujitsu.com.au

#32FAST PostgreSQL
fastpgs@fast.fujitsu.com.au
In reply to: Greg Smith (#28)
Re: WIP patch - INSERT-able log statements

On Wed, 21 Feb 2007 14:59, Greg Smith wrote:

On Tue, 20 Feb 2007, Tom Lane wrote:

A smaller problem is that this forces people to incur a gettimeofday
call for every message logged

I'm stumped trying to think of an application that would require importing
the logs into a database to analyze them, but not need the timestamp.
I'd expect it to be the primary key on the data.

Is it worth providing a knob to determine the set of columns emitted?

Myself and Guillaume felt that having the format be standardized had
significant value from a downstream application perspective; it would be

Come to think of it, this may not be ideal after all. As we are triggering
the sql output in log_destination, if the user gives 'syslog,sql' as options
he is going to get two different looking logs (in terms of contents)
depending upon his settings.

But if we take the settings from log_line_prefix then the log contents are
the same, plus it gives the user flexibility to control what he wants. If an
user wants everything he only has to fill the log_line_prefix completely.

Also, for a meaningful sql log output we may need to tell the user not to
turn on verbose or print_plan or statistics etc... With a uniform log output
it will be clear in that sense.. What he sets in .conf is what he gets, both
in syslog and sql log. This may not be an optimization. Only an option which
is there if any optimization is necessary.

I am happy to implement it either way though. My requirement is same as
yours. I want some sort of sql logging, pronto.

Rgds,
Arul Shaji

nice to know that everyone can work together to write one simple tool
chain to process these things and it would work everywhere. The current
level of log output customization is part of what makes log analysis tools
so much of a pain.

How about this as a simple way to proceed: have the patch include
everything, as Arul already planned. When it's done, do some benchmarking
with it turned on or off. If it really seems like a drag, then consider a
GUC addition to trim it down. Why optimize prematurely? It's not like
this will be on by default. My guess is that the person sophisticated to
analyze their logs probably has an installation that can support the
overhead.

This is an email from Fujitsu Australia Software Technology Pty Ltd, ABN 27 003 693 481. It is confidential to the ordinary user of the email address to which it was addressed and may contain copyright and/or legally privileged information. No one else may read, print, store, copy or forward all or any of it or its attachments. If you receive this email in error, please return to sender. Thank you.

If you do not wish to receive commercial email messages from Fujitsu Australia Software Technology Pty Ltd, please email unsubscribe@fast.fujitsu.com.au

#33Greg Smith
gsmith@gregsmith.com
In reply to: FAST PostgreSQL (#32)
Re: WIP patch - INSERT-able log statements

On Thu, 22 Feb 2007, FAST PostgreSQL wrote:

As we are triggering the sql output in log_destination, if the user
gives 'syslog,sql' as options he is going to get two different looking
logs (in terms of contents) depending upon his settings.

Yes, exactly; it's a good thing. People add and remove things from the
text logs to make them easier to read. It's one of the reasons they're
harder to process. Since readability isn't a requirement for the SQL
formatted ones, you can pack a lot more into there and make it available
easily anyway.

I keep having every part of this conversation twice, so here's take two on
this one. The things that people want out of the text logs are not
necessarily the same things they want from the SQL ones. For example, I
have a situation where the semantics of the syslog output is being driven
by Sarbanes-Oxley related mechanics. But the SQL logs are be based on my
requirements, which is to include close enough to everything that it might
as well be the whole set, in case I forgot something I find I need later.
The SQL logs are *completely* different from the syslog setup.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

#34Bruce Momjian
bruce@momjian.us
In reply to: FAST PostgreSQL (#1)
Re: WIP patch - INSERT-able log statements

What is the status of this patch?

---------------------------------------------------------------------------

FAST PostgreSQL wrote:

Hi,

I've been working on the following TODO item and attached is an initial patch. (It is only partial and not yet completely functional)

"Allow server log information to be output as INSERT statements
This would allow server log information to be easily loaded into a database for analysis. "

I want to confirm, if what I have done so far is what community is looking for and also want to clear some doubts.

What is done so far
---------------

Two postgresql.conf variables

#log_output_type = 'text' #Valid values are 'SQL' or 'text'
#log_output_table_name = 'auditlogs'

These control how to output the log. Defaults to 'text' which is status quo. If it is set to 'SQL' log will be output as INSERT commands.

The second variable is of interest. We need to specify a table in the insert command. My preferred option is for the user to give one and he can create it if and when he wants to. The alternative is we decide the table name and make initdb to create one.

The proposed log output structure
------------------
INSERT INTO user_defined_table values( timestamp_with_milliseconds, timestamp, username, databasename, sessionid, host_and_port, host, proc_id, command_tag, session_start, transaction_id, error_severity, SQL_State_Code, error_message);

All these columns will follow the current rules of log output. ie, unless explicity requested by the user, these columns will have NULL. User can still give log_line_prefix in any order he wants, and logger will output it in appropriate columns. The code has been modified to do
this.

Issues/Questions are:
- How about 'Statement duration log'. This will come to the logger as a single string and after the query execution. In the existing log we can make sense of the duration log by matching it with the statement above it or by the statement which gets printed besides it (Again as
a single string). But when this is loaded onto a table doesn't make much sense untless everything is in a single row. (My preferred option is to add another column to the table structure defined above as 'duration'. But haven't figured out how to achieve this, because the
statement is printed first and then the duration as another log.)

- If the SQL log output is to the syslog, then it becomes pretty awkward and possibly useless because our current syslog writer function breaks up the log into several lines to accomodate various platforms. Syslog also then adds other information before outputting it, which
cannot be loaded onto a table. The preferred option is to educate the user through documentation that SQL type log output is best served when it is output to stderr and redirected to a file? Same goes with other aspects such as verbose and various other statistics log.

- There are also other minor issues such as, the actual query currently gets output in log as 'Statement: CREATE ........'. For sql type log we may not need the 'Statement:' part as it will be in a column ? Do we remove this in both text and SQL outputs ?

Rgds,
Arul Shaji

This is an email from Fujitsu Australia Software Technology Pty Ltd, ABN 27 003 693 481. It is confidential to the ordinary user of the email address to which it was addressed and may contain copyright and/or legally privileged information. No one else may read, print, store, copy or forward all or any of it or its attachments. If you receive this email in error, please return to sender. Thank you.

If you do not wish to receive commercial email messages from Fujitsu Australia Software Technology Pty Ltd, please email unsubscribe@fast.fujitsu.com.au

[ Attachment, skipping... ]

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +