BUG #3799: csvlog skips some logs

Started by depeszabout 18 years ago24 messages
#1depesz
depesz@depesz.com

The following bug has been logged online:

Bug reference: 3799
Logged by: depesz
Email address: depesz@depesz.com
PostgreSQL version: 8.3 HEAD
Operating system: linux, 32bit
Description: csvlog skips some logs
Details:

when settings are set to:
# select name, setting, unit from pg_settings where name ilike 'log%' order
by name;
name | setting | unit
-----------------------------+--------------------------------+--------
log_autovacuum_min_duration | 0 | ms
log_checkpoints | on | [null]
log_connections | on | [null]
log_destination | stderr | [null]
log_directory | pg_log | [null]
log_disconnections | off | [null]
log_duration | off | [null]
log_error_verbosity | default | [null]
log_executor_stats | off | [null]
log_filename | postgresql-%Y-%m-%d_%H%M%S.log | [null]
logging_collector | on | [null]
log_hostname | off | [null]
log_line_prefix | [%t] [%p] <%u@%d> | [null]
log_lock_waits | off | [null]
log_min_duration_statement | 0 | ms
log_min_error_statement | error | [null]
log_min_messages | notice | [null]
log_parser_stats | off | [null]
log_planner_stats | off | [null]
log_rotation_age | 1440 | min
log_rotation_size | 102400 | kB
log_statement | none | [null]
log_statement_stats | off | [null]
log_temp_files | -1 | kB
log_timezone | Poland | [null]
log_truncate_on_rotation | off | [null]
(26 rows)

running faulty query in psql gives this output:
# select * from (select * from test);
ERROR: subquery in FROM must have an alias
HINT: For example, FROM (SELECT ...) [AS] foo.

and this is logged to logfile:
[2007-12-05 21:29:27 CET] [23689] <depesz@depesz> ERROR: subquery in FROM
must have an alias
[2007-12-05 21:29:27 CET] [23689] <depesz@depesz> HINT: For example, FROM
(SELECT ...) [AS] foo.
[2007-12-05 21:29:27 CET] [23689] <depesz@depesz> STATEMENT: select * from
(select * from test);

but when i change settings to:
# select name, setting, unit from pg_settings where name ilike 'log%' order
by name;
name | setting | unit
-----------------------------+--------------------------------+--------
log_autovacuum_min_duration | 0 | ms
log_checkpoints | on | [null]
log_connections | on | [null]
log_destination | csvlog | [null]
log_directory | pg_log | [null]
log_disconnections | off | [null]
log_duration | off | [null]
log_error_verbosity | default | [null]
log_executor_stats | off | [null]
log_filename | postgresql-%Y-%m-%d_%H%M%S.log | [null]
logging_collector | on | [null]
log_hostname | off | [null]
log_line_prefix | [%t] [%p] <%u@%d> | [null]
log_lock_waits | off | [null]
log_min_duration_statement | 0 | ms
log_min_error_statement | error | [null]
log_min_messages | notice | [null]
log_parser_stats | off | [null]
log_planner_stats | off | [null]
log_rotation_age | 1440 | min
log_rotation_size | 102400 | kB
log_statement | none | [null]
log_statement_stats | off | [null]
log_temp_files | -1 | kB
log_timezone | Poland | [null]
log_truncate_on_rotation | off | [null]
(26 rows)
(only log_destination changed), and issue the same query only this gets
logged:

2007-12-05 21:30:42.148
CET,"depesz","depesz",47570a4e.5cfb,[local],23803,5,"idle",2007-12-05
21:30:06 CET,1/4,0,ERROR,42601,"subquery in FROM must have an alias"

output on console is the same:
# select * from (select * from test);
ERROR: subquery in FROM must have an alias
HINT: For example, FROM (SELECT ...) [AS] foo.

documentation doesn't inform about this kind of change (i.e. skiping parts
of logs), so it looks like a bug.

depesz

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: depesz (#1)
Re: BUG #3799: csvlog skips some logs

"depesz" <depesz@depesz.com> writes:

Description: csvlog skips some logs

The point here is that CSV-format log output doesn't include the
DETAIL, HINT, or context (QUERY/STATEMENT/CONTEXT) lines that
you might get with normal output.

I suppose this was intentional in order to keep the CSV output
format manageable, but I have to wonder whether it's really a
good idea. I can see the argument that you probably don't need
to log HINTs, but the other stuff might be important. Particularly
the STATEMENT.

Comments?

regards, tom lane

#3Alvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Tom Lane (#2)
Re: [HACKERS] BUG #3799: csvlog skips some logs

Tom Lane wrote:

"depesz" <depesz@depesz.com> writes:

Description: csvlog skips some logs

The point here is that CSV-format log output doesn't include the
DETAIL, HINT, or context (QUERY/STATEMENT/CONTEXT) lines that
you might get with normal output.

I suppose this was intentional in order to keep the CSV output
format manageable, but I have to wonder whether it's really a
good idea. I can see the argument that you probably don't need
to log HINTs, but the other stuff might be important. Particularly
the STATEMENT.

Hmm. I think it could be a problem in cases where an error message with
identical wording is disambiguated by context. In my VH opinion we
shouldn't be losing log info just because we're using a different
format.

--
Alvaro Herrera http://www.amazon.com/gp/registry/DXLWNGRJD34J
"Linux transform� mi computadora, de una `m�quina para hacer cosas',
en un aparato realmente entretenido, sobre el cual cada d�a aprendo
algo nuevo" (Jaime Salinas)

#4Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#2)
Re: [HACKERS] BUG #3799: csvlog skips some logs

Tom Lane wrote:

"depesz" <depesz@depesz.com> writes:

Description: csvlog skips some logs

The point here is that CSV-format log output doesn't include the
DETAIL, HINT, or context (QUERY/STATEMENT/CONTEXT) lines that
you might get with normal output.

I suppose this was intentional in order to keep the CSV output
format manageable, but I have to wonder whether it's really a
good idea. I can see the argument that you probably don't need
to log HINTs, but the other stuff might be important. Particularly
the STATEMENT.

Comments?

I don't recall making such a conscious intention - not sure about others
whose fingers have been in the pie. More likely it's just oversight. In
general, I'd say that the log content should be independent of the
format. I can't see any very good reason for text logs to have different
content from CSV logs.

cheers

andrew

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#4)
Re: [HACKERS] BUG #3799: csvlog skips some logs

Andrew Dunstan <andrew@dunslane.net> writes:

I can't see any very good reason for text logs to have different
content from CSV logs.

Well, if we want to cram all that stuff in there, how shall we do it?
It seems wrong to put all those lines into one text field, but I'm
not sure I want to add six more text fields to the CSV format
either. Thoughts?

regards, tom lane

#6Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#5)
Re: [HACKERS] BUG #3799: csvlog skips some logs

Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

I can't see any very good reason for text logs to have different
content from CSV logs.

Well, if we want to cram all that stuff in there, how shall we do it?
It seems wrong to put all those lines into one text field, but I'm
not sure I want to add six more text fields to the CSV format
either. Thoughts?

Really? Six? In any case, would that be so bad? It would mean six extra
commas per line in the log file, and nothing much in the log table
unless there were content in those fields.

cheers

andrew

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#6)
Re: [HACKERS] BUG #3799: csvlog skips some logs

Andrew Dunstan <andrew@dunslane.net> writes:

Tom Lane wrote:

Well, if we want to cram all that stuff in there, how shall we do it?
It seems wrong to put all those lines into one text field, but I'm
not sure I want to add six more text fields to the CSV format
either. Thoughts?

Really? Six? In any case, would that be so bad? It would mean six extra
commas per line in the log file, and nothing much in the log table
unless there were content in those fields.

Yeah --- the lines output in the plain-stderr case that are not covered
in the other are

DETAIL
HINT
QUERY (this is an internally-generated query that failed)
CONTEXT (think "stack trace")
LOCATION (reference to code file/line reporting the error)
STATEMENT (user query that led to the error)

One issue here is that CONTEXT is potentially multiple lines. I'm not
sure that there is much we can do about that, especially not at the last
minute. If we had some time to rewrite internal APIs it might be fun to
think about emitting that as array of text not just text, but I fear
it's much too late to consider that now.

Another thing that I notice is that the CSV code emulates a couple of
not-very-orthogonal behaviors of send_message_to_server_log():
substituting "missing error text" for a NULL error field, and emitting
cursor pos as a tack-on to the error text instead of a separate field.
I think both of those are less than defensible. So if you're willing
to entertain redefining the CSV column set at this late date, I'd
propose throwing in a seventh new field too: CURSORPOS.

Another thing: for stderr output, we have various log verbosity options
that determine whether these additional fields get printed. Should
those options also function in the CSV-output case, or are we just going
to do our best to exhaust disk space as fast as possible all the time?

regards, tom lane

#8Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#7)
Re: [HACKERS] BUG #3799: csvlog skips some logs

Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

Tom Lane wrote:

Well, if we want to cram all that stuff in there, how shall we do it?
It seems wrong to put all those lines into one text field, but I'm
not sure I want to add six more text fields to the CSV format
either. Thoughts?

Really? Six? In any case, would that be so bad? It would mean six extra
commas per line in the log file, and nothing much in the log table
unless there were content in those fields.

Yeah --- the lines output in the plain-stderr case that are not covered
in the other are

DETAIL
HINT
QUERY (this is an internally-generated query that failed)
CONTEXT (think "stack trace")
LOCATION (reference to code file/line reporting the error)
STATEMENT (user query that led to the error)

One issue here is that CONTEXT is potentially multiple lines. I'm not
sure that there is much we can do about that, especially not at the last
minute. If we had some time to rewrite internal APIs it might be fun to
think about emitting that as array of text not just text, but I fear
it's much too late to consider that now.

I'm not sure that putting all this into a single extra field would be so
wrong. As for an array of text, that doesn't seem very portable. I don't
think we should assume that Postgres is the only intended program
destination of CSV logs.

Another thing that I notice is that the CSV code emulates a couple of
not-very-orthogonal behaviors of send_message_to_server_log():
substituting "missing error text" for a NULL error field, and emitting
cursor pos as a tack-on to the error text instead of a separate field.
I think both of those are less than defensible. So if you're willing
to entertain redefining the CSV column set at this late date, I'd
propose throwing in a seventh new field too: CURSORPOS.

Seems like over-egging the pudding to me somewhat, but OK if we decide
to go with a whole bunch of fields.

Another thing: for stderr output, we have various log verbosity options
that determine whether these additional fields get printed. Should
those options also function in the CSV-output case, or are we just going
to do our best to exhaust disk space as fast as possible all the time?

I think for sanity's sake we need one (maximal) format. I'm not so
concerned about disk space. It's not like this is the only logging
option available.

cheers

andrew

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#8)
Re: [HACKERS] BUG #3799: csvlog skips some logs

Andrew Dunstan <andrew@dunslane.net> writes:

Tom Lane wrote:

One issue here is that CONTEXT is potentially multiple lines. I'm not
sure that there is much we can do about that, especially not at the last
minute. If we had some time to rewrite internal APIs it might be fun to
think about emitting that as array of text not just text, but I fear
it's much too late to consider that now.

I'm not sure that putting all this into a single extra field would be so
wrong. As for an array of text, that doesn't seem very portable. I don't
think we should assume that Postgres is the only intended program
destination of CSV logs.

Well, I don't see that "{some text,more text,yet more text}" is going
to be harder to cram into the average CSV-reader than "some text
more text
yet more text". However, in most cases split_to_array on newlines
would be a good enough way of deconstructing the field in Postgres,
so maybe it's not worth worrying about.

Anyway, I think that we should just make the CSV fields be the same as
the existing divisions in the textual log format, which seem to have
stood up well enough in use since 7.4 or whenever we put that scheme in.

regards, tom lane

#10Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#9)
Re: [HACKERS] BUG #3799: csvlog skips some logs

Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

Tom Lane wrote:

One issue here is that CONTEXT is potentially multiple lines. I'm not
sure that there is much we can do about that, especially not at the last
minute. If we had some time to rewrite internal APIs it might be fun to
think about emitting that as array of text not just text, but I fear
it's much too late to consider that now.

I'm not sure that putting all this into a single extra field would be so
wrong. As for an array of text, that doesn't seem very portable. I don't
think we should assume that Postgres is the only intended program
destination of CSV logs.

Well, I don't see that "{some text,more text,yet more text}" is going
to be harder to cram into the average CSV-reader than "some text
more text
yet more text". However, in most cases split_to_array on newlines
would be a good enough way of deconstructing the field in Postgres,
so maybe it's not worth worrying about.

Anyway, I think that we should just make the CSV fields be the same as
the existing divisions in the textual log format, which seem to have
stood up well enough in use since 7.4 or whenever we put that scheme in.

OK, works for me. I'll try to look at it after I have attended to the
Windows build issues. My plate is pretty full right now, though.

cheers

andrew

#11Alvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Andrew Dunstan (#10)
Re: [HACKERS] BUG #3799: csvlog skips some logs

Andrew Dunstan wrote:

OK, works for me. I'll try to look at it after I have attended to the
Windows build issues. My plate is pretty full right now, though.

FYI I'm having a look at it now.

--
Alvaro Herrera http://www.amazon.com/gp/registry/CTMLCN8V17R4
"Llegar� una �poca en la que una investigaci�n diligente y prolongada sacar�
a la luz cosas que hoy est�n ocultas" (S�neca, siglo I)

#12Andrew Dunstan
andrew@dunslane.net
In reply to: Alvaro Herrera (#11)
Re: [HACKERS] BUG #3799: csvlog skips some logs

Alvaro Herrera wrote:

Andrew Dunstan wrote:

OK, works for me. I'll try to look at it after I have attended to the
Windows build issues. My plate is pretty full right now, though.

FYI I'm having a look at it now.

Great. Thanks.

cheers

andrew

#13Alvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Tom Lane (#7)
1 attachment(s)
Re: [HACKERS] BUG #3799: csvlog skips some logs

Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

Tom Lane wrote:

Well, if we want to cram all that stuff in there, how shall we do it?
It seems wrong to put all those lines into one text field, but I'm
not sure I want to add six more text fields to the CSV format
either. Thoughts?

Really? Six? In any case, would that be so bad? It would mean six extra
commas per line in the log file, and nothing much in the log table
unless there were content in those fields.

Yeah --- the lines output in the plain-stderr case that are not covered
in the other are

DETAIL
HINT
QUERY (this is an internally-generated query that failed)
CONTEXT (think "stack trace")
LOCATION (reference to code file/line reporting the error)
STATEMENT (user query that led to the error)

Here is a patch to do this. It emits all of these as separate columns,
which are output empty if they are not present. Of course, the commas
are emitted all the time.

I changed some other things. For example, the original code puts
"[unknown]" when the username or database name is unknown. I think this
is pointless; an empty field is fine. I noticed that "unknown" VXIDs
and TransactionIds are printed as xx/-1 and 0 respectively. Should we
just leave the columns empty?

The error location (file and function name, and line number) are only
reported if log_error_verbosity is set, just like the main logs. The
commas are emitted anyway. It is emitted as a single field. So it is
a single column which looks like
"transformColumnRef, /pgsql/source/00head/src/backend/parser/parse_expr.c:420"
I very much doubt that it is worth separating that any further, but if
anyone disagrees, this is the time to change it.

I tried importing the resulting file into Gnumeric, and after some
fooling around to make it use commas as separators instead of colons (it
"autodetects" the separators and seems to consider colons best for these
files for some reason), it works fine, including multiline context
fields. It does look a bit weird though, because they are not multiline
in the spreadsheet but instead it shows the funny square reserved for
non-printable chars with the [0A] byte in it.

I imported it with COPY WITH CSV and it looks fine in psql; multiline
context is displayed correctly.

Another change I did was to change a "%.*s" to "%*s". The "precision"
marker seems useless AFAICT.

One issue here is that CONTEXT is potentially multiple lines. I'm not
sure that there is much we can do about that, especially not at the last
minute. If we had some time to rewrite internal APIs it might be fun to
think about emitting that as array of text not just text, but I fear
it's much too late to consider that now.

In psql it looks like this. The first message is that I forgot to use
the CSV option to COPY the first time. The second message is that I
renamed a file from under CHECKPOINT to see what it looked like to not
have usernames, etc. The third is a nested plpgsql function call to
show multiline context.

alvherre=# select message, context from postgres_log where context is not null;
-[ RECORD 1 ]-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
message | la sintaxis de entrada no es v�lida para tipo timestamp with time zone: �2007-12-10 09:43:32.473 CLST,"alvherre","alvherre",475d33a0.3c78,[local],15480,3,"CREATE TABLE",2007-12-10 09:40:00 CLST,1/4,387,NOTICE,00000,"CREATE TABLE / PRIMARY KEY crear� el �ndice impl�cito �postgres_log_pkey� para la tabla �postgres_log�",,,,,,,,�
context | COPY postgres_log, l�nea 1, columna log_time: �2007-12-10 09:43:32.473 CLST,"alvherre","alvherre",475d33a0.3c78,[local],15480,3,"CREATE TABLE",2007...�
-[ RECORD 2 ]-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
message | no se pudo abrir la relaci�n 1663/16384/16391: No existe el fichero o el directorio
context | escribiendo el bloque 0 de la relaci�n 1663/16384/16391
-[ RECORD 3 ]-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
message | no existe la columna �foo�
context | PL/pgSQL function "foo" line 1 at SQL statement
: sentencia SQL: �select foo()�
: PL/pgSQL function "bar" line 1 at SQL statement
: sentencia SQL: �select bar()�
: PL/pgSQL function "baz" line 1 at SQL statement

The internal_query thing (notice how "cursor_pos" is only reported if
not zero):

alvherre=# select message, context, internal_query, internal_pos, query, cursor_pos from postgres_log where internal_query is not null;
-[ RECORD 1 ]--+------------------------------------------------
message | no existe la columna �foo�
context | PL/pgSQL function "foo" line 1 at SQL statement
: sentencia SQL: �select foo()�
: PL/pgSQL function "bar" line 1 at SQL statement
: sentencia SQL: �select bar()�
: PL/pgSQL function "baz" line 1 at SQL statement
internal_query | select foo
internal_pos | 8
query | select baz();
cursor_pos |

--
Alvaro Herrera http://www.amazon.com/gp/registry/5ZYLFMCVHXC
"I personally became interested in Linux while I was dating an English major
who wouldn't know an operating system if it walked up and bit him."
(Val Henson)

Attachments:

csvlog.patchtext/x-diff; charset=us-asciiDownload
Index: doc/src/sgml/config.sgml
===================================================================
RCS file: /home/alvherre/Code/cvs/pgsql/doc/src/sgml/config.sgml,v
retrieving revision 1.158
diff -c -p -r1.158 config.sgml
*** doc/src/sgml/config.sgml	28 Nov 2007 15:42:30 -0000	1.158
--- doc/src/sgml/config.sgml	10 Dec 2007 13:18:42 -0000
*************** local0.*    /var/log/postgresql
*** 3169,3175 ****
          with these columns: timestamp with milliseconds, user name, database
          name, session ID, host:port number, process ID, per-process line
          number, command tag, session start time, virtual transaction ID,
!         regular transaction id, error severity, SQL state code, error message.
          Here is a sample table definition for storing CSV-format log output:
  
  <programlisting>
--- 3169,3180 ----
          with these columns: timestamp with milliseconds, user name, database
          name, session ID, host:port number, process ID, per-process line
          number, command tag, session start time, virtual transaction ID,
!         regular transaction id, error severity, SQL state code, error message,
!         error message detail, hint, the internal query that led to the error
!         if any, the error position thereof, the user query that led to the
!         error if any, the error position thereof, and the location of the
!         error in the PostgreSQL source code (this field is only emitted
!         if <varname>log_error_verbosity</> is set to <literal>verbose</>).
          Here is a sample table definition for storing CSV-format log output:
  
  <programlisting>
*************** CREATE TABLE postgres_log
*** 3189,3194 ****
--- 3194,3207 ----
    error_severity text,
    sql_state_code text,
    message text,
+   detail text,
+   context text,
+   hint text,
+   internal_query text,
+   internal_pos integer,
+   query text,
+   cursor_pos integer,
+   location text,
    PRIMARY KEY (session_id, process_line_num)
  );
  </programlisting>
Index: src/backend/utils/error/elog.c
===================================================================
RCS file: /home/alvherre/Code/cvs/pgsql/src/backend/utils/error/elog.c,v
retrieving revision 1.198
diff -c -p -r1.198 elog.c
*** src/backend/utils/error/elog.c	15 Nov 2007 21:14:40 -0000	1.198
--- src/backend/utils/error/elog.c	10 Dec 2007 12:49:51 -0000
*************** static const char *error_severity(int el
*** 134,140 ****
  static void append_with_tabs(StringInfo buf, const char *str);
  static bool is_log_level_output(int elevel, int log_min_level);
  static void write_pipe_chunks(char *data, int len, int dest);
- static void get_csv_error_message(StringInfo buf, ErrorData *edata);
  static void write_csvlog(ErrorData *edata);
  
  /*
--- 134,139 ----
*************** log_line_prefix(StringInfo buf)
*** 1616,1629 ****
  /*
   * append a CSV'd version of a string to a StringInfo
   * We use the PostgreSQL defaults for CSV, i.e. quote = escape = '"'
   */
- 
  static inline void
  appendCSVLiteral(StringInfo buf, const char *data)
  {
  	const char *p = data;
  	char		c;
  
  	appendStringInfoCharMacro(buf, '"');
  	while ((c = *p++) != '\0')
  	{
--- 1615,1632 ----
  /*
   * append a CSV'd version of a string to a StringInfo
   * We use the PostgreSQL defaults for CSV, i.e. quote = escape = '"'
+  * If it's NULL, append nothing.
   */
  static inline void
  appendCSVLiteral(StringInfo buf, const char *data)
  {
  	const char *p = data;
  	char		c;
  
+ 	/* avoid confusing an empty string with NULL */
+ 	if (p == NULL)
+ 		return;
+ 
  	appendStringInfoCharMacro(buf, '"');
  	while ((c = *p++) != '\0')
  	{
*************** appendCSVLiteral(StringInfo buf, const c
*** 1642,1649 ****
  static void
  write_csvlog(ErrorData *edata)
  {
- 	StringInfoData msgbuf;
  	StringInfoData buf;
  
  	/* static counter for line numbers */
  	static long log_line_number = 0;
--- 1645,1652 ----
  static void
  write_csvlog(ErrorData *edata)
  {
  	StringInfoData buf;
+ 	bool	print_stmt = false;
  
  	/* static counter for line numbers */
  	static long log_line_number = 0;
*************** write_csvlog(ErrorData *edata)
*** 1664,1670 ****
  	}
  	log_line_number++;
  
- 	initStringInfo(&msgbuf);
  	initStringInfo(&buf);
  
  	/*
--- 1667,1672 ----
*************** write_csvlog(ErrorData *edata)
*** 1715,1740 ****
  
  	/* username */
  	if (MyProcPort)
! 	{
! 		const char *username = MyProcPort->user_name;
! 
! 		if (username == NULL || *username == '\0')
! 			username = _("[unknown]");
! 
! 		appendCSVLiteral(&buf, username);
! 	}
  	appendStringInfoChar(&buf, ',');
  
  	/* databasename */
  	if (MyProcPort)
! 	{
! 		const char *dbname = MyProcPort->database_name;
! 
! 		if (dbname == NULL || *dbname == '\0')
! 			dbname = _("[unknown]");
! 
! 		appendCSVLiteral(&buf, dbname);
! 	}
  	appendStringInfoChar(&buf, ',');
  
  	/* session id */
--- 1717,1728 ----
  
  	/* username */
  	if (MyProcPort)
! 		appendCSVLiteral(&buf, MyProcPort->user_name);
  	appendStringInfoChar(&buf, ',');
  
  	/* databasename */
  	if (MyProcPort)
! 		appendCSVLiteral(&buf, MyProcPort->database_name);
  	appendStringInfoChar(&buf, ',');
  
  	/* session id */
*************** write_csvlog(ErrorData *edata)
*** 1762,1774 ****
  	/* PS display */
  	if (MyProcPort)
  	{
  		const char *psdisp;
  		int			displen;
  
  		psdisp = get_ps_display(&displen);
! 		appendStringInfo(&msgbuf, "%.*s", displen, psdisp);
  		appendCSVLiteral(&buf, msgbuf.data);
! 		resetStringInfo(&msgbuf);
  	}
  	appendStringInfoChar(&buf, ',');
  
--- 1750,1766 ----
  	/* PS display */
  	if (MyProcPort)
  	{
+ 		StringInfoData msgbuf;
  		const char *psdisp;
  		int			displen;
  
+ 		initStringInfo(&msgbuf);
+ 
  		psdisp = get_ps_display(&displen);
! 		appendStringInfo(&msgbuf, "%*s", displen, psdisp);
  		appendCSVLiteral(&buf, msgbuf.data);
! 	
! 		pfree(msgbuf.data);
  	}
  	appendStringInfoChar(&buf, ',');
  
*************** write_csvlog(ErrorData *edata)
*** 1796,1808 ****
  	appendStringInfoChar(&buf, ',');
  
  	/* Error severity */
! 	appendStringInfo(&buf, "%s,", error_severity(edata->elevel));
  
  	/* SQL state code */
! 	appendStringInfo(&buf, "%s,", unpack_sql_state(edata->sqlerrcode));
  
! 	/* Error message and cursor position if any */
! 	get_csv_error_message(&buf, edata);
  
  	appendStringInfoChar(&buf, '\n');
  
--- 1788,1863 ----
  	appendStringInfoChar(&buf, ',');
  
  	/* Error severity */
! 	appendStringInfo(&buf, "%s", error_severity(edata->elevel));
! 	appendStringInfoChar(&buf, ',');
  
  	/* SQL state code */
! 	appendStringInfo(&buf, "%s", unpack_sql_state(edata->sqlerrcode));
! 	appendStringInfoChar(&buf, ',');
! 
! 	/* errmessage */
! 	appendCSVLiteral(&buf, edata->message);
! 	appendStringInfoCharMacro(&buf, ',');
! 
! 	/* errdetail */
! 	appendCSVLiteral(&buf, edata->detail);
! 	appendStringInfoCharMacro(&buf, ',');
! 
! 	/* errcontext -- XXX maybe multiline -- need to do something about that? */
! 	appendCSVLiteral(&buf, edata->context);
! 	appendStringInfoCharMacro(&buf, ',');
! 
! 	/* errhint */
! 	appendCSVLiteral(&buf, edata->hint);
! 	appendStringInfoCharMacro(&buf, ',');
! 
! 	/* internal query */
! 	appendCSVLiteral(&buf, edata->internalquery);
! 	appendStringInfoCharMacro(&buf, ',');
! 
! 	/* if printed internal query, print internal pos too */
! 	if (edata->internalpos > 0 && edata->internalquery != NULL)
! 		appendStringInfo(&buf, "%d", edata->internalpos);
! 	appendStringInfoCharMacro(&buf, ',');
! 
! 	/* the user query is only reported if not disabled by the caller */
! 	if (is_log_level_output(edata->elevel, log_min_error_statement) &&
! 		debug_query_string != NULL &&
! 		!edata->hide_stmt)
! 		print_stmt = true;
! 	if (print_stmt)
! 		appendCSVLiteral(&buf, debug_query_string);
! 	appendStringInfoCharMacro(&buf, ',');
! 	if (print_stmt && edata->cursorpos > 0)
! 		appendStringInfo(&buf, "%d", edata->cursorpos);
! 	appendStringInfoCharMacro(&buf, ',');
  
! 	/* file error location */
! 	if (Log_error_verbosity >= PGERROR_VERBOSE)
! 	{
! 		/* assume no newlines in funcname or filename... */
! 		if (edata->funcname && edata->filename)
! 		{
! 			StringInfoData	msgbuf;
! 
! 			initStringInfo(&msgbuf);
! 			appendStringInfo(&msgbuf, "%s, %s:%d",
! 							 edata->funcname, edata->filename,
! 							 edata->lineno);
! 			appendCSVLiteral(&buf, msgbuf.data);
! 			pfree(msgbuf.data);
! 		}
! 		else if (edata->filename)
! 		{
! 			StringInfoData	msgbuf;
! 
! 			initStringInfo(&msgbuf);
! 			appendStringInfo(&msgbuf, "%s:%d",
! 							 edata->filename, edata->lineno);
! 			appendCSVLiteral(&buf, msgbuf.data);
! 			pfree(msgbuf.data);
! 		}
! 	}
  
  	appendStringInfoChar(&buf, '\n');
  
*************** write_csvlog(ErrorData *edata)
*** 1812,1851 ****
  	else
  		write_pipe_chunks(buf.data, buf.len, LOG_DESTINATION_CSVLOG);
  
- 	pfree(msgbuf.data);
  	pfree(buf.data);
  }
  
  /*
-  * Appends the buffer with the error message and the cursor position, all
-  * CSV escaped.
-  */
- static void
- get_csv_error_message(StringInfo buf, ErrorData *edata)
- {
- 	char	   *msg = edata->message ? edata->message : _("missing error text");
- 	char		c;
- 
- 	appendStringInfoCharMacro(buf, '"');
- 
- 	while ((c = *msg++) != '\0')
- 	{
- 		if (c == '"')
- 			appendStringInfoCharMacro(buf, '"');
- 		appendStringInfoCharMacro(buf, c);
- 	}
- 
- 	if (edata->cursorpos > 0)
- 		appendStringInfo(buf, _(" at character %d"),
- 						 edata->cursorpos);
- 	else if (edata->internalpos > 0)
- 		appendStringInfo(buf, _(" at character %d"),
- 						 edata->internalpos);
- 
- 	appendStringInfoCharMacro(buf, '"');
- }
- 
- /*
   * Unpack MAKE_SQLSTATE code. Note that this returns a pointer to a
   * static buffer.
   */
--- 1867,1876 ----
#14Andrew Dunstan
andrew@dunslane.net
In reply to: Alvaro Herrera (#13)
Re: [HACKERS] BUG #3799: csvlog skips some logs

Alvaro Herrera wrote:

Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

Tom Lane wrote:

Well, if we want to cram all that stuff in there, how shall we do it?
It seems wrong to put all those lines into one text field, but I'm
not sure I want to add six more text fields to the CSV format
either. Thoughts?

Really? Six? In any case, would that be so bad? It would mean six extra
commas per line in the log file, and nothing much in the log table
unless there were content in those fields.

Yeah --- the lines output in the plain-stderr case that are not covered
in the other are

DETAIL
HINT
QUERY (this is an internally-generated query that failed)
CONTEXT (think "stack trace")
LOCATION (reference to code file/line reporting the error)
STATEMENT (user query that led to the error)

Here is a patch to do this. It emits all of these as separate columns,
which are output empty if they are not present. Of course, the commas
are emitted all the time.

Thanks. I will look at it in detail later today.

Not sure I understand what this comment I noticed on a very brief glance
is about:

/* assume no newlines in funcname or filename... */

If it's about what to quote, we need to quote anything that might contain a newline, quote or comma. Filenames certainly come into that category.

cheers

andrew

#15Alvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Andrew Dunstan (#14)
Re: [HACKERS] BUG #3799: csvlog skips some logs

Andrew Dunstan wrote:

Not sure I understand what this comment I noticed on a very brief glance is
about:

/* assume no newlines in funcname or filename... */

If it's about what to quote, we need to quote anything that might contain a newline, quote or comma. Filenames certainly come into that category.

Huh, sorry, that's a leftover comment. The inserted string is quoted
inside appendCSVLiteral. Thanks, I'll fix it.

--
Alvaro Herrera http://www.amazon.com/gp/registry/DXLWNGRJD34J
"No es bueno caminar con un hombre muerto"

#16Alvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Alvaro Herrera (#15)
Re: [HACKERS] BUG #3799: csvlog skips some logs

Alvaro Herrera wrote:

Andrew Dunstan wrote:

Not sure I understand what this comment I noticed on a very brief glance is
about:

/* assume no newlines in funcname or filename... */

If it's about what to quote, we need to quote anything that might contain a newline, quote or comma. Filenames certainly come into that category.

Huh, sorry, that's a leftover comment. The inserted string is quoted
inside appendCSVLiteral. Thanks, I'll fix it.

This part now looks like this (it was a bit duplicative):

/* file error location */
if (Log_error_verbosity >= PGERROR_VERBOSE)
{
StringInfoData msgbuf;

initStringInfo(&msgbuf);

if (edata->funcname && edata->filename)
appendStringInfo(&msgbuf, "%s, %s:%d",
edata->funcname, edata->filename,
edata->lineno);
else if (edata->filename)
appendStringInfo(&msgbuf, "%s:%d",
edata->filename, edata->lineno);
appendCSVLiteral(&buf, msgbuf.data);
pfree(msgbuf.data);
}

--
Alvaro Herrera http://www.amazon.com/gp/registry/5ZYLFMCVHXC
<Schwern> It does it in a really, really complicated way
<crab> why does it need to be complicated?
<Schwern> Because it's MakeMaker.

#17Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#14)
Re: [HACKERS] BUG #3799: csvlog skips some logs

Andrew Dunstan <andrew@dunslane.net> writes:

If it's about what to quote, we need to quote anything that might
contain a newline, quote or comma. Filenames certainly come into that
category.

These are not general file paths, these are base names of our own C
source code files. I don't have a problem with legislating that we will
never have such characters in backend source file names ...

regards, tom lane

#18Alvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Tom Lane (#17)
Re: [HACKERS] BUG #3799: csvlog skips some logs

Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

If it's about what to quote, we need to quote anything that might
contain a newline, quote or comma. Filenames certainly come into that
category.

These are not general file paths, these are base names of our own C
source code files. I don't have a problem with legislating that we will
never have such characters in backend source file names ...

Yeah, I was about to say the same thing, but the problem is that the
field is constructed like

function, file:line

so there is a comma anyway.

(We could split it, but since this is only emitted in the
log_error_verbosity=verbose case, I don't think it makes much sense.)

--
Alvaro Herrera http://www.flickr.com/photos/alvherre/
"I suspect most samba developers are already technically insane...
Of course, since many of them are Australians, you can't tell." (L. Torvalds)

#19Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#13)
Re: [HACKERS] BUG #3799: csvlog skips some logs

Alvaro Herrera <alvherre@alvh.no-ip.org> writes:

Another change I did was to change a "%.*s" to "%*s". The "precision"
marker seems useless AFAICT.

This is wrong, broken, will cause crashes on platforms where the PS
string is not null-terminated. (Hint: .* is a maximum width, * is a
minimum width.)

Another thing I thought strange was the order of the added columns;
why isn't it more like the order in which they appear in the text
version? In particular hint should probably come just after detail
and before context, and internal-query should also come before context
because when relevant it's usually more closely nested than the context
stack.

Otherwise the patch looks pretty sane to me. I didn't do any testing
though.

regards, tom lane

#20Alvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Tom Lane (#19)
Re: [HACKERS] BUG #3799: csvlog skips some logs

Hmm, I'm now wondering if the log line number is correctly positioned.
Right now we have it just after the PID. So it suggests that following
PID and log line number is enough for tracking what a session does.
While this is not entirely incorrect, ISTM to be more logical to put it
closer to the session ID, and change the name so it is less misleading
in that sense.

Currently we have

session_id | 475e91da.291f
connection_from | [local]
process_id | 10527
process_line_num | 3

I propose we change it to

process_id | 10527
connection_from | [local]
session_id | 475e91da.291f
session_line_num | 3

Note changed column name. I also suggest we change the description of
%l in log_line_prefix to

% Number of the log line for each session, starting at 1
original is
Number of the log line for each process, starting at 1

Thoughts?

--
Alvaro Herrera http://www.amazon.com/gp/registry/CTMLCN8V17R4
"No necesitamos banderas
No reconocemos fronteras" (Jorge Gonz�lez)

#21Alvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Alvaro Herrera (#20)
Re: [HACKERS] BUG #3799: csvlog skips some logs

Another problem I just noticed is that it seems the bgwriter is
inheriting the session id from Postmaster; it doesn't have one of its
own.

--
Alvaro Herrera Developer, http://www.PostgreSQL.org/
"People get annoyed when you try to debug them." (Larry Wall)

#22Alvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Alvaro Herrera (#21)
Re: [HACKERS] BUG #3799: csvlog skips some logs

Alvaro Herrera wrote:

Another problem I just noticed is that it seems the bgwriter is
inheriting the session id from Postmaster; it doesn't have one of its
own.

Huh, sorry, I'm just blind, I neglected to look at the last digit ;-)

--
Alvaro Herrera Valdivia, Chile ICBM: S 39� 49' 18.1", W 73� 13' 56.4"
"I must say, I am absolutely impressed with what pgsql's implementation of
VALUES allows me to do. It's kind of ridiculous how much "work" goes away in
my code. Too bad I can't do this at work (Oracle 8/9)." (Tom Allison)
http://archives.postgresql.org/pgsql-general/2007-06/msg00016.php

#23Alvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Tom Lane (#19)
Re: [HACKERS] BUG #3799: csvlog skips some logs

Tom Lane wrote:

Alvaro Herrera <alvherre@alvh.no-ip.org> writes:

Another change I did was to change a "%.*s" to "%*s". The "precision"
marker seems useless AFAICT.

This is wrong, broken, will cause crashes on platforms where the PS
string is not null-terminated. (Hint: .* is a maximum width, * is a
minimum width.)

Oh, OK.

Another thing I thought strange was the order of the added columns;
why isn't it more like the order in which they appear in the text
version? In particular hint should probably come just after detail
and before context, and internal-query should also come before context
because when relevant it's usually more closely nested than the context
stack.

Ok, I changed it like you suggest. I didn't do any other order changes.
I still propose that the log line number should be moved w.r.t. session
identifier.

I changed two more things: the VXID is not reported if not in a backend
(because AuxiliaryProcesses are said to never have one), and added
quotes surrounding the hostname, because in a test here it seems
supported to create an alias for my loopback interface with a name like
"a,b".

--
Alvaro Herrera http://www.amazon.com/gp/registry/DXLWNGRJD34J
"All rings of power are equal,
But some rings of power are more equal than others."
(George Orwell's The Lord of the Rings)

#24Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#23)
Re: [HACKERS] BUG #3799: csvlog skips some logs

Alvaro Herrera <alvherre@alvh.no-ip.org> writes:

I still propose that the log line number should be moved w.r.t. session
identifier.

No objection here.

I changed two more things: the VXID is not reported if not in a backend
(because AuxiliaryProcesses are said to never have one), and added
quotes surrounding the hostname, because in a test here it seems
supported to create an alias for my loopback interface with a name like
"a,b".

Sounds reasonable, as long as autovac processes still report VXID.

regards, tom lane