bug: json format and auto_explain

Started by Euler Taveira de Oliveiraabout 16 years ago9 messages

Hi,

While testing the EXPLAIN BUFFERS patch I found a bug. I'm too tired to
provide a fix right now but if someone can take it I will appreciate. It seems
ExplainJSONLineEnding() doesn't expect es->grouping_stack list as a null pointer.

euler@harman $ ./install/bin/psql
psql (8.5devel)
Type "help" for help.

euler=# load 'auto_explain';
LOAD
euler=# set auto_explain.log_min_duration to 0;
SET
euler=# set auto_explain.log_format to 'json';
SET
euler=# explain (format json) select * from pgbench_branches inner join
pgbench_history using (bid) where bid > 100;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
euler=# \q
euler@harman /a/pgsql/dev $ gdb ./install/bin/postgres ./data/core
GNU gdb 6.8
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html&gt;
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "i686-pc-linux-gnu"...

warning: Can't read pathname for load map: Input/output error.
Reading symbols from /usr/lib/libxml2.so.2...done.
Loaded symbols for /usr/lib/libxml2.so.2
Reading symbols from /usr/lib/libssl.so.0.9.8...done.
Loaded symbols for /usr/lib/libssl.so.0.9.8
Reading symbols from /usr/lib/libcrypto.so.0.9.8...done.
Loaded symbols for /usr/lib/libcrypto.so.0.9.8
Reading symbols from /lib/libdl.so.2...done.
Loaded symbols for /lib/libdl.so.2
Reading symbols from /lib/libm.so.6...done.
Loaded symbols for /lib/libm.so.6
Reading symbols from /usr/lib/libldap-2.4.so.2...done.
Loaded symbols for /usr/lib/libldap-2.4.so.2
Reading symbols from /lib/libc.so.6...done.
Loaded symbols for /lib/libc.so.6
Reading symbols from /lib/libz.so.1...done.
Loaded symbols for /lib/libz.so.1
Reading symbols from /usr/lib/libgssapi_krb5.so.2...done.
Loaded symbols for /usr/lib/libgssapi_krb5.so.2
Reading symbols from /usr/lib/libkrb5.so.3...done.
Loaded symbols for /usr/lib/libkrb5.so.3
Reading symbols from /lib/libcom_err.so.2...done.
Loaded symbols for /lib/libcom_err.so.2
Reading symbols from /usr/lib/libk5crypto.so.3...done.
Loaded symbols for /usr/lib/libk5crypto.so.3
Reading symbols from /lib/libresolv.so.2...done.
Loaded symbols for /lib/libresolv.so.2
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
Reading symbols from /usr/lib/liblber-2.4.so.2...done.
Loaded symbols for /usr/lib/liblber-2.4.so.2
Reading symbols from /usr/lib/libsasl2.so.2...done.
Loaded symbols for /usr/lib/libsasl2.so.2
Reading symbols from /usr/lib/libgnutls.so.26...done.
Loaded symbols for /usr/lib/libgnutls.so.26
Reading symbols from /usr/lib/libkrb5support.so.0...done.
Loaded symbols for /usr/lib/libkrb5support.so.0
Reading symbols from /lib/libpthread.so.0...done.
Loaded symbols for /lib/libpthread.so.0
Reading symbols from /lib/libcrypt.so.1...done.
Loaded symbols for /lib/libcrypt.so.1
Reading symbols from /usr/lib/libtasn1.so.3...done.
Loaded symbols for /usr/lib/libtasn1.so.3
Reading symbols from /usr/lib/libgcrypt.so.11...done.
Loaded symbols for /usr/lib/libgcrypt.so.11
Reading symbols from /usr/lib/libgpg-error.so.0...done.
Loaded symbols for /usr/lib/libgpg-error.so.0
Reading symbols from /lib/libnss_files.so.2...done.
Loaded symbols for /lib/libnss_files.so.2
Reading symbols from /a/pgsql/dev/install/lib/auto_explain.so...done.
Loaded symbols for /a/pgsql/dev/install/lib/auto_explain.so
Core was generated by `postgres: euler euler [local] EXPLAIN '.
Program terminated with signal 11, Segmentation fault.
[New process 2751]
#0 0x081a2158 in ExplainJSONLineEnding (es=0xbfd117dc) at
/a/pgsql/dev/postgresql/src/backend/commands/explain.c:1885

warning: Source file is more recent than executable.
1885 if (linitial_int(es->grouping_stack) != 0)
(gdb) print es->grouping_stack
$1 = (List *) 0x0
(gdb) print es->str->data
$2 = 0x865b09c ""
(gdb) bt
#0 0x081a2158 in ExplainJSONLineEnding (es=0xbfd117dc) at
/a/pgsql/dev/postgresql/src/backend/commands/explain.c:1885
#1 0x081a1ada in ExplainOpenGroup (objtype=0x84fcb64 "Plan",
labelname=0x84fcb64 "Plan", labeled=1 '\001', es=0xbfd117dc) at
/a/pgsql/dev/postgresql/src/backend/commands/explain.c:1684
#2 0x0819faac in ExplainNode (plan=0x8652c98, planstate=0x8654868,
outer_plan=0x0, relationship=0x0, plan_name=0x0, es=0xbfd117dc) at
/a/pgsql/dev/postgresql/src/backend/commands/explain.c:739
#3 0x0819f393 in ExplainPrintPlan (es=0xbfd117dc, queryDesc=0x8653724) at
/a/pgsql/dev/postgresql/src/backend/commands/explain.c:474
#4 0xb7ef0010 in explain_ExecutorEnd (queryDesc=0x8653724) at
/a/pgsql/dev/postgresql/contrib/auto_explain/auto_explain.c:238
#5 0x081f1094 in ExecutorEnd (queryDesc=0x8653724) at
/a/pgsql/dev/postgresql/src/backend/executor/execMain.c:314
#6 0x0819f23d in ExplainOnePlan (plannedstmt=0x86536d8, es=0xbfd11958,
queryString=0x862ad0c "explain (format json) select * from pgbench_branches
inner join pgbench_history using (bid) where bid > 100;",
params=0x0) at /a/pgsql/dev/postgresql/src/backend/commands/explain.c:433
#7 0x0819eee6 in ExplainOneQuery (query=0x85d3620, es=0xbfd11958,
queryString=0x862ad0c "explain (format json) select * from pgbench_branches
inner join pgbench_history using (bid) where bid > 100;",
params=0x0) at /a/pgsql/dev/postgresql/src/backend/commands/explain.c:285
#8 0x0819ebd1 in ExplainQuery (stmt=0x862baf0, queryString=0x862ad0c "explain
(format json) select * from pgbench_branches inner join pgbench_history using
(bid) where bid > 100;", params=0x0,
dest=0x85d29a4) at /a/pgsql/dev/postgresql/src/backend/commands/explain.c:192
#9 0x082e89ce in ProcessUtility (parsetree=0x862baf0, queryString=0x862ad0c
"explain (format json) select * from pgbench_branches inner join
pgbench_history using (bid) where bid > 100;", params=0x0,
isTopLevel=1 '\001', dest=0x85d29a4, completionTag=0xbfd11b70 "") at
/a/pgsql/dev/postgresql/src/backend/tcop/utility.c:922
#10 0x082e6a7f in PortalRunUtility (portal=0x85d0874, utilityStmt=0x862baf0,
isTopLevel=1 '\001', dest=0x85d29a4, completionTag=0xbfd11b70 "") at
/a/pgsql/dev/postgresql/src/backend/tcop/pquery.c:1192
#11 0x082e681a in FillPortalStore (portal=0x85d0874, isTopLevel=1 '\001') at
/a/pgsql/dev/postgresql/src/backend/tcop/pquery.c:1066
#12 0x082e626e in PortalRun (portal=0x85d0874, count=2147483647, isTopLevel=1
'\001', dest=0x862bcb4, altdest=0x862bcb4, completionTag=0xbfd11cd6 "") at
/a/pgsql/dev/postgresql/src/backend/tcop/pquery.c:802
#13 0x082e0acf in exec_simple_query (query_string=0x862ad0c "explain (format
json) select * from pgbench_branches inner join pgbench_history using (bid)
where bid > 100;")
at /a/pgsql/dev/postgresql/src/backend/tcop/postgres.c:1046
#14 0x082e47d9 in PostgresMain (argc=2, argv=0x85b5358, username=0x85b5330
"euler") at /a/pgsql/dev/postgresql/src/backend/tcop/postgres.c:3624
#15 0x082a72df in BackendRun (port=0x85d4860) at
/a/pgsql/dev/postgresql/src/backend/postmaster/postmaster.c:3366
#16 0x082a69ab in BackendStartup (port=0x85d4860) at
/a/pgsql/dev/postgresql/src/backend/postmaster/postmaster.c:3073
#17 0x082a3d1c in ServerLoop () at
/a/pgsql/dev/postgresql/src/backend/postmaster/postmaster.c:1399
#18 0x082a344d in PostmasterMain (argc=3, argv=0x85b33e0) at
/a/pgsql/dev/postgresql/src/backend/postmaster/postmaster.c:1064
#19 0x0822b54b in main (argc=3, argv=0x85b33e0) at
/a/pgsql/dev/postgresql/src/backend/main/main.c:188
(gdb) quit

--
Euler Taveira de Oliveira
http://www.timbira.com/

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Euler Taveira de Oliveira (#1)
Re: bug: json format and auto_explain

Euler Taveira de Oliveira <euler@timbira.com> writes:

While testing the EXPLAIN BUFFERS patch I found a bug. I'm too tired to
provide a fix right now but if someone can take it I will appreciate. It seems
ExplainJSONLineEnding() doesn't expect es->grouping_stack list as a null pointer.

Looks like auto_explain is under the illusion that it need not call
ExplainBeginOutput/ExplainEndOutput.

regards, tom lane

#3Itagaki Takahiro
itagaki.takahiro@oss.ntt.co.jp
In reply to: Tom Lane (#2)
1 attachment(s)
Re: bug: json format and auto_explain

Tom Lane <tgl@sss.pgh.pa.us> wrote:

Looks like auto_explain is under the illusion that it need not call
ExplainBeginOutput/ExplainEndOutput.

They were added by XML formatter; I suppose it worked on 8.4.

Explain{Begin/End}Output are static functions, so we cannot call them
from an external contrib module. Instead, I'll suggest to call them
automatically from ExplainPrintPlan. The original codes in ExplainPrintPlan
was moved into ExplainOneResult, that name might be debatable.

Patch attached.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

Attachments:

auto_explain_fix_20091207.patchapplication/octet-stream; name=auto_explain_fix_20091207.patchDownload
diff -cprN head/src/backend/commands/explain.c work/src/backend/commands/explain.c
*** head/src/backend/commands/explain.c	2009-11-05 07:26:04.000000000 +0900
--- work/src/backend/commands/explain.c	2009-12-07 16:29:14.250487980 +0900
*************** static void ExplainOneQuery(Query *query
*** 53,58 ****
--- 53,59 ----
  static void report_triggers(ResultRelInfo *rInfo, bool show_relname,
  				ExplainState *es);
  static double elapsed_time(instr_time *starttime);
+ static void ExplainOneResult(ExplainState *es, QueryDesc *queryDesc);
  static void ExplainNode(Plan *plan, PlanState *planstate,
  				Plan *outer_plan,
  				const char *relationship, const char *plan_name,
*************** ExplainOnePlan(PlannedStmt *plannedstmt,
*** 377,383 ****
  	ExplainOpenGroup("Query", NULL, true, es);
  
  	/* Create textual dump of plan tree */
! 	ExplainPrintPlan(es, queryDesc);
  
  	/*
  	 * If we ran the command, run any AFTER triggers it queued.  (Note this
--- 378,384 ----
  	ExplainOpenGroup("Query", NULL, true, es);
  
  	/* Create textual dump of plan tree */
! 	ExplainOneResult(es, queryDesc);
  
  	/*
  	 * If we ran the command, run any AFTER triggers it queued.  (Note this
*************** ExplainOnePlan(PlannedStmt *plannedstmt,
*** 448,466 ****
  	ExplainCloseGroup("Query", NULL, true, es);
  }
  
! /*
!  * ExplainPrintPlan -
!  *	  convert a QueryDesc's plan tree to text and append it to es->str
!  *
!  * The caller should have set up the options fields of *es, as well as
!  * initializing the output buffer es->str.  Other fields in *es are
!  * initialized here.
!  *
!  * NB: will not work on utility statements
!  */
! void
! ExplainPrintPlan(ExplainState *es, QueryDesc *queryDesc)
! {
  	Assert(queryDesc->plannedstmt != NULL);
  	es->pstmt = queryDesc->plannedstmt;
  	es->rtable = queryDesc->plannedstmt->rtable;
--- 449,478 ----
  	ExplainCloseGroup("Query", NULL, true, es);
  }
  
! /*
!  * ExplainPrintPlan - setup output and print a plan tree
!  */
! void
! ExplainPrintPlan(ExplainState *es, QueryDesc *queryDesc)
! {
! 	ExplainBeginOutput(es);
! 	ExplainOneResult(es, queryDesc);
! 	ExplainEndOutput(es);
! }
! 
! /*
!  * ExplainOneResult -
!  *	  convert a QueryDesc's plan tree to text and append it to es->str
!  *
!  * The caller should have set up the options fields of *es, as well as
!  * initializing the output buffer es->str.  Other fields in *es are
!  * initialized here.
!  *
!  * NB: will not work on utility statements
!  */
! static void
! ExplainOneResult(ExplainState *es, QueryDesc *queryDesc)
! {
  	Assert(queryDesc->plannedstmt != NULL);
  	es->pstmt = queryDesc->plannedstmt;
  	es->rtable = queryDesc->plannedstmt->rtable;
#4Alvaro Herrera
alvherre@commandprompt.com
In reply to: Tom Lane (#2)
Re: bug: json format and auto_explain

Tom Lane wrote:

Euler Taveira de Oliveira <euler@timbira.com> writes:

While testing the EXPLAIN BUFFERS patch I found a bug. I'm too tired to
provide a fix right now but if someone can take it I will appreciate. It seems
ExplainJSONLineEnding() doesn't expect es->grouping_stack list as a null pointer.

Looks like auto_explain is under the illusion that it need not call
ExplainBeginOutput/ExplainEndOutput.

Fortunately the author of auto_explain can now commit the fix by himself ...
Kudos, BTW :-)

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

In reply to: Alvaro Herrera (#4)
Re: bug: json format and auto_explain

Alvaro Herrera escreveu:

Fortunately the author of auto_explain can now commit the fix by himself ...
Kudos, BTW :-)

Congratulations!

--
Euler Taveira de Oliveira
http://www.timbira.com/

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Itagaki Takahiro (#3)
Re: bug: json format and auto_explain

Itagaki Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes:

Tom Lane <tgl@sss.pgh.pa.us> wrote:

Looks like auto_explain is under the illusion that it need not call
ExplainBeginOutput/ExplainEndOutput.

Explain{Begin/End}Output are static functions, so we cannot call them
from an external contrib module. Instead, I'll suggest to call them
automatically from ExplainPrintPlan. The original codes in ExplainPrintPlan
was moved into ExplainOneResult, that name might be debatable.

This isn't an adequate solution I'm afraid --- what about the other
functions that are exported by explain.h?

I too am not totally thrilled with the idea of exporting
Explain{Begin/End}Output, but it might be the best solution.
We might also need to think about refactoring those functions:
there seem to be two different things going on there, one being
format-specific initialization which will certainly be necessary,
and one being output of a wrapper structure which might or might
not be appropriate for what auto_explain or other callers want.

In any case you need to expend more effort on the comments for the
functions. Inadequate specification of ExplainPrintPlan's call
requirements is what got us into this problem in the first place,
and the proposed patch makes that worse not better.

regards, tom lane

#7Robert Haas
robertmhaas@gmail.com
In reply to: Alvaro Herrera (#4)
Re: bug: json format and auto_explain

On Mon, Dec 7, 2009 at 10:42 AM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:

Tom Lane wrote:

Euler Taveira de Oliveira <euler@timbira.com> writes:

While testing the EXPLAIN BUFFERS patch I found a bug. I'm too tired to
provide a fix right now but if someone can take it I will appreciate. It seems
ExplainJSONLineEnding() doesn't expect es->grouping_stack list as a null pointer.

Looks like auto_explain is under the illusion that it need not call
ExplainBeginOutput/ExplainEndOutput.

Fortunately the author of auto_explain can now commit the fix by himself ...
Kudos, BTW :-)

I just tested and this has been broken since it was committed (by
Tom). However, I believe I did test it on the last version of the
patch that I submitted, and I think it worked then. So we have lots
of choices for who should fix this:

- me, since it's my feature
- Tom, since it appears that he broke it
- Itagaki-san, since he's the auto_explain maintainer

:-)

...Robert

#8Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#6)
Re: bug: json format and auto_explain

On Mon, Dec 7, 2009 at 11:07 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Itagaki Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes:

Tom Lane <tgl@sss.pgh.pa.us> wrote:

Looks like auto_explain is under the illusion that it need not call
ExplainBeginOutput/ExplainEndOutput.

Explain{Begin/End}Output are static functions, so we cannot call them
from an external contrib module. Instead, I'll suggest to call them
automatically from ExplainPrintPlan. The original codes in ExplainPrintPlan
was moved into ExplainOneResult, that name might be debatable.

This isn't an adequate solution I'm afraid --- what about the other
functions that are exported by explain.h?

I too am not totally thrilled with the idea of exporting
Explain{Begin/End}Output, but it might be the best solution.
We might also need to think about refactoring those functions:
there seem to be two different things going on there, one being
format-specific initialization which will certainly be necessary,
and one being output of a wrapper structure which might or might
not be appropriate for what auto_explain or other callers want.

In any case you need to expend more effort on the comments for the
functions.  Inadequate specification of ExplainPrintPlan's call
requirements is what got us into this problem in the first place,
and the proposed patch makes that worse not better.

There's an awful lot of layers of function calls happening in
explain.c for no really discernable purpose that I can see.
ExplainOneQuery() calls either ExplainOneUtility() if it has a utility
command or ExplainOneQuery_hook() if that's defined or else it plans
the query and then calls ExplainOnePlan(). ExplainOneUtility() in
turn calls ExplainExecuteQuery for execute statements and handles
everything else internally. The placement of the hook seems pretty
dubious to me (does anyone actually use it?), and the whole structure
seems like it could probably be flattened a bit.

...Robert

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#8)
Re: bug: json format and auto_explain

Robert Haas <robertmhaas@gmail.com> writes:

There's an awful lot of layers of function calls happening in
explain.c for no really discernable purpose that I can see.
ExplainOneQuery() calls either ExplainOneUtility() if it has a utility
command or ExplainOneQuery_hook() if that's defined or else it plans
the query and then calls ExplainOnePlan(). ExplainOneUtility() in
turn calls ExplainExecuteQuery for execute statements and handles
everything else internally. The placement of the hook seems pretty
dubious to me (does anyone actually use it?), and the whole structure
seems like it could probably be flattened a bit.

IIRC, a fair amount of the messiness has to do with handling prepared
statements. Maybe refactoring the division of labor between prepare.c
and explain.c would help. Whether it's worth the trouble is
questionable though. The immediate issue seems to be insufficient
thought about how to manage the initialization conditions for the
various formatters, and I don't think that any flattening or refactoring
of the rest of the logic would have made any difference there.

regards, tom lane