making EXPLAIN extensible

Started by Robert Haasabout 1 year ago88 messages
Jump to latest
#1Robert Haas
robertmhaas@gmail.com

Prior to PostgreSQL 10, EXPLAIN had just 2 options: VACUUM and
ANALYZE. Now, we're up to 12 options, which is already quite a lot,
and there's plenty more things that somebody might like to do.
However, not all of those things necessarily need to be part of the
core code. My original reason for wanting to extend EXPLAIN was that I
was thinking about an extension that would want to do a bunch of
things and one of those things would be to add some information to the
EXPLAIN output. It wouldn't make sense for core to have an EXPLAIN
option whose whole purpose is to cater to the needs of some extension,
so that made me think of providing some extensibility infrastructure.

However, there are other use cases, too, basically any of the normal
reasons why extensibility is useful and desirable. You might need to
get some information out a query plan that 99% of people don't care
about. You could come up with your own way of formatting a query plan,
but that's a big pain. It's a lot nicer if you can just add the detail
that you care about to the EXPLAIN output without needing to modify
PostgreSQL itself. Even if you think of something that really ought to
be included in the EXPLAIN output by PostgreSQL, you can roll an
extension out much quicker than you can get a change upstreamed and
released. So I think EXPLAIN extensibility is, as a general concept,
useful.

So here are some patches.

0001 allows a loadable module to register new EXPLAIN options.
Currently, EXPLAIN (FUNGUS) will error out, but if you want to make it
work, this patch is for you. This patch also allows you to stash some
state related to your new option, or options, in the ExplainState.
Core options have hard-coded structure members; e.g. EXPLAIN (BUFFERS)
sets es->buffers. If you add EXPLAIN (FUNGUS), there won't be an
es->fungus, but you can get about the same effect using the new
facilities provided here.

0002 provides hooks that you can use to make your new EXPLAIN options
actually do something. In particular, this adds a new hook that is
called once per PlanState node, and a new nook that is called once per
PlannedStmt. Each is called at an appropriate point for you to tack on
more output after what EXPLAIN would already produce.

0003 adds a new contrib module called pg_overexplain, which adds
EXPLAIN (DEBUG) and EXPLAIN (RANGE_TABLE). I actually think this is
quite useful for planner hacking, and maybe a few more options would
be, too. Right now, if you want to see stuff that EXPLAIN doesn't
clearly show, you have to use SET debug_print_plan = true, and that
output is so verbose that finding the parts you actually want to see
is quite difficult. Assuming it gives you the details you need,
EXPLAIN (RANGE_TABLE) looks way, way better to me, and if we end up
committing these patches I anticipate using this semi-regularly.

There are plenty of debatable things in this patch set, and I mention
some of them in the commit messages. The hook design in 0002 is a bit
simplistic and could be made more complex; there's lots of stuff that
could be added to or removed from 0003, much of which comes down to
what somebody hacking on the planner would actually want to see. I'm
happy to bikeshed all of that stuff; this is all quite preliminary and
I'm not committed to the details. The only thing that would disappoint
me is if somebody said "this whole idea of making EXPLAIN extensible
is stupid and pointless and we shouldn't ever do it." I will argue
against that vociferously. I think even what I have here is enough to
disprove that hypothesis, but I have a bunch of ideas about how to do
more. Some of those require additional infrastructure and are best
proposed with that other infrastructure; some can be done with just
this, but I ran out of time to code up examples so here is what I have
got so far.

Hope you like it, sorry if you don't.

--
Robert Haas
EDB: http://www.enterprisedb.com

Attachments:

v1-0003-pg_overexplain-Additional-EXPLAIN-options-for-deb.patchapplication/octet-stream; name=v1-0003-pg_overexplain-Additional-EXPLAIN-options-for-deb.patchDownload+806-1
v1-0002-Add-some-new-hooks-so-extensions-can-add-details-.patchapplication/octet-stream; name=v1-0002-Add-some-new-hooks-so-extensions-can-add-details-.patchDownload+36-1
v1-0001-Make-it-possible-for-loadable-modules-to-add-EXPL.patchapplication/octet-stream; name=v1-0001-Make-it-possible-for-loadable-modules-to-add-EXPL.patchDownload+461-199
#2Thom Brown
thom@linux.com
In reply to: Robert Haas (#1)
Re: making EXPLAIN extensible

On Fri, 28 Feb 2025 at 19:26, Robert Haas <robertmhaas@gmail.com> wrote:

Prior to PostgreSQL 10, EXPLAIN had just 2 options: VACUUM and
ANALYZE. Now, we're up to 12 options, which is already quite a lot,
and there's plenty more things that somebody might like to do.
However, not all of those things necessarily need to be part of the
core code. My original reason for wanting to extend EXPLAIN was that I
was thinking about an extension that would want to do a bunch of
things and one of those things would be to add some information to the
EXPLAIN output. It wouldn't make sense for core to have an EXPLAIN
option whose whole purpose is to cater to the needs of some extension,
so that made me think of providing some extensibility infrastructure.

However, there are other use cases, too, basically any of the normal
reasons why extensibility is useful and desirable. You might need to
get some information out a query plan that 99% of people don't care
about. You could come up with your own way of formatting a query plan,
but that's a big pain. It's a lot nicer if you can just add the detail
that you care about to the EXPLAIN output without needing to modify
PostgreSQL itself. Even if you think of something that really ought to
be included in the EXPLAIN output by PostgreSQL, you can roll an
extension out much quicker than you can get a change upstreamed and
released. So I think EXPLAIN extensibility is, as a general concept,
useful.

So here are some patches.

0001 allows a loadable module to register new EXPLAIN options.
Currently, EXPLAIN (FUNGUS) will error out, but if you want to make it
work, this patch is for you. This patch also allows you to stash some
state related to your new option, or options, in the ExplainState.
Core options have hard-coded structure members; e.g. EXPLAIN (BUFFERS)
sets es->buffers. If you add EXPLAIN (FUNGUS), there won't be an
es->fungus, but you can get about the same effect using the new
facilities provided here.

0002 provides hooks that you can use to make your new EXPLAIN options
actually do something. In particular, this adds a new hook that is
called once per PlanState node, and a new nook that is called once per
PlannedStmt. Each is called at an appropriate point for you to tack on
more output after what EXPLAIN would already produce.

0003 adds a new contrib module called pg_overexplain, which adds
EXPLAIN (DEBUG) and EXPLAIN (RANGE_TABLE). I actually think this is
quite useful for planner hacking, and maybe a few more options would
be, too. Right now, if you want to see stuff that EXPLAIN doesn't
clearly show, you have to use SET debug_print_plan = true, and that
output is so verbose that finding the parts you actually want to see
is quite difficult. Assuming it gives you the details you need,
EXPLAIN (RANGE_TABLE) looks way, way better to me, and if we end up
committing these patches I anticipate using this semi-regularly.

There are plenty of debatable things in this patch set, and I mention
some of them in the commit messages. The hook design in 0002 is a bit
simplistic and could be made more complex; there's lots of stuff that
could be added to or removed from 0003, much of which comes down to
what somebody hacking on the planner would actually want to see. I'm
happy to bikeshed all of that stuff; this is all quite preliminary and
I'm not committed to the details. The only thing that would disappoint
me is if somebody said "this whole idea of making EXPLAIN extensible
is stupid and pointless and we shouldn't ever do it." I will argue
against that vociferously. I think even what I have here is enough to
disprove that hypothesis, but I have a bunch of ideas about how to do
more. Some of those require additional infrastructure and are best
proposed with that other infrastructure; some can be done with just
this, but I ran out of time to code up examples so here is what I have
got so far.

Hope you like it, sorry if you don't.

"pg_overexplain"? I love this name! And the idea sounds like a natural
evolution, so +1.

Some questions:

One thing I am wondering is whether extensions should be required to
prefix their EXPLAIN option with the extension name to avoid
collisions.

If two extensions happen to choose the same name, it won't be possible
to use both simultaneously.

In what order would the options be applied? Would it be deterministic,
or weighted within the extension's configuration, or based on the
order of the options in the list?

Would explain extensions be capable of modifying pre-existing core
option output, or just append to output?

Should there be a way of determining which lines are output by which
option? An extension may output similar output to core output, making
it difficult or impossible to discern which is which.

Does there need to be any security considerations so that things like
RLS don't inadvertently become leaky?

Thom

#3Isaac Morland
isaac.morland@gmail.com
In reply to: Thom Brown (#2)
Re: making EXPLAIN extensible

On Fri, 28 Feb 2025 at 15:09, Thom Brown <thom@linux.com> wrote:

One thing I am wondering is whether extensions should be required to
prefix their EXPLAIN option with the extension name to avoid
collisions.

If two extensions happen to choose the same name, it won't be possible
to use both simultaneously.

Could the call that processes the registration automatically prepend the
extension name to the supplied explain option name? So if extension X
registers option O it would be registered as X_O rather than returning an
error if O doesn't follow the proper pattern.

#4Robert Haas
robertmhaas@gmail.com
In reply to: Thom Brown (#2)
Re: making EXPLAIN extensible

On Fri, Feb 28, 2025 at 3:09 PM Thom Brown <thom@linux.com> wrote:

"pg_overexplain"? I love this name! And the idea sounds like a natural
evolution, so +1.

Thanks. I thought about things like pg_hyperexplain or
pg_explain_debug, but in the end I didn't like any of them better than
overexplain. :-)

One thing I am wondering is whether extensions should be required to
prefix their EXPLAIN option with the extension name to avoid
collisions.

I considered that. One advantage of doing that is that you could
support autoloading. Right now, you have to LOAD 'pg_overexplain' or
put it in session_preload_libraries or shared_preload_libraries in
order to use it. If you required people to type EXPLAIN
(pg_overexplain.range_table) instead of just EXPLAIN (range_table),
then you could react to not finding any such option by trying to
autoload a .so with the part of the name before the dot.

But you can probably see that this idea has a couple of pretty serious
weaknesses:

1. It is much more verbose. I theorize that people will be unhappy
about having to type EXPLAIN (pg_overexplain.range_table) rather than
just EXPLAIN (range_table). One could try to address this by renaming
the extension to something shorter, like just 'oe'. Having to type
EXPLAIN (oe.range_table) wouldn't be nearly as annoying. However, this
seems like a pretty clear case of letting the tail wag the dog.

2. autoloading could have security concerns. This is probably fixable,
but we'd need to be sure that providing a new way to trigger loading a
module didn't open up any security holes.

If two extensions happen to choose the same name, it won't be possible
to use both simultaneously.

That's true. Of course, a lot depends on whether we end up with 3 or 5
or 8 EXPLAIN extensions or more like 30 or 50 or 80. In the former
case, the people writing those extensions will probably mostly know
about each other and can just use different names. In the latter case
it's a problem. My guess is it's the former case.

In what order would the options be applied? Would it be deterministic,
or weighted within the extension's configuration, or based on the
order of the options in the list?

I'm not entirely sure I know which question you're asking here. If
you're asking what happens if two modules try to register the same
EXPLAIN option name and then a user uses it, one of the registrations
will win and the other will lose. I think the second one wins. As I
say above, I assume we'll find a way to not try to do that. However, I
think more likely you're asking: if you load pg_fingernailexplain and
pg_toenailexplain and then do EXPLAIN (toenail, fingernail) SELECT
..., in what order will the options take effect? For the answer to
that question, see the commit message for 0002.

Would explain extensions be capable of modifying pre-existing core
option output, or just append to output?

The interfaces we have are really only going to work for appending.
Modifying would be cool, but I think it's mostly impractical. We have
a framework for emitting stuff into EXPLAIN output in a way that takes
into account whether you're in text mode or json or yaml or whatever,
and this patch just builds on that existing framework to allow you to
make extra calls to those emit-some-output functions at useful places.
As a result, the patch is small and simple. If we had an existing
framework for modifying stuff, then we could perhaps provide suitable
places to call those functions, too. But they don't exist, and it's
not easy to see how they could be created. I think you would need some
kind of major redesign of explain.c, and I don't know how to do that
without making it bloated, slow, and unmaintainable.

If somebody comes up with a way of allowing certain limited types of
modifications to EXPLAIN output with small, elegant-looking code
changes, and if those changes seem like useful things for an extension
to want to do, I'm totally on board. But I currently don't have an
idea like that.

Should there be a way of determining which lines are output by which
option? An extension may output similar output to core output, making
it difficult or impossible to discern which is which.

I don't think this is really going to be a problem.

Does there need to be any security considerations so that things like
RLS don't inadvertently become leaky?

It's possible that there may be some security considerations, and
that's worth thinking about. However, RLS disclaims support for
side-channel attacks, so it's already understood to be (very) leaky.

--
Robert Haas
EDB: http://www.enterprisedb.com

#5Sami Imseih
samimseih@gmail.com
In reply to: Robert Haas (#4)
Re: making EXPLAIN extensible

EXPLAIN output. It wouldn't make sense for core to have an EXPLAIN
option whose whole purpose is to cater to the needs of some extension,
so that made me think of providing some extensibility infrastructure.

Making EXPLAIN extensible sounds like a good idea.. FWIW, There is a
discussion [0]/messages/by-id/CAA5RZ0vXiOiodrNQ-Va4FCAkXMpGA=GZDeKjFBRgRvHGuW7s7Q@mail.gmail.com
for showing FDW remote plans ( postgres_fdw specifically), and I think
we will need to
add some new options to EXPLAIN to make that possible.

Have not looked at your patches, but I will do so now.

Regards,

Sami Imseih
Amazon Web Services (AWS)

[0]: /messages/by-id/CAA5RZ0vXiOiodrNQ-Va4FCAkXMpGA=GZDeKjFBRgRvHGuW7s7Q@mail.gmail.com

#6Srinath Reddy Sadipiralla
srinath2133@gmail.com
In reply to: Robert Haas (#1)
Re: making EXPLAIN extensible

Hi Robert,
thanks for working on this and +1 for the idea.
i have reviewed 1,2 patches using 3rd patch(pg_overexplain module) and they
LGTM,will review more the 3rd patch.

Regards,
Srinath Reddy Sadipiralla
EDB:http://www.enterprisedb.com

#7Alena Rybakina
a.rybakina@postgrespro.ru
In reply to: Robert Haas (#1)
Re: making EXPLAIN extensible

Hi! I agree with your idea to expand the explain.

On 28.02.2025 22:26, Robert Haas wrote:

0002 provides hooks that you can use to make your new EXPLAIN options
actually do something. In particular, this adds a new hook that is
called once per PlanState node, and a new nook that is called once per
PlannedStmt. Each is called at an appropriate point for you to tack on
more output after what EXPLAIN would already produce.

0003 adds a new contrib module called pg_overexplain, which adds
EXPLAIN (DEBUG) and EXPLAIN (RANGE_TABLE). I actually think this is
quite useful for planner hacking, and maybe a few more options would
be, too. Right now, if you want to see stuff that EXPLAIN doesn't
clearly show, you have to use SET debug_print_plan = true, and that
output is so verbose that finding the parts you actually want to see
is quite difficult. Assuming it gives you the details you need,
EXPLAIN (RANGE_TABLE) looks way, way better to me, and if we end up
committing these patches I anticipate using this semi-regularly.

There are plenty of debatable things in this patch set, and I mention
some of them in the commit messages. The hook design in 0002 is a bit
simplistic and could be made more complex; there's lots of stuff that
could be added to or removed from 0003, much of which comes down to
what somebody hacking on the planner would actually want to see. I'm
happy to bikeshed all of that stuff; this is all quite preliminary and
I'm not committed to the details. The only thing that would disappoint
me is if somebody said "this whole idea of making EXPLAIN extensible
is stupid and pointless and we shouldn't ever do it." I will argue
against that vociferously. I think even what I have here is enough to
disprove that hypothesis, but I have a bunch of ideas about how to do
more. Some of those require additional infrastructure and are best
proposed with that other infrastructure; some can be done with just
this, but I ran out of time to code up examples so here is what I have
got so far.

Hope you like it, sorry if you don't.

while writing the AQO extension [0]https://github.com/postgrespro/aqo we were just adding a hook (we
called it ExplainOnePlan_hook) similar to this one to add a description
for the node used in the plan, in particular its identifier and the
cardinality that is used during query planning. We also added a guc that
allows user to disable this debug information, since it can only be
useful when analyzing a problematic query, but not all the time.
Therefore, I think this work is necessary and needed to provide the
necessary output of additional information about the plan, which may be
necessary for extensions like this.

[0]: https://github.com/postgrespro/aqo

--
Regards,
Alena Rybakina
Postgres Professional

#8Matheus Alcantara
matheusssilv97@gmail.com
In reply to: Robert Haas (#4)
Re: making EXPLAIN extensible

Hi, +1 for the idea. I Haven't reviewed the patches yet, but I would like to
share some thoughts.

On Fri, Feb 28, 2025 at 5:32 PM Robert Haas <robertmhaas@gmail.com> wrote:

One thing I am wondering is whether extensions should be required to
prefix their EXPLAIN option with the extension name to avoid
collisions.

I considered that. One advantage of doing that is that you could
support autoloading. Right now, you have to LOAD 'pg_overexplain' or
put it in session_preload_libraries or shared_preload_libraries in
order to use it. If you required people to type EXPLAIN
(pg_overexplain.range_table) instead of just EXPLAIN (range_table),
then you could react to not finding any such option by trying to
autoload a .so with the part of the name before the dot.

But you can probably see that this idea has a couple of pretty serious
weaknesses:

1. It is much more verbose. I theorize that people will be unhappy
about having to type EXPLAIN (pg_overexplain.range_table) rather than
just EXPLAIN (range_table). One could try to address this by renaming
the extension to something shorter, like just 'oe'. Having to type
EXPLAIN (oe.range_table) wouldn't be nearly as annoying. However, this
seems like a pretty clear case of letting the tail wag the dog.

2. autoloading could have security concerns. This is probably fixable,
but we'd need to be sure that providing a new way to trigger loading a
module didn't open up any security holes.

If two extensions happen to choose the same name, it won't be possible
to use both simultaneously.

That's true. Of course, a lot depends on whether we end up with 3 or 5
or 8 EXPLAIN extensions or more like 30 or 50 or 80. In the former
case, the people writing those extensions will probably mostly know
about each other and can just use different names. In the latter case
it's a problem. My guess is it's the former case.

It would make sense (or possible) to have some kind of validation that returns
an error when using an option that is ambiguous? If a option is unique for all
options registered via extensions than the extension name prefix is not needed,
otherwise an error is returned forcing the user to specify the extension name.

This is a similar behaviour when, e.g we have a where clause that is
referencing a column that is present in multiple tables used in the query:
ERROR: 42702: column reference "b" is ambiguous LINE 1: select * from t inner
join t2 on t.a = t2.a where b = 10;

--
Matheus Alcantara

#9Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Robert Haas (#1)
Re: making EXPLAIN extensible

On Fri, 28 Feb 2025 at 20:26, Robert Haas <robertmhaas@gmail.com> wrote:

Prior to PostgreSQL 10, EXPLAIN had just 2 options: VACUUM and
ANALYZE.

I think you meant "some time prior to PostgreSQL 10".
PostgreSQL 9.0 had 5 options, of which COSTS, BUFFERS, and FORMAT were
newly added, so only before 9.0 we had 2 options.
PostgreSQL 9.2 then added TIMING on top of that, for a total of 6
options prior to PostgreSQL 10.

It wouldn't make sense for core to have an EXPLAIN
option whose whole purpose is to cater to the needs of some extension,
so that made me think of providing some extensibility infrastructure.

+1, Neon would greatly appreciate infrastructure to allow extending EXPLAIN.

0001 allows a loadable module to register new EXPLAIN options.
Currently, EXPLAIN (FUNGUS) will error out, but if you want to make it
work, this patch is for you. This patch also allows you to stash some
state related to your new option, or options, in the ExplainState.
Core options have hard-coded structure members; e.g. EXPLAIN (BUFFERS)
sets es->buffers. If you add EXPLAIN (FUNGUS), there won't be an
es->fungus, but you can get about the same effect using the new
facilities provided here.

Does this work with parallel workers' stats?
I can't seem to figure out whether or where parallel workers would
pass through their extended explain statistics, mostly because of the
per-backend nature of ID generation making the pointers of
ExplainState->extension_state unshareable.

Kind regards,

Matthias van de Meent

#10Robert Haas
robertmhaas@gmail.com
In reply to: Matheus Alcantara (#8)
Re: making EXPLAIN extensible

First, thanks to all who have said they like this idea. I wasn't
expecting this much enthusiasm, to be honest. Woohoo!

On Mon, Mar 3, 2025 at 8:27 AM Matheus Alcantara
<matheusssilv97@gmail.com> wrote:

It would make sense (or possible) to have some kind of validation that returns
an error when using an option that is ambiguous? If a option is unique for all
options registered via extensions than the extension name prefix is not needed,
otherwise an error is returned forcing the user to specify the extension name.

I'm not saying that this couldn't be done -- it definitely could --
but to me it seems somewhat pointless. I mean, you'd just get the
error when you try to load the second of the two extensions, and then
what are you supposed to do about it at that point? Really, it's
incumbent on EXPLAIN-extension developers to avoid picking names that
conflict with other EXPLAIN extensions that the same users might want
to use. If they don't do that, everything sucks. By jiggering things
around you can cause the result to be either (a) some options are
ignored (current behavior) or (b) errors occur during extension
loading (your proposal) or (c) errors occur at runtime or even (d)
suddenly the conflicting options need to be spelled in a very verbose
way with a module name prefix instead of using them normally. But I
think no matter which of those things happen, life is pretty bad for
the user. The only real solution, AFAICS, is for the extension
developers to be thoughtful about the option names that they pick.

I'm not saying that's definitely going to happen, and I can be
convinced to add something to the patch to cater to it. But only if we
all agree on exactly what to add and exactly why that's going to be
better than doing nothing. My current view - as explained above - is
that anything we add is just going to be increasing the amount of code
without really making the user experience any better. If that's
correct, it's better to just keep it simple, as I have done.

--
Robert Haas
EDB: http://www.enterprisedb.com

#11Robert Haas
robertmhaas@gmail.com
In reply to: Matthias van de Meent (#9)
Re: making EXPLAIN extensible

On Mon, Mar 3, 2025 at 9:14 AM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:

I think you meant "some time prior to PostgreSQL 10".
PostgreSQL 9.0 had 5 options, of which COSTS, BUFFERS, and FORMAT were
newly added, so only before 9.0 we had 2 options.
PostgreSQL 9.2 then added TIMING on top of that, for a total of 6
options prior to PostgreSQL 10.

Probably I meant 9 rather than 10, then.

+1, Neon would greatly appreciate infrastructure to allow extending EXPLAIN.

Cool.

Does this work with parallel workers' stats?
I can't seem to figure out whether or where parallel workers would
pass through their extended explain statistics, mostly because of the
per-backend nature of ID generation making the pointers of
ExplainState->extension_state unshareable.

I don't fully understand what your question is. I think there are a
couple of separate points to consider here.

First, I don't think we ever store an ExplainState in DSM. If we do,
then the per-backend nature of ID generation is a fundamental design
issue and needs to be rethought. Otherwise, I don't see why it
matters.

Second, I did not add a hook to allow an extension to add data to a
"Worker N" section. I'm open to suggestions.

Third, regardless of parallel query, there is a general problem with
this infrastructure if what you want to do is print out some
instrumentation data. Sure, the hooks might allow you to get control
at a point where you can print some stuff, but how are you supposed to
get the stuff you want to print? planduration, bufusage, and
mem_counters are passed down to ExplainOnePlan(); and there's other
stuff in struct Instrumentation that is used in ExplainNode(), but
those approaches don't seem to scale nicely to arbitrary new things
that somebody might want to measure. While I welcome ideas about how
to fix that, my current view is that it's a job for a separate patch
set.

In general, it's expected that each parallel-aware node may register a
shm_toc entry using the plan_node_id as the key. So if you wanted
per-worker instrumentation of any sort of some particular node, you
could possibly add it to that chunk of memory. This would work well,
for example, for a custom scan, or any other case where the node is
under the control over the same code that is trying to instrument
stuff. A patch to core could extend both the node's DSM footprint and
the explain.c code that prints data from it. However, if you want to
do something like "for every executor node, count the number of flying
spaghetti monster tendrils that pass through the computer during the
execution of that node," there's not really any great way of doing
that today, with or without this patch, and with or without parallel
query. I mean, you can patch core, but that's it; there's no
extensibility here.

I'm not sure if any of this is responsive to your actual question; if
not, please help me get on the right track.

Thanks,

--
Robert Haas
EDB: http://www.enterprisedb.com

#12Guillaume Lelarge
guillaume@lelarge.info
In reply to: Robert Haas (#1)
Re: making EXPLAIN extensible

On 28/02/2025 20:26, Robert Haas wrote:

Prior to PostgreSQL 10, EXPLAIN had just 2 options: VACUUM and
ANALYZE. Now, we're up to 12 options, which is already quite a lot,
and there's plenty more things that somebody might like to do.
However, not all of those things necessarily need to be part of the
core code. My original reason for wanting to extend EXPLAIN was that I
was thinking about an extension that would want to do a bunch of
things and one of those things would be to add some information to the
EXPLAIN output. It wouldn't make sense for core to have an EXPLAIN
option whose whole purpose is to cater to the needs of some extension,
so that made me think of providing some extensibility infrastructure.

However, there are other use cases, too, basically any of the normal
reasons why extensibility is useful and desirable. You might need to
get some information out a query plan that 99% of people don't care
about. You could come up with your own way of formatting a query plan,
but that's a big pain. It's a lot nicer if you can just add the detail
that you care about to the EXPLAIN output without needing to modify
PostgreSQL itself. Even if you think of something that really ought to
be included in the EXPLAIN output by PostgreSQL, you can roll an
extension out much quicker than you can get a change upstreamed and
released. So I think EXPLAIN extensibility is, as a general concept,
useful.

So here are some patches.

0001 allows a loadable module to register new EXPLAIN options.
Currently, EXPLAIN (FUNGUS) will error out, but if you want to make it
work, this patch is for you. This patch also allows you to stash some
state related to your new option, or options, in the ExplainState.
Core options have hard-coded structure members; e.g. EXPLAIN (BUFFERS)
sets es->buffers. If you add EXPLAIN (FUNGUS), there won't be an
es->fungus, but you can get about the same effect using the new
facilities provided here.

0002 provides hooks that you can use to make your new EXPLAIN options
actually do something. In particular, this adds a new hook that is
called once per PlanState node, and a new nook that is called once per
PlannedStmt. Each is called at an appropriate point for you to tack on
more output after what EXPLAIN would already produce.

0003 adds a new contrib module called pg_overexplain, which adds
EXPLAIN (DEBUG) and EXPLAIN (RANGE_TABLE). I actually think this is
quite useful for planner hacking, and maybe a few more options would
be, too. Right now, if you want to see stuff that EXPLAIN doesn't
clearly show, you have to use SET debug_print_plan = true, and that
output is so verbose that finding the parts you actually want to see
is quite difficult. Assuming it gives you the details you need,
EXPLAIN (RANGE_TABLE) looks way, way better to me, and if we end up
committing these patches I anticipate using this semi-regularly.

There are plenty of debatable things in this patch set, and I mention
some of them in the commit messages. The hook design in 0002 is a bit
simplistic and could be made more complex; there's lots of stuff that
could be added to or removed from 0003, much of which comes down to
what somebody hacking on the planner would actually want to see. I'm
happy to bikeshed all of that stuff; this is all quite preliminary and
I'm not committed to the details. The only thing that would disappoint
me is if somebody said "this whole idea of making EXPLAIN extensible
is stupid and pointless and we shouldn't ever do it." I will argue
against that vociferously. I think even what I have here is enough to
disprove that hypothesis, but I have a bunch of ideas about how to do
more. Some of those require additional infrastructure and are best
proposed with that other infrastructure; some can be done with just
this, but I ran out of time to code up examples so here is what I have
got so far.

Hope you like it, sorry if you don't.

I definitely LOVE it. I tried your patches and it works great. No real
surprise here :) I tried to code my own library (entirely based on
yours), and it's quite nice. Patch attached, not intended to be applied
on the repo, but just a nice use case.

This library adds "Tip" line for each tip it can give on a specific
node. Right now, it only handles "Rows Removed by Filter" on a
sequential scan, but there's much more we could add to it.

Here is an example on how to use it:

postgres=# show shared_preload_libraries ;
shared_preload_libraries
--------------------------
pg_explaintips
(1 row)

postgres=# create table t1 (id integer);

CREATE TABLE

postgres=# insert into t1 select generate_series(1, 1000);

INSERT 0 1000
postgres=# explain (analyze,costs off,tips) select * from t1 where id>2;

QUERY PLAN
---------------------------------------------------------------
Seq Scan on t1 (actual time=0.042..0.337 rows=998.00 loops=1)
Filter: (id > 2)
Rows Removed by Filter: 2
Buffers: shared hit=5
Planning:
Buffers: shared hit=4
Planning Time: 0.079 ms
Execution Time: 0.479 ms
(8 rows)

postgres=# explain (analyze,costs off,tips) select * from t1 where id<2;

QUERY PLAN
-------------------------------------------------------------
Seq Scan on t1 (actual time=0.014..0.113 rows=1.00 loops=1)
Filter: (id < 2)
Rows Removed by Filter: 999
Buffers: shared hit=5
Tips: You should probably add an index!
Planning Time: 0.035 ms
Execution Time: 0.127 ms
(7 rows)

postgres=# explain (analyze,costs off,tips off) select * from t1 where
id<2;
QUERY PLAN
-------------------------------------------------------------
Seq Scan on t1 (actual time=0.009..0.067 rows=1.00 loops=1)
Filter: (id < 2)
Rows Removed by Filter: 999
Buffers: shared hit=5
Planning:
Buffers: shared hit=5
Planning Time: 0.070 ms
Execution Time: 0.076 ms
(8 rows)

Just great. Hope your patchs will find their way in the 18 release.
Thanks a lot.

--
Guillaume Lelarge
Consultant
https://dalibo.com

Attachments:

pg_explaintips.tgzapplication/x-compressed-tar; name=pg_explaintips.tgzDownload
#13Andrei Lepikhov
lepihov@gmail.com
In reply to: Robert Haas (#1)
Re: making EXPLAIN extensible

On 28/2/2025 20:26, Robert Haas wrote:

So here are some patches.

Yes, this is a big pain for extension developers. As I remember, it was
discussed multiple times in the hackers' mailing list.
Because there is no explain node hook, I use a patch in almost each of
my extensions: I write optimisation helpers, and it is necessary to show
which node was influenced and how. I guess pg_hint_plan will also profit
from this extra extensibility.

Passing through the patches, I would say that changing the order of 0001
and 0002 would make them more independent.
Also, I'm ok with the floating order of extension messages in the
explain output. We get used to living with dependencies on extension
load order (pg_stat_statements quite annoyingly impacts queryid, for
example), and this issue should be solved generally, in my opinion.
I support the way where extensions are allowed to print info but not
restructure or remove something.
Wait for the commit!

--
regards, Andrei Lepikhov

#14Robert Haas
robertmhaas@gmail.com
In reply to: Andrei Lepikhov (#13)
Re: making EXPLAIN extensible

On Tue, Mar 4, 2025 at 8:56 AM Andrei Lepikhov <lepihov@gmail.com> wrote:

Passing through the patches, I would say that changing the order of 0001
and 0002 would make them more independent.

Hmm, I thought this order made sense, but I could reorder them if
there's some compelling reason to do so. If there's no particularly
compelling reason, it would be less work to commit them in this order.

Also, I'm ok with the floating order of extension messages in the
explain output. We get used to living with dependencies on extension
load order (pg_stat_statements quite annoyingly impacts queryid, for
example), and this issue should be solved generally, in my opinion.

I've often thought that the solution to this class of problems could
be to have extensions not manipulate a hook variable directly, but
instead call a function to which they pass their callback function and
an integer priority. Then we could call hook functions in priority
order. One difficulty is that this requires extension authors to agree
on what the priority order should be. In some cases that might not be
too hard, but it isn't apparent how it would work here.

IMHO, it's reasonable for the author of an EXPLAIN extension to say
"well, I see Robert already created an extension with an option called
DEBUG, so I will name my option TROUBLESHOOT," or something of that
sort. But if Robert gave the DEBUG hook priority 50, should I also
give my hook priority 50, or should I make it 40 or 25 or 1 or 100 or
what? Even if I know about all of the other extensions it's not really
clear what I should do. Actually, in this case, it feels like it would
be better if the user could control the ordering somehow, but I feel
like that might be over-engineering.

I support the way where extensions are allowed to print info but not
restructure or remove something.
Wait for the commit!

Cool, thanks.

--
Robert Haas
EDB: http://www.enterprisedb.com

#15Andrei Lepikhov
lepihov@gmail.com
In reply to: Robert Haas (#14)
Re: making EXPLAIN extensible

On 4/3/2025 15:23, Robert Haas wrote:

On Tue, Mar 4, 2025 at 8:56 AM Andrei Lepikhov <lepihov@gmail.com> wrote:

Passing through the patches, I would say that changing the order of 0001
and 0002 would make them more independent.

Hmm, I thought this order made sense, but I could reorder them if
there's some compelling reason to do so. If there's no particularly
compelling reason, it would be less work to commit them in this order.

I have no compelling reason except avoiding adding/removing explain.h
into the head of auto_explain.c and file_fdw.c

Also, I'm ok with the floating order of extension messages in the
explain output. We get used to living with dependencies on extension
load order (pg_stat_statements quite annoyingly impacts queryid, for
example), and this issue should be solved generally, in my opinion.

I've often thought that the solution to this class of problems could
be to have extensions not manipulate a hook variable directly, but
instead call a function to which they pass their callback function and
an integer priority. Then we could call hook functions in priority
order. One difficulty is that this requires extension authors to agree
on what the priority order should be. In some cases that might not be
too hard, but it isn't apparent how it would work here.

IMHO, it's reasonable for the author of an EXPLAIN extension to say
"well, I see Robert already created an extension with an option called
DEBUG, so I will name my option TROUBLESHOOT," or something of that
sort. But if Robert gave the DEBUG hook priority 50, should I also
give my hook priority 50, or should I make it 40 or 25 or 1 or 100 or
what? Even if I know about all of the other extensions it's not really
clear what I should do. Actually, in this case, it feels like it would
be better if the user could control the ordering somehow, but I feel
like that might be over-engineering.

I think the same way. It would be clearer for an observer to have a
dependency on load order everywhere than different orders in different
places with no strong guarantees.

Also, I think this feature is quite close to the discussion on the
possibility of adding an extensible list field into Query, PlanState,
Plan, etc. nodes to let extensions gather and transfer some additional
data starting with the first 'analyze' hook up to the end of execution.
For example, in solving user issues, I frequently need to know
predictions on the number of groups in Memoize, IncrementalSort and some
other nodes. Such extensibility could allow an extension to gather such
internal data during the planning stage and show it in the explain
without any changes in the core!

--
regards, Andrei Lepikhov

#16Robert Haas
robertmhaas@gmail.com
In reply to: Andrei Lepikhov (#15)
Re: making EXPLAIN extensible

On Tue, Mar 4, 2025 at 10:12 AM Andrei Lepikhov <lepihov@gmail.com> wrote:

Also, I think this feature is quite close to the discussion on the
possibility of adding an extensible list field into Query, PlanState,
Plan, etc. nodes to let extensions gather and transfer some additional
data starting with the first 'analyze' hook up to the end of execution.
For example, in solving user issues, I frequently need to know
predictions on the number of groups in Memoize, IncrementalSort and some
other nodes. Such extensibility could allow an extension to gather such
internal data during the planning stage and show it in the explain
without any changes in the core!

If you're saying there's a -hackers discussion on this, could you
provide a link? I haven't seen it.

--
Robert Haas
EDB: http://www.enterprisedb.com

#17Andrei Lepikhov
lepihov@gmail.com
In reply to: Robert Haas (#16)
Re: making EXPLAIN extensible

On 4/3/2025 16:14, Robert Haas wrote:

On Tue, Mar 4, 2025 at 10:12 AM Andrei Lepikhov <lepihov@gmail.com> wrote:

Also, I think this feature is quite close to the discussion on the
possibility of adding an extensible list field into Query, PlanState,
Plan, etc. nodes to let extensions gather and transfer some additional
data starting with the first 'analyze' hook up to the end of execution.
For example, in solving user issues, I frequently need to know
predictions on the number of groups in Memoize, IncrementalSort and some
other nodes. Such extensibility could allow an extension to gather such
internal data during the planning stage and show it in the explain
without any changes in the core!

If you're saying there's a -hackers discussion on this, could you
provide a link? I haven't seen it.

I wouldn't say there is a thread in the mailing list. I mentioned this
direction of extensibility multiple times (for example, [1,2]) with no
reaction. However, letting extensions show data in explan gives this
idea additional impulse.

[1]: /messages/by-id/30113d59-8678-49ca-a8fb-bbceacf7efb0@gmail.com
/messages/by-id/30113d59-8678-49ca-a8fb-bbceacf7efb0@gmail.com
[2]: /messages/by-id/CA+TgmoYXgBVCnFhrW3X1NxpdjWtJCYRKP38PQ-AdR-RJziTBUQ@mail.gmail.com
/messages/by-id/CA+TgmoYXgBVCnFhrW3X1NxpdjWtJCYRKP38PQ-AdR-RJziTBUQ@mail.gmail.com

--
regards, Andrei Lepikhov

#18Robert Haas
robertmhaas@gmail.com
In reply to: Andrei Lepikhov (#17)
Re: making EXPLAIN extensible

On Tue, Mar 4, 2025 at 10:26 AM Andrei Lepikhov <lepihov@gmail.com> wrote:

I wouldn't say there is a thread in the mailing list. I mentioned this
direction of extensibility multiple times (for example, [1,2]) with no
reaction. However, letting extensions show data in explan gives this
idea additional impulse.

I agree that it's worth considering. I'd like to get this patch set
committed first, because I feel like it's already good enough to let
people do a bunch of cool stuff, and then we can build on top of it
later to let people do even more cool stuff. I do have some ideas that
involve piping plan-time data through to the final plan so that it can
be made visible via EXPLAIN. However, I think there are several
challenging design questions that need to be figured out, including:

(1) exactly how do we pipe that plan-time data through to the final plan?

(2) how should the plan-time hooks be designed to let people do as
many interesting things as possible with as few hooks as possible?

It sounds like you might already have some ideas about how those
questions should be answered, but I haven't thought about it enough
yet to feel confident and I don't want to make those decisions now.
Let's keep this thread focused on these patches, and we can look at
what else to do once that's done.

--
Robert Haas
EDB: http://www.enterprisedb.com

#19Jeff Davis
pgsql@j-davis.com
In reply to: Robert Haas (#4)
Re: making EXPLAIN extensible

+1 to the general idea, I didn't look at the patches yet.

On Fri, 2025-02-28 at 15:32 -0500, Robert Haas wrote:

1. It is much more verbose. I theorize that people will be unhappy
about having to type EXPLAIN (pg_overexplain.range_table) rather than
just EXPLAIN (range_table).

That was my first reaction.

That's true. Of course, a lot depends on whether we end up with 3 or
5
or 8 EXPLAIN extensions or more like 30 or 50 or 80. In the former
case, the people writing those extensions will probably mostly know
about each other and can just use different names.

I don't expect there to be zillions of extensions that only add new and
exciting explain options. Instead, it seems more likely that all
TableAM and CustomScan extensions will have 1-3 new explain options,
and that some of those might collide. For example, several may have a
EXPLAIN(PUSHDOWN) option that explains what work is being pushed down
into the TableAM/CustomScan.

In that case it's not even clear to me that a collision is a problem.
Would you really only want pushdown information from extension A, and
be upset that it also emits pushdown information from extension B?
Maybe we should just allow multiple extensions to use the same option
name?

Regards,
Jeff Davis

#20Robert Haas
robertmhaas@gmail.com
In reply to: Jeff Davis (#19)
Re: making EXPLAIN extensible

On Tue, Mar 4, 2025 at 1:53 PM Jeff Davis <pgsql@j-davis.com> wrote:

I don't expect there to be zillions of extensions that only add new and
exciting explain options. Instead, it seems more likely that all
TableAM and CustomScan extensions will have 1-3 new explain options,
and that some of those might collide. For example, several may have a
EXPLAIN(PUSHDOWN) option that explains what work is being pushed down
into the TableAM/CustomScan.

In that case it's not even clear to me that a collision is a problem.
Would you really only want pushdown information from extension A, and
be upset that it also emits pushdown information from extension B?
Maybe we should just allow multiple extensions to use the same option
name?

One fairly big problem with that idea is that options need not be
simple Booleans. If extension A adds PUSHDOWN { 'summary' | 'detail'
} and extension B adds PUSHDOWN { 'scan' | 'join' | 'agg' }, it's
definitely not going to work, even if we arrange to call handlers for
both extensions. If we want to have EXPLAIN keywords that have
well-defined meanings that span across different
FDWs/tableAMs/whatever, I think we should add those to core and
document what we expect extensions to do. This mechanism is really for
cases where you need a completely new option that only you will care
about.

Actually, I don't think custom scans or table AMs are the design
center for this feature. Keep in mind that, for a custom scan, we
already have ExplainCustomScan as part of CustomScanState->methods. We
don't currently have anything similar for table AMs, and you could
work around that with these hooks, by checking every node to see
whether it's a scan node and if so whether it scans a relation that
matches your table AM, but it would probably be better to have a
method for it, similar to what we have for CustomScan, if this is
something people want to do. It would be more efficient and require
less code.

Granted, those interfaces don't let you add completely new options,
but I think actually for custom scans and table AMs you most likely
want to just display whatever details you have unconditionally, or
conditional on es->verbose. I'm not sure there's a real reason to add
a new option that gates, say, postgres_fdw's display of join pushdown
information. That seems awfully specific. You'd need a reason why you
would want to control that display separately from everything else,
and the use case for that seems thin.

Where I see this being more useful is for people who want to display
additional information for plan nodes that they did not implement. For
example, my EXPLAIN (RANGE_TABLE) option dumps every
range-table-related fact it can find in the Plan tree. That includes
both information about which plan nodes (as displayed by EXPLAIN) are
scanning which RTIs, and also a list of all the RTIs and a bunch of
properties of each one. When you're trying to show information by
topic, as in this case, hooks on individual node types like custom
scan or (hypothetically) table AM callbacks don't help you get the job
done -- and neither do the existing options.

That having been said, the vigorous response to this proposal so far
suggests to me that people probably will want to use this for things
that are a bit different than what I had in mind, and that's fine.
Maybe there's even some way to adjust this proposal so that it can
suit some of those other use cases better. But, I'm doubtful that
letting unrelated extensions try to share the same option name is that
thing.

--
Robert Haas
EDB: http://www.enterprisedb.com

#21Andrei Lepikhov
lepihov@gmail.com
In reply to: Robert Haas (#20)
#22Jeff Davis
pgsql@j-davis.com
In reply to: Robert Haas (#20)
#23Jeff Davis
pgsql@j-davis.com
In reply to: Robert Haas (#20)
#24Robert Haas
robertmhaas@gmail.com
In reply to: Jeff Davis (#22)
#25Robert Haas
robertmhaas@gmail.com
In reply to: Jeff Davis (#23)
#26Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#24)
#27Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#26)
#28Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#27)
#29Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#28)
#30Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#29)
#31Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#30)
#32Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#31)
#33Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#32)
#34Peter Eisentraut
peter_e@gmx.net
In reply to: Robert Haas (#31)
#35Robert Haas
robertmhaas@gmail.com
In reply to: Robert Haas (#33)
#36Robert Haas
robertmhaas@gmail.com
In reply to: Peter Eisentraut (#34)
#37Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#36)
#38Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#35)
#39Sami Imseih
samimseih@gmail.com
In reply to: Tom Lane (#38)
#40Tom Lane
tgl@sss.pgh.pa.us
In reply to: Sami Imseih (#39)
#41Sami Imseih
samimseih@gmail.com
In reply to: Tom Lane (#40)
#42Sami Imseih
samimseih@gmail.com
In reply to: Sami Imseih (#41)
#43Andrei Lepikhov
lepihov@gmail.com
In reply to: Sami Imseih (#41)
#44Robert Haas
robertmhaas@gmail.com
In reply to: Sami Imseih (#42)
#45Sami Imseih
samimseih@gmail.com
In reply to: Robert Haas (#44)
#46Andrei Lepikhov
lepihov@gmail.com
In reply to: Robert Haas (#35)
#47Robert Haas
robertmhaas@gmail.com
In reply to: Sami Imseih (#45)
#48Robert Haas
robertmhaas@gmail.com
In reply to: Andrei Lepikhov (#46)
#49Andrei Lepikhov
lepihov@gmail.com
In reply to: Robert Haas (#48)
#50Robert Haas
robertmhaas@gmail.com
In reply to: Andrei Lepikhov (#49)
#51Sami Imseih
samimseih@gmail.com
In reply to: Robert Haas (#47)
#52Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#50)
#53Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#52)
#54Sami Imseih
samimseih@gmail.com
In reply to: Sami Imseih (#51)
#55Robert Haas
robertmhaas@gmail.com
In reply to: Sami Imseih (#54)
#56Sami Imseih
samimseih@gmail.com
In reply to: Robert Haas (#55)
#57Robert Haas
robertmhaas@gmail.com
In reply to: Sami Imseih (#56)
#58Sami Imseih
samimseih@gmail.com
In reply to: Robert Haas (#57)
#59Robert Haas
robertmhaas@gmail.com
In reply to: Sami Imseih (#58)
#60Andrei Lepikhov
lepihov@gmail.com
In reply to: Sami Imseih (#58)
#61Sami Imseih
samimseih@gmail.com
In reply to: Andrei Lepikhov (#60)
#62Tom Lane
tgl@sss.pgh.pa.us
In reply to: Sami Imseih (#61)
#63Andrei Lepikhov
lepihov@gmail.com
In reply to: Sami Imseih (#61)
#64Sami Imseih
samimseih@gmail.com
In reply to: Andrei Lepikhov (#63)
#65Robert Haas
robertmhaas@gmail.com
In reply to: Andrei Lepikhov (#63)
#66Sami Imseih
samimseih@gmail.com
In reply to: Robert Haas (#65)
#67Robert Haas
robertmhaas@gmail.com
In reply to: Sami Imseih (#66)
#68Sami Imseih
samimseih@gmail.com
In reply to: Robert Haas (#67)
#69Tom Lane
tgl@sss.pgh.pa.us
In reply to: Sami Imseih (#68)
#70Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#69)
#71Andrei Lepikhov
lepihov@gmail.com
In reply to: Robert Haas (#70)
#72Robert Haas
robertmhaas@gmail.com
In reply to: Andrei Lepikhov (#71)
#73Robert Haas
robertmhaas@gmail.com
In reply to: Robert Haas (#72)
#74Robert Haas
robertmhaas@gmail.com
In reply to: Robert Haas (#73)
#75Michael Paquier
michael@paquier.xyz
In reply to: Robert Haas (#73)
#76Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#74)
#77Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#76)
#78Andrei Lepikhov
lepihov@gmail.com
In reply to: Robert Haas (#73)
#79zengman
zengman@halodbtech.com
In reply to: Robert Haas (#77)
#80Pavel Luzanov
p.luzanov@postgrespro.ru
In reply to: zengman (#79)
#81Robert Haas
robertmhaas@gmail.com
In reply to: Andrei Lepikhov (#78)
#82Robert Haas
robertmhaas@gmail.com
In reply to: zengman (#79)
#83Robert Haas
robertmhaas@gmail.com
In reply to: Pavel Luzanov (#80)
#84Andrei Lepikhov
lepihov@gmail.com
In reply to: Robert Haas (#81)
#85Robert Haas
robertmhaas@gmail.com
In reply to: Andrei Lepikhov (#84)
#86Andrei Lepikhov
lepihov@gmail.com
In reply to: Robert Haas (#85)
#87Robert Haas
robertmhaas@gmail.com
In reply to: Andrei Lepikhov (#86)
#88Robert Haas
robertmhaas@gmail.com
In reply to: Andrei Lepikhov (#17)