WIP: document the hook system
Please find attached :)
This could probably use a lot of filling in, but having it in the
actual documentation beats needing to know folklore even to know
that the capability is there.
Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
Attachments:
v1-0001-WIP-Document-the-hooks-system.patchtext/x-diff; charset=us-asciiDownload
From 5152b3f5255ec91b6ea34b76ea765a26d392d3ac Mon Sep 17 00:00:00 2001
From: David Fetter <david@fetter.org>
Date: Wed, 30 Dec 2020 19:13:57 -0800
Subject: [PATCH v1] WIP: Document the hooks system
To: hackers
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="------------2.29.2"
This is a multi-part message in MIME format.
--------------2.29.2
Content-Type: text/plain; charset=UTF-8; format=fixed
Content-Transfer-Encoding: 8bit
Outline of something that should probably have more detail, but it's
probably better than nothing at all.
create mode 100644 doc/src/sgml/hooks.sgml
--------------2.29.2
Content-Type: text/x-patch; name="v1-0001-WIP-Document-the-hooks-system.patch"
Content-Transfer-Encoding: 8bit
Content-Disposition: attachment; filename="v1-0001-WIP-Document-the-hooks-system.patch"
diff --git doc/src/sgml/filelist.sgml doc/src/sgml/filelist.sgml
index 38e8aa0bbf..16f78b371a 100644
--- doc/src/sgml/filelist.sgml
+++ doc/src/sgml/filelist.sgml
@@ -58,6 +58,7 @@
<!ENTITY external-projects SYSTEM "external-projects.sgml">
<!ENTITY func-ref SYSTEM "func-ref.sgml">
<!ENTITY infoschema SYSTEM "information_schema.sgml">
+<!ENTITY hooks SYSTEM "hooks.sgml">
<!ENTITY libpq SYSTEM "libpq.sgml">
<!ENTITY lobj SYSTEM "lobj.sgml">
<!ENTITY rules SYSTEM "rules.sgml">
diff --git doc/src/sgml/hooks.sgml doc/src/sgml/hooks.sgml
new file mode 100644
index 0000000000..5891f74dc8
--- /dev/null
+++ doc/src/sgml/hooks.sgml
@@ -0,0 +1,321 @@
+<!-- doc/src/sgml/hooks.sgml -->
+
+<chapter id="hooks">
+ <title>Hooks System</title>
+
+ <indexterm>
+ <primary>Hook</primary>
+ </indexterm>
+
+ <para>
+ This chapter explains the <productname>PostgreSQL</productname>
+ hooks system, a way to inject functionality in many different parts
+ of the database via function pointers to shared libraries. They simply
+ need to honor the hooks API, which consists roughly of the following:
+
+ <itemizedlist>
+ <listitem>
+ <para>
+ Initialize a hook of the correct type for what you are doing, conventionally
+ using a function with signature:
+<programlisting>
+void _PG_init(void)
+</programlisting>
+ which initializes whatever the hook code needs initialized. Here, you will cache
+ any previous hook(s) with code that looks like:
+<programlisting>
+prev_Foo = Foo_hook;
+prev_Bar = Bar_hook;
+</programlisting>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Let your imagination run wild, but try very hard not to crash while doing so.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Restore cached hooks in a function conventionally called
+<programlisting>
+void _PG_fini(void)
+</programlisting>
+ with code that looks like:
+<programlisting>
+Foo_hook = prev_Foo;
+Bar_hook = prev_Bar;
+</programlisting>
+ </para>
+ </listitem>
+ </itemizedlist>
+
+ </para>
+
+ <sect1 id="available-hooks">
+ <title>Available Hooks</title>
+
+ <para>
+ The following backend hooks are available for your use:
+ </para>
+
+ <sect2 id="analyze-hook">
+ <title>Analyze hook</title>
+
+ <para>
+ Take control at the end of parse analysis.
+ </para>
+ </sect2>
+
+ <sect2 id="auth-hook">
+ <title>Authentication hook</title>
+
+ <para>
+ Take control in <literal>ClientAuthentication</literal>.
+ </para>
+ </sect2>
+
+ <sect2 id="elog-hook">
+ <title>Logging hook</title>
+
+ <para>
+ Intercept messages before they are sent to the server log.
+ </para>
+ </sect2>
+
+ <sect2 id="executor-hook">
+ <title>Executor hooks</title>
+
+ <para>
+ The following hooks control the executor:
+ </para>
+
+ <sect3 id="executor-start-hook">
+ <title>Executor Start Hook</title>
+
+ <para>
+ Take control in <literal>ExecutorStart()</literal>.
+ </para>
+ </sect3>
+
+ <sect3 id="executor-run-hook">
+ <title>Executor Run Hook</title>
+
+ <para>
+ Take control in <literal>ExecutorRun()</literal>.
+ </para>
+ </sect3>
+
+ <sect3 id="executor-finish-hook">
+ <title>Executor Finish Hook</title>
+
+ <para>
+ Take control in <literal>ExecutorFinish()</literal>.
+ </para>
+ </sect3>
+
+ <sect3 id="executor-end-hook">
+ <title>Executor End Hook</title>
+
+ <para>
+ Take control in <literal>ExecutorEnd()</literal>.
+ </para>
+ </sect3>
+
+ <sect3 id="executor-check-permissions-hook">
+ <title>Executor Check Permissions Hook</title>
+
+ <para>
+ Take control in <literal>ExecutorCheckRTPerms()</literal>.
+ </para>
+ </sect3>
+
+ </sect2>
+
+ <sect2 id="explain-hook">
+ <title>Explain hook</title>
+
+ <para>
+ Take control of <command>EXPLAIN</command>.
+ </para>
+
+ <sect3 id="explain-one-query-hook">
+ <title>Explain One Query</title>
+
+ <para>
+ Take control in <literal>ExplainOneQuery()</literal>.
+ </para>
+ </sect3>
+
+ <sect3 id="explain-get-index-name">
+ <title>Explain Get Index Name</title>
+
+ <para>
+ Take control in <literal>explain_get_index_name()</literal>.
+ </para>
+ </sect3>
+
+ </sect2>
+
+ <sect2 id="fmgr-hook">
+ <title>Function Call Hooks</title>
+
+ <sect3 id="needs-fmgr-hook">
+ <title>Needs Function Manager</title>
+
+ <para>
+ Look around and decide whether the function manager hook is needed.
+ </para>
+ </sect3>
+
+ <sect3 id="fmgr-execution-hook">
+ <title>Function Manager</title>
+
+ <para>
+ Take control at start, end, or abort of a <literal>ROUTINE</literal>.
+ </para>
+ </sect3>
+ </sect2>
+
+ <sect2 id="ipc-hook">
+ <title>Shared Memory Startup</title>
+
+ <para>
+ Take control of chunks of shared memory at startup.
+ </para>
+ </sect2>
+
+ <sect2 id="openssl-tls-init-hook">
+ <title>OpenSSL TLS Init</title>
+
+ <para>
+ Take control in initialization of OpenSSL.
+ </para>
+ </sect2>
+
+ <sect2 id="get-attavgwidth-hook">
+ <title>Get Attribute Average Width</title>
+
+ <para>
+ Take control in <literal>get_attavgwidth()</literal>.
+ </para>
+ </sect2>
+
+ <sect2 id="object-access-hook">
+ <title>Object Access Hooks</title>
+
+ <para>
+ Take control just before or just after <command>CREATE</command>,
+ <command>DROP</command>, <command>ALTER</command>,
+ namespace search, function execution, and <command>TRUNCATE</command>.
+ </para>
+ </sect2>
+
+ <sect2 id="path-hook">
+ <title>Path (Optimizer) Hooks</title>
+
+ <para>
+ Take control during query planning.
+ </para>
+
+ <sect3 id="set-rel-pathlist-hook">
+ <title>Set Relation Pathlist</title>
+
+ <para>
+ Take control in <literal>set_rel_pathlist()</literal>.
+ </para>
+ </sect3>
+
+ <sect3 id="set-join-pathlist-hook">
+ <title>Set Join Pathlist</title>
+
+ <para>
+ Take control in <literal>set_join_pathlist()</literal>.
+ </para>
+ </sect3>
+
+ <sect3 id="join-search-hook">
+ <title>Join Search</title>
+
+ <para>
+ Take control in <literal>join_search()</literal>.
+ </para>
+ </sect3>
+ </sect2>
+
+ <sect2 id="get-relation-info-hook">
+ <title>Get Relation Info Hook</title>
+
+ <para>
+ Take control in <literal>get_relation_info()</literal>.
+ </para>
+ </sect2>
+
+ <sect2 id="planner-hook">
+ <title>Planner Hooks</title>
+
+ <para>
+ Take control in parts of the planner intended to be called by the optimizer.
+ </para>
+
+ <sect3 id="planner-hook-planner">
+ <title>Planner Function Hook</title>
+
+ <para>
+ Take control in <literal>planner()</literal>.
+ </para>
+ </sect3>
+
+ <sect3 id="planner-hook-grouping-planner">
+ <title>Grouping Planner Hook</title>
+
+ <para>
+ Take control in <literal>grouping_planner()</literal>.
+ </para>
+ </sect3>
+ </sect2>
+
+ <sect2 id="selectivity-hook">
+ <title>Selectivity Hooks</title>
+
+ <para>
+ Take control in bits that estimate selectivity.
+ </para>
+
+ <sect3 id="relation-stats-hook">
+ <title>Relation Stats Hook</title>
+
+ <para>
+ Take control when we ask for relation stats.
+ </para>
+ </sect3>
+
+ <sect3 id="index-stats-hook">
+ <title>Index Stats Hook</title>
+
+ <para>
+ Take control when we ask for index stats.
+ </para>
+ </sect3>
+ </sect2>
+
+ <sect2 id="user-hook">
+ <title>Check Password Hook</title>
+
+ <para>
+ Take control when calling <literal>CreateRole()</literal> and
+ <literal>AlterRole()</literal>.
+ </para>
+ </sect2>
+
+ <sect2 id="process-utility-hook">
+ <title>Process Utility Hook</title>
+
+ <para>
+ Take control in <literal>ProcessUtility()</literal>
+ </para>
+ </sect2>
+ </sect1>
+
+</chapter>
diff --git doc/src/sgml/postgres.sgml doc/src/sgml/postgres.sgml
index 730d5fdc34..f45d395acd 100644
--- doc/src/sgml/postgres.sgml
+++ doc/src/sgml/postgres.sgml
@@ -233,6 +233,7 @@ break is not needed in a wider output rendering.
&bgworker;
&logicaldecoding;
&replication-origins;
+ &hooks;
</part>
--------------2.29.2--
On 2020-12-31 04:28, David Fetter wrote:
This could probably use a lot of filling in, but having it in the
actual documentation beats needing to know folklore even to know
that the capability is there.
This patch seems quite short of a state where one could begin to
evaluate it. Documenting the hooks better seems a worthwhile goal. I
think the question is whether we can develop documentation that is
genuinely useful by itself without studying the relevant source code.
This submission does not address that question.
On Fri, Jan 15, 2021 at 12:28 AM Peter Eisentraut <
peter.eisentraut@enterprisedb.com> wrote:
On 2020-12-31 04:28, David Fetter wrote:
This could probably use a lot of filling in, but having it in the
actual documentation beats needing to know folklore even to know
that the capability is there.
Typo - the entity definition hooks should be listed before infoschema, not
after.
This patch seems quite short of a state where one could begin to
evaluate it. Documenting the hooks better seems a worthwhile goal. I
think the question is whether we can develop documentation that is
genuinely useful by itself without studying the relevant source code.
This submission does not address that question.
Yeah, there seems to be a point further along this path we want to reach.
In particular, having a complete example would be nice. Also needing
explaining is the whole hook swapping thing (I don't think "cache" is the
right term to use here) - like "why" it is important and how its useful
given that the "Foo_hook" is never assigned away from "prev_Foo" so the
restoration in _PG_fini seems redundant (I see the full example does assign
the various old values away - but why is the needed and why doesn't another
module doing the same thing end up clobbering this one in a last-one-wins
way?)
I would also be curious whether a static listing of hooks in the
documentation could instead be accomplished by writing a query and having
the build routine populate a "pg_hooks" catalog table which would
be referenced, and ideally could be queried at runtime to enumerate
installed hooks.
Pointing out the presence of src/test/modules/test_rls_hooks would be
advised (in addition to a more minimal "hello world" like example in the
documentation itself). Having the test module point to the documentation
for more explanations would be good as well.
Coming in with fresh eyes the main thing I would care about is that these
exist, a brief idea of how they operate without having to dig into the
source code, and pointers on where to learn which ones exist (ideally
without digging into source code) and how to go about writing one (which
builds upon material already documented about extending the service using
the C programming language - so links there). I'm good with running a
catalog query to learn about which ones exist instead of reading them in
the documentation - though the later has some appeal and if it can be
maintained as a build artefact alongside the catalog entries that would be
a bonus.
David J.
On Fri, Jan 15, 2021 at 8:28 AM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
On 2020-12-31 04:28, David Fetter wrote:
This could probably use a lot of filling in, but having it in the
actual documentation beats needing to know folklore even to know
that the capability is there.This patch seems quite short of a state where one could begin to
evaluate it. Documenting the hooks better seems a worthwhile goal. I
think the question is whether we can develop documentation that is
genuinely useful by itself without studying the relevant source code.
This submission does not address that question.
Even just having a list of available hooks would be a nice improvement though :)
But maybe it's something that belongs better in a README file instead,
since as you say it's unlikely to be properly useful without looking
at the source anyway. But just a list of hooks and a *very* high
overview of where each of them hooks in would definitely be useful to
have somewhere, I think. Having to find with "grep" whether there may
or may not exist a hook for approximately what it is you're looking
for is definitely a process to improve on.
--
Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/
On 17.01.2021 16:53, Magnus Hagander wrote:
On Fri, Jan 15, 2021 at 8:28 AM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:On 2020-12-31 04:28, David Fetter wrote:
This could probably use a lot of filling in, but having it in the
actual documentation beats needing to know folklore even to know
that the capability is there.This patch seems quite short of a state where one could begin to
evaluate it. Documenting the hooks better seems a worthwhile goal. I
think the question is whether we can develop documentation that is
genuinely useful by itself without studying the relevant source code.
This submission does not address that question.Even just having a list of available hooks would be a nice improvement though :)
But maybe it's something that belongs better in a README file instead,
since as you say it's unlikely to be properly useful without looking
at the source anyway. But just a list of hooks and a *very* high
overview of where each of them hooks in would definitely be useful to
have somewhere, I think. Having to find with "grep" whether there may
or may not exist a hook for approximately what it is you're looking
for is definitely a process to improve on.
+1 for README.
Hooks are intended for developers and can be quite dangerous without
proper understanding of the internal code.
I also want to remind about a readme gathered by mentees [1]https://github.com/AmatanHead/psql-hooks/blob/master/Detailed.md. It was
done under a PostgreSQL license, so we can use it.
By the way, is there any agreement on the plain-text format of
PostrgeSQL README files or we can use md?
[1]: https://github.com/AmatanHead/psql-hooks/blob/master/Detailed.md
--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On 15.01.21 08:28, Peter Eisentraut wrote:
On 2020-12-31 04:28, David Fetter wrote:
This could probably use a lot of filling in, but having it in the
actual documentation beats needing to know folklore even to know
that the capability is there.This patch seems quite short of a state where one could begin to
evaluate it.� Documenting the hooks better seems a worthwhile goal.�� I
think the question is whether we can develop documentation that is
genuinely useful by itself without studying the relevant source code.
This submission does not address that question.
There hasn't been any meaningful progress on this, and no new patch to
look at, so I'm proposing to set this as returned with feedback.
On 3/4/21 10:00 AM, Peter Eisentraut wrote:
On 15.01.21 08:28, Peter Eisentraut wrote:
On 2020-12-31 04:28, David Fetter wrote:
This could probably use a lot of filling in, but having it in the
actual documentation beats needing to know folklore even to know
that the capability is there.This patch seems quite short of a state where one could begin to
evaluate it.� Documenting the hooks better seems a worthwhile goal.
I think the question is whether we can develop documentation that is
genuinely useful by itself without studying the relevant source code.
This submission does not address that question.There hasn't been any meaningful progress on this, and no new patch to
look at, so I'm proposing to set this as returned with feedback.
+1. I'll close it on March 9 unless there are objections.
Regards,
--
-David
david@pgmasters.net
On Fri, Feb 12, 2021 at 08:02:51PM +0300, Anastasia Lubennikova wrote:
On 17.01.2021 16:53, Magnus Hagander wrote:
On Fri, Jan 15, 2021 at 8:28 AM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:On 2020-12-31 04:28, David Fetter wrote:
This could probably use a lot of filling in, but having it in the
actual documentation beats needing to know folklore even to know
that the capability is there.This patch seems quite short of a state where one could begin to
evaluate it. Documenting the hooks better seems a worthwhile goal. I
think the question is whether we can develop documentation that is
genuinely useful by itself without studying the relevant source code.
This submission does not address that question.Even just having a list of available hooks would be a nice improvement though :)
But maybe it's something that belongs better in a README file instead,
since as you say it's unlikely to be properly useful without looking
at the source anyway. But just a list of hooks and a *very* high
overview of where each of them hooks in would definitely be useful to
have somewhere, I think. Having to find with "grep" whether there may
or may not exist a hook for approximately what it is you're looking
for is definitely a process to improve on.+1 for README.
Hooks are intended for developers and can be quite dangerous without proper
understanding of the internal code.I also want to remind about a readme gathered by mentees [1]. It was done
under a PostgreSQL license, so we can use it.
By the way, is there any agreement on the plain-text format of PostrgeSQL
README files or we can use md?[1] https://github.com/AmatanHead/psql-hooks/blob/master/Detailed.md
This is much more thorough than what I've done so far, and a much
better document in terms of pointing to actual hunks of the source for
context.
I'm -1 on making a README alone. These are public APIs, and as such,
the fact of their existence shouldn't be a mystery discoverable only
by knowing that there's something to look for in the source tree and
then running an appropriate grep command to find the current ones
Would a document simply listing current hooks and pointing to
something along the lines of README_hooks in src work better?
Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
David Fetter <david@fetter.org> writes:
I'm -1 on making a README alone. These are public APIs, and as such,
the fact of their existence shouldn't be a mystery discoverable only
by knowing that there's something to look for in the source tree and
then running an appropriate grep command to find the current ones
Meh. Almost always, effective use of a hook requires a substantial
amount of code-reading, so I don't have much use for the idea that
hook users shouldn't need to be familiar with how to find things in
the source tree. Now, you could argue "if we had a higher standard
of documentation for hooks, that wouldn't be necessary" ... to
which I'd reply "if we enforced such a standard of documentation,
there would be approximately zero hooks". Almost all the ones that
exist got in there partly because of the low overhead involved in
adding one.
Moreover, if by "public API" you mean something we're promising
to hold stable, then there won't be approximately zero hooks,
there will be *precisely* zero hooks. I can't think of any
interesting hook that isn't in a place where relevant APIs
change regularly. (I think the SPI entry points might be the
only backend-internal functions that we treat as stable APIs
in that sense.) The more documentation you expect to exist for
a hook, the more likely that some of it will be out of date.
This situation won't be helped any by our community's proven
track record of failing to update comments that are more than
three lines away from the code they're changing. (OK, I'm being
unduly negative here, perhaps. But this is a very real problem.)
I think that the best you should hope for here is that people are
willing to add a short, not-too-detailed para to a markup-free
plain-text README file that lists all the hooks. As soon as it
gets any more complex than that, either the doco aspect will be
ignored, or there simply won't be any more hooks.
(I'm afraid I likewise don't believe in the idea of carrying a test
module for each hook. Again, requiring that is a good way to
ensure that new hooks just won't happen.)
regards, tom lane
On Sat, Mar 6, 2021 at 08:32:43PM -0500, Tom Lane wrote:
I think that the best you should hope for here is that people are
willing to add a short, not-too-detailed para to a markup-free
plain-text README file that lists all the hooks. As soon as it
gets any more complex than that, either the doco aspect will be
ignored, or there simply won't be any more hooks.(I'm afraid I likewise don't believe in the idea of carrying a test
module for each hook. Again, requiring that is a good way to
ensure that new hooks just won't happen.)
Agreed. If you document the hooks too much, it allows them to drift
away from matching the code, which makes the hook documentation actually
worse than having no hook documentation at all.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
The usefulness of a cup is in its emptiness, Bruce Lee
On 3/9/21 12:20 PM, Bruce Momjian wrote:
On Sat, Mar 6, 2021 at 08:32:43PM -0500, Tom Lane wrote:
I think that the best you should hope for here is that people are
willing to add a short, not-too-detailed para to a markup-free
plain-text README file that lists all the hooks. As soon as it
gets any more complex than that, either the doco aspect will be
ignored, or there simply won't be any more hooks.(I'm afraid I likewise don't believe in the idea of carrying a test
module for each hook. Again, requiring that is a good way to
ensure that new hooks just won't happen.)Agreed. If you document the hooks too much, it allows them to drift
away from matching the code, which makes the hook documentation actually
worse than having no hook documentation at all.
There's doesn't seem to be agreement on how to proceed here, so closing.
David, if you do decide to proceed with a README then it would probably
be best to create a new thread/entry.
Regards,
--
-David
david@pgmasters.net
On Wed, Mar 10, 2021 at 09:38:39AM -0500, David Steele wrote:
On 3/9/21 12:20 PM, Bruce Momjian wrote:
On Sat, Mar 6, 2021 at 08:32:43PM -0500, Tom Lane wrote:
I think that the best you should hope for here is that people are
willing to add a short, not-too-detailed para to a markup-free
plain-text README file that lists all the hooks. As soon as it
gets any more complex than that, either the doco aspect will be
ignored, or there simply won't be any more hooks.(I'm afraid I likewise don't believe in the idea of carrying a test
module for each hook. Again, requiring that is a good way to
ensure that new hooks just won't happen.)Agreed. If you document the hooks too much, it allows them to drift
away from matching the code, which makes the hook documentation actually
worse than having no hook documentation at all.There's doesn't seem to be agreement on how to proceed here, so closing.
David, if you do decide to proceed with a README then it would probably be
best to create a new thread/entry.
Thanks for the work on this and the helpful feedback!
Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate