[PATCH] pg_stat_activity: make slow/hanging authentication more visible

Started by Jacob Championalmost 2 years ago66 messages
Jump to latest
#1Jacob Champion
jacob.champion@enterprisedb.com

Hi all,

Recently I dealt with a server where PAM had hung a connection
indefinitely, suppressing our authentication timeout and preventing a
clean shutdown. Worse, the xmin that was pinned by the opening
transaction cascaded to replicas and started messing things up
downstream.

The DBAs didn't know what was going on, because pg_stat_activity
doesn't report the authenticating connection or its open transaction.
It just looked like a Postgres bug. And while talking about it with
Euler, he mentioned he'd seen similar "invisible" hangs with
misbehaving LDAP deployments. I think we can do better to show DBAs
what's happening.

0001, attached, changes InitPostgres() to report a nearly-complete
pgstat entry before entering client authentication, then fills it in
the rest of the way once we know who the user is. Here's a sample
entry for a client that's hung during a SCRAM exchange:

=# select * from pg_stat_activity where state = 'authenticating';
-[ RECORD 1 ]----+------------------------------
datid |
datname |
pid | 745662
leader_pid |
usesysid |
usename |
application_name |
client_addr | 127.0.0.1
client_hostname |
client_port | 38304
backend_start | 2024-05-06 11:25:23.905923-07
xact_start |
query_start |
state_change |
wait_event_type | Client
wait_event | ClientRead
state | authenticating
backend_xid |
backend_xmin | 784
query_id |
query |
backend_type | client backend

0002 goes even further, and adds wait events for various forms of
external authentication, but it's not fully baked. The intent is for a
DBA to be able to see when a bunch of connections are piling up
waiting for PAM/Kerberos/whatever. (I'm also motivated by my OAuth
patchset, where there's a server-side plugin that we have no control
over, and we'd want to be able to correctly point fingers at it if
things go wrong.)

= Open Issues, Idle Thoughts =

Maybe it's wishful thinking, but it'd be cool if a misbehaving
authentication exchange did not impact replicas in any way. Is there a
way to make that opening transaction lighterweight?

0001 may be a little too much code. There are only two parts of
pgstat_bestart() that need to be modified: omit the user ID, and fill
in the state as 'authenticating' rather than unknown. I could just add
the `pre_auth` boolean to the signature of pgstat_bestart() directly,
if we don't mind adjusting all the call sites. We could also avoid
changing the signature entirely, and just assume that we're
authenticating if SessionUserId isn't set. That felt like a little too
much global magic to me, though.

Would anyone like me to be more aggressive, and create a pgstat entry
as soon as we have the opening transaction? Or... as soon as a
connection is made?

0002 is abusing the "IPC" wait event class. If the general idea seems
okay, maybe we could add an "External" class that encompasses the
general idea of "it's not our fault, it's someone else's"?

I had trouble deciding how granular to make the areas that are covered
by the new wait events. Ideally they would kick in only when we call
out to an external system, but for some authentication types, that's a
lot of calls to wrap. On the other extreme, we don't want to go too
high in the call stack and accidentally nest wait events (such as
those generated during pq_getmessage()). What I have now is not very
principled.

I haven't decided how to test these patches. Seems like a potential
use case for injection points, but I think I'd need to preload an
injection library rather than using the existing extension. Does that
seem like an okay way to go?

Thanks,
--Jacob

Attachments:

0001-pgstat-report-in-earlier-with-STATE_AUTHENTICATING.patchapplication/octet-stream; name=0001-pgstat-report-in-earlier-with-STATE_AUTHENTICATING.patchDownload+47-5
0002-WIP-report-external-auth-calls-as-wait-events.patchapplication/octet-stream; name=0002-WIP-report-external-auth-calls-as-wait-events.patchDownload+49-13
#2Noah Misch
noah@leadboat.com
In reply to: Jacob Champion (#1)
Re: [PATCH] pg_stat_activity: make slow/hanging authentication more visible

On Mon, May 06, 2024 at 02:23:38PM -0700, Jacob Champion wrote:

=# select * from pg_stat_activity where state = 'authenticating';
-[ RECORD 1 ]----+------------------------------
datid |
datname |
pid | 745662
leader_pid |
usesysid |
usename |
application_name |
client_addr | 127.0.0.1
client_hostname |
client_port | 38304
backend_start | 2024-05-06 11:25:23.905923-07
xact_start |
query_start |
state_change |
wait_event_type | Client
wait_event | ClientRead
state | authenticating
backend_xid |
backend_xmin | 784
query_id |
query |
backend_type | client backend

That looks like a reasonable user experience. Is any field newly-nullable?

= Open Issues, Idle Thoughts =

Maybe it's wishful thinking, but it'd be cool if a misbehaving
authentication exchange did not impact replicas in any way. Is there a
way to make that opening transaction lighterweight?

You could release the xmin before calling PAM or LDAP. If you've copied all
relevant catalog content to local memory, that's fine to do. That said, it
may be more fruitful to arrange for authentication timeout to cut through PAM
etc. Hanging connection slots hurt even if they lack an xmin. I assume it
takes an immediate shutdown to fix them?

Would anyone like me to be more aggressive, and create a pgstat entry
as soon as we have the opening transaction? Or... as soon as a
connection is made?

All else being equal, I'd like backends to have one before taking any lmgr
lock or snapshot.

I haven't decided how to test these patches. Seems like a potential
use case for injection points, but I think I'd need to preload an
injection library rather than using the existing extension. Does that
seem like an okay way to go?

Yes.

Thanks,
nm

#3Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Noah Misch (#2)
Re: [PATCH] pg_stat_activity: make slow/hanging authentication more visible

On Sun, Jun 30, 2024 at 10:48 AM Noah Misch <noah@leadboat.com> wrote:

That looks like a reasonable user experience. Is any field newly-nullable?

Technically I think the answer is no, since backends such as walwriter
already have null database and user fields. It's new for a client
backend to have nulls there, though.

That said, it
may be more fruitful to arrange for authentication timeout to cut through PAM
etc.

That seems mostly out of our hands -- the misbehaving modules are free
to ignore our signals (and do). Is there another way to force the
issue?

Hanging connection slots hurt even if they lack an xmin.

Oh, would releasing the xmin not really move the needle, then?

I assume it
takes an immediate shutdown to fix them?

That's my understanding, yeah.

Would anyone like me to be more aggressive, and create a pgstat entry
as soon as we have the opening transaction? Or... as soon as a
connection is made?

All else being equal, I'd like backends to have one before taking any lmgr
lock or snapshot.

I can look at this for the next patchset version.

I haven't decided how to test these patches. Seems like a potential
use case for injection points, but I think I'd need to preload an
injection library rather than using the existing extension. Does that
seem like an okay way to go?

Yes.

I misunderstood how injection points worked. No preload module needed,
so v2 adds a waitpoint and a test along with a couple of needed tweaks
to BackgroundPsql. I think 0001 should probably be applied
independently.

Thanks,
--Jacob

Attachments:

v2-0001-BackgroundPsql-handle-empty-query-results.patchapplication/octet-stream; name=v2-0001-BackgroundPsql-handle-empty-query-results.patchDownload+5-10
v2-0002-Test-Cluster-let-background_psql-work-asynchronou.patchapplication/octet-stream; name=v2-0002-Test-Cluster-let-background_psql-work-asynchronou.patchDownload+26-8
v2-0004-WIP-report-external-auth-calls-as-wait-events.patchapplication/octet-stream; name=v2-0004-WIP-report-external-auth-calls-as-wait-events.patchDownload+49-13
v2-0003-pgstat-report-in-earlier-with-STATE_AUTHENTICATIN.patchapplication/octet-stream; name=v2-0003-pgstat-report-in-earlier-with-STATE_AUTHENTICATIN.patchDownload+128-5
#4Noah Misch
noah@leadboat.com
In reply to: Jacob Champion (#3)
Re: [PATCH] pg_stat_activity: make slow/hanging authentication more visible

On Mon, Jul 08, 2024 at 02:09:21PM -0700, Jacob Champion wrote:

On Sun, Jun 30, 2024 at 10:48 AM Noah Misch <noah@leadboat.com> wrote:

That said, it
may be more fruitful to arrange for authentication timeout to cut through PAM
etc.

That seems mostly out of our hands -- the misbehaving modules are free
to ignore our signals (and do). Is there another way to force the
issue?

Two ways at least (neither of them cheap):
- Invoke PAM in a subprocess, and SIGKILL that process if needed.
- Modify the module to be interruptible.

Hanging connection slots hurt even if they lack an xmin.

Oh, would releasing the xmin not really move the needle, then?

It still moves the needle.

#5Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Noah Misch (#2)
Re: [PATCH] pg_stat_activity: make slow/hanging authentication more visible

On Sun, Jun 30, 2024 at 10:48 AM Noah Misch <noah@leadboat.com> wrote:v

Would anyone like me to be more aggressive, and create a pgstat entry
as soon as we have the opening transaction? Or... as soon as a
connection is made?

All else being equal, I'd like backends to have one before taking any lmgr
lock or snapshot.

v3-0003 pushes the pgstat creation as far back as I felt comfortable,
right after the PGPROC registration by InitProcessPhase2(). That
function does lock the ProcArray, but if it gets held forever due to
some bug, you won't be able to use pg_stat_activity to debug it
anyway. And with this ordering, pg_stat_get_activity() will be able to
retrieve the proc entry by PID without a race.

This approach ends up registering an early entry for more cases than
the original patchset. For example, autovacuum and other background
workers will now briefly get their own "authenticating" state, which
seems like it could potentially confuse people. Should I rename the
state, or am I overthinking it?

You could release the xmin before calling PAM or LDAP. If you've copied all
relevant catalog content to local memory, that's fine to do.

I played with the xmin problem a little bit, but I've shelved it for
now. There's probably a way to do that safely; I just don't understand
enough about the invariants to do it. For example, there's a comment
later on that says

* We established a catalog snapshot while reading pg_authid and/or
* pg_database;

and I'm a little nervous about invalidating the snapshot halfway
through that process. Even if PAM and LDAP don't rely on pg_authid or
other shared catalogs today, shouldn't they be allowed to in the
future, without being coupled to InitPostgres implementation order?
And I don't think we can move the pg_database checks before
authentication.

As for the other patches, I'll ping Andrew about 0001, and 0004
remains in its original WIP state. Anyone excited about that wait
event idea?

Thanks!
--Jacob

Attachments:

v3-0002-Test-Cluster-let-background_psql-work-asynchronou.patchapplication/octet-stream; name=v3-0002-Test-Cluster-let-background_psql-work-asynchronou.patchDownload+26-8
v3-0001-BackgroundPsql-handle-empty-query-results.patchapplication/octet-stream; name=v3-0001-BackgroundPsql-handle-empty-query-results.patchDownload+5-10
v3-0003-pgstat-report-in-earlier-with-STATE_AUTHENTICATIN.patchapplication/octet-stream; name=v3-0003-pgstat-report-in-earlier-with-STATE_AUTHENTICATIN.patchDownload+134-8
v3-0004-WIP-report-external-auth-calls-as-wait-events.patchapplication/octet-stream; name=v3-0004-WIP-report-external-auth-calls-as-wait-events.patchDownload+49-13
#6Andrew Dunstan
andrew@dunslane.net
In reply to: Jacob Champion (#5)
Re: [PATCH] pg_stat_activity: make slow/hanging authentication more visible

On 2024-08-29 Th 4:44 PM, Jacob Champion wrote:

As for the other patches, I'll ping Andrew about 0001,

Patch 0001 looks sane to me.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#7Michael Paquier
michael@paquier.xyz
In reply to: Andrew Dunstan (#6)
Re: [PATCH] pg_stat_activity: make slow/hanging authentication more visible

On Fri, Aug 30, 2024 at 04:10:32PM -0400, Andrew Dunstan wrote:

On 2024-08-29 Th 4:44 PM, Jacob Champion wrote:

As for the other patches, I'll ping Andrew about 0001,

Patch 0001 looks sane to me.

So does 0002 to me. I'm not much a fan of the addition of
pgstat_bestart_pre_auth(), which is just a shortcut to set a different
state in the backend entry to tell that it is authenticating. Is
authenticating the term for this state of the process startups,
actually? Could it be more transparent to use a "startup" or
"starting"" state instead that gets also used by pgstat_bestart() in
the case of the patch where !pre_auth?

The addition of the new wait event states in 0004 is a good idea,
indeed, and these can be seen in pg_stat_activity once we get out of
PGSTAT_END_WRITE_ACTIVITY() (err.. Right?).
--
Michael

#8Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Michael Paquier (#7)
Re: [PATCH] pg_stat_activity: make slow/hanging authentication more visible

On Sun, Sep 1, 2024 at 5:10 PM Michael Paquier <michael@paquier.xyz> wrote:

On Fri, Aug 30, 2024 at 04:10:32PM -0400, Andrew Dunstan wrote:

Patch 0001 looks sane to me.

So does 0002 to me.

Thanks both!

I'm not much a fan of the addition of
pgstat_bestart_pre_auth(), which is just a shortcut to set a different
state in the backend entry to tell that it is authenticating. Is
authenticating the term for this state of the process startups,
actually? Could it be more transparent to use a "startup" or
"starting"" state instead

Yeah, I think I should rename that. Especially if we adopt new wait
states to make it obvious where we're stuck.

"startup", "starting", "initializing", "connecting"...?

that gets also used by pgstat_bestart() in
the case of the patch where !pre_auth?

To clarify, do you want me to just add the new boolean directly to
pgstat_bestart()'s parameter list?

The addition of the new wait event states in 0004 is a good idea,
indeed,

Thanks! Any thoughts on the two open questions for it?:
1) Should we add a new wait event class rather than reusing IPC?
2) Is the level at which I've inserted calls to
pgstat_report_wait_start()/_end() sane and maintainable?

and these can be seen in pg_stat_activity once we get out of
PGSTAT_END_WRITE_ACTIVITY() (err.. Right?).

It doesn't look like pgstat_report_wait_start() uses that machinery.

--Jacob

#9Michael Paquier
michael@paquier.xyz
In reply to: Jacob Champion (#8)
Re: [PATCH] pg_stat_activity: make slow/hanging authentication more visible

On Tue, Sep 03, 2024 at 02:47:57PM -0700, Jacob Champion wrote:

On Sun, Sep 1, 2024 at 5:10 PM Michael Paquier <michael@paquier.xyz> wrote:

that gets also used by pgstat_bestart() in
the case of the patch where !pre_auth?

To clarify, do you want me to just add the new boolean directly to
pgstat_bestart()'s parameter list?

No. My question was about splitting pgstat_bestart() and
pgstat_bestart_pre_auth() in a cleaner way, because authenticated
connections finish by calling both, meaning that we do twice the same
setup for backend entries depending on the authentication path taken.
That seems like a waste.

The addition of the new wait event states in 0004 is a good idea,
indeed,

Thanks! Any thoughts on the two open questions for it?:
1) Should we add a new wait event class rather than reusing IPC?

A new category would be more adapted. IPC is not adapted because are
not waiting for another server process. Perhaps just use a new
"Authentication" class, as in "The server is waiting for an
authentication operation to complete"?

2) Is the level at which I've inserted calls to
pgstat_report_wait_start()/_end() sane and maintainable?

These don't worry me. You are adding twelve event points with only 5
new wait names. Couldn't it be better to have a one-one mapping
instead, adding twelve entries in wait_event_names.txt?

I am not really on board with the test based on injection points
proposed, though. It checks that the "authenticating" flag is set in
pg_stat_activity, but it does nothing else. That seems limited. Or
are you planning for more?
--
Michael

#10Noah Misch
noah@leadboat.com
In reply to: Michael Paquier (#9)
Re: [PATCH] pg_stat_activity: make slow/hanging authentication more visible

On Tue, Sep 10, 2024 at 02:29:57PM +0900, Michael Paquier wrote:

You are adding twelve event points with only 5
new wait names. Couldn't it be better to have a one-one mapping
instead, adding twelve entries in wait_event_names.txt?

No, I think the patch's level of detail is better. One shouldn't expect the
two ldap_simple_bind_s() calls to have different-enough performance
characteristics to justify exposing that level of detail to the DBA.
ldap_search_s() and InitializeLDAPConnection() differ more, but the DBA mostly
just needs to know the scale of their LDAP responsiveness problem.

(Someday, it might be good to expose the file:line and/or backtrace associated
with a wait, like we do for ereport(). As a way to satisfy rare needs for
more detail, I'd prefer that over giving every pgstat_report_wait_start() a
different name.)

#11Robert Haas
robertmhaas@gmail.com
In reply to: Noah Misch (#10)
Re: [PATCH] pg_stat_activity: make slow/hanging authentication more visible

On Tue, Sep 10, 2024 at 1:27 PM Noah Misch <noah@leadboat.com> wrote:

On Tue, Sep 10, 2024 at 02:29:57PM +0900, Michael Paquier wrote:

You are adding twelve event points with only 5
new wait names. Couldn't it be better to have a one-one mapping
instead, adding twelve entries in wait_event_names.txt?

No, I think the patch's level of detail is better. One shouldn't expect the
two ldap_simple_bind_s() calls to have different-enough performance
characteristics to justify exposing that level of detail to the DBA.
ldap_search_s() and InitializeLDAPConnection() differ more, but the DBA mostly
just needs to know the scale of their LDAP responsiveness problem.

(Someday, it might be good to expose the file:line and/or backtrace associated
with a wait, like we do for ereport(). As a way to satisfy rare needs for
more detail, I'd prefer that over giving every pgstat_report_wait_start() a
different name.)

I think unique names are a good idea. If a user doesn't care about the
difference between sdgjsA and sdjgsB, they can easily ignore the
trailing suffix, and IME, people typically do that without really
stopping to think about it. If on the other hand the two are lumped
together as sdjgs and a user needs to distinguish them, they can't. So
I see unique names as having much more upside than downside.

--
Robert Haas
EDB: http://www.enterprisedb.com

#12Noah Misch
noah@leadboat.com
In reply to: Robert Haas (#11)
Re: [PATCH] pg_stat_activity: make slow/hanging authentication more visible

On Tue, Sep 10, 2024 at 02:51:23PM -0400, Robert Haas wrote:

On Tue, Sep 10, 2024 at 1:27 PM Noah Misch <noah@leadboat.com> wrote:

On Tue, Sep 10, 2024 at 02:29:57PM +0900, Michael Paquier wrote:

You are adding twelve event points with only 5
new wait names. Couldn't it be better to have a one-one mapping
instead, adding twelve entries in wait_event_names.txt?

No, I think the patch's level of detail is better. One shouldn't expect the
two ldap_simple_bind_s() calls to have different-enough performance
characteristics to justify exposing that level of detail to the DBA.
ldap_search_s() and InitializeLDAPConnection() differ more, but the DBA mostly
just needs to know the scale of their LDAP responsiveness problem.

(Someday, it might be good to expose the file:line and/or backtrace associated
with a wait, like we do for ereport(). As a way to satisfy rare needs for
more detail, I'd prefer that over giving every pgstat_report_wait_start() a
different name.)

I think unique names are a good idea. If a user doesn't care about the
difference between sdgjsA and sdjgsB, they can easily ignore the
trailing suffix, and IME, people typically do that without really
stopping to think about it. If on the other hand the two are lumped
together as sdjgs and a user needs to distinguish them, they can't. So
I see unique names as having much more upside than downside.

I agree a person can ignore the distinction, but that requires the person to
be consuming the raw event list. It's reasonable to tell your monitoring tool
to give you the top N wait events. Individual AuthnLdap* events may all miss
the cut even though their aggregate would have made the cut. Before you know
to teach that monitoring tool to group AuthnLdap* together, it won't show you
any of those names.

I felt commit c789f0f also chose sub-optimally in this respect, particularly
with the DblinkGetConnect/DblinkConnect pair. I didn't feel strongly enough
to complain at the time, but a rule of "each wait event appears in one
pgstat_report_wait_start()" would be a rule I don't want. One needs
familiarity with the dblink implementation internals to grasp the
DblinkGetConnect/DblinkConnect distinction, and a plausible refactor of dblink
would make those names cease to fit. I see this level of fine-grained naming
as making the event name a sort of stable proxy for FILE:LINE. I'd value
exposing such a proxy, all else being equal, but I don't think wait event
names like AuthLdapBindLdapbinddn/AuthLdapBindUser are the right way. Wait
event names should be more independent of today's code-level details.

#13Michael Paquier
michael@paquier.xyz
In reply to: Noah Misch (#12)
Re: [PATCH] pg_stat_activity: make slow/hanging authentication more visible

On Tue, Sep 10, 2024 at 01:58:50PM -0700, Noah Misch wrote:

On Tue, Sep 10, 2024 at 02:51:23PM -0400, Robert Haas wrote:

I think unique names are a good idea. If a user doesn't care about the
difference between sdgjsA and sdjgsB, they can easily ignore the
trailing suffix, and IME, people typically do that without really
stopping to think about it. If on the other hand the two are lumped
together as sdjgs and a user needs to distinguish them, they can't. So
I see unique names as having much more upside than downside.

I agree a person can ignore the distinction, but that requires the person to
be consuming the raw event list. It's reasonable to tell your monitoring tool
to give you the top N wait events. Individual AuthnLdap* events may all miss
the cut even though their aggregate would have made the cut. Before you know
to teach that monitoring tool to group AuthnLdap* together, it won't show you
any of those names.

That's a fair point. I use a bunch of aggregates with group bys for
any monitoring queries looking for event point patterns. In my
experience, when dealing with enough connections, patterns show up
anyway even if there is noise because some of the events that I was
looking for are rather short-term, like a sync events interleaving
with locks storing an average of the events into a secondary table
with some INSERT SELECT.

I felt commit c789f0f also chose sub-optimally in this respect, particularly
with the DblinkGetConnect/DblinkConnect pair. I didn't feel strongly enough
to complain at the time, but a rule of "each wait event appears in one
pgstat_report_wait_start()" would be a rule I don't want. One needs
familiarity with the dblink implementation internals to grasp the
DblinkGetConnect/DblinkConnect distinction, and a plausible refactor of dblink
would make those names cease to fit. I see this level of fine-grained naming
as making the event name a sort of stable proxy for FILE:LINE. I'd value
exposing such a proxy, all else being equal, but I don't think wait event
names like AuthLdapBindLdapbinddn/AuthLdapBindUser are the right way. Wait
event names should be more independent of today's code-level details.

Depends. I'd rather choose more granularity to know exactly which
part of the code I am dealing with, especially in the case of this
thread where these are embedded around external function calls. If,
for example, one notices that a stack of pg_stat_activity scans are
complaining about a specific step in the authentication process, it is
going to offer a much better hint than having to guess which part of
the authentication step is slow, like in LDAP.

Wait event additions are also kind of cheap in terms of maintenance in
core, creating a new translation cost. So I also think there are more
upsides to be wilder here with more points and more granularity.
--
Michael

#14Robert Haas
robertmhaas@gmail.com
In reply to: Noah Misch (#12)
Re: [PATCH] pg_stat_activity: make slow/hanging authentication more visible

On Tue, Sep 10, 2024 at 4:58 PM Noah Misch <noah@leadboat.com> wrote:

... a rule of "each wait event appears in one
pgstat_report_wait_start()" would be a rule I don't want.

As the original committer of the wait event stuff, I intended for the
rule that you do not want to be the actual rule. However, I see that I
didn't spell that out anywhere in the commit message, or the commit
itself.

I see this level of fine-grained naming
as making the event name a sort of stable proxy for FILE:LINE. I'd value
exposing such a proxy, all else being equal, but I don't think wait event
names like AuthLdapBindLdapbinddn/AuthLdapBindUser are the right way. Wait
event names should be more independent of today's code-level details.

I don't agree with that. One of the most difficult parts of supporting
PostgreSQL, in my experience, is that it's often very difficult to
find out what has gone wrong when a system starts behaving badly. It
is often necessary to ask customers to install a debugger and do stuff
with it, or give them an instrumented build, in order to determine the
root cause of a problem that in some cases is not even particularly
complicated. While needing to refer to specific source code details
may not be a common experience for the typical end user, it is
extremely common for me. This problem commonly arises with error
messages, because we have lots of error messages that are exactly the
same, although thankfully it has become less common due to "could not
find tuple for THINGY %u" no longer being a message that no longer
typically reaches users. But even when someone has a complaint about
an error message and there are multiple instances of that error
message, I know that:

(1) I can ask them to set the error verbosity to verbose. I don't have
that option for wait events.

(2) The primary function of the error message is to be understandable
to the user, which means that it needs to be written in plain English.
The primary function of a wait event is to make it possible to
understand the behavior of the system and troubleshoot problems, and
it becomes much less effective as soon as it starts saying that thing
A and thing B are so similar that nobody will ever care about the
distinction. It is very hard to be certain of that. When somebody
reports that they've got a whole bunch of wait events on some wait
event that nobody has ever complained about before, I want to go look
at the code in that specific place and try to figure out what's
happening. If I have to start imagining possible scenarios based on 2
or more call sites, or if I have to start by getting them to install a
modified build with those properly split apart and trying to reproduce
the problem, it's a lot harder.

In my experience, the number of distinct wait events that a particular
installation experiences is rarely very large. It is probably measured
in dozens. A user who wishes to disregard the distinction between
similarly-named wait events won't find it prohibitively difficult to
look over the list of all the wait events they ever see and decide
which ones they'd like to merge for reporting purposes. But a user who
really needs things separated out and finds that they aren't is simply
out of luck.

--
Robert Haas
EDB: http://www.enterprisedb.com

#15Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Michael Paquier (#9)
Re: [PATCH] pg_stat_activity: make slow/hanging authentication more visible

On Mon, Sep 9, 2024 at 10:30 PM Michael Paquier <michael@paquier.xyz> wrote:

No. My question was about splitting pgstat_bestart() and
pgstat_bestart_pre_auth() in a cleaner way, because authenticated
connections finish by calling both, meaning that we do twice the same
setup for backend entries depending on the authentication path taken.
That seems like a waste.

I can try to separate them out. I'm a little wary of messing with the
CRITICAL_SECTION guarantees, though. I thought the idea was that you
filled in the entire struct to prevent tearing. (If I've misunderstood
that, please let me know :D)

Perhaps just use a new
"Authentication" class, as in "The server is waiting for an
authentication operation to complete"?

Sounds good.

Couldn't it be better to have a one-one mapping
instead, adding twelve entries in wait_event_names.txt?

(I have no strong opinions on this myself, but while the debate is
ongoing, I'll work on a version of the patch with more detailed wait
events. It's easy to collapse them again if that gets the most votes.)

I am not really on board with the test based on injection points
proposed, though. It checks that the "authenticating" flag is set in
pg_stat_activity, but it does nothing else. That seems limited. Or
are you planning for more?

I can test for specific contents of the entry, if you'd like. My
primary goal was to test that an entry shows up if that part of the
code hangs. I think a regression would otherwise go completely
unnoticed.

Thanks!
--Jacob

#16Michael Paquier
michael@paquier.xyz
In reply to: Jacob Champion (#15)
Re: [PATCH] pg_stat_activity: make slow/hanging authentication more visible

On Wed, Sep 11, 2024 at 02:29:49PM -0700, Jacob Champion wrote:

On Mon, Sep 9, 2024 at 10:30 PM Michael Paquier <michael@paquier.xyz> wrote:

No. My question was about splitting pgstat_bestart() and
pgstat_bestart_pre_auth() in a cleaner way, because authenticated
connections finish by calling both, meaning that we do twice the same
setup for backend entries depending on the authentication path taken.
That seems like a waste.

I can try to separate them out. I'm a little wary of messing with the
CRITICAL_SECTION guarantees, though. I thought the idea was that you
filled in the entire struct to prevent tearing. (If I've misunderstood
that, please let me know :D)

Hm, yeah. We surely should be careful about the consequences of that.
Setting up twice the structure as the patch proposes is kind of
a weird concept, but it feels to me that we should split that and set
the fields in the pre-auth step and ignore the irrelevant ones, then
complete the rest in a second step. We are going to do that anyway if
we want to be able to have backend entries earlier in the
authentication phase.

Couldn't it be better to have a one-one mapping
instead, adding twelve entries in wait_event_names.txt?

(I have no strong opinions on this myself, but while the debate is
ongoing, I'll work on a version of the patch with more detailed wait
events. It's easy to collapse them again if that gets the most votes.)

Thanks. Robert is arguing upthread about more granularity, which is
also what I understand is the original intention of the wait events.
Noah has a different view. Let's see where it goes but I've given my
opinion.

I can test for specific contents of the entry, if you'd like. My
primary goal was to test that an entry shows up if that part of the
code hangs. I think a regression would otherwise go completely
unnoticed.

Perhaps that would be useful, not sure. Based on my first
impressions, I'd tend to say no to these extra test cycles, but I'm
okay to be proved wrong, as well.
--
Michael

#17Noah Misch
noah@leadboat.com
In reply to: Robert Haas (#14)
Re: [PATCH] pg_stat_activity: make slow/hanging authentication more visible

On Wed, Sep 11, 2024 at 09:00:33AM -0400, Robert Haas wrote:

On Tue, Sep 10, 2024 at 4:58 PM Noah Misch <noah@leadboat.com> wrote:

... a rule of "each wait event appears in one
pgstat_report_wait_start()" would be a rule I don't want.

As the original committer of the wait event stuff, I intended for the
rule that you do not want to be the actual rule. However, I see that I
didn't spell that out anywhere in the commit message, or the commit
itself.

I see this level of fine-grained naming
as making the event name a sort of stable proxy for FILE:LINE. I'd value
exposing such a proxy, all else being equal, but I don't think wait event
names like AuthLdapBindLdapbinddn/AuthLdapBindUser are the right way. Wait
event names should be more independent of today's code-level details.

I don't agree with that. One of the most difficult parts of supporting
PostgreSQL, in my experience, is that it's often very difficult to
find out what has gone wrong when a system starts behaving badly. It
is often necessary to ask customers to install a debugger and do stuff
with it, or give them an instrumented build, in order to determine the
root cause of a problem that in some cases is not even particularly
complicated. While needing to refer to specific source code details
may not be a common experience for the typical end user, it is
extremely common for me. This problem commonly arises with error
messages

That is a problem. Half the time, error verbosity doesn't disambiguate enough
for me, and I need backtrace_functions. I now find it hard to believe how
long we coped without backtrace_functions.

I withdraw the objection to "each wait event appears in one
pgstat_report_wait_start()".

#18Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Noah Misch (#17)
Re: [PATCH] pg_stat_activity: make slow/hanging authentication more visible

Hi all,

Here's a v4, with a separate wait event for each location. (I could
use some eyes on the specific phrasing I've chosen for each one.)

On Sun, Sep 1, 2024 at 5:10 PM Michael Paquier <michael@paquier.xyz> wrote:

Could it be more transparent to use a "startup" or
"starting"" state instead that gets also used by pgstat_bestart() in
the case of the patch where !pre_auth?

Done. (I've used "starting".)

On Mon, Sep 9, 2024 at 10:30 PM Michael Paquier <michael@paquier.xyz> wrote:

A new category would be more adapted. IPC is not adapted because are
not waiting for another server process. Perhaps just use a new
"Authentication" class, as in "The server is waiting for an
authentication operation to complete"?

Added a new "Auth" class (to cover both authn and authz during
startup), plus documentation.

On Wed, Sep 11, 2024 at 4:42 PM Michael Paquier <michael@paquier.xyz> wrote:

Setting up twice the structure as the patch proposes is kind of
a weird concept, but it feels to me that we should split that and set
the fields in the pre-auth step and ignore the irrelevant ones, then
complete the rest in a second step.

The more I look at this, the more uneasy I feel about the goal. Best I
can tell, the pre-auth step can't ignore irrelevant fields, because
they may contain junk from the previous owner of the shared memory. So
if we want to optimize, we can only change the second step to skip
fields that were already filled in by the pre-auth step.

That has its own problems: not every backend type uses the pre-auth
step in the current patch. Which means a bunch of backends that don't
benefit from the two-step initialization nevertheless have to either
do two PGSTAT_BEGIN_WRITE_ACTIVITY() dances in a row, or else we
duplicate a bunch of the logic to make sure they maintain the same
efficient code path as before.

Finally, if we're okay with all of that, future maintainers need to be
careful about which fields get copied in the first (preauth) step, the
second step, or both. GSS, for example, can be set up during transport
negotiation (first step) or authentication (second step), so we have
to duplicate the logic there. SSL is currently first-step-only, I
think -- but are we sure we want to hardcode the assumption that cert
auth can't change any of those parameters after the transport has been
established? (I've been brainstorming ways we might use TLS 1.3's
post-handshake CertificateRequest, for example.)

So before I commit to this path, I just want to double-check that all
of the above sounds good and non-controversial. :)

--

In the meantime, is anyone willing and able to commit 0001 and/or 0002?

Thanks!
--Jacob

Attachments:

since-v3.diff.txttext/plain; charset=US-ASCII; name=since-v3.diff.txtDownload
v4-0001-BackgroundPsql-handle-empty-query-results.patchapplication/octet-stream; name=v4-0001-BackgroundPsql-handle-empty-query-results.patchDownload+5-10
v4-0002-Test-Cluster-let-background_psql-work-asynchronou.patchapplication/octet-stream; name=v4-0002-Test-Cluster-let-background_psql-work-asynchronou.patchDownload+26-8
v4-0003-pgstat-report-in-earlier-with-STATE_STARTING.patchapplication/octet-stream; name=v4-0003-pgstat-report-in-earlier-with-STATE_STARTING.patchDownload+140-8
v4-0004-Report-external-auth-calls-as-wait-events.patchapplication/octet-stream; name=v4-0004-Report-external-auth-calls-as-wait-events.patchDownload+90-12
#19Michael Paquier
michael@paquier.xyz
In reply to: Jacob Champion (#18)
Re: [PATCH] pg_stat_activity: make slow/hanging authentication more visible

On Fri, Nov 01, 2024 at 02:47:38PM -0700, Jacob Champion wrote:

On Sun, Sep 1, 2024 at 5:10 PM Michael Paquier <michael@paquier.xyz> wrote:

Could it be more transparent to use a "startup" or
"starting"" state instead that gets also used by pgstat_bestart() in
the case of the patch where !pre_auth?

Done. (I've used "starting".)

0003 looks much cleaner this way.

Added a new "Auth" class (to cover both authn and authz during
startup), plus documentation.

+PAM_ACCT_MGMT	"Waiting for the local PAM service to validate the user account."
+PAM_AUTHENTICATE	"Waiting for the local PAM service to authenticate the user."

Is "local" required for both? Perhaps just use "the PAM service".

+SSPI_LOOKUP_ACCOUNT_SID "Waiting for Windows to find the user's account SID."

We don't document SID in doc/. So perhaps this should add be "SID
(system identifier)".

+SSPI_MAKE_UPN "Waiting for Windows to translate a Kerberos UPN."

UPN is mentioned once in doc/ already. Perhaps this one is OK left
alone..

Except for these tweaks 0004 looks OK.

The more I look at this, the more uneasy I feel about the goal. Best I
can tell, the pre-auth step can't ignore irrelevant fields, because
they may contain junk from the previous owner of the shared memory. So
if we want to optimize, we can only change the second step to skip
fields that were already filled in by the pre-auth step.

That has its own problems: not every backend type uses the pre-auth
step in the current patch. Which means a bunch of backends that don't
benefit from the two-step initialization nevertheless have to either
do two PGSTAT_BEGIN_WRITE_ACTIVITY() dances in a row, or else we
duplicate a bunch of the logic to make sure they maintain the same
efficient code path as before.

Finally, if we're okay with all of that, future maintainers need to be
careful about which fields get copied in the first (preauth) step, the
second step, or both. GSS, for example, can be set up during transport
negotiation (first step) or authentication (second step), so we have
to duplicate the logic there. SSL is currently first-step-only, I
think -- but are we sure we want to hardcode the assumption that cert
auth can't change any of those parameters after the transport has been
established? (I've been brainstorming ways we might use TLS 1.3's
post-handshake CertificateRequest, for example.)

The future field maintenance and what one would need to think more
about in the future is a good point. I still feel slightly uneasy
about the way 0003 is shaped with its new pgstat_bestart_pre_auth(),
but I think that I'm just going to put my hands down on 0003 and see
if I can finish with something I'm a bit more comfortable with. Let's
see..

So before I commit to this path, I just want to double-check that all
of the above sounds good and non-controversial. :)

The goal of the thread is sound.

I'm OK with 0002 to add the wait parameter to BackgroundPsql and be
able to take some actions until a manual wait_connect(). I'll go do
this one. Also perhaps 0001 while on it but I am a bit puzzled by the
removal of the three ok() calls in 037_invalid_database.pl.
--
Michael

#20Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#19)
Re: [PATCH] pg_stat_activity: make slow/hanging authentication more visible

On Wed, Nov 06, 2024 at 02:48:31PM +0900, Michael Paquier wrote:

I'm OK with 0002 to add the wait parameter to BackgroundPsql and be
able to take some actions until a manual wait_connect(). I'll go do
this one. Also perhaps 0001 while on it but I am a bit puzzled by the
removal of the three ok() calls in 037_invalid_database.pl.

0002 has been done as ba08edb06545 after adding a bit more
documentation that was missing. 0001 as well with 70291a3c66ec. The
original expectation of 037_invalid_database.pl with the banner data
expected in the output was interesting..

Note that 0003 is lacking an EXTRA_INSTALL in the Makefile of
src/test/authentication/, or the test would fail if doing for example
a `make check` in this path.

The following nit is also required in the script for installcheck, to
skip the test if the module is not installed:
if (!$node->check_extension('injection_points'))
{
plan skip_all => 'Extension injection_points not installed';
}

See src/test/modules/test_misc/t/005_timeouts.pl as one example. (I
know, these are tricky to know about..)

007_injection_points.pl is a name too generic as it could apply in a
lot more places, without being linked to injection points. How about
something like 007_pre_auth.pl?
--
Michael

#21Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Michael Paquier (#20)
#22Andres Freund
andres@anarazel.de
In reply to: Jacob Champion (#21)
#23Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Andres Freund (#22)
#24Andres Freund
andres@anarazel.de
In reply to: Jacob Champion (#23)
#25Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Andres Freund (#24)
#26Andres Freund
andres@anarazel.de
In reply to: Jacob Champion (#25)
#27Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Andres Freund (#26)
#28Andres Freund
andres@anarazel.de
In reply to: Jacob Champion (#27)
#29Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Andres Freund (#28)
#30Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Jacob Champion (#29)
#31Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Jacob Champion (#30)
#32Michael Paquier
michael@paquier.xyz
In reply to: Jacob Champion (#31)
#33Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Michael Paquier (#32)
#34Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Jacob Champion (#33)
#35Michael Paquier
michael@paquier.xyz
In reply to: Jacob Champion (#34)
#36Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Michael Paquier (#35)
#37Michael Paquier
michael@paquier.xyz
In reply to: Jacob Champion (#36)
#38Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Michael Paquier (#37)
#39Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Jacob Champion (#38)
#40Michael Paquier
michael@paquier.xyz
In reply to: Jacob Champion (#39)
#41Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#40)
#42Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Michael Paquier (#41)
#43Andres Freund
andres@anarazel.de
In reply to: Michael Paquier (#41)
#44Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Andres Freund (#43)
#45Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Jacob Champion (#44)
#46Michael Paquier
michael@paquier.xyz
In reply to: Jacob Champion (#45)
#47Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Michael Paquier (#46)
#48Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Jacob Champion (#47)
#49Andres Freund
andres@anarazel.de
In reply to: Jacob Champion (#48)
#50Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Andres Freund (#49)
#51Andres Freund
andres@anarazel.de
In reply to: Jacob Champion (#50)
#52Michael Paquier
michael@paquier.xyz
In reply to: Andres Freund (#51)
#53Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Michael Paquier (#52)
#54Michael Paquier
michael@paquier.xyz
In reply to: Jacob Champion (#53)
#55Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Michael Paquier (#54)
#56Andres Freund
andres@anarazel.de
In reply to: Jacob Champion (#55)
#57Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Andres Freund (#56)
#58Andres Freund
andres@anarazel.de
In reply to: Jacob Champion (#57)
#59Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Andres Freund (#58)
#60Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Jacob Champion (#59)
#61Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Jacob Champion (#60)
#62Andres Freund
andres@anarazel.de
In reply to: Jacob Champion (#61)
#63Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Andres Freund (#62)
#64Andres Freund
andres@anarazel.de
In reply to: Jacob Champion (#63)
#65Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Andres Freund (#64)
#66Michael Paquier
michael@paquier.xyz
In reply to: Jacob Champion (#65)