Add new option 'all' to pg_stat_reset_shared()

Started by torikoshiaover 2 years ago41 messageshackers
Jump to latest
#1torikoshia
torikoshia@oss.nttdata.com

Hi,

After 96f052613f3, we have below 6 types of parameter for
pg_stat_reset_shared().

"archiver", "bgwriter", "checkpointer", "io", "recovery_prefetch",
"wal"

How about adding a new option 'all' to delete all targets above?

I imagine there are cases where people want to initialize all of them at
the same time in addition to initializing one at a time.

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation

#2Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: torikoshia (#1)
Re: Add new option 'all' to pg_stat_reset_shared()

At Mon, 30 Oct 2023 16:35:19 +0900, torikoshia <torikoshia@oss.nttdata.com> wrote in

Hi,

After 96f052613f3, we have below 6 types of parameter for
pg_stat_reset_shared().

"archiver", "bgwriter", "checkpointer", "io", "recovery_prefetch",
"wal"

How about adding a new option 'all' to delete all targets above?

I imagine there are cases where people want to initialize all of them
at the same time in addition to initializing one at a time.

FWIW, I fairly often wanted it, but forgot about that:p

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#3Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Kyotaro Horiguchi (#2)
Re: Add new option 'all' to pg_stat_reset_shared()

On Mon, Oct 30, 2023 at 1:47 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:

At Mon, 30 Oct 2023 16:35:19 +0900, torikoshia <torikoshia@oss.nttdata.com> wrote in

Hi,

After 96f052613f3, we have below 6 types of parameter for
pg_stat_reset_shared().

"archiver", "bgwriter", "checkpointer", "io", "recovery_prefetch",
"wal"

How about adding a new option 'all' to delete all targets above?

I imagine there are cases where people want to initialize all of them
at the same time in addition to initializing one at a time.

FWIW, I fairly often wanted it, but forgot about that:p

Isn't calling pg_stat_reset_shared() for all stats types helping you
out? Is there any problem with it? Can you be more specific about the
use-case?

IMV, I don't see any point for adding another pseudo (rather
non-existent) shared stats target which might confuse users - it's
easy to specify pg_stat_reset_shared('all'); to clear things out when
someone actually doesn't want to reset all - an accidental usage of
the 'all' option will reset all shared memory stats.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#4Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Bharath Rupireddy (#3)
Re: Add new option 'all' to pg_stat_reset_shared()

At Mon, 30 Oct 2023 14:15:53 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in

I imagine there are cases where people want to initialize all of them
at the same time in addition to initializing one at a time.

FWIW, I fairly often wanted it, but forgot about that:p

Isn't calling pg_stat_reset_shared() for all stats types helping you
out? Is there any problem with it? Can you be more specific about the
use-case?

Oh. Sorry, it's my bad. pg_stat_reset_shared() is sufficient.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#5torikoshia
torikoshia@oss.nttdata.com
In reply to: Kyotaro Horiguchi (#4)
Re: Add new option 'all' to pg_stat_reset_shared()

On Mon, Oct 30, 2023 at 5:46 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Thanks for the comments!

Isn't calling pg_stat_reset_shared() for all stats types helping you
out? Is there any problem with it? Can you be more specific about the
use-case?

Yes, calling pg_stat_reset_shared() for all stats types can do what I
wanted to do.
But calling it with 6 different parameters seems tiresome and I thought
it would be convenient to have a parameter to delete all cluster-wide
statistics at once.

I may be wrong, but I imagine that it's more common to want to delete
all of the statistics for an entire cluster rather than just a portion
of it.

IMV, I don't see any point for adding another pseudo (rather
non-existent) shared stats target which might confuse users - it's
easy to specify pg_stat_reset_shared('all'); to clear things out when
someone actually doesn't want to reset all - an accidental usage of
the 'all' option will reset all shared memory stats.

I once considered changing the pg_stat_reset_shared() to delete all
stats when called without parameters like pg_stat_statements_reset(),
but gave it up since it can confuse users as you described.

I was hoping that the need to specify 'all' would remind users that the
target can be specified individually.

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation

#6Michael Paquier
michael@paquier.xyz
In reply to: torikoshia (#5)
Re: Add new option 'all' to pg_stat_reset_shared()

On Tue, Oct 31, 2023 at 04:26:18PM +0900, torikoshia wrote:

Yes, calling pg_stat_reset_shared() for all stats types can do what I wanted
to do.
But calling it with 6 different parameters seems tiresome and I thought it
would be convenient to have a parameter to delete all cluster-wide
statistics at once.

I may be wrong, but I imagine that it's more common to want to delete all of
the statistics for an entire cluster rather than just a portion of it.

If more flexibility is wanted in this function, could it be an option
to consider a flavor like pg_stat_reset_shared(text[]), where it is
possible to specify a list of shared stats types to reset? Perhaps
there are no real use cases for it, just wanted to mention it anyway
regarding the fact that it could have benefits to refactor this code
to use a bitwise operator for its internals with bit flags for each
type.
--
Michael

#7Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Michael Paquier (#6)
Re: Add new option 'all' to pg_stat_reset_shared()

On Wed, Nov 1, 2023 at 4:24 AM Michael Paquier <michael@paquier.xyz> wrote:

On Tue, Oct 31, 2023 at 04:26:18PM +0900, torikoshia wrote:

Yes, calling pg_stat_reset_shared() for all stats types can do what I wanted
to do.
But calling it with 6 different parameters seems tiresome and I thought it
would be convenient to have a parameter to delete all cluster-wide
statistics at once.

I may be wrong, but I imagine that it's more common to want to delete all of
the statistics for an entire cluster rather than just a portion of it.

If more flexibility is wanted in this function, could it be an option
to consider a flavor like pg_stat_reset_shared(text[]), where it is
possible to specify a list of shared stats types to reset? Perhaps
there are no real use cases for it, just wanted to mention it anyway
regarding the fact that it could have benefits to refactor this code
to use a bitwise operator for its internals with bit flags for each
type.

I don't see a strong reason to introduce yet-another API when someone
can just call things in a loop. I could recollect a recent analogy - a
proposal to have a way to define multiple custom wait events with a
single function call instead of callers defining in a loop didn't draw
much interest.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#8Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Bharath Rupireddy (#7)
Re: Add new option 'all' to pg_stat_reset_shared()

On Thu, 2 Nov 2023 at 20:26, Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Wed, Nov 1, 2023 at 4:24 AM Michael Paquier <michael@paquier.xyz> wrote:

On Tue, Oct 31, 2023 at 04:26:18PM +0900, torikoshia wrote:

Yes, calling pg_stat_reset_shared() for all stats types can do what I wanted
to do.
But calling it with 6 different parameters seems tiresome and I thought it
would be convenient to have a parameter to delete all cluster-wide
statistics at once.

I may be wrong, but I imagine that it's more common to want to delete all of
the statistics for an entire cluster rather than just a portion of it.

If more flexibility is wanted in this function, could it be an option
to consider a flavor like pg_stat_reset_shared(text[]), where it is
possible to specify a list of shared stats types to reset? Perhaps
there are no real use cases for it, just wanted to mention it anyway
regarding the fact that it could have benefits to refactor this code
to use a bitwise operator for its internals with bit flags for each
type.

I don't see a strong reason to introduce yet-another API when someone
can just call things in a loop. I could recollect a recent analogy - a
proposal to have a way to define multiple custom wait events with a
single function call instead of callers defining in a loop didn't draw
much interest.

Knowing that your metrics have a shared starting point can be quite
valuable, as it allows you to do some math that would otherwise be
much less accurate when working with stats over a short amount of
time. I've not used these stats systems much myself, but skew between
metrics caused by different reset points can be difficult to detect
and debug, so I think an atomic call to reset all these stats could be
worth implementing.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)

#9Andres Freund
andres@anarazel.de
In reply to: Bharath Rupireddy (#7)
Re: Add new option 'all' to pg_stat_reset_shared()

Hi,

On 2023-11-03 00:55:00 +0530, Bharath Rupireddy wrote:

On Wed, Nov 1, 2023 at 4:24 AM Michael Paquier <michael@paquier.xyz> wrote:

On Tue, Oct 31, 2023 at 04:26:18PM +0900, torikoshia wrote:

Yes, calling pg_stat_reset_shared() for all stats types can do what I wanted
to do.
But calling it with 6 different parameters seems tiresome and I thought it
would be convenient to have a parameter to delete all cluster-wide
statistics at once.
I may be wrong, but I imagine that it's more common to want to delete all of
the statistics for an entire cluster rather than just a portion of it.

Yes, agreed. However, I'd suggest adding pg_stat_reset_shared(), instead of
pg_stat_reset_shared('all') for this purpose.

If more flexibility is wanted in this function, could it be an option
to consider a flavor like pg_stat_reset_shared(text[]), where it is
possible to specify a list of shared stats types to reset? Perhaps
there are no real use cases for it, just wanted to mention it anyway
regarding the fact that it could have benefits to refactor this code
to use a bitwise operator for its internals with bit flags for each
type.

I don't think there is much point in such an API - if the caller actually
wants to delete individual stats, it's not too hard to loop.

But most of the time resetting individual stats doesn't make sense. IMO it was
a mistake to ever add the ability. But that ship has sailed.

I don't see a strong reason to introduce yet-another API when someone
can just call things in a loop.

I don't agree at all. That requires callers to know the set of possible values
that stats need to be reset for - which has grown over time. But nearly all
the time the right thing to do is to reset *all* shared stats, not just some.

I could recollect a recent analogy - a
proposal to have a way to define multiple custom wait events with a
single function call instead of callers defining in a loop didn't draw
much interest.

That's not analoguous - in your example the caller by definition knows the set
of wait events it wants to create. Introducing a batch API wouldn't change
that. But in case of resetting all stats the caller does *not* inherently
know the set of stats types.

Greetings,

Andres Freund

#10torikoshia
torikoshia@oss.nttdata.com
In reply to: Andres Freund (#9)
Re: Add new option 'all' to pg_stat_reset_shared()

Thanks all for the comments!

On Fri, Nov 3, 2023 at 5:17 AM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:

Knowing that your metrics have a shared starting point can be quite
valuable, as it allows you to do some math that would otherwise be
much less accurate when working with stats over a short amount of
time. I've not used these stats systems much myself, but skew between
metrics caused by different reset points can be difficult to detect
and debug, so I think an atomic call to reset all these stats could be
worth implementing.

Since each stats, except wal_prefetch was reset acquiring LWLock,
attached PoC patch makes the call atomic by using these LWlocks.

If this is the right direction, I'll try to make wal_prefetch also take
LWLock.

On 2023-11-04 10:49, Andres Freund wrote:

Yes, agreed. However, I'd suggest adding pg_stat_reset_shared(),
instead of
pg_stat_reset_shared('all') for this purpose.

In the attached PoC patch the shared statistics are reset by calling
pg_stat_reset_shared() not pg_stat_reset_shared('all').

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation

Attachments:

v1-0001-WIP-Add-ability-to-reset-all-statistics-to-pg_sta.patchtext/x-diff; name=v1-0001-WIP-Add-ability-to-reset-all-statistics-to-pg_sta.patchDownload+93-2
#11Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Andres Freund (#9)
Re: Add new option 'all' to pg_stat_reset_shared()

On Sat, Nov 4, 2023 at 7:19 AM Andres Freund <andres@anarazel.de> wrote:

On 2023-11-03 00:55:00 +0530, Bharath Rupireddy wrote:

On Wed, Nov 1, 2023 at 4:24 AM Michael Paquier <michael@paquier.xyz> wrote:

On Tue, Oct 31, 2023 at 04:26:18PM +0900, torikoshia wrote:

Yes, calling pg_stat_reset_shared() for all stats types can do what I wanted
to do.
But calling it with 6 different parameters seems tiresome and I thought it
would be convenient to have a parameter to delete all cluster-wide
statistics at once.
I may be wrong, but I imagine that it's more common to want to delete all of
the statistics for an entire cluster rather than just a portion of it.

Yes, agreed. However, I'd suggest adding pg_stat_reset_shared(), instead of
pg_stat_reset_shared('all') for this purpose.

An overloaded function seems a better choice than another target
'all'. I'm all +1 for it as others seem to concur with the idea of
having something to reset all shared stats.

If more flexibility is wanted in this function, could it be an option
to consider a flavor like pg_stat_reset_shared(text[]), where it is
possible to specify a list of shared stats types to reset? Perhaps
there are no real use cases for it, just wanted to mention it anyway
regarding the fact that it could have benefits to refactor this code
to use a bitwise operator for its internals with bit flags for each
type.

I don't think there is much point in such an API - if the caller actually
wants to delete individual stats, it's not too hard to loop.

-1 for text[] version.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#12Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: torikoshia (#10)
Re: Add new option 'all' to pg_stat_reset_shared()

On Mon, Nov 6, 2023 at 11:39 AM torikoshia <torikoshia@oss.nttdata.com> wrote:

Thanks all for the comments!

On Fri, Nov 3, 2023 at 5:17 AM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:

Knowing that your metrics have a shared starting point can be quite
valuable, as it allows you to do some math that would otherwise be
much less accurate when working with stats over a short amount of
time. I've not used these stats systems much myself, but skew between
metrics caused by different reset points can be difficult to detect
and debug, so I think an atomic call to reset all these stats could be
worth implementing.

Since each stats, except wal_prefetch was reset acquiring LWLock,
attached PoC patch makes the call atomic by using these LWlocks.

If this is the right direction, I'll try to make wal_prefetch also take
LWLock.

+    // Acquire LWLocks
+    LWLock *locks[] = {&stats_archiver->lock,  &stats_bgwriter->lock,
+                       &stats_checkpointer->lock, &stats_wal->lock};
+
+    for (int i = 0; i < STATS_SHARED_NUM_LWLOCK; i++)
+        LWLockAcquire(locks[i], LW_EXCLUSIVE);
+
+    for (int i = 0; i < BACKEND_NUM_TYPES; i++)
+    {
+        LWLock    *bktype_lock = &stats_io->locks[i];
+        LWLockAcquire(bktype_lock, LW_EXCLUSIVE);
+    }

Well, that's a total of ~17 LWLocks this new function takes to make
the stats reset atomic. I'm not sure if this atomicity is worth the
effort which can easily be misused - what if someone runs something
like SELECT pg_stat_reset_shared() FROM generate_series(1,
100000....n); to cause heavy lock acquisition and release cycles?

IMV, atomicity is not something that applies for the stats reset
operation because stats are approximate numbers by nature after all.
If the pg_stat_reset_shared() resets stats for only a bunch of stats
types and fails, it's the basic application programming style that
when a query fails it's the application that needs to have a retry
mechanism. FWIW, the atomicity doesn't apply today if someone wants to
reset stats in a loop for all stats types.

On 2023-11-04 10:49, Andres Freund wrote:

Yes, agreed. However, I'd suggest adding pg_stat_reset_shared(),
instead of
pg_stat_reset_shared('all') for this purpose.

In the attached PoC patch the shared statistics are reset by calling
pg_stat_reset_shared() not pg_stat_reset_shared('all').

Some quick comments:

1.
+/*
+pg_stat_reset_shared_all(PG_FUNCTION_ARGS)
+{
+    pgstat_reset_shared_all();
+    PG_RETURN_VOID();
+}

IMO, simpler way is to move the existing code in
pg_stat_reset_shared() to a common internal function like
pgstat_reset_shared(char *target) and the pg_stat_reset_shared_all()
can just loop over all the stats targets.

2.
+{ oid => '8000',
+  descr => 'statistics: reset collected statistics shared across the cluster',
+  proname => 'pg_stat_reset_shared', provolatile => 'v', prorettype => 'void',
+  proargtypes => '', prosrc => 'pg_stat_reset_shared_all' },

Why a new function consuming the oid? Why can't we just do the trick
of proisstrict => 'f' and if (PG_ARGISNULL(0)) { reset all stats} else
{reset specified stats kind} like the pg_stat_reset_slru()?

3. I think the new reset all stats function must also consider
resetting all SLRU stats, no?
/* stats for fixed-numbered objects */
PGSTAT_KIND_ARCHIVER,
PGSTAT_KIND_BGWRITER,
PGSTAT_KIND_CHECKPOINTER,
PGSTAT_KIND_IO,
PGSTAT_KIND_SLRU,
PGSTAT_KIND_WAL,

4. I think the new reset all stats function must also consider
resetting recovery_prefetch.

5.
+        If no argument is specified, reset all these views at once.
+        Note current patch is WIP and pg_stat_recovery_prefetch is not reset.
        </para>

How about "If the argument is NULL, all counters shown in all of these
views are reset."?

6. Add a test case to cover the code in stats.sql.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#13Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Bharath Rupireddy (#12)
Re: Add new option 'all' to pg_stat_reset_shared()

At Mon, 6 Nov 2023 14:00:13 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in

On Mon, Nov 6, 2023 at 11:39 AM torikoshia <torikoshia@oss.nttdata.com> wrote:

Since each stats, except wal_prefetch was reset acquiring LWLock,
attached PoC patch makes the call atomic by using these LWlocks.

If this is the right direction, I'll try to make wal_prefetch also take
LWLock.

I must say, I have reservations about this approach. The main concern
is the duplication of reset code, which has been efficiently
encapsulated for individual targets, into this location. This practice
degrades the maintainability and clarity of the code.

Well, that's a total of ~17 LWLocks this new function takes to make
the stats reset atomic. I'm not sure if this atomicity is worth the
effort which can easily be misused - what if someone runs something
like SELECT pg_stat_reset_shared() FROM generate_series(1,
100000....n); to cause heavy lock acquisition and release cycles?

...

IMV, atomicity is not something that applies for the stats reset
operation because stats are approximate numbers by nature after all.
If the pg_stat_reset_shared() resets stats for only a bunch of stats
types and fails, it's the basic application programming style that
when a query fails it's the application that needs to have a retry
mechanism. FWIW, the atomicity doesn't apply today if someone wants to
reset stats in a loop for all stats types.

The infrequent use of this feature, coupled with the fact that there
is no inherent need for these counters to be reset simultaneoulsy,
leads me to think that there is little point in cnetralizing the
locks.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#14Michael Paquier
michael@paquier.xyz
In reply to: Kyotaro Horiguchi (#13)
Re: Add new option 'all' to pg_stat_reset_shared()

On Wed, Nov 08, 2023 at 10:08:42AM +0900, Kyotaro Horiguchi wrote:

At Mon, 6 Nov 2023 14:00:13 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in
I must say, I have reservations about this approach. The main concern
is the duplication of reset code, which has been efficiently
encapsulated for individual targets, into this location. This practice
degrades the maintainability and clarity of the code.

+{ oid => '8000',
This OID pick had me smile.

IMV, atomicity is not something that applies for the stats reset
operation because stats are approximate numbers by nature after all.
If the pg_stat_reset_shared() resets stats for only a bunch of stats
types and fails, it's the basic application programming style that
when a query fails it's the application that needs to have a retry
mechanism. FWIW, the atomicity doesn't apply today if someone wants to
reset stats in a loop for all stats types.

The infrequent use of this feature, coupled with the fact that there
is no inherent need for these counters to be reset simultaneoulsy,
leads me to think that there is little point in centralizing the
locks.

Each stat listed with fixed_amount has meaning taken in isolation, so
I don't see why this patch has to be that complicated. I'd expect one
code path that just calls a series of pgstat_reset_of_kind(). There
could be an argument for a new routine in pgstat.c that loops over the
pgstat_kind_infos and triggers the callbacks where .fixed_amount is
set, but that's less transparent than the other approach. The reset
time should be consistent across all the calls as we rely on
GetCurrentTimestamp().
--
Michael

#15Andres Freund
andres@anarazel.de
In reply to: Bharath Rupireddy (#12)
Re: Add new option 'all' to pg_stat_reset_shared()

Hi,

On 2023-11-06 14:00:13 +0530, Bharath Rupireddy wrote:

On Mon, Nov 6, 2023 at 11:39 AM torikoshia <torikoshia@oss.nttdata.com> wrote:

Thanks all for the comments!

On Fri, Nov 3, 2023 at 5:17 AM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:

Knowing that your metrics have a shared starting point can be quite
valuable, as it allows you to do some math that would otherwise be
much less accurate when working with stats over a short amount of
time. I've not used these stats systems much myself, but skew between
metrics caused by different reset points can be difficult to detect
and debug, so I think an atomic call to reset all these stats could be
worth implementing.

Since each stats, except wal_prefetch was reset acquiring LWLock,
attached PoC patch makes the call atomic by using these LWlocks.

If this is the right direction, I'll try to make wal_prefetch also take
LWLock.

+    // Acquire LWLocks
+    LWLock *locks[] = {&stats_archiver->lock,  &stats_bgwriter->lock,
+                       &stats_checkpointer->lock, &stats_wal->lock};
+
+    for (int i = 0; i < STATS_SHARED_NUM_LWLOCK; i++)
+        LWLockAcquire(locks[i], LW_EXCLUSIVE);
+
+    for (int i = 0; i < BACKEND_NUM_TYPES; i++)
+    {
+        LWLock    *bktype_lock = &stats_io->locks[i];
+        LWLockAcquire(bktype_lock, LW_EXCLUSIVE);
+    }

Well, that's a total of ~17 LWLocks this new function takes to make
the stats reset atomic. I'm not sure if this atomicity is worth the
effort which can easily be misused - what if someone runs something
like SELECT pg_stat_reset_shared() FROM generate_series(1,
100000....n); to cause heavy lock acquisition and release cycles?

Yea, this seems like an *extremely* bad idea to me. Without careful analysis
it could very well cause deadlocks.

IMV, atomicity is not something that applies for the stats reset
operation because stats are approximate numbers by nature after all.
If the pg_stat_reset_shared() resets stats for only a bunch of stats
types and fails, it's the basic application programming style that
when a query fails it's the application that needs to have a retry
mechanism. FWIW, the atomicity doesn't apply today if someone wants to
reset stats in a loop for all stats types.

Yea. Additionally it's not really atomic regardless of the lwlocks, due to
various processes all accumulating in local counters first, and only
occasionally updating the shared data. So even after holding all the locks at
the same time, the shared stats would still not actually represent a truly
atomic state.

2.
+{ oid => '8000',
+  descr => 'statistics: reset collected statistics shared across the cluster',
+  proname => 'pg_stat_reset_shared', provolatile => 'v', prorettype => 'void',
+  proargtypes => '', prosrc => 'pg_stat_reset_shared_all' },

Why a new function consuming the oid? Why can't we just do the trick
of proisstrict => 'f' and if (PG_ARGISNULL(0)) { reset all stats} else
{reset specified stats kind} like the pg_stat_reset_slru()?

It's not like oids are a precious resource. It's a more confusing API to have
to have to specify a NULL as an argument than not having to do so. If we
really want to avoid a separate oid, a more sensible path would be to add a
default argument to pg_stat_reset_slru() (by doing a CREATE OR REPLACE in
system_functions.sql).

Greetings,

Andres Freund

#16Andres Freund
andres@anarazel.de
In reply to: Kyotaro Horiguchi (#13)
Re: Add new option 'all' to pg_stat_reset_shared()

Hi,

On 2023-11-08 10:08:42 +0900, Kyotaro Horiguchi wrote:

At Mon, 6 Nov 2023 14:00:13 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in

On Mon, Nov 6, 2023 at 11:39 AM torikoshia <torikoshia@oss.nttdata.com> wrote:

Since each stats, except wal_prefetch was reset acquiring LWLock,
attached PoC patch makes the call atomic by using these LWlocks.

If this is the right direction, I'll try to make wal_prefetch also take
LWLock.

I must say, I have reservations about this approach. The main concern
is the duplication of reset code, which has been efficiently
encapsulated for individual targets, into this location. This practice
degrades the maintainability and clarity of the code.

Yes, as-is this seems to evolve the code in precisely the wrong direction. We
want less central awareness of different types of stats, not more. The
proposed new code is far longer than the current pg_stat_reset(), despite
doing something conceptually simpler.

Greetings,

Andres Freund

#17Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Andres Freund (#15)
Re: Add new option 'all' to pg_stat_reset_shared()

On Wed, Nov 8, 2023 at 9:43 AM Andres Freund <andres@anarazel.de> wrote:

2.
+{ oid => '8000',
+  descr => 'statistics: reset collected statistics shared across the cluster',
+  proname => 'pg_stat_reset_shared', provolatile => 'v', prorettype => 'void',
+  proargtypes => '', prosrc => 'pg_stat_reset_shared_all' },

Why a new function consuming the oid? Why can't we just do the trick
of proisstrict => 'f' and if (PG_ARGISNULL(0)) { reset all stats} else
{reset specified stats kind} like the pg_stat_reset_slru()?

It's not like oids are a precious resource. It's a more confusing API to have
to have to specify a NULL as an argument than not having to do so. If we
really want to avoid a separate oid, a more sensible path would be to add a
default argument to pg_stat_reset_slru() (by doing a CREATE OR REPLACE in
system_functions.sql).

+1. Attached the patch.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v1-0001-Add-no-argument-support-for-pg_stat_reset_slru.texttext/plain; charset=US-ASCII; name=v1-0001-Add-no-argument-support-for-pg_stat_reset_slru.textDownload+14-7
#18Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Andres Freund (#15)
Re: Add new option 'all' to pg_stat_reset_shared()

On Wed, 8 Nov 2023 at 05:13, Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2023-11-06 14:00:13 +0530, Bharath Rupireddy wrote:

Well, that's a total of ~17 LWLocks this new function takes to make
the stats reset atomic. I'm not sure if this atomicity is worth the
effort which can easily be misused - what if someone runs something
like SELECT pg_stat_reset_shared() FROM generate_series(1,
100000....n); to cause heavy lock acquisition and release cycles?

Yea, this seems like an *extremely* bad idea to me. Without careful analysis
it could very well cause deadlocks.

I didn't realize that it'd take 17 LwLocks to reset those stats; I
thought it was one shared system using the same lock, or a very
limited set of locks. Aquiring 17 locks is quite likely not worth the
chance of having to wait for some stats lock or another and thus
generating 'bubbles' in other stats gathering pipelines.

IMV, atomicity is not something that applies for the stats reset
operation because stats are approximate numbers by nature after all.
If the pg_stat_reset_shared() resets stats for only a bunch of stats
types and fails, it's the basic application programming style that
when a query fails it's the application that needs to have a retry
mechanism. FWIW, the atomicity doesn't apply today if someone wants to
reset stats in a loop for all stats types.

Yea. Additionally it's not really atomic regardless of the lwlocks, due to
various processes all accumulating in local counters first, and only
occasionally updating the shared data. So even after holding all the locks at
the same time, the shared stats would still not actually represent a truly
atomic state.

Good points that I hadn't thought much about yet. I agree that atomic
reset isn't worth implementing in this stats system - it's too costly
and complex to do it correctly.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)

#19Michael Paquier
michael@paquier.xyz
In reply to: Bharath Rupireddy (#17)
Re: Add new option 'all' to pg_stat_reset_shared()

On Wed, Nov 08, 2023 at 02:15:24PM +0530, Bharath Rupireddy wrote:

On Wed, Nov 8, 2023 at 9:43 AM Andres Freund <andres@anarazel.de> wrote:

It's not like oids are a precious resource. It's a more confusing API to have
to have to specify a NULL as an argument than not having to do so. If we
really want to avoid a separate oid, a more sensible path would be to add a
default argument to pg_stat_reset_slru() (by doing a CREATE OR REPLACE in
system_functions.sql).

+1. Attached the patch.

-- Test that multiple SLRUs are reset when no specific SLRU provided to reset function
-SELECT pg_stat_reset_slru(NULL);
+SELECT pg_stat_reset_slru();

For the SLRU part, why not.

Hmm. What's the final plan for pg_stat_reset_shared(), then? An
equivalent that calls a series of pgstat_reset_of_kind()?
--
Michael

#20torikoshia
torikoshia@oss.nttdata.com
In reply to: Michael Paquier (#19)
Re: Add new option 'all' to pg_stat_reset_shared()

On 2023-11-09 08:58, Michael Paquier wrote:

On Wed, Nov 08, 2023 at 02:15:24PM +0530, Bharath Rupireddy wrote:

On Wed, Nov 8, 2023 at 9:43 AM Andres Freund <andres@anarazel.de>
wrote:

It's not like oids are a precious resource. It's a more confusing API
to have
to have to specify a NULL as an argument than not having to do so. If
we
really want to avoid a separate oid, a more sensible path would be to
add a
default argument to pg_stat_reset_slru() (by doing a CREATE OR
REPLACE in
system_functions.sql).

+1. Attached the patch.

-- Test that multiple SLRUs are reset when no specific SLRU provided 
to reset function
-SELECT pg_stat_reset_slru(NULL);
+SELECT pg_stat_reset_slru();

For the SLRU part, why not.

Hmm. What's the final plan for pg_stat_reset_shared(), then? An
equivalent that calls a series of pgstat_reset_of_kind()?

Sorry for late reply and thanks for the feedbacks everyone.

As your 1st suggestion, I think "calls a series of
pgstat_reset_of_kind()" would be enough.

I am a little concerned about that the reset time is not the same and
that GetCurrentTimestamp() is called multiple times, but I think it
would be acceptable because the function is probably not used that often
and the reset time is not atomic in practice.

I'll attach the patch.

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation

#21Michael Paquier
michael@paquier.xyz
In reply to: torikoshia (#20)
#22Andres Freund
andres@anarazel.de
In reply to: Michael Paquier (#21)
#23torikoshia
torikoshia@oss.nttdata.com
In reply to: Andres Freund (#22)
#24Michael Paquier
michael@paquier.xyz
In reply to: torikoshia (#23)
#25torikoshia
torikoshia@oss.nttdata.com
In reply to: Michael Paquier (#24)
#26Michael Paquier
michael@paquier.xyz
In reply to: torikoshia (#25)
#27Andres Freund
andres@anarazel.de
In reply to: Michael Paquier (#24)
#28torikoshia
torikoshia@oss.nttdata.com
In reply to: Andres Freund (#27)
#29Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#26)
#30Michael Paquier
michael@paquier.xyz
In reply to: torikoshia (#28)
#31torikoshia
torikoshia@oss.nttdata.com
In reply to: Michael Paquier (#30)
#32Michael Paquier
michael@paquier.xyz
In reply to: torikoshia (#31)
#33Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: torikoshia (#31)
#34Michael Paquier
michael@paquier.xyz
In reply to: Bharath Rupireddy (#33)
#35Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#34)
#36torikoshia
torikoshia@oss.nttdata.com
In reply to: torikoshia (#31)
#37Michael Paquier
michael@paquier.xyz
In reply to: torikoshia (#36)
#38torikoshia
torikoshia@oss.nttdata.com
In reply to: Michael Paquier (#37)
#39Michael Paquier
michael@paquier.xyz
In reply to: torikoshia (#38)
#40Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#39)
#41torikoshia
torikoshia@oss.nttdata.com
In reply to: Michael Paquier (#40)