pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead
Hi,
Since
commit 960869da0803427d14335bba24393f414b476e2c
Author: Magnus Hagander <magnus@hagander.net>
Date: 2021-01-17 13:34:09 +0100
Add pg_stat_database counters for sessions and session time
pgstat_report_stat() does another timestamp computation via
pgstat_send_connstats(), despite typically having computed one just a few
lines before.
Given that timestamp computation isn't all that cheap, that's not great. Even
more, that additional timestamp computation makes things *less* accurate:
void
pgstat_report_stat(bool disconnect)
...
/*
* Don't send a message unless it's been at least PGSTAT_STAT_INTERVAL
* msec since we last sent one, or the backend is about to exit.
*/
now = GetCurrentTransactionStopTimestamp();
if (!disconnect &&
!TimestampDifferenceExceeds(last_report, now, PGSTAT_STAT_INTERVAL))
return;
/* for backends, send connection statistics */
if (MyBackendType == B_BACKEND)
pgstat_send_connstats(disconnect, last_report);
last_report = now;
and then pgstat_send_connstats() does:
/* session time since the last report */
TimestampDifference(((last_report == 0) ? MyStartTimestamp : last_report),
GetCurrentTimestamp(),
&secs, &usecs);
msg.m_session_time = secs * 1000000 + usecs;
We maintain last_report as GetCurrentTransactionStopTimestamp(), but then use
a separate timestamp in pgstat_send_connstats() to compute the difference from
last_report, which is then set to GetCurrentTransactionStopTimestamp()'s
return value.
I'm also not all that happy with sending yet another UDP packet for this. This
doubles the UDP packets to the stats collector in the common cases (unless
more than TABSTAT_QUANTUM tables have stats to report, or shared tables have
been accessed). We already send plenty of "summary" information via
PgStat_MsgTabstat, one of which is sent unconditionally, why do we need a
separate message for connection stats?
Alternatively we could just not send an update to connection stats every 500ms
for every active connection, but only do so at connection end. The database
stats would only contain stats for disconnected sessions, while the stats for
"live" connections are maintained via backend_status.c. That'd give us *more*
information for less costs, because we then could see idle/active times for
individual connections.
That'd likely be more change than what we would want to do at this point in
the release cycle though. But I think we ought to do something about the
increased overhead...
Greetings,
Andres Freund
On Sun, 2021-08-01 at 13:55 -0700, Andres Freund wrote:
Since
commit 960869da0803427d14335bba24393f414b476e2c
Author: Magnus Hagander <magnus@hagander.net>
Date: 2021-01-17 13:34:09 +0100Add pg_stat_database counters for sessions and session time
pgstat_report_stat() does another timestamp computation via
pgstat_send_connstats(), despite typically having computed one just a few
lines before.Given that timestamp computation isn't all that cheap, that's not great. Even
more, that additional timestamp computation makes things *less* accurate:void
pgstat_report_stat(bool disconnect)
...
/*
* Don't send a message unless it's been at least PGSTAT_STAT_INTERVAL
* msec since we last sent one, or the backend is about to exit.
*/
now = GetCurrentTransactionStopTimestamp();
if (!disconnect &&
!TimestampDifferenceExceeds(last_report, now, PGSTAT_STAT_INTERVAL))
return;/* for backends, send connection statistics */
if (MyBackendType == B_BACKEND)
pgstat_send_connstats(disconnect, last_report);last_report = now;
and then pgstat_send_connstats() does:
/* session time since the last report */
TimestampDifference(((last_report == 0) ? MyStartTimestamp : last_report),
GetCurrentTimestamp(),
&secs, &usecs);
msg.m_session_time = secs * 1000000 + usecs;We maintain last_report as GetCurrentTransactionStopTimestamp(), but then use
a separate timestamp in pgstat_send_connstats() to compute the difference from
last_report, which is then set to GetCurrentTransactionStopTimestamp()'s
return value.
Makes sense to me. How about passing "now", which was just initialized from
GetCurrentTransactionStopTimestamp(), as additional parameter to
pgstat_send_connstats() and use that value instead of taking the current time?
I'm also not all that happy with sending yet another UDP packet for this. This
doubles the UDP packets to the stats collector in the common cases (unless
more than TABSTAT_QUANTUM tables have stats to report, or shared tables have
been accessed). We already send plenty of "summary" information via
PgStat_MsgTabstat, one of which is sent unconditionally, why do we need a
separate message for connection stats?
Are you suggesting that connection statistics should be shoehorned into
some other statistics message? That would reduce the number of UDP packets,
but it sounds ugly and confusing to me.
Alternatively we could just not send an update to connection stats every 500ms
for every active connection, but only do so at connection end. The database
stats would only contain stats for disconnected sessions, while the stats for
"live" connections are maintained via backend_status.c. That'd give us *more*
information for less costs, because we then could see idle/active times for
individual connections.
That was my original take, but if I remember right, Magnus convinced me that
it would be more useful to have statistics for open sessions as well.
With a connection pool, connections can stay open for a very long time,
and the accuracy and usefulness of the statistics would become questionable.
That'd likely be more change than what we would want to do at this point in
the release cycle though. But I think we ought to do something about the
increased overhead...
If you are talking about the extra GetCurrentTimestamp() call, and my first
suggestion is acceptable, that should be simple.
Yours,
Laurenz Albe
Hi,
On 2021-08-17 10:44:51 +0200, Laurenz Albe wrote:
On Sun, 2021-08-01 at 13:55 -0700, Andres Freund wrote:
We maintain last_report as GetCurrentTransactionStopTimestamp(), but then use
a separate timestamp in pgstat_send_connstats() to compute the difference from
last_report, which is then set to GetCurrentTransactionStopTimestamp()'s
return value.Makes sense to me. How about passing "now", which was just initialized from
GetCurrentTransactionStopTimestamp(), as additional parameter to
pgstat_send_connstats() and use that value instead of taking the current time?
Yes.
I'm also not all that happy with sending yet another UDP packet for this. This
doubles the UDP packets to the stats collector in the common cases (unless
more than TABSTAT_QUANTUM tables have stats to report, or shared tables have
been accessed). We already send plenty of "summary" information via
PgStat_MsgTabstat, one of which is sent unconditionally, why do we need a
separate message for connection stats?Are you suggesting that connection statistics should be shoehorned into
some other statistics message? That would reduce the number of UDP packets,
but it sounds ugly and confusing to me.
That ship already has sailed. Look at struct PgStat_MsgTabstat
typedef struct PgStat_MsgTabstat
{
PgStat_MsgHdr m_hdr;
Oid m_databaseid;
int m_nentries;
int m_xact_commit;
int m_xact_rollback;
PgStat_Counter m_block_read_time; /* times in microseconds */
PgStat_Counter m_block_write_time;
PgStat_TableEntry m_entry[PGSTAT_NUM_TABENTRIES];
} PgStat_MsgTabstat;
Given that we transport number of commits/commits, block read/write time
adding the time the connection was active/inactive doesn't really seem like it
makes things meaningfully worse?
Alternatively we could just not send an update to connection stats every 500ms
for every active connection, but only do so at connection end. The database
stats would only contain stats for disconnected sessions, while the stats for
"live" connections are maintained via backend_status.c. That'd give us *more*
information for less costs, because we then could see idle/active times for
individual connections.That was my original take, but if I remember right, Magnus convinced me that
it would be more useful to have statistics for open sessions as well.
With a connection pool, connections can stay open for a very long time,
and the accuracy and usefulness of the statistics would become questionable.
That's not a contradiction to what I propose. Having the data available via
backend_status.c allows to sum up the data for active connections and the data
for past connections.
I think it's also just cleaner to not constantly report changing preliminary
data as pgstat_send_connstats() does.
That'd likely be more change than what we would want to do at this point in
the release cycle though. But I think we ought to do something about the
increased overhead...If you are talking about the extra GetCurrentTimestamp() call, and my first
suggestion is acceptable, that should be simple.
Doubling the number of UDP messages in common workloads seems also problematic
enough that it should be addressed for 14. It increases the likelihood of
dropping stats messages on busy systems, which can have downstream impacts.
Greetings,
Andres Freund
On Tue, 2021-08-17 at 02:14 -0700, Andres Freund wrote:
On 2021-08-17 10:44:51 +0200, Laurenz Albe wrote:
On Sun, 2021-08-01 at 13:55 -0700, Andres Freund wrote:
We maintain last_report as GetCurrentTransactionStopTimestamp(), but then use
a separate timestamp in pgstat_send_connstats() to compute the difference from
last_report, which is then set to GetCurrentTransactionStopTimestamp()'s
return value.Makes sense to me. How about passing "now", which was just initialized from
GetCurrentTransactionStopTimestamp(), as additional parameter to
pgstat_send_connstats() and use that value instead of taking the current time?Yes.
Here is a patch for that.
I'm also not all that happy with sending yet another UDP packet for this.
Are you suggesting that connection statistics should be shoehorned into
some other statistics message? That would reduce the number of UDP packets,
but it sounds ugly and confusing to me.That ship already has sailed. Look at struct PgStat_MsgTabstat
Given that we transport number of commits/commits, block read/write time
adding the time the connection was active/inactive doesn't really seem like it
makes things meaningfully worse?
Point taken.
I looked at the other statistics sent in pgstat_report_stat(), and I see
none that are sent unconditionally. Are you thinking of this:
/*
* Send partial messages. Make sure that any pending xact commit/abort
* gets counted, even if there are no table stats to send.
*/
if (regular_msg.m_nentries > 0 ||
pgStatXactCommit > 0 || pgStatXactRollback > 0)
pgstat_send_tabstat(®ular_msg);
if (shared_msg.m_nentries > 0)
pgstat_send_tabstat(&shared_msg);
I can't think of a way to hack this up that wouldn't make my stomach turn.
Alternatively we could just not send an update to connection stats every 500ms
for every active connection, but only do so at connection end. The database
stats would only contain stats for disconnected sessions, while the stats for
"live" connections are maintained via backend_status.c.That was my original take, but if I remember right, Magnus convinced me that
it would be more useful to have statistics for open sessions as well.
With a connection pool, connections can stay open for a very long time,
and the accuracy and usefulness of the statistics would become questionable.That's not a contradiction to what I propose. Having the data available via
backend_status.c allows to sum up the data for active connections and the data
for past connections.I think it's also just cleaner to not constantly report changing preliminary
data as pgstat_send_connstats() does.
Currently, the data are kept in static variables in the backend process.
That would have to change for such an approach, right?
Doubling the number of UDP messages in common workloads seems also problematic
enough that it should be addressed for 14.
Ok, but I don't know how to go about it.
Yours,
Laurenz Albe
Attachments:
0001-Improve-performance-and-accuracy-of-session-statisti.patchtext/x-patch; charset=UTF-8; name=0001-Improve-performance-and-accuracy-of-session-statisti.patchDownload+4-5
On Tue, Aug 17, 2021 at 02:14:20AM -0700, Andres Freund wrote:
Doubling the number of UDP messages in common workloads seems also problematic
enough that it should be addressed for 14. It increases the likelihood of
dropping stats messages on busy systems, which can have downstream impacts.
I think by "common workloads" you mean one with many, shortlived sessions.
That does sounds like a concern, and I added this as an Opened Item.
--
Justin
At Wed, 18 Aug 2021 05:16:38 +0200, Laurenz Albe <laurenz.albe@cybertec.at> wrote in
On Tue, 2021-08-17 at 02:14 -0700, Andres Freund wrote:
On 2021-08-17 10:44:51 +0200, Laurenz Albe wrote:
On Sun, 2021-08-01 at 13:55 -0700, Andres Freund wrote:
We maintain last_report as GetCurrentTransactionStopTimestamp(), but then use
a separate timestamp in pgstat_send_connstats() to compute the difference from
last_report, which is then set to GetCurrentTransactionStopTimestamp()'s
return value.Makes sense to me. How about passing "now", which was just initialized from
GetCurrentTransactionStopTimestamp(), as additional parameter to
pgstat_send_connstats() and use that value instead of taking the current time?Yes.
Here is a patch for that.
FWIW, looks good to me.
I'm also not all that happy with sending yet another UDP packet for this.
Are you suggesting that connection statistics should be shoehorned into
some other statistics message? That would reduce the number of UDP packets,
but it sounds ugly and confusing to me.That ship already has sailed. Look at struct PgStat_MsgTabstat
Given that we transport number of commits/commits, block read/write time
adding the time the connection was active/inactive doesn't really seem like it
makes things meaningfully worse?Point taken.
I looked at the other statistics sent in pgstat_report_stat(), and I see
none that are sent unconditionally. Are you thinking of this:
IIUC, that means that pg_stat_report sends at least one
PgStat_MsgTabstat struct for the database stats purpose if any stats
are sent. So the connection stats can piggy-back on the packet.
/*
* Send partial messages. Make sure that any pending xact commit/abort
* gets counted, even if there are no table stats to send.
*/
if (regular_msg.m_nentries > 0 ||
pgStatXactCommit > 0 || pgStatXactRollback > 0)
pgstat_send_tabstat(®ular_msg);
if (shared_msg.m_nentries > 0)
pgstat_send_tabstat(&shared_msg);I can't think of a way to hack this up that wouldn't make my stomach turn.
No need to change the condition. It's sufficient that the connection
stats are sent at the same time with transaction stats are sent.
Alternatively we could just not send an update to connection stats every 500ms
for every active connection, but only do so at connection end. The database
stats would only contain stats for disconnected sessions, while the stats for
"live" connections are maintained via backend_status.c.That was my original take, but if I remember right, Magnus convinced me that
it would be more useful to have statistics for open sessions as well.
With a connection pool, connections can stay open for a very long time,
and the accuracy and usefulness of the statistics would become questionable.That's not a contradiction to what I propose. Having the data available via
backend_status.c allows to sum up the data for active connections and the data
for past connections.I think it's also just cleaner to not constantly report changing preliminary
data as pgstat_send_connstats() does.Currently, the data are kept in static variables in the backend process.
That would have to change for such an approach, right?
Total session time can be summarized from beentry any time, but I'm
not sure how we can summarize active/idle time.. Anyway it's not
needed if the attached works.
Doubling the number of UDP messages in common workloads seems also problematic
enough that it should be addressed for 14.Ok, but I don't know how to go about it.
The attached is a heavy-WIP on:
- remove redundant gettimeofday().
- avoid sending dedicate UCP packet for connection stats.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachments:
remove_connstats_udp_wip.patchtext/x-patch; charset=us-asciiDownload+76-112
On Tue, 2021-08-24 at 15:12 +0900, Kyotaro Horiguchi wrote:
At Wed, 18 Aug 2021 05:16:38 +0200, Laurenz Albe <laurenz.albe@cybertec.at> wrote in
On Tue, 2021-08-17 at 02:14 -0700, Andres Freund wrote:
I'm also not all that happy with sending yet another UDP packet for this.
Are you suggesting that connection statistics should be shoehorned into
some other statistics message? That would reduce the number of UDP packets,
but it sounds ugly and confusing to me.That ship already has sailed. Look at struct PgStat_MsgTabstat
Given that we transport number of commits/commits, block read/write time
adding the time the connection was active/inactive doesn't really seem like it
makes things meaningfully worse?Point taken.
I looked at the other statistics sent in pgstat_report_stat(), and I see
none that are sent unconditionally.IIUC, that means that pg_stat_report sends at least one
PgStat_MsgTabstat struct for the database stats purpose if any stats
are sent. So the connection stats can piggy-back on the packet.No need to change the condition. It's sufficient that the connection
stats are sent at the same time with transaction stats are sent.
Doubling the number of UDP messages in common workloads seems also problematicenough that it should be addressed for 14.
Ok, but I don't know how to go about it.
The attached is a heavy-WIP on:
- remove redundant gettimeofday().
- avoid sending dedicate UCP packet for connection stats.
Thank you.
Perhaps I misread that, but doesn't that mean that the session statistics
could be sent several times? "pgstat_send_tabstat()" could be called more than
once, right?
Yours,
Laurenz Albe
At Tue, 24 Aug 2021 12:34:25 +0200, Laurenz Albe <laurenz.albe@cybertec.at> wrote in
On Tue, 2021-08-24 at 15:12 +0900, Kyotaro Horiguchi wrote:
- remove redundant gettimeofday().
- avoid sending dedicate UCP packet for connection stats.Thank you.
Perhaps I misread that, but doesn't that mean that the session statistics
could be sent several times? "pgstat_send_tabstat()" could be called more than
once, right?
Yes, it can be called two or more times for a call to
pgstat_report_stat, but the patch causes only the first of them convey
effective connection stats numbers. This is the same way as how
transaction stats are treated. If no table activities have taken
place at all, pgStatXactCommit/Rollback are not consumed then the
following condition:
/*
* Send partial messages. Make sure that any pending xact commit/abort
* gets counted, even if there are no table stats to send.
*/
if (regular_msg.m_nentries > 0 ||
pgStatXactCommit > 0 || pgStatXactRollback > 0)
leads to a call to pgstat_send_tabstat() and it sends a tabstat
message without a table stats, which is a "partial message".
In this logic the condition term (pgStatXactCommit > 0 ||
pgStatXactRollback > 0) acts as a single-shot trigger.
So we can piggy-back on the condition to send something only once.
The patch sets "0" = (DISCONNECT_NOT_YET) to m_disconnect as the
"not-effective" number, but it would be better to add
DISCONNECT_something to express that state.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On Wed, Aug 25, 2021 at 10:12:41AM +0900, Kyotaro Horiguchi wrote:
Yes, it can be called two or more times for a call to
pgstat_report_stat, but the patch causes only the first of them convey
effective connection stats numbers. This is the same way as how
transaction stats are treated. If no table activities have taken
place at all, pgStatXactCommit/Rollback are not consumed then the
following condition:
I was looking at this WIP patch, and plugging in the connection
statistics to the table-access statistics looks like the wrong
abstraction to me. I find much cleaner the approach of HEAD to use a
separate API to report this information, as of
pgstat_send_connstats().
As of the two problems discussed on this thread, aka the increased
number of UDP packages and the extra timestamp computations, it seems
to me that we had better combine the following ideas for HEAD and 14,
for now:
- Avoid the extra timestamp computation as proposed by Laurenz in [1]/messages/by-id/4095ceb328780d1bdba77ac6d9785fd7577ed047.camel@cybertec.at
- Throttle the frequency where the connection stat packages are sent,
as of [2]/messages/by-id/20210801205501.nyxzxoelqoo4x2qc@alap3.anarazel.de -- Michael.
Magnus, this open item is assigned to you as the committer of
960869d. Could you comment on those issues?
[1]: /messages/by-id/4095ceb328780d1bdba77ac6d9785fd7577ed047.camel@cybertec.at
[2]: /messages/by-id/20210801205501.nyxzxoelqoo4x2qc@alap3.anarazel.de -- Michael
--
Michael
At Wed, 25 Aug 2021 12:51:58 +0900, Michael Paquier <michael@paquier.xyz> wrote in
On Wed, Aug 25, 2021 at 10:12:41AM +0900, Kyotaro Horiguchi wrote:
Yes, it can be called two or more times for a call to
pgstat_report_stat, but the patch causes only the first of them convey
effective connection stats numbers. This is the same way as how
transaction stats are treated. If no table activities have taken
place at all, pgStatXactCommit/Rollback are not consumed then the
following condition:I was looking at this WIP patch, and plugging in the connection
statistics to the table-access statistics looks like the wrong
abstraction to me. I find much cleaner the approach of HEAD to use a
separate API to report this information, as of
pgstat_send_connstats().As of the two problems discussed on this thread, aka the increased
number of UDP packages and the extra timestamp computations, it seems
to me that we had better combine the following ideas for HEAD and 14,
for now:
- Avoid the extra timestamp computation as proposed by Laurenz in [1]
- Throttle the frequency where the connection stat packages are sent,
as of [2].Magnus, this open item is assigned to you as the committer of
960869d. Could you comment on those issues?[1]: /messages/by-id/4095ceb328780d1bdba77ac6d9785fd7577ed047.camel@cybertec.at
[2]: /messages/by-id/20210801205501.nyxzxoelqoo4x2qc@alap3.anarazel.de
About [2], we need to maintain session active/idel times on additional
menbers in backend status. Letting gpgstat_report_activity to
directly scribble on backend status array would work?
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
At Wed, 25 Aug 2021 13:21:52 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in
At Wed, 25 Aug 2021 12:51:58 +0900, Michael Paquier <michael@paquier.xyz> wrote in
- Throttle the frequency where the connection stat packages are sent,
as of [2].Magnus, this open item is assigned to you as the committer of
960869d. Could you comment on those issues?[1]: /messages/by-id/4095ceb328780d1bdba77ac6d9785fd7577ed047.camel@cybertec.at
[2]: /messages/by-id/20210801205501.nyxzxoelqoo4x2qc@alap3.anarazel.deAbout [2], we need to maintain session active/idel times on additional
menbers in backend status. Letting gpgstat_report_activity to
directly scribble on backend status array would work?
So the attached is roughly that (just a PoC, of course).
- accumulate active and idle time on backend status array.
(pgstat_report_activity)
- pgstat_get_db_session_time() and friends read pgstat file then sum
up relevant members in backend status array. (So it scans on the
array for every number of every database X().
- The function pgstat_send_connstats is called only at the end of a
connection. It reads backend status element then send the numbers
to stats collector. pgstat_shutdown_hook needs to be moved from
on_shmem_exit to before_shmem_exit to read MyBEEntry.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachments:
reduce_frequency_of_connstats_PoC.txttext/plain; charset=us-asciiDownload+90-40
Hi,
On 2021-08-20 14:27:20 -0500, Justin Pryzby wrote:
On Tue, Aug 17, 2021 at 02:14:20AM -0700, Andres Freund wrote:
Doubling the number of UDP messages in common workloads seems also problematic
enough that it should be addressed for 14. It increases the likelihood of
dropping stats messages on busy systems, which can have downstream impacts.I think by "common workloads" you mean one with many, shortlived sessions.
You don't need short-lived sessions. You just need sessions that don't
process queries all the time (so that there's only one or a few queries
within each PGSTAT_STAT_INTERVAL). The connection stats aren't sent once
per session, they're sent once per PGSTAT_STAT_INTERVAL.
Greetings,
Andres Freund
Hi,
On 2021-08-25 12:51:58 +0900, Michael Paquier wrote:
I was looking at this WIP patch, and plugging in the connection
statistics to the table-access statistics looks like the wrong
abstraction to me. I find much cleaner the approach of HEAD to use a
separate API to report this information, as of
pgstat_send_connstats().
As I said before, this ship has long sailed:
typedef struct PgStat_MsgTabstat
{
PgStat_MsgHdr m_hdr;
Oid m_databaseid;
int m_nentries;
int m_xact_commit;
int m_xact_rollback;
PgStat_Counter m_block_read_time; /* times in microseconds */
PgStat_Counter m_block_write_time;
PgStat_TableEntry m_entry[PGSTAT_NUM_TABENTRIES];
} PgStat_MsgTabstat;
As of the two problems discussed on this thread, aka the increased
number of UDP packages and the extra timestamp computations, it seems
to me that we had better combine the following ideas for HEAD and 14,
for now:
- Avoid the extra timestamp computation as proposed by Laurenz in [1]
- Throttle the frequency where the connection stat packages are sent,
as of [2].
I think in that case we'd have to do the bigger redesign and move "live"
connection stats to backend_status.c...
Greetings,
Andres Freund
On Wed, Aug 25, 2021 at 01:20:03AM -0700, Andres Freund wrote:
On 2021-08-25 12:51:58 +0900, Michael Paquier wrote:
As I said before, this ship has long sailed:typedef struct PgStat_MsgTabstat
{
PgStat_MsgHdr m_hdr;
Oid m_databaseid;
int m_nentries;
int m_xact_commit;
int m_xact_rollback;
PgStat_Counter m_block_read_time; /* times in microseconds */
PgStat_Counter m_block_write_time;
PgStat_TableEntry m_entry[PGSTAT_NUM_TABENTRIES];
} PgStat_MsgTabstat;
Well, I kind of misread what you meant upthread then.
PgStat_MsgTabstat has a name a bit misleading, especially if you
assign connection stats to it.
As of the two problems discussed on this thread, aka the increased
number of UDP packages and the extra timestamp computations, it seems
to me that we had better combine the following ideas for HEAD and 14,
for now:
- Avoid the extra timestamp computation as proposed by Laurenz in [1]
- Throttle the frequency where the connection stat packages are sent,
as of [2].I think in that case we'd have to do the bigger redesign and move "live"
connection stats to backend_status.c...
Hmm. A redesign is not really an option for 14 at this stage. And I
am not really comfortable with the latest proposal from upthread to
plug in that to pgstat_send_tabstat to report things once per
transaction, either. It really looks like this needs more thoughts,
and it would mean that a revert may be the most appropriate choice
for the moment. That's the last-resort option, surely, but we are
post-beta3 so there is no much margin left.
--
Michael
On Fri, 2021-08-27 at 13:57 +0900, Michael Paquier wrote:
I think in that case we'd have to do the bigger redesign and move "live"
connection stats to backend_status.c...Hmm. A redesign is not really an option for 14 at this stage. And I
am not really comfortable with the latest proposal from upthread to
plug in that to pgstat_send_tabstat to report things once per
transaction, either. It really looks like this needs more thoughts,
and it would mean that a revert may be the most appropriate choice
for the moment. That's the last-resort option, surely, but we are
post-beta3 so there is no much margin left.
In the view of that, how about doubling PGSTAT_STAT_INTERVAL to 1000
milliseconds? That would mean slightly less up-to-date statistics, but
I doubt that that will be a problem. And it should even out the increase
in statistics messages, except in the case of lots of short-lived
sessions. But in that scenario you cannot have session statistics
without lots of extra messages, and such a workload has enough performance
problems as it is, so I don't think we have to specifically worry about it.
Yours,
Laurenz Albe
On Tue, Aug 31, 2021 at 04:55:35AM +0200, Laurenz Albe wrote:
In the view of that, how about doubling PGSTAT_STAT_INTERVAL to 1000
milliseconds? That would mean slightly less up-to-date statistics, but
I doubt that that will be a problem. And it should even out the increase
in statistics messages, except in the case of lots of short-lived
sessions. But in that scenario you cannot have session statistics
without lots of extra messages, and such a workload has enough performance
problems as it is, so I don't think we have to specifically worry about it.
Perhaps we could do that. Now, increasing an interval for the sake of
balancing the extra load created by a feature while impacting the
whole set of stats is not really appealing.
--
Michael
Hi,
On August 31, 2021 6:33:15 PM PDT, Michael Paquier <michael@paquier.xyz> wrote:
On Tue, Aug 31, 2021 at 04:55:35AM +0200, Laurenz Albe wrote:
In the view of that, how about doubling PGSTAT_STAT_INTERVAL to 1000
milliseconds? That would mean slightly less up-to-date statistics, but
I doubt that that will be a problem. And it should even out the increase
in statistics messages, except in the case of lots of short-lived
sessions. But in that scenario you cannot have session statistics
without lots of extra messages, and such a workload has enough performance
problems as it is, so I don't think we have to specifically worry about it.Perhaps we could do that. Now, increasing an interval for the sake of
balancing the extra load created by a feature while impacting the
whole set of stats is not really appealing.
I think it's not helpful. Still increases the number of messages substantially in workloads with a lot of connections doing occasional queries. Which is common.
Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
On Tue, 2021-08-31 at 18:55 -0700, Andres Freund wrote:
On Tue, Aug 31, 2021 at 04:55:35AM +0200, Laurenz Albe wrote:In the view of that, how about doubling PGSTAT_STAT_INTERVAL to 1000
milliseconds? That would mean slightly less up-to-date statistics, but
I doubt that that will be a problem.I think it's not helpful. Still increases the number of messages substantially in workloads
with a lot of connections doing occasional queries. Which is common.
How come? If originally you send table statistics every 500ms, and now you send
table statistics and session statistics every second, that should amount to the
same thing. Where is my misunderstanding?
Yours,
Laurenz Albe
On Wed, 2021-09-01 at 10:33 +0900, Michael Paquier wrote:
On Tue, Aug 31, 2021 at 04:55:35AM +0200, Laurenz Albe wrote:
In the view of that, how about doubling PGSTAT_STAT_INTERVAL to 1000
milliseconds?Perhaps we could do that. Now, increasing an interval for the sake of
balancing the extra load created by a feature while impacting the
whole set of stats is not really appealing.
I agree. But if the best fix is too invasive at this point, the
alternatives are reverting the patch or choosing a less appealing
solution.
Yours,
Laurenz Albe
On 2021-09-01 05:39:14 +0200, Laurenz Albe wrote:
On Tue, 2021-08-31 at 18:55 -0700, Andres Freund wrote:
On Tue, Aug 31, 2021 at 04:55:35AM +0200, Laurenz Albe wrote:In the view of that, how about doubling PGSTAT_STAT_INTERVAL to 1000
milliseconds?� That would mean slightly less up-to-date statistics, but
I doubt that that will be a problem.I think it's not helpful. Still increases the number of messages substantially in workloads
with a lot of connections doing occasional queries. Which is common.How come? If originally you send table statistics every 500ms, and now you send
table statistics and session statistics every second, that should amount to the
same thing. Where is my misunderstanding?
Consider the case of one query a second.