Inefficiency in SLRU stats collection

Started by Tom Laneover 5 years ago4 messages

Tom Lane

tgl@sss.pgh.pa.us

over 5 years ago

I happened to come across this code added by 28cac71bd:

static PgStat_MsgSLRU *
slru_entry(SlruCtl ctl)
{
int idx = pgstat_slru_index(ctl->shared->lwlock_tranche_name);

Assert((idx >= 0) && (idx < SLRU_NUM_ELEMENTS));

return &SLRUStats[idx];
}

which is invoked by all the pgstat_count_slru_XXX routines.
This seems mightily inefficient --- the count functions are
just there to increment integer counters, but first they
have to do up to half a dozen strcmp's to figure out which
counter to increment.

We could improve this by adding another integer field to
SlruSharedData (which is already big enough that no one
would notice) and recording the result of pgstat_slru_index()
there as soon as the lwlock_tranche_name is set. (In fact,
it looks like we could stop saving the tranche name as such
altogether, thus buying back way more shmem than the integer
field would occupy.)

This does require the assumption that all backends agree
on the SLRU stats index for a particular SLRU cache. But
AFAICS we're locked into that already, since the backends
use those indexes to tell the stats collector which cache
they're sending stats for.

Thoughts?

regards, tom lane

Kyotaro Horiguchi

horikyota.ntt@gmail.com

over 5 years ago

In reply to: Tom Lane (#1)

Re: Inefficiency in SLRU stats collection

At Tue, 12 May 2020 15:50:35 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote in

I happened to come across this code added by 28cac71bd:

static PgStat_MsgSLRU *
slru_entry(SlruCtl ctl)
{
int idx = pgstat_slru_index(ctl->shared->lwlock_tranche_name);

Assert((idx >= 0) && (idx < SLRU_NUM_ELEMENTS));

return &SLRUStats[idx];
}

which is invoked by all the pgstat_count_slru_XXX routines.
This seems mightily inefficient --- the count functions are
just there to increment integer counters, but first they
have to do up to half a dozen strcmp's to figure out which
counter to increment.

We could improve this by adding another integer field to
SlruSharedData (which is already big enough that no one
would notice) and recording the result of pgstat_slru_index()
there as soon as the lwlock_tranche_name is set. (In fact,
it looks like we could stop saving the tranche name as such
altogether, thus buying back way more shmem than the integer
field would occupy.)

I noticed that while trying to move that stuff into shmem-stats patch.

I think we can get rid of SlruCtl->shared->lwlock_tranche_name since
the only user is the slru_entry() and no external modules don't look
into that depth and there's a substitute way to know the name for
them.

This does require the assumption that all backends agree
on the SLRU stats index for a particular SLRU cache. But
AFAICS we're locked into that already, since the backends
use those indexes to tell the stats collector which cache
they're sending stats for.

Thoughts?

AFAICS it is right and the change suggested looks reasonable to me.
One arguable point might be whether it is right that SlruData holds
pgstats internal index from the standpoint of modularity. (It is one
of the reasons I didn't propose a patch for that..)

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Fujii Masao

masao.fujii@oss.nttdata.com

over 5 years ago

In reply to: Kyotaro Horiguchi (#2)

Re: Inefficiency in SLRU stats collection

On 2020/05/13 11:26, Kyotaro Horiguchi wrote:

At Tue, 12 May 2020 15:50:35 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote in

I happened to come across this code added by 28cac71bd:

static PgStat_MsgSLRU *
slru_entry(SlruCtl ctl)
{
int idx = pgstat_slru_index(ctl->shared->lwlock_tranche_name);

Assert((idx >= 0) && (idx < SLRU_NUM_ELEMENTS));

return &SLRUStats[idx];
}

which is invoked by all the pgstat_count_slru_XXX routines.
This seems mightily inefficient --- the count functions are
just there to increment integer counters, but first they
have to do up to half a dozen strcmp's to figure out which
counter to increment.

We could improve this by adding another integer field to
SlruSharedData (which is already big enough that no one
would notice) and recording the result of pgstat_slru_index()
there as soon as the lwlock_tranche_name is set. (In fact,
it looks like we could stop saving the tranche name as such
altogether, thus buying back way more shmem than the integer
field would occupy.)

Sounds good to me.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Tom Lane

tgl@sss.pgh.pa.us

over 5 years ago

In reply to: Kyotaro Horiguchi (#2)

Re: Inefficiency in SLRU stats collection

Kyotaro Horiguchi <horikyota.ntt@gmail.com> writes:

AFAICS it is right and the change suggested looks reasonable to me.
One arguable point might be whether it is right that SlruData holds
pgstats internal index from the standpoint of modularity. (It is one
of the reasons I didn't propose a patch for that..)

Yeah, this is a fair point. On the other hand, the existing code has
pgstat.c digging into the SLRU control structure, which is as bad or
worse a modularity violation. Perhaps we could ditch that by having
slru.c obtain and store the integer index which it then passes to
the pgstat.c counter routines, rather than passing a SlruCtl pointer.

I'll have to look at whether 28cac71bd exposed a data structure that
was formerly private, but if it did I'd be VERY strongly inclined
to revert that.

regards, tom lane