resowner "cold start" overhead

Started by Andres Freundabout 3 years ago4 messages

andres@anarazel.de

about 3 years ago

Hi,

As part of [1] I made IOs-in-progress be tracked by resowner.c. Benchmarking
unfortunately showed that to have a small impact on workloads that often have
to read data, but where that data is guaranteed to be in the kernel cache.

I was a bit surprised, given that we also use the resowner.c mechanism for
buffer pins, which are obviously more common. But those (and e.g. relache
references) actually also show up in profiles...

The obvious answeriis to have a few "embedded" elements in each ResourceArray,
so that no allocation is needed for the first few remembered objects in each
category. In a prototype I went with four, since that avoided allocations for
trivial queries. That works nicely, delivering small but measurable speedups.

However, that approach does increase the size of a ResourceOwner. I don't know
if it matters that much, my prototype made the size go from 544 to 928 bytes -
which afaict would basically be free currently, because of aset.c rounding
up. But it'd take just two more ResourceArrays to go above that boundary.

struct ResourceArray {
Datum * itemsarr; /* 0 8 */
Datum invalidval; /* 8 8 */
uint32 capacity; /* 16 4 */
uint32 nitems; /* 20 4 */
uint32 maxitems; /* 24 4 */
uint32 lastidx; /* 28 4 */
Datum initialarr[4]; /* 32 32 */

/* size: 64, cachelines: 1, members: 7 */
};

One way to reduce the size increase would be to use the space for initialarr
to store variables we don't need while initialarr is used. E.g. itemsarr,
maxitems, lastarr are candidates. But I suspect that the code complication
isn't worth it.

A different approach could be to not go for the "embedded initial elements"
approach, but instead to not delete resource owners / resource arrays inside
ResourceOwnerDelete(). We could stash them in a bounded list of resource
owners, to be reused by ResourceOwnerCreate(). We do end up creating a
several resource owners even for the simplest queries.

The advantage of that scheme is that it'd save more and that we'd only reserve
space for ResourceArrays that are actually used in the current workload -
often the majority of arrays won't be.

A potential problem would be that we don't want to use the "hashing" style
ResourceArrays forever, I don't think they'll be as fast for other cases. But
we could reset the arrays when they get large.

Greetings,

Andres Freund

/messages/by-id/20221029025420.eplyow6k7tgu6he3@awork3.anarazel.de

Kyotaro Horiguchi

horikyota.ntt@gmail.com

about 3 years ago

In reply to: Andres Freund (#1)

Re: resowner "cold start" overhead

At Sat, 29 Oct 2022 13:00:25 -0700, Andres Freund <andres@anarazel.de> wrote in

One way to reduce the size increase would be to use the space for initialarr
to store variables we don't need while initialarr is used. E.g. itemsarr,
maxitems, lastarr are candidates. But I suspect that the code complication
isn't worth it.

A different approach could be to not go for the "embedded initial elements"
approach, but instead to not delete resource owners / resource arrays inside
ResourceOwnerDelete(). We could stash them in a bounded list of resource
owners, to be reused by ResourceOwnerCreate(). We do end up creating a
several resource owners even for the simplest queries.

We often do end up creating several resource owners that aquires not
an element at all . On the other hand, a few resource owners
sometimes grown up to 2048 (several times) or 4096 (one time) elements
druing a run of the regressiont tests. (I saw catlist, tupdesc and
relref grown to 2048 or more elements.)

The advantage of that scheme is that it'd save more and that we'd only reserve
space for ResourceArrays that are actually used in the current workload -
often the majority of arrays won't be.

Thus I believe preserving resource owners works well. Preserving
resource arrays also would work for the time efficiency, but some
resource owners may end up keeping large amount of memory
unnecessarily most of the time for the backend lifetime. I guess that
the amount is far less than the possible bloat by catcache..

A potential problem would be that we don't want to use the "hashing" style
ResourceArrays forever, I don't think they'll be as fast for other cases. But
we could reset the arrays when they get large.

I'm not sure linear search (am I correct?) doesn't harm for 2048 or
more elements. I think that the "hashing" style doesn't prevent the
arrays from being reset (free-d) at transaction end (or at resource
owner deletion). That allows releasing unused elements while in
transaction but I'm not sure we need to be so keen to reclaim space
during a transaction.

Greetings,

Andres Freund

/messages/by-id/20221029025420.eplyow6k7tgu6he3@awork3.anarazel.de

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Heikki Linnakangas

hlinnaka@iki.fi

about 3 years ago

In reply to: Kyotaro Horiguchi (#2)

Re: resowner "cold start" overhead

On 31/10/2022 04:28, Kyotaro Horiguchi wrote:

At Sat, 29 Oct 2022 13:00:25 -0700, Andres Freund <andres@anarazel.de> wrote in

One way to reduce the size increase would be to use the space for initialarr
to store variables we don't need while initialarr is used. E.g. itemsarr,
maxitems, lastarr are candidates. But I suspect that the code complication
isn't worth it.

+1

A different approach could be to not go for the "embedded initial elements"
approach, but instead to not delete resource owners / resource arrays inside
ResourceOwnerDelete(). We could stash them in a bounded list of resource
owners, to be reused by ResourceOwnerCreate(). We do end up creating a
several resource owners even for the simplest queries.

We often do end up creating several resource owners that aquires not
an element at all . On the other hand, a few resource owners
sometimes grown up to 2048 (several times) or 4096 (one time) elements
druing a run of the regressiont tests. (I saw catlist, tupdesc and
relref grown to 2048 or more elements.)

The advantage of that scheme is that it'd save more and that we'd only reserve
space for ResourceArrays that are actually used in the current workload -
often the majority of arrays won't be.

Thus I believe preserving resource owners works well. Preserving
resource arrays also would work for the time efficiency, but some
resource owners may end up keeping large amount of memory
unnecessarily most of the time for the backend lifetime. I guess that
the amount is far less than the possible bloat by catcache..

A potential problem would be that we don't want to use the "hashing" style
ResourceArrays forever, I don't think they'll be as fast for other cases. But
we could reset the arrays when they get large.

I'm not sure linear search (am I correct?) doesn't harm for 2048 or
more elements. I think that the "hashing" style doesn't prevent the
arrays from being reset (free-d) at transaction end (or at resource
owner deletion). That allows releasing unused elements while in
transaction but I'm not sure we need to be so keen to reclaim space
during a transaction.

What do you think of my ResourceOwner refactoring patches [1]/messages/by-id/2e10b71b-352e-b97b-1e47-658e2669cecb@iki.fi? Reminded
by this, I rebased and added it to the upcoming commitfest again.

With that patch, all resources are stored in the same array and hash.
The array is part of ResourceOwnerData, so it saves the allocation
overhead, like the "initialarr" that you suggested. And it always uses
the array for recently remembered resources, and spills over to the hash
for more long-lived resources.

Andres, could you repeat your benchmark with [1]/messages/by-id/2e10b71b-352e-b97b-1e47-658e2669cecb@iki.fi, to see if it helps?

[1]: /messages/by-id/2e10b71b-352e-b97b-1e47-658e2669cecb@iki.fi
/messages/by-id/2e10b71b-352e-b97b-1e47-658e2669cecb@iki.fi

- Heikki

Andres Freund

andres@anarazel.de

about 3 years ago

In reply to: Heikki Linnakangas (#3)

Re: resowner "cold start" overhead

Hi,

On 2022-10-31 11:05:32 +0100, Heikki Linnakangas wrote:

What do you think of my ResourceOwner refactoring patches [1]? Reminded by
this, I rebased and added it to the upcoming commitfest again.

With that patch, all resources are stored in the same array and hash. The
array is part of ResourceOwnerData, so it saves the allocation overhead,
like the "initialarr" that you suggested. And it always uses the array for
recently remembered resources, and spills over to the hash for more
long-lived resources.

Andres, could you repeat your benchmark with [1], to see if it helps?

[1] /messages/by-id/2e10b71b-352e-b97b-1e47-658e2669cecb@iki.fi

Just for future readers of this thread: Replied on the other thread.

It does seem to address the performance issue, but I have some architectural
concerns.

Greetings,

Andres Freund