[PATCH] fastpacth-locks compile time options
While replaying our production workload we have found Postgres spending a
lot of time inside TimescaleDB planner. The planner itself need an
information about whether a table involved is a TimescaleDB hypertable. So
planner need an access to TimescaleDB internal metainformation tables. This
planner access become extremely slow when you have a lot of tables involved
and you are beyond fast-path lock limit
This humble example makes only 2330 tps on REL_15_STABLE but 27041tps on
patched version with 64 slots for fast-path locks.
\set bid random(1,1000)
BEGIN;
select bbalance from pgbench_branches where bid = :bid
UNION
select bbalance from pgbench_branches2 where bid = :bid
UNION
select bbalance from pgbench_branches3 where bid = :bid
UNION
select bbalance from pgbench_branches4 where bid = :bid
UNION
select bbalance from pgbench_branches5 where bid = :bid
UNION
select bbalance from pgbench_branches6 where bid = :bid
UNION
select bbalance from pgbench_branches7 where bid = :bid
UNION
select bbalance from pgbench_branches8 where bid = :bid
UNION
select bbalance from pgbench_branches9 where bid = :bid
UNION
select bbalance from pgbench_branches10 where bid = :bid
UNION
select bbalance from pgbench_branches11 where bid = :bid
UNION
select bbalance from pgbench_branches12 where bid = :bid
UNION
select bbalance from pgbench_branches13 where bid = :bid
UNION
select bbalance from pgbench_branches14 where bid = :bid
UNION
select bbalance from pgbench_branches15 where bid = :bid
UNION
select bbalance from pgbench_branches16 where bid = :bid
UNION
select bbalance from pgbench_branches17 where bid = :bid
UNION
select bbalance from pgbench_branches18 where bid = :bid
UNION
select bbalance from pgbench_branches19 where bid = :bid
UNION
select bbalance from pgbench_branches20 where bid = :bid;
END;
First i try to make the number of fast-path locks as a GUC parameter. But
it implies a lot of changes with PGPROC structure. Next I implement it as a
compile-time parameter.
Hope this patch will be usefull/
On Mon, Sep 18, 2023 at 5:47 PM Sergey Sergey <ioxgrey@gmail.com> wrote:
Show quoted text
While replaying our production workload we have found Postgres spending a
lot of time inside TimescaleDB planner. The planner itself need an
information about whether a table involved is a TimescaleDB hypertable. So
planner need an access to TimescaleDB internal metainformation tables. This
planner access become extremely slow when you have a lot of tables involved
and you are beyond fast-path lock limitThis humble example makes only 2330 tps on REL_15_STABLE but 27041tps on
patched version with 64 slots for fast-path locks.\set bid random(1,1000)
BEGIN;
select bbalance from pgbench_branches where bid = :bid
UNION
select bbalance from pgbench_branches2 where bid = :bid
UNION
select bbalance from pgbench_branches3 where bid = :bid
UNION
select bbalance from pgbench_branches4 where bid = :bid
UNION
select bbalance from pgbench_branches5 where bid = :bid
UNION
select bbalance from pgbench_branches6 where bid = :bid
UNION
select bbalance from pgbench_branches7 where bid = :bid
UNION
select bbalance from pgbench_branches8 where bid = :bid
UNION
select bbalance from pgbench_branches9 where bid = :bid
UNION
select bbalance from pgbench_branches10 where bid = :bid
UNION
select bbalance from pgbench_branches11 where bid = :bid
UNION
select bbalance from pgbench_branches12 where bid = :bid
UNION
select bbalance from pgbench_branches13 where bid = :bid
UNION
select bbalance from pgbench_branches14 where bid = :bid
UNION
select bbalance from pgbench_branches15 where bid = :bid
UNION
select bbalance from pgbench_branches16 where bid = :bid
UNION
select bbalance from pgbench_branches17 where bid = :bid
UNION
select bbalance from pgbench_branches18 where bid = :bid
UNION
select bbalance from pgbench_branches19 where bid = :bid
UNION
select bbalance from pgbench_branches20 where bid = :bid;
END;First i try to make the number of fast-path locks as a GUC parameter. But
it implies a lot of changes with PGPROC structure. Next I implement it as a
compile-time parameter.
Attachments:
fastpath.patchtext/x-patch; charset=US-ASCII; name=fastpath.patchDownload+89-20
On Mon, Sep 18, 2023 at 05:49:51PM +0300, Sergey Sergey wrote:
Hope this patch will be usefull/
- uint64 fpLockBits; /* lock modes held for each fast-path slot */
+ uint8 fpLockBits[FP_LOCK_SLOTS_PER_BACKEND]; /* lock modes
If my maths are right, this makes PGPROC 8 bytes larger with 16 slots
by default. That is not a good idea.
+ --runstatedir=DIR modifiable per-process data [LOCALSTATEDIR/run]
And this points out that ./configure has been generated with one of
Debian's autoreconf commands, which is something to avoid.
I am not sure that this patch is a good idea long-term. Wouldn't it
be better to invent new and more scalable concepts able to tackle
bottlenecks around these code paths instead of using compile-time
tweaks like that?
--
Michael
Thank you for response.
On Tue, Sep 19, 2023 at 2:52 AM Michael Paquier <michael@paquier.xyz> wrote:
On Mon, Sep 18, 2023 at 05:49:51PM +0300, Sergey Sergey wrote:
Hope this patch will be usefull/
- uint64 fpLockBits; /* lock modes held for each fast-path slot */ + uint8 fpLockBits[FP_LOCK_SLOTS_PER_BACKEND]; /* lock modesIf my maths are right, this makes PGPROC 8 bytes larger with 16 slots
by default. That is not a good idea.
You maths are correct. I can't estimate overall effect of this PGPROC
grows.
Our typical setup include 768Gb RAM. It looks like space-for-time
optimization.
I check ordinary pgbench for patched and unpatched version.Total average tps
are the same. Patched version has very stable tps values during test.
+ --runstatedir=DIR modifiable per-process data [LOCALSTATEDIR/run]
And this points out that ./configure has been generated with one of
Debian's autoreconf commands, which is something to avoid.
Yes, first i try to build it Debian way.
I can rebuild ./configure with autoconf.
I am not sure that this patch is a good idea long-term. Wouldn't it
be better to invent new and more scalable concepts able to tackle
bottlenecks around these code paths instead of using compile-time
tweaks like that?
Another one way is to replace fixed arrays inside PGPROC
uint8 fpLockBits[FP_LOCK_SLOTS_PER_BACKEND]; /* lock
modes
Oid fpRelId[FP_LOCK_SLOTS_PER_BACKEND]; /* slots for rel oids */
with pointers to arrays allocated outside PGPROC.
We also can use c99 flexible array pointers feature. This way we should make
structure like
struct FPLock
{
uint8 fpLockBit;
Oid fpRelid;
}
Set the array of struct FPLock at the end of PGPROC structure. And
calculate memory
allocation for PGPROC using some GUC variable.
This two ways seems so complex for me.
Show quoted text
--
Michael