Thread: [PATCH] fastpacth-locks compile time options
While replaying our production workload we have found Postgres spending a lot of time inside TimescaleDB planner. The planner itself need an information about whether a table involved is a TimescaleDB hypertable. So planner need an access to TimescaleDB internal metainformation tables. This planner access become extremely slow when you have a lot of tables involved and you are beyond fast-path lock limit
This humble example makes only 2330 tps on REL_15_STABLE but 27041tps on patched version with 64 slots for fast-path locks.
\set bid random(1,1000)
BEGIN;
select bbalance from pgbench_branches where bid = :bid
UNION
select bbalance from pgbench_branches2 where bid = :bid
UNION
select bbalance from pgbench_branches3 where bid = :bid
UNION
select bbalance from pgbench_branches4 where bid = :bid
UNION
select bbalance from pgbench_branches5 where bid = :bid
UNION
select bbalance from pgbench_branches6 where bid = :bid
UNION
select bbalance from pgbench_branches7 where bid = :bid
UNION
select bbalance from pgbench_branches8 where bid = :bid
UNION
select bbalance from pgbench_branches9 where bid = :bid
UNION
select bbalance from pgbench_branches10 where bid = :bid
UNION
select bbalance from pgbench_branches11 where bid = :bid
UNION
select bbalance from pgbench_branches12 where bid = :bid
UNION
select bbalance from pgbench_branches13 where bid = :bid
UNION
select bbalance from pgbench_branches14 where bid = :bid
UNION
select bbalance from pgbench_branches15 where bid = :bid
UNION
select bbalance from pgbench_branches16 where bid = :bid
UNION
select bbalance from pgbench_branches17 where bid = :bid
UNION
select bbalance from pgbench_branches18 where bid = :bid
UNION
select bbalance from pgbench_branches19 where bid = :bid
UNION
select bbalance from pgbench_branches20 where bid = :bid;
END;
BEGIN;
select bbalance from pgbench_branches where bid = :bid
UNION
select bbalance from pgbench_branches2 where bid = :bid
UNION
select bbalance from pgbench_branches3 where bid = :bid
UNION
select bbalance from pgbench_branches4 where bid = :bid
UNION
select bbalance from pgbench_branches5 where bid = :bid
UNION
select bbalance from pgbench_branches6 where bid = :bid
UNION
select bbalance from pgbench_branches7 where bid = :bid
UNION
select bbalance from pgbench_branches8 where bid = :bid
UNION
select bbalance from pgbench_branches9 where bid = :bid
UNION
select bbalance from pgbench_branches10 where bid = :bid
UNION
select bbalance from pgbench_branches11 where bid = :bid
UNION
select bbalance from pgbench_branches12 where bid = :bid
UNION
select bbalance from pgbench_branches13 where bid = :bid
UNION
select bbalance from pgbench_branches14 where bid = :bid
UNION
select bbalance from pgbench_branches15 where bid = :bid
UNION
select bbalance from pgbench_branches16 where bid = :bid
UNION
select bbalance from pgbench_branches17 where bid = :bid
UNION
select bbalance from pgbench_branches18 where bid = :bid
UNION
select bbalance from pgbench_branches19 where bid = :bid
UNION
select bbalance from pgbench_branches20 where bid = :bid;
END;
First i try to make the number of fast-path locks as a GUC parameter. But it implies a lot of changes with PGPROC structure. Next I implement it as a compile-time parameter.
Hope this patch will be usefull/
On Mon, Sep 18, 2023 at 5:47 PM Sergey Sergey <ioxgrey@gmail.com> wrote:
While replaying our production workload we have found Postgres spending a lot of time inside TimescaleDB planner. The planner itself need an information about whether a table involved is a TimescaleDB hypertable. So planner need an access to TimescaleDB internal metainformation tables. This planner access become extremely slow when you have a lot of tables involved and you are beyond fast-path lock limitThis humble example makes only 2330 tps on REL_15_STABLE but 27041tps on patched version with 64 slots for fast-path locks.\set bid random(1,1000)
BEGIN;
select bbalance from pgbench_branches where bid = :bid
UNION
select bbalance from pgbench_branches2 where bid = :bid
UNION
select bbalance from pgbench_branches3 where bid = :bid
UNION
select bbalance from pgbench_branches4 where bid = :bid
UNION
select bbalance from pgbench_branches5 where bid = :bid
UNION
select bbalance from pgbench_branches6 where bid = :bid
UNION
select bbalance from pgbench_branches7 where bid = :bid
UNION
select bbalance from pgbench_branches8 where bid = :bid
UNION
select bbalance from pgbench_branches9 where bid = :bid
UNION
select bbalance from pgbench_branches10 where bid = :bid
UNION
select bbalance from pgbench_branches11 where bid = :bid
UNION
select bbalance from pgbench_branches12 where bid = :bid
UNION
select bbalance from pgbench_branches13 where bid = :bid
UNION
select bbalance from pgbench_branches14 where bid = :bid
UNION
select bbalance from pgbench_branches15 where bid = :bid
UNION
select bbalance from pgbench_branches16 where bid = :bid
UNION
select bbalance from pgbench_branches17 where bid = :bid
UNION
select bbalance from pgbench_branches18 where bid = :bid
UNION
select bbalance from pgbench_branches19 where bid = :bid
UNION
select bbalance from pgbench_branches20 where bid = :bid;
END;First i try to make the number of fast-path locks as a GUC parameter. But it implies a lot of changes with PGPROC structure. Next I implement it as a compile-time parameter.
Attachment
On Mon, Sep 18, 2023 at 05:49:51PM +0300, Sergey Sergey wrote: > Hope this patch will be usefull/ - uint64 fpLockBits; /* lock modes held for each fast-path slot */ + uint8 fpLockBits[FP_LOCK_SLOTS_PER_BACKEND]; /* lock modes If my maths are right, this makes PGPROC 8 bytes larger with 16 slots by default. That is not a good idea. + --runstatedir=DIR modifiable per-process data [LOCALSTATEDIR/run] And this points out that ./configure has been generated with one of Debian's autoreconf commands, which is something to avoid. I am not sure that this patch is a good idea long-term. Wouldn't it be better to invent new and more scalable concepts able to tackle bottlenecks around these code paths instead of using compile-time tweaks like that? -- Michael
Attachment
Thank you for response.
On Tue, Sep 19, 2023 at 2:52 AM Michael Paquier <michael@paquier.xyz> wrote:
On Mon, Sep 18, 2023 at 05:49:51PM +0300, Sergey Sergey wrote:
> Hope this patch will be usefull/
- uint64 fpLockBits; /* lock modes held for each fast-path slot */
+ uint8 fpLockBits[FP_LOCK_SLOTS_PER_BACKEND]; /* lock modes
If my maths are right, this makes PGPROC 8 bytes larger with 16 slots
by default. That is not a good idea.
You maths are correct. I can't estimate overall effect of this PGPROC grows.
Our typical setup include 768Gb RAM. It looks like space-for-time optimization.
I check ordinary pgbench for patched and unpatched version.Total average tps
are the same. Patched version has very stable tps values during test.
+ --runstatedir=DIR modifiable per-process data [LOCALSTATEDIR/run]
And this points out that ./configure has been generated with one of
Debian's autoreconf commands, which is something to avoid.
Yes, first i try to build it Debian way.
I can rebuild ./configure with autoconf.
I am not sure that this patch is a good idea long-term. Wouldn't it
be better to invent new and more scalable concepts able to tackle
bottlenecks around these code paths instead of using compile-time
tweaks like that?
Another one way is to replace fixed arrays inside PGPROC
uint8 fpLockBits[FP_LOCK_SLOTS_PER_BACKEND]; /* lock modes
Oid fpRelId[FP_LOCK_SLOTS_PER_BACKEND]; /* slots for rel oids */
with pointers to arrays allocated outside PGPROC.
We also can use c99 flexible array pointers feature. This way we should make
structure like
struct FPLock
{
uint8 fpLockBit;
Oid fpRelid;
}
Set the array of struct FPLock at the end of PGPROC structure. And calculate memory
allocation for PGPROC using some GUC variable.
This two ways seems so complex for me.
--
Michael