Re: a pool for parallel worker - Mailing list pgsql-hackers

From Kirill Reshke
Subject Re: a pool for parallel worker
Date
Msg-id CALdSSPgJMaLeegakjVyWrk7nkqmsbT7hwyZao4FJj9w267GXDA@mail.gmail.com
Whole thread Raw
In response to a pool for parallel worker  (Andy Fan <zhihuifan1213@163.com>)
List pgsql-hackers
On Tue, 11 Mar 2025 at 17:38, Andy Fan <zhihuifan1213@163.com> wrote:
>
>
>
> Hi,
>

Hi!

> Currently when a query needs some parallel workers, postmaster spawns
> some backend for this query and when the work is done, the backend
> exit.  there are some wastage here, e.g. syscache, relcache, smgr cache,
> vfd cache and fork/exit syscall itself.
>
> I am thinking if we should preallocate (or create lazily) some backends
> as a pool for parallel worker. The benefits includes:
>
> (1) Make the startup cost of a parallel worker lower in fact.
> (2) Make the core most suitable for the cases where executor need to a
> new worker to run a piece of plan more. I think this is needed in some
> data redistribution related executor in a distributed database.
>
> I guess the both cases can share some well designed code, like costing or
> transfer the data between worker and leader.

Surely forking from the postmaster is costly.

> The boring thing for the pool is it is [dbid + userId] based, which
> I mean if the dbid or userId is different with the connection in pool,
> they can't be reused.  To reduce the effect of UserId, I think if we can
> start the pool with a superuser and then switch the user information
> with 'SET ROLE xxx'. and the pool can be created lazily.

I don't think this is secure. Currently, if your postgresql process
had started under superuser role, there is no way to undo that.
Consider a worker in a pool running a user query, which uses UDF. In
this UDF, one can simply RESET SESSION AUTHORIZATION and process with
anything under superuser rights.

> Any comments on this idea?
>
> --
> Best Regards
> Andy Fan
>
>
>


-- 
Best regards,
Kirill Reshke



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Enhance 'pg_createsubscriber' to retrieve databases automatically when no database is provided.
Next
From: Masahiko Sawada
Date:
Subject: Re: [Patch] remove duplicated smgrclose