Framing this differently, how about a patch that lets extension authors choose to implement alternative formulas or even provide GUC-driven constants into the planner at the existing spot instead of having to choose a best algorithm. IOW, what would it take to make the proposed patch an extension that a DBA could choose to install and override the current log3 algorithm?
I've added code to make this new GUC, with the default behavior being the old behavior. I don't think I know enough postgresql to code an extension.
As an example showing this works, where the default algorithm assigns 5 works, the new one assigns 12.:
CREATE TABLE Departments (code VARCHAR(5), UNIQUE (code));
CREATE TABLE Towns (
id SERIAL UNIQUE NOT NULL,
code VARCHAR(10) NOT NULL, -- not unique
article TEXT,
name TEXT NOT NULL, -- not unique
department VARCHAR(5) NOT NULL,
UNIQUE (code, department)
);
insert into towns (
code, article, name, department
)
select
left(md5(i::text), 10),
md5(random()::text),
md5(random()::text),
left(md5(random()::text), 5)
from generate_series(1, 10000000) s(i);
insert into departments (
code
)
select
left(md5(i::text), 5)
from generate_series(1, 1000) s(i);
analyze departments;
analyze towns;
postgres@fedora:~$ /usr/local/pgsql/bin/psql test
psql (19devel)
Type "help" for help.
test=# show parallel_worker_algorithm ;
parallel_worker_algorithm
---------------------------
log3
(1 row)
test=# show max_parallel_workers ;
max_parallel_workers
----------------------
24
(1 row)
test=# explain (costs off) select count(*) from departments, towns where towns.department = departments.code;
QUERY PLAN
--------------------------------------------------------------------------------------
Finalize Aggregate
-> Gather
Workers Planned: 5
-> Partial Aggregate
-> Hash Join
Hash Cond: ((towns.department)::text = (departments.code)::text)
-> Parallel Seq Scan on towns
-> Hash
-> Seq Scan on departments
(9 rows)
test=# set parallel_worker_algorithm = sqrt;
SET
test=# explain (costs off) select count(*) from departments, towns where towns.department = departments.code;
QUERY PLAN
--------------------------------------------------------------------------------------
Finalize Aggregate
-> Gather
Workers Planned: 12
-> Partial Aggregate
-> Hash Join
Hash Cond: ((towns.department)::text = (departments.code)::text)
-> Parallel Seq Scan on towns
-> Hash
-> Seq Scan on departments
(9 rows)
test=#