Home > mailing lists

Re: Getting an index scan to be a parallel index scan - Mailing list pgsql-performance

From	Ranier Vilela
Subject	Re: Getting an index scan to be a parallel index scan
Date	February 1, 2023 11:17:17
Msg-id	CAEudQAqisJrxGsHzekOFHwXAscqb3K_AVrO9d8F9sL8QbQKVEA@mail.gmail.com Whole thread
In response to	Getting an index scan to be a parallel index scan (Alex Kaiser <alextkaiser@gmail.com>)
List	pgsql-performance

Tree view

Em qua., 1 de fev. de 2023 às 02:39, Alex Kaiser <alextkaiser@gmail.com> escreveu:

Hello,

I'm trying to get the following query to use a plan with parallelism, but I haven't been successful and would like some advice.

The schema and table that I'm using is this:

CREATE TABLE testing(
id INT,
info INT,
data_one TEXT,
data_two TEXT,
primary key(id, info)
);

INSERT INTO testing(id, info, data_one, data_two)
SELECT idx, idx, md5(random()::text), md5(random()::text)
FROM generate_series(1,10000000) idx;

Then the query that I'm trying to run is this (I'll include the full query at the very end of the email because it is long:

select * from testing where id in (1608377,5449811, ... <1000 random ids> ,4654284,3558460);

Essentially I have a list of 1000 ids and I would like the rows for all of those ids.

This seems like it would be pretty easy to parallelize, if you have X threads then you would split the list of IDs into 1000/X sub lists and give one to each thread to go find the rows for ids in the given list. Even when I use the following configs I don't get a query plan that actually uses any parallelism:

psql (15.1 (Debian 15.1-1.pgdg110+1))
Type "help" for help.

postgres=# show max_parallel_workers;
max_parallel_workers
----------------------
8
(1 row)

postgres=# set max_parallel_workers_per_gather = 8;
SET
postgres=# set parallel_setup_cost = 0;
SET
postgres=# set parallel_tuple_cost = 0;
SET
postgres=# set force_parallel_mode = on;
SET
postgres=# explain select * from testing where id in (1608377,5449811, ... <removed for brevity> ... ,4654284,3558460);

Can you try:

select * from testing where id any = (values(1608377),(5449811),(5334677) ... <removed for brevity> ... ,(4654284),(3558460));

Or alternately you can use EXTEND STATISTICS to improve Postgres planner choice.

regards,

Ranier Vilela

pgsql-performance by date:

From: Alex Kaiser
Date: 01 February 2023, 05:39:06
Subject: Getting an index scan to be a parallel index scan

From: David Rowley
Date: 01 February 2023, 11:30:31
Subject: Re: Getting an index scan to be a parallel index scan

Re: Getting an index scan to be a parallel index scan - Mailing list pgsql-performance

Previous

Next