Home > mailing lists

Re: How to solve my slow disk i/o throughput during index scan - Mailing list pgsql-performance

From	Andrei Lepikhov
Subject	Re: How to solve my slow disk i/o throughput during index scan
Date	July 5, 2024 02:04:31
Msg-id	d07e4b85-0572-453a-93c8-2ca8b34aa662@gmail.com Whole thread Raw
In response to	RE: How to solve my slow disk i/o throughput during index scan ("FREYBURGER Simon (SNCF VOYAGEURS / DIRECTION GENERALE TGV / DM RMP YIELD MANAGEMENT)" <simon.freyburger@sncf.fr>)
Responses	RE: How to solve my slow disk i/o throughput during index scan
List	pgsql-performance

Tree view

On 7/4/24 22:23, FREYBURGER Simon (SNCF VOYAGEURS / DIRECTION GENERALE 
TGV / DM RMP YIELD MANAGEMENT) wrote:
> Hello,
> 
> Thank you, splitting in “OR” query definitely enables bitmap heap scans, 
> and thus parallelized read to disk 😃! I though did not understand your 
> second point, what is parallel append, and how to enable it ?
Just for example:

DROP TABLE IF EXISTS t CASCADE;
CREATE TABLE t (id int not null, payload text) PARTITION BY RANGE (id);
CREATE TABLE p1 PARTITION OF t FOR VALUES FROM (0) TO (1000);
CREATE TABLE p2 PARTITION OF t FOR VALUES FROM (1000) TO (2000);
CREATE TABLE p3 PARTITION OF t FOR VALUES FROM (2000) TO (3000);
CREATE TABLE p4 PARTITION OF t FOR VALUES FROM (3000) TO (4000);
INSERT INTO t SELECT x % 4000, repeat('a',128) || x FROM 
generate_series(1,1E5) AS x;
ANALYZE t;

SET enable_parallel_append = on;
SET parallel_setup_cost = 0.00001;
SET parallel_tuple_cost = 0.00001;
SET max_parallel_workers_per_gather = 8;
SET min_parallel_table_scan_size = 0;
SET min_parallel_index_scan_size = 0;

EXPLAIN (COSTS OFF)
SELECT t.id, t.payload FROM t WHERE t.id % 2 = 0
GROUP BY t.id, t.payload;

  Group
    Group Key: t.id, t.payload
    ->  Gather Merge
          Workers Planned: 6
          ->  Sort
                Sort Key: t.id, t.payload
                ->  Parallel Append
                      ->  Parallel Seq Scan on p1 t_1
                            Filter: ((id % 2) = 0)
                      ->  Parallel Seq Scan on p2 t_2
                            Filter: ((id % 2) = 0)
                      ->  Parallel Seq Scan on p3 t_3
                            Filter: ((id % 2) = 0)
                      ->  Parallel Seq Scan on p4 t_4
                            Filter: ((id % 2) = 0)

Here the table is scanned in parallel. It also works with IndexScan.

-- 
regards, Andrei Lepikhov

pgsql-performance by date:

From: James Pang
Date: 05 July 2024, 00:50:18
Subject: Re: Hash Right join and seq scan

From: David Rowley
Date: 05 July 2024, 02:15:42
Subject: Re: Hash Right join and seq scan

Re: How to solve my slow disk i/o throughput during index scan - Mailing list pgsql-performance

Previous

Next