Re: Parallel Seq Scan vs kernel read ahead - Mailing list pgsql-hackers
From | Ranier Vilela |
---|---|
Subject | Re: Parallel Seq Scan vs kernel read ahead |
Date | |
Msg-id | CAEudQApHw1Meay=Xdp0sws3femxsh4BduYo+qXWKHCZpfAz4fA@mail.gmail.com Whole thread Raw |
In response to | Re: Parallel Seq Scan vs kernel read ahead (David Rowley <dgrowleyml@gmail.com>) |
Responses |
Re: Parallel Seq Scan vs kernel read ahead
|
List | pgsql-hackers |
Em seg., 22 de jun. de 2020 às 02:53, David Rowley <dgrowleyml@gmail.com> escreveu:
On Mon, 22 Jun 2020 at 16:54, David Rowley <dgrowleyml@gmail.com> wrote:
> I also tested this an AMD machine running Ubuntu 20.04 on kernel
> version 5.4.0-37. I used the same 100GB table I mentioned in [1], but
> with the query "select * from t where a < 0;", which saves having to
> do any aggregate work.
I just wanted to add a note here that Thomas and I just discussed this
a bit offline. He recommended I try setting the kernel readhead a bit
higher.
It was set to 128kB, so I cranked it up to 2MB with:
sudo blockdev --setra 4096 /dev/nvme0n1p2
I didn't want to run the full test again as it took quite a long time,
so I just tried with 32 workers.
The first two results here are taken from the test results I just
posted 1 hour ago.
Master readhead=128kB = 89921.283 ms
v2 patch readhead=128kB = 36085.642 ms
master readhead=2MB = 60984.905 ms
v2 patch readhead=2MB = 22611.264 ms
notebook with i5, 8GB, 256 GB (SSD)
Windows 10 64 bits (2004
msvc 2019 64 bits
Postgresql head (with v2 patch)
Configuration: none
Connection local ipv4 (not localhost)
create table t (a int, b text);
insert into t select x,md5(x::text) from
generate_series(1,1000000*1572.7381809)x;
vacuum freeze t;
insert into t select x,md5(x::text) from
generate_series(1,1000000*1572.7381809)x;
vacuum freeze t;
set max_parallel_workers_per_gather = 0;
Time: 354211,826 ms (05:54,212)
set max_parallel_workers_per_gather = 1;
Time: 332805,773 ms (05:32,806)
set max_parallel_workers_per_gather = 2;
Time: 282566,711 ms (04:42,567)
set max_parallel_workers_per_gather = 3;
Time: 263383,945 ms (04:23,384)
set max_parallel_workers_per_gather = 4;
Time: 255728,259 ms (04:15,728)
set max_parallel_workers_per_gather = 5;
Time: 238288,720 ms (03:58,289)
set max_parallel_workers_per_gather = 6;
Time: 238647,792 ms (03:58,648)
set max_parallel_workers_per_gather = 7;
Time: 231295,763 ms (03:51,296)
set max_parallel_workers_per_gather = 8;
Time: 232502,828 ms (03:52,503)
set max_parallel_workers_per_gather = 9;
Time: 230970,604 ms (03:50,971)
set max_parallel_workers_per_gather = 10;
Time: 232104,182 ms (03:52,104)
set max_parallel_workers_per_gather = 8;
postgres=# explain select count(*) from t;
QUERY PLAN
-------------------------------------------------------------------------------------------
Finalize Aggregate (cost=15564556.43..15564556.44 rows=1 width=8)
-> Gather (cost=15564555.60..15564556.41 rows=8 width=8)
Workers Planned: 8
-> Partial Aggregate (cost=15563555.60..15563555.61 rows=1 width=8)
-> Parallel Seq Scan on t (cost=0.00..15072074.88 rows=196592288 width=0)
(5 rows)
QUERY PLAN
-------------------------------------------------------------------------------------------
Finalize Aggregate (cost=15564556.43..15564556.44 rows=1 width=8)
-> Gather (cost=15564555.60..15564556.41 rows=8 width=8)
Workers Planned: 8
-> Partial Aggregate (cost=15563555.60..15563555.61 rows=1 width=8)
-> Parallel Seq Scan on t (cost=0.00..15072074.88 rows=196592288 width=0)
(5 rows)
Questions:
1. Why acquire and release lock in retry: loop.
Wouldn't that be better?
/* Grab the spinlock. */
SpinLockAcquire(&pbscan->phs_mutex);
retry:
/*
* If the scan's startblock has not yet been initialized, we must do so
* now. If this is not a synchronized scan, we just start at block 0, but
* if it is a synchronized scan, we must get the starting position from
* the synchronized scan machinery. We can't hold the spinlock while
* doing that, though, so release the spinlock, get the information we
* need, and retry. If nobody else has initialized the scan in the
* meantime, we'll fill in the value we fetched on the second time
* through.
*/
if (pbscan->phs_startblock == InvalidBlockNumber)
{
if (!pbscan->base.phs_syncscan)
pbscan->phs_startblock = 0;
else if (sync_startpage != InvalidBlockNumber)
pbscan->phs_startblock = sync_startpage;
else
{
sync_startpage = ss_get_location(rel, pbscan->phs_nblocks);
goto retry;
}
}
SpinLockRelease(&pbscan->phs_mutex);
}
SpinLockAcquire(&pbscan->phs_mutex);
retry:
/*
* If the scan's startblock has not yet been initialized, we must do so
* now. If this is not a synchronized scan, we just start at block 0, but
* if it is a synchronized scan, we must get the starting position from
* the synchronized scan machinery. We can't hold the spinlock while
* doing that, though, so release the spinlock, get the information we
* need, and retry. If nobody else has initialized the scan in the
* meantime, we'll fill in the value we fetched on the second time
* through.
*/
if (pbscan->phs_startblock == InvalidBlockNumber)
{
if (!pbscan->base.phs_syncscan)
pbscan->phs_startblock = 0;
else if (sync_startpage != InvalidBlockNumber)
pbscan->phs_startblock = sync_startpage;
else
{
sync_startpage = ss_get_location(rel, pbscan->phs_nblocks);
goto retry;
}
}
SpinLockRelease(&pbscan->phs_mutex);
}
Acquire lock once, before retry?
2. Is there any configuration to improve performance?
regards,
Ranier Vilela
pgsql-hackers by date: