Home > mailing lists

Parallel Seq Scan vs kernel read ahead - Mailing list pgsql-hackers

From	Thomas Munro
Subject	Parallel Seq Scan vs kernel read ahead
Date	May 20, 2020 01:53:24
Msg-id	CA+hUKGJ_EErDv41YycXcbMbCBkztA34+z1ts9VQH+ACRuvpxig@mail.gmail.com Whole thread
Responses	Re: Parallel Seq Scan vs kernel read ahead
List	pgsql-hackers

Tree view

Hello hackers,

Parallel sequential scan relies on the kernel detecting sequential
access, but we don't make the job easy.  The resulting striding
pattern works terribly on strict next-block systems like FreeBSD UFS,
and degrades rapidly when you add too many workers on sliding window
systems like Linux.

Demonstration using FreeBSD on UFS on a virtual machine, taking ball
park figures from iostat:

  create table t as select generate_series(1, 200000000)::int i;

  set max_parallel_workers_per_gather = 0;
  select count(*) from t;
  -> execution time 13.3s, average read size = ~128kB, ~500MB/s

  set max_parallel_workers_per_gather = 1;
  select count(*) from t;
  -> execution time 24.9s, average read size = ~32kB, ~250MB/s

Note the small read size, which means that there was no read
clustering happening at all: that's the logical block size of this
filesystem.

That explains some complaints I've heard about PostgreSQL performance
on that filesystem: parallel query destroys I/O performance.

As a quick experiment, I tried teaching the block allocated to
allocate ranges of up 64 blocks at a time, ramping up incrementally,
and ramping down at the end, and I got:

  set max_parallel_workers_per_gather = 1;
  select count(*) from t;
  -> execution time 7.5s, average read size = ~128kB, ~920MB/s

  set max_parallel_workers_per_gather = 3;
  select count(*) from t;
  -> execution time 5.2s, average read size = ~128kB, ~1.2GB/s

I've attached the quick and dirty patch I used for that.

Attachment

0001-Use-larger-step-sizes-for-Parallel-Seq-Scan.patch

pgsql-hackers by date:

From: Fujii Masao
Date: 20 May 2020, 01:19:56
Subject: Re: pg_stat_wal_receiver and flushedUpto/writtenUpto

From: Amit Kapila
Date: 20 May 2020, 02:23:28
Subject: Re: Parallel Seq Scan vs kernel read ahead

Parallel Seq Scan vs kernel read ahead - Mailing list pgsql-hackers

Attachment

Previous

Next