Re: [PATCH v4] parallel pg_restore: avoid disk seeks when jumping short distance forward - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [PATCH v4] parallel pg_restore: avoid disk seeks when jumping short distance forward
Date
Msg-id 375562.1760995276@sss.pgh.pa.us
Whole thread Raw
In response to [PATCH v4] parallel pg_restore: avoid disk seeks when jumping short distance forward  (Dimitrios Apostolou <jimis@gmx.net>)
Responses Re: [PATCH v4] parallel pg_restore: avoid disk seeks when jumping short distance forward
Re: [PATCH v4] parallel pg_restore: avoid disk seeks when jumping short distance forward
List pgsql-hackers
Dimitrios Apostolou <jimis@gmx.net> writes:
> Regarding the attached patch (rebased and edited commit message), it
> basically replaces seek(up to 1MB forward) with read(). The 1MB number
> comes a bit out of the top of my head. But tweaking it between 128KB and
> 1MB wouldn't really change anything, given that the block size is now
> 128KB: The read() will always be chosen against the seek(). Do you know
> of a real-world case with block sizes >128KB?

Yeah, with the recent changes I'd expect table data to pretty much
always consist of blocks around 128K, unless the table is smaller
than that of course.

I experimented with this patch locally and came away not too
impressed; it seems the results may be highly platform-dependent.

In the interests of having a common benchmark case that's easy to
replicate, here's precisely what I did:

Use non-assert build of current HEAD (4bea91f21 at the moment).

$ createdb bench
$ time pgbench -i -s 10000 bench

real    14m40.474s
user    1m26.717s
sys     0m5.045s
$ psql bench
...
bench=# create table zedtable(f1 int);
CREATE TABLE
bench=# insert into zedtable values(42);
INSERT 0 1
bench=# \q
$ time pg_dump -Fc --compress=none bench | cat >bench10000.dump

real    7m48.969s
user    0m36.334s
sys     1m35.209s

(At this -s value, the database occupies about 147G and the dump
file about 95G.  It's important the dump file not fit in RAM.)

$ time pg_restore -f /dev/null -t zedtable bench10000.dump

real    1m12.646s
user    0m0.355s
sys     0m5.083s

This compares rather favorably to "cat":

$ time cat bench10000.dump >/dev/null

real    3m6.627s
user    0m0.167s
sys     0m30.999s

I then applied your patch and repeated the restore run:

$ time pg_restore -f /dev/null -t zedtable bench10000.dump

real    2m39.138s
user    0m0.386s
sys     0m28.493s

So for me, the proposed patch actually makes it 2X slower.

Watching it with "iostat 1", I'm seeing about 40MB/s disk read
rate with HEAD, and 500MB/s with the patch; "cat" also shows
read rate around 500MB/s.  So yeah, we can saturate the disk
interface by doing all reads and no seeks, but that doesn't
net out faster.

I did this on a few-years-old Dell Precision 5820 workstation.
The specs for it are a bit vague about the disk subsystem:
  Storage Drive Controllers
  Integrated Intel AHCI SATA chipset controller (8x 6.0Gb/s), SW RAID 0,1,5,10
  Storage Drive
  2.5 1.92TB SATA AG Enterprise Solid State Drive
and hdparm isn't enormously helpful either:

ATA device, with non-removable media
    Model Number:       SSDSC2KB019T8R
    Serial Number:      PHYF1291017A1P9DGN
    Firmware Revision:  XCV1DL69
    Media Serial Num:
    Media Manufacturer:
    Transport:          Serial, ATA8-AST, SATA 1.0a, SATA II Extensions, SATA Rev 2.5, SATA Rev 2.6, SATA Rev 3.0
Standards:
    Used: unknown (minor revision code 0x006d)
    Supported: 10 9 8 7 6 5
    Likely used: 10

I'm running RHEL 8.10, file system is xfs.

So I find this a bit discouraging.  It's not clear why you're seeing a
win and I'm not, and it's even less clear whether there'd be enough
of a win across enough platforms to make it worth trying to engineer
a solution that helps many more people than it hurts.

            regards, tom lane



pgsql-hackers by date:

Previous
From: "David E. Wheeler"
Date:
Subject: Re: abi-compliance-check failure due to recent changes to pg_{clear,restore}_{attribute,relation}_stats()
Next
From: "David E. Wheeler"
Date:
Subject: Re: abi-compliance-check failure due to recent changes to pg_{clear,restore}_{attribute,relation}_stats()