PostgreSQL reads each 8k block - no larger blocks are used - even on sequential scans - Mailing list pgsql-general

From Gerhard Wiesinger
Subject PostgreSQL reads each 8k block - no larger blocks are used - even on sequential scans
Date
Msg-id alpine.LFD.2.00.0909271756270.1116@bbs.intern
Whole thread Raw
Responses Re: PostgreSQL reads each 8k block - no larger blocks are used - even on sequential scans  (Greg Smith <gsmith@gregsmith.com>)
List pgsql-general
Hello,

As blocksizes, random I/O and linear I/O are critical I/O performance
parameters I had a look on PostgreSQL and a commercial software vendor.

Therefore I enhanced the system tap script:
http://www.wiesinger.com/opensource/systemtap/disktop_gw.stp

Output per 5 seconds on a sequence scan:
     UID      PID     PPID                       CMD   DEVICE    T        BYTES     REQUESTS    BYTES/REQ
      26     4263     4166                postmaster     dm-1    R    168542208        20574         8192
=> 32MB/s

So I saw, that even on sequential reads (and also on bitmap heap scan
acces) PostgreSQL uses only 8k blocks. I think that's a major I/O
bottleneck.

A commercial software database vendor solved the problem by reading
multiple continuous blocks by multiple 8k blocks up to a maximum
threshold. Output per 5 seconds on an equivalent "sequence scan":
     UID      PID     PPID                       CMD   DEVICE    T        BYTES     REQUESTS    BYTES/REQ
    1001     5381        1                   process     dm-1    R    277754638         2338       118800
=> 53 MB/s

A google research has shown that Gregory Stark already worked on that
issue (see references below) but as far as I saw only on bitmap heap
scans.

I think this is one of the most critical performance showstopper of
PostgreSQL on the I/O side.

What's the current status of the patch of Gregory Stark? Any timeframes
to integrate?
Does it also work for sequence scans? Any plans for a generic "multi block
read count" solution?

Any comments?

Thnx.

Ciao,
Gerhard

--
http://www.wiesinger.com/

http://wiki.postgresql.org/wiki/Todo#Concurrent_Use_of_Resources
http://archives.postgresql.org/pgsql-hackers/2007-12/msg00027.php
http://archives.postgresql.org/pgsql-hackers/2007-12/msg00395.php
http://archives.postgresql.org/pgsql-hackers/2007-12/msg00088.php
http://archives.postgresql.org/pgsql-hackers/2007-12/msg00092.php
http://archives.postgresql.org/pgsql-hackers/2007-12/msg00098.php

http://archives.postgresql.org/pgsql-hackers/2006-10/msg00820.php

http://markmail.org/message/a5osy4qptxk2jgu3#query:+page:1+mid:hz7uzhwxtkbzncy2+state:results
http://markmail.org/message/a5osy4qptxk2jgu3#query:+page:1+mid:a5osy4qptxk2jgu3+state:results

pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: Understanding sort's memory/disk usage
Next
From: Sam Mason
Date:
Subject: Re: PostgreSQL reads each 8k block - no larger blocks are used - even on sequential scans