Thread: Block size: 8K or 16K?

Block size: 8K or 16K?

From

mlw

Date:

25 April 2002, 09:08:04

I am going to compare a 16KB PostgreSQL system to an 8KB system. I am working
on the assumption that 16K takes about as long to read as 8K, and That the CPU
overhead of working with a 16K block is not too significant. 

I know with toast, block size is no longer an issue, but 8K is not a lot these
days, and it seems like a lot of syscall and block management overhead could be
reduced by doubling it. Any comments?

The test system is a dual 850MHZ PIII, 1G memory, RedHat 7.2, 2 IBM SCSI 18G
hard disks, intel motherboard with onboard adaptec SCSI ULVD.

Besides pgbench, anyone have any tests that they would like to try?

Has anyone already done this test and found it useful/useless?

Re: Block size: 8K or 16K?

From

mlw

Date:

25 April 2002, 09:53:43

Jean-Paul ARGUDO wrote:
> 
> > I know with toast, block size is no longer an issue, but 8K is not a lot these
> > days, and it seems like a lot of syscall and block management overhead could be
> > reduced by doubling it. Any comments?
> 
> IMHO, I think this would enhance performances only if tuple lenght is
> above 8k, huh?..
> 
> I mean, I think this would enhance databases with many large objects. On
> the contrary, database with classical varchar and integers wont benefit
> it, don't you think?

See, I'm not sure. I can make many arguments pro or con, an I could defend
either, but my gut tells me that using 16K blocks will increase performance
over 8K. Aleady I have seen a sequential scan of a large table go from 20
seconds using 8K to 17.3 seconds using 16K.

select * from zsong where song like '%fubar%';

I am copying my pgbench database to the new block size to test that.

8K vs 16K
Pros: 
A sequential scan will require 1/2 the number of system calls for the same
amount of data.
Block "cache" management costs will be cut in half, for the same amount of
data.
More index information per fetch.
Larger tuples can be stored without toasting.

Cons:
Time to search a block for a tuple may increase.
More memory used per block (These days, I don't think this is too much of an
issue.)

This is based on the assumption that reading an 8K chunk is about as costly as
reading a 16K chunk. If this assumption is not true, then the arguments do not
work.

Re: Block size: 8K or 16K?

From

Neil Conway

Date:

25 April 2002, 11:21:13

On Thu, 25 Apr 2002 09:04:07 -0400
"mlw" <markw@mohawksoft.com> wrote:
> I am going to compare a 16KB PostgreSQL system to an 8KB system. I am working
> on the assumption that 16K takes about as long to read as 8K, and That the CPU
> overhead of working with a 16K block is not too significant. 
> 
> I know with toast, block size is no longer an issue, but 8K is not a lot these
> days, and it seems like a lot of syscall and block management overhead could be
> reduced by doubling it. Any comments?

It's something I was planning to investigate, FWIW. I'd be interested to see
the results...

> The test system is a dual 850MHZ PIII, 1G memory, RedHat 7.2, 2 IBM SCSI 18G
> hard disks, intel motherboard with onboard adaptec SCSI ULVD.
> 
> Besides pgbench, anyone have any tests that they would like to try?

Perhaps OSDB? http://osdb.sf.net

Cheers,

Neil

-- 
Neil Conway <neilconway@rogers.com>
PGP Key ID: DB3C29FC

Re: Block size: 8K or 16K?

From

Curt Sampson

Date:

26 April 2002, 01:09:28

On Thu, 25 Apr 2002, mlw wrote:

> ...but my gut tells me that using 16K blocks will increase performance
> over 8K. Aleady I have seen a sequential scan of a large table go from 20
> seconds using 8K to 17.3 seconds using 16K.

You should be able to get the same performance increase with 8K
blocks by reading two blocks at a time while doing sequential scans.
That's why I've been promoting this idea of changing postgres to
do its own read-ahead.

Of course, Bruce might be right that the OS read-ahead may take
care of this anyway, but then why would switching to 16K blocks
improve sequential scans? Possibly because I'm missing something here.

Anyway, we now know how to test the change, should someone do it:
compare sequential scans with and without readahead on 8K blocks,
and then compare that against a server without readahead but with
block sizes the size of the readahead (64K, I propose--oh wait, we
can only do 32K....)

cjs
-- 
Curt Sampson  <cjs@cynic.net>   +81 90 7737 2974   http://www.netbsd.org   Don't you know, in this new Dark Age, we're
alllight.  --XTC

Re: Block size: 8K or 16K?

From

Bruce Momjian

Date:

26 April 2002, 01:28:52

Curt Sampson wrote:
> On Thu, 25 Apr 2002, mlw wrote:
> 
> > ...but my gut tells me that using 16K blocks will increase performance
> > over 8K. Aleady I have seen a sequential scan of a large table go from 20
> > seconds using 8K to 17.3 seconds using 16K.
> 
> You should be able to get the same performance increase with 8K
> blocks by reading two blocks at a time while doing sequential scans.
> That's why I've been promoting this idea of changing postgres to
> do its own read-ahead.
> 
> Of course, Bruce might be right that the OS read-ahead may take
> care of this anyway, but then why would switching to 16K blocks
> improve sequential scans? Possibly because I'm missing something here.

I am almost sure that increasing the block size or doing read-ahead in
the db will only improve performance if someone is performing seeks in
the file at the same time, and hence OS readahead is being turned off.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: Block size: 8K or 16K?

From

mlw

Date:

26 April 2002, 08:31:48

Bruce Momjian wrote:
> 
> Curt Sampson wrote:
> > On Thu, 25 Apr 2002, mlw wrote:
> >
> > > ...but my gut tells me that using 16K blocks will increase performance
> > > over 8K. Aleady I have seen a sequential scan of a large table go from 20
> > > seconds using 8K to 17.3 seconds using 16K.
> >
> > You should be able to get the same performance increase with 8K
> > blocks by reading two blocks at a time while doing sequential scans.
> > That's why I've been promoting this idea of changing postgres to
> > do its own read-ahead.
> >
> > Of course, Bruce might be right that the OS read-ahead may take
> > care of this anyway, but then why would switching to 16K blocks
> > improve sequential scans? Possibly because I'm missing something here.
> 
> I am almost sure that increasing the block size or doing read-ahead in
> the db will only improve performance if someone is performing seeks in
> the file at the same time, and hence OS readahead is being turned off.

I largely agree with you, however, don't underestimate the overhead of a read()
call. By doubling the block size, the overhead of my full table scan was cut in
half, thus potentially more efficient, 20 seconds was reduced to 17. (That was
on a machine only doing one query, not one under full load, so the real effect
may be much more subtle.)

In fact, I posted some results of a comparison between 16k and 8k blocks, I saw
very little difference on most tests while a couple looked pretty interesting.