Thread: Has anybody think about changing BLCKSZ to an option of initdb?

Has anybody think about changing BLCKSZ to an option of initdb?

From
"Jacky Leng"
Date:
After all, re-initdb is much easier than re-build the whole package.

And there seems nothing diffcult to implement this. Is that true? 




Re: Has anybody think about changing BLCKSZ to an option of initdb?

From
Tom Lane
Date:
"Jacky Leng" <lengjianquan@163.com> writes:
> And there seems nothing diffcult to implement this. Is that true? 

No.
        regards, tom lane


Re: Has anybody think about changing BLCKSZ to an option of initdb?

From
Greg Stark
Date:
On Wed, Mar 11, 2009 at 12:38 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> And there seems nothing diffcult to implement this. Is that true?
>
> No.

Eh? There's nothing difficult in implementing it.

But there are a lot of other constants dependant on this value which
are currently compile-time constants. The only downside I'm aware of
is that with this change they become dynamically calculated values
which might have a cpu cost since they'll be recalculated quite often.

-- 
greg


Re: Has anybody think about changing BLCKSZ to an option of initdb?

From
Tom Lane
Date:
Greg Stark <stark@enterprisedb.com> writes:
> On Wed, Mar 11, 2009 at 12:38 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> And there seems nothing diffcult to implement this. Is that true?
>> 
>> No.

> Eh? There's nothing difficult in implementing it.

> But there are a lot of other constants dependant on this value which
> are currently compile-time constants.

Exactly, and we rely on them being constants, eg to size arrays.

There's no free lunch, and in this particular case there is no evidence
whatsoever that it'd be worth the trouble to support run-time-variable
BLCKSZ.
        regards, tom lane


Re: Has anybody think about changing BLCKSZ to an option of initdb?

From
Greg Stark
Date:
On Wed, Mar 11, 2009 at 1:13 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Greg Stark <stark@enterprisedb.com> writes:
>> On Wed, Mar 11, 2009 at 12:38 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>>> And there seems nothing diffcult to implement this. Is that true?
>>>
>>> No.
>
>> Eh? There's nothing difficult in implementing it.
>
>> But there are a lot of other constants dependant on this value which
>> are currently compile-time constants.
>
> Exactly, and we rely on them being constants, eg to size arrays.
>
> There's no free lunch, and in this particular case there is no evidence
> whatsoever that it'd be worth the trouble to support run-time-variable
> BLCKSZ.

The main advantage would be for circumstances such as the Windows
installer where users are installing precompiled binaries. They don't
get an opportunity to choose the block size at all. (Similarly for
users of binary-only commercial products such as EDB's but the Windows
installer makes a pretty good argument on its own). I think the
question hinges on whether there's any real benefit to block size at
all.

The current situation is that the facility is available for people to
test and demonstrate that it's helpful. But there are so many
variables -- filesystem type, filesystem block size, raid array stripe
size, OS readahead, database work-load -- that nobody's done that kind
of testing extensively enough to separate the effects of block size
from other effects.

If we had a solid use case for adjusting block size at all I think we
would also need to make it adjustable at initdb time for those
binary-only installs. Until we do leaving the compile-time
configuration in for people to experiment with is sufficient.

-- 
greg


Re: Has anybody think about changing BLCKSZ to an option of initdb?

From
Martijn van Oosterhout
Date:
On Wed, Mar 11, 2009 at 01:29:43PM +0000, Greg Stark wrote:
> The main advantage would be for circumstances such as the Windows
> installer where users are installing precompiled binaries. They don't
> get an opportunity to choose the block size at all. (Similarly for
> users of binary-only commercial products such as EDB's but the Windows
> installer makes a pretty good argument on its own).

And all the linux distributions which ship precompiled binaries. I'm
sure there are people who compile postgres themselves but I think there
are more who don't.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while
> boarding. Thank you for flying nlogn airlines.

Re: Has anybody think about changing BLCKSZ to an option of initdb?

From
"Joshua D. Drake"
Date:
On Sat, 2009-03-14 at 13:53 +0100, Martijn van Oosterhout wrote:
> On Wed, Mar 11, 2009 at 01:29:43PM +0000, Greg Stark wrote:
> > The main advantage would be for circumstances such as the Windows
> > installer where users are installing precompiled binaries. They don't
> > get an opportunity to choose the block size at all. (Similarly for
> > users of binary-only commercial products such as EDB's but the Windows
> > installer makes a pretty good argument on its own).
> 
> And all the linux distributions which ship precompiled binaries. I'm
> sure there are people who compile postgres themselves but I think there
> are more who don't.

I think that is an understatement. I would say 99% of postgresql users
do NOT compile from source. Heck the only time I compile from source is
when I need to fix mis-configured defaults in RH packages (which is why
we now have rpms that fix those defaults) or when we have back patched
something for a customer.

Joshua D. Drake

> 
> Have a nice day,
-- 
PostgreSQL - XMPP: jdrake@jabber.postgresql.org  Consulting, Development, Support, Training  503-667-4564 -
http://www.commandprompt.com/ The PostgreSQL Company, serving since 1997
 



Re: Has anybody think about changing BLCKSZ to an option of initdb?

From
Gregory Stark
Date:
"Joshua D. Drake" <jd@commandprompt.com> writes:

> On Sat, 2009-03-14 at 13:53 +0100, Martijn van Oosterhout wrote:
>> On Wed, Mar 11, 2009 at 01:29:43PM +0000, Greg Stark wrote:
>> > The main advantage would be for circumstances such as the Windows
>> > installer where users are installing precompiled binaries. They don't
>> > get an opportunity to choose the block size at all. (Similarly for
>> > users of binary-only commercial products such as EDB's but the Windows
>> > installer makes a pretty good argument on its own).
>> 
>> And all the linux distributions which ship precompiled binaries. I'm
>> sure there are people who compile postgres themselves but I think there
>> are more who don't.
>
> I think that is an understatement. I would say 99% of postgresql users
> do NOT compile from source. Heck the only time I compile from source is
> when I need to fix mis-configured defaults in RH packages (which is why
> we now have rpms that fix those defaults) or when we have back patched
> something for a customer.

So has anyone here done any experiments with live systems with different block
sizes? What were your experiences? 

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services!


Re: Has anybody think about changing BLCKSZ to an option of initdb?

From
"Joshua D. Drake"
Date:
On Sat, 2009-03-14 at 15:29 +0000, Gregory Stark wrote:
> "Joshua D. Drake" <jd@commandprompt.com> writes:

> > I think that is an understatement. I would say 99% of postgresql users
> > do NOT compile from source. Heck the only time I compile from source is
> > when I need to fix mis-configured defaults in RH packages (which is why
> > we now have rpms that fix those defaults) or when we have back patched
> > something for a customer.
> 
> So has anyone here done any experiments with live systems with different block
> sizes? What were your experiences? 

I tested with 4k once. The system tanked. This might be a good one for
the performance lab.

Joshua D. Drake

> 
> -- 
>   Gregory Stark
>   EnterpriseDB          http://www.enterprisedb.com
>   Ask me about EnterpriseDB's RemoteDBA services!
> 
-- 
PostgreSQL - XMPP: jdrake@jabber.postgresql.org  Consulting, Development, Support, Training  503-667-4564 -
http://www.commandprompt.com/ The PostgreSQL Company, serving since 1997
 



Re: Has anybody think about changing BLCKSZ to an option of initdb?

From
Tom Lane
Date:
Gregory Stark <stark@enterprisedb.com> writes:
> So has anyone here done any experiments with live systems with different block
> sizes? What were your experiences? 

That should really have been the *first* question.  We are not going to
make this a tunable unless there is some pretty strong evidence that
it's worth twiddling.  Aside from the implementation costs of making
it variable, there is the oft repeated refrain that Postgres has too
many configuration knobs already.
        regards, tom lane


Re: Has anybody think about changing BLCKSZ to an option of initdb?

From
"Joshua D. Drake"
Date:
On Sat, 2009-03-14 at 11:47 -0400, Tom Lane wrote:
> Gregory Stark <stark@enterprisedb.com> writes:
> > So has anyone here done any experiments with live systems with different block
> > sizes? What were your experiences? 
> 
> That should really have been the *first* question.  We are not going to
> make this a tunable unless there is some pretty strong evidence that
> it's worth twiddling.  Aside from the implementation costs of making
> it variable, there is the oft repeated refrain that Postgres has too
> many configuration knobs already.

Well that "too many knobs" argument doesn't apply to this scenario etc.
Anyone who is making use of these need those knobs. It is the other 98%
that really just need to crank up half a dozen parameters and PostgreSQL
is blazing fast for them that make that argument (which is why we should
rip everything out of the postgresql.conf).

Joshua D. Drake


> 
>             regards, tom lane
> 
-- 
PostgreSQL - XMPP: jdrake@jabber.postgresql.org  Consulting, Development, Support, Training  503-667-4564 -
http://www.commandprompt.com/ The PostgreSQL Company, serving since 1997
 



Re: Has anybody think about changing BLCKSZ to an option of initdb?

From
Tom Lane
Date:
"Joshua D. Drake" <jd@commandprompt.com> writes:
> On Sat, 2009-03-14 at 11:47 -0400, Tom Lane wrote:
>> ... Aside from the implementation costs of making
>> it variable, there is the oft repeated refrain that Postgres has too
>> many configuration knobs already.

> Well that "too many knobs" argument doesn't apply to this scenario etc.
> Anyone who is making use of these need those knobs.

That's nonsense --- on that argument, any variable no matter how obscure
should be exposed as a tunable because there might be somebody somewhere
who could benefit from it.  You are ignoring the costs to everybody else
who don't need it, but still have to study a GUC variable definition and
try to figure out whether it needs changing for their usage.  Not to
mention the people who set it to a bad value and suffer lost performance
as a result (cf vacuum_cost_delay).

Note that I am not saying "no", I am saying "give us some evidence
*first*".  The costs in implementation effort and user confusion are
certain, the benefits are not.
        regards, tom lane


Re: Has anybody think about changing BLCKSZ to an option of initdb?

From
Josh Berkus
Date:
Tom Lane wrote:
> Gregory Stark <stark@enterprisedb.com> writes:
>> So has anyone here done any experiments with live systems with different block
>> sizes? What were your experiences? 

Mark tested this back in the OSDL days.  His findings on DBT2 was that 
the right *combination* of OS and PG blocksizes gave up to a 5% 
performance increase, I think.  Hardly enough to make it worth the 
headache of running with non-default PG and non-deafault Linux block 
sizes, especially since the wrong combination resulted in a decrease in 
performance, sometimes dramatically so.

However, at Greenplum I remember determining that larger PG block sizes, 
if matched with larger filesystem block sizes did significantly help on 
performance of data warehouses which do a lot of seq scans -- but that 
our ceiling of 32K was still too small to really make this work.  I 
don't have the figures for that, though; Luke reading this?

--Josh


Re: Has anybody think about changing BLCKSZ to an option of initdb?

From
Alvaro Herrera
Date:
Josh Berkus wrote:

> However, at Greenplum I remember determining that larger PG block sizes,  
> if matched with larger filesystem block sizes did significantly help on  
> performance of data warehouses which do a lot of seq scans -- but that  
> our ceiling of 32K was still too small to really make this work.  I  
> don't have the figures for that, though; Luke reading this?

And did they study the effect of tuning the kernel's readahead?

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: Has anybody think about changing BLCKSZ to an option of initdb?

From
"Joshua D. Drake"
Date:
On Sat, 2009-03-14 at 12:25 -0400, Tom Lane wrote:
> "Joshua D. Drake" <jd@commandprompt.com> writes:
> > On Sat, 2009-03-14 at 11:47 -0400, Tom Lane wrote:
> >> ... Aside from the implementation costs of making
> >> it variable, there is the oft repeated refrain that Postgres has too
> >> many configuration knobs already.
> 
> > Well that "too many knobs" argument doesn't apply to this scenario etc.
> > Anyone who is making use of these need those knobs.
> 
> That's nonsense --- on that argument, any variable no matter how obscure
> should be exposed as a tunable because there might be somebody somewhere
> who could benefit from it.  You are ignoring the costs to everybody else
> who don't need it, but still have to study a GUC variable definition and
> try to figure out whether it needs changing for their usage.  Not to
> mention the people who set it to a bad value and suffer lost performance
> as a result (cf vacuum_cost_delay).

I think you misunderstood me. I wasn't actually arguing for the
variable. I was arguing that if the variable was required that those are
the people that would need it. I frankly don't see a need for this
variable but again, I think that the performance lab would be provide
the information we need to make such a determination.


> Note that I am not saying "no", I am saying "give us some evidence
> *first*".  The costs in implementation effort and user confusion are
> certain, the benefits are not.

I do not disagree with this.

Sincerely,

Joshua D. Drake

> 
>             regards, tom lane
> 
-- 
PostgreSQL - XMPP: jdrake@jabber.postgresql.org  Consulting, Development, Support, Training  503-667-4564 -
http://www.commandprompt.com/ The PostgreSQL Company, serving since 1997
 



Re: Has anybody think about changing BLCKSZ to an option of initdb?

From
ITAGAKI Takahiro
Date:
"Joshua D. Drake" <jd@commandprompt.com> wrote:

> > So has anyone here done any experiments with live systems with different block
> > sizes? What were your experiences? 
> 
> I tested with 4k once. The system tanked. This might be a good one for
> the performance lab.

I'm using 16k blocks for one system. There are tables with 5kB+/row.
The perfomance was worst if 8kB blocks because of many TOASTed fields
and unusable spaces.

There are some users who don't want to recompile postgres because they
think recompiled version of postgres are not tested well and not supported
by companies and 3rd party tools. Their database designs are bad, of course,
but they want to resolve their problem using knobs of databases.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center