Thread: Re: [HACKERS] Disk block size issues.

Re: [HACKERS] Disk block size issues.

From

darrenk@insightdist.com (Darren King)

Date:

09 January 1998, 11:36:46

> > A few things that I have noticed will be affected by allowing the
> > disk block size to be other than 8k. (4k, 8k, 16k or 32k)
> >
> > 1. Rules
> >
> > The rule system currently stores plans as tuples in pg_rewrite.
> > Making the block size smaller will accordingly reduce the size of
> > the rules you can create.
>
> I say make it match the given block size at compile time.

For now it does.  There's a comment in rewriteDefine.c though that
indicates the original pg coders thought about putting the stored
plans into large objects if 8k was too limiting.

Could be nice to have the type limits stored in a system table so
the user or a program could query the limits of the current db.

> > 2. Attribute limits
> >
> > Should the size limits of the varchar/char be driven by the chosen
> > block size?
>
> Yes, they should be calculated based on the compile block size.
> ...
> Just make the max size based on the block size.
> ...
> This is an interesting point.  While we can compute most of the changes
> at compile time, we will have to communicate with clients that were
> compiled with different max limits.
>
> I recommend we increase the max client buffer size to what we believe is
> the largest block size anyone would ever reasonably choose.  That way,
> all can communicate.  I recommend you contact Peter Mount for JDBC,
> Openlink for ODBC, and all the other client maintainers and let them
> know the changes will be in 6.3 so they can be ready with new version
> when 6.3 starts beta on February 1.

So the buffer size will be defined in one place also that they should all
reference when compiling or running?  In include/config.h I assume?

This could be difficult for the ODBC and JDBC drivers to determine
automagically since they are usually compiled on different systems that
the postgres src.

Other stuff...

Could the block size be made into a command line option, like "-k 8192"?

Would only require that the BLCKSZ define become a variable and that it
be passed to the backends too.  Much easier than having to recompile/install
postgres to change the block size.  Could have multiple postmasters running
different block-sized databases without having to have a binary around for
each size.

Renaming BLCKSZ...

How about PG_BLOCK_SIZE?  Or if it's made a variable, DiskBlockSize, keeping
it in the tradition of SortMem, ShowStats, etc.

darrenk

Re: [HACKERS] Disk block size issues.

From

Bruce Momjian

Date:

09 January 1998, 12:30:22

>
> > > A few things that I have noticed will be affected by allowing the
> > > disk block size to be other than 8k. (4k, 8k, 16k or 32k)
> > >
> > > 1. Rules
> > >
> > > The rule system currently stores plans as tuples in pg_rewrite.
> > > Making the block size smaller will accordingly reduce the size of
> > > the rules you can create.
> >
> > I say make it match the given block size at compile time.
>
> For now it does.  There's a comment in rewriteDefine.c though that
> indicates the original pg coders thought about putting the stored
> plans into large objects if 8k was too limiting.

Yep, I saw that too.

> Could be nice to have the type limits stored in a system table so
> the user or a program could query the limits of the current db.

Someday.

>
> > > 2. Attribute limits
> > >
> > > Should the size limits of the varchar/char be driven by the chosen
> > > block size?
> >
> > Yes, they should be calculated based on the compile block size.
> > ...
> > Just make the max size based on the block size.
> > ...
> > This is an interesting point.  While we can compute most of the changes
> > at compile time, we will have to communicate with clients that were
> > compiled with different max limits.
> >
> > I recommend we increase the max client buffer size to what we believe is
> > the largest block size anyone would ever reasonably choose.  That way,
> > all can communicate.  I recommend you contact Peter Mount for JDBC,
> > Openlink for ODBC, and all the other client maintainers and let them
> > know the changes will be in 6.3 so they can be ready with new version
> > when 6.3 starts beta on February 1.
>
> So the buffer size will be defined in one place also that they should all
> reference when compiling or running?  In include/config.h I assume?

Yes, in config.h, and let's call it PG... so it is clear, and everything
can key off of that.

>
> This could be difficult for the ODBC and JDBC drivers to determine
> automagically since they are usually compiled on different systems that
> the postgres src.

I think they will need to handle the maximum size someone could ever
choose.  Let's face it, 32k or 64k is not too much to ask for a buffer.
I just hope there are not too many of them.  I only see it in one place
in libpq.  The others are malloc'ed based on how big the result is when
it comes back from the socket.

I recommend we add a test in config.h to make sure they do not set the
max size greater than some predefined limit, and mention why we test
there (for clients).  The interface/* files will not use the backend
block size, but will use another config.h define called PGMAXBLCKSZ, or
something like that, so they can interoperate will all backends.

>
> Other stuff...
>
> Could the block size be made into a command line option, like "-k 8192"?

Too scary for me.

>
> Would only require that the BLCKSZ define become a variable and that it
> be passed to the backends too.  Much easier than having to recompile/install
> postgres to change the block size.  Could have multiple postmasters running
> different block-sized databases without having to have a binary around for
> each size.

Yes, we could do that, but if they ever start the postmaster with a
different value, he is lost.  I thought because of the bit fields and
cases where BLCKSZ is used in macros to define sized arrays that we
can't make it variable.

I think we should make it a config.h constant for now, but I am not firm
on this.

>
> Renaming BLCKSZ...
>
> How about PG_BLOCK_SIZE?  Or if it's made a variable, DiskBlockSize, keeping
> it in the tradition of SortMem, ShowStats, etc.

I like that new name.

--
Bruce Momjian
maillist@candle.pha.pa.us

Re: [HACKERS] Disk block size issues.

From

The Hermit Hacker

Date:

09 January 1998, 13:10:22

On Fri, 9 Jan 1998, Darren King wrote:

> How about PG_BLOCK_SIZE?  Or if it's made a variable, DiskBlockSize, keeping
> it in the tradition of SortMem, ShowStats, etc.

    I know of one site that builds their Virtual Websites into
chroot()'d environments...something like this would be perfect for them,
as it would prvent them having to recompile for each individual size...

    But...initdb would have to have an appropriate option...and we'd
have to have a mechanism in place that checks that -k parameter is
actually appropriate.

    Would it not make a little more sense to have a pg_block_size file
created in the data directory that postmaster reads at startup?

Re: [HACKERS] Disk block size issues.

From

Bruce Momjian

Date:

09 January 1998, 14:00:13

>
> On Fri, 9 Jan 1998, Darren King wrote:
>
> > How about PG_BLOCK_SIZE?  Or if it's made a variable, DiskBlockSize, keeping
> > it in the tradition of SortMem, ShowStats, etc.
>
>     I know of one site that builds their Virtual Websites into
> chroot()'d environments...something like this would be perfect for them,
> as it would prvent them having to recompile for each individual size...
>
>     But...initdb would have to have an appropriate option...and we'd
> have to have a mechanism in place that checks that -k parameter is
> actually appropriate.
>
>     Would it not make a little more sense to have a pg_block_size file
> created in the data directory that postmaster reads at startup?

I like that, but the postmaster and each backend would have to read that
file before starting, or the postmaster can pass it down into the
postgres backend via a command-line option.

--
Bruce Momjian
maillist@candle.pha.pa.us

Re: [HACKERS] Disk block size issues.

From

The Hermit Hacker

Date:

09 January 1998, 16:53:12

On Fri, 9 Jan 1998, Bruce Momjian wrote:

> > Other stuff...
> >
> > Could the block size be made into a command line option, like "-k 8192"?
>
> Too scary for me.

    I kinda like this one...if it can be relatively implimented.  The main
reason I like it is that, like -B and -S, it means that someone could deal
with "tweaking" a system without having to recompile from scratch...

    That said, I'd much rather that -k option being something that is
an option only available when *creating* the database (ie. initdb) with a
pg_blocksize file being created and checked when postmaster starts up.

    Essentially, make '-k 8192' an option only available to the postgres
process, not the postmaster process.  And not settable by the -O option to
postmaster...

> Yes, we could do that, but if they ever start the postmaster with a
> different value, he is lost.

    See above...it should only be something that is settable at initdb time,
not accessible via 'postmaster' itself...

Marc G. Fournier
Systems Administrator @ hub.org
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org

Re: [HACKERS] Disk block size issues.

From

Shiby Thomas

Date:

09 January 1998, 17:42:48

=>     I kinda like this one...if it can be relatively implimented.  The main
=> reason I like it is that, like -B and -S, it means that someone could deal
=> with "tweaking" a system without having to recompile from scratch...
=>
The -S flag for the postmaster seems to be setting the silentflag. But the
FAQ says, it can be used to set the sort memory. The following is 6.2.1 version
code in src/backend/postmaster/postmaster.c
            case 'S':

                /*
                 * Start in 'S'ilent mode (disassociate from controlling
                 * tty). You may also think of this as 'S'ysV mode since
                 * it's most badly needed on SysV-derived systems like
                 * SVR4 and HP-UX.
                 */
                silentflag = 1;
                break;

Am I looking at the wrong file? Can someone please tell me how to increase
the sort memory size.

Thanks
--shiby

Re: [HACKERS] Disk block size issues.

From

Bruce Momjian

Date:

09 January 1998, 17:54:17

Bug in FAQ, fixed now.  The -S in postmaster is silent, the -S in
postgres is sort.  The FAQ had it as postmaster when it should have been
postgres.

>
>
> =>     I kinda like this one...if it can be relatively implimented.  The main
> => reason I like it is that, like -B and -S, it means that someone could deal
> => with "tweaking" a system without having to recompile from scratch...
> =>
> The -S flag for the postmaster seems to be setting the silentflag. But the
> FAQ says, it can be used to set the sort memory. The following is 6.2.1 version
> code in src/backend/postmaster/postmaster.c
>             case 'S':
>
>                 /*
>                  * Start in 'S'ilent mode (disassociate from controlling
>                  * tty). You may also think of this as 'S'ysV mode since
>                  * it's most badly needed on SysV-derived systems like
>                  * SVR4 and HP-UX.
>                  */
>                 silentflag = 1;
>                 break;
>
> Am I looking at the wrong file? Can someone please tell me how to increase
> the sort memory size.
>
> Thanks
> --shiby
>
>
>


--
Bruce Momjian
maillist@candle.pha.pa.us

Re: [HACKERS] Disk block size issues.

From

Peter T Mount

Date:

10 January 1998, 07:39:23

On Fri, 9 Jan 1998, Bruce Momjian wrote:

> > > This is an interesting point.  While we can compute most of the changes
> > > at compile time, we will have to communicate with clients that were
> > > compiled with different max limits.
> > >
> > > I recommend we increase the max client buffer size to what we believe is
> > > the largest block size anyone would ever reasonably choose.  That way,
> > > all can communicate.  I recommend you contact Peter Mount for JDBC,
> > > Openlink for ODBC, and all the other client maintainers and let them
> > > know the changes will be in 6.3 so they can be ready with new version
> > > when 6.3 starts beta on February 1.

I'll be ready :-)

> > So the buffer size will be defined in one place also that they should all
> > reference when compiling or running?  In include/config.h I assume?
>
> Yes, in config.h, and let's call it PG... so it is clear, and everything
> can key off of that.
>
> >
> > This could be difficult for the ODBC and JDBC drivers to determine
> > automagically since they are usually compiled on different systems that
> > the postgres src.

Not necesarily for JDBC. Because of it's nature, there is no real reason
why we can't even include it precompiled with the source - the same jar
file runs on any platform.

Infact, this does bring up the same problem we were discussing about
earlier, where we were thinking about changing the protocol on startup. If
that change occurs, then this value is an ideal candidate to add to the
startup packet.

> I think they will need to handle the maximum size someone could ever
> choose.  Let's face it, 32k or 64k is not too much to ask for a buffer.
> I just hope there are not too many of them.  I only see it in one place
> in libpq.  The others are malloc'ed based on how big the result is when
> it comes back from the socket.
>
> I recommend we add a test in config.h to make sure they do not set the
> max size greater than some predefined limit, and mention why we test
> there (for clients).  The interface/* files will not use the backend
> block size, but will use another config.h define called PGMAXBLCKSZ, or
> something like that, so they can interoperate will all backends.

Slight problem with JDBC (or Java in general), in that we don't use .h
files, so settings in config.h are useless to us. So far, certain
constants have been duplicated in the source.

I was thinking of possibly adding a couple of functions to the backend, to
allow us to get certain details about the backend, which is needed for
certain DatabaseMetaData methods. Perhaps adding PGMAXBLCKSZ to that may
get round the problem.

--
Peter T Mount  petermount@earthling.net or pmount@maidast.demon.co.uk
Main Homepage: http://www.demon.co.uk/finder
Work Homepage: http://www.maidstone.gov.uk Work EMail: peter@maidstone.gov.uk