Thread: initdb change

initdb change

From
David Fetter
Date:
Folks,

While initdb allows you to choose a directory for transaction logs, it
can't already exist, so it can't be in its usual place under $PGDATA.
I'd like to propose that this be allowed by having an alternate syntax
for the -X option, namely, "existing."

When -X is set to "existing", it would check whether pg_xlog is a
directory and the only thing in $PGDATA.  One way to do that is to add
a new return code to check_data_dir() and a new branch of the case
statement after it's called.

What say?

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: initdb change

From
Tom Lane
Date:
David Fetter <david@fetter.org> writes:
> While initdb allows you to choose a directory for transaction logs, it
> can't already exist, so it can't be in its usual place under $PGDATA.
> I'd like to propose that this be allowed by having an alternate syntax
> for the -X option, namely, "existing."

> When -X is set to "existing", it would check whether pg_xlog is a
> directory and the only thing in $PGDATA.  One way to do that is to add
> a new return code to check_data_dir() and a new branch of the case
> statement after it's called.

Why is this useful?  It seems like just extra complication.
        regards, tom lane


Re: initdb change

From
Joshua Drake
Date:
On Mon, 25 Aug 2008 08:40:17 -0700
David Fetter <david@fetter.org> wrote:

> Folks,
> 
> While initdb allows you to choose a directory for transaction logs, it
> can't already exist, so it can't be in its usual place under $PGDATA.
> I'd like to propose that this be allowed by having an alternate syntax
> for the -X option, namely, "existing."
> 
> When -X is set to "existing", it would check whether pg_xlog is a
> directory and the only thing in $PGDATA.  One way to do that is to add
> a new return code to check_data_dir() and a new branch of the case
> statement after it's called.
> 


Why not just implicitly do it? If the directory already exists we throw
a notice saying:

NOTICE: pg_xlog destination already exists, creating pg_xlog underneath
(or some such thing)

> What say?
> 
> Cheers,
> David.


-- 
The PostgreSQL Company since 1997: http://www.commandprompt.com/ 
PostgreSQL Community Conference: http://www.postgresqlconference.org/
United States PostgreSQL Association: http://www.postgresql.us/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate




Re: initdb change

From
David Fetter
Date:
On Mon, Aug 25, 2008 at 11:48:29AM -0400, Tom Lane wrote:
> David Fetter <david@fetter.org> writes:
> > While initdb allows you to choose a directory for transaction
> > logs, it can't already exist, so it can't be in its usual place
> > under $PGDATA.  I'd like to propose that this be allowed by having
> > an alternate syntax for the -X option, namely, "existing."
> 
> > When -X is set to "existing", it would check whether pg_xlog is a
> > directory and the only thing in $PGDATA.  One way to do that is to
> > add a new return code to check_data_dir() and a new branch of the
> > case statement after it's called.
> 
> Why is this useful?  It seems like just extra complication.

Letting people put a separate I/O channel in for pg_xlog in the usual
spot at initdb time makes it easier on everybody.  The person tasked
with solving a problem is not left blearily wondering where pg_xlog
went when their phone rings at 0300, as such phones are wont to do. :)

Another approach to this is to look by default for pg_xlog in the
$PGDATA-to-be, testing it for all the appropriate properties
(directory-ness, permissions, emptiness).

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: initdb change

From
Joshua Drake
Date:
On Mon, 25 Aug 2008 09:04:01 -0700
David Fetter <david@fetter.org> wrote:

> > > When -X is set to "existing", it would check whether pg_xlog is a
> > > directory and the only thing in $PGDATA.  One way to do that is to
> > > add a new return code to check_data_dir() and a new branch of the
> > > case statement after it's called.
> > 
> > Why is this useful?  It seems like just extra complication.

At no time should I ask I DBA to do this:

service postgresql stop
mv /var/lib/pgsql/data/pg_xlog /xlogs/pg_xlog
ln -sf /xlogs/pg_xlog /var/lib/pgsql/data/pg_xlog
service postgresql start

We either need to provide a way to initialize it at initdb, allow
xlogs to be in table space or add a GUC for the location.

Sincerely,

Joshua D. Drake

-- 
The PostgreSQL Company since 1997: http://www.commandprompt.com/ 
PostgreSQL Community Conference: http://www.postgresqlconference.org/
United States PostgreSQL Association: http://www.postgresql.us/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate




Re: initdb change

From
David Fetter
Date:
On Mon, Aug 25, 2008 at 09:29:03AM -0700, Joshua D. Drake wrote:
> On Mon, 25 Aug 2008 09:04:01 -0700
> David Fetter <david@fetter.org> wrote:
> 
> > > > When -X is set to "existing", it would check whether pg_xlog
> > > > is a directory and the only thing in $PGDATA.  One way to do
> > > > that is to add a new return code to check_data_dir() and a new
> > > > branch of the case statement after it's called.
> > > 
> > > Why is this useful?  It seems like just extra complication.
> 
> At no time should I ask I DBA to do this:
> 
> service postgresql stop
> mv /var/lib/pgsql/data/pg_xlog /xlogs/pg_xlog
> ln -sf /xlogs/pg_xlog /var/lib/pgsql/data/pg_xlog
> service postgresql start
> 
> We either need to provide a way to initialize it at initdb, allow
> xlogs to be in table space or add a GUC for the location.

There's already a way to specify where xlogs should be via
-X/--xlogdir.  What that doesn't do is put the xlogdir where a DBA
would naturally expect to find it.  When that DBA doesn't find it in
the place they expect, very bad knock-on decisions are likely to
result.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: initdb change

From
Joshua Drake
Date:
On Mon, 25 Aug 2008 09:42:21 -0700
David Fetter <david@fetter.org> wrote:

> > We either need to provide a way to initialize it at initdb, allow
> > xlogs to be in table space or add a GUC for the location.
> 
> There's already a way to specify where xlogs should be via
> -X/--xlogdir. 

Sorry should have checked 8.3 initdb instead of 8.2.

> What that doesn't do is put the xlogdir where a DBA
> would naturally expect to find it.  When that DBA doesn't find it in
> the place they expect, very bad knock-on decisions are likely to
> result.

O.k. when using 8.3 I did this:

initdb -D /tmp/foo -X /tmp/xlogs

And I got:

/tmp/foo/pg_xlog which is a link to /tmp/xlogs

That seems perfectly logical. If I (without removing the old initdb) do
this:

/usr/lib/postgresql/8.3/bin/initdb -D /tmp/bar -X /tmp/xlog

I get:

initdb: directory "/tmp/xlog" exists but is not empty
If you want to store the transaction log there, either
remove or empty the directory "/tmp/xlog".
initdb: removing data directory "/tmp/bar"

I just reread your original message a little slower and see that what
you want is if:

/var/lib/pgsql/data/ exists but is empty you can initdb within that
directory. However if there is anything in it you can not. You are
asking that if pg_xlog exists but is empty that we still be able to use
the DATADIR and you can pass existing so that it will also use pg_xlog
if it is empty.

My take would be to not add a new flag. Instead to implicitly allow it.
If initdb finds that DATADIR and pg_xlog is empty it will use both. 


Sincerely,

Joshua D. Drake








> 
> Cheers,
> David.


-- 
The PostgreSQL Company since 1997: http://www.commandprompt.com/ 
PostgreSQL Community Conference: http://www.postgresqlconference.org/
United States PostgreSQL Association: http://www.postgresql.us/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate




Re: initdb change

From
"Heikki Linnakangas"
Date:
David Fetter wrote:
> There's already a way to specify where xlogs should be via
> -X/--xlogdir.  What that doesn't do is put the xlogdir where a DBA
> would naturally expect to find it.  When that DBA doesn't find it in
> the place they expect, very bad knock-on decisions are likely to
> result.

I don't understand what that natural place is that you refer to, and it 
seems that others don't either. Could you walk us through how one would 
use the new option?

mount something somewhere
???
initdb -D /mnt/data -X ???
postgres -D /mnt/data

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: initdb change

From
Andrew Dunstan
Date:

David Fetter wrote:
> On Mon, Aug 25, 2008 at 11:48:29AM -0400, Tom Lane wrote:
>   
>> David Fetter <david@fetter.org> writes:
>>     
>>> While initdb allows you to choose a directory for transaction
>>> logs, it can't already exist, so it can't be in its usual place
>>> under $PGDATA.  I'd like to propose that this be allowed by having
>>> an alternate syntax for the -X option, namely, "existing."
>>>       
>>> When -X is set to "existing", it would check whether pg_xlog is a
>>> directory and the only thing in $PGDATA.  One way to do that is to
>>> add a new return code to check_data_dir() and a new branch of the
>>> case statement after it's called.
>>>       
>> Why is this useful?  It seems like just extra complication.
>>     
>
> Letting people put a separate I/O channel in for pg_xlog in the usual
> spot at initdb time makes it easier on everybody.  The person tasked
> with solving a problem is not left blearily wondering where pg_xlog
> went when their phone rings at 0300, as such phones are wont to do. :)
>
> Another approach to this is to look by default for pg_xlog in the
> $PGDATA-to-be, testing it for all the appropriate properties
> (directory-ness, permissions, emptiness).
>
>
>   

This is totally unclear to me.

First, your statement that the directory must not exist is factually 
wrong, according to my inspection of the initdb code.

Second, the whole point of this switch is to allow you to put the xlog 
dir outside the data dir.

Third, there is not the slightest reason I can see for any confusion 
about where it is - initdb creates a symlink in the datadir pointing to 
the real location if you use this option.

Fourth, I utterly fail to see how making for some extra behaviour on 
initdb will save you from confusion at 0300.

cheers

andrew


Re: initdb change

From
David Fetter
Date:
On Mon, Aug 25, 2008 at 09:54:26AM -0700, Joshua D. Drake wrote:
> On Mon, 25 Aug 2008 09:42:21 -0700
> David Fetter <david@fetter.org> wrote:
> 
> > > We either need to provide a way to initialize it at initdb, allow
> > > xlogs to be in table space or add a GUC for the location.
> > 
> > There's already a way to specify where xlogs should be via
> > -X/--xlogdir. 
> 
> Sorry should have checked 8.3 initdb instead of 8.2.
> 
> > What that doesn't do is put the xlogdir where a DBA
> > would naturally expect to find it.  When that DBA doesn't find it in
> > the place they expect, very bad knock-on decisions are likely to
> > result.
> 
> O.k. when using 8.3 I did this:
> 
> initdb -D /tmp/foo -X /tmp/xlogs
> 
> And I got:
> 
> /tmp/foo/pg_xlog which is a link to /tmp/xlogs

Oops.  Well, this isn't quite the foot-gun I'd previously thought :P

> That seems perfectly logical. If I (without removing the old initdb) do
> this:
> 
> /usr/lib/postgresql/8.3/bin/initdb -D /tmp/bar -X /tmp/xlog
> 
> I get:
> 
> initdb: directory "/tmp/xlog" exists but is not empty
> If you want to store the transaction log there, either
> remove or empty the directory "/tmp/xlog".
> initdb: removing data directory "/tmp/bar"
> 
> I just reread your original message a little slower and see that what
> you want is if:
> 
> /var/lib/pgsql/data/ exists but is empty you can initdb within that
> directory. However if there is anything in it you can not. You are
> asking that if pg_xlog exists but is empty that we still be able to use
> the DATADIR and you can pass existing so that it will also use pg_xlog
> if it is empty.
> 
> My take would be to not add a new flag. Instead to implicitly allow it.
> If initdb finds that DATADIR and pg_xlog is empty it will use both. 

Is there some reason why initdb shouldn't just Do The Right Thing™
when it finds an empty extant $PGDATA/pg_xlog directory that passes
the same tests an empty extant $PGDATA would?

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: initdb change

From
Joshua Drake
Date:
On Mon, 25 Aug 2008 10:12:03 -0700
David Fetter <david@fetter.org> wrote:

> > /var/lib/pgsql/data/ exists but is empty you can initdb within that
> > directory. However if there is anything in it you can not. You are
> > asking that if pg_xlog exists but is empty that we still be able to
> > use the DATADIR and you can pass existing so that it will also use
> > pg_xlog if it is empty.
> >
> > My take would be to not add a new flag. Instead to implicitly allow
> > it. If initdb finds that DATADIR and pg_xlog is empty it will use
> > both.
>
> Is there some reason why initdb shouldn't just Do The Right Thing™
> when it finds an empty extant $PGDATA/pg_xlog directory that passes
> the same tests an empty extant $PGDATA would?

That is what I was suggesting.

Joshua D. Drake

>
> Cheers,
> David.


--
The PostgreSQL Company since 1997: http://www.commandprompt.com/
PostgreSQL Community Conference: http://www.postgresqlconference.org/
United States PostgreSQL Association: http://www.postgresql.us/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate




Re: initdb change

From
Andrew Dunstan
Date:

Joshua Drake wrote:
>>
>> Is there some reason why initdb shouldn't just Do The Right Thing™
>> when it finds an empty extant $PGDATA/pg_xlog directory that passes
>> the same tests an empty extant $PGDATA would?
>>     
>
> That is what I was suggesting.
>
>   
>   

Why should the xlog directory be treated specially? We don't do this for 
any other subdirectory of PGDATA. The extra logic would be a nuisance 
and for no great gain in functionality that I can see.

This whole discussion springs from a misconception apparently, so let's 
just move on.

cheers

andrew


Re: initdb change

From
Joshua Drake
Date:
On Mon, 25 Aug 2008 13:56:16 -0400
Andrew Dunstan <andrew@dunslane.net> wrote:
> >
> > That is what I was suggesting.
> > 
> 
> Why should the xlog directory be treated specially?

Consider the following:

mount /dev/sda1 /var/lib/pgsql
mount /dev/sdb1 /srv1/pgsql/pg_xlog (which has a link
from /var/lib/pgsql/data/pg_xlog)

initdb -D /var/lib/pgsql/data -X /var/lib/pgsql/data/pg_xlog

Will fail; now you have multiple steps to get everything where it
should be.


> We don't do this
> for any other subdirectory of PGDATA. The extra logic would be a

Well the only other directory it would even matter for would be pg_clog
(maybe). I grant that it is a very little feature that could be lived
without.

> nuisance and for no great gain in functionality that I can see.
> 

In an environment where you are provisioning many spindle machines over
many differently mounts and raid configurations it could be useful. The
question is; is it worth it? I don't know. I was just trying to
understand exactly what David was talking about and offer some
suggestions.

> This whole discussion springs from a misconception apparently, so
> let's just move on.

Well that pretty much describes life doesn't it? :)

Sincerely,

Joshua D. Drake




-- 
The PostgreSQL Company since 1997: http://www.commandprompt.com/ 
PostgreSQL Community Conference: http://www.postgresqlconference.org/
United States PostgreSQL Association: http://www.postgresql.us/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate




Re: initdb change

From
Robert Treat
Date:
On Monday 25 August 2008 14:05:21 Joshua Drake wrote:
> On Mon, 25 Aug 2008 13:56:16 -0400
>
> Andrew Dunstan <andrew@dunslane.net> wrote:
> > > That is what I was suggesting.
> >
> > Why should the xlog directory be treated specially?
>
> Consider the following:
>
> mount /dev/sda1 /var/lib/pgsql
> mount /dev/sdb1 /srv1/pgsql/pg_xlog (which has a link
> from /var/lib/pgsql/data/pg_xlog)
>
> initdb -D /var/lib/pgsql/data -X /var/lib/pgsql/data/pg_xlog
>
> Will fail; now you have multiple steps to get everything where it
> should be.
>
> > We don't do this
> > for any other subdirectory of PGDATA. The extra logic would be a
>
> Well the only other directory it would even matter for would be pg_clog
> (maybe). I grant that it is a very little feature that could be lived
> without.
>
> > nuisance and for no great gain in functionality that I can see.
>
> In an environment where you are provisioning many spindle machines over
> many differently mounts and raid configurations it could be useful. The
> question is; is it worth it? I don't know. I was just trying to
> understand exactly what David was talking about and offer some
> suggestions.
>

I would have thought the place you need this is where you have SA's who set up 
a machine, creating a $PGDATA and $PGDATA/xlog on seperate mountpoints where 
the postgres user has full rights to use those directories, but not create 
directies in those locations. In that scenario, the DBA couldn't create the 
directories if he wanted, so allowing the behavior to use an existing 
directory would be helpful. 

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL


Re: initdb change

From
Andrew Dunstan
Date:

Robert Treat wrote:
> I would have thought the place you need this is where you have SA's who set up 
> a machine, creating a $PGDATA and $PGDATA/xlog on seperate mountpoints where 
> the postgres user has full rights to use those directories, but not create 
> directies in those locations. In that scenario, the DBA couldn't create the 
> directories if he wanted, so allowing the behavior to use an existing 
> directory would be helpful. 
>
>   

As I have already pointed out in this thread, the allegation that you 
cannot use an existing directory is false.

See below for proof.

cheers

andrew

[andrew@constanza inst.8.3.5707]$ sudo mkdir /bk/xl
[andrew@constanza inst.8.3.5707]$ sudo chown andrew:andrew /bk/xl
[andrew@constanza inst.8.3.5707]$ bin/initdb -X /bk/xl blurfl
The files belonging to this database system will be owned by user "andrew".
This user must also own the server process.

The database cluster will be initialized with locale en_US.UTF-8.
The default database encoding has accordingly been set to UTF8.
The default text search configuration will be set to "english".

creating directory blurfl ... ok
fixing permissions on existing directory /bk/xl ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers/max_fsm_pages ... 32MB/204800
creating configuration files ... ok
creating template1 database in blurfl/base/1 ... ok
initializing pg_authid ... ok
initializing dependencies ... ok
creating system views ... ok
loading system objects' descriptions ... ok
creating conversions ... ok
creating dictionaries ... ok
setting privileges on built-in objects ... ok
creating information schema ... ok
vacuuming database template1 ... ok
copying template1 to template0 ... ok
copying template1 to postgres ... ok

WARNING: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the -A option the
next time you run initdb.

Success. You can now start the database server using:
   bin/postgres -D blurfl
or   bin/pg_ctl -D blurfl -l logfile start