Thread: tablespaces inside $PGDATA considered harmful

tablespaces inside $PGDATA considered harmful

From
Robert Haas
Date:
I think everyone who has read this mailing list for a while is
probably already aware of this problem.  When you create a tablespace
somewhere inside the data directory, weird things happen. If you
pg_upgrade and then incautiously run the delete_old_cluster.sh script
thus created, you will blow away large chunks of your data.[1]  If you
try to use pg_basebackup, it will back up your data twice and maybe
throw some warnings.[2]  You can also induce pg_database_size() to
give wrong results --- it'll count pg_tblspace/$TABLESPACE_OID as well
as pg_tblspace/some-stupid-tablespace-name, the former being a symlink
to the latter.

Given all this, it seems like a good idea to at least give a warning
if somebody tries to create a tablespace instead the data directory.
Arguably, we should prohibit it altogether, but there are obviously
people that want to do it, and there could even be somewhat valid
reasons for that, like wanting to set per-tablespace settings
differently for different tablespaces.  Possibly we should prohibit it
anyway, or maybe there should be an option to create a tablespace
whose directory is a real directory, not a symlink.  So then:

CREATE TABLESPACE foo LOCATION '/home/rhaas/pgdata/pg_tblspc/foo';

...would fail, but if you really want a separate tablespace inside the
data directory, we could allow:

CREATE TABLESPACE foo NO LOCATION;

...which would just create a bare directory where the symlink would normally go.

In the short term, I favor just adding a warning, so that people get
some clue that they are doing something that might be a bad idea.  In
the long term, we might want to do more.  Thoughts?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

[1] http://www.postgresql.org/message-id/B6F6FD62F2624C4C9916AC0175D56D880CE46DB7@jenmbs01.ad.intershop.net
[2] http://www.postgresql.org/message-id/CABUevExkhE+KcQa+fLUeaizP5i5QVCnNjZ2j0ZZQcaMJFheQDw@mail.gmail.com



Re: tablespaces inside $PGDATA considered harmful

From
Bruce Momjian
Date:
On Fri, Jan 30, 2015 at 11:12:43AM -0500, Robert Haas wrote:
> I think everyone who has read this mailing list for a while is
> probably already aware of this problem.  When you create a tablespace
> somewhere inside the data directory, weird things happen. If you
> pg_upgrade and then incautiously run the delete_old_cluster.sh script
> thus created, you will blow away large chunks of your data.[1]  If you

pg_upgrade doesn't create the deletion script in this case, and warns
the user:
       Could not create a script to delete the old cluster's data       files because user-defined tablespaces exist in
theold cluster       directory.  The old cluster's contents must be deleted manually.
 

> In the short term, I favor just adding a warning, so that people get
> some clue that they are doing something that might be a bad idea.  In
> the long term, we might want to do more.  Thoughts?

Yes, good idea.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +



Re: tablespaces inside $PGDATA considered harmful

From
"Joshua D. Drake"
Date:
On 01/30/2015 08:19 AM, Bruce Momjian wrote:
>
> On Fri, Jan 30, 2015 at 11:12:43AM -0500, Robert Haas wrote:
>> I think everyone who has read this mailing list for a while is
>> probably already aware of this problem.  When you create a tablespace
>> somewhere inside the data directory, weird things happen. If you
>> pg_upgrade and then incautiously run the delete_old_cluster.sh script
>> thus created, you will blow away large chunks of your data.[1]  If you
>
> pg_upgrade doesn't create the deletion script in this case, and warns
> the user:
>
>          Could not create a script to delete the old cluster's data
>          files because user-defined tablespaces exist in the old cluster
>          directory.  The old cluster's contents must be deleted manually.
>
>> In the short term, I favor just adding a warning, so that people get
>> some clue that they are doing something that might be a bad idea.  In
>> the long term, we might want to do more.  Thoughts?
>
> Yes, good idea.

Uhm, wouldn't it be a rather simple patch to say:

if tablespace_create() in $PGDATA:  ERROR!

?

I mean yes a warning is good but it is after the fact, the tablespace is 
already created. We know that tablespaces in $PGDATA are a bad idea, why 
not protect the user?

JD


>


-- 
Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, @cmdpromptinc
"If we send our children to Caesar for their education, we should             not be surprised when they come back as
Romans."



Re: tablespaces inside $PGDATA considered harmful

From
Robert Haas
Date:
On Fri, Jan 30, 2015 at 11:43 AM, Joshua D. Drake <jd@commandprompt.com> wrote:
> I mean yes a warning is good but it is after the fact, the tablespace is
> already created. We know that tablespaces in $PGDATA are a bad idea, why not
> protect the user?

Please go back and read the discussion of that option in the OP.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: tablespaces inside $PGDATA considered harmful

From
David Steele
Date:
On 1/30/15 11:43 AM, Joshua D. Drake wrote:
> On 01/30/2015 08:19 AM, Bruce Momjian wrote:
>>
>> On Fri, Jan 30, 2015 at 11:12:43AM -0500, Robert Haas wrote:
>>> I think everyone who has read this mailing list for a while is
>>> probably already aware of this problem.  When you create a tablespace
>>> somewhere inside the data directory, weird things happen. If you
>>> pg_upgrade and then incautiously run the delete_old_cluster.sh script
>>> thus created, you will blow away large chunks of your data.[1]  If you
>>
>> pg_upgrade doesn't create the deletion script in this case, and warns
>> the user:
>>
>>          Could not create a script to delete the old cluster's data
>>          files because user-defined tablespaces exist in the old cluster
>>          directory.  The old cluster's contents must be deleted
>> manually.
>>
>>> In the short term, I favor just adding a warning, so that people get
>>> some clue that they are doing something that might be a bad idea.  In
>>> the long term, we might want to do more.  Thoughts?
>>
>> Yes, good idea.
>
> Uhm, wouldn't it be a rather simple patch to say:
>
> if tablespace_create() in $PGDATA:
>   ERROR!
>
> ?
>
> I mean yes a warning is good but it is after the fact, the tablespace
> is already created. We know that tablespaces in $PGDATA are a bad
> idea, why not protect the user?

I would be in favor of an error.  It would then be OK for basebackup,
pg_upgrade, and friends to error when a tablespace lives in $PGDATA,
rather than trying to deal with the situation in strange ways.

If the user really wants tablespaces in $PGDATA they can always change
the links manually in the filesystem and deal with any consequences on
their own.

--
- David Steele
david@pgmasters.net



Re: tablespaces inside $PGDATA considered harmful

From
Stephen Frost
Date:
* Robert Haas (robertmhaas@gmail.com) wrote:
> Given all this, it seems like a good idea to at least give a warning
> if somebody tries to create a tablespace instead the data directory.

A warning seems like a good idea.  I actually thought we *did* prevent
it..

> Arguably, we should prohibit it altogether, but there are obviously
> people that want to do it, and there could even be somewhat valid
> reasons for that, like wanting to set per-tablespace settings
> differently for different tablespaces.  Possibly we should prohibit it
> anyway, or maybe there should be an option to create a tablespace
> whose directory is a real directory, not a symlink.  So then:
>
> CREATE TABLESPACE foo LOCATION '/home/rhaas/pgdata/pg_tblspc/foo';
>
> ...would fail, but if you really want a separate tablespace inside the
> data directory, we could allow:
>
> CREATE TABLESPACE foo NO LOCATION;
>
> ...which would just create a bare directory where the symlink would normally go.

I actually really like this 'NO LOCATION' idea.  Are there reasons why
that would be difficult or ill-advised to do?

I could see the NO LOCATION approach being useful for migrating between
systems, in particular, or a way to have pg_basebackup work that doesn't
involve having to actually map all the tablespaces...
Thanks!
    Stephen

Re: tablespaces inside $PGDATA considered harmful

From
David G Johnston
Date:
Robert Haas wrote
> Arguably, we should prohibit it altogether, but there are obviously
> people that want to do it, and there could even be somewhat valid
> reasons for that, 

Lots of hand-waving here and it is just as likely they simply are not aware
of the downsides and the only reason they put it is $PGDATA is that
it seemed like a logical place to put a directory that is intended to hold
database data.


> like wanting to set per-tablespace settings differently for different
> tablespaces.

I do not follow where this has anything to do with the location of the
physical tablespace directory?


> Possibly we should prohibit it
> anyway, or maybe there should be an option to create a tablespace
> whose directory is a real directory, not a symlink.  So then:
> 
> CREATE TABLESPACE foo LOCATION '/home/rhaas/pgdata/pg_tblspc/foo';
> 
> ...would fail, but if you really want a separate tablespace inside the
> data directory, we could allow:
> 
> CREATE TABLESPACE foo NO LOCATION;
> 
> ...which would just create a bare directory where the symlink would
> normally go.

CREATE TABLE foo LOCATION INTERNAL

The creators of tablespaces seem to have envisioned their usage as a means
of pulling in
disparate file systems and not simply for namespaces within the main
filesystem
that $PGDATA exists on.

This seems arbitrary and while the internal location specification likely
doesn't buy one much in terms of real options it doesn't seem like it has
any serious downsides either.


> In the short term, I favor just adding a warning, so that people get
> some clue that they are doing something that might be a bad idea.  In
> the long term, we might want to do more.  Thoughts?

If this is intended to be back-patched then I'd go with just a warning.  If
this is strictly 9.5 material then I'd say that since our own tools behave
badly in the current situation we should simply outright disallow it.  In
either case we should consider what tools we can provide to detect the
now-illegal configuration and, during pg_upgrade, configure the new cluster
to adhere to the correct configuration or help the user migrate their
internalized tablespaces to a different part of their filesystem.

Writing this I ponder the situation that someone would mount a different
file system directly under $PGDATA so that they get both benefits - single
parent and the different properties of the filesystems they are using.  If
we force "Internal" to be in the same location as the default tablespace we
only accomplish half of their goals.

David J.



--
View this message in context:
http://postgresql.nabble.com/tablespaces-inside-PGDATA-considered-harmful-tp5836161p5836180.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.



Re: tablespaces inside $PGDATA considered harmful

From
Josh Berkus
Date:
On 01/30/2015 09:19 AM, Stephen Frost wrote:
> * Robert Haas (robertmhaas@gmail.com) wrote:
>> Given all this, it seems like a good idea to at least give a warning
>> if somebody tries to create a tablespace instead the data directory.
> 
> A warning seems like a good idea.  I actually thought we *did* prevent
> it..
> 
>> Arguably, we should prohibit it altogether, but there are obviously
>> people that want to do it, and there could even be somewhat valid
>> reasons for that, like wanting to set per-tablespace settings
>> differently for different tablespaces.  Possibly we should prohibit it
>> anyway, or maybe there should be an option to create a tablespace
>> whose directory is a real directory, not a symlink.  So then:
>>
>> CREATE TABLESPACE foo LOCATION '/home/rhaas/pgdata/pg_tblspc/foo';
>>
>> ...would fail, but if you really want a separate tablespace inside the
>> data directory, we could allow:
>>
>> CREATE TABLESPACE foo NO LOCATION;
>>
>> ...which would just create a bare directory where the symlink would normally go.
> 
> I actually really like this 'NO LOCATION' idea.  Are there reasons why
> that would be difficult or ill-advised to do?
> 
> I could see the NO LOCATION approach being useful for migrating between
> systems, in particular, or a way to have pg_basebackup work that doesn't
> involve having to actually map all the tablespaces...

I like this idea too.  And it would make tablespaces more manageable for
people who are using them for reasons other than putting them on
different disks.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: tablespaces inside $PGDATA considered harmful

From
Josh Berkus
Date:
Robert, Stephen, etc.:

Apparently you can create a tablespace in the tablespace directory:

josh=# create tablespace tbl location '/home/josh/pg94/data/pg_tblspc/';
CREATE TABLESPACE
josh=# create table test_tbl ( test text ) tablespace tbl;
CREATE TABLE
josh=# \q
josh@Radegast:~/pg94/data/pg_tblspc$ ls
17656  PG_9.4_201409291
josh@Radegast:~/pg94/data/pg_tblspc$ ls -l
total 4
lrwxrwxrwx 1 josh josh   30 Jan 30 13:02 17656 ->
/home/josh/pg94/data/pg_tblspc
drwx------ 3 josh josh 4096 Jan 30 13:02 PG_9.4_201409291
josh@Radegast:~/pg94/data/pg_tblspc$

In theory if I could guess the next OID, I could cause a failure there,
but that appears to be obscure enough to be not worth bothering about.

What is a real problem is that we don't block creating tablespaces
anywhere at all, including in obviously problematic places like the
transaction log directory:

josh=# create tablespace tbl2 location '/home/josh/pg94/data/pg_xlog/';
CREATE TABLESPACE

It really seems like we ought to block *THAT*.  Of course, if we block
tablespace creation in PGDATA generally, then that's covered.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: tablespaces inside $PGDATA considered harmful

From
Marc Mamin
Date:
> it is just as likely they simply are not aware
> of the downsides and the only reason they put it is $PGDATA is that
> it seemed like a logical place to put a directory that is intended to hold
> database data.

Yes, this is the reason why we got in this issue. The name PGDATA is misleading.

> The creators of tablespaces seem to have envisioned their usage as a means
> of pulling in disparate file systems and not simply for namespaces within the main
> filesystem that $PGDATA exists on.

true too. We have a lot of tablespaces. I'd probably won't go that way by now, but it still has the advantage to help
quicklymove parts of the data to  manage filesystem usage. 

> Given all this, it seems like a good idea to at least give a warning
> if somebody tries to create a tablespace instead the data directory.

IMHO the first place to put a warning is within the
documentation:http://www.postgresql.org/docs/9.4/interactive/manage-ag-tablespaces.htmlandpossibly a crosslink in
http://www.postgresql.org/docs/9.4/interactive/sql-createtablespace.html
>If this is intended to be back-patched then I'd go with just a warning. If
>this is strictly 9.5 material then I'd say that since our own tools behave
>badly in the current situation we should simply outright disallow it.

We have a lot of maintenance scripts that rely on our architecture
($PGDADAT -> symlinks -> tablespace locations).
We already made a quick evaluation on how to fix this, but gave it up
for now due to the work amount.
So please be cautious about disallowing it too abruptly.
Back-patching a change that disallow our current architecture could prevent us
to apply minor releases for a while...

regards,

Marc Mamin


Re: tablespaces inside $PGDATA considered harmful

From
Bruce Momjian
Date:
On Fri, Jan 30, 2015 at 01:26:22PM -0800, Josh Berkus wrote:
> Robert, Stephen, etc.:
>
> Apparently you can create a tablespace in the tablespace directory:
>
> josh=# create tablespace tbl location '/home/josh/pg94/data/pg_tblspc/';
> CREATE TABLESPACE
> josh=# create table test_tbl ( test text ) tablespace tbl;
> CREATE TABLE
> josh=# \q
> josh@Radegast:~/pg94/data/pg_tblspc$ ls
> 17656  PG_9.4_201409291
> josh@Radegast:~/pg94/data/pg_tblspc$ ls -l
> total 4
> lrwxrwxrwx 1 josh josh   30 Jan 30 13:02 17656 ->
> /home/josh/pg94/data/pg_tblspc
> drwx------ 3 josh josh 4096 Jan 30 13:02 PG_9.4_201409291
> josh@Radegast:~/pg94/data/pg_tblspc$
>
> In theory if I could guess the next OID, I could cause a failure there,
> but that appears to be obscure enough to be not worth bothering about.
>
> What is a real problem is that we don't block creating tablespaces
> anywhere at all, including in obviously problematic places like the
> transaction log directory:
>
> josh=# create tablespace tbl2 location '/home/josh/pg94/data/pg_xlog/';
> CREATE TABLESPACE
>
> It really seems like we ought to block *THAT*.  Of course, if we block
> tablespace creation in PGDATA generally, then that's covered.

I have developed the attached patch to warn about creating tablespaces
inside the data directory.  The case this doesn't catch is referencing a
symbolic link that points to the same directory.  We can't make it an
error so people can use pg_upgrade these setups.  This would be for 9.5
only.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + Everyone has their own god. +

Attachment

Re: tablespaces inside $PGDATA considered harmful

From
Robert Haas
Date:
On Wed, Apr 22, 2015 at 10:41 PM, Bruce Momjian <bruce@momjian.us> wrote:
>> What is a real problem is that we don't block creating tablespaces
>> anywhere at all, including in obviously problematic places like the
>> transaction log directory:
>>
>> josh=# create tablespace tbl2 location '/home/josh/pg94/data/pg_xlog/';
>> CREATE TABLESPACE
>>
>> It really seems like we ought to block *THAT*.  Of course, if we block
>> tablespace creation in PGDATA generally, then that's covered.
>
> I have developed the attached patch to warn about creating tablespaces
> inside the data directory.  The case this doesn't catch is referencing a
> symbolic link that points to the same directory.  We can't make it an
> error so people can use pg_upgrade these setups.  This would be for 9.5
> only.

I think this is a good thing to do, but I sure wish we could go
further and block it completely.  That may require more thought than
we have time to put in at this stage of the release cycle, though, so
+1 for doing at least this much.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: tablespaces inside $PGDATA considered harmful

From
Bruce Momjian
Date:
On Thu, Apr 23, 2015 at 09:13:52AM -0400, Robert Haas wrote:
> On Wed, Apr 22, 2015 at 10:41 PM, Bruce Momjian <bruce@momjian.us> wrote:
> >> What is a real problem is that we don't block creating tablespaces
> >> anywhere at all, including in obviously problematic places like the
> >> transaction log directory:
> >>
> >> josh=# create tablespace tbl2 location '/home/josh/pg94/data/pg_xlog/';
> >> CREATE TABLESPACE
> >>
> >> It really seems like we ought to block *THAT*.  Of course, if we block
> >> tablespace creation in PGDATA generally, then that's covered.
> >
> > I have developed the attached patch to warn about creating tablespaces
> > inside the data directory.  The case this doesn't catch is referencing a
> > symbolic link that points to the same directory.  We can't make it an
> > error so people can use pg_upgrade these setups.  This would be for 9.5
> > only.
> 
> I think this is a good thing to do, but I sure wish we could go
> further and block it completely.  That may require more thought than
> we have time to put in at this stage of the release cycle, though, so
> +1 for doing at least this much.

OK, good.  Thinking to 9.6, I am not sure how we could throw an error
because we have allowed this in the past and pg_dump is going to be
restored with a raw SQL CREATE TABLESPACE command.  

We have had this type of problem before, but never resolved it.  We
almost need pg_dump to set a GUC variable telling the backend it is
restoring a dump and issue a warning, but throw an error if the same
command was issued outside of a pg_dump restore.  FYI, pg_upgrade
already throws a warning related to the non-creation of a delete script.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +



Re: tablespaces inside $PGDATA considered harmful

From
Andres Freund
Date:
On 2015-04-23 11:00:43 -0400, Bruce Momjian wrote:
> On Thu, Apr 23, 2015 at 09:13:52AM -0400, Robert Haas wrote:
> > I think this is a good thing to do, but I sure wish we could go
> > further and block it completely.  That may require more thought than
> > we have time to put in at this stage of the release cycle, though, so
> > +1 for doing at least this much.
> 
> OK, good.  Thinking to 9.6, I am not sure how we could throw an error
> because we have allowed this in the past and pg_dump is going to be
> restored with a raw SQL CREATE TABLESPACE command.  

We could just document that you need to pre-create the tablespace and
ignore the resulting error. This isn't going to affect too many people.

Greetings,

Andres Freund



Re: tablespaces inside $PGDATA considered harmful

From
Jim Nasby
Date:
On 4/22/15 9:41 PM, Bruce Momjian wrote:
> The case this doesn't catch is referencing a
> symbolic link that points to the same directory.  We can't make it an
> error so people can use pg_upgrade these setups.

Couldn't we make it an ERROR unless IsBinaryUpgrade?
-- 
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: tablespaces inside $PGDATA considered harmful

From
Bruce Momjian
Date:
On Thu, Apr 23, 2015 at 05:05:14PM +0200, Andres Freund wrote:
> On 2015-04-23 11:00:43 -0400, Bruce Momjian wrote:
> > On Thu, Apr 23, 2015 at 09:13:52AM -0400, Robert Haas wrote:
> > > I think this is a good thing to do, but I sure wish we could go
> > > further and block it completely.  That may require more thought than
> > > we have time to put in at this stage of the release cycle, though, so
> > > +1 for doing at least this much.
> > 
> > OK, good.  Thinking to 9.6, I am not sure how we could throw an error
> > because we have allowed this in the past and pg_dump is going to be
> > restored with a raw SQL CREATE TABLESPACE command.  
> 
> We could just document that you need to pre-create the tablespace and
> ignore the resulting error. This isn't going to affect too many people.

This approach is going to cause any object in that tablespace to not
restore --- are we sure that enough people check for restore errors that
we will not have people losing data on a restore?

Also, the error is going to cause pg_upgrade to fail.  We could have
pg_upgrade --check detect these cases and force people to fix their
setups before they run pg_upgrade --- at least that would be consistent
with the pg_dump behavior.  Jim Nasby suggested throwing an error unless
IsBinaryUpgrade is set, and that would work, but it means we are
allowing such tablespaces to be upgraded using pg_upgrade only, which
seems kind of odd.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +



Re: tablespaces inside $PGDATA considered harmful

From
Andres Freund
Date:
On April 23, 2015 6:12:05 PM GMT+03:00, Jim Nasby <Jim.Nasby@BlueTreble.com> wrote:
>On 4/22/15 9:41 PM, Bruce Momjian wrote:
>> The case this doesn't catch is referencing a
>> symbolic link that points to the same directory.  We can't make it an
>> error so people can use pg_upgrade these setups.
>
>Couldn't we make it an ERROR unless IsBinaryUpgrade?

People still upgrade without pg upgrade.

Andres

--- 
Please excuse brevity and formatting - I am writing this on my mobile phone.



Re: tablespaces inside $PGDATA considered harmful

From
Robert Haas
Date:
On Thu, Apr 23, 2015 at 11:00 AM, Bruce Momjian <bruce@momjian.us> wrote:
>> > I have developed the attached patch to warn about creating tablespaces
>> > inside the data directory.  The case this doesn't catch is referencing a
>> > symbolic link that points to the same directory.  We can't make it an
>> > error so people can use pg_upgrade these setups.  This would be for 9.5
>> > only.
>>
>> I think this is a good thing to do, but I sure wish we could go
>> further and block it completely.  That may require more thought than
>> we have time to put in at this stage of the release cycle, though, so
>> +1 for doing at least this much.
>
> OK, good.  Thinking to 9.6, I am not sure how we could throw an error
> because we have allowed this in the past and pg_dump is going to be
> restored with a raw SQL CREATE TABLESPACE command.
>
> We have had this type of problem before, but never resolved it.  We
> almost need pg_dump to set a GUC variable telling the backend it is
> restoring a dump and issue a warning, but throw an error if the same
> command was issued outside of a pg_dump restore.  FYI, pg_upgrade
> already throws a warning related to the non-creation of a delete script.

Well, we've made backward-incompatible changes before.  Not to this
specific thing, but in general.  I don't think there's anything
preventing us from doing so here, except that we don't want to annoy
too many users.

I don't think the right solution is to add a GUC so that pg_dump
ignores this, and otherwise deny it.  It's bad if you do it as part of
a restore, and it's bad if you do it some other time, too.

What I'd recommend is that we add a GUC stupid_tablespaces=off.  If
you have done this in the past, and you want to upgrade (whether via
pg_dump or pg_upgrade) to a new release, you'll have to configure the
new cluster for stupid_tablespaces=on.  If you don't, you'll get an
error.  If you do, you'll get a warning.  That way, people can still
upgrade, but they have to set the GUC to make it work, so they'll be
clearly aware that they're doing something that is not recommended.

(Of course we might want to call the GUC something like other than
stupid_tablespaces, like allow_tablespaces_in_data_directory, but you
get the idea.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: tablespaces inside $PGDATA considered harmful

From
Andres Freund
Date:
On 2015-04-23 15:46:20 -0400, Robert Haas wrote:
> Well, we've made backward-incompatible changes before.  Not to this
> specific thing, but in general.  I don't think there's anything
> preventing us from doing so here, except that we don't want to annoy
> too many users.

I think the number of users that have done this, and haven't yet
(knowing or unknowningly) been bitten by it is pretty low. In that
scenario it seems much better to break compatibility given that it's
pretty easy to fix during restore (just precreate the tablespace).  It's
not something you have to retest a whole application for.

If you want to avoid that one error you can still do pg_dumpall
--globals, edit and run that script, and only then restore the the
actual databases.

Greetings,

Andres Freund



Re: tablespaces inside $PGDATA considered harmful

From
Jim Nasby
Date:
On 4/23/15 11:01 AM, Andres Freund wrote:
> On April 23, 2015 6:12:05 PM GMT+03:00, Jim Nasby <Jim.Nasby@BlueTreble.com> wrote:
>> On 4/22/15 9:41 PM, Bruce Momjian wrote:
>>> The case this doesn't catch is referencing a
>>> symbolic link that points to the same directory.  We can't make it an
>>> error so people can use pg_upgrade these setups.
>>
>> Couldn't we make it an ERROR unless IsBinaryUpgrade?
>
> People still upgrade without pg upgrade.

Yes, but only after creating a brand new cluster from scratch, which 
would then disallow them from putting tablespaces in $PGDATA.

Or are you saying people do binary upgrades without pg_upgrade? I don't 
think we have any obligation to support that...
-- 
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: tablespaces inside $PGDATA considered harmful

From
Andres Freund
Date:
On 2015-04-23 15:17:55 -0500, Jim Nasby wrote:
> Yes, but only after creating a brand new cluster from scratch, which would
> then disallow them from putting tablespaces in $PGDATA.

pg_dumpall output includes tablespaces.



Re: tablespaces inside $PGDATA considered harmful

From
Robert Haas
Date:
On Thu, Apr 23, 2015 at 3:57 PM, Andres Freund <andres@anarazel.de> wrote:
> On 2015-04-23 15:46:20 -0400, Robert Haas wrote:
>> Well, we've made backward-incompatible changes before.  Not to this
>> specific thing, but in general.  I don't think there's anything
>> preventing us from doing so here, except that we don't want to annoy
>> too many users.
>
> I think the number of users that have done this, and haven't yet
> (knowing or unknowningly) been bitten by it is pretty low. In that
> scenario it seems much better to break compatibility given that it's
> pretty easy to fix during restore (just precreate the tablespace).  It's
> not something you have to retest a whole application for.
>
> If you want to avoid that one error you can still do pg_dumpall
> --globals, edit and run that script, and only then restore the the
> actual databases.

But pg_upgrade automates all that, so you can't use pg_upgrade in that
case.  If we add a GUC as I suggested, you can still use pg_upgrade.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: tablespaces inside $PGDATA considered harmful

From
Andres Freund
Date:
On 2015-04-23 16:26:09 -0400, Robert Haas wrote:
> But pg_upgrade automates all that, so you can't use pg_upgrade in that
> case.  If we add a GUC as I suggested, you can still use pg_upgrade.

But we also have to live with data directories being in a shit state
forever onward. We won't really be able to remove the option
realistically.

It's not that hard to just move the tablespace out of the data directory
while the server. As long as you move it on the same partition, it's
even fast.

Greetings,

Andres Freund



Re: tablespaces inside $PGDATA considered harmful

From
David Steele
Date:
On 4/23/15 4:30 PM, Andres Freund wrote:
> On 2015-04-23 16:26:09 -0400, Robert Haas wrote:
>> But pg_upgrade automates all that, so you can't use pg_upgrade in that
>> case.  If we add a GUC as I suggested, you can still use pg_upgrade.
>
> But we also have to live with data directories being in a shit state
> forever onward. We won't really be able to remove the option
> realistically.
>
> It's not that hard to just move the tablespace out of the data directory
> while the server. As long as you move it on the same partition, it's
> even fast.

I agree.  It wouldn't be that hard to do a bit of directory manipulation
before upgrading - and that's only for the people who have put
tablespaces in $PGDATA.  I've never seen it before, but I have no doubt
that it happens.  I can see how it might make a weird sort of sense
depending on the level of experience.

--
- David Steele
david@pgmasters.net


Re: tablespaces inside $PGDATA considered harmful

From
Robert Haas
Date:
On Thu, Apr 23, 2015 at 4:30 PM, Andres Freund <andres@anarazel.de> wrote:
> On 2015-04-23 16:26:09 -0400, Robert Haas wrote:
>> But pg_upgrade automates all that, so you can't use pg_upgrade in that
>> case.  If we add a GUC as I suggested, you can still use pg_upgrade.
>
> But we also have to live with data directories being in a shit state
> forever onward. We won't really be able to remove the option
> realistically.
>
> It's not that hard to just move the tablespace out of the data directory
> while the server. As long as you move it on the same partition, it's
> even fast.

OK, fair point.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: tablespaces inside $PGDATA considered harmful

From
Bruce Momjian
Date:
On Wed, Apr 22, 2015 at 10:41:02PM -0400, Bruce Momjian wrote:
> > josh=# create tablespace tbl2 location '/home/josh/pg94/data/pg_xlog/';
> > CREATE TABLESPACE
> > 
> > It really seems like we ought to block *THAT*.  Of course, if we block
> > tablespace creation in PGDATA generally, then that's covered.
> 
> I have developed the attached patch to warn about creating tablespaces
> inside the data directory.  The case this doesn't catch is referencing a
> symbolic link that points to the same directory.  We can't make it an
> error so people can use pg_upgrade these setups.  This would be for 9.5
> only.

OK, based on later discussions, I have updated my 9.5 patch to have
pg_upgrade also display a warning (the warning will also appear in the
pg_upgrade logs, but I doubt the user will see it), e.g.:
Setting next OID for new cluster                            okSync data directory to disk
 okCreating script to analyze new cluster                      okWARNING:  user-defined tablespace locations should not
beinside the data directory, e.g. /u/pgsql.old/data/pg_tblspcUpgrade Complete----------------Optimizer statistics are
nottransferred by pg_upgrade so,once you start the new server, consider running:    ./analyze_new_cluster.shCould not
createa script to delete the old cluster's datafiles because user-defined tablespaces exist in the old
clusterdirectory. The old cluster's contents must be deleted manually.
 

This way, both pg_dump and pg_upgrade will issue warnings, though, of
course, those warnings can be ignored.  I am hopeful these two warnings
will be sufficient and we will not need make these errors, with the
possible inconvenience it will cause.  I am still afraid that someone
will ignore the new errors pg_dump would generate and lose data.  I just
don't remember enough cases where we threw new errors on _data_ restore.

Frankly, those using pg_upgrade already will have to move the old
tablespaces out of the old cluster if they ever want to delete those
clusters, so I am hopeful these additional warnings will help eliminate
this practice, which is already cumbersome and useless.  I am not
planning to revisit this for 9.6.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +



Re: tablespaces inside $PGDATA considered harmful

From
Bruce Momjian
Date:
On Fri, Apr 24, 2015 at 01:05:03PM -0400, Bruce Momjian wrote:
> This way, both pg_dump and pg_upgrade will issue warnings, though, of
> course, those warnings can be ignored.  I am hopeful these two warnings
> will be sufficient and we will not need make these errors, with the
> possible inconvenience it will cause.  I am still afraid that someone
> will ignore the new errors pg_dump would generate and lose data.  I just
> don't remember enough cases where we threw new errors on _data_ restore.
> 
> Frankly, those using pg_upgrade already will have to move the old
> tablespaces out of the old cluster if they ever want to delete those
> clusters, so I am hopeful these additional warnings will help eliminate
> this practice, which is already cumbersome and useless.  I am not
> planning to revisit this for 9.6.

Patch applied.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +



Re: [HACKERS] tablespaces inside $PGDATA considered harmful

From
Mark Kirkwood
Date:
On 29/04/15 09:35, Bruce Momjian wrote:

> On Fri, Apr 24, 2015 at 01:05:03PM -0400, Bruce Momjian wrote:
>> This way, both pg_dump and pg_upgrade will issue warnings, though, of
>> course, those warnings can be ignored.  I am hopeful these two warnings
>> will be sufficient and we will not need make these errors, with the
>> possible inconvenience it will cause.  I am still afraid that someone
>> will ignore the new errors pg_dump would generate and lose data.  I just
>> don't remember enough cases where we threw new errors on _data_ restore.
>>
>> Frankly, those using pg_upgrade already will have to move the old
>> tablespaces out of the old cluster if they ever want to delete those
>> clusters, so I am hopeful these additional warnings will help eliminate
>> this practice, which is already cumbersome and useless.  I am not
>> planning to revisit this for 9.6.
>

(resurrecting an old thread) I encountered this the other day, a 
customer had created tablespaces with directories inside 
$PGDATA/pg_tblspc. This is just pathalogical - e.g (v11 checkout with 
PGDATA=/data0/pgdata/11):

bench=# CREATE TABLESPACE space1 LOCATION 
'/data0/pgdata/11/pg_tblspc/space1';
WARNING:  tablespace location should not be inside the data directory
CREATE TABLESPACE
bench=# ALTER TABLE pgbench_accounts SET  TABLESPACE space1;
ALTER TABLE

Ok, so I've been warned:

$ pg_basebackup -D .
WARNING:  could not read symbolic link "pg_tblspc/space1": Invalid argument
pg_basebackup: directory "/data0/pgdata/11/pg_tblspc/space1" exists but 
is not empty
pg_basebackup: removing contents of data directory "."

So pg_basebackup is completely broken by this construction - should we 
not prohibit the creation of tablespace directories under $PGDATA (or at 
least $PGDATA/pg_tblspc) at this point?

regards

Mark




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] tablespaces inside $PGDATA considered harmful

From
Mark Kirkwood
Date:
On 26/09/17 20:44, Mark Kirkwood wrote:

>
>
> $ pg_basebackup -D .
> WARNING:  could not read symbolic link "pg_tblspc/space1": Invalid 
> argument
> pg_basebackup: directory "/data0/pgdata/11/pg_tblspc/space1" exists 
> but is not empty
> pg_basebackup: removing contents of data directory "."
>

Err - actually this example is wrong - sorry. In fact pg_basebackup is 
complaining because it does not want to overwrite the contents of the 
tablespace (need to use the -T option as I'm on the same host)!

A correct example of pg_basebackup failing due to tablespaces inside 
$PGDATA/pg_tblspc can be easily demonstrated by trying to set up 
streaming replication on another host:

$ pg_basebackup -h 10.0.119.100 -P -D .
WARNING:  could not read symbolic link "pg_tblspc/space1": Invalid argument
pg_basebackup: could not create directory "./pg_tblspc": File exists

Fortunately this can be worked around by changing to tar format:

$ pg_basebackup -h 10.0.119.100 -Ft -P -D .
WARNING:  could not read symbolic link "pg_tblspc/space1": Invalid argument
1560632/1560632 kB (100%), 2/2 tablespaces

...however, not that great that the plain mode is busted.

regards

Mark




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers