Thread: security labels on databases are bad for dump & restore

security labels on databases are bad for dump & restore

From
Andres Freund
Date:
Hi,

pg_dump dumps security labels on databases. Which makes sense. The
problem is that they're dumped including the database name.

Which means that if you dump a database and restore it into a
differently named one you'll either get a failure because the database
does not exist, or worse you'll update the label of the wrong database.

So I think we need CURRENT_DATABASE (or similar) support for security
labels on databases.

I won't have time to do anything about this anytime soon, but I think we
should fix that at some point.  Shall I put this on the todo? Or do we
want to create an 'open items' page that's not major version specific?

Greetings,

Andres Freund



Re: security labels on databases are bad for dump & restore

From
Robert Haas
Date:
On Fri, Jul 10, 2015 at 7:57 AM, Andres Freund <andres@anarazel.de> wrote:
> pg_dump dumps security labels on databases. Which makes sense. The
> problem is that they're dumped including the database name.
>
> Which means that if you dump a database and restore it into a
> differently named one you'll either get a failure because the database
> does not exist, or worse you'll update the label of the wrong database.
>
> So I think we need CURRENT_DATABASE (or similar) support for security
> labels on databases.
>
> I won't have time to do anything about this anytime soon, but I think we
> should fix that at some point.  Shall I put this on the todo? Or do we
> want to create an 'open items' page that's not major version specific?

I think adding it to the TODO would be great.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: security labels on databases are bad for dump & restore

From
Adam Brightwell
Date:
All,

>> I won't have time to do anything about this anytime soon, but I think we
>> should fix that at some point.  Shall I put this on the todo? Or do we
>> want to create an 'open items' page that's not major version specific?
>
> I think adding it to the TODO would be great.

I'd be willing to look/dive into this one further.

-Adam

-- 
Adam Brightwell - adam.brightwell@crunchydatasolutions.com
Database Engineer - www.crunchydatasolutions.com



Re: security labels on databases are bad for dump & restore

From
Ted Toth
Date:
I'm sort of new to this so maybe I'm missing something but since the
sepgsql SELinux userspace object manager was never integrated into
postgresql (AFAIK KaiGais branch was never merged into the mainline)
who uses these labels? What use are they?

Ted

On Tue, Jul 14, 2015 at 12:09 PM, Adam Brightwell
<adam.brightwell@crunchydatasolutions.com> wrote:
> All,
>
>>> I won't have time to do anything about this anytime soon, but I think we
>>> should fix that at some point.  Shall I put this on the todo? Or do we
>>> want to create an 'open items' page that's not major version specific?
>>
>> I think adding it to the TODO would be great.
>
> I'd be willing to look/dive into this one further.
>
> -Adam
>
> --
> Adam Brightwell - adam.brightwell@crunchydatasolutions.com
> Database Engineer - www.crunchydatasolutions.com
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers



Re: security labels on databases are bad for dump & restore

From
Robert Haas
Date:
On Tue, Jul 14, 2015 at 1:22 PM, Ted Toth <txtoth@gmail.com> wrote:
> I'm sort of new to this so maybe I'm missing something but since the
> sepgsql SELinux userspace object manager was never integrated into
> postgresql (AFAIK KaiGais branch was never merged into the mainline)
> who uses these labels? What use are they?

See contrib/sepgsql

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: security labels on databases are bad for dump & restore

From
Ted Toth
Date:
That's exactly what I'm talking about like I said KaiGais branch was
never merged into the mainline so I do not believe that it is used at
all.

Ted

On Tue, Jul 14, 2015 at 12:28 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Jul 14, 2015 at 1:22 PM, Ted Toth <txtoth@gmail.com> wrote:
>> I'm sort of new to this so maybe I'm missing something but since the
>> sepgsql SELinux userspace object manager was never integrated into
>> postgresql (AFAIK KaiGais branch was never merged into the mainline)
>> who uses these labels? What use are they?
>
> See contrib/sepgsql
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company



Re: security labels on databases are bad for dump & restore

From
Kohei KaiGai
Date:
2015-07-15 2:39 GMT+09:00 Ted Toth <txtoth@gmail.com>:
> That's exactly what I'm talking about like I said KaiGais branch was
> never merged into the mainline so I do not believe that it is used at
> all.
>
It depends on the definition of "integrated".
The PostgreSQL core offers an infrastructure for label based security
mechanism, not only selinux. Also, one extension module that is
usually distributed with PosgreSQL bridges the world of database and
the world of selinux (even though all the features I initially designed
are not yet implemented). I like to say it is integrated.

> On Tue, Jul 14, 2015 at 12:28 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Tue, Jul 14, 2015 at 1:22 PM, Ted Toth <txtoth@gmail.com> wrote:
>>> I'm sort of new to this so maybe I'm missing something but since the
>>> sepgsql SELinux userspace object manager was never integrated into
>>> postgresql (AFAIK KaiGais branch was never merged into the mainline)
>>> who uses these labels? What use are they?
>>
>> See contrib/sepgsql
>>
>> --
>> Robert Haas
>> EnterpriseDB: http://www.enterprisedb.com
>> The Enterprise PostgreSQL Company



-- 
KaiGai Kohei <kaigai@kaigai.gr.jp>



Re: security labels on databases are bad for dump & restore

From
Ted Toth
Date:
So if I label a table with an SELinux context and the type of my
client connection does not have policy to be able to access the table
type will an AVC be generated and the access denied?

Ted

On Tue, Jul 14, 2015 at 12:53 PM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote:
> 2015-07-15 2:39 GMT+09:00 Ted Toth <txtoth@gmail.com>:
>> That's exactly what I'm talking about like I said KaiGais branch was
>> never merged into the mainline so I do not believe that it is used at
>> all.
>>
> It depends on the definition of "integrated".
> The PostgreSQL core offers an infrastructure for label based security
> mechanism, not only selinux. Also, one extension module that is
> usually distributed with PosgreSQL bridges the world of database and
> the world of selinux (even though all the features I initially designed
> are not yet implemented). I like to say it is integrated.
>
>> On Tue, Jul 14, 2015 at 12:28 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>>> On Tue, Jul 14, 2015 at 1:22 PM, Ted Toth <txtoth@gmail.com> wrote:
>>>> I'm sort of new to this so maybe I'm missing something but since the
>>>> sepgsql SELinux userspace object manager was never integrated into
>>>> postgresql (AFAIK KaiGais branch was never merged into the mainline)
>>>> who uses these labels? What use are they?
>>>
>>> See contrib/sepgsql
>>>
>>> --
>>> Robert Haas
>>> EnterpriseDB: http://www.enterprisedb.com
>>> The Enterprise PostgreSQL Company
>
>
>
> --
> KaiGai Kohei <kaigai@kaigai.gr.jp>



Re: security labels on databases are bad for dump & restore

From
Kouhei Kaigai
Date:
> So if I label a table with an SELinux context and the type of my
> client connection does not have policy to be able to access the table
> type will an AVC be generated and the access denied?
>
Of course, it depends on the policy of the system.

If client connection come from none-SELinux system, use netlabelctl
to configure default fallback security context. It gives getpeercon(3)
the client label shall be applied when netlabel is not configured on
the connection.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>


> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org
> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Ted Toth
> Sent: Wednesday, July 15, 2015 2:59 AM
> To: Kohei KaiGai
> Cc: Robert Haas; Adam Brightwell; Andres Freund; pgsql-hackers@postgresql.org;
> Alvaro Herrera
> Subject: Re: [HACKERS] security labels on databases are bad for dump & restore
> 
> So if I label a table with an SELinux context and the type of my
> client connection does not have policy to be able to access the table
> type will an AVC be generated and the access denied?
> 
> Ted
> 
> On Tue, Jul 14, 2015 at 12:53 PM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote:
> > 2015-07-15 2:39 GMT+09:00 Ted Toth <txtoth@gmail.com>:
> >> That's exactly what I'm talking about like I said KaiGais branch was
> >> never merged into the mainline so I do not believe that it is used at
> >> all.
> >>
> > It depends on the definition of "integrated".
> > The PostgreSQL core offers an infrastructure for label based security
> > mechanism, not only selinux. Also, one extension module that is
> > usually distributed with PosgreSQL bridges the world of database and
> > the world of selinux (even though all the features I initially designed
> > are not yet implemented). I like to say it is integrated.
> >
> >> On Tue, Jul 14, 2015 at 12:28 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> >>> On Tue, Jul 14, 2015 at 1:22 PM, Ted Toth <txtoth@gmail.com> wrote:
> >>>> I'm sort of new to this so maybe I'm missing something but since the
> >>>> sepgsql SELinux userspace object manager was never integrated into
> >>>> postgresql (AFAIK KaiGais branch was never merged into the mainline)
> >>>> who uses these labels? What use are they?
> >>>
> >>> See contrib/sepgsql
> >>>
> >>> --
> >>> Robert Haas
> >>> EnterpriseDB: http://www.enterprisedb.com
> >>> The Enterprise PostgreSQL Company
> >
> >
> >
> > --
> > KaiGai Kohei <kaigai@kaigai.gr.jp>
> 
> 
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

Re: security labels on databases are bad for dump & restore

From
Ted Toth
Date:
That doesn't answer my question. I'm talking about a client and server
running on the same system with SELinux MLS policy so that getpeercon
will return the context of the client process unless it has explicitly
sets the socket create context . So again will postgresql if the
sepgsql module is loaded call a function in sepgsql to compute the
access vector for the source (getpeercon label) contexts access to the
target context (tables context set by SECURITY LABEL) and fail the
operation generating an AVC if access is denied because there is no
policy?

On Tue, Jul 14, 2015 at 8:35 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
>> So if I label a table with an SELinux context and the type of my
>> client connection does not have policy to be able to access the table
>> type will an AVC be generated and the access denied?
>>
> Of course, it depends on the policy of the system.
>
> If client connection come from none-SELinux system, use netlabelctl
> to configure default fallback security context. It gives getpeercon(3)
> the client label shall be applied when netlabel is not configured on
> the connection.
>
> Thanks,
> --
> NEC Business Creation Division / PG-Strom Project
> KaiGai Kohei <kaigai@ak.jp.nec.com>
>
>
>> -----Original Message-----
>> From: pgsql-hackers-owner@postgresql.org
>> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Ted Toth
>> Sent: Wednesday, July 15, 2015 2:59 AM
>> To: Kohei KaiGai
>> Cc: Robert Haas; Adam Brightwell; Andres Freund; pgsql-hackers@postgresql.org;
>> Alvaro Herrera
>> Subject: Re: [HACKERS] security labels on databases are bad for dump & restore
>>
>> So if I label a table with an SELinux context and the type of my
>> client connection does not have policy to be able to access the table
>> type will an AVC be generated and the access denied?
>>
>> Ted
>>
>> On Tue, Jul 14, 2015 at 12:53 PM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote:
>> > 2015-07-15 2:39 GMT+09:00 Ted Toth <txtoth@gmail.com>:
>> >> That's exactly what I'm talking about like I said KaiGais branch was
>> >> never merged into the mainline so I do not believe that it is used at
>> >> all.
>> >>
>> > It depends on the definition of "integrated".
>> > The PostgreSQL core offers an infrastructure for label based security
>> > mechanism, not only selinux. Also, one extension module that is
>> > usually distributed with PosgreSQL bridges the world of database and
>> > the world of selinux (even though all the features I initially designed
>> > are not yet implemented). I like to say it is integrated.
>> >
>> >> On Tue, Jul 14, 2015 at 12:28 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> >>> On Tue, Jul 14, 2015 at 1:22 PM, Ted Toth <txtoth@gmail.com> wrote:
>> >>>> I'm sort of new to this so maybe I'm missing something but since the
>> >>>> sepgsql SELinux userspace object manager was never integrated into
>> >>>> postgresql (AFAIK KaiGais branch was never merged into the mainline)
>> >>>> who uses these labels? What use are they?
>> >>>
>> >>> See contrib/sepgsql
>> >>>
>> >>> --
>> >>> Robert Haas
>> >>> EnterpriseDB: http://www.enterprisedb.com
>> >>> The Enterprise PostgreSQL Company
>> >
>> >
>> >
>> > --
>> > KaiGai Kohei <kaigai@kaigai.gr.jp>
>>
>>
>> --
>> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-hackers



Re: security labels on databases are bad for dump & restore

From
Kouhei Kaigai
Date:
> That doesn't answer my question. I'm talking about a client and server
> running on the same system with SELinux MLS policy so that getpeercon
> will return the context of the client process unless it has explicitly
> sets the socket create context . So again will postgresql if the
> sepgsql module is loaded call a function in sepgsql to compute the
> access vector for the source (getpeercon label) contexts access to the
> target context (tables context set by SECURITY LABEL) and fail the
> operation generating an AVC if access is denied because there is no
> policy?
>
Yes. You may see AVC denial/allowed message on PostgreSQL log, like:

LOG:  SELinux: allowed { create } scontext=unconfined_u:unconfined_r:unconfined_t:s0
tcontext=unconfined_u:object_r:sepgsql_table_t:s0tclass=db_table name="regtest_schema.regtest_table"
 

scontext comes from getpeercon(3),
tcontext comes from the configuration by SECURITY LABEL command.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>


Re: security labels on databases are bad for dump & restore

From
Andres Freund
Date:
On 2015-07-14 13:09:26 -0400, Adam Brightwell wrote:
> All,
> 
> >> I won't have time to do anything about this anytime soon, but I think we
> >> should fix that at some point.  Shall I put this on the todo? Or do we
> >> want to create an 'open items' page that's not major version specific?
> >
> > I think adding it to the TODO would be great.

Done. It's rather telling that it took me a fair while to find a spot in
the todo list where it fits...

> I'd be willing to look/dive into this one further.

Cool.


One thing worth mentioning is that arguably the problem is caused by the
fact that we're here emitting database level information in pg_dump,
normally only done for dumpall.

Greetings,

Andres Freund



Re: security labels on databases are bad for dump & restore

From
Alvaro Herrera
Date:
Andres Freund wrote:

> One thing worth mentioning is that arguably the problem is caused by the
> fact that we're here emitting database level information in pg_dump,
> normally only done for dumpall.

... the reason for which is probably the lack of CURRENT_DATABASE as a
database specifier.  It might make sense to add the rest of
database-level information to pg_dump output, when we get that.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: security labels on databases are bad for dump & restore

From
Andres Freund
Date:
On 2015-07-15 12:04:40 +0300, Alvaro Herrera wrote:
> Andres Freund wrote:
> > One thing worth mentioning is that arguably the problem is caused by the
> > fact that we're here emitting database level information in pg_dump,
> > normally only done for dumpall.
> 
> ... the reason for which is probably the lack of CURRENT_DATABASE as a
> database specifier.  It might make sense to add the rest of
> database-level information to pg_dump output, when we get that.

I'm not sure. I mean, it's not that an odd idea to assign a label to a
database and then restore data into it, and expect the explicitly
assigned label to survive.  Not sure if there actually is a good
behaviour either way here :/



Re: security labels on databases are bad for dump & restore

From
Noah Misch
Date:
On Wed, Jul 15, 2015 at 11:08:53AM +0200, Andres Freund wrote:
> On 2015-07-15 12:04:40 +0300, Alvaro Herrera wrote:
> > Andres Freund wrote:
> > > One thing worth mentioning is that arguably the problem is caused by the
> > > fact that we're here emitting database level information in pg_dump,
> > > normally only done for dumpall.

Consistency with existing practice would indeed have pg_dump ignore
pg_shseclabel and have pg_dumpall reproduce its entries.

> > ... the reason for which is probably the lack of CURRENT_DATABASE as a
> > database specifier.  It might make sense to add the rest of
> > database-level information to pg_dump output, when we get that.
> 
> I'm not sure. I mean, it's not that an odd idea to assign a label to a
> database and then restore data into it, and expect the explicitly
> assigned label to survive.  Not sure if there actually is a good
> behaviour either way here :/

In a greenfield, I would make "pg_dump --create" reproduce pertinent entries
from datacl, pg_db_role_setting, pg_shseclabel and pg_shdescription.  I would
make non-creating pg_dump reproduce none of those.  Moreover, I would enable
--create by default.  Restoring into a user-provided shell database is
specialized compared to reproducing a database from scratch.



Re: security labels on databases are bad for dump & restore

From
Adam Brightwell
Date:
> Consistency with existing practice would indeed have pg_dump ignore
> pg_shseclabel and have pg_dumpall reproduce its entries.

I think that makes sense, but what about other DATABASE level info
such as COMMENT?  Should that also be ignored by pg_dump as well?  I'm
specifically thinking of the discussion from the following thread:

http://www.postgresql.org/message-id/20150317172459.GM3636@alvh.no-ip.org

If COMMENT is included then why not SECURITY LABEL or others?

> In a greenfield, I would make "pg_dump --create" reproduce pertinent entries
> from datacl, pg_db_role_setting, pg_shseclabel and pg_shdescription.  I would
> make non-creating pg_dump reproduce none of those.

I think the bigger question is "Where is the line drawn between
pg_dump and pg_dumpall?".  At what point does one tool become the
other?

-Adam

-- 
Adam Brightwell - adam.brightwell@crunchydatasolutions.com
Database Engineer - www.crunchydatasolutions.com



Re: security labels on databases are bad for dump & restore

From
Noah Misch
Date:
On Mon, Jul 20, 2015 at 07:01:14PM -0400, Adam Brightwell wrote:
> > Consistency with existing practice would indeed have pg_dump ignore
> > pg_shseclabel and have pg_dumpall reproduce its entries.
> 
> I think that makes sense, but what about other DATABASE level info
> such as COMMENT?  Should that also be ignored by pg_dump as well?  I'm
> specifically thinking of the discussion from the following thread:
> 
> http://www.postgresql.org/message-id/20150317172459.GM3636@alvh.no-ip.org
> 
> If COMMENT is included then why not SECURITY LABEL or others?

In any given situation, we should indeed restore both pg_database comments and
pg_database security labels, or we should restore neither.  Restoring neither
is most consistent with history, but several people like the idea of restoring
both.  I won't mind either conclusion.

> > In a greenfield, I would make "pg_dump --create" reproduce pertinent entries
> > from datacl, pg_db_role_setting, pg_shseclabel and pg_shdescription.  I would
> > make non-creating pg_dump reproduce none of those.
> 
> I think the bigger question is "Where is the line drawn between
> pg_dump and pg_dumpall?".  At what point does one tool become the
> other?

That question may be too big for me.



Re: security labels on databases are bad for dump & restore

From
Alvaro Herrera
Date:
Noah Misch wrote:
> On Mon, Jul 20, 2015 at 07:01:14PM -0400, Adam Brightwell wrote:

> > I think that makes sense, but what about other DATABASE level info
> > such as COMMENT?  Should that also be ignored by pg_dump as well?  I'm
> > specifically thinking of the discussion from the following thread:
> > 
> > http://www.postgresql.org/message-id/20150317172459.GM3636@alvh.no-ip.org
> > 
> > If COMMENT is included then why not SECURITY LABEL or others?
> 
> In any given situation, we should indeed restore both pg_database comments and
> pg_database security labels, or we should restore neither.

Agreed.

> > > In a greenfield, I would make "pg_dump --create" reproduce pertinent entries
> > > from datacl, pg_db_role_setting, pg_shseclabel and pg_shdescription.  I would
> > > make non-creating pg_dump reproduce none of those.
> > 
> > I think the bigger question is "Where is the line drawn between
> > pg_dump and pg_dumpall?".  At what point does one tool become the
> > other?
> 
> That question may be too big for me.

I don't think there's any line near pg_dumpall.  That tool seems to
have grown out of desperation without much actual design.  I think it
makes more sense to plan around that's the best pg_dump behavior for the
various use cases.

I like Noah's proposal of having pg_dump --create reproduce all
database-level state.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: security labels on databases are bad for dump & restore

From
Adam Brightwell
Date:
> I don't think there's any line near pg_dumpall.  That tool seems to
> have grown out of desperation without much actual design.  I think it
> makes more sense to plan around that's the best pg_dump behavior for the
> various use cases.

Ok.

> I like Noah's proposal of having pg_dump --create reproduce all
> database-level state.

Should it be enabled by default?  If so, then wouldn't it make more
sense to call it --no-create and do the opposite?  So, --no-create
would exclude rather than include database-level information?  Would
enabling it by default cause issues with the current expected use of
the tool by end users?

How would this handle related global objects? It seems like this part
could get a little tricky.

Taking it one step further, would a --all option that dumps all
databases make sense as well?  Of course I know that's probably a
considerable undertaking and certainly beyond the current scope.
Though, I thought I'd throw it out there.

Also, I think this would potentially conflict with what Fabrízio is
doing with CURRENT_DATABASE on COMMENT, though, I think it might be a
preferable solution.

https://commitfest.postgresql.org/5/229/

-Adam

--
Adam Brightwell - adam.brightwell@crunchydatasolutions.com
Database Engineer - www.crunchydatasolutions.com



Re: security labels on databases are bad for dump & restore

From
Fabrízio de Royes Mello
Date:
<div dir="ltr"><div class="gmail_extra">On Wed, Jul 22, 2015 at 4:42 PM, Adam Brightwell <<a
href="mailto:adam.brightwell@crunchydatasolutions.com">adam.brightwell@crunchydatasolutions.com</a>>wrote:<br />>
<br/>> [...]<br />><br />> Also, I think this would potentially conflict with what Fabrízio is<br />> doing
withCURRENT_DATABASE on COMMENT, though, I think it might be a<br />> preferable solution.<br />><br />> <a
href="https://commitfest.postgresql.org/5/229/">https://commitfest.postgresql.org/5/229/</a><br/>><br /><br
/></div><divclass="gmail_extra">Unfortunately this code is a bit weird and will be better to move to the next
commitfest(I have no time to improve it yet), so we can join efforts and implement all ideas and make the reviewers
lifeeasier with a more consistency patch. <br /><br /></div><div class="gmail_extra">Seems reasonable?<br /></div><div
class="gmail_extra"><br/></div><div class="gmail_extra">Regards,<br /></div><div class="gmail_extra"><br />--<br
/>Fabríziode Royes Mello<br />Consultoria/Coaching PostgreSQL<br />>> Timbira: <a
href="http://www.timbira.com.br">http://www.timbira.com.br</a><br/>>> Blog: <a
href="http://fabriziomello.github.io">http://fabriziomello.github.io</a><br/>>> Linkedin: <a
href="http://br.linkedin.com/in/fabriziomello">http://br.linkedin.com/in/fabriziomello</a><br/>>> Twitter: <a
href="http://twitter.com/fabriziomello">http://twitter.com/fabriziomello</a><br/>>> Github: <a
href="http://github.com/fabriziomello">http://github.com/fabriziomello</a></div></div>

Re: security labels on databases are bad for dump & restore

From
Noah Misch
Date:
On Wed, Jul 22, 2015 at 03:42:58PM -0400, Adam Brightwell wrote:
> > I like Noah's proposal of having pg_dump --create reproduce all
> > database-level state.
> 
> Should it be enabled by default?  If so, then wouldn't it make more
> sense to call it --no-create and do the opposite?  So, --no-create
> would exclude rather than include database-level information?  Would
> enabling it by default cause issues with the current expected use of
> the tool by end users?

While I'd favor optional --no-create if we were designing fresh, it's not
worth breaking user scripts by changing that now.

> How would this handle related global objects? It seems like this part
> could get a little tricky.

Like roles and tablespaces?  No need to change their treatment.

> Taking it one step further, would a --all option that dumps all
> databases make sense as well?  Of course I know that's probably a
> considerable undertaking and certainly beyond the current scope.

I agree it's outside the scope of fixing $subject.

Thanks,
nm



Re: security labels on databases are bad for dump & restore

From
Adam Brightwell
Date:
> While I'd favor optional --no-create if we were designing fresh, it's not
> worth breaking user scripts by changing that now.

Agreed.  So, --create would not be enabled by default.

>> How would this handle related global objects? It seems like this part
>> could get a little tricky.
>
> Like roles and tablespaces?  No need to change their treatment.

Yes, those.  Ok.

-Adam



-- 
Adam Brightwell - adam.brightwell@crunchydatasolutions.com
Database Engineer - www.crunchydatasolutions.com



Re: security labels on databases are bad for dump & restore

From
Robert Haas
Date:
On Wed, Jul 22, 2015 at 3:42 PM, Adam Brightwell
<adam.brightwell@crunchydatasolutions.com> wrote:
>> I don't think there's any line near pg_dumpall.  That tool seems to
>> have grown out of desperation without much actual design.  I think it
>> makes more sense to plan around that's the best pg_dump behavior for the
>> various use cases.
>
> Ok.
>
>> I like Noah's proposal of having pg_dump --create reproduce all
>> database-level state.
>
> Should it be enabled by default?  If so, then wouldn't it make more
> sense to call it --no-create and do the opposite?  So, --no-create
> would exclude rather than include database-level information?  Would
> enabling it by default cause issues with the current expected use of
> the tool by end users?

This seems a bit hairy to me.  If we want to transfer responsibility
for dumping this stuff from pg_dumpall to pg_dump, I have no problem
with that at all.  But doing it only when --create is specified seems
odd.  Then, does pg_dumpall -g dump it or not?  If it does, then we're
sorta dumping it in two places when --create is used.  If it doesn't,
then when --create is not used we're doing it nowhere.

I may be confused.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: security labels on databases are bad for dump & restore

From
Noah Misch
Date:
On Thu, Jul 23, 2015 at 12:14:14PM -0400, Robert Haas wrote:
> On Wed, Jul 22, 2015 at 3:42 PM, Adam Brightwell <adam.brightwell@crunchydatasolutions.com> wrote:
> >> I like Noah's proposal of having pg_dump --create reproduce all
> >> database-level state.
> >
> > Should it be enabled by default?  If so, then wouldn't it make more
> > sense to call it --no-create and do the opposite?  So, --no-create
> > would exclude rather than include database-level information?  Would
> > enabling it by default cause issues with the current expected use of
> > the tool by end users?
> 
> This seems a bit hairy to me.  If we want to transfer responsibility
> for dumping this stuff from pg_dumpall to pg_dump, I have no problem
> with that at all.  But doing it only when --create is specified seems
> odd.  Then, does pg_dumpall -g dump it or not?

The principle I had in mind was to dump ACLs, pg_db_role_setting entries,
comments and security labels if and only if we emit a CREATE statement for the
object they modify.  That is already the rule for objects located inside
databases.  Since "pg_dumpall -g" does not emit CREATE DATABASE statements[1],
it would not dump these attributes of databases.

> If it does, then we're
> sorta dumping it in two places when --create is used.  If it doesn't,
> then when --create is not used we're doing it nowhere.

Yep.  Plain "pg_dump" dumps the contents of a database without dumping the
database itself.  I don't like that as a default, but we're stuck with it.


[1] "pg_dumpall -g --binary-upgrade" _does_ emit CREATE DATABASE statements,
so _it_ would dump these attributes.



Re: security labels on databases are bad for dump & restore

From
Craig Ringer
Date:
On 20 July 2015 at 01:18, Noah Misch <noah@leadboat.com> wrote:
> On Wed, Jul 15, 2015 at 11:08:53AM +0200, Andres Freund wrote:
>> On 2015-07-15 12:04:40 +0300, Alvaro Herrera wrote:
>> > Andres Freund wrote:
>> > > One thing worth mentioning is that arguably the problem is caused by the
>> > > fact that we're here emitting database level information in pg_dump,
>> > > normally only done for dumpall.
>
> Consistency with existing practice would indeed have pg_dump ignore
> pg_shseclabel and have pg_dumpall reproduce its entries.

Existing practice is pretty broken though, and not necessarily a good guide.

COMMENT ON DATABASE and SECURITY LABEL FOR DATABASE are dumped by
pg_dump, but always refer to the database's name at the time it was
dumped, so restoring it can break.

GRANTs on databases are ignored and not dumped by pg_dump or by
pg_dumpall --globals-only. The only way to dump them seems to be to
use pg_dumpall, which nobody uses in the real world.

I'd be strongly in favour of teaching GRANT, SECURITY LABEL, COMMENT
ON DATABASE, etc to recognise CURRENT_DATABASE as a keyword. Then
dumping them in pg_dump --create, and in pg_dump -Fc .

In practice I see zero real use of pg_dumpall without --globals-only,
and almost everyone does pg_dump -Fc . I'd like to see that method
case actually preserve the whole state of the system and do the right
thing sensibly.

A pg_restore option to skip database-level settings could be useful,
but I think by default they should be restored.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: security labels on databases are bad for dump & restore

From
Robert Haas
Date:
On Sun, Jul 26, 2015 at 11:43 PM, Craig Ringer <craig@2ndquadrant.com> wrote:
> On 20 July 2015 at 01:18, Noah Misch <noah@leadboat.com> wrote:
>> On Wed, Jul 15, 2015 at 11:08:53AM +0200, Andres Freund wrote:
>>> On 2015-07-15 12:04:40 +0300, Alvaro Herrera wrote:
>>> > Andres Freund wrote:
>>> > > One thing worth mentioning is that arguably the problem is caused by the
>>> > > fact that we're here emitting database level information in pg_dump,
>>> > > normally only done for dumpall.
>>
>> Consistency with existing practice would indeed have pg_dump ignore
>> pg_shseclabel and have pg_dumpall reproduce its entries.
>
> Existing practice is pretty broken though, and not necessarily a good guide.
>
> COMMENT ON DATABASE and SECURITY LABEL FOR DATABASE are dumped by
> pg_dump, but always refer to the database's name at the time it was
> dumped, so restoring it can break.
>
> GRANTs on databases are ignored and not dumped by pg_dump or by
> pg_dumpall --globals-only. The only way to dump them seems to be to
> use pg_dumpall, which nobody uses in the real world.
>
> I'd be strongly in favour of teaching GRANT, SECURITY LABEL, COMMENT
> ON DATABASE, etc to recognise CURRENT_DATABASE as a keyword. Then
> dumping them in pg_dump --create, and in pg_dump -Fc .
>
> In practice I see zero real use of pg_dumpall without --globals-only,
> and almost everyone does pg_dump -Fc . I'd like to see that method
> case actually preserve the whole state of the system and do the right
> thing sensibly.
>
> A pg_restore option to skip database-level settings could be useful,
> but I think by default they should be restored.

Yes, I think we should make restoring the database's properties the
job of pg_dump and remove it completely from pg_dumpall, unless we can
find a case where that's really going to break things.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: security labels on databases are bad for dump & restore

From
Stephen Frost
Date:
* Robert Haas (robertmhaas@gmail.com) wrote:
> On Sun, Jul 26, 2015 at 11:43 PM, Craig Ringer <craig@2ndquadrant.com> wrote:
> > On 20 July 2015 at 01:18, Noah Misch <noah@leadboat.com> wrote:
> >> On Wed, Jul 15, 2015 at 11:08:53AM +0200, Andres Freund wrote:
> >>> On 2015-07-15 12:04:40 +0300, Alvaro Herrera wrote:
> >>> > Andres Freund wrote:
> >>> > > One thing worth mentioning is that arguably the problem is caused by the
> >>> > > fact that we're here emitting database level information in pg_dump,
> >>> > > normally only done for dumpall.
> >>
> >> Consistency with existing practice would indeed have pg_dump ignore
> >> pg_shseclabel and have pg_dumpall reproduce its entries.
> >
> > Existing practice is pretty broken though, and not necessarily a good guide.
> >
> > COMMENT ON DATABASE and SECURITY LABEL FOR DATABASE are dumped by
> > pg_dump, but always refer to the database's name at the time it was
> > dumped, so restoring it can break.
> >
> > GRANTs on databases are ignored and not dumped by pg_dump or by
> > pg_dumpall --globals-only. The only way to dump them seems to be to
> > use pg_dumpall, which nobody uses in the real world.
> >
> > I'd be strongly in favour of teaching GRANT, SECURITY LABEL, COMMENT
> > ON DATABASE, etc to recognise CURRENT_DATABASE as a keyword. Then
> > dumping them in pg_dump --create, and in pg_dump -Fc .
> >
> > In practice I see zero real use of pg_dumpall without --globals-only,
> > and almost everyone does pg_dump -Fc . I'd like to see that method
> > case actually preserve the whole state of the system and do the right
> > thing sensibly.
> >
> > A pg_restore option to skip database-level settings could be useful,
> > but I think by default they should be restored.
>
> Yes, I think we should make restoring the database's properties the
> job of pg_dump and remove it completely from pg_dumpall, unless we can
> find a case where that's really going to break things.

I believe that means, as discussed, that we'll need to support
"CURRENT_DATABASE" or similar for all database properties, but that
seems like a wholly good thing to do anyway, provided we can do so
without causing problems.

In other words, I agree.
Thanks!
    Stephen

Re: security labels on databases are bad for dump & restore

From
Andres Freund
Date:
On 2015-07-28 14:58:26 -0400, Robert Haas wrote:
> Yes, I think we should make restoring the database's properties the
> job of pg_dump and remove it completely from pg_dumpall, unless we can
> find a case where that's really going to break things.

CREATE DATABASE blarg;
SECURITY LABEL ON blarg IS 'noaccess';
ALTER DATABASE blarg SET default_tablespace = space_with_storage;
pg_restore
-> SECURITY LABEL ON blarg IS 'allow_access';
-> ALTER DATABASE blarg SET default_tablespace = space_without_storage;

That's probably not sufficient reasons not to go that way, but I do
think there's a bunch more issues like that.


At the very least all these need to be emitted as ALTER DATABASE
current_database ... et al. Otherwise it's impossible to rename
databases, which definitely would not be ok.

Andres



Re: security labels on databases are bad for dump & restore

From
Robert Haas
Date:
On Tue, Jul 28, 2015 at 3:03 PM, Andres Freund <andres@anarazel.de> wrote:
> On 2015-07-28 14:58:26 -0400, Robert Haas wrote:
>> Yes, I think we should make restoring the database's properties the
>> job of pg_dump and remove it completely from pg_dumpall, unless we can
>> find a case where that's really going to break things.
>
> CREATE DATABASE blarg;
> SECURITY LABEL ON blarg IS 'noaccess';
> ALTER DATABASE blarg SET default_tablespace = space_with_storage;
> pg_restore
> -> SECURITY LABEL ON blarg IS 'allow_access';
> -> ALTER DATABASE blarg SET default_tablespace = space_without_storage;
>
> That's probably not sufficient reasons not to go that way, but I do
> think there's a bunch more issues like that.

Could you use some complete sentences to describe what the actual
issue is?  I can't make heads or tails of what you wrote there.

> At the very least all these need to be emitted as ALTER DATABASE
> current_database ... et al. Otherwise it's impossible to rename
> databases, which definitely would not be ok.

Yep, I think that's the plan.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: security labels on databases are bad for dump & restore

From
Andres Freund
Date:
On 2015-07-28 15:05:01 -0400, Robert Haas wrote:
> On Tue, Jul 28, 2015 at 3:03 PM, Andres Freund <andres@anarazel.de> wrote:
> > On 2015-07-28 14:58:26 -0400, Robert Haas wrote:
> >> Yes, I think we should make restoring the database's properties the
> >> job of pg_dump and remove it completely from pg_dumpall, unless we can
> >> find a case where that's really going to break things.
> >
> > CREATE DATABASE blarg;
> > SECURITY LABEL ON blarg IS 'noaccess';
> > ALTER DATABASE blarg SET default_tablespace = space_with_storage;
> > pg_restore
> > -> SECURITY LABEL ON blarg IS 'allow_access';
> > -> ALTER DATABASE blarg SET default_tablespace = space_without_storage;
> >
> > That's probably not sufficient reasons not to go that way, but I do
> > think there's a bunch more issues like that.
> 
> Could you use some complete sentences to describe what the actual
> issue is?  I can't make heads or tails of what you wrote there.

DBA creates a database and sets some properties (security labels, gucs,
acls) on it. Then goes on to restore a backup. Unfortunately that backup
might, or might not, overwrite the properties he configured depending on
whether the restored database already contains them and from which
version the backup originates.

Greetings,

Andres Freund



Re: security labels on databases are bad for dump & restore

From
Robert Haas
Date:
On Tue, Jul 28, 2015 at 3:10 PM, Andres Freund <andres@anarazel.de> wrote:
> On 2015-07-28 15:05:01 -0400, Robert Haas wrote:
>> On Tue, Jul 28, 2015 at 3:03 PM, Andres Freund <andres@anarazel.de> wrote:
>> > On 2015-07-28 14:58:26 -0400, Robert Haas wrote:
>> >> Yes, I think we should make restoring the database's properties the
>> >> job of pg_dump and remove it completely from pg_dumpall, unless we can
>> >> find a case where that's really going to break things.
>> >
>> > CREATE DATABASE blarg;
>> > SECURITY LABEL ON blarg IS 'noaccess';
>> > ALTER DATABASE blarg SET default_tablespace = space_with_storage;
>> > pg_restore
>> > -> SECURITY LABEL ON blarg IS 'allow_access';
>> > -> ALTER DATABASE blarg SET default_tablespace = space_without_storage;
>> >
>> > That's probably not sufficient reasons not to go that way, but I do
>> > think there's a bunch more issues like that.
>>
>> Could you use some complete sentences to describe what the actual
>> issue is?  I can't make heads or tails of what you wrote there.
>
> DBA creates a database and sets some properties (security labels, gucs,
> acls) on it. Then goes on to restore a backup. Unfortunately that backup
> might, or might not, overwrite the properties he configured depending on
> whether the restored database already contains them and from which
> version the backup originates.

Well, I think that's just a potential incompatibility between 9.6 and
previous versions, and a relatively minor one at that.  We can't and
don't guarantee that a dump taken using the 9.3 version of pg_dump
will restore correctly on any server version except 9.3.  It might
work OK on a newer or older version, but then again it might not.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: security labels on databases are bad for dump & restore

From
Andres Freund
Date:
On 2015-07-28 15:14:11 -0400, Robert Haas wrote:
> On Tue, Jul 28, 2015 at 3:10 PM, Andres Freund <andres@anarazel.de> wrote:
> > DBA creates a database and sets some properties (security labels, gucs,
> > acls) on it. Then goes on to restore a backup. Unfortunately that backup
> > might, or might not, overwrite the properties he configured depending on
> > whether the restored database already contains them and from which
> > version the backup originates.
> 
> Well, I think that's just a potential incompatibility between 9.6 and
> previous versions, and a relatively minor one at that.  We can't and
> don't guarantee that a dump taken using the 9.3 version of pg_dump
> will restore correctly on any server version except 9.3.  It might
> work OK on a newer or older version, but then again it might not.

Even within a single major version it'll be a bit confusing that one
time a restore yielded the desired result (previously set property
survives) and the next restore it doesn't, because now the backup does
contain the property.



Re: security labels on databases are bad for dump & restore

From
Josh Berkus
Date:
On 07/28/2015 11:58 AM, Robert Haas wrote:
> I'd be strongly in favour of teaching GRANT, SECURITY LABEL, COMMENT
>> ON DATABASE, etc to recognise CURRENT_DATABASE as a keyword. Then
>> dumping them in pg_dump --create, and in pg_dump -Fc .
>>
>> In practice I see zero real use of pg_dumpall without --globals-only,
>> and almost everyone does pg_dump -Fc . I'd like to see that method
>> case actually preserve the whole state of the system and do the right
>> thing sensibly.
>>
>> A pg_restore option to skip database-level settings could be useful,
>> but I think by default they should be restored.

+++++1

Let's get rid of pg_dumpall -g.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: security labels on databases are bad for dump & restore

From
Stephen Frost
Date:
* Andres Freund (andres@anarazel.de) wrote:
> On 2015-07-28 15:14:11 -0400, Robert Haas wrote:
> > On Tue, Jul 28, 2015 at 3:10 PM, Andres Freund <andres@anarazel.de> wrote:
> > > DBA creates a database and sets some properties (security labels, gucs,
> > > acls) on it. Then goes on to restore a backup. Unfortunately that backup
> > > might, or might not, overwrite the properties he configured depending on
> > > whether the restored database already contains them and from which
> > > version the backup originates.
> >
> > Well, I think that's just a potential incompatibility between 9.6 and
> > previous versions, and a relatively minor one at that.  We can't and
> > don't guarantee that a dump taken using the 9.3 version of pg_dump
> > will restore correctly on any server version except 9.3.  It might
> > work OK on a newer or older version, but then again it might not.
>
> Even within a single major version it'll be a bit confusing that one
> time a restore yielded the desired result (previously set property
> survives) and the next restore it doesn't, because now the backup does
> contain the property.

I'm not sure that I agree with this at all- you might create one SSL
certificate after you install PG and then you use one of the various
utilities to restore a prior cluster and, blam, you get a different
certificate because that's what was in the backup.

I might see having an option to enable/disable restoring the database
level properies which exist inside a backup as that may be useful
flexibility, but I don't believe this concern should stop us from
including the database properties in the database backup.
Thanks!
    Stephen

Re: security labels on databases are bad for dump & restore

From
Alvaro Herrera
Date:
Josh Berkus wrote:
> On 07/28/2015 11:58 AM, Robert Haas wrote:
> > I'd be strongly in favour of teaching GRANT, SECURITY LABEL, COMMENT
> >> ON DATABASE, etc to recognise CURRENT_DATABASE as a keyword. Then
> >> dumping them in pg_dump --create, and in pg_dump -Fc .
> >>
> >> In practice I see zero real use of pg_dumpall without --globals-only,
> >> and almost everyone does pg_dump -Fc . I'd like to see that method
> >> case actually preserve the whole state of the system and do the right
> >> thing sensibly.
> >>
> >> A pg_restore option to skip database-level settings could be useful,
> >> but I think by default they should be restored.
> 
> +++++1
> 
> Let's get rid of pg_dumpall -g.

Quite the opposite, I think --- let's get rid of pg_dumpall EXCEPT when
invoked as pg_dumpall -g.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: security labels on databases are bad for dump & restore

From
Robert Haas
Date:
On Tue, Jul 28, 2015 at 3:16 PM, Andres Freund <andres@anarazel.de> wrote:
> On 2015-07-28 15:14:11 -0400, Robert Haas wrote:
>> On Tue, Jul 28, 2015 at 3:10 PM, Andres Freund <andres@anarazel.de> wrote:
>> > DBA creates a database and sets some properties (security labels, gucs,
>> > acls) on it. Then goes on to restore a backup. Unfortunately that backup
>> > might, or might not, overwrite the properties he configured depending on
>> > whether the restored database already contains them and from which
>> > version the backup originates.
>>
>> Well, I think that's just a potential incompatibility between 9.6 and
>> previous versions, and a relatively minor one at that.  We can't and
>> don't guarantee that a dump taken using the 9.3 version of pg_dump
>> will restore correctly on any server version except 9.3.  It might
>> work OK on a newer or older version, but then again it might not.
>
> Even within a single major version it'll be a bit confusing that one
> time a restore yielded the desired result (previously set property
> survives) and the next restore it doesn't, because now the backup does
> contain the property.

How would that happen?  We're not gonna back-patch this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: security labels on databases are bad for dump & restore

From
Andres Freund
Date:
On 2015-07-28 15:27:51 -0400, Robert Haas wrote:
> On Tue, Jul 28, 2015 at 3:16 PM, Andres Freund <andres@anarazel.de> wrote:
> > On 2015-07-28 15:14:11 -0400, Robert Haas wrote:
> >> On Tue, Jul 28, 2015 at 3:10 PM, Andres Freund <andres@anarazel.de> wrote:
> >> > DBA creates a database and sets some properties (security labels, gucs,
> >> > acls) on it. Then goes on to restore a backup. Unfortunately that backup
> >> > might, or might not, overwrite the properties he configured depending on
> >> > whether the restored database already contains them and from which
> >> > version the backup originates.
> >>
> >> Well, I think that's just a potential incompatibility between 9.6 and
> >> previous versions, and a relatively minor one at that.  We can't and
> >> don't guarantee that a dump taken using the 9.3 version of pg_dump
> >> will restore correctly on any server version except 9.3.  It might
> >> work OK on a newer or older version, but then again it might not.
> >
> > Even within a single major version it'll be a bit confusing that one
> > time a restore yielded the desired result (previously set property
> > survives) and the next restore it doesn't, because now the backup does
> > contain the property.
>
> How would that happen?  We're not gonna back-patch this.

Hm?  Let me try again: If the admin does a ALTER DATABASE ... SET guc =
... *before* restoring a backup and the backup does contain a setting
for the same guc, but with a different value it'll overwrite the
previous explicit action by the DBA without any warning.  If the backup
does *not* contain that guc the previous action survives.  That's
confusing, because you're more likely to be in the 'the backup does not
contain the guc' situation when testing where it thus will work.



Re: security labels on databases are bad for dump & restore

From
Robert Haas
Date:
On Tue, Jul 28, 2015 at 3:33 PM, Andres Freund <andres@anarazel.de> wrote:
> On 2015-07-28 15:27:51 -0400, Robert Haas wrote:
>> On Tue, Jul 28, 2015 at 3:16 PM, Andres Freund <andres@anarazel.de> wrote:
>> > On 2015-07-28 15:14:11 -0400, Robert Haas wrote:
>> >> On Tue, Jul 28, 2015 at 3:10 PM, Andres Freund <andres@anarazel.de> wrote:
>> >> > DBA creates a database and sets some properties (security labels, gucs,
>> >> > acls) on it. Then goes on to restore a backup. Unfortunately that backup
>> >> > might, or might not, overwrite the properties he configured depending on
>> >> > whether the restored database already contains them and from which
>> >> > version the backup originates.
>> >>
>> >> Well, I think that's just a potential incompatibility between 9.6 and
>> >> previous versions, and a relatively minor one at that.  We can't and
>> >> don't guarantee that a dump taken using the 9.3 version of pg_dump
>> >> will restore correctly on any server version except 9.3.  It might
>> >> work OK on a newer or older version, but then again it might not.
>> >
>> > Even within a single major version it'll be a bit confusing that one
>> > time a restore yielded the desired result (previously set property
>> > survives) and the next restore it doesn't, because now the backup does
>> > contain the property.
>>
>> How would that happen?  We're not gonna back-patch this.
>
> Hm?  Let me try again: If the admin does a ALTER DATABASE ... SET guc =
> ... *before* restoring a backup and the backup does contain a setting
> for the same guc, but with a different value it'll overwrite the
> previous explicit action by the DBA without any warning.  If the backup
> does *not* contain that guc the previous action survives.  That's
> confusing, because you're more likely to be in the 'the backup does not
> contain the guc' situation when testing where it thus will work.

True.  But I don't think modifying a database before restoring into it
is terribly supported.  Even pg_dump --clean, which is supposed to do
this sort of thing, doesn't seem to work terribly reliably.  We could
try to fix this by having a command like ALTER DATABASE ... RESET ALL
that we issue before restoring the settings, but I'm afraid that will
take us into all sorts of unreasonable scenarios that are better just
labeled as "don't do that".

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: security labels on databases are bad for dump & restore

From
Noah Misch
Date:
On Tue, Jul 28, 2015 at 03:36:13PM -0400, Robert Haas wrote:
> On Tue, Jul 28, 2015 at 3:33 PM, Andres Freund <andres@anarazel.de> wrote:
> > Hm?  Let me try again: If the admin does a ALTER DATABASE ... SET guc =
> > ... *before* restoring a backup and the backup does contain a setting
> > for the same guc, but with a different value it'll overwrite the
> > previous explicit action by the DBA without any warning.  If the backup
> > does *not* contain that guc the previous action survives.  That's
> > confusing, because you're more likely to be in the 'the backup does not
> > contain the guc' situation when testing where it thus will work.
> 
> True.  But I don't think modifying a database before restoring into it
> is terribly supported.  Even pg_dump --clean, which is supposed to do
> this sort of thing, doesn't seem to work terribly reliably.  We could
> try to fix this by having a command like ALTER DATABASE ... RESET ALL
> that we issue before restoring the settings, but I'm afraid that will
> take us into all sorts of unreasonable scenarios that are better just
> labeled as "don't do that".

Andres's example is a harbinger of the semantic morass ahead.  Excepting
database objects and the "public" schema object, pg_dump and pg_dumpall mutate
only the objects they CREATE.  They consistently restore object properties
(owner, ACLs, security label, etc.) if and only if issuing a CREATE statement
for the object.  For example, restoring objects contained in a schema without
restoring the schema itself changes none of those schema properties.  pg_dump
and pg_dumpall have mostly followed that rule for databases, too, but they
depart from it for comment and security label.  That was a mistake.  We can't
in general mutate an existing database to match, because we can't mutate the
encoding, datcollate or datctype.  Even discounting that problem, I value
consistency with the rest of the dumpable object types.

I suppose many folks using pg_dump without --create think of it as a dump of a
database itself, not a dump of a database's contents.  They would benefit from
a dump that mutates the target database to match the properties of the source
database.  Helping those folks does not offset the aforementioned losses.

Thanks,
nm



Re: security labels on databases are bad for dump & restore

From
Robert Haas
Date:
On Wed, Jul 29, 2015 at 12:39 AM, Noah Misch <noah@leadboat.com> wrote:
> On Tue, Jul 28, 2015 at 03:36:13PM -0400, Robert Haas wrote:
>> On Tue, Jul 28, 2015 at 3:33 PM, Andres Freund <andres@anarazel.de> wrote:
>> > Hm?  Let me try again: If the admin does a ALTER DATABASE ... SET guc =
>> > ... *before* restoring a backup and the backup does contain a setting
>> > for the same guc, but with a different value it'll overwrite the
>> > previous explicit action by the DBA without any warning.  If the backup
>> > does *not* contain that guc the previous action survives.  That's
>> > confusing, because you're more likely to be in the 'the backup does not
>> > contain the guc' situation when testing where it thus will work.
>>
>> True.  But I don't think modifying a database before restoring into it
>> is terribly supported.  Even pg_dump --clean, which is supposed to do
>> this sort of thing, doesn't seem to work terribly reliably.  We could
>> try to fix this by having a command like ALTER DATABASE ... RESET ALL
>> that we issue before restoring the settings, but I'm afraid that will
>> take us into all sorts of unreasonable scenarios that are better just
>> labeled as "don't do that".
>
> Andres's example is a harbinger of the semantic morass ahead.  Excepting
> database objects and the "public" schema object, pg_dump and pg_dumpall mutate
> only the objects they CREATE.  They consistently restore object properties
> (owner, ACLs, security label, etc.) if and only if issuing a CREATE statement
> for the object.  For example, restoring objects contained in a schema without
> restoring the schema itself changes none of those schema properties.  pg_dump
> and pg_dumpall have mostly followed that rule for databases, too, but they
> depart from it for comment and security label.  That was a mistake.  We can't
> in general mutate an existing database to match, because we can't mutate the
> encoding, datcollate or datctype.  Even discounting that problem, I value
> consistency with the rest of the dumpable object types.

What we've proven so far (if Craig's comments are to be believed) is
that the oft-recommended formula of pg_dumpall -g plus pg_dump of each
database doesn't completely work.  That's absolutely gotta be fixed.
I'm open to whatever ideas you or others may have about how to fix
that, but it doesn't seem to me that changing the behavior of pg_dump
only when -c is specified gets us any closer to that goal.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: security labels on databases are bad for dump & restore

From
Noah Misch
Date:
On Wed, Jul 29, 2015 at 10:50:53AM -0400, Robert Haas wrote:
> On Wed, Jul 29, 2015 at 12:39 AM, Noah Misch <noah@leadboat.com> wrote:
> > On Tue, Jul 28, 2015 at 03:36:13PM -0400, Robert Haas wrote:
> >> On Tue, Jul 28, 2015 at 3:33 PM, Andres Freund <andres@anarazel.de> wrote:
> >> > Hm?  Let me try again: If the admin does a ALTER DATABASE ... SET guc =
> >> > ... *before* restoring a backup and the backup does contain a setting
> >> > for the same guc, but with a different value it'll overwrite the
> >> > previous explicit action by the DBA without any warning.  If the backup
> >> > does *not* contain that guc the previous action survives.  That's
> >> > confusing, because you're more likely to be in the 'the backup does not
> >> > contain the guc' situation when testing where it thus will work.
> >>
> >> True.  But I don't think modifying a database before restoring into it
> >> is terribly supported.  Even pg_dump --clean, which is supposed to do
> >> this sort of thing, doesn't seem to work terribly reliably.  We could
> >> try to fix this by having a command like ALTER DATABASE ... RESET ALL
> >> that we issue before restoring the settings, but I'm afraid that will
> >> take us into all sorts of unreasonable scenarios that are better just
> >> labeled as "don't do that".
> >
> > Andres's example is a harbinger of the semantic morass ahead.  Excepting
> > database objects and the "public" schema object, pg_dump and pg_dumpall mutate
> > only the objects they CREATE.  They consistently restore object properties
> > (owner, ACLs, security label, etc.) if and only if issuing a CREATE statement
> > for the object.  For example, restoring objects contained in a schema without
> > restoring the schema itself changes none of those schema properties.  pg_dump
> > and pg_dumpall have mostly followed that rule for databases, too, but they
> > depart from it for comment and security label.  That was a mistake.  We can't
> > in general mutate an existing database to match, because we can't mutate the
> > encoding, datcollate or datctype.  Even discounting that problem, I value
> > consistency with the rest of the dumpable object types.
> 
> What we've proven so far (if Craig's comments are to be believed) is
> that the oft-recommended formula of pg_dumpall -g plus pg_dump of each
> database doesn't completely work.  That's absolutely gotta be fixed.

What exact formula did you have in mind?  It must not be merely

1. "pg_dumpall -g"
2. "pg_dump" (without --create) per database

which _never_ works: it emits no CREATE DATABASE statements.  Perhaps this?

1. "pg_dumpall -g"
2. Issue a handwritten CREATE DATABASE statement per database with correct  encoding, lc_ctype and lc_collate
parameters. All other database  properties can be wrong; the dump will fix them.
 
3. "pg_dump" (without --create) per database

That neglects numerous database properties today, but we could make it work.
Given the problems I described upthread, it's an inferior formula that I
recommend against propping up.  I much prefer making this work completely:

1. "pg_dumpall -g"
2. "pg_dump --create" per database

Another formula I wouldn't mind offering:

1. "pg_dumpall -g"
2. pg_dumpall --empty-databases
3. "pg_dump" (without --create) per database

Code for an --empty-databases option already exists for "pg_dumpall -g
--binary-upgrade".  A patch turning that into a user-facing feature might be
quite compact.  I don't see much point given a complete "pg_dump --create",
but I wouldn't object.

Thanks,
nm



Re: security labels on databases are bad for dump & restore

From
Alvaro Herrera
Date:
Noah Misch wrote:

> What exact formula did you have in mind?  It must not be merely
> 
> 1. "pg_dumpall -g"
> 2. "pg_dump" (without --create) per database
> 
> which _never_ works: it emits no CREATE DATABASE statements.  Perhaps this?
> 
> 1. "pg_dumpall -g"
> 2. Issue a handwritten CREATE DATABASE statement per database with correct
>    encoding, lc_ctype and lc_collate parameters.  All other database
>    properties can be wrong; the dump will fix them.
> 3. "pg_dump" (without --create) per database
> 
> That neglects numerous database properties today, but we could make it work.
> Given the problems I described upthread, it's an inferior formula that I
> recommend against propping up.

Agreed, and IMO it's embarrasing that it's so complicated to get a fully
working backup.

> I much prefer making this work completely:
> 
> 1. "pg_dumpall -g"
> 2. "pg_dump --create" per database

My full support for this proposal.

> Another formula I wouldn't mind offering:
> 
> 1. "pg_dumpall -g"
> 2. pg_dumpall --empty-databases
> 3. "pg_dump" (without --create) per database
> 
> Code for an --empty-databases option already exists for "pg_dumpall -g
> --binary-upgrade".  A patch turning that into a user-facing feature might be
> quite compact.

I don't mind if this one is also made to work, but I don't care about
this case all that much.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: security labels on databases are bad for dump & restore

From
Robert Haas
Date:
On Thu, Jul 30, 2015 at 2:49 AM, Noah Misch <noah@leadboat.com> wrote:
> What exact formula did you have in mind?  It must not be merely
>
> 1. "pg_dumpall -g"
> 2. "pg_dump" (without --create) per database
>
> which _never_ works: it emits no CREATE DATABASE statements.  Perhaps this?
>
> 1. "pg_dumpall -g"
> 2. Issue a handwritten CREATE DATABASE statement per database with correct
>    encoding, lc_ctype and lc_collate parameters.  All other database
>    properties can be wrong; the dump will fix them.
> 3. "pg_dump" (without --create) per database
>
> That neglects numerous database properties today, but we could make it work.
> Given the problems I described upthread, it's an inferior formula that I
> recommend against propping up.  I much prefer making this work completely:
>
> 1. "pg_dumpall -g"
> 2. "pg_dump --create" per database

Gah, OK, I see your point.  But we better document this, because if
you need a PhD in PostgreSQL-ology to take a backup, we're not in a
good place.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: security labels on databases are bad for dump & restore

From
Adam Brightwell
Date:
>> 1. "pg_dumpall -g"
>> 2. "pg_dump --create" per database
>
> Gah, OK, I see your point.  But we better document this, because if
> you need a PhD in PostgreSQL-ology to take a backup, we're not in a
> good place.

Agreed.  Though, honestly, I find this to be a cumbersome approach.  I
think it just makes things more confusing, even if it is well
documented.  Perhaps it might be necessary as a bridge to get to a
better place.  But my first question as an end user would be, 'why
can't one tool do this?'.  Also, by using 'pg_dumpall -g' aren't you
potentially getting things that you don't want/need/care about?  For
instance, if database 'foo' is owned by 'user1' and database 'bar' is
owned by 'user2' and neither have any knowledge/relation of/to the
other, then when I dump 'foo', in this manner, wouldn't I also be
including 'user2'?  Said differently, a restore of a 'foo'-only dump
would also include a 'bar' related role.  That seems like a bad idea,
IMHO.  Maybe it can't be avoided, but I'd expect that only relevant
information for the database being dumped would be included.

-Adam

-- 
Adam Brightwell - adam.brightwell@crunchydatasolutions.com
Database Engineer - www.crunchydatasolutions.com



Re: security labels on databases are bad for dump & restore

From
Noah Misch
Date:
On Thu, Jul 30, 2015 at 10:37:33PM -0400, Adam Brightwell wrote:
> On Thu, Jul 30, 2015 at 02:26:34PM -0400, Robert Haas wrote:
> >> 1. "pg_dumpall -g"
> >> 2. "pg_dump --create" per database
> >
> > Gah, OK, I see your point.  But we better document this, because if
> > you need a PhD in PostgreSQL-ology to take a backup, we're not in a
> > good place.

Definitely.

> Agreed.  Though, honestly, I find this to be a cumbersome approach.  I
> think it just makes things more confusing, even if it is well
> documented.  Perhaps it might be necessary as a bridge to get to a
> better place.  But my first question as an end user would be, 'why
> can't one tool do this?'.

"pg_dumpall" (without -g) is that one tool.  It has excellent dump fidelity.
It lacks the slicing and dump format options of pg_dump, which are important
to many sites.

> Also, by using 'pg_dumpall -g' aren't you
> potentially getting things that you don't want/need/care about?  For
> instance, if database 'foo' is owned by 'user1' and database 'bar' is
> owned by 'user2' and neither have any knowledge/relation of/to the
> other, then when I dump 'foo', in this manner, wouldn't I also be
> including 'user2'?  Said differently, a restore of a 'foo'-only dump
> would also include a 'bar' related role.  That seems like a bad idea,
> IMHO.  Maybe it can't be avoided, but I'd expect that only relevant
> information for the database being dumped would be included.

Nothing in core PostgreSQL attempts to answer the high-level question "Is
user1 relevant to database bar?"  PostgreSQL has no one concept of a role's
relevance to databases.  Some reasonable heuristics come to mind, but nothing
I'd be eager to memorialize in a core tool like pg_dumpall.

Thanks,
nm



Re: security labels on databases are bad for dump & restore

From
Bruce Momjian
Date:
On Tue, Jul 28, 2015 at 04:23:36PM -0300, Alvaro Herrera wrote:
> Josh Berkus wrote:
> > On 07/28/2015 11:58 AM, Robert Haas wrote:
> > > I'd be strongly in favour of teaching GRANT, SECURITY LABEL, COMMENT
> > >> ON DATABASE, etc to recognise CURRENT_DATABASE as a keyword. Then
> > >> dumping them in pg_dump --create, and in pg_dump -Fc .
> > >>
> > >> In practice I see zero real use of pg_dumpall without --globals-only,
> > >> and almost everyone does pg_dump -Fc . I'd like to see that method
> > >> case actually preserve the whole state of the system and do the right
> > >> thing sensibly.
> > >>
> > >> A pg_restore option to skip database-level settings could be useful,
> > >> but I think by default they should be restored.
> > 
> > +++++1
> > 
> > Let's get rid of pg_dumpall -g.
> 
> Quite the opposite, I think --- let's get rid of pg_dumpall EXCEPT when
> invoked as pg_dumpall -g.

Is this a TODO?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +



Re: security labels on databases are bad for dump & restore

From
Noah Misch
Date:
On Mon, Aug 31, 2015 at 05:46:08PM -0400, Bruce Momjian wrote:
> On Tue, Jul 28, 2015 at 04:23:36PM -0300, Alvaro Herrera wrote:
> > Josh Berkus wrote:
> > > On 07/28/2015 11:58 AM, Robert Haas wrote:
> > > > I'd be strongly in favour of teaching GRANT, SECURITY LABEL, COMMENT
> > > >> ON DATABASE, etc to recognise CURRENT_DATABASE as a keyword. Then
> > > >> dumping them in pg_dump --create, and in pg_dump -Fc .
> > > >>
> > > >> In practice I see zero real use of pg_dumpall without --globals-only,
> > > >> and almost everyone does pg_dump -Fc . I'd like to see that method
> > > >> case actually preserve the whole state of the system and do the right
> > > >> thing sensibly.
> > > >>
> > > >> A pg_restore option to skip database-level settings could be useful,
> > > >> but I think by default they should be restored.
> > > 
> > > +++++1
> > > 
> > > Let's get rid of pg_dumpall -g.
> > 
> > Quite the opposite, I think --- let's get rid of pg_dumpall EXCEPT when
> > invoked as pg_dumpall -g.
> 
> Is this a TODO?

Most ideas from this thread had been on TODO for 5+ years.  I've just now
linked the main existing item to this thread.  Removing modes of pg_dumpall
isn't on TODO, but I don't think it has enough support to belong there.

Thanks,
nm