Thread: pg_dump broken for non-super user

pg_dump broken for non-super user

From
Rushabh Lathia
Date:
With commit a9f0e8e5a2e779a888988cb64479a6723f668c84, now pg_dump, use a bitmap
to represent what to include. With this commit if non-super user is unable
to perform the dump.

Consider the below testcase:

postgres=# select version();
                                                   version                                                  
-------------------------------------------------------------------------------------------------------------
 PostgreSQL 9.6devel on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-16), 64-bit
(1 row)

postgres=#
postgres=# create user u1;
CREATE ROLE
postgres=# \c postgres u1
You are now connected to database "postgres" as user "u1".
postgres=> create table t (a int );
CREATE TABLE
postgres=> \q
rushabh@rushabh-centos-vm:postgresql$ ./db/bin/pg_dump postgres -U u1
pg_dump: [archiver (db)] query failed: ERROR:  permission denied for relation pg_authid
pg_dump: [archiver (db)] query was: LOCK TABLE pg_catalog.pg_authid IN ACCESS SHARE MODE


getTables() take read-lock target tables to make sure they aren't DROPPED
or altered in schema before we get around to dumping them. Here it having
below condition to take  a lock:

        if (tblinfo[i].dobj.dump && tblinfo[i].relkind == RELKIND_RELATION)

which need to replace with:

        if ((tblinfo[i].dobj.dump & DUMP_COMPONENT_DEFINITION) &&
            tblinfo[i].relkind == RELKIND_RELATION)

PFA patch to fix the issue.

Thanks,


--
Rushabh Lathia
Attachment

Re: pg_dump broken for non-super user

From
Craig Ringer
Date:
On 3 May 2016 at 19:04, Rushabh Lathia <rushabh.lathia@gmail.com> wrote:
With commit a9f0e8e5a2e779a888988cb64479a6723f668c84, now pg_dump, use a bitmap
to represent what to include. With this commit if non-super user is unable
to perform the dump.

Hm. I think we need a src/bin/pg_dump/t/ test for the TAP suite for this one, because something this fundamental should really have blown the buildfarm up when committed.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: pg_dump broken for non-super user

From
Stephen Frost
Date:
* Craig Ringer (craig@2ndquadrant.com) wrote:
> On 3 May 2016 at 19:04, Rushabh Lathia <rushabh.lathia@gmail.com> wrote:
>
> > With commit a9f0e8e5a2e779a888988cb64479a6723f668c84, now pg_dump, use a
> > bitmap
> > to represent what to include. With this commit if non-super user is unable
> > to perform the dump.
> >
>
> Hm. I think we need a src/bin/pg_dump/t/ test for the TAP suite for this
> one, because something this fundamental should really have blown the
> buildfarm up when committed.

I'm in the process of building a pretty comprehensive set, as discussed
in another thread.

Thanks!

Stephen

Re: pg_dump broken for non-super user

From
Stephen Frost
Date:
* Rushabh Lathia (rushabh.lathia@gmail.com) wrote:
> With commit a9f0e8e5a2e779a888988cb64479a6723f668c84, now pg_dump, use a
> bitmap
> to represent what to include. With this commit if non-super user is unable
> to perform the dump.
[...]
> pg_dump: [archiver (db)] query was: LOCK TABLE pg_catalog.pg_authid IN
> ACCESS SHARE MODE

This does get addressed, in a way, in one of the patches that I've
currently got pending for pg_dump.  That particular change is actually
one for performance and therefore it's only going to help in cases where
none of the catalog tables have had their ACLs changed.

If the ACL has changed for any catalog table then pg_dump will still try
to LOCK the table, because we're going to dump out information about
that table (the ACLs on it).  I'm not sure if that's really an issue or
not..  Generally, if you're using pg_dump as a non-superuser, I'd expect
you to be limiting the tables you're dumping to ones you have access to
normally (using -n).

> getTables() take read-lock target tables to make sure they aren't DROPPED
> or altered in schema before we get around to dumping them. Here it having
> below condition to take  a lock:
>
>         if (tblinfo[i].dobj.dump && tblinfo[i].relkind == RELKIND_RELATION)
>
> which need to replace with:
>
>         if ((tblinfo[i].dobj.dump & DUMP_COMPONENT_DEFINITION) &&
>             tblinfo[i].relkind == RELKIND_RELATION)
>
> PFA patch to fix the issue.

I don't think we want to limit the cases where we take a lock to just
when we're dumping out the table definition..  Consider what happens if
someone drops the table before we get to dumping out the data in the
table, or gathering the ACLs on it (which happens much, much later).

Thanks!

Stephen

Re: pg_dump broken for non-super user

From
Robert Haas
Date:
On Tue, May 3, 2016 at 10:57 AM, Stephen Frost <sfrost@snowman.net> wrote:
> * Craig Ringer (craig@2ndquadrant.com) wrote:
>> On 3 May 2016 at 19:04, Rushabh Lathia <rushabh.lathia@gmail.com> wrote:
>>
>> > With commit a9f0e8e5a2e779a888988cb64479a6723f668c84, now pg_dump, use a
>> > bitmap
>> > to represent what to include. With this commit if non-super user is unable
>> > to perform the dump.
>> >
>>
>> Hm. I think we need a src/bin/pg_dump/t/ test for the TAP suite for this
>> one, because something this fundamental should really have blown the
>> buildfarm up when committed.
>
> I'm in the process of building a pretty comprehensive set, as discussed
> in another thread.

Are you going to review and commit this patch?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: pg_dump broken for non-super user

From
Stephen Frost
Date:
* Robert Haas (robertmhaas@gmail.com) wrote:
> On Tue, May 3, 2016 at 10:57 AM, Stephen Frost <sfrost@snowman.net> wrote:
> > * Craig Ringer (craig@2ndquadrant.com) wrote:
> >> On 3 May 2016 at 19:04, Rushabh Lathia <rushabh.lathia@gmail.com> wrote:
> >>
> >> > With commit a9f0e8e5a2e779a888988cb64479a6723f668c84, now pg_dump, use a
> >> > bitmap
> >> > to represent what to include. With this commit if non-super user is unable
> >> > to perform the dump.
> >> >
> >>
> >> Hm. I think we need a src/bin/pg_dump/t/ test for the TAP suite for this
> >> one, because something this fundamental should really have blown the
> >> buildfarm up when committed.
> >
> > I'm in the process of building a pretty comprehensive set, as discussed
> > in another thread.
>
> Are you going to review and commit this patch?

I've already reviewed and commented on why that patch isn't a good idea,
please see my other email.

I'll be posting the updated set of patches on the other thread shortly
which address the performance concerns previously raised about
(relatively) empty databases and the TAP test suite that I've been
building.

Thanks!

Stephen

Re: pg_dump broken for non-super user

From
Rushabh Lathia
Date:


On Tue, May 3, 2016 at 8:34 PM, Stephen Frost <sfrost@snowman.net> wrote:
* Rushabh Lathia (rushabh.lathia@gmail.com) wrote:
> With commit a9f0e8e5a2e779a888988cb64479a6723f668c84, now pg_dump, use a
> bitmap
> to represent what to include. With this commit if non-super user is unable
> to perform the dump.
[...]
> pg_dump: [archiver (db)] query was: LOCK TABLE pg_catalog.pg_authid IN
> ACCESS SHARE MODE

This does get addressed, in a way, in one of the patches that I've
currently got pending for pg_dump.  That particular change is actually
one for performance and therefore it's only going to help in cases where
none of the catalog tables have had their ACLs changed.

If the ACL has changed for any catalog table then pg_dump will still try
to LOCK the table, because we're going to dump out information about
that table (the ACLs on it).  I'm not sure if that's really an issue or
not..  Generally, if you're using pg_dump as a non-superuser, I'd expect
you to be limiting the tables you're dumping to ones you have access to
normally (using -n).

If its limitation that if someone is using pg_dump as a non-superuser, it
should be limiting the tables using -n, then better we document that. But
I don't think this is valid limitation. Comments ?
 

> getTables() take read-lock target tables to make sure they aren't DROPPED
> or altered in schema before we get around to dumping them. Here it having
> below condition to take  a lock:
>
>         if (tblinfo[i].dobj.dump && tblinfo[i].relkind == RELKIND_RELATION)
>
> which need to replace with:
>
>         if ((tblinfo[i].dobj.dump & DUMP_COMPONENT_DEFINITION) &&
>             tblinfo[i].relkind == RELKIND_RELATION)
>
> PFA patch to fix the issue.

I don't think we want to limit the cases where we take a lock to just
when we're dumping out the table definition..  Consider what happens if
someone drops the table before we get to dumping out the data in the
table, or gathering the ACLs on it (which happens much, much later).

Right, condition should check for (DUMP_COMPONENT_DEFINITION ||
DUMP_COMPONENT_DATA).

I am confused, in getTables() we executing LOCK TABLE, which holds the
share lock on that table till we get around to dumping them ( and that include
dumping data or gathering the ACLs).  isn't that right ?

 

Thanks!

Stephen



--
Rushabh Lathia

Re: pg_dump broken for non-super user

From
Stephen Frost
Date:
Rushabh,

* Rushabh Lathia (rushabh.lathia@gmail.com) wrote:
> On Tue, May 3, 2016 at 8:34 PM, Stephen Frost <sfrost@snowman.net> wrote:
>
> > * Rushabh Lathia (rushabh.lathia@gmail.com) wrote:
> > > With commit a9f0e8e5a2e779a888988cb64479a6723f668c84, now pg_dump, use a
> > > bitmap
> > > to represent what to include. With this commit if non-super user is
> > unable
> > > to perform the dump.
> > [...]
> > > pg_dump: [archiver (db)] query was: LOCK TABLE pg_catalog.pg_authid IN
> > > ACCESS SHARE MODE
> >
> > This does get addressed, in a way, in one of the patches that I've
> > currently got pending for pg_dump.  That particular change is actually
> > one for performance and therefore it's only going to help in cases where
> > none of the catalog tables have had their ACLs changed.
> >
> > If the ACL has changed for any catalog table then pg_dump will still try
> > to LOCK the table, because we're going to dump out information about
> > that table (the ACLs on it).  I'm not sure if that's really an issue or
> > not..  Generally, if you're using pg_dump as a non-superuser, I'd expect
> > you to be limiting the tables you're dumping to ones you have access to
> > normally (using -n).
>
> If its limitation that if someone is using pg_dump as a non-superuser, it
> should be limiting the tables using -n, then better we document that. But
> I don't think this is valid limitation. Comments ?

There is no such limitation on using pg_dump as a non-superuser.  It's
always been the case that you need to be able to LOCK the table that
you're dumping.  If you don't have rights to LOCK a certain table then
pg_dump is going to throw an error just like the one you saw.

Existing versions of pg_dump would complain just the same if you ran
it with "pg_dump -t pg_authid".

> > > getTables() take read-lock target tables to make sure they aren't DROPPED
> > > or altered in schema before we get around to dumping them. Here it having
> > > below condition to take  a lock:
> > >
> > >         if (tblinfo[i].dobj.dump && tblinfo[i].relkind ==
> > RELKIND_RELATION)
> > >
> > > which need to replace with:
> > >
> > >         if ((tblinfo[i].dobj.dump & DUMP_COMPONENT_DEFINITION) &&
> > >             tblinfo[i].relkind == RELKIND_RELATION)
> > >
> > > PFA patch to fix the issue.
> >
> > I don't think we want to limit the cases where we take a lock to just
> > when we're dumping out the table definition..  Consider what happens if
> > someone drops the table before we get to dumping out the data in the
> > table, or gathering the ACLs on it (which happens much, much later).
>
> Right, condition should check for (DUMP_COMPONENT_DEFINITION ||
> DUMP_COMPONENT_DATA).

No.  The point of locking the table is to allow us to gather information
about the table later on, which includes all of the components, not just
the definition or the data.

> I am confused, in getTables() we executing LOCK TABLE, which holds the
> share lock on that table till we get around to dumping them ( and that
> include
> dumping data or gathering the ACLs).  isn't that right ?

The patch you've proposed is changing it so that we wouldn't always
execute LOCK TABLE.

Thanks!

Stephen

Re: pg_dump broken for non-super user

From
Robert Haas
Date:
On Wed, May 4, 2016 at 5:29 AM, Stephen Frost <sfrost@snowman.net> wrote:
> There is no such limitation on using pg_dump as a non-superuser.  It's
> always been the case that you need to be able to LOCK the table that
> you're dumping.  If you don't have rights to LOCK a certain table then
> pg_dump is going to throw an error just like the one you saw.

But in Rushabh's example, he's not doing that.  He's trying to do a
full-database dump of a database that contains one object which the
dump user has rights to access.  Previously, that worked.  Now, it
fails with an error about a system catalog.  How is that not broken?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: pg_dump broken for non-super user

From
Stephen Frost
Date:
Robert,

* Robert Haas (robertmhaas@gmail.com) wrote:
> On Wed, May 4, 2016 at 5:29 AM, Stephen Frost <sfrost@snowman.net> wrote:
> > There is no such limitation on using pg_dump as a non-superuser.  It's
> > always been the case that you need to be able to LOCK the table that
> > you're dumping.  If you don't have rights to LOCK a certain table then
> > pg_dump is going to throw an error just like the one you saw.
>
> But in Rushabh's example, he's not doing that.  He's trying to do a
> full-database dump of a database that contains one object which the
> dump user has rights to access.  Previously, that worked.  Now, it
> fails with an error about a system catalog.  How is that not broken?

As I mentioned up-thread, the optimization to skip tables which are not
"interesting" has been improved in the patch-set posted this morning to
skip over tables whose ACLs haven't changed from the defaults.  With
that patch, we will skip over catalog tables whose ACLs haven't been
changed and Rushabh's command will work as a non-superuser, assuming
none of the ACLs on tables in pg_catalog have been changed.

However, if any of the ACLs have been changed on tables in pg_catalog,
we'll attempt to lock those tables and include those ACLs.  That will
still work in many cases as you only need SELECT access to be able to
lock a table in access share mode, but if the permissions on pg_authid
are changed, the same failure will occur.

I wouldn't recommend depending on those tables being excluded in backup
scripts.  If you wish to dump everything except pg_catalog, the -N
switch can be used to exclude that schema.

Thanks!

Stephen

Re: pg_dump broken for non-super user

From
Robert Haas
Date:
On Wed, May 4, 2016 at 9:35 AM, Stephen Frost <sfrost@snowman.net> wrote:
> * Robert Haas (robertmhaas@gmail.com) wrote:
>> On Wed, May 4, 2016 at 5:29 AM, Stephen Frost <sfrost@snowman.net> wrote:
>> > There is no such limitation on using pg_dump as a non-superuser.  It's
>> > always been the case that you need to be able to LOCK the table that
>> > you're dumping.  If you don't have rights to LOCK a certain table then
>> > pg_dump is going to throw an error just like the one you saw.
>>
>> But in Rushabh's example, he's not doing that.  He's trying to do a
>> full-database dump of a database that contains one object which the
>> dump user has rights to access.  Previously, that worked.  Now, it
>> fails with an error about a system catalog.  How is that not broken?
>
> As I mentioned up-thread, the optimization to skip tables which are not
> "interesting" has been improved in the patch-set posted this morning to
> skip over tables whose ACLs haven't changed from the defaults.  With
> that patch, we will skip over catalog tables whose ACLs haven't been
> changed and Rushabh's command will work as a non-superuser, assuming
> none of the ACLs on tables in pg_catalog have been changed.

OK.

> However, if any of the ACLs have been changed on tables in pg_catalog,
> we'll attempt to lock those tables and include those ACLs.  That will
> still work in many cases as you only need SELECT access to be able to
> lock a table in access share mode, but if the permissions on pg_authid
> are changed, the same failure will occur.

Hmm, I guess that's reasonable.  It might cause problems for some
users, but then again it might not.  I guess that's what a beta is
for.

> I wouldn't recommend depending on those tables being excluded in backup
> scripts.  If you wish to dump everything except pg_catalog, the -N
> switch can be used to exclude that schema.

Maybe that recommendation should be added to the documentation and/or
release notes someplace.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: pg_dump broken for non-super user

From
Tom Lane
Date:
Stephen Frost <sfrost@snowman.net> writes:
> * Robert Haas (robertmhaas@gmail.com) wrote:
>> But in Rushabh's example, he's not doing that.  He's trying to do a
>> full-database dump of a database that contains one object which the
>> dump user has rights to access.  Previously, that worked.  Now, it
>> fails with an error about a system catalog.  How is that not broken?

> As I mentioned up-thread, the optimization to skip tables which are not
> "interesting" has been improved in the patch-set posted this morning to
> skip over tables whose ACLs haven't changed from the defaults.  With
> that patch, we will skip over catalog tables whose ACLs haven't been
> changed and Rushabh's command will work as a non-superuser, assuming
> none of the ACLs on tables in pg_catalog have been changed.

> However, if any of the ACLs have been changed on tables in pg_catalog,
> we'll attempt to lock those tables and include those ACLs.  That will
> still work in many cases as you only need SELECT access to be able to
> lock a table in access share mode, but if the permissions on pg_authid
> are changed, the same failure will occur.

I think this is a bad idea, not only because of the issue about
permissions failures, but because the more things pg_dump locks
the greater risk it has of deadlock failures against other sessions.

Why is it that we need to lock a table at all if we're just going to dump
its ACL?  I understand the failure modes that motivate locking when we're
going to dump data or schema, but the ACL is not really subject to that
kind of problem: we are going to see a unitary, unchanging view of
pg_class.relacl in our snapshot, and we aren't relying on any server-side
logic to interpret that AFAIR.
        regards, tom lane



Re: pg_dump broken for non-super user

From
Stephen Frost
Date:
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Why is it that we need to lock a table at all if we're just going to dump
> its ACL?  I understand the failure modes that motivate locking when we're
> going to dump data or schema, but the ACL is not really subject to that
> kind of problem: we are going to see a unitary, unchanging view of
> pg_class.relacl in our snapshot, and we aren't relying on any server-side
> logic to interpret that AFAIR.

I think I'm coming around to agree with that, but it seems like it'd be
better to look at each component and say "we know X is safe, so we won't
lock the table if we're only doing X" rather than saying "we only need to
lock the table for case X".

Also, we should document why we need to lock the table in some cases and
not others.  To that end, I'd like to make sure that it's clear what
cases we are considering here.

In particular, dumping the schema definition of a table may involve
calling server-side functions which will see changes to the table which
have happened after our transaction started, primairly due to SysCache
usage in those functions.  Any of the components which only look at the
tables in pg_catalog should be fine and not require us to lock the
table.

When considering the components:

- DEFINITION
 pg_get_viewdef(), pg_get_indexdef(), pg_get_constraintdef(), pg_get_triggerdef(), etc...

- DATA
 Depends on the table structure itself not changing.

- COMMENT
 Shouldn't require a lock, only uses a relatively simple query against pg_description.

- SECLABEL
 Similar to COMMENT, shouldn't require a lock.

- ACL
 ACL info is collected from pg_class relacl without any server-side functions being used which would impact the
result.

- POLICY
 Uses pg_get_expr(), which at least gets the relation name from SysCache, so we'll want to lock the table.

- USERMAP
 Uses pg_options_to_table(), but I don't think that actually uses SysCache at all, it's just taking the array provided
andbuilds a table out of it, so I think this case is ok.
 

If the above looks reasonable to others, I can write up a patch which
will skip locking the table if we are only looking to include components
that don't require a lock.  Since we only ever look at pg_catalog for
ACLs (except if a user explicitly asks to dump one of the tables in
pg_catalog), we shouldn't ever attempt to lock those tables.

Of course, the pg_dump would still end up including the ACLs for
pg_authid and whatever other tables the user has changed the ACLs on and
errors will be thrown during restore if the restore is done with a
non-superuser.

Thanks!

Stephen

Re: pg_dump broken for non-super user

From
Tom Lane
Date:
Stephen Frost <sfrost@snowman.net> writes:
> * Tom Lane (tgl@sss.pgh.pa.us) wrote:
>> Why is it that we need to lock a table at all if we're just going to dump
>> its ACL?

> I think I'm coming around to agree with that, but it seems like it'd be
> better to look at each component and say "we know X is safe, so we won't
> lock the table if we're only doing X" rather than saying "we only need to
> lock the table for case X".

Agreed.  I did not realize you'd broken down the aspects of an object so
finely in pg_dump, but since you have, this is a good way to approach it.

> When considering the components:

> - DEFINITION
> - DATA
>   [ obviously need lock ]

> - COMMENT

>   Shouldn't require a lock, only uses a relatively simple query against
>   pg_description.

> - SECLABEL

>   Similar to COMMENT, shouldn't require a lock.

> - ACL

>   ACL info is collected from pg_class relacl without any server-side
>   functions being used which would impact the result.

> - POLICY

>   Uses pg_get_expr(), which at least gets the relation name from
>   SysCache, so we'll want to lock the table.

> - USERMAP

>   Uses pg_options_to_table(), but I don't think that actually uses
>   SysCache at all, it's just taking the array provided and builds a
>   table out of it, so I think this case is ok.

USERMAP seems a bit squishy and easily broken, perhaps.  Not sure there's
an advantage to distinguishing this case --- why did you break it out
from DEFINITION to start with?  Also, AFAICS, it does not apply to tables
which are the only things we lock anyway.

Seems reasonable otherwise.

> Of course, the pg_dump would still end up including the ACLs for
> pg_authid and whatever other tables the user has changed the ACLs on and
> errors will be thrown during restore if the restore is done with a
> non-superuser.

Right, but at least you have the option of ignoring such errors.
        regards, tom lane



Re: pg_dump broken for non-super user

From
Stephen Frost
Date:
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Stephen Frost <sfrost@snowman.net> writes:
> > - USERMAP
>
> >   Uses pg_options_to_table(), but I don't think that actually uses
> >   SysCache at all, it's just taking the array provided and builds a
> >   table out of it, so I think this case is ok.
>
> USERMAP seems a bit squishy and easily broken, perhaps.  Not sure there's
> an advantage to distinguishing this case --- why did you break it out
> from DEFINITION to start with?  Also, AFAICS, it does not apply to tables
> which are the only things we lock anyway.

When it comes to usermaps, they're dumped as part of FOREIGN SERVERs, so
I broke out the usermap as being an independent component from the
definition of the FOREIGN SERVER itself.  I realize we don't currently
have a way to ask pg_dump to exclude user mappings, but it seemed like
it wouldn't be unreasonable for us to have that in the future as user
mappings include role names and we have similar options for avoiding
specific role names in a dump (eg: --no-owner).

I agree that there's nothing to lock for usermaps though, as they aren't
associated with tables.

> Seems reasonable otherwise.

Ok, I'll write up a patch for that, should be pretty trivial.

> > Of course, the pg_dump would still end up including the ACLs for
> > pg_authid and whatever other tables the user has changed the ACLs on and
> > errors will be thrown during restore if the restore is done with a
> > non-superuser.
>
> Right, but at least you have the option of ignoring such errors.

Agreed.

Thanks!

Stephen

Re: pg_dump broken for non-super user

From
Stephen Frost
Date:
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Seems reasonable otherwise.

Attached patch implements this change to not LOCK the table in cases
where we don't need to.  I'll push this with my other changes to pg_dump
tomorrow (and I've included it in an updated, complete, set of patches
sent on the thread where those changes were being discussed already).

Wanted to include it here also for completeness.

Comments welcome, of course.

Thanks!

Stephen

Attachment

Re: pg_dump broken for non-super user

From
Tom Lane
Date:
Stephen Frost <sfrost@snowman.net> writes:
> Attached patch implements this change to not LOCK the table in cases
> where we don't need to.  I'll push this with my other changes to pg_dump
> tomorrow (and I've included it in an updated, complete, set of patches
> sent on the thread where those changes were being discussed already).

> Wanted to include it here also for completeness.

> Comments welcome, of course.

Minor suggestion: instead of putting these comments and hardwired
knowledge here, I'd suggest putting them adjacent to the list of
DUMP_COMPONENT #defines, creating a symbol along the lines of
DUMP_COMPONENTS_REQUIRING_TABLE_LOCK.  That approach would make it
far more likely that somebody changing the list of DUMP_COMPONENT
elements in future would notice the possible need to adjust the
requires-lock list.
        regards, tom lane



Re: pg_dump broken for non-super user

From
Stephen Frost
Date:
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Stephen Frost <sfrost@snowman.net> writes:
> > Attached patch implements this change to not LOCK the table in cases
> > where we don't need to.  I'll push this with my other changes to pg_dump
> > tomorrow (and I've included it in an updated, complete, set of patches
> > sent on the thread where those changes were being discussed already).
>
> > Wanted to include it here also for completeness.
>
> > Comments welcome, of course.
>
> Minor suggestion: instead of putting these comments and hardwired
> knowledge here, I'd suggest putting them adjacent to the list of
> DUMP_COMPONENT #defines, creating a symbol along the lines of
> DUMP_COMPONENTS_REQUIRING_TABLE_LOCK.  That approach would make it
> far more likely that somebody changing the list of DUMP_COMPONENT
> elements in future would notice the possible need to adjust the
> requires-lock list.

Good thought, I'll do that.

Thanks!

Stephen

Re: pg_dump broken for non-super user

From
Rushabh Lathia
Date:


On Thu, May 5, 2016 at 5:35 AM, Stephen Frost <sfrost@snowman.net> wrote:
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Stephen Frost <sfrost@snowman.net> writes:
> > Attached patch implements this change to not LOCK the table in cases
> > where we don't need to.  I'll push this with my other changes to pg_dump
> > tomorrow (and I've included it in an updated, complete, set of patches
> > sent on the thread where those changes were being discussed already).
>
> > Wanted to include it here also for completeness.
>
> > Comments welcome, of course.
>
> Minor suggestion: instead of putting these comments and hardwired
> knowledge here, I'd suggest putting them adjacent to the list of
> DUMP_COMPONENT #defines, creating a symbol along the lines of
> DUMP_COMPONENTS_REQUIRING_TABLE_LOCK.  That approach would make it
> far more likely that somebody changing the list of DUMP_COMPONENT
> elements in future would notice the possible need to adjust the
> requires-lock list.

Good thought, I'll do that.

+1

I liked the new approach, initially when I was looking around code
,I also thought about why we need to hold lock on the object
which we are not interested in dumping. That is the reason
I suggested patch with adding check for DUMP_COMPONENT_DEFINITION
&  DUMP_COMPONENT_DATA (but ofcourse that was not perfect)

Tom suggestion for adding DUMP_COMPONENTS_REQUIRING_TABLE_LOCK
is the nice way to fix this issue.



Thanks!

Stephen



--
Rushabh Lathia

Re: pg_dump broken for non-super user

From
Simon Riggs
Date:
On 4 May 2016 at 16:45, Tom Lane <tgl@sss.pgh.pa.us> wrote:
 
Why is it that we need to lock a table at all if we're just going to dump
its ACL? 

We don't, but surely that's the wrong question.

If we don't lock it then we will have a inconsistent dump that will fail later, if dumped while an object is being dropped.
Do we want an inconsistent dump?

For what reason are we changing existing behaviour? There is no bug here, as Stephen explained.

So this is a behaviour change after freeze with uncertain purpose.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: pg_dump broken for non-super user

From
Stephen Frost
Date:
Simon,

* Simon Riggs (simon@2ndQuadrant.com) wrote:
> On 4 May 2016 at 16:45, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Why is it that we need to lock a table at all if we're just going to dump
> > its ACL?
>
> We don't, but surely that's the wrong question.

I tend to agree with this, however...

> If we don't lock it then we will have a inconsistent dump that will fail
> later, if dumped while an object is being dropped.
> Do we want an inconsistent dump?

The dump won't be inconsistent, as Tom pointed out.  The catalog tables
are read using a repeatable read transaction, which will be consistent.

> For what reason are we changing existing behaviour? There is no bug here,
> as Stephen explained.
>
> So this is a behaviour change after freeze with uncertain purpose.

This isn't accurate.  We never locked tables in pg_catalog before, as we
never looked at them, and that's currently the only case where the new
logic will apply.  We may change the behavior for --no-privileges (and
perhaps other options) in the future to also have this logic apply, but
I agree that's 9.7 material.

Thanks!

Stephen

Re: pg_dump broken for non-super user

From
Simon Riggs
Date:
On 7 May 2016 at 16:14, Stephen Frost <sfrost@snowman.net> wrote:
 
> If we don't lock it then we will have a inconsistent dump that will fail
> later, if dumped while an object is being dropped.
> Do we want an inconsistent dump?

The dump won't be inconsistent, as Tom pointed out.  The catalog tables
are read using a repeatable read transaction, which will be consistent.

The scan is consistent, yes, but the results would not be.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: pg_dump broken for non-super user

From
Stephen Frost
Date:
* Simon Riggs (simon@2ndQuadrant.com) wrote:
> On 7 May 2016 at 16:14, Stephen Frost <sfrost@snowman.net> wrote:
> > > If we don't lock it then we will have a inconsistent dump that will fail
> > > later, if dumped while an object is being dropped.
> > > Do we want an inconsistent dump?
> >
> > The dump won't be inconsistent, as Tom pointed out.  The catalog tables
> > are read using a repeatable read transaction, which will be consistent.
>
> The scan is consistent, yes, but the results would not be.

I'm not following- the results are entirely dependent on the scan, so if
the scan is consistent, how could the results not be?

Thanks!

Stephen

Re: pg_dump broken for non-super user

From
Simon Riggs
Date:
On 7 May 2016 at 16:21, Stephen Frost <sfrost@snowman.net> wrote:
* Simon Riggs (simon@2ndQuadrant.com) wrote:
> On 7 May 2016 at 16:14, Stephen Frost <sfrost@snowman.net> wrote:
> > > If we don't lock it then we will have a inconsistent dump that will fail
> > > later, if dumped while an object is being dropped.
> > > Do we want an inconsistent dump?
> >
> > The dump won't be inconsistent, as Tom pointed out.  The catalog tables
> > are read using a repeatable read transaction, which will be consistent.
>
> The scan is consistent, yes, but the results would not be.

I'm not following- the results are entirely dependent on the scan, so if
the scan is consistent, how could the results not be?

Objects would no longer exist because of concurrent DROPs.

You agreed before, why did you change? 

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: pg_dump broken for non-super user

From
Stephen Frost
Date:
* Simon Riggs (simon@2ndQuadrant.com) wrote:
> On 7 May 2016 at 16:21, Stephen Frost <sfrost@snowman.net> wrote:
> > * Simon Riggs (simon@2ndQuadrant.com) wrote:
> > > On 7 May 2016 at 16:14, Stephen Frost <sfrost@snowman.net> wrote:
> > > > > If we don't lock it then we will have a inconsistent dump that will
> > fail
> > > > > later, if dumped while an object is being dropped.
> > > > > Do we want an inconsistent dump?
> > > >
> > > > The dump won't be inconsistent, as Tom pointed out.  The catalog tables
> > > > are read using a repeatable read transaction, which will be consistent.
> > >
> > > The scan is consistent, yes, but the results would not be.
> >
> > I'm not following- the results are entirely dependent on the scan, so if
> > the scan is consistent, how could the results not be?
> >
>
> Objects would no longer exist because of concurrent DROPs.

A concurrent DROP wouldn't actually produce different results than it
did before, except that the DROP would be allowed to proceed, rather
than block behind the dump.  In both cases, the dump would include that
table.

> You agreed before, why did you change?

I realized that Tom was right that we are just reading from the tables
using regular SELECTs in these cases and therefore the repeatable-read
transaction semantics would be used.

There are cases where that doesn't work, as discussed up-thread and in
the comments, but those are cases where we are using server-side
functions that use SysCache and don't respect the repeatable-read
transaction.

Thanks!

Stephen