Thread: Re: [HACKERS] [COMMITTERS] pgsql: Sync pg_dump and pg_dumpall output

Re: [HACKERS] [COMMITTERS] pgsql: Sync pg_dump and pg_dumpall output

From
Stephen Frost
Date:
Andrew,

* Andrew Dunstan (andrew@dunslane.net) wrote:
> Sync pg_dump and pg_dumpall output

This probably should have adjusted all callers of pg_dump in the
regression tests to use the --no-sync option, otherwise we'll end up
spending possibly a good bit of time calling fsync() during the
regression tests unnecessairly.

Thanks!

Stephen

Re: [HACKERS] [COMMITTERS] pgsql: Sync pg_dump and pg_dumpall output

From
Andrew Dunstan
Date:

On 03/22/2017 11:39 AM, Stephen Frost wrote:
> Andrew,
>
> * Andrew Dunstan (andrew@dunslane.net) wrote:
>> Sync pg_dump and pg_dumpall output
> This probably should have adjusted all callers of pg_dump in the
> regression tests to use the --no-sync option, otherwise we'll end up
> spending possibly a good bit of time calling fsync() during the
> regression tests unnecessairly.
>


All of them? The imnpact is not likely to be huge in most cases
(possibly different on Windows). On crake, the bin-check stage actually
took less time after the change than before, so I suspect that the
impact will be pretty small.

Still I agree that we should have tests for both cases.

Michael, do you want to look at that? If not, I'll take a look but it
will probably be next week before I get to it.

cheers

andrew

-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: [HACKERS] [COMMITTERS] pgsql: Sync pg_dump and pg_dumpall output

From
Stephen Frost
Date:
Andrew,

* Andrew Dunstan (andrew.dunstan@2ndquadrant.com) wrote:
> On 03/22/2017 11:39 AM, Stephen Frost wrote:
> > * Andrew Dunstan (andrew@dunslane.net) wrote:
> >> Sync pg_dump and pg_dumpall output
> > This probably should have adjusted all callers of pg_dump in the
> > regression tests to use the --no-sync option, otherwise we'll end up
> > spending possibly a good bit of time calling fsync() during the
> > regression tests unnecessairly.
>
> All of them? The imnpact is not likely to be huge in most cases
> (possibly different on Windows). On crake, the bin-check stage actually
> took less time after the change than before, so I suspect that the
> impact will be pretty small.

Well, perhaps not all, but I'd think --no-sync would be better to have
in most cases.  We turn off fsync for all of the TAP tests and all
initdb calls, so it seems like we should here too.  Perhaps it won't be
much overhead on an unloaded system, but not all of the buildfarm
animals seem to be on unloaded systems, nor are they particularly fast
in general or have fast i/o..

> Still I agree that we should have tests for both cases.

Perhaps, though if I recall correctly, we've basically had zero calls
for fsync() until this.  If we don't feel like we need to test that in
the backend then it seems a bit silly to feel like we need it for
pg_dump's regression coverage.

That said, perhaps the right answer is to have a couple tests for both
the backend and for pg_dump which do exercise the fsync-enabled paths.

Thanks!

Stephen

Re: [HACKERS] [COMMITTERS] pgsql: Sync pg_dump and pg_dumpall output

From
Andrew Dunstan
Date:

On 03/22/2017 12:10 PM, Stephen Frost wrote:
>> Still I agree that we should have tests for both cases.
> Perhaps, though if I recall correctly, we've basically had zero calls
> for fsync() until this.  If we don't feel like we need to test that in
> the backend then it seems a bit silly to feel like we need it for
> pg_dump's regression coverage.
>
> That said, perhaps the right answer is to have a couple tests for both
> the backend and for pg_dump which do exercise the fsync-enabled paths.
>
>


+1

cheers

andrew

-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: [HACKERS] [COMMITTERS] pgsql: Sync pg_dump and pg_dumpall output

From
Michael Paquier
Date:
On Thu, Mar 23, 2017 at 1:10 AM, Stephen Frost <sfrost@snowman.net> wrote:
> Andrew,
>
> * Andrew Dunstan (andrew.dunstan@2ndquadrant.com) wrote:
>> On 03/22/2017 11:39 AM, Stephen Frost wrote:
>> > * Andrew Dunstan (andrew@dunslane.net) wrote:
>> >> Sync pg_dump and pg_dumpall output
>> > This probably should have adjusted all callers of pg_dump in the
>> > regression tests to use the --no-sync option, otherwise we'll end up
>> > spending possibly a good bit of time calling fsync() during the
>> > regression tests unnecessairly.
>>
>> All of them? The imnpact is not likely to be huge in most cases
>> (possibly different on Windows). On crake, the bin-check stage actually
>> took less time after the change than before, so I suspect that the
>> impact will be pretty small.
>
> Well, perhaps not all, but I'd think --no-sync would be better to have
> in most cases.  We turn off fsync for all of the TAP tests and all
> initdb calls, so it seems like we should here too.  Perhaps it won't be
> much overhead on an unloaded system, but not all of the buildfarm
> animals seem to be on unloaded systems, nor are they particularly fast
> in general or have fast i/o..

Agreed.

>> Still I agree that we should have tests for both cases.
>
> Perhaps, though if I recall correctly, we've basically had zero calls
> for fsync() until this.  If we don't feel like we need to test that in
> the backend then it seems a bit silly to feel like we need it for
> pg_dump's regression coverage.
>
> That said, perhaps the right answer is to have a couple tests for both
> the backend and for pg_dump which do exercise the fsync-enabled paths.

And agreed.

The patch attached uses --no-sync in most places for pg_dump, except
in 4 places of 002_pg_dump.pl to stress as well the sync code path.
-- 
Michael

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

Re: [COMMITTERS] pgsql: Sync pg_dump and pg_dumpall output

From
Noah Misch
Date:
On Fri, Mar 24, 2017 at 12:26:33PM +0900, Michael Paquier wrote:
> On Thu, Mar 23, 2017 at 1:10 AM, Stephen Frost <sfrost@snowman.net> wrote:
> > * Andrew Dunstan (andrew.dunstan@2ndquadrant.com) wrote:
> >> On 03/22/2017 11:39 AM, Stephen Frost wrote:
> >> > * Andrew Dunstan (andrew@dunslane.net) wrote:
> >> >> Sync pg_dump and pg_dumpall output
> >> > This probably should have adjusted all callers of pg_dump in the
> >> > regression tests to use the --no-sync option, otherwise we'll end up
> >> > spending possibly a good bit of time calling fsync() during the
> >> > regression tests unnecessairly.
> >>
> >> All of them? The imnpact is not likely to be huge in most cases
> >> (possibly different on Windows). On crake, the bin-check stage actually
> >> took less time after the change than before, so I suspect that the
> >> impact will be pretty small.
> >
> > Well, perhaps not all, but I'd think --no-sync would be better to have
> > in most cases.  We turn off fsync for all of the TAP tests and all
> > initdb calls, so it seems like we should here too.  Perhaps it won't be
> > much overhead on an unloaded system, but not all of the buildfarm
> > animals seem to be on unloaded systems, nor are they particularly fast
> > in general or have fast i/o..
> 
> Agreed.
> 
> >> Still I agree that we should have tests for both cases.
> >
> > Perhaps, though if I recall correctly, we've basically had zero calls
> > for fsync() until this.  If we don't feel like we need to test that in
> > the backend then it seems a bit silly to feel like we need it for
> > pg_dump's regression coverage.
> >
> > That said, perhaps the right answer is to have a couple tests for both
> > the backend and for pg_dump which do exercise the fsync-enabled paths.
> 
> And agreed.
> 
> The patch attached uses --no-sync in most places for pg_dump, except
> in 4 places of 002_pg_dump.pl to stress as well the sync code path.

[Action required within three days.  This is a generic notification.]

The above-described topic is currently a PostgreSQL 10 open item.  Andrew,
since you committed the patch believed to have created it, you own this open
item.  If some other commit is more relevant or if this does not belong as a
v10 open item, please let us know.  Otherwise, please observe the policy on
open item ownership[1] and send a status update within three calendar days of
this message.  Include a date for your subsequent status update.  Testers may
discover new open items at any time, and I want to plan to get them all fixed
well in advance of shipping v10.  Consequently, I will appreciate your efforts
toward speedy resolution.  Thanks.

[1] https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com



Re: [HACKERS] [COMMITTERS] pgsql: Sync pg_dump and pg_dumpall output

From
Noah Misch
Date:
On Wed, Apr 05, 2017 at 02:49:41AM -0400, Noah Misch wrote:
> On Fri, Mar 24, 2017 at 12:26:33PM +0900, Michael Paquier wrote:
> > On Thu, Mar 23, 2017 at 1:10 AM, Stephen Frost <sfrost@snowman.net> wrote:
> > > * Andrew Dunstan (andrew.dunstan@2ndquadrant.com) wrote:
> > >> On 03/22/2017 11:39 AM, Stephen Frost wrote:
> > >> > * Andrew Dunstan (andrew@dunslane.net) wrote:
> > >> >> Sync pg_dump and pg_dumpall output
> > >> > This probably should have adjusted all callers of pg_dump in the
> > >> > regression tests to use the --no-sync option, otherwise we'll end up
> > >> > spending possibly a good bit of time calling fsync() during the
> > >> > regression tests unnecessairly.
> > >>
> > >> All of them? The imnpact is not likely to be huge in most cases
> > >> (possibly different on Windows). On crake, the bin-check stage actually
> > >> took less time after the change than before, so I suspect that the
> > >> impact will be pretty small.
> > >
> > > Well, perhaps not all, but I'd think --no-sync would be better to have
> > > in most cases.  We turn off fsync for all of the TAP tests and all
> > > initdb calls, so it seems like we should here too.  Perhaps it won't be
> > > much overhead on an unloaded system, but not all of the buildfarm
> > > animals seem to be on unloaded systems, nor are they particularly fast
> > > in general or have fast i/o..
> > 
> > Agreed.
> > 
> > >> Still I agree that we should have tests for both cases.
> > >
> > > Perhaps, though if I recall correctly, we've basically had zero calls
> > > for fsync() until this.  If we don't feel like we need to test that in
> > > the backend then it seems a bit silly to feel like we need it for
> > > pg_dump's regression coverage.
> > >
> > > That said, perhaps the right answer is to have a couple tests for both
> > > the backend and for pg_dump which do exercise the fsync-enabled paths.
> > 
> > And agreed.
> > 
> > The patch attached uses --no-sync in most places for pg_dump, except
> > in 4 places of 002_pg_dump.pl to stress as well the sync code path.
> 
> [Action required within three days.  This is a generic notification.]
> 
> The above-described topic is currently a PostgreSQL 10 open item.  Andrew,
> since you committed the patch believed to have created it, you own this open
> item.  If some other commit is more relevant or if this does not belong as a
> v10 open item, please let us know.  Otherwise, please observe the policy on
> open item ownership[1] and send a status update within three calendar days of
> this message.  Include a date for your subsequent status update.  Testers may
> discover new open items at any time, and I want to plan to get them all fixed
> well in advance of shipping v10.  Consequently, I will appreciate your efforts
> toward speedy resolution.  Thanks.
> 
> [1] https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com

This PostgreSQL 10 open item is past due for your status update.  Kindly send
a status update within 24 hours, and include a date for your subsequent status
update.  Refer to the policy on open item ownership:
https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com



Re: [HACKERS] [COMMITTERS] pgsql: Sync pg_dump and pg_dumpall output

From
Michael Paquier
Date:
On Mon, Apr 10, 2017 at 1:01 PM, Noah Misch <noah@leadboat.com> wrote:
> This PostgreSQL 10 open item is past due for your status update.  Kindly send
> a status update within 24 hours, and include a date for your subsequent status
> update.  Refer to the policy on open item ownership:
> https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com

Note for reviewers and committers: the patch sent upthread in
https://www.postgresql.org/message-id/CAB7nPqTUOpF792rDOnBkswZ=ZgHwxdB01OQU2tAF1KU4iUuLrw@mail.gmail.com
still applies and passes all the tests.
-- 
Michael



Re: [HACKERS] [COMMITTERS] pgsql: Sync pg_dump and pg_dumpall output

From
Andrew Dunstan
Date:

On 04/10/2017 12:43 AM, Michael Paquier wrote:
> On Mon, Apr 10, 2017 at 1:01 PM, Noah Misch <noah@leadboat.com> wrote:
>> This PostgreSQL 10 open item is past due for your status update.  Kindly send
>> a status update within 24 hours, and include a date for your subsequent status
>> update.  Refer to the policy on open item ownership:
>> https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com
> Note for reviewers and committers: the patch sent upthread in
> https://www.postgresql.org/message-id/CAB7nPqTUOpF792rDOnBkswZ=ZgHwxdB01OQU2tAF1KU4iUuLrw@mail.gmail.com
> still applies and passes all the tests.


Sorry, previous notification flew by me. I Have looked at the patch. On
the surface it looks fine and I will apply after running a test, should
be today.

cheers

andrew

-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services