Thread: Dividing progress/debug information in pg_standby, and stat before copy

Dividing progress/debug information in pg_standby, and stat before copy

From
Selena Deckelmann
Date:
Hi!

I poked around a little at pg_standby because of some 'cp: file does
not exist' errors that were cropping up as pg_standby searched around
for .history files.

I added a stat that still outputs debugging information but now we
don't get cp failures, and then added a 'progress' mode. It was a bit
of a semantic stretch, but '-l' had been used for 'link' in the past.

I will also add a timestamp to all of the output in my next patch. But
before I do that..

Questions:

* Could we just re-use '-l' for logging?
* Is there a way to get a non-module to use the ereport/elog system?

-selena



--
http://chesnok.com/daily - me
http://endpoint.com - work

Attachment

Re: Dividing progress/debug information in pg_standby, and stat before copy

From
Josh Berkus
Date:
On 1/25/10 9:45 AM, Selena Deckelmann wrote:
> Hi!
> 
> I poked around a little at pg_standby because of some 'cp: file does
> not exist' errors that were cropping up as pg_standby searched around
> for .history files.
> 
> I added a stat that still outputs debugging information but now we
> don't get cp failures, and then added a 'progress' mode. It was a bit
> of a semantic stretch, but '-l' had been used for 'link' in the past.

We discussed this issue at LCA where I encountered these bogus error
messages when I was doing the demo of HS.  I consider Selena's patch to
be a bug-fix for beta of 9.0, not a feature.  Currently the database
reports a lot of false error messages when running in standby mode, and
once we have 1000's more users using standby mode, we're going to see a
lot of related confusion.

The searching for files it doesn't need also delays standby for 15-45s,
so this is a performance fix as well.

--Josh Berkus


Re: Dividing progress/debug information in pg_standby, and stat before copy

From
Greg Smith
Date:
Josh Berkus wrote:
> We discussed this issue at LCA where I encountered these bogus error
> messages when I was doing the demo of HS.  I consider Selena's patch to
> be a bug-fix for beta of 9.0, not a feature.  Currently the database
> reports a lot of false error messages when running in standby mode, and
> once we have 1000's more users using standby mode, we're going to see a
> lot of related confusion.
>   

Does anyone have a complete list of the false error messages and what 
context they show up in so that a proper test case could be constructed?

I extracted some pg_standby changes from Simon last week that have some 
overlap with Selena's patch (better logging, remove bogus link feature, 
throw less false error messages out).  I'm not quite ready to submit 
anything here just yet, I'm going to break that into more targeted 
patches, but I will be later this week.  I share the concern here that 
some of these issues are annoying enough to be considered bugs, and I 
need to fix them regardless of whether anybody else does.  I'd be happy 
to work with Selena as a review pair here, to knock out the worst of the 
problems on this program, now that the use-case for it should be more 
popular.  pg_standby could use a bit of an upgrade based on the rough 
edges found by all its field tests, most of which is in error handling 
and logging.  I don't have anything like her stat check in what I'm 
working on, so there's certainly useful stuff uniquely in each patch.

As far as her questions go:
> * Could we just re-use '-l' for logging?

The patch I'm working on adds "-v verbosity" so that logging can be a 
bit more fine-grained than that even.  Having both debug and a progress 
report boolean can then get folded into a single verbosity level, rather 
than maintain two similar paths.  Just make debug equal to the highest 
verbosity and maybe start deprecating that switch altogether.

One reason I'm not quite ready to submit what I've got yet is that I 
want to unify things better here.  I think that I'd prefer to use the 
same terminology as log_min_messages for the various options, and make a 
macro wrapper like ELOG this code uses instead of all these terrible 
direct fprintf([stderr|stdout]... calls.
> * Is there a way to get a non-module to use the ereport/elog system?

And that work would make this transition easier to make, too, if it 
became feasible.  I fear that's outside of the scope of what anyone 
wants to touch at this point though.

-- 
Greg Smith    2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com  www.2ndQuadrant.com



Re: Dividing progress/debug information in pg_standby, and stat before copy

From
Selena Deckelmann
Date:
Hi!

On Mon, Jan 25, 2010 at 1:34 PM, Greg Smith <greg@2ndquadrant.com> wrote:
> Josh Berkus wrote:
>>
>> We discussed this issue at LCA where I encountered these bogus error
>> messages when I was doing the demo of HS.  I consider Selena's patch to
>> be a bug-fix for beta of 9.0, not a feature.  Currently the database
>> reports a lot of false error messages when running in standby mode, and
>> once we have 1000's more users using standby mode, we're going to see a
>> lot of related confusion.
>>
>
> Does anyone have a complete list of the false error messages and what
> context they show up in so that a proper test case could be constructed?

They aren't "false" technically. They are the result of the function
call attempting to copy files that do not exist. It's not a big deal
functionality-wise, but it retries a few times. The stat call "fixes"
it.  I could do a bit more there with the error result, but didn't.

I can scan through the code tonight and look for other cases where
this might be an issue. The main thing I'm looking for is to
distinguish between harmful and non-harmful errors.

> I extracted some pg_standby changes from Simon last week that have some
> overlap with Selena's patch (better logging, remove bogus link feature,
> throw less false error messages out).  I'm not quite ready to submit
> anything here just yet, I'm going to break that into more targeted patches,
> but I will be later this week.  I share the concern here that some of these
> issues are annoying enough to be considered bugs, and I need to fix them
> regardless of whether anybody else does.

The stat issue is one of those issues for users that makes them think:
"this looks like an error, and i've never done this before. maybe
there is SOMETHING WRONG!"

I included the progress/logging/verbosity changes so that the errors
were still generated but were definitely flagged as 'debugging' and
'probably not an issue'. :)

> I'd be happy to work with Selena
> as a review pair here, to knock out the worst of the problems on this

Sweet. I, too, would love to work with you to get this fancied/cleaned up.

> program, now that the use-case for it should be more popular.  pg_standby
> could use a bit of an upgrade based on the rough edges found by all its
> field tests, most of which is in error handling and logging.  I don't have
> anything like her stat check in what I'm working on, so there's certainly
> useful stuff uniquely in each patch.

Thanks!

>> * Could we just re-use '-l' for logging?
>
> The patch I'm working on adds "-v verbosity" so that logging can be a bit
> more fine-grained than that even.  Having both debug and a progress report
> boolean can then get folded into a single verbosity level, rather than
> maintain two similar paths.  Just make debug equal to the highest verbosity
> and maybe start deprecating that switch altogether.
>
> One reason I'm not quite ready to submit what I've got yet is that I want to
> unify things better here.  I think that I'd prefer to use the same
> terminology as log_min_messages for the various options, and make a macro
> wrapper like ELOG this code uses instead of all these terrible direct
> fprintf([stderr|stdout]... calls.

Yes, a wrapper is desperately needed with timestamps.

>> * Is there a way to get a non-module to use the ereport/elog system?
>
> And that work would make this transition easier to make, too, if it became
> feasible.  I fear that's outside of the scope of what anyone wants to touch
> at this point though.

Sure thing.  I scanned what was in contrib and didn't see anything I
could crib in there. Was just throwing it out there if someone had
already done it.

-selena

--
http://chesnok.com/daily - me
http://endpoint.com - work


Re: Dividing progress/debug information in pg_standby, and stat before copy

From
Greg Smith
Date:
Selena Deckelmann wrote:
> I can scan through the code tonight and look for other cases where
> this might be an issue. The main thing I'm looking for is to
> distinguish between harmful and non-harmful errors.
>   

Where I think this is going toward is where every line that comes out of 
this program gets tagged with an explicit log level, same as everything 
else in the server and using the same terminology (LOG, INFO, WARNING, 
DEBUG), and we just need to make sure those levels are appropriate given 
the severity of the message.  I was planning a similar sweep through all 
the messages the program produces to classify them like that, I have a 
starter set of suggestions from Simon to chew on I'm working through.

> Yes, a wrapper is desperately needed with timestamps.
>   

Right--that's something I too was planning to add eventually too, so we 
might as well get the right infrastructure in there to make that easy 
while we're mucking around with all this logging anyway.

I'll touch base with you later this week once I've done my initial pass 
through all this; not sure we need to drag this list through all those 
details until we've got a unified patch to propose.

-- 
Greg Smith    2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com  www.2ndQuadrant.com



Re: Dividing progress/debug information in pg_standby, and stat before copy

From
Simon Riggs
Date:
On Mon, 2010-01-25 at 16:34 -0500, Greg Smith wrote:
> Josh Berkus wrote:
> > We discussed this issue at LCA where I encountered these bogus error
> > messages when I was doing the demo of HS.  I consider Selena's patch to
> > be a bug-fix for beta of 9.0, not a feature.  Currently the database
> > reports a lot of false error messages when running in standby mode, and
> > once we have 1000's more users using standby mode, we're going to see a
> > lot of related confusion.
> >   
> 
> Does anyone have a complete list of the false error messages and what 
> context they show up in so that a proper test case could be constructed?

Just committed a fix: the server no longer requests 0000000001.history
at start of archive recovery.

-- Simon Riggs           www.2ndQuadrant.com



Greg Smith <greg@2ndquadrant.com> writes:
> [ Greg and Selena discuss filing some rough edges off pg_standby ]

Maybe I'm missing something, but I thought pg_standby would be mostly
dead once SR hits the streets.  Is it worth spending lots of time on?

The ideas all sound good, I'm just wondering if it's useful effort
at this point.
        regards, tom lane


Re: Dividing progress/debug information in pg_standby, and stat before copy

From
Josh Berkus
Date:
On 1/25/10 4:09 PM, Tom Lane wrote:
> Greg Smith <greg@2ndquadrant.com> writes:
>> [ Greg and Selena discuss filing some rough edges off pg_standby ]
> 
> Maybe I'm missing something, but I thought pg_standby would be mostly
> dead once SR hits the streets.  Is it worth spending lots of time on?
> 
> The ideas all sound good, I'm just wondering if it's useful effort
> at this point.

Well, if we offer HS separately from SR, it ought to work, don't you
think?  It's possible that some of our users may prefer just HS.

--Josh Berkus


Re: Dividing progress/debug information in pg_standby, and stat before copy

From
Greg Smith
Date:
Tom Lane wrote:
> Maybe I'm missing something, but I thought pg_standby would be mostly
> dead once SR hits the streets.  Is it worth spending lots of time on?
>   

I have to do the work I outlined regardless, to support installs on 
earlier versions (luckily there's few backport issues for this code).  
Also, some people are going to have working WAL shipping installs they 
just don't want to mess with as part of the upgrade--particularly given 
the relative newness of the SR code--that they should be able to convert 
over easily.

One reason we hadn't really brought up merging these pg_standby changes 
into core yet is for the reason you describe.  Simon and I didn't think 
there'd be much community uptake on the idea given it's a less likely 
approach for future installs to use, didn't want to distract everyone 
with this topic.  But if it mainly occupies time from people who have 
similar requirements that drive working on it anyway, like Selena it 
appears, it would be nice to see a "final" pg_standby get shipped as a 
result that has less obvious rough parts.  Doubt much work will go into 
it beyond this release though.

-- 
Greg Smith    2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com  www.2ndQuadrant.com



Re: Dividing progress/debug information in pg_standby, and stat before copy

From
Fujii Masao
Date:
On Tue, Jan 26, 2010 at 9:08 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> Just committed a fix: the server no longer requests 0000000001.history
> at start of archive recovery.

Good.

And I think that writeTimeLineHistory() should also skip the request
of 0000000001.history. Here is the patch to do so. Comments?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachment

Re: Dividing progress/debug information in pg_standby, and stat before copy

From
Simon Riggs
Date:
On Tue, 2010-01-26 at 11:08 +0900, Fujii Masao wrote:
> On Tue, Jan 26, 2010 at 9:08 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> > Just committed a fix: the server no longer requests 0000000001.history
> > at start of archive recovery.
> 
> Good.
> 
> And I think that writeTimeLineHistory() should also skip the request
> of 0000000001.history. Here is the patch to do so. Comments?

I didn't think it was worth the bother, since you can't easily avoid
requesting 00000002.history or whatever the newTLI will be, so you do
still get some messages that might appear spurious. Sure we could
rewrite that, but it works and we have other matters to attend.

Also, I'm not going to add a Goto to the Postgres source code. You've
spent too long staring at ReadRecord(), it seems. :-)

-- Simon Riggs           www.2ndQuadrant.com



Re: Dividing progress/debug information in pg_standby, and stat before copy

From
Magnus Hagander
Date:
2010/1/26 Tom Lane <tgl@sss.pgh.pa.us>:
> Greg Smith <greg@2ndquadrant.com> writes:
>> [ Greg and Selena discuss filing some rough edges off pg_standby ]
>
> Maybe I'm missing something, but I thought pg_standby would be mostly
> dead once SR hits the streets.  Is it worth spending lots of time on?
>
> The ideas all sound good, I'm just wondering if it's useful effort
> at this point.

I think there are definite use-cases for pg_standby as well, even when
we have SR. SR requires you to have a reasonably reliable network
connection that lets you do an arbitrary TCP connection. There are a
lot of scenarios that could still use the
"here's-a-file-you-choose-how-to-get-it-over-to-the-other-end" style
transfer, and that don't necessarily care that there is a longer
delay.

*Most* people will still use SR, I'm sure.


-- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/


Re: Dividing progress/debug information in pg_standby, and stat before copy

From
Heikki Linnakangas
Date:
Magnus Hagander wrote:
> I think there are definite use-cases for pg_standby as well, even when
> we have SR. SR requires you to have a reasonably reliable network
> connection that lets you do an arbitrary TCP connection. There are a
> lot of scenarios that could still use the
> "here's-a-file-you-choose-how-to-get-it-over-to-the-other-end" style
> transfer, and that don't necessarily care that there is a longer
> delay.

With the changes to the retry-logic that were discussed (see
http://archives.postgresql.org/message-id/4B5758ED.1060703@enterprisedb.com,
I intend to commit that tomorrow), if standby_mode=on, the server will
keep retrying to restore the next segment using restore_command until
it's found, or the trigger file is found.

*That* makes pg_standby obsolete, not streaming replication per se.
Setting standby_mode=on, with a valid restore_command using e.g 'cp' and
no connection info for walreceiver is more or less the same as using
pg_standby.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com


Re: Dividing progress/debug information in pg_standby, and stat before copy

From
Magnus Hagander
Date:
2010/1/26 Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>:
> Magnus Hagander wrote:
>> I think there are definite use-cases for pg_standby as well, even when
>> we have SR. SR requires you to have a reasonably reliable network
>> connection that lets you do an arbitrary TCP connection. There are a
>> lot of scenarios that could still use the
>> "here's-a-file-you-choose-how-to-get-it-over-to-the-other-end" style
>> transfer, and that don't necessarily care that there is a longer
>> delay.
>
> With the changes to the retry-logic that were discussed (see
> http://archives.postgresql.org/message-id/4B5758ED.1060703@enterprisedb.com,
> I intend to commit that tomorrow), if standby_mode=on, the server will
> keep retrying to restore the next segment using restore_command until
> it's found, or the trigger file is found.
>
> *That* makes pg_standby obsolete, not streaming replication per se.
> Setting standby_mode=on, with a valid restore_command using e.g 'cp' and
> no connection info for walreceiver is more or less the same as using
> pg_standby.

Ah, ok, missed that. So it basically folds pg_standby into the
backend. In *that* case, I can see how pg_standby would be obsolete.

-- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/


Re: Dividing progress/debug information in pg_standby, and stat before copy

From
Dimitri Fontaine
Date:
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
> *That* makes pg_standby obsolete, not streaming replication per se.
> Setting standby_mode=on, with a valid restore_command using e.g 'cp' and
> no connection info for walreceiver is more or less the same as using
> pg_standby.

I've yet to understand how the files in the archive get from the master
to the slave in this case, or are you supposing in your example that the
cp in the restore_command is targetting a shared disk setup or
something?

Regards,
-- 
dim


Re: Dividing progress/debug information in pg_standby, and stat before copy

From
tomas@tuxteam.de
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Tue, Jan 26, 2010 at 11:24:09AM +0100, Dimitri Fontaine wrote:

[...]

> I've yet to understand how the files in the archive get from the master
> to the slave in this case, or are you supposing in your example that the
> cp in the restore_command is targetting a shared disk setup or
> something?

As far as I understand (at least in the current setting its that way),
cp is just a (minimal) example -- it could be scp, rsync (or UUCP for
the older among us ;)

Typically you would wrap the real work in some shell script anyway.

Regards
- -- tomás
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD4DBQFLXsTZBcgs9XrR2kYRAmjBAJjMcxAKfhmlGLqnE/FNo3PrnCEEAJ96K7eJ
nKnsebO/IKJsIPQ6WTbqoA==
=dd3Z
-----END PGP SIGNATURE-----


Re: Dividing progress/debug information in pg_standby, and stat before copy

From
Heikki Linnakangas
Date:
Dimitri Fontaine wrote:
> Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
>> *That* makes pg_standby obsolete, not streaming replication per se.
>> Setting standby_mode=on, with a valid restore_command using e.g 'cp' and
>> no connection info for walreceiver is more or less the same as using
>> pg_standby.
> 
> I've yet to understand how the files in the archive get from the master
> to the slave in this case, or are you supposing in your example that the
> cp in the restore_command is targetting a shared disk setup or
> something?

Yes. Just like with pg_standby.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com


Re: Dividing progress/debug information in pg_standby, and stat before copy

From
Dimitri Fontaine
Date:
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
> Yes. Just like with pg_standby.

Hehe, I'm using walmgr.py from skytools instead, and this discussion
makes me think I'll continue doing so even if using SR, as they are
complementary solutions.

In SR mode, the master continues to archive as usual, and the slave will
either take the WAL on a per-file basis from the restore_command or on
an per-LSN basis from the walreceiver and a live connection to the
master, right?
(You explained it very clearly, so I hope I got it right, but some interrested readers might have skipped the other
threadabout it).
 

Does it mean any working wal shipping setup (pitrtools, walmgr.py) will
continue working unchanged, or should we begin testing those and
scheduling adaptations to 9.0?

Regards,
-- 
dim


Re: Dividing progress/debug information in pg_standby, and stat before copy

From
Simon Riggs
Date:
On Tue, 2010-01-26 at 12:12 +0200, Heikki Linnakangas wrote:
> Magnus Hagander wrote:
> > I think there are definite use-cases for pg_standby as well, even when
> > we have SR. SR requires you to have a reasonably reliable network
> > connection that lets you do an arbitrary TCP connection. There are a
> > lot of scenarios that could still use the
> > "here's-a-file-you-choose-how-to-get-it-over-to-the-other-end" style
> > transfer, and that don't necessarily care that there is a longer
> > delay.
> 
> With the changes to the retry-logic that were discussed (see
> http://archives.postgresql.org/message-id/4B5758ED.1060703@enterprisedb.com,
> I intend to commit that tomorrow), if standby_mode=on, the server will
> keep retrying to restore the next segment using restore_command until
> it's found, or the trigger file is found.
> 
> *That* makes pg_standby obsolete, not streaming replication per se.
> Setting standby_mode=on, with a valid restore_command using e.g 'cp' and
> no connection info for walreceiver is more or less the same as using
> pg_standby.

It could be, but that is not what you've mentioned before and I don't
think its that simple.

If no connection info is set we use existing defaults and attempt that.
ISTM there is no way to set connection info to be "unset", or to request
that it doesn't keep continually retrying. Currently, we had presumed
that standby_mode = off would be the same as Warm Standby replication,
using pg_standby or other.

pg_standby checks file size before returning. If you just use "cp" as
suggested then it would fail because the copy would start before the
file is ready to be copied.

I'm not against including pg_standby features in backend, as long as you
provide *all* of the features and test that mode of operation. Even so,
you're still likely to remove something someone currently thinks
important.

Please don't be so quick to slam the door on previous ways of working.
When sync rep fails, I would not wish to find that the parachute has
been removed to keep the balloon tidy or that the hole at the top has
been widened to improve the view.

That isn't an argument to spend further time on pg_standby, but if
people feel its important and they have time, I would not stop them.

-- Simon Riggs           www.2ndQuadrant.com



Re: Dividing progress/debug information in pg_standby, and stat before copy

From
Heikki Linnakangas
Date:
Dimitri Fontaine wrote:
> Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
>> Yes. Just like with pg_standby.
> 
> Hehe, I'm using walmgr.py from skytools instead, and this discussion
> makes me think I'll continue doing so even if using SR, as they are
> complementary solutions.
> 
> In SR mode, the master continues to archive as usual, and the slave will
> either take the WAL on a per-file basis from the restore_command or on
> an per-LSN basis from the walreceiver and a live connection to the
> master, right?

Right.

> Does it mean any working wal shipping setup (pitrtools, walmgr.py) will
> continue working unchanged, or should we begin testing those and
> scheduling adaptations to 9.0?

They will continue to work as is, just leave standby_mode=off. But if
you want to take advantage streaming replication, you'll have to switch
it to 'on', and adapt the scripts to work with that.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com


Re: Dividing progress/debug information in pg_standby, and stat before copy

From
Heikki Linnakangas
Date:
Simon Riggs wrote:
> Currently, we had presumed
> that standby_mode = off would be the same as Warm Standby replication,
> using pg_standby or other.

That's still true. standby_mode=off is the same as what you have in PG
8.4. You can still use pg_standby etc. with that.

> pg_standby checks file size before returning. If you just use "cp" as
> suggested then it would fail because the copy would start before the
> file is ready to be copied.

True. You could work around that by using a script instead of plain
'cp', or by making sure that no partial files appear in the archive in
the other end.

Or we could check for the partial WAL file case in the server. But
Windows copy command is worse than that, and sets the file size before
finishing copying.

From a user point-of-view, it might be simplest to provide a
"restore_directory" option, besides restore_command, and handle the copy
ourselves.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com


Re: Dividing progress/debug information in pg_standby, and stat before copy

From
Dimitri Fontaine
Date:
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
> From a user point-of-view, it might be simplest to provide a
> "restore_directory" option, besides restore_command, and handle the copy
> ourselves.

Even more user-friendly would be to provide default working archive and
restore commands, maybe simple shell scripts using rsync. But it's not
clear it's the time to do that.

Regards,
-- 
dim


Re: Dividing progress/debug information in pg_standby, and stat before copy

From
Robert Haas
Date:
On Tue, Jan 26, 2010 at 6:54 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> From a user point-of-view, it might be simplest to provide a
> "restore_directory" option, besides restore_command, and handle the copy
> ourselves.

Well, that might not handle all possible use cases -- of course, it
could be there as a sort of "fast path" for people for simple cases.
But I think we should get some more experience with what we have
before we add too many bells and whistles (that might turn out not to
be as useful as we thought).

...Robert


Re: Dividing progress/debug information in pg_standby, and stat before copy

From
Fujii Masao
Date:
On Tue, Jan 26, 2010 at 8:54 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> Simon Riggs wrote:
>> Currently, we had presumed
>> that standby_mode = off would be the same as Warm Standby replication,
>> using pg_standby or other.
>
> That's still true. standby_mode=off is the same as what you have in PG
> 8.4. You can still use pg_standby etc. with that.
>
>> pg_standby checks file size before returning. If you just use "cp" as
>> suggested then it would fail because the copy would start before the
>> file is ready to be copied.
>
> True. You could work around that by using a script instead of plain
> 'cp', or by making sure that no partial files appear in the archive in
> the other end.

How about just retrying SR instead of emitting a FATAL error in
RestoreArchivedFile() when a partial WAL file has been restored?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Dimitri Fontaine <dfontaine@hi-media.com> writes:
> Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
>> *That* makes pg_standby obsolete, not streaming replication per se.
>> Setting standby_mode=on, with a valid restore_command using e.g 'cp' and
>> no connection info for walreceiver is more or less the same as using
>> pg_standby.

> I've yet to understand how the files in the archive get from the master
> to the slave in this case, or are you supposing in your example that the
> cp in the restore_command is targetting a shared disk setup or
> something?

pg_standby didn't address that detail, either.
        regards, tom lane


Re: Dividing progress/debug information in pg_standby, and stat before copy

From
Josh Berkus
Date:
> *That* makes pg_standby obsolete, not streaming replication per se.
> Setting standby_mode=on, with a valid restore_command using e.g 'cp' and
> no connection info for walreceiver is more or less the same as using
> pg_standby.

What about deletion of no-longer-needed WALfile copies?

--Josh Berkus



Re: Dividing progress/debug information in pg_standby, and stat before copy

From
Heikki Linnakangas
Date:
Josh Berkus wrote:
>> *That* makes pg_standby obsolete, not streaming replication per se.
>> Setting standby_mode=on, with a valid restore_command using e.g 'cp' and
>> no connection info for walreceiver is more or less the same as using
>> pg_standby.
> 
> What about deletion of no-longer-needed WALfile copies?

Yeah, good point. You can still use a shell script as restore_command,
pass the %r option to it, and do the deletion there.

I didn't intend to replace pg_standby when I started this, it just kind
of happened. Maybe we should provide a sample script similar to
pg_standby, to be used instead of plain 'cp', that does the cleanup too.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com


Re: Dividing progress/debug information in pg_standby, and stat before copy

From
Josh Berkus
Date:
On 1/26/10 10:48 AM, Heikki Linnakangas wrote:
> Josh Berkus wrote:
>>> *That* makes pg_standby obsolete, not streaming replication per se.
>>> Setting standby_mode=on, with a valid restore_command using e.g 'cp' and
>>> no connection info for walreceiver is more or less the same as using
>>> pg_standby.
>> What about deletion of no-longer-needed WALfile copies?
> 
> Yeah, good point. You can still use a shell script as restore_command,
> pass the %r option to it, and do the deletion there.
> 
> I didn't intend to replace pg_standby when I started this, it just kind
> of happened. Maybe we should provide a sample script similar to
> pg_standby, to be used instead of plain 'cp', that does the cleanup too.

What I'm pointing out is that you haven't *quite* replaced pg_standby,
and at this late date aren't going to.  Some people will still want to
use it.  Especially for people who for some reason aren't using SR.

--Josh Berkus



Josh Berkus <josh@agliodbs.com> writes:
> On 1/26/10 10:48 AM, Heikki Linnakangas wrote:
>> I didn't intend to replace pg_standby when I started this, it just kind
>> of happened. Maybe we should provide a sample script similar to
>> pg_standby, to be used instead of plain 'cp', that does the cleanup too.

> What I'm pointing out is that you haven't *quite* replaced pg_standby,
> and at this late date aren't going to.  Some people will still want to
> use it.  Especially for people who for some reason aren't using SR.

Right, but the question is: is there enough use-case left in it to
justify spending community effort on polishing rough edges?  It's not
like we haven't got plenty else to do to get 9.0 out the door.
        regards, tom lane


Re: Dividing progress/debug information in pg_standby, and stat before copy

From
Josh Berkus
Date:
On 1/26/10 10:58 AM, Tom Lane wrote:
> Josh Berkus <josh@agliodbs.com> writes:
>> On 1/26/10 10:48 AM, Heikki Linnakangas wrote:
>>> I didn't intend to replace pg_standby when I started this, it just kind
>>> of happened. Maybe we should provide a sample script similar to
>>> pg_standby, to be used instead of plain 'cp', that does the cleanup too.
> 
>> What I'm pointing out is that you haven't *quite* replaced pg_standby,
>> and at this late date aren't going to.  Some people will still want to
>> use it.  Especially for people who for some reason aren't using SR.
> 
> Right, but the question is: is there enough use-case left in it to
> justify spending community effort on polishing rough edges?  It's not
> like we haven't got plenty else to do to get 9.0 out the door.

Can we wait until the alpha to decide that?  I haven't tested Heikki's
code to find out whether it works as he intends, and Simon seems to
think different.  If it's OK to wait until beta to fix the error
messages for pg_standby, then let's wait.

Anyway, if we fix the *core* bogus error messages for standby mode, that
will eliminate half of what's confusing and alarming people (and all of
it for those using SR).

--Josh Berkus


Re: Dividing progress/debug information in pg_standby, and stat before copy

From
Simon Riggs
Date:
On Tue, 2010-01-26 at 11:01 -0800, Josh Berkus wrote:

> Anyway, if we fix the *core* bogus error messages for standby mode, that
> will eliminate half of what's confusing and alarming people (and all of
> it for those using SR).

Just done that.

-- Simon Riggs           www.2ndQuadrant.com



Re: Dividing progress/debug information in pg_standby, and stat before copy

From
Greg Smith
Date:
Tom Lane wrote:
> Right, but the question is: is there enough use-case left in it to
> justify spending community effort on polishing rough edges?  It's not
> like we haven't got plenty else to do to get 9.0 out the door.
>   

The point I was trying to make is that I'm on the hook to do this 
particular job whether or not the community feels it's worthwhile.  The 
only real decision is whether there's enough interest in upgrading this 
utility to review and possibly commit the result, and I'm trying to 
minimize the public resources needed even for those parts.

-- 
Greg Smith    2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com  www.2ndQuadrant.com



Re: Dividing progress/debug information in pg_standby, and stat before copy

From
Selena Deckelmann
Date:
On Tue, Jan 26, 2010 at 11:03 PM, Greg Smith <greg@2ndquadrant.com> wrote:
> Tom Lane wrote:
>>
>> Right, but the question is: is there enough use-case left in it to
>> justify spending community effort on polishing rough edges?  It's not
>> like we haven't got plenty else to do to get 9.0 out the door.
>>
>
> The point I was trying to make is that I'm on the hook to do this particular
> job whether or not the community feels it's worthwhile.  The only real
> decision is whether there's enough interest in upgrading this utility to
> review and possibly commit the result, and I'm trying to minimize the public
> resources needed even for those parts.

I'm interested as well, so happy to continue to work with you on it, Greg.

-selena

--
http://chesnok.com/daily - me
http://endpoint.com - work