Thread: Issues with two-server Synch Rep

Issues with two-server Synch Rep

From
Josh Berkus
Date:
Simon, Fujii,

What follows are what I see as the major issues with making two-server
synch replication work well.  I would like to have you each answer them,
explaining how your patch and your design addresses each issue.  I
believe this will go a long way towards helping the majority of the
community understand the options we have from your code, as well as
where help is still needed.

Adding a Synch Standby
-----------------------
What is the procedure for adding a new synchronous standby in your
implementation?  That is, how do we go from having a standby server with
an empty PGDATA to having a working synchronous standby?

Snapshot Publication
---------------------
During 9.0 development discussion, one of the things we realized we
needed for synch standby was publication of snapshots back to the master
in order to prevent query cancel on the standby.  Without this, the
synch standby is useless for running read queries.  Does your patch
implement this?  Please describe.

Management
-----------
One of the serious flaws currently in HS/SR is complexity of
administration.  Setting up and configuring even a single master and
single standby requires editing up to 6 configuration files in Postgres,
as well as dealing with file permissions.  As such, any Synch Rep patch
must work together with attempts to simplify administration.  How does
your design do this?

Monitoring
-----------
Synch rep offers severe penalties to availability if a synch standby
gets behind or goes down.  What replication-specific monitoring tools
and hooks are available to allow administators to take action before the
database becomes unavailable?

Degradation
------------
In the event that the synch rep standby falls too far behind or becomes
unavailable, or is deliberately taken offline, what are you envisioning
as the process for the DBA resolving the situation?  Is there any
ability to commit "stuck" transactions?

Client Consistency
---------------------
With a standby in "apply" mode, and a master failure at the wrong time,
there is the possibility that the Standby will apply a transaction at
the same time that the master crashes, causing the client to never
receive a commit message.  Once the client reconnects to the standby,
how will it know whether its transaction was committed or not?

As a lesser case, a standby in "apply" mode will show the results of
committed transactions *before* they are visible on the master.  Is
there any need to handle this?  If so, how?

Performance
------------
As with XA, synch rep has the potential to be so slow as to be unusable.What optimizations to you make in your approach
tosynch rep to make it
 
faster than two-phase commit?  What other performance optimizations have
you added?

--                                  -- Josh Berkus                                    PostgreSQL Experts Inc.
                        http://www.pgexperts.com
 


Re: Issues with two-server Synch Rep

From
Robert Haas
Date:
On Thu, Oct 7, 2010 at 2:05 PM, Josh Berkus <josh@agliodbs.com> wrote:
> What is the procedure for adding a new synchronous standby in your
> implementation?  That is, how do we go from having a standby server with
> an empty PGDATA to having a working synchronous standby?

I'll take a crack at answering these.  I don't think that the
procedure for setting up a standby server is going to change much.
The idea is presumably that you set up an async standby more or less
as you do now and then make whatever configuration changes are
necessary to flip it to synchronous.

> During 9.0 development discussion, one of the things we realized we
> needed for synch standby was publication of snapshots back to the master
> in order to prevent query cancel on the standby.  Without this, the
> synch standby is useless for running read queries.  Does your patch
> implement this?  Please describe.

This is a completely separate issue from making replication
synchronous.  And, really?  Useless for running read queries?

> One of the serious flaws currently in HS/SR is complexity of
> administration.  Setting up and configuring even a single master and
> single standby requires editing up to 6 configuration files in Postgres,
> as well as dealing with file permissions.  As such, any Synch Rep patch
> must work together with attempts to simplify administration.  How does
> your design do this?

This is also completely out of scope for sync rep.

> Synch rep offers severe penalties to availability if a synch standby
> gets behind or goes down.  What replication-specific monitoring tools
> and hooks are available to allow administators to take action before the
> database becomes unavailable?

I don't think there's much hope of allowing administrators to take
action BEFORE the database becomes unavailable.  The point of making
replication synchronous rather than asynchronous is that the slave
can't be behind AT ALL, and if it goes down the primary is immediately
stuck.  If the synchronous standby vanishes, the master can recover
if:

1. We turn off synchronous replication, or
2. TCP keepalives or some other mechanism kills the master-slave
connection after a suitable timeout, and we interpret (or configure)
no connected standbys = stop synchronous replication.

> In the event that the synch rep standby falls too far behind or becomes
> unavailable, or is deliberately taken offline, what are you envisioning
> as the process for the DBA resolving the situation?  Is there any
> ability to commit "stuck" transactions?

Again, it can't fall "too far" behind.  It can't be behind at all.
Any stuck transactions are necessarily already committed; the commit
just hasn't been acknowledged to the client yet.  Presumably, if
synchronous replication is disabled via (1) or (2) above, then any
outstanding committed-but-unacknowledged-to-the-client transactions
should notify the client of the commit and continue on.

> With a standby in "apply" mode, and a master failure at the wrong time,
> there is the possibility that the Standby will apply a transaction at
> the same time that the master crashes, causing the client to never
> receive a commit message.  Once the client reconnects to the standby,
> how will it know whether its transaction was committed or not?

If a client loses the connection after issuing a commit but before
receiving the acknowledgment, it can't know whether the commit
happened or not.  This is true regardless of whether there is a
standby and regardless of whether that standby is synchronous.
Clients that care need to implement their own mechanisms for resolving
this difficulty.

> As a lesser case, a standby in "apply" mode will show the results of
> committed transactions *before* they are visible on the master.  Is
> there any need to handle this?  If so, how?

It's theoretically impossible for the transaction to become visible
everywhere simultaneously.  It's already the case that transactions
become visible to other backends before the backend doing the commit
has received an acknowledgment.  Any client relying on any other
behavior is already broken.

> As with XA, synch rep has the potential to be so slow as to be unusable.
>  What optimizations to you make in your approach to synch rep to make it
> faster than two-phase commit?  What other performance optimizations have
> you added?

Sync rep is going to be slow, period.  Every implementation currently
on the table has to fsync on the master, and then send the commit xlog
record to the slave and wait for an acknowledgment from the slave.
Allowing those to happen in parallel is going to be Hard.  Also, the
interaction with max_standby_delay is going to be a big problem, I
suspect.

As for the specific optimizations in each patch, I believe the major
thing that differs between them is the exact timing of the
acknowledgments; but perhaps I should let the patch authors speak to
that question, if they wish to do so.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Issues with two-server Synch Rep

From
Josh Berkus
Date:
Robert,

> I'll take a crack at answering these.  I don't think that the
> procedure for setting up a standby server is going to change much.
> The idea is presumably that you set up an async standby more or less
> as you do now and then make whatever configuration changes are
> necessary to flip it to synchronous.

What is the specific "flip" procedure, though?  For one thing, I want to
make sure that it's not necessary to restart the master or the standby
to "flip" it, since that would be a catch-22.

> This is a completely separate issue from making replication
> synchronous.  And, really?  Useless for running read queries?

Absolutely.  For a synch standby, you can't tolerate any standby delay
at all.  This means that anywhere from 1/4 to 3/4 of queries on the
standby would be cancelled on any high-traffic OLTP server.  Hence,
"useless".

>>  As such, any Synch Rep patch
>> must work together with attempts to simplify administration.  How does
>> your design do this?
> 
> This is also completely out of scope for sync rep.

It is not, given that I've seen several proposals for synch rep which
would make asynch rep even more complicated than it already is.  I'm
taking the stance that any sync rep design which *blocks* making asynch
rep easier to use is fundamentally flawed and can't be accepted.

> I don't think there's much hope of allowing administrators to take
> action BEFORE the database becomes unavailable. 

I'd swear that you were working as a DBA less than a year ago, but I
couldn't tell it from that statement.

There is every bit of value in allowing DBAs to view, and chart,
response times on the standby for ACK.  That way they can notice an
increase in response times and take action to improve the standby
*before* it locks up the system.

>   Presumably, if
> synchronous replication is disabled via (1) or (2) above, then any
> outstanding committed-but-unacknowledged-to-the-client transactions
> should notify the client of the commit and continue on.

That's what I was asking about.  I'm not "presuming" that any pending
patch covers any such eventuality until it's confirmed.

> If a client loses the connection after issuing a commit but before
> receiving the acknowledgment, it can't know whether the commit
> happened or not.  This is true regardless of whether there is a
> standby and regardless of whether that standby is synchronous.
> Clients that care need to implement their own mechanisms for resolving
> this difficulty.

That's a handwavy way of saying "go away, don't bother us with such
details".  For the client to resolve the situation, then *it* needs to
be able to tell whether or not the transaction was committed.  How would
it do this, exactly?

> It's theoretically impossible for the transaction to become visible
> everywhere simultaneously.  It's already the case that transactions
> become visible to other backends before the backend doing the commit
> has received an acknowledgment.  Any client relying on any other
> behavior is already broken.

So, your opinion is "it's out of scope to handle this issue" ?

> Sync rep is going to be slow, period.  Every implementation currently
> on the table has to fsync on the master, and then send the commit xlog
> record to the slave and wait for an acknowledgment from the slave.
> Allowing those to happen in parallel is going to be Hard.  

Yes, but it's something we need to address. XA is widely distrusted and
is seen as inadequate for high-traffic OLTP systems precisely because it
is SO slow.  If we want to create a synch rep system which people will
want to use, then it has to be faster than XA.  If it's not faster than
XA, why bother creating it?  We already have 2PC.

> Also, the
> interaction with max_standby_delay is going to be a big problem, I
> suspect.

Interaction?  My opinion is that the two are completely incompatible.
You can't have synch rep and also have standby_delay > 0.

--                                  -- Josh Berkus                                    PostgreSQL Experts Inc.
                        http://www.pgexperts.com
 


Re: Issues with two-server Synch Rep

From
Robert Haas
Date:
On Mon, Oct 11, 2010 at 2:07 PM, Josh Berkus <josh@agliodbs.com> wrote:
>> I'll take a crack at answering these.  I don't think that the
>> procedure for setting up a standby server is going to change much.
>> The idea is presumably that you set up an async standby more or less
>> as you do now and then make whatever configuration changes are
>> necessary to flip it to synchronous.
>
> What is the specific "flip" procedure, though?  For one thing, I want to
> make sure that it's not necessary to restart the master or the standby
> to "flip" it, since that would be a catch-22.

Obviously.  I presume it'll be something like "update postgresql.conf
or recovery.conf and run pg_ctl reload", but I haven't (yet, anyway)
verified the actual behavior of the patches, but if the above isn't
feasible then we have a problem.

>> This is a completely separate issue from making replication
>> synchronous.  And, really?  Useless for running read queries?
>
> Absolutely.  For a synch standby, you can't tolerate any standby delay
> at all.  This means that anywhere from 1/4 to 3/4 of queries on the
> standby would be cancelled on any high-traffic OLTP server.  Hence,
> "useless".

What is your source for those numbers?  They could be right, but I
simply don't know.

At any rate, I don't disagree that we have a problem.  In fact, I
think we have a whole serious of problems.  The whole architecture of
replication as it exists in PG is pretty fundamentally limited right
now.   Right now, a pruning operation on the master (regardless of
whether it's a HOT prune or vacuum) can happen when there are still
snapshots on the slave that need that data.  Our only options are to
either wait for those snapshots to go away, or kill of the
queries/transactions that took them.  Adding an XID feedback from the
slave to the master "fixes" the problem by preventing the master from
pruning those tuples until the slave no longer needs them, but at the
expense of bloating the master and all other standbys.  That may,
indeed, be better for some use cases, but it's not really all that
good.  It would be far better if we could decouple master cleanup from
standby cleanup, so that only the machine that actually has the old
query gets bloated.  However, no one seems excited about writing that
code.

A further grump about our current architecture is that it doesn't seem
at all clear how to make it work for partial replication.  I have to
wonder whether we are going down the wrong path completely and need to
hit the reset button.  But neither this nor the pruning problem are
things that we can reasonably expect the sync rep patch to solve, if
we want it to get committed this release cycle.

>>>  As such, any Synch Rep patch
>>> must work together with attempts to simplify administration.  How does
>>> your design do this?
>>
>> This is also completely out of scope for sync rep.
>
> It is not, given that I've seen several proposals for synch rep which
> would make asynch rep even more complicated than it already is.

I'm not aware of any proposals on the table which would do that.

> I'm
> taking the stance that any sync rep design which *blocks* making asynch
> rep easier to use is fundamentally flawed and can't be accepted.

Do you have some ideas on how to simplify it?  How will we know
whether a particular design for sync rep does this?

>> I don't think there's much hope of allowing administrators to take
>> action BEFORE the database becomes unavailable.
>
> I'd swear that you were working as a DBA less than a year ago, but I
> couldn't tell it from that statement.

Your comment sounded to me like you were asking for a schedule of all
future unplanned outages.

> There is every bit of value in allowing DBAs to view, and chart,
> response times on the standby for ACK.  That way they can notice an
> increase in response times and take action to improve the standby
> *before* it locks up the system.

Sure, that would be nice to have, and it's a good idea.  But I don't
think that's going to be a common failure mode.  What I expect to
happen is the standby to hum along with no problem for a long time and
then either kick a disk or suffer a power outage.  There's very little
monitoring we can do within PG that will notice either of those things
coming.  There might be some external-to-PG monitoring that can be
done, but if there's a massive blackout or a terrorist attack or
somebody trips over the power cord, you're just going to get
surprised.

>>   Presumably, if
>> synchronous replication is disabled via (1) or (2) above, then any
>> outstanding committed-but-unacknowledged-to-the-client transactions
>> should notify the client of the commit and continue on.
>
> That's what I was asking about.  I'm not "presuming" that any pending
> patch covers any such eventuality until it's confirmed.

Yep, we need to confirm that.

>> If a client loses the connection after issuing a commit but before
>> receiving the acknowledgment, it can't know whether the commit
>> happened or not.  This is true regardless of whether there is a
>> standby and regardless of whether that standby is synchronous.
>> Clients that care need to implement their own mechanisms for resolving
>> this difficulty.
>
> That's a handwavy way of saying "go away, don't bother us with such
> details".  For the client to resolve the situation, then *it* needs to
> be able to tell whether or not the transaction was committed.  How would
> it do this, exactly?

No, it isn't at all.  What does your application do NOW if the master
goes down after you've sent a commit and before you get an
acknowledgment back?  Does it assume that the transaction is
committed, or does it assume the transaction was aborted by a crash on
the master?  Either is possible, right?

>> It's theoretically impossible for the transaction to become visible
>> everywhere simultaneously.  It's already the case that transactions
>> become visible to other backends before the backend doing the commit
>> has received an acknowledgment.  Any client relying on any other
>> behavior is already broken.
>
> So, your opinion is "it's out of scope to handle this issue" ?

What handling of it would you propose?  Consider the case where you
just have one server and no standbys.  A client connects, does some
work, and says COMMIT.  There is some finite amount of time after the
COMMIT happens and before the client gets the acknowledgment back that
the commit has succeeded.  During that time, another transaction that
starts up will see the effects of the COMMIT - BEFORE the transaction
itself knows that it is committed.  There's not much you can do about
this.  You have to do the commit on the server before sending the
response back to the client.

In the sync rep case, you're going to get the same behavior.  After
the client has asked for commit and before the commit has been
acknowledged, there's no guarantee whether another transaction that
starts up during that in-between time sees the transaction or not.
The only further anomaly that can happen as a result of sync rep is
that, in apply mode, the transaction's effects will become visible on
the standby before they are visible on the master, so if you fire off
a COMMIT, and then before receiving the acknowledgment start a
transaction on the standby, and then just after that start a
transaction on the master, and then just after that you get back an
acknowledgment that the COMMIT completed, you might have a snapshot on
the master that was taken afterwards chronologically but shows the
effects of fewer committed XIDs - i.e. time has gone backwards.
Unfortunately, short of a global transaction manager, this is an
unsolvable problem, and that's definitely more than is going to happen
for 9.1, I think.

>> Sync rep is going to be slow, period.  Every implementation currently
>> on the table has to fsync on the master, and then send the commit xlog
>> record to the slave and wait for an acknowledgment from the slave.
>> Allowing those to happen in parallel is going to be Hard.
>
> Yes, but it's something we need to address.

I agree, but it's not something we can address in the first patch,
which is hard enough without adding things that make it even harder.
We need to get something simple committed first and then build on it.

> XA is widely distrusted and
> is seen as inadequate for high-traffic OLTP systems precisely because it
> is SO slow.  If we want to create a synch rep system which people will
> want to use, then it has to be faster than XA.  If it's not faster than
> XA, why bother creating it?  We already have 2PC.

I don't know anything about XA so I can't comment on this.

>> Also, the
>> interaction with max_standby_delay is going to be a big problem, I
>> suspect.
>
> Interaction?  My opinion is that the two are completely incompatible.
> You can't have synch rep and also have standby_delay > 0.

We seem to be in violent agreement on this point.  I was saying the
same thing in a different way.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Issues with two-server Synch Rep

From
Jeff Davis
Date:
On Mon, 2010-10-11 at 15:22 -0400, Robert Haas wrote:
> >> This is a completely separate issue from making replication
> >> synchronous.  And, really?  Useless for running read queries?
> >
> > Absolutely.  For a synch standby, you can't tolerate any standby delay
> > at all.  This means that anywhere from 1/4 to 3/4 of queries on the
> > standby would be cancelled on any high-traffic OLTP server.  Hence,
> > "useless".
> 
> What is your source for those numbers?  They could be right, but I
> simply don't know.

I was initially taken aback by the word "useless" as well. However, I
had trouble thinking of a use case that isn't better solved by sync rep
without HS, or async rep. I don't have the numbers either though, so
perhaps someone does have a use case.

> It would be far better if we could decouple master cleanup from
> standby cleanup, so that only the machine that actually has the old
> query gets bloated.  However, no one seems excited about writing that
> code.

That doesn't seem just a matter of code, it seems like a major design
conflict. If the master removes a tuple and then re-uses that space for
another tuple, you can't move those changes to the standby unless you
cancel queries dependent on the first tuple.

> A further grump about our current architecture is that it doesn't seem
> at all clear how to make it work for partial replication.  I have to
> wonder whether we are going down the wrong path completely and need to
> hit the reset button.  But neither this nor the pruning problem are
> things that we can reasonably expect the sync rep patch to solve, if
> we want it to get committed this release cycle.

What we have is physical replication, but you seem to be talking about
logical replication (evidence: Slony solves both the independent cleanup
problem and partial replication).

Both logical and physical replication have a place, and I don't believe
either is the "wrong" path. If we'd like to add logical replication, we
may be better of starting with Slony (or Londiste) and working from
there.

Regards,Jeff Davis



Re: Issues with two-server Synch Rep

From
Robert Haas
Date:
On Mon, Oct 11, 2010 at 4:01 PM, Jeff Davis <pgsql@j-davis.com> wrote:
> On Mon, 2010-10-11 at 15:22 -0400, Robert Haas wrote:
>> >> This is a completely separate issue from making replication
>> >> synchronous.  And, really?  Useless for running read queries?
>> >
>> > Absolutely.  For a synch standby, you can't tolerate any standby delay
>> > at all.  This means that anywhere from 1/4 to 3/4 of queries on the
>> > standby would be cancelled on any high-traffic OLTP server.  Hence,
>> > "useless".
>>
>> What is your source for those numbers?  They could be right, but I
>> simply don't know.
>
> I was initially taken aback by the word "useless" as well. However, I
> had trouble thinking of a use case that isn't better solved by sync rep
> without HS, or async rep. I don't have the numbers either though, so
> perhaps someone does have a use case.

The main use cases for synchronous replication seem to be (1) high
availability and (2) read scalability.  That is, if you have 99%
writes and 1% reads, you can round-robin the reads and do all the
writes on the master.  But I think we are quite a way from making (2)
work well enough to get excited about.

>> It would be far better if we could decouple master cleanup from
>> standby cleanup, so that only the machine that actually has the old
>> query gets bloated.  However, no one seems excited about writing that
>> code.
>
> That doesn't seem just a matter of code, it seems like a major design
> conflict.

Yes.  I had the idea of trying to fix this by allowing the standby to
retain old versions of entire pages that got cleaned up on the master,
until the transactions that might want to read the old pages were
gone.  But that may be prohibitively difficult, not sure.
Alternatively, you could, as you say, do logical rather than physical
replication.

>> A further grump about our current architecture is that it doesn't seem
>> at all clear how to make it work for partial replication.  I have to
>> wonder whether we are going down the wrong path completely and need to
>> hit the reset button.  But neither this nor the pruning problem are
>> things that we can reasonably expect the sync rep patch to solve, if
>> we want it to get committed this release cycle.
>
> What we have is physical replication, but you seem to be talking about
> logical replication (evidence: Slony solves both the independent cleanup
> problem and partial replication).
>
> Both logical and physical replication have a place, and I don't believe
> either is the "wrong" path. If we'd like to add logical replication, we
> may be better of starting with Slony (or Londiste) and working from
> there.

Yeah, that's possible.  Or Mammoth Replicator.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Issues with two-server Synch Rep

From
Josh Berkus
Date:
> Obviously.  I presume it'll be something like "update postgresql.conf
> or recovery.conf and run pg_ctl reload", but I haven't (yet, anyway)
> verified the actual behavior of the patches, but if the above isn't
> feasible then we have a problem.

Right.  That's why I asked the question.  Mind you, a superuser function
on the master would be even better ...

> What is your source for those numbers?  They could be right, but I
> simply don't know.

pg_bench tests with asynch rep and standby_delay = 0.  Not rigorous, but
enough to show that there is a problem there.  Doing pg_bench with a
small database

> It would be far better if we could decouple master cleanup from
> standby cleanup, so that only the machine that actually has the old
> query gets bloated.  However, no one seems excited about writing that
> code.

"not excited" == terrified of the amount of troubleshooting involved,
and likely believing it's impossible.

> A further grump about our current architecture is that it doesn't seem
> at all clear how to make it work for partial replication.  I have to
> wonder whether we are going down the wrong path completely and need to
> hit the reset button.  

The way to do partial replication is Slony, Londiste, Bucardo, etc.

> But neither this nor the pruning problem are
> things that we can reasonably expect the sync rep patch to solve, if
> we want it to get committed this release cycle.

>> It is not, given that I've seen several proposals for synch rep which
>> would make asynch rep even more complicated than it already is.
> 
> I'm not aware of any proposals on the table which would do that.

Standby registration?

> Do you have some ideas on how to simplify it?  How will we know
> whether a particular design for sync rep does this?

That's a good point, I'll have to think about this and do a write-up.

> Sure, that would be nice to have, and it's a good idea.  But I don't
> think that's going to be a common failure mode.  What I expect to
> happen is the standby to hum along with no problem for a long time and
> then either kick a disk or suffer a power outage.

That might be more common, but it's not an argument against monitoring
what we *can* monitor for.  More importantly, if monitoring ACK response
times -- and similar metrics -- is not available via core postgres, it
is impossible to find them out any other way.  We need to give DBAs the
tools to do their jobs, even if the tools are at a very low level.

> No, it isn't at all.  What does your application do NOW if the master
> goes down after you've sent a commit and before you get an
> acknowledgment back?  Does it assume that the transaction is
> committed, or does it assume the transaction was aborted by a crash on
> the master?  Either is possible, right?

This problem certainly exists with async, it's just less likely so
people are ignoring it.  With a high enough transaction rate, and a
standby in "apply" mode, it's *certain* to happen on synch rep.  So we
can't ignore it as a problem anymore.

I don't have any brilliant ideas on a solution for this one.

>> So, your opinion is "it's out of scope to handle this issue" ?
> 
> What handling of it would you propose?  Consider the case where you

I was asking a question.  My original question was "do we need to handle
this?"  I'm taking your viewpoint as "there's no reasonable way to
handle it, so we shouldn't."  That's a fine answer.  What I want is for
-hackers to make a *decision* about a very real problem, and not just
fail to discuss it.

> I agree, but it's not something we can address in the first patch,
> which is hard enough without adding things that make it even harder.
> We need to get something simple committed first and then build on it.

The reason I posted the start of this thread is that I know that both
Fujii and Simon have thought about some of these questions, and even if
they don't have code for them, they have ideas.  I want to read those
ideas explained.  Further, the answers to these questions may tell the
rest of us which parts of each patch are the most valuable.

--                                  -- Josh Berkus                                    PostgreSQL Experts Inc.
                        http://www.pgexperts.com
 


Re: Issues with two-server Synch Rep

From
Jeff Davis
Date:
On Mon, 2010-10-11 at 16:07 -0400, Robert Haas wrote:
> > I was initially taken aback by the word "useless" as well. However, I
> > had trouble thinking of a use case that isn't better solved by sync rep
> > without HS, or async rep. I don't have the numbers either though, so
> > perhaps someone does have a use case.
> 
> The main use cases for synchronous replication seem to be (1) high
> availability and (2) read scalability.  That is, if you have 99%
> writes and 1% reads, you can round-robin the reads and do all the
> writes on the master.  But I think we are quite a way from making (2)
> work well enough to get excited about.

[ I assume you meant "99% reads and 1% writes" ]

Wouldn't the snapshot publication (as Josh called it) back to the master
work better for that use case?

I'm not even sure that it's the ratio that matters, but rather how
constant the writes are. 1% writes does not necessarily mean that a
random 1% of read queries fail on the standby. I don't have the numbers,
but SR + query cancel seems like the standby system would effectively be
down during write activity. I wouldn't be surprised if SR + query cancel
resulted in some frustrated users; but perhaps "useless" is too strong a
word.

> >> It would be far better if we could decouple master cleanup from
> >> standby cleanup, so that only the machine that actually has the old
> >> query gets bloated.  However, no one seems excited about writing that
> >> code.
> >
> > That doesn't seem just a matter of code, it seems like a major design
> > conflict.
> 
> Yes.  I had the idea of trying to fix this by allowing the standby to
> retain old versions of entire pages that got cleaned up on the master,
> until the transactions that might want to read the old pages were
> gone.  But that may be prohibitively difficult, not sure.

I think you'd end up having a notion of a snapshot of block information
(like a FS with snapshots) inside of postgres.

Sounds like a lot of complexity to me, and the only benefit I see is
moving bloat from the primary to the standby. Granted, that would be
nice, but I would expect some costs aside from just the complexity.

Regards,Jeff Davis



Re: Issues with two-server Synch Rep

From
Robert Haas
Date:
On Mon, Oct 11, 2010 at 5:40 PM, Josh Berkus <josh@agliodbs.com> wrote:
>> Obviously.  I presume it'll be something like "update postgresql.conf
>> or recovery.conf and run pg_ctl reload", but I haven't (yet, anyway)
>> verified the actual behavior of the patches, but if the above isn't
>> feasible then we have a problem.
>
> Right.  That's why I asked the question.  Mind you, a superuser function
> on the master would be even better ...

That's probably not going to happen until we have a way to update
postgresql.conf via SQL.  Which, I maintain, as I have maintained
before, is not going to happen until we get rid of the comments,
because otherwise absolutely any implementation anyone proposes will
get criticized for failing to handle them sensibly (because it is not
possible to rewrite the file while handling the comments sensibly).

>> What is your source for those numbers?  They could be right, but I
>> simply don't know.
>
> pg_bench tests with asynch rep and standby_delay = 0.  Not rigorous, but
> enough to show that there is a problem there.  Doing pg_bench with a
> small database

Interesting.

>> It would be far better if we could decouple master cleanup from
>> standby cleanup, so that only the machine that actually has the old
>> query gets bloated.  However, no one seems excited about writing that
>> code.
>
> "not excited" == terrified of the amount of troubleshooting involved,
> and likely believing it's impossible.

Quitter!  :-)

>> A further grump about our current architecture is that it doesn't seem
>> at all clear how to make it work for partial replication.  I have to
>> wonder whether we are going down the wrong path completely and need to
>> hit the reset button.
>
> The way to do partial replication is Slony, Londiste, Bucardo, etc.

Sure.  But we can't forever ignore the fact that trigger-based
replication is not as performant as log-based replication.

>> But neither this nor the pruning problem are
>> things that we can reasonably expect the sync rep patch to solve, if
>> we want it to get committed this release cycle.
>
>>> It is not, given that I've seen several proposals for synch rep which
>>> would make asynch rep even more complicated than it already is.
>>
>> I'm not aware of any proposals on the table which would do that.
>
> Standby registration?

No one that I know of has proposed making that mandatory for async
standbys.  In fact, Heikki has said just the opposite, and I agree
with him.

>> Sure, that would be nice to have, and it's a good idea.  But I don't
>> think that's going to be a common failure mode.  What I expect to
>> happen is the standby to hum along with no problem for a long time and
>> then either kick a disk or suffer a power outage.
>
> That might be more common, but it's not an argument against monitoring
> what we *can* monitor for.  More importantly, if monitoring ACK response
> times -- and similar metrics -- is not available via core postgres, it
> is impossible to find them out any other way.  We need to give DBAs the
> tools to do their jobs, even if the tools are at a very low level.

No argument.

>> No, it isn't at all.  What does your application do NOW if the master
>> goes down after you've sent a commit and before you get an
>> acknowledgment back?  Does it assume that the transaction is
>> committed, or does it assume the transaction was aborted by a crash on
>> the master?  Either is possible, right?
>
> This problem certainly exists with async, it's just less likely so
> people are ignoring it.  With a high enough transaction rate, and a
> standby in "apply" mode, it's *certain* to happen on synch rep.  So we
> can't ignore it as a problem anymore.

It exists with no replication at all...

> I don't have any brilliant ideas on a solution for this one.

Right, well, the world is fundamentally asynchronous.  In practice,
it's not that hard to write application-dependent logic to handle this
when it matters.  Your transaction can, e.g. start by inserting a UUID
in a table somewhere.  Then if a crash occurs you probe and see if the
UUID associated with that transaction is there, or not.  Or depending
on what your transaction does, there may be some more natural
identifier you can use (e.g. if you are inserting an order into an
orders table, you can look and see if the order number you thought you
created is there).

>>> So, your opinion is "it's out of scope to handle this issue" ?
>>
>> What handling of it would you propose?  Consider the case where you
>
> I was asking a question.  My original question was "do we need to handle
> this?"  I'm taking your viewpoint as "there's no reasonable way to
> handle it, so we shouldn't."  That's a fine answer.  What I want is for
> -hackers to make a *decision* about a very real problem, and not just
> fail to discuss it.

Yes, that's my viewpoint.  I think both this an the previous item are
pretty basic database theory - and you can probably read about them in
any good database theory textbook.

>> I agree, but it's not something we can address in the first patch,
>> which is hard enough without adding things that make it even harder.
>> We need to get something simple committed first and then build on it.
>
> The reason I posted the start of this thread is that I know that both
> Fujii and Simon have thought about some of these questions, and even if
> they don't have code for them, they have ideas.  I want to read those
> ideas explained.  Further, the answers to these questions may tell the
> rest of us which parts of each patch are the most valuable.

Fair enough.  I was just replying because (1) nobody else had and (2)
I thought it might help to try to separate out which of the issues you
raised are most relevant to the patches at hand.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Issues with two-server Synch Rep

From
Robert Haas
Date:
On Mon, Oct 11, 2010 at 7:16 PM, Jeff Davis <pgsql@j-davis.com> wrote:
> On Mon, 2010-10-11 at 16:07 -0400, Robert Haas wrote:
>> > I was initially taken aback by the word "useless" as well. However, I
>> > had trouble thinking of a use case that isn't better solved by sync rep
>> > without HS, or async rep. I don't have the numbers either though, so
>> > perhaps someone does have a use case.
>>
>> The main use cases for synchronous replication seem to be (1) high
>> availability and (2) read scalability.  That is, if you have 99%
>> writes and 1% reads, you can round-robin the reads and do all the
>> writes on the master.  But I think we are quite a way from making (2)
>> work well enough to get excited about.
>
> [ I assume you meant "99% reads and 1% writes" ]

Oops, yes.

> Wouldn't the snapshot publication (as Josh called it) back to the master
> work better for that use case?

Well, that would help make it more useful.  Of course then bloat on
any machine will bloat the entire cluster...

> I'm not even sure that it's the ratio that matters, but rather how
> constant the writes are. 1% writes does not necessarily mean that a
> random 1% of read queries fail on the standby. I don't have the numbers,
> but SR + query cancel seems like the standby system would effectively be
> down during write activity. I wouldn't be surprised if SR + query cancel
> resulted in some frustrated users; but perhaps "useless" is too strong a
> word.

Yeah.

>> >> It would be far better if we could decouple master cleanup from
>> >> standby cleanup, so that only the machine that actually has the old
>> >> query gets bloated.  However, no one seems excited about writing that
>> >> code.
>> >
>> > That doesn't seem just a matter of code, it seems like a major design
>> > conflict.
>>
>> Yes.  I had the idea of trying to fix this by allowing the standby to
>> retain old versions of entire pages that got cleaned up on the master,
>> until the transactions that might want to read the old pages were
>> gone.  But that may be prohibitively difficult, not sure.
>
> I think you'd end up having a notion of a snapshot of block information
> (like a FS with snapshots) inside of postgres.

Yep.

> Sounds like a lot of complexity to me, and the only benefit I see is
> moving bloat from the primary to the standby. Granted, that would be
> nice, but I would expect some costs aside from just the complexity.

The standby is bloated either way, but you avoid propagating that
bloat back to the master.  It's particularly pernicious if you have a
master and 17 standbys.  Now any single standby with a long running
query bloats all 18 machines.  Not awesome.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Issues with two-server Synch Rep

From
Josh Berkus
Date:
> That's probably not going to happen until we have a way to update
> postgresql.conf via SQL.  Which, I maintain, as I have maintained
> before, is not going to happen until we get rid of the comments,
> because otherwise absolutely any implementation anyone proposes will
> get criticized for failing to handle them sensibly (because it is not
> possible to rewrite the file while handling the comments sensibly).

Well, you know my viewpoint on that.  I'm still bitter about the .conf
directory patch.

>>> What is your source for those numbers?  They could be right, but I
>>> simply don't know.
>> pg_bench tests with asynch rep and standby_delay = 0.  Not rigorous, but
>> enough to show that there is a problem there.  Doing pg_bench with a
>> small database
> 
> Interesting.

Yeah, it occurs to me that we can "fix" this with cleanup_delay on the
master, but that's a much worse solution than XID publication from the
standby.  It means bloat *all* the time instead of just some of the time.

I think we have Yet Another Knob here: users whose standby is
essentially idle will NOT want XID publication, and users whose standby
is for load-balancing will.

Note that XID publication needn't mean extraordinary extra bloat
provided that the queries on the standby are short-lived.   Long-running
reporting queries, of course, should be run on an asynch standby.

> Sure.  But we can't forever ignore the fact that trigger-based
> replication is not as performant as log-based replication.

Watch me.  ;-)

>> This problem certainly exists with async, it's just less likely so
>> people are ignoring it.  With a high enough transaction rate, and a
>> standby in "apply" mode, it's *certain* to happen on synch rep.  So we
>> can't ignore it as a problem anymore.
> 
> It exists with no replication at all...

Oh, I see what you mean, now.  Point taken.

> Yes, that's my viewpoint.  I think both this an the previous item are
> pretty basic database theory - and you can probably read about them in
> any good database theory textbook.

I doubt anything we're doing here is covering new ground, really.
Binary log-based replication has been around for a while in proprietary
products.  We just don't necessarily want to make the *same* tradeoffs
other DBs have.

> Fair enough.  I was just replying because (1) nobody else had and (2)
> I thought it might help to try to separate out which of the issues you
> raised are most relevant to the patches at hand.

Sure.  Simon is apparently on vacation right now.  Not sure where Fujii is.

--                                  -- Josh Berkus                                    PostgreSQL Experts Inc.
                        http://www.pgexperts.com
 


Re: Issues with two-server Synch Rep

From
Robert Haas
Date:
On Mon, Oct 11, 2010 at 9:29 PM, Josh Berkus <josh@agliodbs.com> wrote:
>>>> What is your source for those numbers?  They could be right, but I
>>>> simply don't know.
>>> pg_bench tests with asynch rep and standby_delay = 0.  Not rigorous, but
>>> enough to show that there is a problem there.  Doing pg_bench with a
>>> small database
>>
>> Interesting.
>
> Yeah, it occurs to me that we can "fix" this with cleanup_delay on the
> master, but that's a much worse solution than XID publication from the
> standby.  It means bloat *all* the time instead of just some of the time.

Yeah, that's worse, I think.

> I think we have Yet Another Knob here: users whose standby is
> essentially idle will NOT want XID publication, and users whose standby
> is for load-balancing will.

There probably is a knob, but XID publication ought to be basically
free on an idle standby, so the real trade-off is between query
cancellation or replay delay on the standby, vs. cluster-wide bloat.

>> Sure.  But we can't forever ignore the fact that trigger-based
>> replication is not as performant as log-based replication.
>
> Watch me.  ;-)

s/can't/shouldn't/ ?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Issues with two-server Synch Rep

From
Greg Stark
Date:
On Mon, Oct 11, 2010 at 5:41 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> That's probably not going to happen until we have a way to update
> postgresql.conf via SQL.  Which, I maintain, as I have maintained
> before, is not going to happen until we get rid of the comments,
> because otherwise absolutely any implementation anyone proposes will
> get criticized for failing to handle them sensibly (because it is not
> possible to rewrite the file while handling the comments sensibly).
>

So we've been over this. All the pieces you need are already there:
you can handle this without any nasty comment grunging by just writing
the new setting to a postgresql.auto and including that from
postgresql.conf. Include a note in postgresql.auto warning users any
changes in this file will be thrown away when the file is rewritten.
This is the same method used in .emacs.custom or a million other
places people wanted automatically written config files.

Also incidentally pgadmin currently *does* rewrite postgresql.conf
while keeping the comments. I think that's not such a hot idea because
it interferes with things like debian configuration file management
and so on, but it's not impossible to do. It's just that separating
automatically written files from user-editable files is a better
long-term plan.

--
greg


Re: Issues with two-server Synch Rep

From
Robert Haas
Date:
On Mon, Oct 11, 2010 at 9:44 PM, Greg Stark <gsstark@mit.edu> wrote:
> On Mon, Oct 11, 2010 at 5:41 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> That's probably not going to happen until we have a way to update
>> postgresql.conf via SQL.  Which, I maintain, as I have maintained
>> before, is not going to happen until we get rid of the comments,
>> because otherwise absolutely any implementation anyone proposes will
>> get criticized for failing to handle them sensibly (because it is not
>> possible to rewrite the file while handling the comments sensibly).
>>
>
> So we've been over this. All the pieces you need are already there:
> you can handle this without any nasty comment grunging by just writing
> the new setting to a postgresql.auto and including that from
> postgresql.conf. Include a note in postgresql.auto warning users any
> changes in this file will be thrown away when the file is rewritten.
> This is the same method used in .emacs.custom or a million other
> places people wanted automatically written config files.

It seems that someone could have the following complaint: a setting
configured in whichever file gets read first could potentially be
ignored if it's also set in whichever file gets read second.  And that
might be confusing.

Still, maybe we should just ignore that problem and do it anyway.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Issues with two-server Synch Rep

From
Gurjeet Singh
Date:
<div dir="ltr"><br /><div class="gmail_quote">On Mon, Oct 11, 2010 at 7:16 PM, Jeff Davis <span dir="ltr"><<a
href="mailto:pgsql@j-davis.com">pgsql@j-davis.com</a>></span>wrote:<br /><blockquote class="gmail_quote"
style="margin:0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"><div class="im">On Mon,
2010-10-11at 16:07 -0400, Robert Haas wrote:<br /> > >> It would be far better if we could decouple master
cleanupfrom<br /> > >> standby cleanup, so that only the machine that actually has the old<br /> > >>
querygets bloated.  However, no one seems excited about writing that<br /> > >> code.<br /> > ><br />
>> That doesn't seem just a matter of code, it seems like a major design<br /> > > conflict.<br /> ><br
/>> Yes.  I had the idea of trying to fix this by allowing the standby to<br /> > retain old versions of entire
pagesthat got cleaned up on the master,<br /> > until the transactions that might want to read the old pages were<br
/>> gone.  But that may be prohibitively difficult, not sure.<br /><br /></div>I think you'd end up having a notion
ofa snapshot of block information<br /> (like a FS with snapshots) inside of postgres.<br /><br /> Sounds like a lot of
complexityto me, and the only benefit I see is<br /> moving bloat from the primary to the standby. Granted, that would
be<br/> nice, but I would expect some costs aside from just the complexity.<br /></blockquote></div><br clear="all" />I
hadthis idea when HS was being initially implemented, and I saw two problems with it.<br /><br />.) Identifying block
versionsin buffers. Our XID based MVCC like scheme would work, but we already have a block header which we would want
totrim.<br /><br />.) Effectively every block cleanup will have to create a copy of the block because more often than
notthere's at least one transaction in the system, and when it asks, we need to hand it the old version.<br /><br />
Regards,<br/> -- <br />gurjeet.singh<br />@ EnterpriseDB - The Enterprise Postgres Company<br /><a
href="http://www.EnterpriseDB.com">http://www.EnterpriseDB.com</a><br/><br />singh.gurjeet@{ gmail | yahoo }.com<br
/>Twitter/Skype:singh_gurjeet<br /><br />Mail sent from my BlackLaptop device<br /></div> 

Re: Issues with two-server Synch Rep

From
Robert Haas
Date:
On Mon, Oct 11, 2010 at 10:17 PM, Gurjeet Singh <singh.gurjeet@gmail.com> wrote:
>> > Yes.  I had the idea of trying to fix this by allowing the standby to
>> > retain old versions of entire pages that got cleaned up on the master,
>> > until the transactions that might want to read the old pages were
>> > gone.  But that may be prohibitively difficult, not sure.
>>
>> I think you'd end up having a notion of a snapshot of block information
>> (like a FS with snapshots) inside of postgres.
>>
>> Sounds like a lot of complexity to me, and the only benefit I see is
>> moving bloat from the primary to the standby. Granted, that would be
>> nice, but I would expect some costs aside from just the complexity.
>
> I had this idea when HS was being initially implemented, and I saw two
> problems with it.
>
> .) Identifying block versions in buffers. Our XID based MVCC like scheme
> would work, but we already have a block header which we would want to trim.

I was thinking of stuffing the old versions of the blocks into a
relation fork... and then having some kind of mapping... thing...

> .) Effectively every block cleanup will have to create a copy of the block
> because more often than not there's at least one transaction in the system,
> and when it asks, we need to hand it the old version.

This is probably true.  And then, of course, you need the equivalent
of VACUUM to get rid of the old blocks, which kind of sucks.

Hence my earlier comments about being unsure that physical replication
is the way to go.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Robert Haas <robertmhaas@gmail.com> writes:
> That's probably not going to happen until we have a way to update
> postgresql.conf via SQL.  Which, I maintain, as I have maintained
> before, is not going to happen until we get rid of the comments,
> because otherwise absolutely any implementation anyone proposes will
> get criticized for failing to handle them sensibly (because it is not
> possible to rewrite the file while handling the comments sensibly).

I'm sorry but you don't need to get rid of the comments *at all* to have
the facility to update configuration from SQL.

What you need to do is make the choice that the setup is either human
friendly or SQL friendly. You will lose something that you now have to
be able to use SQL commands to edit the setup. Re-read. Ready now?

So, what you do is have a file per GUC, file name is the GUC name, first
line contains *ONLY* current value, the rest of the file is comments.

Note that it's still possible to hand-edit your setup. It's not even
hard to do. If you prefer a single file without comments, it's even
trivial to come up with a script that prepares the machine friendly
layout.

And it's even possible to offer DBAs a choice here: if postgresql.conf
exists, it's used in the old way and the SQL command only issues errors,
if there's a postgresql.conf.d and no postgresql.conf, the SQL commands
are working fine. Add an option for initdb to choose one layout or the
other.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: SQL command to edit postgresql.conf, with comments

From
Andrew Dunstan
Date:

On 10/12/2010 05:02 PM, Dimitri Fontaine wrote:
>   So, what you do is have a file per GUC, file name is the GUC name, first
> line contains *ONLY* current value, the rest of the file is comments.

You're joking, right?

cheers

andrew


Re: SQL command to edit postgresql.conf, with comments

From
Dimitri Fontaine
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> On 10/12/2010 05:02 PM, Dimitri Fontaine wrote:
>>   So, what you do is have a file per GUC, file name is the GUC name, first
>> line contains *ONLY* current value, the rest of the file is comments.
>
> You're joking, right?

No. I just want both comments and SQL commands. If you refuse this
simple file scheme, keep your postgresql.conf and don't remote edit it.

That's my proposal, I'm happy that it comes with laughter :)
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: SQL command to edit postgresql.conf, with comments

From
Darren Duncan
Date:
Dimitri Fontaine wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> On 10/12/2010 05:02 PM, Dimitri Fontaine wrote:
>>>   So, what you do is have a file per GUC, file name is the GUC name, first
>>> line contains *ONLY* current value, the rest of the file is comments.
>> You're joking, right?
> 
> No. I just want both comments and SQL commands. If you refuse this
> simple file scheme, keep your postgresql.conf and don't remote edit it.
> 
> That's my proposal, I'm happy that it comes with laughter :)

Maybe I missed something important, but why is it not possible to retain the 
single existing postgres.conf file format (human writable) *and* have it 
machine/SQL-editable *and* maintain the comments?  I should think that it would 
be possible to do all of these without too much trouble.  All you would need is 
for the file parser to retain the comments as metadata, include them in the 
relations that the SQL commands see where the latter can also edit them as data, 
and then write out the updated file with comments.  The fact that Postgres 
already explicitly supports comment metadata in its system catalog means it must 
already know something about this.  If something is missing, then expand the 
catalog so it represents all the details you want to preserve. -- Darren Duncan


Re: SQL command to edit postgresql.conf, with comments

From
"Kevin Grittner"
Date:
Darren Duncan <darren@darrenduncan.net> wrote:
> why is it not possible to retain the single existing postgres.conf
> file format (human writable) *and* have it machine/SQL-editable
> *and* maintain the comments?
My recollection of the previous discussion wasn't that there was no
way to do it, but rather that each person could thing of a somewhat
different way to do it and no two people agreed on which was best,
so no consensus was reached.
-Kevin


Re: SQL command to edit postgresql.conf, with comments

From
Josh Berkus
Date:
Darren, All,

> All you would need is for the file parser to retain the
> comments as metadata, include them in the relations that the SQL
> commands see where the latter can also edit them as data, and then write
> out the updated file with comments.

"All you need to do" in order to trisect and angle is to divide it into
three equal parts using just a compass and straightedge.

Given the number of different ways people can comment up pg.conf files,
and the fact that they can have the same settings set multiple times in
the file, both commented and uncommented, and that both the settings and
the comments can be in any order, with user-created spacing, you've
added a *huge* burden of parsing code to any .conf updater.  One which
would, in terms of LOC, exceed the updating code by more than 300%.

>  The fact that Postgres already
> explicitly supports comment metadata in its system catalog means it must
> already know something about this.

We support what?

The solution to this is simple, and was previously discussed on this list:

(a) have a postgresql.conf.auto
(b) add a default include for postgresql.conf.auto at the beginning of
PostgreSQL.conf
(c) SQL updates go to postgresql.conf.auto, which consists only of
"setting = value" in alphabetical order.
(d) We document that settings which are changed manually in
postgresql.conf will override postgresql.conf.auto.
(e) $$profit$$!!

--Josh Berkus

--                                  -- Josh Berkus                                    PostgreSQL Experts Inc.
                        http://www.pgexperts.com
 


Re: SQL command to edit postgresql.conf, with comments

From
Darren Duncan
Date:
Josh Berkus wrote:
>>  The fact that Postgres already
>> explicitly supports comment metadata in its system catalog means it must
>> already know something about this.
> 
> We support what?

Postgres has "COMMENT ON ..." SQL for various database objects and I assumed 
that said comments would be stored in the system catalog.
  http://www.postgresql.org/docs/9.0/interactive/sql-comment.html

> The solution to this is simple, and was previously discussed on this list:
> 
> (a) have a postgresql.conf.auto
> (b) add a default include for postgresql.conf.auto at the beginning of
> PostgreSQL.conf
> (c) SQL updates go to postgresql.conf.auto, which consists only of
> "setting = value" in alphabetical order.
> (d) We document that settings which are changed manually in
> postgresql.conf will override postgresql.conf.auto.
> (e) $$profit$$!!

I agree that this looks like an effective solution.

-- Darren Duncan


Re: SQL command to edit postgresql.conf, with comments

From
Josh Berkus
Date:
> Postgres has "COMMENT ON ..." SQL for various database objects and I
> assumed that said comments would be stored in the system catalog.

Oh.  Now that's an interesting perspective ... you're suggesting that we
take the comments and apply them as COMMENTS on the specific pg_settings?

That wouldn't solve the issues of ordering, or of comments in the file
not associated with a setting, but might be a good 80% solution.

--                                  -- Josh Berkus                                    PostgreSQL Experts Inc.
                        http://www.pgexperts.com
 


Re: SQL command to edit postgresql.conf, with comments

From
Richard Broersma
Date:
On Tue, Oct 12, 2010 at 3:54 PM, Josh Berkus <josh@agliodbs.com> wrote:

> Oh.  Now that's an interesting perspective ... you're suggesting that we
> take the comments and apply them as COMMENTS on the specific pg_settings?

On a side note regarding comments, I'd like to make a request for a
more comprehensive commenting mechanism.  The first though that comes
to my mind would allow for comments to be stored and annotated using
XML or sgml.  It'd be nice to be able to generate user documentation
from selected comments taken from application derived database
objects.

I don't know, maybe this is already possible.

--
Regards,
Richard Broersma Jr.

Visit the Los Angeles PostgreSQL Users Group (LAPUG)
http://pugs.postgresql.org/lapug


Re: SQL command to edit postgresql.conf, with comments

From
Tom Lane
Date:
Josh Berkus <josh@agliodbs.com> writes:
>> Postgres has "COMMENT ON ..." SQL for various database objects and I
>> assumed that said comments would be stored in the system catalog.

> Oh.  Now that's an interesting perspective ... you're suggesting that we
> take the comments and apply them as COMMENTS on the specific pg_settings?

Well, if the settings were stored in a catalog ... which they are not
... that might be a useful idea.

The reason that COMMENT ON isn't a terribly helpful analogy is that it
puts the burden on the user to associate specific comment texts with
specific database objects.  The problem with parsing and modifying
freestyle comments as seen in postgresql.conf is exactly (or mostly)
that there's no reasonable way of mechanically determining which
comments are associated with which setting.  We could invent some
arbitrary rule or other, but it'd have little to do with the way
people perceive what's in such a file.

I agree with Josh's proposal: keep mechanically-generated settings in a
separate file, and don't even pretend to allow comments to be kept there.
        regards, tom lane


Re: SQL command to edit postgresql.conf, with comments

From
Darren Duncan
Date:
Josh Berkus wrote first:
>> Postgres has "COMMENT ON ..." SQL for various database objects and I
>> assumed that said comments would be stored in the system catalog.
> 
> Oh.  Now that's an interesting perspective ... you're suggesting that we
> take the comments and apply them as COMMENTS on the specific pg_settings?

Yes, something along those lines.

I should point out that in general I consider a "COMMENT ON" to just be a less 
ambiguous form of a "-- " or "/* */" and that it would be a good idea for a code 
or config parser to preserve comments given in the latter formats in the same 
way it preserves the former when it can reasonably infer what to associate the 
latter comments with.

> That wouldn't solve the issues of ordering, or of comments in the file
> not associated with a setting, but might be a good 80% solution.

Yes, and for something like this an 80% solution is good.

As for ordering, that can be handled with more metadata, like a "visual order" 
number.

Tom Lane wrote second:
> Well, if the settings were stored in a catalog ... which they are not
> ... that might be a useful idea.

The catalog is where they *should* be stored, at least from the user's perspective.

> The reason that COMMENT ON isn't a terribly helpful analogy is that it
> puts the burden on the user to associate specific comment texts with
> specific database objects.  The problem with parsing and modifying
> freestyle comments as seen in postgresql.conf is exactly (or mostly)
> that there's no reasonable way of mechanically determining which
> comments are associated with which setting.  We could invent some
> arbitrary rule or other, but it'd have little to do with the way
> people perceive what's in such a file.

There's no reason that you can't have both kinds of comments.  You could have 
comments that are specially formatted such that the parser will then recognize 
they are associated with something specific and so put them in the system 
catalog.  And then you can have other comments not formatted that way and the 
parser will then pass over them like whitespace.

> I agree with Josh's proposal: keep mechanically-generated settings in a
> separate file, and don't even pretend to allow comments to be kept there.

I agree with the separate files approach as being the most practical in at least 
the short term.  However, I think that this file could support comments too, but 
they would just be limited to end-of-line comments on the same line as the 
setting, and it would be only these comments appearing in the system catalog by 
default.  Comments in the file for user editing would only appear in the catalog 
if specially formatted, which for now could just mean taking end-of-line comments.

-- Darren Duncan


Re: SQL command to edit postgresql.conf, with comments

From
Darren Duncan
Date:
Richard Broersma wrote:
> On Tue, Oct 12, 2010 at 3:54 PM, Josh Berkus <josh@agliodbs.com> wrote:
>> Oh.  Now that's an interesting perspective ... you're suggesting that we
>> take the comments and apply them as COMMENTS on the specific pg_settings?
> 
> On a side note regarding comments, I'd like to make a request for a
> more comprehensive commenting mechanism.  The first though that comes
> to my mind would allow for comments to be stored and annotated using
> XML or sgml.  It'd be nice to be able to generate user documentation
> from selected comments taken from application derived database
> objects.
> 
> I don't know, maybe this is already possible.

When you start going there, you have new issues to consider.  (For the record I 
also prefer plain text for comments.)

I should point out that the group making the Perl 6 language has already been 
looking into such matters extensively of essentially unifying code comments and 
code documentation into a common metadata both accessible in the source code and 
programmatically at runtime.

I think this approach goes beyond comments as we normally know them, which I 
just think of plain text strings associated with some code element.

But if you want to pursue bringing documentation into this, I suggest looking at 
what Perl 6, and other languages, have done.

While some of the results of the Perl 6 discussion may have just been in the 
forums, these urls at least are related to it:
 - http://perlcabal.org/syn/S02.html#Whitespace_and_Comments -
http://github.com/perl6/specs/raw/master/S26-documentation.podfor
 

I'm not proposing adopting their syntax, but some features or design may be 
useful to learn from.

I also want to point out that the FoxPro language constitutes some prior art 
about including comments as data fields in their runtime-accessible objects.

-- Darren Duncan


Re: Issues with two-server Synch Rep

From
Fujii Masao
Date:
On Fri, Oct 8, 2010 at 3:05 AM, Josh Berkus <josh@agliodbs.com> wrote:
> Adding a Synch Standby
> -----------------------
> What is the procedure for adding a new synchronous standby in your
> implementation?  That is, how do we go from having a standby server with
> an empty PGDATA to having a working synchronous standby?

In my patch, you still have to take a base backup from the master
and start the standby from that.

> Snapshot Publication
> ---------------------
> During 9.0 development discussion, one of the things we realized we
> needed for synch standby was publication of snapshots back to the master
> in order to prevent query cancel on the standby.  Without this, the
> synch standby is useless for running read queries.  Does your patch
> implement this?

No. I think that this has almost nothing to do with sync rep itself.

To solve this problem, I think that we should implement the mechanism
like UNDO segment on the standby instead of snapshot publication. That
is, the replay of the VACUUM operation copies the old version of tuple
to somewhere instead of removing it immediately, until all the transactions
which can see that have gone away. Though it would be difficult to
implement this.

> Management
> -----------
> One of the serious flaws currently in HS/SR is complexity of
> administration.  Setting up and configuring even a single master and
> single standby requires editing up to 6 configuration files in Postgres,
> as well as dealing with file permissions.  As such, any Synch Rep patch
> must work together with attempts to simplify administration.  How does
> your design do this?

No. What is worse is that my patch introduces new configuration file
standbys.conf.

Aside from the patch, I agree to specify the synchronous standbys in
the existing conf file like postgresql.conf on the master or recovery.conf
on the standby instead of adding new conf file.

> Monitoring
> -----------
> Synch rep offers severe penalties to availability if a synch standby
> gets behind or goes down.  What replication-specific monitoring tools
> and hooks are available to allow administators to take action before the
> database becomes unavailable?

Yeah, if you choose recv or fsync as synchronization level, recovery on
the standby might fall behind the master even in sync rep. This delay
would increase the failover time.

To monitor that, you can use pg_current_xlog_location on the master and
pg_last_xlog_replay_location on the standby. My patch hasn't provided
another monitoring mechanism.

> Degradation
> ------------
> In the event that the synch rep standby falls too far behind or becomes
> unavailable, or is deliberately taken offline, what are you envisioning
> as the process for the DBA resolving the situation?  Is there any
> ability to commit "stuck" transactions?

Since my patch hasn't provided wait-forever option, there is obviously
no capability to resume the transactions which are waiting until the
standby has caught up with.

> Client Consistency
> ---------------------
> With a standby in "apply" mode, and a master failure at the wrong time,
> there is the possibility that the Standby will apply a transaction at
> the same time that the master crashes, causing the client to never
> receive a commit message.  Once the client reconnects to the standby,
> how will it know whether its transaction was committed or not?

This problem can happen even if you don't use replication. So this
should be addressed separately from sync rep.

> As a lesser case, a standby in "apply" mode will show the results of
> committed transactions *before* they are visible on the master.  Is
> there any need to handle this?  If so, how?

The cluster-side snapshot would be required in order to ensure that
all the standbys return the same result. "Export snapshots to other sessions"
feature which was proposed in Cluster Developer Meeting is one step for
that, I think.
http://wiki.postgresql.org/wiki/ClusterFeatures

> Performance
> ------------
> As with XA, synch rep has the potential to be so slow as to be unusable.
>  What optimizations to you make in your approach to synch rep to make it
> faster than two-phase commit?  What other performance optimizations have
> you added?

To allow walsender to send the WAL records which have not been fsync'd yet
(i.e., WAL is written and sent in parallel) would increase the performance
significantly. Obviously my patch has not provided this improvement.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: SQL command to edit postgresql.conf, with comments

From
Dimitri Fontaine
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:
> I agree with Josh's proposal: keep mechanically-generated settings in a
> separate file, and don't even pretend to allow comments to be kept there.

And then, when you SET PERMANENT knob TO value (or whatever the syntax
is), you never know what value is selected at next startup or SIGHUP.

I know I'm alone on this, but I much prefer the all-machine-friendly
proposal that still makes it possible to hand-edit the files. You get
remote editing, comments, and the code is *very easy* to write.

The only drawback is that we're not used to it so it might look odd, or
outright ugly. I mean, the directory containing the files that you're
not supposed to edit manually at all any more looks strange. How big a
problem is that, when it allows for implementing the feature easily?

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: SQL command to edit postgresql.conf, with comments

From
Robert Haas
Date:
On Wed, Oct 13, 2010 at 3:41 AM, Dimitri Fontaine
<dimitri@2ndquadrant.fr> wrote:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
>> I agree with Josh's proposal: keep mechanically-generated settings in a
>> separate file, and don't even pretend to allow comments to be kept there.
>
> And then, when you SET PERMANENT knob TO value (or whatever the syntax
> is), you never know what value is selected at next startup or SIGHUP.
>
> I know I'm alone on this, but I much prefer the all-machine-friendly
> proposal that still makes it possible to hand-edit the files.

You're not alone on this at all: I agree 100%.  I don't like your
proposed syntax, but I completely agree with your concern.  I don't
see what's wrong with having the initial contents of postgresql.conf
look like this (these are the settings that are uncommented by default
on my machine):

# type "man postgresql.conf" for help on editing this file
max_connections = 100
shared_buffers = 32MB
datestyle = 'iso, mdy'
lc_messages = 'en_US.UTF-8'
lc_monetary = 'en_US.UTF-8'
lc_numeric = 'en_US.UTF-8'
lc_time = 'en_US.UTF-8'
default_text_search_config = 'pg_catalog.english'

When you type 'man postgresql.conf' it can tell you all of the things
that are currently in the file as comments.  It's just that they'll be
in a man page rather in the file itself.  I don't see what's bad about
that.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Issues with two-server Synch Rep

From
Peter Eisentraut
Date:
On mån, 2010-10-11 at 18:44 -0700, Greg Stark wrote:
> So we've been over this. All the pieces you need are already there:
> you can handle this without any nasty comment grunging by just writing
> the new setting to a postgresql.auto and including that from
> postgresql.conf. Include a note in postgresql.auto warning users any
> changes in this file will be thrown away when the file is rewritten.
> This is the same method used in .emacs.custom or a million other
> places people wanted automatically written config files.

Seems like a reasonable solution, although I've never heard
of .emacs.custom; my emacsen have always written their custom settings
somewhere in the middle of the .emacs proper.



Re: SQL command to edit postgresql.conf, with comments

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> You're not alone on this at all: I agree 100%.  I don't like your
> proposed syntax, but I completely agree with your concern.  I don't
> see what's wrong with having the initial contents of postgresql.conf
> look like this (these are the settings that are uncommented by default
> on my machine):

> # type "man postgresql.conf" for help on editing this file
> max_connections = 100
> shared_buffers = 32MB
> datestyle = 'iso, mdy'
> lc_messages = 'en_US.UTF-8'
> lc_monetary = 'en_US.UTF-8'
> lc_numeric = 'en_US.UTF-8'
> lc_time = 'en_US.UTF-8'
> default_text_search_config = 'pg_catalog.english'

I'm not sure if anybody is particularly against the initial contents
looking like that.  The big problem, which both you and Dimitri are
conveniently ignoring, is that if people are allowed to hand-edit
the file they are going to introduce comments that no mechanical
parser will do a nice job of preserving.  And they're not going to be
happy when SET PERMANENT has a side-effect of throwing away their
comments.

I don't see anything particularly wrong with Josh's proposal of keeping
machine-generated and person-generated text in separate files.  Dimitri
complains that the behavior will be confusing if there are conflicting
settings, but I think that's hogwash.  We already have the ability for
pg_settings to tell you which file, and even which line, set the active
value of the setting.  It's not going to be hard for people to figure
that out.
        regards, tom lane


Re: Issues with two-server Synch Rep

From
Simon Riggs
Date:
On Thu, 2010-10-07 at 11:05 -0700, Josh Berkus wrote:
> Simon, Fujii,
> 
> What follows are what I see as the major issues with making two-server
> synch replication work well.  I would like to have you each answer them,
> explaining how your patch and your design addresses each issue.  I
> believe this will go a long way towards helping the majority of the
> community understand the options we have from your code, as well as
> where help is still needed.

Happy to answer your questions. Please add me to the copy list if you
address me directly.

> Adding a Synch Standby
> -----------------------
> What is the procedure for adding a new synchronous standby in your
> implementation?  That is, how do we go from having a standby server with
> an empty PGDATA to having a working synchronous standby?

Same as adding a streaming standby.

Only difference is that *if* you don't want standby to be a synch
standby then you would set synchronous_replication_service = off

My understanding is that other approaches are significantly more complex
at this point, with required changes on the master, and also on the
standby should we wish the standby to be a failover target.

> Snapshot Publication
> ---------------------
> During 9.0 development discussion, one of the things we realized we
> needed for synch standby was publication of snapshots back to the master
> in order to prevent query cancel on the standby.  Without this, the
> synch standby is useless for running read queries.  

Don't see much difference there. 

This isn't isn't needed for sync rep. It can be added, as soon as we
have a channel to pass info back from standby to master. That is a small
commit that can be added after we commit something; I will handle that -
it is a requirement that will be addressed.

> Does your patch
> implement this?  Please describe.

No, but that isn't needed for sync rep.

> Management
> -----------
> One of the serious flaws currently in HS/SR is complexity of
> administration.  Setting up and configuring even a single master and
> single standby requires editing up to 6 configuration files in Postgres,
> as well as dealing with file permissions.  As such, any Synch Rep patch
> must work together with attempts to simplify administration.  How does
> your design do this?

Simplification of the existing framework is possible, though is not a
goal of sync rep. My proposed approach is to add as few mandatory
parameters as possible to avoid over-complexity.

Complexity of administration is very important, because getting it wrong
has a critical impact on availability and can lead to data loss.

In the two node case this post covers, my patch requires 1 parameter,
added to the existing postgresql.conf on the master. That parameter does
not need to be changed should failover occur. So no parameter changes
are required at failover, nor can mistakes happen because of
misconfiguration.

> Monitoring
> -----------
> Synch rep offers severe penalties to availability if a synch standby
> gets behind or goes down.  What replication-specific monitoring tools
> and hooks are available to allow administators to take action before the
> database becomes unavailable?

I don't see any differences here. It's easy to add an SRF that shows
current status of standbys.

> Degradation
> ------------
> In the event that the synch rep standby falls too far behind or becomes
> unavailable, or is deliberately taken offline, what are you envisioning
> as the process for the DBA resolving the situation?  

Add a new standby as quickly as possible. This only happens if the DBA
had not provided sufficient standbys in the first place.

> Is there any
> ability to commit "stuck" transactions?

Yes, an operator function.

> Client Consistency
> ---------------------
> With a standby in "apply" mode, and a master failure at the wrong time,
> there is the possibility that the Standby will apply a transaction at
> the same time that the master crashes, causing the client to never
> receive a commit message.  Once the client reconnects to the standby,
> how will it know whether its transaction was committed or not?

It wouldn't, but this situation already occurs even without sync rep.
Any user issuing COMMIT at time of server crash may that their
transaction was committed and they received no commit message. There is
no "tell me if the last thing I did worked" function, since the client
doesn't record the xid.

> As a lesser case, a standby in "apply" mode will show the results of
> committed transactions *before* they are visible on the master.  Is
> there any need to handle this?  If so, how?

No need to handle it. It's how it works. As long as there are more than
one clog then we will have commits happening at different times.

> Performance
> ------------
> As with XA, synch rep has the potential to be so slow as to be unusable.
>  What optimizations to you make in your approach to synch rep to make it
> faster than two-phase commit?  What other performance optimizations have
> you added?

Applications implementing sync rep will be very sensitive to the
performance we provide. Designed-in performance will be critical.
Providing both performance and application flexibility is a cornerstone
of my design. Sync rep does not have to be slow, nor do we need to make
it unusable.

* Master-side transaction controlled replication allows parameters to
control behaviour within applications.

* Bulk acknowledgement - the standby doesn't send back details of each
individual waiting transaction, nor does it know or care. This means the
response messages are very small and only sent when status changes.

* First reponder processing is faster than quorum_commit > 1.

* In a later patch, WAL writer will be active during recovery, so that
WALreceiver doesn't need to fsync. This is required to implement the
"recv" mode which is very important for performance. Heikki asked me to
remove this from my current patch, but its easy to add back in again
soon afterwards. It's very important, IMHO.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Development, 24x7 Support, Training and Services



Re: SQL command to edit postgresql.conf, with comments

From
Andrew Dunstan
Date:

On 10/13/2010 10:25 AM, Tom Lane wrote:
> Robert Haas<robertmhaas@gmail.com>  writes:
>> You're not alone on this at all: I agree 100%.  I don't like your
>> proposed syntax, but I completely agree with your concern.  I don't
>> see what's wrong with having the initial contents of postgresql.conf
>> look like this (these are the settings that are uncommented by default
>> on my machine):
>> # type "man postgresql.conf" for help on editing this file
>> max_connections = 100
>> shared_buffers = 32MB
>> datestyle = 'iso, mdy'
>> lc_messages = 'en_US.UTF-8'
>> lc_monetary = 'en_US.UTF-8'
>> lc_numeric = 'en_US.UTF-8'
>> lc_time = 'en_US.UTF-8'
>> default_text_search_config = 'pg_catalog.english'
> I'm not sure if anybody is particularly against the initial contents
> looking like that.  The big problem, which both you and Dimitri are
> conveniently ignoring, is that if people are allowed to hand-edit
> the file they are going to introduce comments that no mechanical
> parser will do a nice job of preserving.  And they're not going to be
> happy when SET PERMANENT has a side-effect of throwing away their
> comments.
>
> I don't see anything particularly wrong with Josh's proposal of keeping
> machine-generated and person-generated text in separate files.  Dimitri
> complains that the behavior will be confusing if there are conflicting
> settings, but I think that's hogwash.  We already have the ability for
> pg_settings to tell you which file, and even which line, set the active
> value of the setting.  It's not going to be hard for people to figure
> that out.

+1. Preserving the comments when you change the value could make the 
comments totally bogus. Separating machine-generated values into a 
separate file makes plenty of sense to me.

Which one wins, though? I can see cases being made for both.

cheers

andrew


Re: Issues with two-server Synch Rep

From
Simon Riggs
Date:
On Mon, 2010-10-11 at 11:07 -0700, Josh Berkus wrote:

> Absolutely.  For a synch standby, you can't tolerate any standby delay
> at all.  This means that anywhere from 1/4 to 3/4 of queries on the
> standby would be cancelled on any high-traffic OLTP server.  Hence,
> "useless".

Don't agree with your numbers there and you seem to be assuming no
workarounds would be in use. A different discussion, I think.

> Interaction?  My opinion is that the two are completely incompatible.
> You can't have synch rep and also have standby_delay > 0.

I would agree that adding an "apply" mode only makes sense when we have
master feedback to ensure that standby delay is minimised. But that's
not the only use case for sync rep and it doesn't actually help that
much. 

Adding the feedback channel looks trivial to me, once we've got the main
sync rep patch in. I'll handle that.

For this reason, I've removed the "apply" mode from my patch, for now. I
want to get the simplest possible patch agreed and then add things
later.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Development, 24x7 Support, Training and Services



Re: SQL command to edit postgresql.conf, with comments

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> +1. Preserving the comments when you change the value could make the 
> comments totally bogus. Separating machine-generated values into a 
> separate file makes plenty of sense to me.

> Which one wins, though? I can see cases being made for both.

IIRC the proposal was that postgresql.conf (the people-editable file)
would have "include postgresql.auto" in it.  You could put that at
the top, bottom, or even middle to control how the priority works.
So it's user-configurable.  I think the factory default ought to
be to have it at the top, which would result in manually edited
settings (if any) overriding SET PERMANENT.

Basically the way I'd like to see this go is that SET PERMANENT
gets attached on the side of what's there now, and people who are
used to the old way don't have to change their habits at all.
If the new way is as much better as its advocates claim, use of
manual editing of postgresql.conf will soon die off; then at some
future date we could consider whether to remove that file or at
least delete all the comments it contains out-of-the-box.  About
the only change I want to make immediately is that initdb ought
to shove its settings into postgresql.auto instead of mucking with
postgresql.conf.
        regards, tom lane


Re: SQL command to edit postgresql.conf, with comments

From
David Christensen
Date:
On Oct 13, 2010, at 10:24 AM, Tom Lane wrote:

> Andrew Dunstan <andrew@dunslane.net> writes:
>> +1. Preserving the comments when you change the value could make the
>> comments totally bogus. Separating machine-generated values into a
>> separate file makes plenty of sense to me.
>
>> Which one wins, though? I can see cases being made for both.
>
> IIRC the proposal was that postgresql.conf (the people-editable file)
> would have "include postgresql.auto" in it.  You could put that at
> the top, bottom, or even middle to control how the priority works.
> So it's user-configurable.  I think the factory default ought to
> be to have it at the top, which would result in manually edited
> settings (if any) overriding SET PERMANENT.

Since this is just touching the local servers' postgresql.conf.auto (or whatever), any reason why SET PERMANENT
couldn'tbe used on a read-only standby?  Could this be to manage some of the failover scenarios (i.e., setting any
relevantconfig from a central clusterware|whatever)? 

Regards,

David
--
David Christensen
End Point Corporation
david@endpoint.com






Re: SQL command to edit postgresql.conf, with comments

From
Dimitri Fontaine
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:
> I'm not sure if anybody is particularly against the initial contents
> looking like that.  The big problem, which both you and Dimitri are
> conveniently ignoring, is that if people are allowed to hand-edit
> the file they are going to introduce comments that no mechanical
> parser will do a nice job of preserving.

IMO the only reason why my proposal is sound is that is address the
point. Consider:

cat postgresql.conf.d/work_mem
16MB
This database needs at least such a value.
Note it's overridden in some ROLEs setup.


With such a format (name is filename, value is first line content's,
rest is comments), it's easy to preserve comments and have them machine
editable. What do I miss?

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: SQL command to edit postgresql.conf, with comments

From
Robert Haas
Date:
On Wed, Oct 13, 2010 at 10:25 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> You're not alone on this at all: I agree 100%.  I don't like your
>> proposed syntax, but I completely agree with your concern.  I don't
>> see what's wrong with having the initial contents of postgresql.conf
>> look like this (these are the settings that are uncommented by default
>> on my machine):
>
>> # type "man postgresql.conf" for help on editing this file
>> max_connections = 100
>> shared_buffers = 32MB
>> datestyle = 'iso, mdy'
>> lc_messages = 'en_US.UTF-8'
>> lc_monetary = 'en_US.UTF-8'
>> lc_numeric = 'en_US.UTF-8'
>> lc_time = 'en_US.UTF-8'
>> default_text_search_config = 'pg_catalog.english'
>
> I'm not sure if anybody is particularly against the initial contents
> looking like that.  The big problem, which both you and Dimitri are
> conveniently ignoring, is that if people are allowed to hand-edit
> the file they are going to introduce comments that no mechanical
> parser will do a nice job of preserving.  And they're not going to be
> happy when SET PERMANENT has a side-effect of throwing away their
> comments.

But creating a separate file doesn't fix that problem.  It just moves
it around.  If people will expect comments in postgresql.conf to get
preserved, then why won't they also expect comments in
postgresql.conf.auto to get preserved?

If the answer is "because postgresql.conf has always worked that way
before", then add one more line to the proposed initial contents
saying it's not true any more.  It can be the same line you were going
to put in postgresql.conf.auto explaining the same fact.

# The SQL command SET PERMANENT rewrites this file and may move or
remove comment lines.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: SQL command to edit postgresql.conf, with comments

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> But creating a separate file doesn't fix that problem.  It just moves
> it around.  If people will expect comments in postgresql.conf to get
> preserved, then why won't they also expect comments in
> postgresql.conf.auto to get preserved?

Because postgresql.conf.auto will have a nice leading comment telling
people (1) not to hand-edit the file, (2) if they do so anyway,
not to expect comments to be preserved, and (3) the place to do manual
editing of settings is postgresql.conf.

> If the answer is "because postgresql.conf has always worked that way
> before", then add one more line to the proposed initial contents
> saying it's not true any more.

Sorry, wrong answer.  The objection to this is not whether you tell
people that you're taking away the ability to keep useful comments
in postgresql.conf, it's that you're taking away the ability.
        regards, tom lane


Re: SQL command to edit postgresql.conf, with comments

From
Josh Berkus
Date:
> IIRC the proposal was that postgresql.conf (the people-editable file)
> would have "include postgresql.auto" in it.  You could put that at
> the top, bottom, or even middle to control how the priority works.
> So it's user-configurable.  I think the factory default ought to
> be to have it at the top, which would result in manually edited
> settings (if any) overriding SET PERMANENT.

Right, I think that's the behavior which will result in the least newbie
confusion.

So what we'd add to postgresql.conf would look something like this:

# Link to auto-generated configuration file.
# Do not edit the auto-generated file or remove this link;
# instead, edit settings below in this file and they will
# override the auto-generated file.

include 'postgresql.conf.auto'

> the only change I want to make immediately is that initdb ought
> to shove its settings into postgresql.auto instead of mucking with
> postgresql.conf.

That would be good.

--                                  -- Josh Berkus                                    PostgreSQL Experts Inc.
                        http://www.pgexperts.com
 


Re: SQL command to edit postgresql.conf, with comments

From
Garick Hamlin
Date:
On Wed, Oct 13, 2010 at 12:56:15PM -0400, Tom Lane wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > But creating a separate file doesn't fix that problem.  It just moves
> > it around.  If people will expect comments in postgresql.conf to get
> > preserved, then why won't they also expect comments in
> > postgresql.conf.auto to get preserved?
> 
> Because postgresql.conf.auto will have a nice leading comment telling
> people (1) not to hand-edit the file, (2) if they do so anyway,
> not to expect comments to be preserved, and (3) the place to do manual
> editing of settings is postgresql.conf.
> 
> > If the answer is "because postgresql.conf has always worked that way
> > before", then add one more line to the proposed initial contents
> > saying it's not true any more.
> 
> Sorry, wrong answer.  The objection to this is not whether you tell
> people that you're taking away the ability to keep useful comments
> in postgresql.conf, it's that you're taking away the ability.
> 
>             regards, tom lane

I like this approach.  I was just wondering if there is a simple 
tweak to this schema to make it work more easily with standbys.  If
there was a GUC that controlled the 'auto filename' and it could expand
something like %h to hostname (or name if we had something like standby 
registration).  This would allow each standby to store its local settings 
in a different location and have something like a unified set of config
files.  

I suppose something like symlinking postgresql.auto <hostname>.auto on each 
machine might achieve a similar effect...

Garick
> 
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


Re: Issues with two-server Synch Rep

From
Dimitri Fontaine
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> Seems like a reasonable solution, although I've never heard
> of .emacs.custom; my emacsen have always written their custom settings
> somewhere in the middle of the .emacs proper.

See (info "(emacs) Saving Customizations") and add the following to your
setup:    (setq custom-file "~/.emacs-custom.el")    (load custom-file)
 http://www.gnu.org/software/emacs/manual/html_mono/emacs.html#Saving-Customizations

I guess it's relevant enough here.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: SQL command to edit postgresql.conf, with comments

From
Dimitri Fontaine
Date:
Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes:
> cat postgresql.conf.d/work_mem
> 16MB
> This database needs at least such a value.
> Note it's overridden in some ROLEs setup.
>
>
> With such a format (name is filename, value is first line content's,
> rest is comments), it's easy to preserve comments and have them machine
> editable. What do I miss?

Allow me to insist on this some more, because there's something
important going on here. The other proposal (.auto) have a major failure
mode that I don't think is acceptable.
 SET PERMANENT work_mem TO '8 MB'; select pg_reload_conf();

There's simply no way after this sequence to guess the current active
value of work_mem, because it could be overridden in the non-automatic
file. How do you work with such a system?


So, back to my funny proposal. It does not have the problem above
because it's either postgresql.conf or postgresql.conf.d, not both. A
single source of authority.

I'm being told that we're talking about something over 200 files and
that's too many. That's a problem size we already know how to handle, I
would say:

dim ~/dev/PostgreSQL/postgresql-extension ls -C1 doc/src/sgml/ref/ |wc -l    166

Ok, now, some people want to still be able to edit the files by hand
because it's the way they do it now. Well, first, that's still possible
and easy to script if so you care, then again, open an editor a prepare
a SQL script:
set permanent foo to bar with comment 'blah';set …

Then you psql -f the script and you're done. So you use the "remote
edit" feature to be able to edit a single file rather than a bunch of
them if so you want.

Another option in the same spirit would be to prepare a file in the CSV
format (name;value) and have a script that COPY it into some table, then SELECT set_config(name, value, 'permanent')
FROMimport_setup_table; 

With a CSV foreign data wrapper at the door, you don't even have to use
an intermediate table where to direct your COPY.

Executive Summary: you can still edit some file the old way, you just
need a tool to then issue the SQL for you rather than reload.

Now, there are some comments that are not tied to any setting in
particular. We could extend my idea to include a .comments file in the
directory, and a command to edit it.

We could make it so that COMMENT ON SETTING would just amend the given
GUC's file, and that COMMENT ON CONFIG allow you to edit the main
.comment file.

Then, there's another feature here, which is having a single comment
entry for more than one parameter. Again, we could support it rather
easily by having directories to group files, and the common comment
would be the groupname/.comment file entry. Add some new commands to
manage that, and refuse to consider the setup at all as soon as there's
file name duplication in there.
 CREATE SETTING GROUP name WITH foo, bar, ... AND COMMENT 'foobar'; CREATE SETTING GROUP other WITH baz, foo; NOTICE:
foowas in the group name, it's been moved to other. 

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: SQL command to edit postgresql.conf, with comments

From
Dimitri Fontaine
Date:
Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes:
> Ok, now, some people want to still be able to edit the files by hand

I wonder if providing a system function to load some configuration
option from a file, using the current parser, would do it:
 SELECT pg_load_conf('path/to/filename.conf', permament => true);

Of course comments are ignored. Settings in the directory would be
changed according to what's in your file, so that it'd be roughtly
equivalent as current steps:a. edit the fileb. use pg_load_conf() rather than pg_reload_conf()

Comments?
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: SQL command to edit postgresql.conf, with comments

From
Tom Lane
Date:
Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes:
> Allow me to insist on this some more, because there's something
> important going on here. The other proposal (.auto) have a major failure
> mode that I don't think is acceptable.

>   SET PERMANENT work_mem TO '8 MB';
>   select pg_reload_conf();

> There's simply no way after this sequence to guess the current active
> value of work_mem,

Um, other than "show work_mem" or "select from pg_settings"?

The fact is that you cannot know the active value anyway without
checking, because what you did with SET PERMANENT might be overridden
in various session-local ways.  The proposal for hand-edited versus
machine-edited files just adds one more layer of possible overrides
to the existing half-dozen layers, all of which are widely considered
features not bugs.  So I see no merit in your argument.

> I'm being told that we're talking about something over 200 files and
> that's too many.

Yup, you're dead right about that.  Backup/restore of configurations
would become a real mess.
        regards, tom lane


Re: SQL command to edit postgresql.conf, with comments

From
Tom Lane
Date:
Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes:
> I wonder if providing a system function to load some configuration
> option from a file, using the current parser, would do it:

>   SELECT pg_load_conf('path/to/filename.conf', permament => true);

That seems like a pretty bizarre idea.  The settings wouldn't be
persistent would they?  Or are you proposing this as a substitute
way of providing input for SET PERMANENT?  If so what's the value?
It seems to me that it just introduces unnecessary failure modes
(ie, server can't read file because of permissions) without any
really useful new functionality.
        regards, tom lane


Re: SQL command to edit postgresql.conf, with comments

From
Dimitri Fontaine
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:
> Um, other than "show work_mem" or "select from pg_settings"?
>
> The fact is that you cannot know the active value anyway without
> checking, because what you did with SET PERMANENT might be overridden
> in various session-local ways.  The proposal for hand-edited versus
> machine-edited files just adds one more layer of possible overrides
> to the existing half-dozen layers, all of which are widely considered
> features not bugs.  So I see no merit in your argument.

I understand that. I just think there are already too many sources for
GUCs and would welcome that there's a single possible source file with a
complete remote editing feature.

-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: SQL command to edit postgresql.conf, with comments

From
Dimitri Fontaine
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:
>>   SELECT pg_load_conf('path/to/filename.conf', permament => true);
>
> That seems like a pretty bizarre idea.  The settings wouldn't be
> persistent would they?  Or are you proposing this as a substitute
> way of providing input for SET PERMANENT?  If so what's the value?

Yeah, it only has value if those two conditions are met:1. there's a single source file possible for the
configuration2.it's not in the format you'd like it to be for easy hand editing
 

As the premise aren't reached, I agree such a function would have no value.
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: SQL command to edit postgresql.conf, with comments

From
Tom Lane
Date:
Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes:
> I understand that. I just think there are already too many sources for
> GUCs and would welcome that there's a single possible source file with a
> complete remote editing feature.

[ shrug... ]  So don't use the option of hand-editing postgresql.conf.
You're not being forced to do that, and on the other side of the coin,
you shouldn't think that you get to force people who'd rather hand-edit
to change their habits.
        regards, tom lane


Re: SQL command to edit postgresql.conf, with comments

From
Dimitri Fontaine
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:
> [ shrug... ]  So don't use the option of hand-editing postgresql.conf.
> You're not being forced to do that, and on the other side of the coin,
> you shouldn't think that you get to force people who'd rather hand-edit
> to change their habits.

Sure. I just lose comments. I'll live with that.
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: SQL command to edit postgresql.conf, with comments

From
Josh Berkus
Date:
> Sure. I just lose comments. I'll live with that.

Actually, as part of this scheme, it would be nice if pg_settings had a 
"comment" column, which would be optionally set with SET PERMANENT.  Not 
required, but nice to have.

If we had that, I suspect that a lot fewer people would want a 
hand-edited file.

--                                   -- Josh Berkus                                     PostgreSQL Experts Inc.
                           http://www.pgexperts.com
 


Re: SQL command to edit postgresql.conf, with comments

From
Robert Haas
Date:
On Thu, Oct 14, 2010 at 12:40 PM, Josh Berkus <josh@agliodbs.com> wrote:
>
>> Sure. I just lose comments. I'll live with that.
>
> Actually, as part of this scheme, it would be nice if pg_settings had a
> "comment" column, which would be optionally set with SET PERMANENT.  Not
> required, but nice to have.
>
> If we had that, I suspect that a lot fewer people would want a hand-edited
> file.

I have to say that I'm woefully unimpressed by the idea of trying to
do anything with comments other than ignore them, even something this
simple.  There's no obvious way to know which comments go with which
settings.   You can make up a rule, such as "on the same line", but
it's not necessarily going to be what people want.  I think it's
better to sidestep the issue entirely.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: SQL command to edit postgresql.conf, with comments

From
Robert Haas
Date:
On Thu, Oct 14, 2010 at 9:50 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> The fact is that you cannot know the active value anyway without
> checking, because what you did with SET PERMANENT might be overridden
> in various session-local ways.  The proposal for hand-edited versus
> machine-edited files just adds one more layer of possible overrides
> to the existing half-dozen layers, all of which are widely considered
> features not bugs.  So I see no merit in your argument.

You know, this is a good point.  I was really hoping for a single
file, but maybe two files is not so bad as I think it is.  However, I
kind of dislike SET PERMANENT as a command name, partly because I
think it sounds more certain than it really is, and partly because
it's asymmetric with the other, similar GUC-setting commands, which
are:

ALTER ROLE name [ IN DATABASE database_name ] SET
configuration_parameter { TO | = } { value | DEFAULT }
ALTER DATABASE name SET configuration_parameter { TO | = } { value | DEFAULT }

Perhaps ALTER SYSTEM SET configuration_parameter { TO | = } { value |
DEFAULT } ?

A similar syntax exists in Oracle:

http://stanford.edu/dept/itss/docs/oracle/10g/server.101/b10759/statements_2013.htm#i2282157

From what I gather this works out to:

ALTER SYSTEM SCOPE = { MEMORY | SPFILE | BOTH } SET
configuration_paramater = value

(I don't think we can the SCOPE clause because I believe the only way
we have of propagating a GUC through the system is to have all the
backends reread the file.)

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: SQL command to edit postgresql.conf, with comments

From
Andrew Dunstan
Date:

On 10/14/2010 01:10 PM, Robert Haas wrote:
> On Thu, Oct 14, 2010 at 12:40 PM, Josh Berkus<josh@agliodbs.com>  wrote:
>>> Sure. I just lose comments. I'll live with that.
>> Actually, as part of this scheme, it would be nice if pg_settings had a
>> "comment" column, which would be optionally set with SET PERMANENT.  Not
>> required, but nice to have.
>>
>> If we had that, I suspect that a lot fewer people would want a hand-edited
>> file.
> I have to say that I'm woefully unimpressed by the idea of trying to
> do anything with comments other than ignore them, even something this
> simple.  There's no obvious way to know which comments go with which
> settings.   You can make up a rule, such as "on the same line", but
> it's not necessarily going to be what people want.  I think it's
> better to sidestep the issue entirely.

If the file is completely machine-maintained, including the comments, it 
seems eminently doable. It's only if the file gets mangled by humans 
that there's a problem.

cheers

andrew


Re: SQL command to edit postgresql.conf, with comments

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> If the file is completely machine-maintained, including the comments, it 
> seems eminently doable. It's only if the file gets mangled by humans 
> that there's a problem.

Yeah.  We could have comments that were injected by some sort of COMMENT
ON command, stored in the file in some suitably strict format,
and then maintained mechanically.  What we don't want to do is promise
that hand editing of the file can coexist with machine updates.
        regards, tom lane


Re: SQL command to edit postgresql.conf, with comments

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> I kind of dislike SET PERMANENT as a command name, partly because I
> think it sounds more certain than it really is, and partly because
> it's asymmetric with the other, similar GUC-setting commands, which
> are:

> ALTER ROLE name [ IN DATABASE database_name ] SET
> configuration_parameter { TO | = } { value | DEFAULT }
> ALTER DATABASE name SET configuration_parameter { TO | = } { value | DEFAULT }

> Perhaps ALTER SYSTEM SET configuration_parameter { TO | = } { value |
> DEFAULT } ?

That might be a good idea.  One argument for it is that you could
extend this syntax to include an optional comment, which would avoid
having to wedge an inherently non-transactional operation into
COMMENT ON.  Say
ALTER SYSTEM SET foo  = 'bar' COMMENT 'hacked by tgl 10/13/2010';
        regards, tom lane


Re: Issues with two-server Synch Rep

From
Josh Berkus
Date:
>> Absolutely.  For a synch standby, you can't tolerate any standby delay
>> at all.  This means that anywhere from 1/4 to 3/4 of queries on the
>> standby would be cancelled on any high-traffic OLTP server.  Hence,
>> "useless".
>
> Don't agree with your numbers there and you seem to be assuming no
> workarounds would be in use. A different discussion, I think.

The only viable workaround would be to delay vacuum on the master, no?

> Adding the feedback channel looks trivial to me, once we've got the main
> sync rep patch in. I'll handle that.

OK. I thought it would be a major challenge.  Ideally, we'd have some 
way to turn snapshot publication on or off; for a standby which was not 
receiving reads, we wouldn't need it.  Maybe make it contingent on the 
status of hot_standby GUC?  That would make sense.

> For this reason, I've removed the "apply" mode from my patch, for now. I
> want to get the simplest possible patch agreed and then add things
> later.

Um, ok.  "apply" mode is still useful for users who are not running 
queries against the standby.  Which many won't.

--                                   -- Josh Berkus                                     PostgreSQL Experts Inc.
                           http://www.pgexperts.com
 


Re: Issues with two-server Synch Rep

From
Simon Riggs
Date:
On Tue, 2010-10-19 at 09:36 -0700, Josh Berkus wrote:
> >> Absolutely.  For a synch standby, you can't tolerate any standby delay
> >> at all.  This means that anywhere from 1/4 to 3/4 of queries on the
> >> standby would be cancelled on any high-traffic OLTP server.  Hence,
> >> "useless".
> >
> > Don't agree with your numbers there and you seem to be assuming no
> > workarounds would be in use. A different discussion, I think.
> 
> The only viable workaround would be to delay vacuum on the master, no?

Sure. It's a documented workaround.

> > Adding the feedback channel looks trivial to me, once we've got the main
> > sync rep patch in. I'll handle that.
> 
> OK. I thought it would be a major challenge.  Ideally, we'd have some 
> way to turn snapshot publication on or off; for a standby which was not 
> receiving reads, we wouldn't need it.  Maybe make it contingent on the 
> status of hot_standby GUC?  That would make sense.

Yes, I thought to extend hot_standby parameter to have 3 modes:

off (default) | on | feedback

> > For this reason, I've removed the "apply" mode from my patch, for now. I
> > want to get the simplest possible patch agreed and then add things
> > later.
> 
> Um, ok.  "apply" mode is still useful for users who are not running 
> queries against the standby.  Which many won't.

I agree many people won't use the standby for reads.

Why then would they care about the difference between fsync and apply?

Anyway, that suggestion is mainly so we can reach agreement on a minimal
patch, not suggesting it as a limit on released functionality.

-- Simon Riggs           http://www.2ndquadrant.com/books/PostgreSQL Development, 24x7 Support, Training and Services