Home > mailing lists
Re: Sync Rep: First Thoughts on Code - Mailing list pgsql-hackers

From	Mark Mielke
Subject	Re: Sync Rep: First Thoughts on Code
Date	December 13, 2008 19:35:50
Msg-id	49444544.4060009@mark.mielke.cc Whole thread Raw
In response to	Re: Sync Rep: First Thoughts on Code (Markus Wanner <markus@bluegap.ch>)
Responses	Re: Sync Rep: First Thoughts on Code
List	pgsql-hackers
Tree view
Markus Wanner wrote:<br /><blockquote cite="mid:494436AE.2080207@bluegap.ch" type="cite"><blockquote type="cite"><pre
wrap="">Idon't think synchronous replication guarantees that it will be
 
immediately visible. Even if it did push the change to the other
machine, and the other machine had committed it, that doesn't guarantee
that any reader sees it any more than if I commit to the same machine
(no replication), I am guaranteed to see the change from another
session.   </pre></blockquote><pre wrap="">
AFAIK every snapshot taken after a transaction has acknowledged its
commit is guaranteed to see changes from that transaction. Isn't that a
pretty frequent and obvious user expectation? </pre></blockquote><br /> Yes - but that's only really true while the
sessioncontinues. From another session? I've never assumed that I could reconnect and be guaranteed to get the latest
snapshotthat includes absolutely everything that has been committed.<br /><br /> Any system that guaranteed this even
wheninvolving multiple machines would be guaranteed to be inefficient and difficult to scale in my opinion. How could
anysystem promise to have reasonable commit times while also guaranteeing that once a commit completes, any session to
anyother server will be able to see the commit? I think this forces some sort of serialization between multiple
machinesand defeats the purpose of having multiple machines. Where before it was indeterminate to know when the commit
wouldtake effect at each replica, it's not indeterminate when my commit will succeed. That is, my commit cannot succeed
untilevery single server acknowledge that it is has fully received and committed my transaction. What happens if there
arenetwork problems, or what happens if I am replicating over a slower link? What if I am committing to 100 servers? Is
itreasonable to expect 100 server negotiations to complete in full before my own commit will return?<br /><br
/><blockquotecite="mid:494436AE.2080207@bluegap.ch" type="cite"><blockquote type="cite"><pre wrap="">Synchronous
replicationonly means that I can be assured that
 
my change has been saved permanently by the time my commit completes. It
doesn't mean anybody else can see my change or is guaranteed to see my
change if the query from another session.   </pre></blockquote><pre wrap="">So you wouldn't be surprised if a
transactionfrom two hours ago isn't
 
visible on another node, just because that node happens to be rather
busy with lots of other readers and maintenance tasks? </pre></blockquote><br /> Any system that is two hours behind
shouldfall out of the pool used to satisfy reads from. So, if there was a surprise, it would be this. I don't believe
ACIDrequires that a commit on one server is immediately visible on another server. Any work I do on the "behind" server
wouldstill be safe from a transaction and referential integrity perspective. However, if I executed 'commit' on this
"behind"server, I would expect the commit to wait until it catches up, or in the case of a 2 hour behind, I would
expectthe commit to fail. Look at the alternative - all commits to any server in the pool would be locked up waiting
forthis one machine to catch up on 2 hours of transaction. This emphasizes that the problem is that a server two hours
ofdate is still in the pool, rather than the problem being keeping things up-to-date.<br /><br /><br /><blockquote
cite="mid:494436AE.2080207@bluegap.ch"type="cite"><blockquote type="cite"><pre wrap="">If my application assumes that
itcan commit to one server, and then
 
read back the commit from another server, and my application breaks as a
result, it's because I didn't understand the problem.   </pre></blockquote><pre wrap="">Well, yeah, depends on user
expectations.I'm surprised to hear that you
 
have that understanding of synchronous replication. </pre></blockquote><br /> I've seen people face it in the past.
Mostrecently we had a presentation from the developer of digg.com, and he described how he had this problem with MySQL
andthat he had to work around it.<br /><br /> On a smaller scale and slightly unrelated, I had this problem frequently
betweenmemcache and PostgreSQL. That is, memcache would always be latest, but PostgreSQL might not be latest, because
thecommit had not occurred.<br /><br /> It seems like a standard enough problem to me. I don't expect Postgres-R to do
theimpossible. As with my previous paragraph, I don't expect Postgres-R to wait 2-hours to commit just because one
serveris falling behind.<br /><br /><blockquote cite="mid:494436AE.2080207@bluegap.ch" type="cite"><blockquote
type="cite"><prewrap="">Even if PostgreSQL
 
didn't use the word "synchronous replication", I could still be
confused. I need to understand the problem no matter what words are used.   </pre></blockquote><pre wrap="">
As said, it depends on what the common understanding of "synchronous
replication" is. I've so far been under the impression, that these
potential lags are unexpected and confusing. Several people pointed me
at that problem and I've thus "relabeled" Postgres-R as not being
synchronous. I'm at least surprised to suddenly get pushed into the
other direction. :-)

However, I absolutely agree that it's not that important how we name it.
What is important, is that users and developers understand the difference</pre></blockquote><br /> I agree they are
unexpectedand confusing. I don't agree that they are unexpected or confusing to those knowledgeable in the domain. So,
thequestion becomes - whose expectation is wrong? Should the user learn more? Or should we push for a change in
terminology?Does it make sense for Postgres-R (which looks excellent to me BTW, at least in principle) be marketed
differently,because a few users tie "synchronous replication" to "serialized access"?<br /><br /> Because that's really
whatwe're talking about - we're talking about transactions in all sessions being serialized between machines to provide
lesssurprise to users who don't understand the complexity of having multiple replicas.<br /><br /> Forget replication -
evenfor the exact same server - I don't expect that if I commit from one session, I will be able to see the change
immediatelyfrom my other session or a new session that I just opened. Perhaps this is often stable to rely on this, and
itis useful for the database server to minimize the window during which the commit becomes visible to others, but I
thinkit's a false expectation from the start that it absolutely will be immediately visible to another session. I'm
thinkingof situations where some part of the table is in cache. The only way the commit can communicate that the new
transactionis available is by during communication between the processes or threads, or between the multiple CPUs on
themachine. Do I want every commit to force each session to become fully in alignment before my commit completes? Does
PostgreSQLmake this guarantee today? I bet it doesn't if you look far enough into the guts. It might be very fast - I
don'tthink it is infinitely fast.<br /><br /> Cheers,<br /> mark<br /><br /><pre class="moz-signature" cols="72">-- 
 
Mark Mielke <a class="moz-txt-link-rfc2396E" href="mailto:mark@mielke.cc"><mark@mielke.cc></a>
</pre>
pgsql-hackers by date:
From: "Ned T. Crigler"
Date: 13 December 2008, 19:13:02
Subject: visibility map and reltuples
From: "Oleg Serov"
Date: 13 December 2008, 20:28:37
Subject: Future request: BgBouncer && "cache lookup failed for function": Auto recache function.
Re: Sync Rep: First Thoughts on Code - Mailing list pgsql-hackers

Previous

Next