Re: Hot Standby b-tree delete records review - Mailing list pgsql-hackers
From | Simon Riggs |
---|---|
Subject | Re: Hot Standby b-tree delete records review |
Date | |
Msg-id | 1289304688.14131.7012.camel@ebony Whole thread Raw |
In response to | Re: Hot Standby b-tree delete records review (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>) |
List | pgsql-hackers |
On Tue, 2010-11-09 at 13:34 +0200, Heikki Linnakangas wrote: > (cleaning up my inbox, and bumped into this..) > > On 22.04.2010 12:31, Simon Riggs wrote: > > On Thu, 2010-04-22 at 12:18 +0300, Heikki Linnakangas wrote: > >> Simon Riggs wrote: > >>> On Thu, 2010-04-22 at 11:56 +0300, Heikki Linnakangas wrote: > >>> > >>>>>>>> If none of the removed heap tuples were present anymore, we currently > >>>>>>>> return InvalidTransactionId, which kills/waits out all read-only > >>>>>>>> queries. But if none of the tuples were present anymore, the read-only > >>>>>>>> queries wouldn't have seen them anyway, so ISTM that we should treat > >>>>>>>> InvalidTransactionId return value as "we don't need to kill anyone". > >>>>>>> That's not the point. The tuples were not themselves the sole focus, > >>>>>> Yes, they were. We're replaying a b-tree deletion record, which removes > >>>>>> pointers to some heap tuples, making them unreachable to any read-only > >>>>>> queries. If any of them still need to be visible to read-only queries, > >>>>>> we have a conflict. But if all of the heap tuples are gone already, > >>>>>> removing the index pointers to them can'ẗ change the situation for any > >>>>>> query. If any of them should've been visible to a query, the damage was > >>>>>> done already by whoever pruned the heap tuples leaving just the > >>>>>> tombstone LP_DEAD item pointers (in the heap) behind. > >>>>> You're missing my point. Those tuples are indicators of what may lie > >>>>> elsewhere in the database, completely unreferenced by this WAL record. > >>>>> Just because these referenced tuples are gone doesn't imply that all > >>>>> tuple versions written by the as yet-unknown-xids are also gone. We > >>>>> can't infer anything about the whole database just from one small group > >>>>> of records. > >>>> Have you got an example of that? > >>> > >>> I don't need one, I have suggested the safe route. In order to infer > >>> anything, and thereby further optimise things, we would need proof that > >>> no cases can exist, which I don't have. Perhaps we can add "yet", not > >>> sure about that either. > >> > >> It's good to be safe rather than sorry, but I'd still like to know > >> because I'm quite surprised by that, and got me worried that I don't > >> understand how hot standby works as well as I thought I did. I thought > >> the point of stopping replay/killing queries at a b-tree deletion record > >> is precisely that it makes some heap tuples invisible to running > >> read-only queries. If it doesn't make any tuples invisible, why do any > >> queries need to be killed? And why was it OK for them to be running just > >> before replaying the b-tree deletion record? > > > > I'm sorry but I'm too busy to talk further on this today. Since we are > > discussing a further optimisation rather than a bug, I hope it is OK to > > come back to this again later. > > Would now be a good time to revisit this? I still don't see why a b-tree > deletion record should conflict with anything, if all the removed index > tuples point to just LP_DEAD tombstones in the heap. I want what you say to be true. The question is: is it? We just need to explain why that will never be a problem. -- Simon Riggs http://www.2ndQuadrant.com/books/PostgreSQL Development, 24x7 Support, Training and Services
pgsql-hackers by date: