Re: [HACKERS] Restrict concurrent update/delete with UPDATE ofpartition key - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: [HACKERS] Restrict concurrent update/delete with UPDATE ofpartition key |
Date | |
Msg-id | 20180328181211.yilvi44y5rv6i3ud@alap3.anarazel.de Whole thread Raw |
In response to | Re: [HACKERS] Restrict concurrent update/delete with UPDATE of partition key (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: [HACKERS] Restrict concurrent update/delete with UPDATE ofpartition key
(Robert Haas <robertmhaas@gmail.com>)
|
List | pgsql-hackers |
Hi, On 2018-03-28 13:52:24 -0400, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > Given, as explained nearby, we already do store transient data in the > > ctid for speculative insertions (i.e. ON CONFLICT), and it hasn't caused > > even a whiff of trouble, I'm currently not inclined to see a huge issue > > here. It'd be great if you could expand on your concerns here a bit, we > > gotta figure out a way forward. > > Just what I said. There's a lot of code that knows how to follow tuple > update chains, probably not all of it in core, and this will break it. > But only in seldom-exercised corner cases, which is the worst of all > possible worlds from a reliability standpoint. How will it break it? They'll see an invalid ctid and conclude that the tuple is dead? Without any changes that's already something that can happen if a later tuple in the chain has been pruned away. Sure, that code won't realize it should error out because the tuple is now in a different partition, but neither would a infomask bit. I think my big problem is that I just don't see what the worst that can happen is. We'd potentially see a broken ctid chain, something that very commonly happens, and consider the tuple to be invisible. That seems pretty sane behaviour for unadapted code, and not any worse than other potential solutions. > (I don't think ON CONFLICT is a counterexample because, IIUC, it's not > a persistent state.) Hm, it can be persistent if we error out in the right moment. But it's nots super common to encounter that over a long time, I grant you that. Not that this'd be super persistent either, vacuum/pruning would normally remove the tuple as well, it's dead after all. > >> I would've been happier about expending an infomask bit towards this > >> purpose. Just eyeing what we've got, I can't help noticing that > >> HEAP_MOVED_OFF/HEAP_MOVED_IN couldn't possibly be set in any tuple > >> in a partitioned table. Perhaps making these tests depend on > >> partitioned-ness would be unworkably messy, but it's worth thinking > >> about. > > > They previously couldn't be set together IIRC, so we could just (mask & > > (HEAP_MOVED_OFF |HEAP_MOVED_IN)) == (HEAP_MOVED_OFF |HEAP_MOVED_IN) but > > that'd be permanently eating two infomask bits. > > Hmm. That objection only matters if we have realistic intentions of > reclaiming those bits in future, which I've not heard anyone making > serious effort towards. I plan to submit a patch early v12 that keeps track of the last time a table has been fully scanned (and when it was created). With part of the goal being debuggability and part being able to reclaim things like these bits. Greetings, Andres Freund
pgsql-hackers by date: