Thread: true serializability and predicate locking
I have a question regarding true serializability and predicate locking. There's some context on the wiki page: http://wiki.postgresql.org/wiki/Serializable under the heading "Predicate Locking". If you have the following DDL: create table mytable(mycircle circle); create index mytable_mycircle_idx on mytable using gist (mycircle); and two transactions: T1: BEGIN; SELECT * FROM mytable WHERE mycircle && '<(0, 0), 10>'; -- if any rows are returned, ROLLBACK INSERT INTO mytable(mycircle)VALUES('<(0, 0), 10>'); COMMIT; T2: BEGIN; SELECT * FROM mytable WHERE mycircle && '<(5, 5), 5>'; -- if any rows are returned, ROLLBACK INSERT INTO mytable(mycircle)VALUES('<(5, 5), 5>'); COMMIT; Clearly one of those transactions should abort, because that will happen in either serialized order. But I don't see where any lock is stored, nor how the conflict is detected. There has been a lot of theoretical discussion on this matter, but I'd like to know how it will work in this specific case. You can't merely lock a few index pages, because the INSERT might put the tuple in another page. I'm still trying to catch up on this discussion as well as relevant papers, but this question has been on my mind. One approach that might work for GiST is to get some kind of lock (SIREAD?) on the predicates for the pages that the search does not match. That way, the conflict can be detected if an INSERT tries to update the predicate of a page to something that the search may have matched. If the index was GIN instead of GiST, I think the fastupdate feature would cause a problem, though (as Greg brought up). Fastupdate may need to be disabled when using truly serializable transactions. Regards,Jeff Davis
On Tue, Jan 5, 2010 at 2:14 PM, Jeff Davis <pgsql@j-davis.com> wrote: > I have a question regarding true serializability and predicate locking. > There's some context on the wiki page: > > http://wiki.postgresql.org/wiki/Serializable > > under the heading "Predicate Locking". > > If you have the following DDL: > > create table mytable(mycircle circle); > create index mytable_mycircle_idx on mytable > using gist (mycircle); > > and two transactions: > > T1: > BEGIN; > SELECT * FROM mytable WHERE mycircle && '<(0, 0), 10>'; > -- if any rows are returned, ROLLBACK > INSERT INTO mytable(mycircle) VALUES('<(0, 0), 10>'); > COMMIT; > > T2: > BEGIN; > SELECT * FROM mytable WHERE mycircle && '<(5, 5), 5>'; > -- if any rows are returned, ROLLBACK > INSERT INTO mytable(mycircle) VALUES('<(5, 5), 5>'); > COMMIT; > > Clearly one of those transactions should abort, because that will happen > in either serialized order. But I don't see where any lock is stored, > nor how the conflict is detected. > > There has been a lot of theoretical discussion on this matter, but I'd > like to know how it will work in this specific case. You can't merely > lock a few index pages, because the INSERT might put the tuple in > another page. > > I'm still trying to catch up on this discussion as well as relevant > papers, but this question has been on my mind. > > One approach that might work for GiST is to get some kind of lock > (SIREAD?) on the predicates for the pages that the search does not > match. That way, the conflict can be detected if an INSERT tries to > update the predicate of a page to something that the search may have > matched. > > If the index was GIN instead of GiST, I think the fastupdate feature > would cause a problem, though (as Greg brought up). Fastupdate may need > to be disabled when using truly serializable transactions. It seems to me that we shouldn't pre-judge too much about how predicate locking will ultimately be implemented. Suppose the first query in your example were: SELECT * FROM mytable WHERE [some incredibly complex condition involving all sorts of strange magic] It seems to me that in a case like this our only realistic option is to lock the entire table until T1 commits. Now, in certain cases, we can optimize this, if we want. For example, if the first query were: SELECT * FROM mytable WHERE id = 42; ...and if further we know that there is a unique B-tree index on the id column, and if there is an existing record with id = 42, then we can lock just that record. If no such record exists, we can either fall back to locking the whole table, or we can take some sort of key-range lock that will block any future attempts to read or update a row with id = 42. With GIST and GIN and so on, the situation is similar. If the index AM can be modified to provide certain kinds of key-range locking then we can take weaker locks for queries that involve those types of conditions. If not, we have to fall back to a full-table lock. ...Robert
Jeff Davis <pgsql@j-davis.com> wrote: > Clearly one of those transactions should abort, because that will > happen in either serialized order. But I don't see where any lock > is stored, nor how the conflict is detected. That depends on where in the development cycle of this feature you are. I'm anticipating that throughout, the locks to support SSI will be kept in RAM, probably in the existing lock heap table or something based on it. Near the beginning, all locking will be at the table level, as the fastest way to develop something which is "correct" in the sense of not allowing any of the snapshot anomalies. Later in development, we will try to optimize initial locks to smaller granularity and promote to coarser granularity only as needed to keep RAM usage reasonable. Behavior will be no more "correct" with such optimizations, but it should become more acceptable in terms of performance and rollback rates. I will not spend any significant amount of time looking at the specifics of any particular optimizations yet, because such premature optimization is certain to kill the whole project. > There has been a lot of theoretical discussion on this matter, but > I'd like to know how it will work in this specific case. You can't > merely lock a few index pages, because the INSERT might put the > tuple in another page. I don't yet know a lot about GiST indexes beyond the high-level theory (it's an area where I haven't yet delved into the code), but it's pretty easy to get to page level locks if (and only if) an index search is guaranteed to look at some page which will be modified if a later conflicting INSERT or UPDATE will be required to modify either that page or a logically adjacent page. My initial intuition is that a search can't decide that there are no matching rows unless it has looked at some page which would be different if a matching row existed. > One approach that might work for GiST is to get some kind of lock > (SIREAD?) on the predicates for the pages that the search does not > match. That way, the conflict can be detected if an INSERT tries > to update the predicate of a page to something that the search may > have matched. That sounds right to me. > If the index was GIN instead of GiST, I think the fastupdate > feature would cause a problem, though (as Greg brought up). > Fastupdate may need to be disabled when using truly serializable > transactions. Again, if I spent the time to evaluate all such details now, we would never get to the point where such ideas can be examined in context or quickly tested. I'm trying to keep this process as open as possible. If I hid in a corner and worked on this in isolation I could probably (eventually) present it with answers to all such questions "at the ready." I think there are obvious down-sides to such a strategy, so I'm forced into the position of saying, with regards to most potential optimizations, "we'll cross that bridge when we come to it" -- knowing full well that many optimizations will indeed be necessary before the patch is acceptable. I hope that helps. -Kevin
On Tue, 2010-01-05 at 13:47 -0600, Kevin Grittner wrote: > I will not > spend any significant amount of time looking at the specifics of any > particular optimizations yet, because such premature optimization is > certain to kill the whole project. I'm certainly not trying to derail the project, I'm just trying to see some light at the end of the tunnel. Is a full table lock acceptable in the end? If so, then predicate locking is just optimization, and we should leave it until later. If not, then reasonably efficient predicate locking is a part of the design. We can still leave it until later, but we shouldn't call design issues "premature optimization". > I don't yet know a lot about GiST indexes beyond the high-level > theory (it's an area where I haven't yet delved into the code), but > it's pretty easy to get to page level locks if (and only if) an > index search is guaranteed to look at some page which will be > modified if a later conflicting INSERT or UPDATE will be required to > modify either that page or a logically adjacent page. My initial > intuition is that a search can't decide that there are no matching > rows unless it has looked at some page which would be different if a > matching row existed. Well, that's my concern. Technically, I think you're correct. But that might not be very helpful in the case of GIN fastupdate, for instance. Every insert will modify the fastupdate buffer, and every search will read it. > > One approach that might work for GiST is to get some kind of lock > > (SIREAD?) on the predicates for the pages that the search does not > > match. That way, the conflict can be detected if an INSERT tries > > to update the predicate of a page to something that the search may > > have matched. > > That sounds right to me. With GiST, the picture looks a little more promising. I'm still a little concerned that it will cause significant performance pitfalls, however. > I > think there are obvious down-sides to such a strategy, so I'm forced > into the position of saying, with regards to most potential > optimizations, "we'll cross that bridge when we come to it" -- > knowing full well that many optimizations will indeed be necessary > before the patch is acceptable. That's fine with me, and I'll hold off on issues like this one. Regards,Jeff Davis
Jeff Davis <pgsql@j-davis.com> wrote: > Is a full table lock acceptable in the end? If so, then predicate > locking is just optimization, and we should leave it until later. I think that the predicate locking will need to be RAM-based to provide acceptable performance, and that we will need a multi-granularity approach with granularity promotion which will include table locks in some situations. > If not, then reasonably efficient predicate locking is a part of > the design. We can still leave it until later, but we shouldn't > call design issues "premature optimization". Well, technically table locking can be used to provide predicate locking; it is just way too coarse-grained to be acceptable for production as the *only* technique. The optimization phase will involve many types of optimization, but a lot of it will be balancing the overhead of finer-grained locks against the higher rate of false positives with coarser-grained locks. I really think that the correct way to view this is to view backing off to finer-grained resolutions as optimization, albeit absolutely necessary optimization. I totally understand the impulse to work on these details up front -- I'm fighting against such an impulse myself. I've just convinced myself, rationally, that such work is better deferred. On the other hand, perhaps if I'm working the development path in the wiki page, it *could* make sense, if you're interested, to look at issues like this now and get them all documented and ready to go once I get far enough along -- we could "meet in the middle." Another interesting thing which crossed my mind (and I should probably add a section for such things in the wiki) is that, given the overhead and conflict implications of using table scans in serializable transactions, we should perhaps try to discourage table scans from being chosen for those using serializable transactions. I figure we can probably fudge this to a workable degree by adjusting tuple cost GUCs, but if you wanted to think about this issue in more depth, it might be useful. -Kevin
Kevin Grittner wrote: > Another interesting thing which crossed my mind (and I should > probably add a section for such things in the wiki) is that, given > the overhead and conflict implications of using table scans in > serializable transactions, we should perhaps try to discourage table > scans from being chosen for those using serializable transactions. > I figure we can probably fudge this to a workable degree by > adjusting tuple cost GUCs, but if you wanted to think about this > issue in more depth, it might be useful. I don't know if that's a good idea. It's an attempt to guess what the user really wanted, and one reason why I dislike Microsoft is that their software does exactly that. Messing with the optimizer might result in bad plans, much to the surprise of the user who "changed nothing". What have you won if you take out fewer locks, but your query runs forever? I'd opt for good documentation that tells you about the pitfalls of this serializable implementation and counsels you. That also helps to keep it simple. Yours, Laurenz Albe
Robert Haas wrote: > Jeff Davis wrote: > > I have a question regarding true serializability and predicate locking. > > There's some context on the wiki page: > > > > If you have the following DDL: > > > > create table mytable(mycircle circle); > > create index mytable_mycircle_idx on mytable > > using gist (mycircle); > > > > and two transactions: > > > > T1: > > BEGIN; > > SELECT * FROM mytable WHERE mycircle && '<(0, 0), 10>'; > > -- if any rows are returned, ROLLBACK > > INSERT INTO mytable(mycircle) VALUES('<(0, 0), 10>'); > > COMMIT; > > > > T2: > > BEGIN; > > SELECT * FROM mytable WHERE mycircle && '<(5, 5), 5>'; > > -- if any rows are returned, ROLLBACK > > INSERT INTO mytable(mycircle) VALUES('<(5, 5), 5>'); > > COMMIT; > > > > Clearly one of those transactions should abort, because that will happen > > in either serialized order. But I don't see where any lock is stored, > > nor how the conflict is detected. > > > > There has been a lot of theoretical discussion on this matter, but I'd > > like to know how it will work in this specific case. You can't merely > > lock a few index pages, because the INSERT might put the tuple in > > another page. > > > > I'm still trying to catch up on this discussion as well as relevant > > papers, but this question has been on my mind. > > > > One approach that might work for GiST is to get some kind of lock > > (SIREAD?) on the predicates for the pages that the search does not > > match. That way, the conflict can be detected if an INSERT tries to > > update the predicate of a page to something that the search may have > > matched. > > > > If the index was GIN instead of GiST, I think the fastupdate feature > > would cause a problem, though (as Greg brought up). Fastupdate may need > > to be disabled when using truly serializable transactions. > > It seems to me that we shouldn't pre-judge too much about how > predicate locking will ultimately be implemented. Suppose the first > query in your example were: > > SELECT * FROM mytable WHERE [some incredibly complex condition > involving all sorts of strange magic] > > It seems to me that in a case like this our only realistic option is > to lock the entire table until T1 commits. I don't know if such a thing would be easy to implement in PostgreSQL, but I had thought that the "standard" approach to implement predicate locking is like this: Whenever you touch (read) a data structure, you tag it with a lock that prevents anybody else from modifying the data structure in a way that would change your result if you perform the same operation again (or with SIREAD locks, it will not prevent the modification, but may lead to aborted transactions later). So no matter how complex the index condition is, it will boil down to accessing and hence locking certain parts of a table or index (in the case of a B*-tree, you'll probably end up locking gaps in the index). I'd say that the index should know what exactly should be locked if a certain operation must become repeatable. Yours, Laurenz Albe
2010/1/7 Albe Laurenz <laurenz.albe@wien.gv.at>: > I don't know if such a thing would be easy to implement in > PostgreSQL, but I had thought that the "standard" approach to > implement predicate locking is like this: > > Whenever you touch (read) a data structure, you tag it with a lock > that prevents anybody else from modifying the data structure in a way > that would change your result if you perform the same operation again > (or with SIREAD locks, it will not prevent the modification, but > may lead to aborted transactions later). > > So no matter how complex the index condition is, it will boil down > to accessing and hence locking certain parts of a table or index > (in the case of a B*-tree, you'll probably end up locking gaps in > the index). That would be an "access layer based" version of predicate locking (of which a typical implementation is the already-mentioned "next-key locking"). The most "pure" version of predicate locking literally "locks predicates" (i.e., conditions on rows). Conflicts are detected by checking for "overlap" between predicates: is there a (possibly non-existent) row that matches the same predicate? If so, and the locks are incompatible, then a conflict arises (this would be different in the SIREAD case, of course; I am talking about traditional 2PL + predicate locking). In such a pure implementation of predicate locking, the overlap testing is be done using the algebraic properties of the conditions, which is of course extremely difficult (if not impossible) to implement perfectly in a system that allows conditions of arbitrary complexity. Therefore, the conditions are typically simplified in such a way that they become true for more rows, but are easier to check for overlap. Nicolas
Nicolas Barbier wrote: >> I don't know if such a thing would be easy to implement in >> PostgreSQL, but I had thought that the "standard" approach to >> implement predicate locking is like this: >> >> Whenever you touch (read) a data structure, you tag it with a lock >> that prevents anybody else from modifying the data structure in a way >> that would change your result if you perform the same operation again >> (or with SIREAD locks, it will not prevent the modification, but >> may lead to aborted transactions later). >> >> So no matter how complex the index condition is, it will boil down >> to accessing and hence locking certain parts of a table or index >> (in the case of a B*-tree, you'll probably end up locking gaps in >> the index). > > That would be an "access layer based" version of predicate locking (of > which a typical implementation is the already-mentioned "next-key > locking"). > > The most "pure" version of predicate locking literally "locks > predicates" (i.e., conditions on rows). Conflicts are detected by > checking for "overlap" between predicates: is there a (possibly > non-existent) row that matches the same predicate? If so, and the > locks are incompatible, then a conflict arises (this would be > different in the SIREAD case, of course; I am talking about > traditional 2PL + predicate locking). > > In such a pure implementation of predicate locking, the overlap > testing is be done using the algebraic properties of the conditions, > which is of course extremely difficult (if not impossible) to > implement perfectly in a system that allows conditions of arbitrary > complexity. Therefore, the conditions are typically simplified in such > a way that they become true for more rows, but are easier to check for > overlap. That sounds like major AI to me. What do you do if the condition involves user defined functions? Are there any implementations taking such an approach? Yours, Laurenz Albe
2010/1/7 Albe Laurenz <laurenz.albe@wien.gv.at>: > Nicolas Barbier wrote: > >> In such a pure implementation of predicate locking, the overlap >> testing is be done using the algebraic properties of the conditions, >> which is of course extremely difficult (if not impossible) to >> implement perfectly in a system that allows conditions of arbitrary >> complexity. Therefore, the conditions are typically simplified in such >> a way that they become true for more rows, but are easier to check for >> overlap. > > That sounds like major AI to me. It is, but only if you want to approach perfection, which is probably not the way to go. > What do you do if the condition involves user defined functions? Then you have to become conservative, I guess: if you don't know otherwise, you assume it *might* do the bad thing: the predicate might match any inputs (i.e., you could convert such a condition to a whole-table lock if there are no other restrictions ANDed to it). > Are there any implementations taking such an approach? I personally don't know of any production-ready DBMSs that use any predicate locking approach other than the "access layer based" one (i.e., locking index ranges and falling back to whole-table locks for table scans); but then, I am not an expert in this matter. I think that it is a good thing in itself to remove plan dependencies (which would be accomplished by locking on the algebraic level), but it seems that none other the other DBMSs were able to implement it like that. Nicolas
On Thu, 2010-01-07 at 10:57 +0100, Albe Laurenz wrote: > That sounds like major AI to me. > What do you do if the condition involves user defined functions? I don't see how it has anything to do with "user-defined" or not. If the predicate is pure (immutable), it's no problem (regardless of whether the function is user-defined or not) -- we test the predicate on a new tuple the same way as we would test an existing tuple against a WHERE clause in a scan. If the predicate is stable, it can be considered to be immutable if we keep the snapshot around and evaluate the predicate using the same snapshot. Volatile functions like random() don't make much sense in combination with true serializability anyway. Perhaps volatile functions that just have a side effect, but always return the same result, may still be useful. If so, we could have a new classification for those kinds of functions. So I don't see this as a major problem. My biggest worry is that a predicate locking scheme like this will be fairly difficult to implement and expensive to execute. Regards,Jeff Davis
Jeff Davis <pgsql@j-davis.com> wrote: > [discussion of a "pure" predicate locking scheme where each > database modification is checked against a list of predicate > expressions] > My biggest worry is that a predicate locking scheme like this will > be fairly difficult to implement and expensive to execute. I couldn't agree more. I am not aware of any production-quality database with predicate locking which has used such a "pure" (purist?) implementation, and I have no ambition to attempt to be first on this front. For a page-and-a-half summary (with references to more exhaustive works) on the how I think predicate locking should be done, see [1] -- in particular section: 6.5.3 Next-Key Locking: Physical Surrogates for Logical Properties Some here seem to be missing the point, to various degrees, which I am trying to make that the table, page, tuple, and index range locks I'm proposing *are* forms of predicate locking. If after we have locking working that way, people feel that there would be a net gain to implement the "pure" form, it's open source -- if they can show reasonable benchmarks that that works better, cool. Personally, I think it's a nice abstract concept with zero chance of working well on an industrial scale without some compromises with reality, such as described in the referenced paper. If anyone else wants to jump in and say it in different words, to help get the idea across (since I seem to be doing it so poorly), feel free to jump in. -Kevin [1] Joseph M. Hellerstein, Michael Stonebraker and James Hamilton. 2007. Architecture of a Database System. Foundations and Trends(R) in Databases Vol. 1, No. 2 (2007) 141*259. http://db.cs.berkeley.edu/papers/fntdb07-architecture.pdf
"Albe Laurenz" <laurenz.albe@wien.gv.at> wrote: > Kevin Grittner wrote: >> Another interesting thing which crossed my mind (and I should >> probably add a section for such things in the wiki) is that, >> given the overhead and conflict implications of using table scans >> in serializable transactions, we should perhaps try to discourage >> table scans from being chosen for those using serializable >> transactions. I figure we can probably fudge this to a workable >> degree by adjusting tuple cost GUCs, but if you wanted to think >> about this issue in more depth, it might be useful. > > I don't know if that's a good idea. > It's an attempt to guess what the user really wanted, No, it's an attempt to reflect the difference in costs for true serializable transactions, so that the optimizer can choose a plan appropriate for that mode, versus some other. In serializable transaction isolation there is a higher cost per tuple read, both directly in locking and indirectly in increased rollbacks; so why lie to the optimizer about it and say it's the same? > Messing with the optimizer might result in bad plans, much to > the surprise of the user who "changed nothing". The transaction isolation level *is* something, and it's something which people should expect to affect performance. > What have you won if you take out fewer locks, but your query runs > forever? Well, if the load runs worse with an optimization, it's not one we should use. As I've mentioned before, I want to get to where it's working correctly (if slowly), and *then* start working on optimizations (such as this one), testing each against various workloads to determine effect. Please note that I threw this out "for the record" as a possible optimization to consider down the road when we hit the optimization phase. I hope we don't have to debate every such notation in a vacuum before we get to that phase. To forestall that in the future, perhaps I should keep these just to the wiki and off the list. Or would people rather see the bread crumbs drop? > I'd opt for good documentation that tells you about the pitfalls > of this serializable implementation and counsels you. It's a given that we need that. > That also helps to keep it simple. Ignoring optimizations might keep it simple, but might not allow it to become usable. -Kevin
On Thu, Jan 7, 2010 at 3:43 PM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: > "Albe Laurenz" <laurenz.albe@wien.gv.at> wrote: >> Kevin Grittner wrote: >>> Another interesting thing which crossed my mind (and I should >>> probably add a section for such things in the wiki) is that, >>> given the overhead and conflict implications of using table scans >>> in serializable transactions, we should perhaps try to discourage >>> table scans from being chosen for those using serializable >>> transactions. I figure we can probably fudge this to a workable >>> degree by adjusting tuple cost GUCs, but if you wanted to think >>> about this issue in more depth, it might be useful. >> >> I don't know if that's a good idea. >> It's an attempt to guess what the user really wanted, > > No, it's an attempt to reflect the difference in costs for true > serializable transactions, so that the optimizer can choose a plan > appropriate for that mode, versus some other. In serializable > transaction isolation there is a higher cost per tuple read, both > directly in locking and indirectly in increased rollbacks; so why > lie to the optimizer about it and say it's the same? I don't think this necessarily is a bad idea, but thinking that fudging the tuple cost GUCs is going to work seems unrealistically optimistic. If you need the optimizer to know about this, you need the optimizer to REALLY know about this... ...Robert
Robert Haas <robertmhaas@gmail.com> wrote: > thinking that fudging the tuple cost GUCs is going to work seems > unrealistically optimistic. If you need the optimizer to know > about this, you need the optimizer to REALLY know about this... I rule nothing out. If you have a more refined idea, I welcome you to include in the wiki's "R&D Issues" section. Or describe it here and I'll add it. Frankly, it seemed no more of a hack than some aspects of our costing calculations, but it obviously pays to model it as well as we can. But I will take something which shows any improvement without getting too nasty, until we have something better. I see the optimization phase as lasting a while and trying out many ideas, some of which won't turn out to have been good ones. We don't use those. -Kevin
On Thu, Jan 7, 2010 at 8:43 PM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: > No, it's an attempt to reflect the difference in costs for true > serializable transactions, so that the optimizer can choose a plan > appropriate for that mode, versus some other. In serializable > transaction isolation there is a higher cost per tuple read, both > directly in locking and indirectly in increased rollbacks; so why > lie to the optimizer about it and say it's the same? This depends how you represent the predicates. If you represent the predicate by indicating that you might have read any record in the table -- i.e. a full table lock then you would have very low overhead per-tuple read, effectively 0. The chances of a serialization failure would go up but I don't see how to represent that as a planner cost. But this isn't directly related to the plan in any case. You could do a full table scan but record in the predicate lock that you were only interested in records with certain constraints. Or you could do an index scan but decide to represent the predicate lock as a full table lock anyways. -- greg
Greg Stark <gsstark@mit.edu> wrote: > On Thu, Jan 7, 2010 at 8:43 PM, Kevin Grittner > <Kevin.Grittner@wicourts.gov> wrote: >> No, it's an attempt to reflect the difference in costs for true >> serializable transactions, so that the optimizer can choose a >> plan appropriate for that mode, versus some other. In >> serializable transaction isolation there is a higher cost per >> tuple read, both directly in locking and indirectly in increased >> rollbacks; so why lie to the optimizer about it and say it's the >> same? > > This depends how you represent the predicates. If you represent > the predicate by indicating that you might have read any record in > the table -- i.e. a full table lock then you would have very low > overhead per-tuple read, effectively 0. The chances of a > serialization failure would go up but I don't see how to represent > that as a planner cost. > > But this isn't directly related to the plan in any case. You could > do a full table scan but record in the predicate lock that you > were only interested in records with certain constraints. Or you > could do an index scan but decide to represent the predicate lock > as a full table lock anyways. All valid points. I could try to make counter-arguments, but in my view the only thing that really matters is how any such attempt performs in a realistic workload. If, when we get to the optimization phase, such a technique shows a performance improvement in benchmarks which we believe realistically model workloads we believe to be reasonable candidates to use serializable transactions, then I'll argue that the burden of proof is on anyone who thinks it's a bad idea in spite of that. If it doesn't show an improvement, I'll be the first to either try to refine it or toss it. Fair enough? -Kevin
On Thu, Jan 7, 2010 at 9:28 PM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: > All valid points. I could try to make counter-arguments, but in my > view the only thing that really matters is how any such attempt > performs in a realistic workload. If, when we get to the > optimization phase, such a technique shows a performance improvement > in benchmarks which we believe realistically model workloads we > believe to be reasonable candidates to use serializable > transactions, then I'll argue that the burden of proof is on anyone > who thinks it's a bad idea in spite of that. If it doesn't show an > improvement, I'll be the first to either try to refine it or toss > it. Fair enough? My comment was in relation to the idea of representing the costs in the planner. I was a) saying you had to see how the implementation went before you try to come up with how to represent the costs and then b) speculating (hypocritically:) that you might have the direction of adjustment backward. From what I understand your first cut will just take full-table "locks" anyways so it won't matter what type of plan is used at all. Personally I can't see how that won't generate a serialization failure on basically every query on any moderately concurrent system but at least it would make an interesting test-bed for the SIREAD dependency detection logic. And I agree it's necessary code before we get into more fine-grained siread locks. -- greg
Greg Stark <gsstark@mit.edu> wrote: > My comment was in relation to the idea of representing the costs > in the planner. I was a) saying you had to see how the > implementation went before you try to come up with how to > represent the costs and then b) speculating (hypocritically:) that > you might have the direction of adjustment backward. I think we may be agreeing violently. I had the thought that costing may need to be adjusted, suggested the easiest hack that seemed like it might show an improvement, and said it needed more thought before we got to trying it out in the optimization phase. I don't think we actually disagree about much there. > From what I understand your first cut will just take full-table > "locks" anyways so it won't matter what type of plan is used at > all. Right. And it would be totally premature to try to test any optimizations at that phase, which is reflected in the development plan on the wiki. > Personally I can't see how that won't generate a serialization > failure on basically every query on any moderately concurrent > system but at least it would make an interesting test-bed for the > SIREAD dependency detection logic. Exactly. > And I agree it's necessary code before we get into > more fine-grained siread locks. Cool. Overall, it sounds like we've gotten to the same page. :-) -Kevin
On Fri, Jan 8, 2010 at 5:13 PM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: >> From what I understand your first cut will just take full-table >> "locks" anyways so it won't matter what type of plan is used at >> all. > > Right. And it would be totally premature to try to test any > optimizations at that phase, which is reflected in the development > plan on the wiki. > >> Personally I can't see how that won't generate a serialization >> failure on basically every query on any moderately concurrent >> system but at least it would make an interesting test-bed for the >> SIREAD dependency detection logic. > > Exactly. > >> And I agree it's necessary code before we get into >> more fine-grained siread locks. > > Cool. Overall, it sounds like we've gotten to the same page. :-) Well we disagree with whether we have any reasonable plan for adding the more fine-grained locks. AFAICT we have either a) add something clean and abstract which doesn't scale at all or b) turn Postgres into a clone of Sybase by adding something grotty with hooks all over the tree which exposes internal details as user-visible behaviour. I'm pretty unhappy with both options. The SIREAD stuff sounds cool but it's all based on these read locks that we have no plausible implementation which doesn't throw away all the wonderful things about Postges like that it does everything at the tuple level and doesn't have arbitrary limits on the size of transactions. -- greg
Greg Stark <gsstark@mit.edu> wrote: > Well we disagree with whether we have any reasonable plan for > adding the more fine-grained locks. We probably agree on that, too. Perhaps it's that I think we can develop one within a few months and you don't? > AFAICT we have either a) add something clean and abstract which > doesn't scale at all or b) turn Postgres into a clone of Sybase by > adding something grotty with hooks all over the tree which exposes > internal details as user-visible behaviour. Well, I sure hope we can avoid falling into either of those pits. It sounds like Robert has some ideas on a clean approach. I haven't looked at that aspect deeply enough to make detailed suggestions. > The SIREAD stuff sounds cool but it's all based on these read > locks that we have no plausible implementation which doesn't throw > away all the wonderful things about Postges like that it does > everything at the tuple level and doesn't have arbitrary limits on > the size of transactions. I don't plan to throw any of that away; all existing techniques will continue to be used for all transactions, and unless a transaction is serializable it will not use any new techniques. For a serializable transaction the granularity of information used to detect dangerous structures will need to be kept in RAM and will therefore need to support coarser granularity at times to avoid running out of space. Coarser granularities will cause a higher rollback rate for serializable transactions. The optimization phase is mostly about minimizing "false positive" rollbacks. We probably have different thresholds for how many serialization errors we'd be willing to tolerate to get the benefits of true serializability, but that doesn't seem like a very fundamental difference to me. -Kevin