Thread: management of large patches
We're coming the end of the 9.1 development cycle, and I think that there is a serious danger of insufficient bandwidth to handle the large patches we have outstanding. For my part, I am hoping to find the bandwidth to two, MAYBE three major commits between now and the end of 9.1CF4, but I am not positive that I will be able to find even that much time, and the number of major patches vying for attention is considerably greater than that. Quick estimate: - SQL/MED - probably needs >~3 large commits: foreign table scan, file FDW, postgresql FDW, plus whatever else gets submitted in the next two weeks - MERGE - checkpoint improvements - SE-Linux integration - extensions - may need 2 or more commits - true serializability - not entirely sure of the status of this - writeable CTEs (Tom has indicated he will look at this) - PL/python patches (Peter has indicated he will look look at this) - snapshot taking inconsistencies (Tom has indicated he will look at this) - per-column collation (Peter) - synchronous replication (Simon, and, given the level of interest in and complexity of this feature, probably others as well) I guess my basic question is - is it realistic to think that we're going to get all of the above done in the next 45 days? Is there anything we can do make the process more efficient? If a few more large patches drop into the queue in the next two weeks, will we have bandwidth for those as well? If we don't think we can get everything done in the time available, what's the best way to handle that? I would hate to discourage people from continuing to hack away, but I think it would be even worse to give people the impression that there's a chance of getting work reviewed and committed if there really isn't. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Sun, Jan 2, 2011 at 06:32, Robert Haas <robertmhaas@gmail.com> wrote: > We're coming the end of the 9.1 development cycle, and I think that > there is a serious danger of insufficient bandwidth to handle the > large patches we have outstanding. For my part, I am hoping to find > the bandwidth to two, MAYBE three major commits between now and the > end of 9.1CF4, but I am not positive that I will be able to find even > that much time, and the number of major patches vying for attention is > considerably greater than that. Quick estimate: > > - SQL/MED - probably needs >~3 large commits: foreign table scan, file > FDW, postgresql FDW, plus whatever else gets submitted in the next two > weeks > - MERGE > - checkpoint improvements > - SE-Linux integration > - extensions - may need 2 or more commits > - true serializability - not entirely sure of the status of this > - writeable CTEs (Tom has indicated he will look at this) > - PL/python patches (Peter has indicated he will look look at this) > - snapshot taking inconsistencies (Tom has indicated he will look at this) > - per-column collation (Peter) > - synchronous replication (Simon, and, given the level of interest in > and complexity of this feature, probably others as well) > > I guess my basic question is - is it realistic to think that we're > going to get all of the above done in the next 45 days? Is there > anything we can do make the process more efficient? If a few more > large patches drop into the queue in the next two weeks, will we have > bandwidth for those as well? If we don't think we can get everything > done in the time available, what's the best way to handle that? I Well, we've always (well, since we had cf's) said that large patches shouldn't be submitted for the last CF, they should be submitted for one of the first. So if something *new* gets dumped on us for the last one, giving priority to the existing ones in the queue seems like the only fair option. As for priority between those that *were* submitted earlier, and have been reworked (which is how the system is supposed to work), it's a lot harder. And TBH, I think we're going to have a problem getting all those done. But the question is - are all ready enough, or are a couple going to need the "returned with feedback" status *regardless* of if this is the last CF or not? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
On Sun, Jan 2, 2011 at 4:29 AM, Magnus Hagander <magnus@hagander.net> wrote: > As for priority between those that *were* submitted earlier, and have > been reworked (which is how the system is supposed to work), it's a > lot harder. And TBH, I think we're going to have a problem getting all > those done. But the question is - are all ready enough, or are a > couple going to need the "returned with feedback" status *regardless* > of if this is the last CF or not? Well, that all depends on how much work people are willing to put into reviewing and committing them, which I think is what we need to determine. None of those patches are going to be as simple as "patch -p1 < $F && git commit -a && git push". Having done a couple of these now, I'd say that doing final review and commit of a patch of this scope takes me ~20 hours of work, but it obviously varies a lot based on how good the patch is to begin with and how much review has already been done. So I guess the question is - who is willing to step up to the plate, either as reviewer or as final reviewer/committer? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
(2011/01/02 14:32), Robert Haas wrote: > We're coming the end of the 9.1 development cycle, and I think that > there is a serious danger of insufficient bandwidth to handle the > large patches we have outstanding. For my part, I am hoping to find > the bandwidth to two, MAYBE three major commits between now and the > end of 9.1CF4, but I am not positive that I will be able to find even > that much time, and the number of major patches vying for attention is > considerably greater than that. Quick estimate: > : > - SE-Linux integration How about feasibility to commit this 3KL patch in the last 45 days? At least, the idea of security provider enables us to maintain a set of hooks and logic to make access control decision independently. I'm available to provide a set of sources for this module at git.postgresql.org, so we can always obtain a working module from here. The worst scenario for us is nothing were progressed in spite of large man-power to review and discuss. It may be more productive to keep features to be committed on the last CF as small as possible, such as hooks to support a part of DDL permissions or pg_regress enhancement to run regression test. Thanks, -- KaiGai Kohei <kaigai@kaigai.gr.jp>
Robert Haas wrote: > - true serializability - not entirely sure of the status of this I try to keep the status section of the Wiki page up-to-date. I have just reviewed it and tweaked it for the latest events: http://wiki.postgresql.org/wiki/Serializable#Current_Status There are a number of pending R&D issues: http://wiki.postgresql.org/wiki/Serializable#R.26D_Issues Most of these can be deferred. The ones which really need at least some attention before release relate to how to deal with serializable transactions on replication targets and whether we've been properly careful about using coding style which is safe for machines with weak memory ordering. I've done my best to follow discussions on that topic and do the right thing, but someone with a deeper understanding of the issues should probably take a look. Someone has joined the effort starting this weekend -- a consultant who has done a lot of technical writing (John Okite) will be working on doc changes related to the patch. (I assume that would best be submitted as a separate patch.) If you want a shorter version of the patch status: We expect to have updated patch before the CF, including docs and incorporating feedback from previous CFs and Heikki's comments on interim work. -Kevin
Robert Haas wrote: > - MERGE > - checkpoint improvements > As far as these two go, the state of MERGE is still rougher than I would like. The code itself isn't too hard to read, and that the errors that are popping up tend to be caught by assertions (rather than just being mysterious crashes) makes me feel a little better that there's some defensive coding in there. It's still a 3648 line patch that touches grammar, planner, and executor bits though, and I've been doing mainly functional and coding style review so far. I'm afraid here's not too many committers in a good position to actually consume the whole scope of this thing for a commit level review. And the way larger patches tend to work here, I'd be surprised to find it passes through such a review without some as yet unidentified major beef appearing. Will see what we can do to help move this forward more before the CF start. The checkpoint changes I'm reworking are not really large from a code complexity or size perspective--I estimate around 350 lines of diff, with the rough version I submitted to CF2010-11 at 258. I suspect it will actually be the least complicated patch to consume from that list, from a committer perspective. The complexity there is mainly in the performance testing. I've been gearing up infrastructure the last couple weeks to automate and easily publish all the results I collect there. The main part that hasn't gone through any serious testing yet, auto-tuning the spread interval, will also be really easy to revert if a problem is found there. With Simon and I both reviewing each others work on this already, I hope we can keep this one from clogging the committer critical path you're worried about here. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services and Support www.2ndQuadrant.us "PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books
Robert Haas <robertmhaas@gmail.com> writes: > - extensions - may need 2 or more commits I'm now basically done with coding, I'm writing the docs for the upgrade patch and preparing the upgrade SQL files for pre-9.1 to 9.1 upgrades of the contrib modules. Doing that, I've been cleaning up or reorganising some code: I will backport some of those changes to the main extension patch. So I expect to send both extension.v23.patch and extension-upgrade.v1.patch this week. As the main extension patch as received lots of detailed reviews (both user level and code level) by commiters already, I'm not expecting big surprises for the last commitfest. The upgrade patch design has been discussed in detail on-list too. Dust has settled here. Meanwhile, there's this bugfix for HEAD that I've sent: http://archives.postgresql.org/pgsql-hackers/2011-01/msg00078.php Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support