Re: [BUGS] Breakage with VACUUM ANALYSE + partitions - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: [BUGS] Breakage with VACUUM ANALYSE + partitions |
Date | |
Msg-id | CA+TgmobyCtD4xRY4Ee2Jv6-qznMDaLVrMshuirXbdPNjaYVQGA@mail.gmail.com Whole thread Raw |
In response to | Re: [BUGS] Breakage with VACUUM ANALYSE + partitions (Andres Freund <andres@anarazel.de>) |
Responses |
Re: [BUGS] Breakage with VACUUM ANALYSE + partitions
|
List | pgsql-hackers |
On Mon, Apr 25, 2016 at 11:56 AM, Andres Freund <andres@anarazel.de> wrote: > On 2016-04-25 08:55:54 -0400, Robert Haas wrote: >> Andres, this issue has now been open for more than a month, which is >> frankly kind of ridiculous given the schedule we're trying to hit for >> beta. Do you think it's possible to commit something RSN without >> compromising the quality of PostgreSQL 9.6? > > Frankly I'm getting a bit annoyed here too. I posted a patch Friday, > directly after getting back from pgconf.us. Saturday I posted a patch > for an issue which I didn't cause, but which caused issues when testing > the patch in this thread. Sunday I just worked on some small patch - one > you did want to see resolved too. It's now 8.30 Monday morning. What's > the point of your message? I think that the point of my message is exactly what I said in my message. This isn't really about the last couple of days. The issue was reported on March 20th. On March 31st, Noah asked you for a plan to get it fixed by April 7th. You never replied. On April 16th, the issue not having been fixed, he followed up. You said that you would fix it next week. That week is now over, and we're into the following week. We have a patch, and that's good, and I have reviewed it and Thom has tested it, and that's good, too. But it is not clear whether you feel confident to commit it or when you might be planning to do that, so I asked. Given that this is the open item of longest tenure and that we're hoping to ship beta soon, why is that out of line? Fundamentally, the choices for each open item are (1) get it fixed before beta, (2) revert the commit that caused it, (3) decide it's OK to ship beta with that issue, or (4) slip beta. We initially had a theory that the commit that caused this issue merely revealed an underlying problem that had existed before, but I no longer really think that's the case. That commit introduced a new way to write to blocks that might have in the meantime been removed, and it failed to make that safe. That's not to say that md.c doesn't do some wonky things, but the pre-existing code in the upper layers coped with the wonkiness and your new code doesn't, so in effect it's a regression. And in fact I think it's a regression that can be expected to be a significant operational problem for people if not fixed, because the circumstances in which it can happen are not very obscure. You just need to hold some pending flush requests in your backend-local queue while some other process truncates the relation, and boom. So I am disinclined to option #3. I also do not think that we should slip beta for an issue that was reported this far in advance of the planned beta date, so I am disinclined to option #4. That leaves #1 and #2. I assume you will be pretty darned unhappy if we end up at #2, so I am trying to figure out if we can achieve #1. OK? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: