Re: [PATCH] PREPARE TRANSACTION unexpected behavior with TEMP TABLE - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: [PATCH] PREPARE TRANSACTION unexpected behavior with TEMP TABLE |
Date | |
Msg-id | CA+TgmoaQHd=Ln2_QoHLGpg8ibu5F3HjVOZTF3da3adSN6sEemQ@mail.gmail.com Whole thread Raw |
In response to | Re: [PATCH] PREPARE TRANSACTION unexpected behavior with TEMP TABLE (Himanshu Upadhyaya <upadhyaya.himanshu@gmail.com>) |
List | pgsql-hackers |
On Wed, Apr 14, 2021 at 6:45 AM Himanshu Upadhyaya <upadhyaya.himanshu@gmail.com> wrote: > The purpose of this FIX is to mainly focus on getting consistent behavior > with PREPARE TRANSACTION. With the case that I had mentioned > previously, my expectation was either both PREPARE TRANSACTION should fail > or both should succeed but here second same "PREPARE TRANSACTION" was > successful however first one was failing with an error, which is kind of weird to me. I agree that it's weird, but that doesn't mean that your patch is an improvement, and I don't think it is. If we could make this thing more consistent without incurring any negatives, I'd be in favor of that. But the patch does have some negatives, which in my opinion are more substantial than the problem you're trying to fix. Namely, possible performance consequences, and undocumented and fragile assumptions that, as it seems to me, may easily get broken in the future. I see that you've repeatedly capitalized the word FIX in your reply, but it's just not that simple. If this had really bad consequences like corrupting data or crashing the server then it would be essential to do something about it, but so far the worst consequence you've indicated is that an obscure sequence of SQL commands that no real user is likely to issue produces a slightly surprising result. That's not great, but neither is it an emergency. > I have also tried to reproduce the behavior. Your test case isn't ideal for reproducing the problem that the comment is worrying about. Normally, when we take a lock on a table, we hold it until commit. But, that only applies when we run a query that mentions the table, like a SELECT or an UPDATE. In your case, we only end up opening the table to build a relcache entry for it, so that we can look at the metadata. And, catalog scans used to build syscache and relcache entries release locks immediately, rather than waiting until the end of the transaction. So it might be that if we failed to ever set XACT_FLAGS_ACCESSEDTEMPNAMESPACE in your test case, everything would be fine. That doesn't seem to be true in general though. I tried changing "cannot PREPARE a transaction that has operated on temporary objects" from an ERROR to a NOTICE and then ran 'make check'. It hung. I think this test case is the same problem as the regression tests hit; in any case, it also hangs: rhaas=# begin; BEGIN rhaas=*# create temp table first (); CREATE TABLE rhaas=*# prepare transaction 'whatever'; NOTICE: cannot PREPARE a transaction that has operated on temporary objects PREPARE TRANSACTION rhaas=# create temp table second (); [ hangs ] I haven't checked, but I think the problem here is that the first transaction had to create this backend's pg_temp schema and the second one can't see the results of the first one doing it so it wants to do the same thing and that results in waiting for a lock the prepared transaction already holds. I made a quick attempt to reproduce a hang at backend exit time, but couldn't find a case where that happened. That doesn't mean there isn't one, though. There's a very good chance that the person who wrote that comment knew that a real problem existed, and just didn't describe it well enough for you or I to immediately know what it is. It is also possible that they were completely wrong, or that things have changed since the comment was written, but we can't assume that without considerably more research and analysis than either of us has done so far. I think one point to take away here is that question of whether a temporary relation has been "accessed" is not all black and white. If I ran a SELECT statement against a relation, I think it's clear that I've accessed it. But, if I just used the name of that relation as a type name in some other SQL command, did I really access it? The current code's answer to that is that if we had to open and lock the relation to get its metadata, then we accessed it, and if we already had the details that we needed in cache, then we did not access it. Now, again, I agree that looks a little weird from a user point of view, but looking at the implementation, you can kind of see why it ends up like that. From a certain point of view, it would be more surprising if we never opened or locked the relation and yet ended up deciding that we'd accessed it. Now maybe we should further explore going the other direction and avoiding setting the flag at all in this case, since I think we're neither retaining a lock nor touching any relation buffers, but I think that needs more analysis. Even if we decide that's safe, there's still the problem of finding a better implementation that's not overly complex for what really feels like a very minor issue. -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: