Re: Add sub-transaction overflow status in pg_stat_activity - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Add sub-transaction overflow status in pg_stat_activity |
Date | |
Msg-id | CA+TgmoZT0sB99dK2LWrS3txmiEDnOHNdAEi+eSYdUkK4qqqCow@mail.gmail.com Whole thread Raw |
In response to | Re: Add sub-transaction overflow status in pg_stat_activity (Andres Freund <andres@anarazel.de>) |
Responses |
Re: Add sub-transaction overflow status in pg_stat_activity
Re: Add sub-transaction overflow status in pg_stat_activity Re: Add sub-transaction overflow status in pg_stat_activity Re: Add sub-transaction overflow status in pg_stat_activity |
List | pgsql-hackers |
On Mon, Mar 21, 2022 at 7:45 PM Andres Freund <andres@anarazel.de> wrote: > > It feels to me like far too much effort is being invested in fundamentally > > the wrong direction here. If the subxact overflow business is causing > > real-world performance problems, let's find a way to fix that, not put > > effort into monitoring tools that do little to actually alleviate anyone's > > pain. > > There seems to be some agreement on this (I certainly do agree). Thus it seems > we should mark the CF entry as rejected? I don't think I agree with this outcome, for two reasons. First, we're just talking about an extra couple of columns in pg_stat_activity here, which does not seem like a heavy price to pay. I'm not even sure we need two columns; I think we could get down to one pretty easily. Rough idea: number of cached subtransaction XIDs if not overflowed, else NULL. Or if that's likely to create 0/NULL confusion, then maybe just a Boolean, overflowed or not. Second, the problem seems pretty fundamental to me. Shared memory is fixed size, so we cannot use it to store an unbounded number of subtransaction IDs. We could perhaps rejigger things to be more memory-efficient in some way, but no matter how many subtransaction XIDs you can keep in shared memory, the user can always consume that number plus one -- unless you allow for ~2^31 in shared memory, which seems unrealistic. To me, that means that overflowed snapshots are not going away. We could make them less painful by rewriting the SLRU stuff to be more efficient, and I bet that's possible, but I think it's probably hard, or someone would have gotten it done by now. This has been sucking for a long time and I see no evidence that progress is imminent. Even if it happens, it is unlikely that it will be a full solution. If it were possible to make SLRU lookups fast enough not to matter, we wouldn't need to have hint bits, but in reality we do have them and attempts to get rid of them have not gone well up until now, and in my opinion probably never will. The way that I view this problem is that it is relatively rare but hard for some users to troubleshoot. I think I've seen it come up multiple times, and judging from the earlier responses on this thread, several other people here have, too. In my experience, the problem is inevitably that someone has a DML statement inside a plpgsql EXCEPTION block inside a plpgsql loop. Concurrently with that, they are running a lot of queries that look at recently modified data, so that the overflowed snapshot trigger SLRU lookups often enough to matter. How is a user supposed to identify which backend is causing the problem, as things stand today? I have generally given people the advice to go find the DML inside of a plpgsql EXCEPTION block inside of a loop, but some users have trouble doing that. The DBA who is observing the performance problem is not necessarily the developer who wrote all of the PL code, and the PL code may be large and badly formatted and there could be a bunch of EXCEPTION blocks and it might not be clear which one is the problem. The exception block could be calling another function or procedure that does the actual DML rather than doing it directly, and the loop surrounding it might not be in the same function or procedure but in some other one that calls it, or it could be called repeatedly from the SQL level. I think I fundamentally disagree with the idea that we should refuse to expose instrumentation data because some day the internals might change. If we accepted that argument categorically, we wouldn't have things like backend_xmin or backend_xid in pg_stat_activity, or wait events either, but we do have those things and users find them useful. They suck in the sense that you need to know quite a bit about how the internals work in order to use them to find problems, but people who want to support production PostgreSQL instances have to learn about how those internals work one way or the other because they demonstrably matter. It is absolutely stellar when we can say "hey, we don't need to have a way for users to see what's going on here internally because they don't ever need to care," but once it is established that they do need to care, we should let them see directly the data they need to care about rather than forcing them to troubleshoot the problem in some more roundabout way like auditing all of the code and guessing which part is the problem, or writing custom dtrace scripts to run on their production instances. If and when it happens that a field like backend_xmin or the new ones proposed here are no longer relevant, we can just remove them from the monitoring views. Yeah, that's a backward compatibility break, and there's some pain associated with that. But we have demonstrated that we are perfectly willing to incur the pain associated with adding new columns when there is new and valuable information to display, and that is equally a compatibility break, in the sense that it has about the same chance of making pg_upgrade fail. In short, I think this is a good idea, and if somebody thinks that we should solve the underlying problem instead, I'd like to hear what people think a realistic solution might be. Because to me, it looks like we're refusing to commit a patch that probably took an hour to write because with 10 years of engineering effort we could *maybe* fix the root cause. -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: