Re: What happened to the is_ family of functions proposal? - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: What happened to the is_ |
Date | |
Msg-id | AANLkTinChAnFEQP_yQ1eFyCpM53XfsVFUBP=4sbM_ODP@mail.gmail.com Whole thread Raw |
In response to |
Re: What happened to the is_ |
List | pgsql-hackers |
On Sat, Sep 25, 2010 at 10:34 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > This is all pretty much a dead end, because it offers no confidence > whatsoever. Suppose that COPY calls type X's input function, which > calls function Y, which calls function Z. Z doesn't like what it sees > so it throws an error, which it marks "recoverable" since Z hasn't > done anything dangerous. Unfortunately, Y *did* do something that > requires cleanup. If COPY catches the longjmp and decides that it > can skip doing a transaction abort, you're screwed. Yep. Although it seems a bit pathological for Z to do that, because if Y is doing something like taking an LWLock then Z is some low-level internals function that is not in a good position to judge whether error recovery is feasible. The point is not to recover from as many errors as possible, but to recover specifically from *data validation* errors, which I would not expect to be the sort of thing thrown from someplace deep down in the call stack where we're deep in the middle of things. The toplevel typinput is a pretty good position to know whether it's done anything shady. > What I'm wondering is whether we can fix this by reducing the overhead > of subtransactions, enough so that we can afford to run each row's input > function calls within a subxact. In the past that was dismissed because > you'd run out of subxact XIDs at 4G rows. But we have "lazy" assignment > of XIDs now, so a subxact that didn't actually try to modify the > database shouldn't need to consume any permanent resources. Then we're > just looking at the time needed to call all the per-module subxact start > and subxact cleanup functions, which seems like something that might be > optimizable for the typical case where nothing actually needs to be > done. Well, reducing the overhead of subtransaction cleanup would certainly be VERY nice, as it would benefit a FAR broader set of use cases than just typinput functions. It seems a bit tricky though, because AbortSubTransaction() calls a whole LOT of cleanup functions, and many of them already have fast-paths. Where do you anticipate getting a further large speed-up out of that? The problem seems particularly tricky because those functions are cleaning up different subsystems. Maybe you could group them in some way and figure out some method of skipping entire groups with some kind of super-duper fast path, but it's not obvious to me how to make that work. And I think you'd need a pretty considerable speed-up, too. My gut says that even knocking 50% off, while it might be really nice for other reasons, is not going to be enough to make sticking it inside COPY workable. I bet you need an order-of-magnitude speed-up, maybe more. It seems like a good slice of the problem here comes from the difficulties of being certain what the state is after a longjmp. It seems like you could get around all of these difficulties almost completely if the type input function were empowered to return either (1) a Datum which is the result of the conversion or (2) an SQLSTATE and error message indicating what went wrong. We're already willing to believe that cleanup isn't required when the function returns successfully, so we ought to also believe it when the function returns a failure result (as opposed to throwing an error indicating a failure). The conditions that require cleanup here are probably transient: take an LWLock, do something, release the LWLock. As long as you know that you haven't stopped somewhere in the middle of that sequence, it seems like it should be reasonably safe. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
pgsql-hackers by date: