domain_in performance considerations - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | domain_in performance considerations |
Date | |
Msg-id | 14867.1144298696@sss.pgh.pa.us Whole thread Raw |
Responses |
Re: domain_in performance considerations
|
List | pgsql-hackers |
Neil Conway wrote some pretty nice things here: http://www.advogato.org/person/nconway/diary.html?start=26 but commented > It would be worthwhile to investigate whether this results in a > performance regression, though: there's no easy way to cache the > executor machinery needed to evaluate a CHECK constraint in this > design, whereas the prior design allowed each call-site to implement > its own EState caching. which echoes some performance worries I'd expressed in the original proposal back here: http://archives.postgresql.org/pgsql-hackers/2005-07/msg00320.php I wanted to mention a couple of things for the archives, before they fade from short-term memory: The patch-as-committed arranges to cache the results of GetDomainConstraints, which I think is the worst performance hit involved, since it requires at least one and possibly several indexscans of the pg_constraint system catalog. The cache has lifetime equal to the caller's caching of FmgrInfo for the domain_in function, which is ordinarily query-lifespan, which I think is about right. And the output of GetDomainConstraints is just a palloc'd node tree, so it'll go away when the memory context containing the FmgrInfo is freed. So I don't see any particular issues there. What is not so pleasant is that the domains.c code instantiates and destroys an EState and subsidiary structure for each call, if there are any CHECK constraints to be checked. This is normally only a few malloc's, so it's not *that* big a deal, but it could easily be a performance-limiting factor. It would be just a small change to make the code cache the EState across calls, saving a link to it in the FmgrInfo, but I am worried about that. If the EState's query context is made to be a child of the memory context containing the caller's FmgrInfo, then there is no problem as far as memory management goes --- destroying that parent context will make all the memory associated with the EState go away. The problem is that an EState might have other open resources, principally buffer pins, and there would not be any clean way to close down those resources when the EState is no longer needed. We don't have any sort of shutdown callback that domain_in could make use of in the general case. It's possible that caching the EState would be OK without any shutdown callback, because I suspect that an EState used only for legal CHECK constraint expressions (which may not contain sub-selects) would own no non-memory resources. But it'd take some effort to verify that, and it seems like a pretty fragile assumption over the long haul anyway. Another idea I had looked at seriously was to make callers of input functions pass in an EState if they had one available (this could be done via the "context" field we already have in FunctionCallInfoData). However, inspection of the existing callers of input functions showed that that'd only readily work for COPY. We could maybe have made it work for plpgsql, but I had a lot of questions as to whether the available EState would have the right lifespan relative to the FmgrInfo. Everybody else was out in the cold anyway, because they had no ready access to any suitable EState. So I felt that I should commit a version-zero of the patch that I was pretty confident would *work*, and not get into these dubious performance-enhancing tweaks. We can try to spin it up from here, if needed. It'd be worth first profiling some domain-intensive queries and seeing if there's any reason to worry or not. In any case, I'll fall back on the wisdom of the sage who said "I can make this program arbitrarily fast ... if it doesn't have to give the right answer". Domains are giving measurably more-right answers today than they were yesterday. Making 'em fast comes after. regards, tom lane
pgsql-hackers by date: