Pre-alloc ListCell's optimization - Mailing list pgsql-hackers
From | Stephen Frost |
---|---|
Subject | Pre-alloc ListCell's optimization |
Date | |
Msg-id | 20110525025621.GH4548@tamriel.snowman.net Whole thread Raw |
Responses |
Re: Pre-alloc ListCell's optimization
(Stephen Frost <sfrost@snowman.net>)
Re: Pre-alloc ListCell's optimization (Alvaro Herrera <alvherre@commandprompt.com>) Re: Pre-alloc ListCell's optimization (Alvaro Herrera <alvherre@commandprompt.com>) Re: Pre-alloc ListCell's optimization (Robert Haas <robertmhaas@gmail.com>) Re: Pre-alloc ListCell's optimization (Stephen Frost <sfrost@snowman.net>) Re: Pre-alloc ListCell's optimization (Stephen Frost <sfrost@snowman.net>) |
List | pgsql-hackers |
Greetings, Someone (*cough*Haas*cough) made a claim over beers at PGCon that it would be very difficult to come up with a way to pre-allocate List entries and maintain the current List API. I'll admit that it wasn't quite as trivial as I had *hoped*, but attached is a proof-of-concept patch which does it. A couple of notes regarding the patch: First, it uses ffs(), which might not be fully portable.. We could certainly implement the same thing in userspace and use ffs() when it's available. Second, there are a couple of bugs (or at least, I'll characterize them that way) where we're pfree'ing a list which has been passed to list_concat(). That's not strictly legal as either argument passed to list_concat() could be returned and so one might end up free'ing the list pointer that was just returned. Those pfree's are commented out in this patch, but really should probably just be removed or replaced with 'right' code (if it's critical to free this stuff). Third, COPY_NODE_CELL() wasn't used anywhere other than _copyList and would be difficult to keep as a macro given the way allocations happen now for lists. It's no longer being used at all and therefore should really be removed. Fourth, the current implementation goes ahead and allocates 8 ListCell's, which quadruples the size of a List (40 bytes to 168 bytes, assuming 64bit ints). I don't see that as really being an issue, but perhaps others would disagree. Fifth, list_concat() has become more like list_concat_unique() and friends (and hence slower). Instead of being able to just tack one list on to the end of the other, we have to do an actual copy of the list. This is due to having to allow callers to list_free(), which will call pfree(), and we don't have any way to know which ListCell's from the *old* list were pre-allocated and which weren't, which can end up causing pfree() to crash when it's passed an invalid pointer. In general, I'd hope that not having to palloc() for a small list might make up for a lot of that slowdown, but you can't really argue that anything O(n) can compete with the previous O(1) approach. Finally, sorry it's kind of a fugly patch, it's just a proof of concept and I'd be happy to clean it up if others feel it's worthwhile and a reasonable approach, but I really need to get it out there and take a break from it (I've been a bit obsessive-compulsive about it since PGCon.. :D). Thanks! Stephen
Attachment
pgsql-hackers by date: