Re: Red-black tree for GIN - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Red-black tree for GIN
Date
Msg-id 603c8f071001251924h164b3d7fjced9e2b0142e5b5c@mail.gmail.com
Whole thread Raw
In response to Re: Red-black tree for GIN  (Teodor Sigaev <teodor@sigaev.ru>)
List pgsql-hackers
2010/1/24 Teodor Sigaev <teodor@sigaev.ru>:
>> Would it be OK if I went through here and hacked on the comments and
>> sent this back to you?
>
> Feel free to edit the patch.

Here's an edited version, which I've now reviewed more fully.  Some
more substantive review comments:

1. I'm pretty satisfied that the rbtree code is generally sane,
although I wonder if we should think about putting it in access/common
rather than utils/misc.  I'm not sure that I have a sufficiently
clearly defined notion of what each subdirectory is for to draw a
definitive conclusion on this point; hopefully someone else will weigh
in.

2. I'm a little concerned about the performance implications of this
patch.  Looking at http://www.sai.msu.su/~megera/wiki/2009-04-03, it's
clear that the patch is a huge win in some cases.  But I'm also
surprised by how much performance is lost in some of the other cases
that you tested.  I suspect, on balance, that it's probably still a
good idea to put this in, but I wonder if you've profiled this at all
to see where the extra time is going and/or explored possible ways of
squashing that overhead down a little more.

3. In ginInsertEntry(), the "else" branch of the if statement really
looks like magic when you first read it.  I wonder if it would make
sense to pull the contents of EAAllocate() directly into this
function, since there's only one call site anyhow.

There are a lot of whitespace changes in the version I'm attaching
compared to your last one - your pg_indent run didn't use a typedef
list that included the new typedefs, so some of the spacing came out
kind of wacky.  I also fleshed out the comments a bit, deleted one
comment that was no longer true, and changed some of the function
names in rbtree.c to make them more consistent, but the changes are
all pretty trivial.

I still have not done any performance testing of my own on this code,
and it probably needs that.  If anyone else has time to step up and
help with that, it would be much appreciated.  It would be useful to
have some plain old benchmarks as well as some profiling data as
mentioned above.

...Robert

Attachment

pgsql-hackers by date:

Previous
From: "Joshua D. Drake"
Date:
Subject: Re: plpython3 perf
Next
From: Greg Stark
Date:
Subject: Re: Fwd: Questions about connection clean-up and "invalid page header"