Red-black tree for GIN - Mailing list pgsql-hackers

From Teodor Sigaev
Subject Red-black tree for GIN
Date
Msg-id 4B0A8DFA.7050009@sigaev.ru
Whole thread Raw
Responses Re: Red-black tree for GIN
List pgsql-hackers
Hi there,

attached is a patch, which contains implementation of a  red-black
tree,  a self-balanced implementation of binary tree.  The main goal of
this patch is to improve creation time of GIN index in the corner cases.
While creation, GIN collects data in memory in binary tree until it
reach some limit and then it flush tree to disk. Some data could
produces unbalanced binary tree,  for example, sorted data, so the tree
degenerates to the list with O(N^2) processing time (see thread
http://archives.postgresql.org/pgsql-performance/2009-03/msg00340.php)
), which cause very slow index creation.  Tom has fixed that by limiting
depth of tree  (committed to 8.3 and 8.4),  but we found it's not enough
and propose to use red-black tree, which is very good for skewed data and has
almost the same performance for unsorted data, see
http://www.sai.msu.su/~megera/wiki/2009-07-27 and
http://www.sai.msu.su/~megera/wiki/2009-04-03 for more information.

Implementation of red-black tree has several currently unused  methods,
but they will be used in next patches.

--
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/

Attachment

pgsql-hackers by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: forget patch win32.mak.
Next
From: Teodor Sigaev
Date:
Subject: point_ops for GiST