Re: avoiding tuple copying in btree index builds - Mailing list pgsql-hackers

From Robert Haas
Subject Re: avoiding tuple copying in btree index builds
Date
Msg-id CA+TgmoajZ3FZrYSx_0u14ymNBYQHMoJ332AFScTEpwJtCO5tYg@mail.gmail.com
Whole thread Raw
In response to Re: avoiding tuple copying in btree index builds  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: avoiding tuple copying in btree index builds  (Amit Kapila <amit.kapila16@gmail.com>)
Re: avoiding tuple copying in btree index builds  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Sun, Jun 1, 2014 at 3:26 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Tue, May 6, 2014 at 12:04 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Mon, May 5, 2014 at 2:13 PM, Andres Freund <andres@2ndquadrant.com>
>> wrote:
>> > On 2014-05-05 13:52:39 -0400, Robert Haas wrote:
>> >> Today, I discovered that when building a btree index, the btree code
>> >> uses index_form_tuple() to create an index tuple from the heap tuple,
>> >> calls tuplesort_putindextuple() to copy that tuple into the sort's
>> >> memory context, and then frees the original one it built.  This seemed
>> >> inefficient, so I wrote a patch to eliminate the tuple copying.  It
>> >> works by adding a function tuplesort_putindextuplevalues(), which
>> >> builds the tuple in the sort's memory context and thus avoids the need
>> >> for a separate copy.  I'm not sure if that's the best approach, but
>> >> the optimization seems wortwhile.
>> >
>> > Hm. It looks like we could quite easily just get rid of
>> > tuplesort_putindextuple(). The hash usage doesn't look hard to convert.
>>
>> I glanced at that, but it wasn't obvious to me how to convert the hash
>> usage.  If you have an idea, I'm all ears.
>
> I also think it's possible to have similar optimization for hash index
> incase it has to spool the tuple for sorting.
>
> In function hashbuildCallback(), when buildstate->spool is true, we
> can avoid to form index tuple. To check for nulls before calling
>
> _h_spool(), we can traverse the isnull array.

Hmm, that might work.  Arguably it's less efficient, but on the other
hand if it avoids forming the tuple sometimes it might be MORE
efficient.  And anyway the difference might not be enough to matter.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Re-create dependent views on ALTER TABLE ALTER COLUMN ... TYPE?
Next
From: Peter Eisentraut
Date:
Subject: Re: json casts