Re: Should TIDs be typbyval = FLOAT8PASSBYVAL to speed up CREATE INDEX CONCURRENTLY? - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Should TIDs be typbyval = FLOAT8PASSBYVAL to speed up CREATE INDEX CONCURRENTLY?
Date
Msg-id CAM3SWZTcZJazecjq_TDhuvYLmdrdES_yWMXxG_mgDdpdDKaWsg@mail.gmail.com
Whole thread Raw
In response to Re: Should TIDs be typbyval = FLOAT8PASSBYVAL to speed up CREATE INDEX CONCURRENTLY?  (Peter Geoghegan <pg@heroku.com>)
Responses Re: Should TIDs be typbyval = FLOAT8PASSBYVAL to speed up CREATE INDEX CONCURRENTLY?
Re: Should TIDs be typbyval = FLOAT8PASSBYVAL to speed up CREATE INDEX CONCURRENTLY?
List pgsql-hackers
On Mon, Sep 7, 2015 at 9:20 PM, Peter Geoghegan <pg@heroku.com> wrote:
>>> This matters because a major cost during CREATE INDEX CONCURRENTLY is
>>> a TID-based datum sort (this is probably most of the cost over and
>>> above a conventional CREATE INDEX).
>>
>> Might be better to hack a special case right there (ie, embed TIDs into
>> int8s and sort the int8s) rather than try to change the type's SQL
>> declaration.
>
> That sounds at least as effective as what I originally sketched.

> I hope someone picks this up soon.

I suggested to someone else that he take a look at this as a project,
but I guess he was busy too. I decided to just do it myself. Patch is
attached. This has the bulkdelete callback that gathers TIDs from the
index during CREATE INDEX CONCURRENTLY encode them as int8 values
ahead of the sort, while sorting the values as int8 values. They're
later decoded as needed.

This turns out to be a relatively simple patch. Testing shows it
removes *most* of the overhead of CREATE INDEX CONCURRENTLY over
CREATE INDEX. In absolute terms, a test case involving a CREATE INDEX
CONCURRENTLY on a single integer column takes about 30% - 40% less
time with the patch applied. The TID sort itself is about 3 times
faster, and that's without the benefit of the patch making the TID
sort an internal sort where it would otherwise have been an external
sort. Sorting item pointers as TIDs naively currently wastes a lot of
memory (not just memory bandwidth) -- a pass-by-value datum sort only
ever allocates memory for SortTuples, avoiding allocating any memory
for a "tuple proper". Clearly just having the sort be pass-by-value is
*much* faster. As comments above process_ordered_aggregate_single()
put it:

 * The one-input case is handled separately from the multi-input case
 * for performance reasons: for single by-value inputs, such as the
 * common case of count(distinct id), the tuplesort_getdatum code path
 * is around 300% faster.  (The speedup for by-reference types is less
 * but still noticeable.)

Another way of stating how things are improved here is: my CREATE
INDEX CONCURRENTLY test case had about a 2.1x overhead over an
equivalent CREATE INDEX on master, but with the patch applied the
CREATE INDEX CONCURRENTLY has an overhead of only about 1.3x. The
extra logical I/O for CONCURRENTLY's second scan, and for the
physical-order index scan (to "bulk delete", to gather TIDs to sort)
is a surprisingly small cost.

I'll add the patch to the open commitfest. Think of all these patches
of mine as giving reviewers a choice of what to review. This patch
does seem like a slam dunk, even if I do say so myself, so at least
it's relatively easy to review. The external sort work remains my most
interesting pending work, though.

--
Peter Geoghegan

Attachment

pgsql-hackers by date:

Previous
From: Marko Tiikkaja
Date:
Subject: Re: proposal: function parse_ident
Next
From: Kyotaro HORIGUCHI
Date:
Subject: Re: Support for N synchronous standby servers - take 2