Thread: SortSupport for UUID type

SortSupport for UUID type

From

Peter Geoghegan

Date:

04 October 2015, 06:21:09

Attached is SortSupport for UUID type patch. This accelerates all UUID
related sort-intense operations, making them about twice as fast with
millions of tuples. I know that Heroku has plenty of tables used by
various applications with a UUID primary key, so it will be nice to
have CREATE INDEX go so much faster in those cases. The SortSupport
routine uses abbreviation, and relies on unsigned integer comparisons.
It has a dependency on the new abbreviated key for text patch series
[1].

For full details, see commit message. It's a bit unfortunate that I
have yet another sort patch here, without being much closer to getting
every other such patch committed, but I happened to have written this
patch a while ago. I wanted to shape-up the common API that this patch
and the related text patch use, so it just made sense to submit the
UUID patch now.

This patch is not all that exciting -- the techniques used here are
very simple, and it's all familiar territory for us. I expect that
this patch will not be at all contentious when someone eventually gets
around to reviewing it.

[1] http://www.postgresql.org/message-id/CAM3SWZSTKmOnosidBYeCY9pJT=hhGUwwmxxpWSEJB7Kf2oGgJA@mail.gmail.com
--
Peter Geoghegan

Attachment

0004-Add-SortSupport-routine-for-UUID-data-type.patch

Re: SortSupport for UUID type

From

Robert Haas

Date:

06 October 2015, 20:15:34

On Sun, Oct 4, 2015 at 2:21 AM, Peter Geoghegan <pg@heroku.com> wrote:
> Attached is SortSupport for UUID type patch. This accelerates all UUID
> related sort-intense operations, making them about twice as fast with
> millions of tuples. I know that Heroku has plenty of tables used by
> various applications with a UUID primary key, so it will be nice to
> have CREATE INDEX go so much faster in those cases. The SortSupport
> routine uses abbreviation, and relies on unsigned integer comparisons.
> It has a dependency on the new abbreviated key for text patch series
> [1].
>
> For full details, see commit message. It's a bit unfortunate that I
> have yet another sort patch here, without being much closer to getting
> every other such patch committed, but I happened to have written this
> patch a while ago. I wanted to shape-up the common API that this patch
> and the related text patch use, so it just made sense to submit the
> UUID patch now.

+        tmp = ((uint32) res ^ (uint32) ((uint64) res >> 32));

The outer set of parentheses here seems pretty worthless.  Perhaps one
of the parentheses at the end of the statement should be moved to just
after "res".  That seems like it would add considerably clarity.

> This patch is not all that exciting -- the techniques used here are
> very simple, and it's all familiar territory for us. I expect that
> this patch will not be at all contentious when someone eventually gets
> around to reviewing it.

There is no doubt that we need more reviewers.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: SortSupport for UUID type

From

Peter Geoghegan

Date:

09 October 2015, 00:28:00

On Tue, Oct 6, 2015 at 1:15 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> +        tmp = ((uint32) res ^ (uint32) ((uint64) res >> 32));
>
> The outer set of parentheses here seems pretty worthless.  Perhaps one
> of the parentheses at the end of the statement should be moved to just
> after "res".  That seems like it would add considerably clarity.

This is more or less lifted from numeric_abbrev_convert_var(). Perhaps
you should change it there too. The extra set of parenthesis are
removed in the attached patch. The patch also mechanically updates
things to be consistent with the text changes on the text thread [1]
-- I had to rebase.

>> This patch is not all that exciting -- the techniques used here are
>> very simple, and it's all familiar territory for us. I expect that
>> this patch will not be at all contentious when someone eventually gets
>> around to reviewing it.
>
> There is no doubt that we need more reviewers.

I'm trying to do review following my burst of productivity on sorting
(especially external sorting) -- I should manage to do a bunch of
patch review when I return from vacation in about a month (I leave in
a couple of weeks). I'm currently privately helping a new contributor
with a large project in its early stages, so that is something.
Perhaps you'll hear more about that before too long.

[1] http://www.postgresql.org/message-id/CAM3SWZTaVFBwtHF87OpNGN2r2_he-wsmN53HmqyWYPM=K51rEQ@mail.gmail.com
--
Peter Geoghegan

Attachment

0003-Add-SortSupport-routine-for-UUID-data-type.patch

Re: SortSupport for UUID type

From

Peter Geoghegan

Date:

06 November 2015, 00:10:27

On Thu, Oct 8, 2015 at 5:27 PM, Peter Geoghegan <pg@heroku.com> wrote:
> This is more or less lifted from numeric_abbrev_convert_var(). Perhaps
> you should change it there too. The extra set of parenthesis are
> removed in the attached patch. The patch also mechanically updates
> things to be consistent with the text changes on the text thread [1]
> -- I had to rebase.

Attached is almost the same patch, but rebased. This was required
because the name of our new macro was changed to
DatumBigEndianToNative() at the last minute.


--
Peter Geoghegan

Attachment

0001-Add-SortSupport-routine-for-UUID-data-type.patch

Re: SortSupport for UUID type

From

Kyotaro HORIGUCHI

Date:

06 November 2015, 08:35:50

Hello, I tried to look on this as far as I can referring to
numeric.c..

1. Patch application
 This patch applies on the current master cleanly. And looks to be work as expected.

2. uuid.c
 pg_bswap.h is included under hash.h so it is not needed to be included but I don't object if you include it
explicitly.

3.  uuid_sortsupport()
 The comment for PrepareSortSupportFromIndexRel says that,
> * Caller must previously have zeroed the SortSupportData structure and then> * filled in ssup_cxt, ssup_attno,
ssup_collation,and ssup_nulls_first.  This

And all callers comply with this order, and numeric_sortsupportrelies on this contract and does not initialize
ssup->extrabutuuid_sortsupport does. I think these are better to be in uniformstyle. I think removing ssup_extra = NULL
andadding a commentthat ssup_extra has been initialized by the caller instead.

4. uuid_cmp_abbrev()
Although I don't know it is right or not, 3-way comparisonbetween two Datums surely works.

5. uuid_abbrev_abort()
I don't know the validity of every thresholds, but they work asexpected.

6. uuid_abbrev_convert()
> memcpy((char *) &res, authoritative->data, sizeof(Datum));
memcpy's prototype is "memcpy(void *dest..." so the cast to(char *) is not necessary.

7. system catalogs
uuid_sortsupport looks to be properly registered so thatPrepareSortSupportFrom(OrderingOp|IndexRel)() can find and it.

regards,

At Thu, 5 Nov 2015 16:10:21 -0800, Peter Geoghegan <pg@heroku.com> wrote in
<CAM3SWZR7xv_9p4V2xpatk2xK7Jxmcz8B6BjAbKtAwhxBEmAK=A@mail.gmail.com>
> On Thu, Oct 8, 2015 at 5:27 PM, Peter Geoghegan <pg@heroku.com> wrote:
> > This is more or less lifted from numeric_abbrev_convert_var(). Perhaps
> > you should change it there too. The extra set of parenthesis are
> > removed in the attached patch. The patch also mechanically updates
> > things to be consistent with the text changes on the text thread [1]
> > -- I had to rebase.
> 
> Attached is almost the same patch, but rebased. This was required
> because the name of our new macro was changed to
> DatumBigEndianToNative() at the last minute.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: SortSupport for UUID type

From

Robert Haas

Date:

06 November 2015, 17:18:34

On Thu, Nov 5, 2015 at 7:10 PM, Peter Geoghegan <pg@heroku.com> wrote:
> On Thu, Oct 8, 2015 at 5:27 PM, Peter Geoghegan <pg@heroku.com> wrote:
>> This is more or less lifted from numeric_abbrev_convert_var(). Perhaps
>> you should change it there too. The extra set of parenthesis are
>> removed in the attached patch. The patch also mechanically updates
>> things to be consistent with the text changes on the text thread [1]
>> -- I had to rebase.
>
> Attached is almost the same patch, but rebased. This was required
> because the name of our new macro was changed to
> DatumBigEndianToNative() at the last minute.

Committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: SortSupport for UUID type

From

Robert Haas

Date:

06 November 2015, 17:19:17

On Fri, Nov 6, 2015 at 3:35 AM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> Hello, I tried to look on this as far as I can referring to
> numeric.c..

Oops, I didn't see this review before committing.

> 6. uuid_abbrev_convert()
>
>  > memcpy((char *) &res, authoritative->data, sizeof(Datum));
>
>  memcpy's prototype is "memcpy(void *dest..." so the cast to
>  (char *) is not necessary.

This is a good catch, so I pushed a fix.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: SortSupport for UUID type

From

Peter Geoghegan

Date:

06 November 2015, 19:32:58

On Fri, Nov 6, 2015 at 9:19 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> This is a good catch, so I pushed a fix.

Thanks for your help.


-- 
Peter Geoghegan

Re: SortSupport for UUID type

From

Peter Geoghegan

Date:

06 November 2015, 19:34:00

On Fri, Nov 6, 2015 at 12:35 AM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> Hello, I tried to look on this as far as I can referring to
> numeric.c..

Thank you for the review, Horiguchi-san.


-- 
Peter Geoghegan