Re: Yet another fast GiST build - Mailing list pgsql-hackers

From Emre Hasegeli
Subject Re: Yet another fast GiST build
Date
Msg-id CAE2gYzywrGtuhMcx06L+NthXtNncz_8nfOS15_OUCKPtqjs5yw@mail.gmail.com
Whole thread Raw
In response to Re: Yet another fast GiST build  (Andrey Borodin <x4mmm@yandex-team.ru>)
Responses Re: Yet another fast GiST build  (Daniel Gustafsson <daniel@yesql.se>)
Re: Yet another fast GiST build  (Andrey Borodin <x4mmm@yandex-team.ru>)
List pgsql-hackers
I tried reviewing the remaining patches.  It seems to work correctly,
and passes the tests on my laptop.

> In this pattern I flipped PointerGetDatum(a) to PointerGetDatum(ra.lower), because it seems to me correct. I've
followedrule of thumb: every sort function must extract and use "lower" somehow. Though I suspect numeric a bit. Is it
regularvarlena?
 

As far as I understand, we cannot use the sortsupport functions from
the btree operator classes because the btree_gist extension handles
things differently.  This is unfortunate and a source of bugs [1], but
we cannot do anything about it.

Given that the lower and upper datums must be the same for the leaf
nodes, it makes sense to me to compare one of them.

Using numeric_cmp() for numeric in line with using bttextcmp() for text.

> +   /*
> +    * Numeric has abbreviation routines in numeric.c, but we don't try to use
> +    * them here. Maybe later.
> +    */

This is also true for text.  Perhaps we should also add a comment there.

> PFA patchset with v6 intact + two fixes of discovered issues.

> +   /* Use byteacmp(), like gbt_bitcmp() does */

We can improve this comment by incorporating Heikki's previous email:

> Ok, I think I understand that now. In btree_gist, the *_cmp() function
> operates on non-leaf values, and *_lt(), *_gt() et al operate on leaf
> values. For all other datatypes, the leaf and non-leaf representation is
> the same, but for bit/varbit, the non-leaf representation is different.
> The leaf representation is VarBit, and non-leaf is just the bits without
> the 'bit_len' field. That's why it is indeed correct for gbt_bitcmp() to
> just use byteacmp(), whereas gbt_bitlt() et al compares the 'bit_len'
> field separately. That's subtle, and 100% uncommented.

I think patch number 3 should be squashed to patch number 1.

I couldn't understand patch number 2 "Remove DEBUG1 verification".  It
seems like something rather useful.

[1] https://www.postgresql.org/message-id/flat/201010112055.o9BKtZf7011251%40wwwmaster.postgresql.org



pgsql-hackers by date:

Previous
From: Bharath Rupireddy
Date:
Subject: Re: "debug_invalidate_system_caches_always" is too long
Next
From: Masahiro Ikeda
Date:
Subject: Re: Transactions involving multiple postgres foreign servers, take 2