Re: PATCH: Using BRIN indexes for sorted output - Mailing list pgsql-hackers

From Matthias van de Meent
Subject Re: PATCH: Using BRIN indexes for sorted output
Date
Msg-id CAEze2Wj_UjWdpMfDSpYWXFnr2K2yX9zpihsD4iEZ9+JFMLOknw@mail.gmail.com
Whole thread Raw
In response to Re: PATCH: Using BRIN indexes for sorted output  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses Re: PATCH: Using BRIN indexes for sorted output
List pgsql-hackers
On Mon, 10 Jul 2023 at 13:43, Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
> On 7/10/23 12:22, Matthias van de Meent wrote:
>> On Fri, 7 Jul 2023 at 20:26, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:
>>> However, it's not quite clear to me is what you mean by the weight- and
>>> compare-operators? That is, what are
>>>
>>>  - brin_minmax_minorder(PG_FUNCTION_ARGS=brintuple) -> range.min
>>>  - brin_minmax_maxorder(PG_FUNCTION_ARGS=brintuple) -> range.max
>>>  - brin_minmax_compare(order, order) -> int
>>>
>>> supposed to do? Or what does "PG_FUNCTION_ARGS=brintuple" mean?
>>
>> _minorder/_maxorder is for extracting the minimum/maximum relative
>> order of each range, used for ASC/DESC sorting of operator results
>> (e.g. to support ORDER BY <->(box_column, '(1,2)'::point) DESC).
>> PG_FUNCTION_ARGS is mentioned because of PG calling conventions;
>> though I did forget to describe the second operator argument for the
>> distance function. We might also want to use only one such "order
>> extraction function" with DESC/ASC indicated by an argument.
>>
>
> I'm still not sure I understand what "minimum/maximum relative order"
> is. Isn't it the same as returning min/max value that can appear in the
> range? Although, that wouldn't work for points/boxes.

Kind of. For single-dimensional opclasses (minmax, minmax_multi) we
only need to extract the normal min/max values for ASC/DESC sorts,
which are readily available in the summary. But for multi-dimensional
and distance searches (nearest neighbour) we need to calculate the
distance between the indexed value(s) and the origin value to compare
the summary against, and the order would thus be asc/desc on distance
- a distance which may not be precisely represented by float types -
thus 'relative order' with its own order operation.

>>> In principle we just need a procedure that tells us min/max for a given
>>> page range - I guess that's what the minorder/maxorder functions do? But
>>> why would we need the compare one? We're comparing by the known data
>>> type, so we can just delegate the comparison to that, no?
>>
>> Is there a comparison function for any custom orderable type that we
>> can just use? GIST distance ordering uses floats, and I don't quite
>> like that from a user perspective, as it makes ordering operations
>> imprecise. I'd rather allow (but discourage) any type with its own
>> compare function.
>>
>
> I haven't really thought about geometric types, just about minmax and
> minmax-multi. It's not clear to me what the benefit for these types be.
> I mean, we can probably sort points lexicographically, but is anyone
> doing that in queries? It seems useless for order by distance.

Yes, that's why you would sort them by distance, where the distance is
generated by the opclass as min/max distance between the summary and
the distance's origin, and then inserted into the tuplesort.

(previously)
>>> I finally had time to look at this patch again. There's a bit of bitrot,
>>> so here's a rebased version (no other changes).

It seems like you forgot to attach the rebased patch, so unless you're
actively working on updating the patchset right now, could you send
the rebase to make CFBot happy?


Kind regards,

Matthias van de Meent
Neon (https://neon.tech/)



pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: ResourceOwner refactoring
Next
From: Aleksander Alekseev
Date:
Subject: Re: [PATCH] Slight improvement of worker_spi.c example