Re: PATCH: Using BRIN indexes for sorted output - Mailing list pgsql-hackers

From Matthias van de Meent
Subject Re: PATCH: Using BRIN indexes for sorted output
Date
Msg-id CAEze2Wg00Pry5KsNEqJps-uX+T6f=YTm-jyLZPt=yDNO5+3v0Q@mail.gmail.com
Whole thread Raw
In response to Re: PATCH: Using BRIN indexes for sorted output  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses Re: PATCH: Using BRIN indexes for sorted output  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
List pgsql-hackers
On Mon, 10 Jul 2023 at 22:04, Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
> On 7/10/23 18:18, Matthias van de Meent wrote:
>> On Mon, 10 Jul 2023 at 17:09, Tomas Vondra
>> <tomas.vondra@enterprisedb.com> wrote:
>>> On 7/10/23 14:38, Matthias van de Meent wrote:
>>>>> I haven't really thought about geometric types, just about minmax and
>>>>> minmax-multi. It's not clear to me what the benefit for these types be.
>>>>> I mean, we can probably sort points lexicographically, but is anyone
>>>>> doing that in queries? It seems useless for order by distance.
>>>>
>>>> Yes, that's why you would sort them by distance, where the distance is
>>>> generated by the opclass as min/max distance between the summary and
>>>> the distance's origin, and then inserted into the tuplesort.
>>>>
>>>
>>> OK, so the query says "order by distance from point X" and we calculate
>>> the min/max distance of values in a given page range.
>>
>> Yes, and because it's BRIN that's an approximation, which should
>> generally be fine.
>>
>
> Approximation in what sense? My understanding was we'd get a range of
> distances that we know covers all rows in that range. So the results
> should be accurate, no?

The distance is going to be accurate only to the degree that the
summary can produce accurate distances for the datapoints it
represents. That can be quite imprecise due to the nature of the
contained datapoints: a summary of the points (-1, -1) and (1, 1) will
have a minimum distance of 0 to the origin, where the summary (-1, 0)
and (-1, 0.5) would have a much more accurate distance of 1. The point
I was making is that the summary can only approximate the distance,
and that approximation is fine w.r.t. the BRIN sort algoritm.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech/)



pgsql-hackers by date:

Previous
From: "Zhijie Hou (Fujitsu)"
Date:
Subject: RE: Support logical replication of DDLs
Next
From: Aleksander Alekseev
Date:
Subject: Re: SLRUs in the main buffer pool, redux