Re: PATCH: Using BRIN indexes for sorted output - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: PATCH: Using BRIN indexes for sorted output |
Date | |
Msg-id | 825f648f-2a31-1870-942f-03fba2b6b3cf@enterprisedb.com Whole thread Raw |
In response to | Re: PATCH: Using BRIN indexes for sorted output (Matthias van de Meent <boekewurm+postgres@gmail.com>) |
Responses |
Re: PATCH: Using BRIN indexes for sorted output
|
List | pgsql-hackers |
Hi, I finally had time to look at this patch again. There's a bit of bitrot, so here's a rebased version (no other changes). [more comments inline] On 2/27/23 16:40, Matthias van de Meent wrote: > Hi, > > On Thu, 23 Feb 2023 at 17:44, Matthias van de Meent > <boekewurm+postgres@gmail.com> wrote: >> >> I'll see to further reviewing 0004 and 0005 when I have additional time. > > Some initial comments on 0004: > >> +/* >> + * brin_minmax_ranges >> + * Load the BRIN ranges and sort them. >> + */ >> +Datum >> +brin_minmax_ranges(PG_FUNCTION_ARGS) >> +{ > > Like in 0001, this seems to focus on only single columns. Can't we put > the scan and sort infrastructure in brin, and put the weight- and > compare-operators in the opclasses? I.e. > brin_minmax_minorder(PG_FUNCTION_ARGS=brintuple) -> range.min and > brin_minmax_maxorder(PG_FUNCTION_ARGS=brintuple) -> range.max, and a > brin_minmax_compare(order, order) -> int? I'm thinking of something > similar to GIST's distance operators, which would make implementing > ordering by e.g. `pointcolumn <-> (1, 2)::point` implementable in the > brin infrastructure. > > Note: One big reason I don't really like the current > brin_minmax_ranges (and the analyze code in 0001) is because it breaks > the operatorclass-vs-index abstraction layer. Operator classes don't > (shouldn't) know or care about which attribute number they have, nor > what the index does with the data. > Scanning the index is not something that I expect the operator class > to do, I expect that the index code organizes the scan, and forwards > the data to the relevant operator classes. > Scanning the index N times for N attributes can be excused if there > are good reasons, but I'd rather have that decision made in the > index's core code rather than at the design level. > I think you're right. It'd be more appropriate to have most of the core scanning logic in brin.c, and then delegate only some minor decisions to the opclass. Like, comparisons, extraction of min/max from ranges etc. However, it's not quite clear to me is what you mean by the weight- and compare-operators? That is, what are - brin_minmax_minorder(PG_FUNCTION_ARGS=brintuple) -> range.min - brin_minmax_maxorder(PG_FUNCTION_ARGS=brintuple) -> range.max - brin_minmax_compare(order, order) -> int supposed to do? Or what does "PG_FUNCTION_ARGS=brintuple" mean? In principle we just need a procedure that tells us min/max for a given page range - I guess that's what the minorder/maxorder functions do? But why would we need the compare one? We're comparing by the known data type, so we can just delegate the comparison to that, no? Also, the existence of these opclass procedures should be enough to identify the opclasses which can support this. >> +/* >> + * XXX Does it make sense (is it possible) to have a sort by more than one >> + * column, using a BRIN index? >> + */ > > Yes, even if only one prefix column is included in the BRIN index > (e.g. `company` in `ORDER BY company, date`, the tuplesort with table > tuples can add additional sorting without first reading the whole > table, potentially (likely) reducing the total resource usage of the > query. That utilizes the same idea as incremental sorts, but with the > caveat that the input sort order is approximately likely but not at > all guaranteed. So, even if the range sort is on a single index > column, we can still do the table's tuplesort on all ORDER BY > attributes, as long as a prefix of ORDER BY columns are included in > the BRIN index. > That's now quite what I meant, though. When I mentioned sorting by more than one column, I meant using a multi-column BRIN index on those columns. Something like this: CREATE TABLE t (a int, b int); INSERT INTO t ... CREATE INDEX ON t USING brin (a,b); SELECT * FROM t ORDER BY a, b; Now, what I think you described is using BRIN to sort by "a", and then do incremental sort for "b". What I had in mind is whether it's possible to use BRIN to sort by "b" too. I was suspecting it might be made to work, but now that I think about it again it probably can't - BRIN pretty much sorts the columns separately, it's not like having an ordering by (a,b) - first by "a", then "b". >> + /* >> + * XXX We can be a bit smarter for LIMIT queries - once we >> + * know we have more rows in the tuplesort than we need to >> + * output, we can stop spilling - those rows are not going >> + * to be needed. We can discard the tuplesort (no need to >> + * respill) and stop spilling. >> + */ > > Shouldn't that be "discard the tuplestore"? > Yeah, definitely. >> +#define BRIN_PROCNUM_RANGES 12 /* optional */ > > It would be useful to add documentation for this in this patch. > Right, this should be documented in doc/src/sgml/brin.sgml. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: