Re: PATCH: Using BRIN indexes for sorted output - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: PATCH: Using BRIN indexes for sorted output
Date
Msg-id ab61b00f-17f4-7b0e-a527-7182cb79d1ab@enterprisedb.com
Whole thread Raw
In response to Re: PATCH: Using BRIN indexes for sorted output  (Justin Pryzby <pryzby@telsasoft.com>)
List pgsql-hackers

On 10/24/22 06:32, Justin Pryzby wrote:
> On Sat, Oct 15, 2022 at 02:33:50PM +0200, Tomas Vondra wrote:
>> Of course, if there are e.g. BTREE indexes this is going to be slower,
>> but people are unlikely to have both index types on the same column.
> 
> On Sun, Oct 16, 2022 at 05:48:31PM +0200, Tomas Vondra wrote:
>> I don't think it's all that unfair. How likely is it to have both a BRIN
>> and btree index on the same column? And even if you do have such indexes
> 
> Note that we (at my work) use unique, btree indexes on multiple columns
> for INSERT ON CONFLICT into the most-recent tables: UNIQUE(a,b,c,...),
> plus a separate set of indexes on all tables, used for searching:
> BRIN(a) and BTREE(b).  I'd hope that the costing is accurate enough to
> prefer the btree index for searching the most-recent table, if that's
> what's faster (for example, if columns b and c are specified).
> 

Well, the costing is very crude at the moment - at the moment it's
pretty much just a copy of the existing BRIN costing. And the cost is
likely going to increase, because brinsort needs to do regular BRIN
bitmap scan (more or less) and then also a sort (which is an extra cost,
of course). So if it works now, I don't see why would brinsort break it.
Moreover, if you don't have ORDER BY in the query, I don't see why would
we create a brinsort at all.

But if you could test this once the costing gets improved, that'd be
very valuable.

>> +    /* There must not be any TID scan in progress yet. */
>> +    Assert(node->ss.ss_currentScanDesc == NULL);
>> +
>> +    /* Initialize the TID range scan, for the provided block range. */
>> +    if (node->ss.ss_currentScanDesc == NULL)
>> +    {
> 
> Why is this conditional on the condition that was just Assert()ed ?
> 

Yeah, that's a mistake, due to how the code evolved.

>>  
>> +void
>> +cost_brinsort(BrinSortPath *path, PlannerInfo *root, double loop_count,
>> +           bool partial_path)
> 
> It's be nice to refactor existing code to avoid this part being so
> duplicitive.
> 
>> +     * In some situations (particularly with OR'd index conditions) we may
>> +     * have scan_clauses that are not equal to, but are logically implied by,
>> +     * the index quals; so we also try a predicate_implied_by() check to see
> 
> Isn't that somewhat expensive ?
> 
> If that's known, then it'd be good to say that in the documentation.
> 

Some of this is probably a residue from create_indexscan_path and may
not be needed for this new node.

>> +    {
>> +        {"enable_brinsort", PGC_USERSET, QUERY_TUNING_METHOD,
>> +            gettext_noop("Enables the planner's use of BRIN sort plans."),
>> +            NULL,
>> +            GUC_EXPLAIN
>> +        },
>> +        &enable_brinsort,
>> +        false,
> 
> I think new GUCs should be enabled during patch development.
> Maybe in a separate 0002 patch "for CI only not for commit".
> That way "make check" at least has a chance to hit that new code paths.
> 
> Also, note that indxpath.c had the var initialized to true.
> 

Good point.

>> +            attno = (i + 1);
>> +       nranges = (nblocks / pagesPerRange);
>> +                               node->bs_phase = (nullsFirst) ? BRINSORT_LOAD_NULLS : BRINSORT_LOAD_RANGE;
> 
> I'm curious why you have parenthesis these places ?
> 

Not sure, it seemed more readable when writing the code I guess.

>> +#ifndef NODEBrinSort_H
>> +#define NODEBrinSort_H
> 
> NODEBRIN_SORT would be more consistent with NODEINCREMENTALSORT.
> But I'd prefer NODE_* - otherwise it looks like NO DEBRIN.
> 

Yeah, stupid search/replace on the indescan code, which was used as a
starting point.

> This needed a bunch of work needed to pass any of the regression tests -
> even with the feature set to off.
> 
>  . meson.build needs the same change as the corresponding ./Makefile.
>  . guc missing from postgresql.conf.sample
>  . brin_validate.c is missing support for the opr function.
>    I gather you're planning on changing this part (?) but this allows to
>    pass tests for now.
>  . mingw is warning about OidIsValid(pointer) in nodeBrinSort.c.
>    https://cirrus-ci.com/task/5771227447951360?logs=mingw_cross_warning#L969
>  . Uninitialized catalog attribute.
>  . Some typos in your other patches: "heuristics heuristics".  ste.
>    lest (least).
> 

Thanks, I'll get this fixed. I've posted the patch as a PoC to showcase
it and gather some feedback, I should have mentioned it's incomplete in
these ways.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Perform streaming logical transactions by background workers and parallel apply
Next
From: Alvaro Herrera
Date:
Subject: Re: Issue in GIN fast-insert: XLogBeginInsert + Read/LockBuffer ordering