Re: PATCH: Using BRIN indexes for sorted output - Mailing list pgsql-hackers

From Justin Pryzby
Subject Re: PATCH: Using BRIN indexes for sorted output
Date
Msg-id 20221024043237.GA16921@telsasoft.com
Whole thread Raw
In response to PATCH: Using BRIN indexes for sorted output  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses Re: PATCH: Using BRIN indexes for sorted output  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
List pgsql-hackers
On Sat, Oct 15, 2022 at 02:33:50PM +0200, Tomas Vondra wrote:
> Of course, if there are e.g. BTREE indexes this is going to be slower,
> but people are unlikely to have both index types on the same column.

On Sun, Oct 16, 2022 at 05:48:31PM +0200, Tomas Vondra wrote:
> I don't think it's all that unfair. How likely is it to have both a BRIN
> and btree index on the same column? And even if you do have such indexes

Note that we (at my work) use unique, btree indexes on multiple columns
for INSERT ON CONFLICT into the most-recent tables: UNIQUE(a,b,c,...),
plus a separate set of indexes on all tables, used for searching:
BRIN(a) and BTREE(b).  I'd hope that the costing is accurate enough to
prefer the btree index for searching the most-recent table, if that's
what's faster (for example, if columns b and c are specified).

> +    /* There must not be any TID scan in progress yet. */
> +    Assert(node->ss.ss_currentScanDesc == NULL);
> +
> +    /* Initialize the TID range scan, for the provided block range. */
> +    if (node->ss.ss_currentScanDesc == NULL)
> +    {

Why is this conditional on the condition that was just Assert()ed ?

>  
> +void
> +cost_brinsort(BrinSortPath *path, PlannerInfo *root, double loop_count,
> +           bool partial_path)

It's be nice to refactor existing code to avoid this part being so
duplicitive.

> +     * In some situations (particularly with OR'd index conditions) we may
> +     * have scan_clauses that are not equal to, but are logically implied by,
> +     * the index quals; so we also try a predicate_implied_by() check to see

Isn't that somewhat expensive ?

If that's known, then it'd be good to say that in the documentation.

> +    {
> +        {"enable_brinsort", PGC_USERSET, QUERY_TUNING_METHOD,
> +            gettext_noop("Enables the planner's use of BRIN sort plans."),
> +            NULL,
> +            GUC_EXPLAIN
> +        },
> +        &enable_brinsort,
> +        false,

I think new GUCs should be enabled during patch development.
Maybe in a separate 0002 patch "for CI only not for commit".
That way "make check" at least has a chance to hit that new code paths.

Also, note that indxpath.c had the var initialized to true.

> +            attno = (i + 1);
> +       nranges = (nblocks / pagesPerRange);
> +                               node->bs_phase = (nullsFirst) ? BRINSORT_LOAD_NULLS : BRINSORT_LOAD_RANGE;

I'm curious why you have parenthesis these places ?

> +#ifndef NODEBrinSort_H
> +#define NODEBrinSort_H

NODEBRIN_SORT would be more consistent with NODEINCREMENTALSORT.
But I'd prefer NODE_* - otherwise it looks like NO DEBRIN.

This needed a bunch of work needed to pass any of the regression tests -
even with the feature set to off.

 . meson.build needs the same change as the corresponding ./Makefile.
 . guc missing from postgresql.conf.sample
 . brin_validate.c is missing support for the opr function.
   I gather you're planning on changing this part (?) but this allows to
   pass tests for now.
 . mingw is warning about OidIsValid(pointer) in nodeBrinSort.c.
   https://cirrus-ci.com/task/5771227447951360?logs=mingw_cross_warning#L969
 . Uninitialized catalog attribute.
 . Some typos in your other patches: "heuristics heuristics".  ste.
   lest (least).

-- 
Justin

Attachment

pgsql-hackers by date:

Previous
From: Japin Li
Date:
Subject: Question about savepoint level?
Next
From: Jeff Davis
Date:
Subject: Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation