Re: [PATCH] Incremental sort (was: PoC: Partial sort) - Mailing list pgsql-hackers

From James Coleman
Subject Re: [PATCH] Incremental sort (was: PoC: Partial sort)
Date
Msg-id CAAaqYe_FM2jxQVooEY65hYd8p_OFpsVptSEBx0ia+iFgpVCP+Q@mail.gmail.com
Whole thread Raw
In response to Re: [PATCH] Incremental sort (was: PoC: Partial sort)  (Justin Pryzby <pryzby@telsasoft.com>)
List pgsql-hackers
On Thu, Mar 12, 2020 at 7:40 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
>
> Thanks for working on this.  I have some minor comments.
>
> In 0005:
>
> +                               /* Restore the input path (we might have addes Sort on top). */
>
> => added?  There's at least two more of the same typo.

Fixed.

> +                               /* also ignore already sorted paths */
>
> => You say that in a couple places, but I don't think "also" makes sense since
> there's nothing preceding it ?

Updated.

> In 0004:
>
> +                        * end up resorting the entire data set.  So, unless we can push
>
> => re-sorting

Fixed in this patch; that also shows up in
contrib/postgres_fdw/postgres_fdw.c, but I'll leave that alone.

> + * Unlike generate_gather_paths, this does not look just as pathkeys of the
>
> => look just AT ?

Fixed.

> +                       /* now we know is_sorted == false */
>
> => I would just spell that "Assert", as I think you already do elsewhere.
>
> +                               /* continue */
>
> => Please consider saying "fall through", since "continue" means exactly the
> opposite.

Updated.

> +generate_useful_gather_paths(PlannerInfo *root, RelOptInfo *rel, bool override_rows)
> ...
> +                       /* finally, consider incremental sort */
> ...
> +                               /* Also consider incremental sort. */
>
> => I think it's more confusing than useful with two comments - one is adequate.

Also fixed.

> In 0002:
>
> + * If it's EXPLAIN ANALYZE, show tuplesort stats for a incremental sort node
> ...
> + * make_incrementalsort --- basic routine to build a IncrementalSort plan node
>
> => AN incremental

Fixed.

> + * Initial size of memtuples array.  We're trying to select this size so that
> + * array don't exceed ALLOCSET_SEPARATE_THRESHOLD and overhead of allocation
> + * be possible less.  However, we don't cosider array sizes less than 1024
>
> Four typos (?)
> that array DOESN'T
> and THE overhead
> CONSIDER
> I'm not sure, but "be possible less" should maybe say "possibly be less" ?

Fixed.

> +       bool            maxSpaceOnDisk; /* true when maxSpace is value for on-disk
>
> I suggest to call it IsMaxSpaceDisk

Changed, though with lowercase 'I' (let me know if using uppercase is
standard here).

> +       MemoryContext maincontext;      /* memory context for tuple sort metadata
> +                                          that persist across multiple batches */
>
> persists

Fixed.

> + *     a new sort.  It allows evade recreation of tuple sort (and save resources)
> + *     when sorting multiple small batches.
>
> allows to avoid?  Or allows avoiding?

Fixed.

> + *      When performing sorting by multiple keys input dataset could be already
> + *      presorted by some prefix of these keys.  We call them "presorted keys".
>
> "already presorted" sounds redundant

Reworded.

> +       int64           fullsort_group_count;   /* number of groups with equal presorted keys */
> +       int64           prefixsort_group_count; /* number of groups with equal presorted keys */
>
> I guess these should have different comments

The structure of that changed in my patch from a fews days ago, I
believe, so there aren't two fields anymore. Are you reviewing the
current patch?

Thanks,
James

Attachment

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: [PATCH] Incremental sort (was: PoC: Partial sort)
Next
From: Jeff Davis
Date:
Subject: Re: explain HashAggregate to report bucket and memory stats