Re: [PATCH] Incremental sort (was: PoC: Partial sort) - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: [PATCH] Incremental sort (was: PoC: Partial sort)
Date
Msg-id 20200407010137.2bk2ljqk7gmlcfsi@development
Whole thread Raw
In response to Re: [PATCH] Incremental sort (was: PoC: Partial sort)  (James Coleman <jtc331@gmail.com>)
List pgsql-hackers
On Mon, Apr 06, 2020 at 07:27:19PM -0400, James Coleman wrote:
>On Mon, Apr 6, 2020 at 7:09 PM James Coleman <jtc331@gmail.com> wrote:
>>
>> On Mon, Apr 6, 2020 at 6:13 PM Tomas Vondra
>> <tomas.vondra@2ndquadrant.com> wrote:
>> >
>> > On Mon, Apr 06, 2020 at 05:47:48PM -0400, James Coleman wrote:
>> > >On Mon, Apr 6, 2020 at 5:40 PM Tomas Vondra
>> > ><tomas.vondra@2ndquadrant.com> wrote:
>> > >>
>> > >> On Mon, Apr 06, 2020 at 11:12:32PM +0200, Tomas Vondra wrote:
>> > >> >On Mon, Apr 06, 2020 at 04:54:38PM -0400, Alvaro Herrera wrote:
>> > >> >>On 2020-Apr-06, Tom Lane wrote:
>> > >> >>
>> > >> >>>Locally, things pass without force_parallel_mode, but turning it on
>> > >> >>>produces failures that look similar to rhinoceros's (didn't examine
>> > >> >>>other BF members).
>> > >> >>
>> > >> >>FWIW I looked at the eight failures there were about fifteen minutes ago
>> > >> >>and they were all identical.  I can confirm that, in my laptop, the
>> > >> >>tests work without that GUC, and fail in exactly that way with it.
>> > >> >>
>> > >> >
>> > >> >Yes, there's a thinko in show_incremental_sort_info() and it returns too
>> > >> >soon. I'll push a fix in a minute.
>> > >> >
>> > >>
>> > >> OK, I've pushed a fix - this should make the buildfarm happy again.
>> > >>
>> > >> It however seems to me a bit more needs to be done. The fix makes
>> > >> show_incremental_sort_info closer to show_sort_info, but not entirely
>> > >> because IncrementalSortState does not have sort_Done flag so it still
>> > >> depends on (fullsortGroupInfo->groupCount > 0). I haven't noticed that
>> > >> before, but not having that flag seems a bit weird to me.
>> > >>
>> > >> It also seems possibly incorrect - we may end up with
>> > >>
>> > >>    fullsortGroupInfo->groupCount == 0
>> > >>    prefixsortGroupInfo->groupCount > 0
>> > >>
>> > >> but we won't print anything.
>> > >
>> > >This shouldn't ever be possible, because the only way we get any
>> > >prefix groups at all is if we've already sorted a full sort group
>> > >during the mode transition.
>> > >
>> > >> James, any opinion on this? I'd say we should restore the sort_Done flag
>> > >> and make it work as in plain Sort. Or some comment explaining why
>> > >> depending on the counts is OK (assuming it is).
>> > >
>> > >There's previous email traffic on this thread about that (I can look
>> > >it up later this evening), but the short of it is that I believe that
>> > >relying on the group count is actually more correct than a sort_Done
>> > >flag in the case of incremental sort (in contrast to regular sort).
>> > >
>> >
>> > OK. Maybe we should add a comment to explain.c saying it's OK.
>> >
>> > I've pushed a fix for failures due to different planned workers (in the
>> > test I added to show changes due to add_partial_path tweaks).
>> >
>> > It seems we're not out of the woods yet, though. rhinoceros and
>> > sidewinder failed with something like this:
>> >
>> >                  Sort Method: quicksort  Memory: NNkB
>> > +               Sort Method: unknown  Disk: NNkB
>> >
>> > Would you mind investigating at it?
>>
>> I assume that means those build farm members run with very low
>> work_mem? Is it an acceptable fix to adjust work_mem up a bit just for
>> these tests? Or is that bad practice and these are to expose issues
>> with changing into disk sort mode?
>
>On rhinoceros I see:
>
>================== pgsql.build/src/test/regress/regression.diffs
>===================
>diff -U3 /opt/src/pgsql-git/build-farm-root/HEAD/pgsql.build/src/test/regress/expected/subselect.out
>/opt/src/pgsql-git/build-farm-root/HEAD/pgsql.build/src/test/regress/results/subselect.out
>--- /opt/src/pgsql-git/build-farm-root/HEAD/pgsql.build/src/test/regress/expected/subselect.out
>2020-03-14 10:37:49.156761104 -0700
>+++ /opt/src/pgsql-git/build-farm-root/HEAD/pgsql.build/src/test/regress/results/subselect.out
>2020-04-06 16:01:13.766798059 -0700
>@@ -1328,8 +1328,9 @@
>          ->  Sort (actual rows=3 loops=1)
>                Sort Key: sq_limit.c1, sq_limit.pk
>                Sort Method: top-N heapsort  Memory: xxx
>+               Sort Method: unknown  Disk: 0kB
>                ->  Seq Scan on sq_limit (actual rows=8 loops=1)
>-(6 rows)
>+(7 rows)
>
>Same on sidewinder.
>
>Given the 0kB I'm not sure this is *just* a work_mem thing, though
>that's still something I'm curious to know about, and it's still part
>of the "problem" here.
>
>I'm investigating further.
>

I don't see how could this be caused by low work_mem value, really. What
I think it happening is that when executed with

     force_parallel_mode = regress

we run this as if in a parallel worker, but then it gets somehow
confused and either (leader + worker) or two worker stats, or something
like that. I don't know why it's happening, though, or why would it be
triggered by the incremental sort patch ...

Actually, I just managed to trigger exactly this - the trick is that we
plan for certain number of workers, but then fail to start some. So for
example like this:

     create table t (a int, b int, c int);

     insert into t select mod(i,10), mod(i,10), i
       from generate_series(1,1000000) s(i);

     set force_parallel_mode = regress;

     set max_parallel_workers = 0;

     explain (costs off, analyze) select * from t order by a,b;

                           QUERY PLAN
---------------------------------------------------------------
  Sort (actual time=0.010..0.010 rows=0 loops=1)
    Sort Key: a, b
    Sort Method: quicksort  Memory: 25kB
    Sort Method: unknown  Disk: 0kB
    ->  Seq Scan on t (actual time=0.003..0.004 rows=0 loops=1)
  Planning Time: 0.040 ms
  Execution Time: 0.229 ms
(7 rows)

So we actually do have to print two lines, because without any workers
the leader ends up doing the work. But we don't know this is happening
because the the number of workers started is included in Gather node,
but force_parallel_mode=regress hides that.

The question is why are these failures correlated with incremental sort.
I don't think we've tweaked Sort at all, no?

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: FETCH FIRST clause WITH TIES option
Next
From: Tomas Vondra
Date:
Subject: Re: [PATCH] Incremental sort (was: PoC: Partial sort)