Re: [PATCH] Incremental sort (was: PoC: Partial sort) - Mailing list pgsql-hackers

From James Coleman
Subject Re: [PATCH] Incremental sort (was: PoC: Partial sort)
Date
Msg-id CAAaqYe_cVOS3akUic2PSaEW0noB=9bBxMkkDuj0HBZJqHHh5yQ@mail.gmail.com
Whole thread Raw
In response to Re: [PATCH] Incremental sort (was: PoC: Partial sort)  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: [PATCH] Incremental sort (was: PoC: Partial sort)  (James Coleman <jtc331@gmail.com>)
List pgsql-hackers
On Friday, March 13, 2020, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
On Fri, Mar 13, 2020 at 04:31:16PM -0400, James Coleman wrote:
On Fri, Mar 13, 2020 at 2:23 PM James Coleman <jtc331@gmail.com> wrote:

On Tue, Mar 10, 2020 at 10:44 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
> 3) Most of the execution plans look reasonable, except that some of the
> plans look like this:
>
>
>                           QUERY PLAN
>    ---------------------------------------------------------
>     Limit
>       ->  GroupAggregate
>             Group Key: t.a, t.b, t.c, t.d
>             ->  Incremental Sort
>                   Sort Key: t.a, t.b, t.c, t.d
>                   Presorted Key: t.a, t.b, t.c
>                   ->  Incremental Sort
>                         Sort Key: t.a, t.b, t.c
>                         Presorted Key: t.a, t.b
>                         ->  Index Scan using t_a_b_idx on t
>    (10 rows)
>
> i.e. there are two incremental sorts on top of each other, with
> different prefixes. But this this is not a new issue - it happens with
> queries like this:
>
>    SELECT a, b, c, d, count(*) FROM (
>      SELECT * FROM t ORDER BY a, b, c
>    ) foo GROUP BY a, b, c, d limit 1000;
>
> i.e. there's a subquery with a subset of pathkeys. Without incremental
> sort the plan looks like this:
>
>                     QUERY PLAN
>    ---------------------------------------------
>     Limit
>       ->  GroupAggregate
>             Group Key: t.a, t.b, t.c, t.d
>             ->  Sort
>                   Sort Key: t.a, t.b, t.c, t.d
>                   ->  Sort
>                         Sort Key: t.a, t.b, t.c
>                         ->  Seq Scan on t
>    (8 rows)
>
> so essentially the same plan shape. What bugs me though is that there
> seems to be some sort of memory leak, so that this query consumes
> gigabytes os RAM before it gets killed by OOM. But the memory seems not
> to be allocated in any memory context (at least MemoryContextStats don't
> show anything like that), so I'm not sure what's going on.
>
> Reproducing it is fairly simple:
>
>    CREATE TABLE t (a bigint, b bigint, c bigint, d bigint);
>    INSERT INTO t SELECT
>      1000*random(), 1000*random(), 1000*random(), 1000*random()
>    FROM generate_series(1,10000000) s(i);
>    CREATE INDEX idx ON t(a,b);
>    ANALYZE t;
>
>    EXPLAIN ANALYZE SELECT a, b, c, d, count(*)
>    FROM (SELECT * FROM t ORDER BY a, b, c) foo GROUP BY a, b, c, d
>    LIMIT 100;

While trying to reproduce this, instead of lots of memory usage, I got
the attached assertion failure instead.

And, without the EXPLAIN ANALYZE was able to get this one, which will
probably be a lot more helpful.


Hmmm, I'll try reproducing it, but can you investigate the values in the
Assert? I mean, it fails on this:

  Assert(total_allocated == context->mem_allocated);

so can you get a core or attach to the process using gdb, and see what's
the expected / total value?

BTW, I might have copied the wrong query - can you try with a higher
value in the LIMIT clause? For example:

    EXPLAIN ANALYZE SELECT a, b, c, d, count(*)
    FROM (SELECT * FROM t ORDER BY a, b, c) foo GROUP BY a, b, c, d
    LIMIT 1000000;

I think this might be the differenc ewhy you don't see the memory leak.
Or maybe it was because of asserts? I'm not sure if I had enabled them
in the build ...

I’m not at my laptop right now, but I’ve started looking at it, but I haven’t figured it out yet. Going from memory, it had allocated 16384 but expected 8192 (I think I have the order of that right). 

It’s very consistently reproducible, thankfully, but doesn’t always happen on the first query; IIRC always the 2nd with LIMIT 100, and I could get it to happen with first at 96 and second at 97, but repeating 96 many times didn’t seem to trigger it. 

I’m hoping it’s the same root cause as the memory leak, but unsure. 

I’ll try a higher number when I get a chance. 

James

pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: [PATCH] Incremental sort (was: PoC: Partial sort)
Next
From: Pengzhou Tang
Date:
Subject: Re: Additional size of hash table is alway zero for hash aggregates