Re: [PATCH] Incremental sort (was: PoC: Partial sort) - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: [PATCH] Incremental sort (was: PoC: Partial sort) |
Date | |
Msg-id | 20200314001254.gdusqeanl7wrxmr7@development Whole thread Raw |
In response to | Re: [PATCH] Incremental sort (was: PoC: Partial sort) (James Coleman <jtc331@gmail.com>) |
Responses |
Re: [PATCH] Incremental sort (was: PoC: Partial sort)
|
List | pgsql-hackers |
On Fri, Mar 13, 2020 at 04:31:16PM -0400, James Coleman wrote: >On Fri, Mar 13, 2020 at 2:23 PM James Coleman <jtc331@gmail.com> wrote: >> >> On Tue, Mar 10, 2020 at 10:44 PM Tomas Vondra >> <tomas.vondra@2ndquadrant.com> wrote: >> > 3) Most of the execution plans look reasonable, except that some of the >> > plans look like this: >> > >> > >> > QUERY PLAN >> > --------------------------------------------------------- >> > Limit >> > -> GroupAggregate >> > Group Key: t.a, t.b, t.c, t.d >> > -> Incremental Sort >> > Sort Key: t.a, t.b, t.c, t.d >> > Presorted Key: t.a, t.b, t.c >> > -> Incremental Sort >> > Sort Key: t.a, t.b, t.c >> > Presorted Key: t.a, t.b >> > -> Index Scan using t_a_b_idx on t >> > (10 rows) >> > >> > i.e. there are two incremental sorts on top of each other, with >> > different prefixes. But this this is not a new issue - it happens with >> > queries like this: >> > >> > SELECT a, b, c, d, count(*) FROM ( >> > SELECT * FROM t ORDER BY a, b, c >> > ) foo GROUP BY a, b, c, d limit 1000; >> > >> > i.e. there's a subquery with a subset of pathkeys. Without incremental >> > sort the plan looks like this: >> > >> > QUERY PLAN >> > --------------------------------------------- >> > Limit >> > -> GroupAggregate >> > Group Key: t.a, t.b, t.c, t.d >> > -> Sort >> > Sort Key: t.a, t.b, t.c, t.d >> > -> Sort >> > Sort Key: t.a, t.b, t.c >> > -> Seq Scan on t >> > (8 rows) >> > >> > so essentially the same plan shape. What bugs me though is that there >> > seems to be some sort of memory leak, so that this query consumes >> > gigabytes os RAM before it gets killed by OOM. But the memory seems not >> > to be allocated in any memory context (at least MemoryContextStats don't >> > show anything like that), so I'm not sure what's going on. >> > >> > Reproducing it is fairly simple: >> > >> > CREATE TABLE t (a bigint, b bigint, c bigint, d bigint); >> > INSERT INTO t SELECT >> > 1000*random(), 1000*random(), 1000*random(), 1000*random() >> > FROM generate_series(1,10000000) s(i); >> > CREATE INDEX idx ON t(a,b); >> > ANALYZE t; >> > >> > EXPLAIN ANALYZE SELECT a, b, c, d, count(*) >> > FROM (SELECT * FROM t ORDER BY a, b, c) foo GROUP BY a, b, c, d >> > LIMIT 100; >> >> While trying to reproduce this, instead of lots of memory usage, I got >> the attached assertion failure instead. > >And, without the EXPLAIN ANALYZE was able to get this one, which will >probably be a lot more helpful. > Hmmm, I'll try reproducing it, but can you investigate the values in the Assert? I mean, it fails on this: Assert(total_allocated == context->mem_allocated); so can you get a core or attach to the process using gdb, and see what's the expected / total value? BTW, I might have copied the wrong query - can you try with a higher value in the LIMIT clause? For example: EXPLAIN ANALYZE SELECT a, b, c, d, count(*) FROM (SELECT * FROM t ORDER BY a, b, c) foo GROUP BY a, b, c, d LIMIT 1000000; I think this might be the differenc ewhy you don't see the memory leak. Or maybe it was because of asserts? I'm not sure if I had enabled them in the build ... regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date: