Thread: work_mem = 900MB but Sort Method: external merge Disk: 304008kB

work_mem = 900MB but Sort Method: external merge Disk: 304008kB

From

Reid Thompson

Date:

22 February 2011, 12:41:02

What am I missing that causes this to resort to sorting on disk?

obc=# select version();
                                                 version
----------------------------------------------------------------------------------------------------------
 PostgreSQL 8.3.7 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20071124 (Red Hat 4.1.2-42)
(1 row)

Time: 43.920 ms

>
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>  Limit  (cost=1063641.92..1063643.47 rows=20 width=13) (actual time=422710.147..422711.328 rows=20 loops=1)
>    ->  Unique  (cost=1063641.92..1064133.33 rows=6320 width=13) (actual time=422710.022..422711.127 rows=20 loops=1)
>          ->  Sort  (cost=1063641.92..1063887.62 rows=98282 width=13) (actual time=422710.014..422710.696 rows=172
loops=1)
>                Sort Key: cpn.value
>                Sort Method:  external merge  Disk: 304008kB
>                ->  Nested Loop  (cost=647.20..1061026.67 rows=98282 width=13) (actual time=61.029..71867.921
rows=9627373loops=1) 
>                      ->  HashAggregate  (cost=647.20..648.15 rows=95 width=4) (actual time=60.950..64.350 rows=596
loops=1)
>                            ->  Hash Join  (cost=4.59..646.96 rows=95 width=4) (actual time=0.352..57.210 rows=596
loops=1)
>                                  Hash Cond: (cb.client_id = c.id)
>                                  ->  Seq Scan on contact_block cb  (cost=0.00..596.31 rows=12031 width=8) (actual
time=0.015..26.757rows=10323 loops=1) 
>                                  ->  Hash  (cost=4.58..4.58 rows=1 width=4) (actual time=0.064..0.064 rows=1 loops=1)
>                                        ->  Seq Scan on clients c  (cost=0.00..4.58 rows=1 width=4) (actual
time=0.021..0.055rows=1 loops=1) 
>                                              Filter: ((name)::text = 'Kmart Pharmacies, Inc.'::text)
>                      ->  Index Scan using extra_import_param_blk_item_tag on extra_import_param cpn
(cost=0.00..11039.67rows=9777 width=17) (actual time=0.057..61.769 rows=16153 loops=596) 
>                            Index Cond: ((cpn.block_id = cb.id) AND ((cpn.tag)::text = 'PATNAME'::text))
>  Total runtime: 422920.026 ms
> (16 rows)
>
> Time: 422924.289 ms
> obc=# show sort_mem;
>  work_mem
> ----------
>  900MB
> (1 row)

Re: work_mem = 900MB but Sort Method: external merge Disk: 304008kB

From

Andreas Kretschmer

Date:

22 February 2011, 12:54:54

Reid Thompson <Reid.Thompson@ateb.com> wrote:

> What am I missing that causes this to resort to sorting on disk?
>
> obc=# select version();
>                                                  version
> ----------------------------------------------------------------------------------------------------------
>  PostgreSQL 8.3.7 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20071124 (Red Hat 4.1.2-42)
> (1 row)
>
> Time: 43.920 ms
>
> >
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> >  Limit  (cost=1063641.92..1063643.47 rows=20 width=13) (actual time=422710.147..422711.328 rows=20 loops=1)
> >    ->  Unique  (cost=1063641.92..1064133.33 rows=6320 width=13) (actual time=422710.022..422711.127 rows=20
loops=1)
> >          ->  Sort  (cost=1063641.92..1063887.62 rows=98282 width=13) (actual time=422710.014..422710.696 rows=172
loops=1)
> >                Sort Key: cpn.value
> >                Sort Method:  external merge  Disk: 304008kB

Bad estimation: rows=98282, actual rows=172



Andreas
--
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect.                              (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly."   (unknown)
Kaufbach, Saxony, Germany, Europe.              N 51.05082°, E 13.56889°

Re: work_mem = 900MB but Sort Method: external merge Disk: 304008kB

From

Tom Lane

Date:

22 February 2011, 13:06:24

Reid Thompson <Reid.Thompson@ateb.com> writes:
> What am I missing that causes this to resort to sorting on disk?

The in-memory space required to sort N tuples can be significantly
larger than the on-disk space, because the latter representation is
optimized to be small and the in-memory representation not so much.
I haven't seen a 3X differential before, but it's not outside the realm
of reason, especially for narrow rows like these where it's all about
the overhead.  I suspect if you crank work_mem up still more, you'll see
it switch over.  It flips to on-disk sort when the in-memory
representation exceeds the limit ...

            regards, tom lane

Re: work_mem = 900MB but Sort Method: external merge Disk: 304008kB

From

Andreas Kretschmer

Date:

22 February 2011, 13:17:50

Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Reid Thompson <Reid.Thompson@ateb.com> writes:
> > What am I missing that causes this to resort to sorting on disk?
>
> The in-memory space required to sort N tuples can be significantly
> larger than the on-disk space, because the latter representation is
> optimized to be small and the in-memory representation not so much.
> I haven't seen a 3X differential before, but it's not outside the realm
> of reason, especially for narrow rows like these where it's all about
> the overhead.  I suspect if you crank work_mem up still more, you'll see
> it switch over.  It flips to on-disk sort when the in-memory
> representation exceeds the limit ...

Question: when is the planner making the decision between in-memory and
on-disk, at planning-time or at execution time with the knowledge about
the real amount of tuples?


Andreas
--
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect.                              (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly."   (unknown)
Kaufbach, Saxony, Germany, Europe.              N 51.05082°, E 13.56889°

Re: work_mem = 900MB but Sort Method: external merge Disk: 304008kB

From

Reid Thompson

Date:

22 February 2011, 13:34:48

On 02/22/2011 12:06 PM, Tom Lane wrote:
> Reid Thompson <Reid.Thompson@ateb.com> writes:
>> What am I missing that causes this to resort to sorting on disk?
>
> The in-memory space required to sort N tuples can be significantly
> larger than the on-disk space, because the latter representation is
> optimized to be small and the in-memory representation not so much.
> I haven't seen a 3X differential before, but it's not outside the realm
> of reason, especially for narrow rows like these where it's all about
> the overhead.  I suspect if you crank work_mem up still more, you'll see
> it switch over.  It flips to on-disk sort when the in-memory
> representation exceeds the limit ...
>
>             regards, tom lane

ahh, ok; the underlying cpn.value table is 11 GB so I understand how even slightly less optimized representation could
be
significantly larger than ~300MB/900MB

Thanks,
reid

Re: work_mem = 900MB but Sort Method: external merge Disk: 304008kB

From

Tom Lane

Date:

22 February 2011, 13:39:15

Andreas Kretschmer <akretschmer@spamfence.net> writes:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Reid Thompson <Reid.Thompson@ateb.com> writes:
>>> What am I missing that causes this to resort to sorting on disk?
>>
>> The in-memory space required to sort N tuples can be significantly
>> larger than the on-disk space,

> Question: when is the planner making the decision between in-memory and
> on-disk, at planning-time or at execution time with the knowledge about
> the real amount of tuples?

The planner doesn't make that decision.  tuplesort.c always starts in
in-memory mode, and flips to on-disk when the actual amount of data in
its care exceeds work_mem.  The planner guesses whether that will happen
while making cost estimates, but it's only an estimate.

            regards, tom lane