Thread: [PERFORMANCE] work_mem vs temp files issue

[PERFORMANCE] work_mem vs temp files issue

From
Jaime Casanova
Date:
Hi,

AFAIUI, work_mem is used for some operations (sort, hash, etc) for
avoiding the use of temp files on disk...

In a client server i'm monitoring (pg 8.3.7, 32GB of ram) work_mem is
set to 8MB, however i'm seeing a lot of temp files (>30000 in 4 hours)
with small sizes (ie: 2021520 obviously lower than 8MB). so, why?
maybe we use work_mem until we find isn't enough and we send just the
difference to a temp file?

i'm not thinking in raising work_mem until i understand this well,
what's the point if we still create temp files that could fit in
work_mem...

PS: i have max_connections to 1024, i know i need a pool but the app
is still opening persistent conecctions to the db, so is not like i
could raise work_mem just easy until the app gets fixed

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

Re: [PERFORMANCE] work_mem vs temp files issue

From
decibel
Date:
On Aug 19, 2009, at 7:45 PM, Jaime Casanova wrote:
> AFAIUI, work_mem is used for some operations (sort, hash, etc) for
> avoiding the use of temp files on disk...
>
> In a client server i'm monitoring (pg 8.3.7, 32GB of ram) work_mem is
> set to 8MB, however i'm seeing a lot of temp files (>30000 in 4 hours)
> with small sizes (ie: 2021520 obviously lower than 8MB). so, why?
> maybe we use work_mem until we find isn't enough and we send just the
> difference to a temp file?
>
> i'm not thinking in raising work_mem until i understand this well,
> what's the point if we still create temp files that could fit in
> work_mem...


Are you using temp tables? Those end up in pgsql_tmp as well.
--
Decibel!, aka Jim C. Nasby, Database Architect  decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828



Re: [PERFORMANCE] work_mem vs temp files issue

From
Tom Lane
Date:
decibel <decibel@decibel.org> writes:
> On Aug 19, 2009, at 7:45 PM, Jaime Casanova wrote:
>> AFAIUI, work_mem is used for some operations (sort, hash, etc) for
>> avoiding the use of temp files on disk...
>>
>> In a client server i'm monitoring (pg 8.3.7, 32GB of ram) work_mem is
>> set to 8MB, however i'm seeing a lot of temp files (>30000 in 4 hours)
>> with small sizes (ie: 2021520 obviously lower than 8MB). so, why?
>> maybe we use work_mem until we find isn't enough and we send just the
>> difference to a temp file?
>>
>> i'm not thinking in raising work_mem until i understand this well,
>> what's the point if we still create temp files that could fit in
>> work_mem...

> Are you using temp tables? Those end up in pgsql_tmp as well.

Uh, no, they don't.

It might be useful to turn on trace_sort to see if the small files
are coming from sorts.  If they're from hashes I'm afraid there's
no handy instrumentation ...

            regards, tom lane

Re: [PERFORMANCE] work_mem vs temp files issue

From
Jaime Casanova
Date:
On Sun, Sep 13, 2009 at 5:37 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> It might be useful to turn on trace_sort to see if the small files
> are coming from sorts.  If they're from hashes I'm afraid there's
> no handy instrumentation ...
>

yes they are, this is the log (i deleted the STATEMENT lines because
they were redundant), seems like all the temp files are used to
execute the same sentence...

BTW, this is my laptop no the server.

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

Re: [PERFORMANCE] work_mem vs temp files issue

From
Robert Haas
Date:
On Mon, Jan 11, 2010 at 1:15 PM, Jaime Casanova
<jcasanov@systemguards.com.ec> wrote:
> On Sun, Sep 13, 2009 at 5:37 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>
>> It might be useful to turn on trace_sort to see if the small files
>> are coming from sorts.  If they're from hashes I'm afraid there's
>> no handy instrumentation ...
>>
>
> yes they are, this is the log (i deleted the STATEMENT lines because
> they were redundant), seems like all the temp files are used to
> execute the same sentence...
>
> BTW, this is my laptop no the server.

I think maybe there was supposed to be an attachment here?

...Robert

Re: [PERFORMANCE] work_mem vs temp files issue

From
Jaime Casanova
Date:
On Mon, Jan 11, 2010 at 2:07 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Mon, Jan 11, 2010 at 1:15 PM, Jaime Casanova
> <jcasanov@systemguards.com.ec> wrote:
>> On Sun, Sep 13, 2009 at 5:37 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>>
>>> It might be useful to turn on trace_sort to see if the small files
>>> are coming from sorts.  If they're from hashes I'm afraid there's
>>> no handy instrumentation ...
>>>
>>
>> yes they are, this is the log (i deleted the STATEMENT lines because
>> they were redundant), seems like all the temp files are used to
>> execute the same sentence...
>>
>> BTW, this is my laptop no the server.
>
> I think maybe there was supposed to be an attachment here?
>

i knew i was forgotting something ;)
ah! and this is in 8.5dev but it's the same in 8.3

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

Re: [PERFORMANCE] work_mem vs temp files issue

From
Jaime Casanova
Date:
On Mon, Jan 11, 2010 at 2:14 PM, Jaime Casanova
<jcasanov@systemguards.com.ec> wrote:
> On Mon, Jan 11, 2010 at 2:07 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Mon, Jan 11, 2010 at 1:15 PM, Jaime Casanova
>> <jcasanov@systemguards.com.ec> wrote:
>>> On Sun, Sep 13, 2009 at 5:37 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>>>
>>>> It might be useful to turn on trace_sort to see if the small files
>>>> are coming from sorts.  If they're from hashes I'm afraid there's
>>>> no handy instrumentation ...
>>>>
>>>
>>> yes they are, this is the log (i deleted the STATEMENT lines because
>>> they were redundant), seems like all the temp files are used to
>>> execute the same sentence...
>>>
>>> BTW, this is my laptop no the server.
>>
>> I think maybe there was supposed to be an attachment here?
>>
>
> i knew i was forgotting something ;)
> ah! and this is in 8.5dev but it's the same in 8.3
>

oh! boy this can't be happen!
attaching again

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

Attachment

Re: [PERFORMANCE] work_mem vs temp files issue

From
Tom Lane
Date:
Jaime Casanova <jcasanov@systemguards.com.ec> writes:
> LOG:  begin tuple sort: nkeys = 1, workMem = 1024, randomAccess = f
> LOG:  switching to bounded heapsort at 641 tuples: CPU 0.08s/0.13u sec elapsed 0.25 sec
> LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp8507.5", size 471010
> LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp8507.10", size 81096
> LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp8507.0", size 467373
> LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp8507.9", size 110200
> LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp8507.3", size 470011
> LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp8507.8", size 157192
> LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp8507.4", size 468681
> LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp8507.12", size 101624
> LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp8507.1", size 472285
> LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp8507.11", size 100744
> LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp8507.6", size 467173
> LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp8507.7", size 141888
> LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp8507.2", size 476227
> LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp8507.13", size 89072
> LOG:  performsort starting: CPU 0.10s/0.19u sec elapsed 0.33 sec
> LOG:  performsort done: CPU 0.10s/0.19u sec elapsed 0.33 sec
> LOG:  internal sort ended, 118 KB used: CPU 0.10s/0.19u sec elapsed 0.33 sec

Hmm.  Not clear where the temp files are coming from, but it's *not* the
sort --- the "internal sort ended" line shows that that sort never went
to disk.  What kind of plan is feeding the sort node?

            regards, tom lane

Re: [PERFORMANCE] work_mem vs temp files issue

From
Jaime Casanova
Date:
On Mon, Jan 11, 2010 at 3:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Jaime Casanova <jcasanov@systemguards.com.ec> writes:
>> LOG:  begin tuple sort: nkeys = 1, workMem = 1024, randomAccess = f
>> LOG:  switching to bounded heapsort at 641 tuples: CPU 0.08s/0.13u sec elapsed 0.25 sec
>> LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp8507.5", size 471010
[... some more temp files logged ...]
>> LOG:  internal sort ended, 118 KB used: CPU 0.10s/0.19u sec elapsed 0.33 sec
>
> Hmm.  Not clear where the temp files are coming from, but it's *not* the
> sort --- the "internal sort ended" line shows that that sort never went
> to disk.  What kind of plan is feeding the sort node?
>

i'm sure i have seen on disk sorts even when the files are small, but
still i see a problem here...

the temp files shoul be coming from hash operations but AFAICS the
files are small and every hash operation should be using until
work_mem memory, right?

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

Attachment

Re: [PERFORMANCE] work_mem vs temp files issue

From
Tom Lane
Date:
Jaime Casanova <jcasanov@systemguards.com.ec> writes:
> the temp files shoul be coming from hash operations but AFAICS the
> files are small and every hash operation should be using until
> work_mem memory, right?

No, when a hash spills to disk the code has to guess the partition sizes
(number of buckets per partition) in advance.  So it wouldn't be at all
surprising if the actual sizes come out substantially different from
work_mem.

            regards, tom lane

Re: [PERFORMANCE] work_mem vs temp files issue

From
Jaime Casanova
Date:
On Mon, Jan 11, 2010 at 3:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Hmm.  Not clear where the temp files are coming from, but it's *not* the
> sort --- the "internal sort ended" line shows that that sort never went
> to disk.  What kind of plan is feeding the sort node?
>

some time ago, you said:
"""
It might be useful to turn on trace_sort to see if the small files
are coming from sorts.  If they're from hashes I'm afraid there's
no handy instrumentation ...
"""

and is clearly what was bother me... because most of all temp files
are coming from hash...

why we don't show some of that info in explain? for example: we can
show memory used, no? or if the hash goes to disk... if i remove
#ifdef HJDEBUG seems like we even know how many batchs the hash
used...

the reason i say "most of the temp files" is that when i removed
#ifdef HJDEBUG it says that in total i was using 10 batchs but there
were 14 temp files created (i guess we use 1 file per batch, no?)

"""
nbatch = 1, nbuckets = 1024
nbatch = 1, nbuckets = 1024
nbatch = 8, nbuckets = 2048
"""

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

Re: [PERFORMANCE] work_mem vs temp files issue

From
Tom Lane
Date:
Jaime Casanova <jcasanov@systemguards.com.ec> writes:
> why we don't show some of that info in explain?

Lack of round tuits; plus concern about breaking programs that read
EXPLAIN output, which I guess will be alleviated in 8.5.

> the reason i say "most of the temp files" is that when i removed
> #ifdef HJDEBUG it says that in total i was using 10 batchs but there
> were 14 temp files created (i guess we use 1 file per batch, no?)

Two files per batch, in general --- I suppose some of the buckets
were empty.

            regards, tom lane

Re: [PERFORMANCE] work_mem vs temp files issue

From
Robert Haas
Date:
On Wed, Jan 13, 2010 at 1:31 AM, Jaime Casanova
<jcasanov@systemguards.com.ec> wrote:
> On Mon, Jan 11, 2010 at 3:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>
>> Hmm.  Not clear where the temp files are coming from, but it's *not* the
>> sort --- the "internal sort ended" line shows that that sort never went
>> to disk.  What kind of plan is feeding the sort node?
>>
>
> some time ago, you said:
> """
> It might be useful to turn on trace_sort to see if the small files
> are coming from sorts.  If they're from hashes I'm afraid there's
> no handy instrumentation ...
> """
>
> and is clearly what was bother me... because most of all temp files
> are coming from hash...
>
> why we don't show some of that info in explain? for example: we can
> show memory used, no? or if the hash goes to disk... if i remove
> #ifdef HJDEBUG seems like we even know how many batchs the hash
> used...

I had an idea at one point of making explain show the planned and
actual # of batches for each hash join.  I believe that "actual # of
batches > 1" is isomorphic to "hash join went to disk".  The code is
actually pretty easy; the hard part is figuring out what to do about
the UI.  The choices seem to be:

1. Create a new EXPLAIN option just for this - what would we call it?
2. Think of some more, similar things and come up with a new EXPLAIN
option covering all of them - what else would go along with?
3. Sandwhich it into an existing EXPLAIN option, most likely VERBOSE.
4. Display it by default.

...Robert

Re: [PERFORMANCE] work_mem vs temp files issue

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> I had an idea at one point of making explain show the planned and
> actual # of batches for each hash join.  I believe that "actual # of
> batches > 1" is isomorphic to "hash join went to disk".  The code is
> actually pretty easy; the hard part is figuring out what to do about
> the UI.  The choices seem to be:

> 1. Create a new EXPLAIN option just for this - what would we call it?
> 2. Think of some more, similar things and come up with a new EXPLAIN
> option covering all of them - what else would go along with?
> 3. Sandwhich it into an existing EXPLAIN option, most likely VERBOSE.
> 4. Display it by default.

Treat it the same as the Sort-node actual usage information.  We did not
add a special option when we added that.

            regards, tom lane

Re: [PERFORMANCE] work_mem vs temp files issue

From
Robert Haas
Date:
On Wed, Jan 13, 2010 at 10:42 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> I had an idea at one point of making explain show the planned and
>> actual # of batches for each hash join.  I believe that "actual # of
>> batches > 1" is isomorphic to "hash join went to disk".  The code is
>> actually pretty easy; the hard part is figuring out what to do about
>> the UI.  The choices seem to be:
>
>> 1. Create a new EXPLAIN option just for this - what would we call it?
>> 2. Think of some more, similar things and come up with a new EXPLAIN
>> option covering all of them - what else would go along with?
>> 3. Sandwhich it into an existing EXPLAIN option, most likely VERBOSE.
>> 4. Display it by default.
>
> Treat it the same as the Sort-node actual usage information.  We did not
> add a special option when we added that.

Well, what about when we're just doing EXPLAIN, not EXPLAIN ANALYZE?
It'll add another line to the output for the expected number of
batches.

...Robert

Re: [PERFORMANCE] work_mem vs temp files issue

From
Jaime Casanova
Date:
On Wed, Jan 13, 2010 at 11:11 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> Well, what about when we're just doing EXPLAIN, not EXPLAIN ANALYZE?
> It'll add another line to the output for the expected number of
> batches.
>

and when we are in EXPLAIN ANALYZE the real number as well?

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

Re: [PERFORMANCE] work_mem vs temp files issue

From
Robert Haas
Date:
On Wed, Jan 13, 2010 at 11:14 AM, Jaime Casanova
<jcasanov@systemguards.com.ec> wrote:
> On Wed, Jan 13, 2010 at 11:11 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> Well, what about when we're just doing EXPLAIN, not EXPLAIN ANALYZE?
>> It'll add another line to the output for the expected number of
>> batches.
>
> and when we are in EXPLAIN ANALYZE the real number as well?

Yeah.  My question is whether it's acceptable to add an extra line to
the EXPLAIN output for every hash join, even w/o ANALYZE.

...Robert

Re: [PERFORMANCE] work_mem vs temp files issue

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> Yeah.  My question is whether it's acceptable to add an extra line to
> the EXPLAIN output for every hash join, even w/o ANALYZE.

We could add it if either VERBOSE or ANALYZE appears.  Not sure if
that's just too much concern for backwards compatibility, though.

            regards, tom lane

Re: [PERFORMANCE] work_mem vs temp files issue

From
Robert Haas
Date:
On Wed, Jan 13, 2010 at 11:53 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> Yeah.  My question is whether it's acceptable to add an extra line to
>> the EXPLAIN output for every hash join, even w/o ANALYZE.
>
> We could add it if either VERBOSE or ANALYZE appears.  Not sure if
> that's just too much concern for backwards compatibility, though.

I think having it controlled by either of two options is to weird.
I'm not worried so much about backward compatibility as I am about
cluttering the output.  Maybe making it controlled by VERBOSE is the
right thing to do, although I'm sort of tempted to figure out if there
is more useful instrumentation that could be done and put it all under
a new option called, say, HASH_DETAILS.  Not sure what else we could
show though.

...Robert