Thread: WIP - Add ability to constrain backend temporary file space

WIP - Add ability to constrain backend temporary file space

From
Mark Kirkwood
Date:
Recently two systems here have suffered severely with excessive
temporary file creation during query execution. In one case it could
have been avoided by more stringent qa before application code release,
whereas the other is an ad-hoc system, and err...yes.

In both cases it would have been great to be able to constrain the
amount of temporary file space a query could use. In theory you can sort
of do this with the various ulimits, but it seems pretty impractical as
at that level all files look the same and you'd be just as likely to
unexpectedly crippled the entire db a few weeks later when a table grows...

I got to wonder how hard this would be to do in Postgres, and attached
is my (WIP) attempt. It provides a guc (max_temp_files_size) to limit
the size of all temp files for a backend and amends fd.c cancel
execution if the total size of temporary files exceeds this.

This is WIP, it does seem to work ok, but some areas/choices I'm not
entirely clear about are mentioned in the patch itself. Mainly:

- name of the guc... better suggestions welcome
- datatype for the guc - real would be good, but at the moment the nice
parse KB/MB/GB business only works for int

regards

Mark

Attachment

Re: WIP - Add ability to constrain backend temporary file space

From
Robert Haas
Date:
On Thu, Feb 17, 2011 at 10:17 PM, Mark Kirkwood
<mark.kirkwood@catalyst.net.nz> wrote:
> This is WIP, it does seem to work ok, but some areas/choices I'm not
> entirely clear about are mentioned in the patch itself. Mainly:
>
> - name of the guc... better suggestions welcome
> - datatype for the guc - real would be good, but at the moment the nice
> parse KB/MB/GB business only works for int

Please add this to the next CommitFest:

https://commitfest.postgresql.org/action/commitfest_view/open

With respect to the datatype of the GUC, int seems clearly correct.
Why would you want to use a float?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: WIP - Add ability to constrain backend temporary file space

From
Josh Berkus
Date:
Mark,

> I got to wonder how hard this would be to do in Postgres, and attached
> is my (WIP) attempt. It provides a guc (max_temp_files_size) to limit
> the size of all temp files for a backend and amends fd.c cancel
> execution if the total size of temporary files exceeds this.

First, are we just talking about pgsql_tmp here, or the pg_temp
tablespace?  That is, just sort/hash files, or temporary tables as well?

Second, the main issue with these sorts of macro-counters has generally
been their locking effect on concurrent activity.  Have you been able to
run any tests which try to run lots of small externally-sorted queries
at once on a multi-core machine, and checked the effect on throughput?

--                                  -- Josh Berkus                                    PostgreSQL Experts Inc.
                        http://www.pgexperts.com
 


Re: WIP - Add ability to constrain backend temporary file space

From
Robert Haas
Date:
On Fri, Feb 18, 2011 at 2:41 PM, Josh Berkus <josh@agliodbs.com> wrote:
> Second, the main issue with these sorts of macro-counters has generally
> been their locking effect on concurrent activity.  Have you been able to
> run any tests which try to run lots of small externally-sorted queries
> at once on a multi-core machine, and checked the effect on throughput?

Since it's apparently a per-backend limit, that doesn't seem relevant.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: WIP - Add ability to constrain backend temporary file space

From
Josh Berkus
Date:
On 2/18/11 11:44 AM, Robert Haas wrote:
> On Fri, Feb 18, 2011 at 2:41 PM, Josh Berkus <josh@agliodbs.com> wrote:
>> Second, the main issue with these sorts of macro-counters has generally
>> been their locking effect on concurrent activity.  Have you been able to
>> run any tests which try to run lots of small externally-sorted queries
>> at once on a multi-core machine, and checked the effect on throughput?
> 
> Since it's apparently a per-backend limit, that doesn't seem relevant.

Oh!  I missed that.

What good would a per-backend limit do, though?

And what happens with queries which exceed the limit?  Error message?  Wait?


--                                  -- Josh Berkus                                    PostgreSQL Experts Inc.
                        http://www.pgexperts.com
 


Re: WIP - Add ability to constrain backend temporary file space

From
Robert Haas
Date:
On Fri, Feb 18, 2011 at 2:48 PM, Josh Berkus <josh@agliodbs.com> wrote:
> On 2/18/11 11:44 AM, Robert Haas wrote:
>> On Fri, Feb 18, 2011 at 2:41 PM, Josh Berkus <josh@agliodbs.com> wrote:
>>> Second, the main issue with these sorts of macro-counters has generally
>>> been their locking effect on concurrent activity.  Have you been able to
>>> run any tests which try to run lots of small externally-sorted queries
>>> at once on a multi-core machine, and checked the effect on throughput?
>>
>> Since it's apparently a per-backend limit, that doesn't seem relevant.
>
> Oh!  I missed that.
>
> What good would a per-backend limit do, though?
>
> And what happens with queries which exceed the limit?  Error message?  Wait?

Well I have not RTFP, but I assume it'd throw an error.  Waiting isn't
going to accomplish anything.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: WIP - Add ability to constrain backend temporary file space

From
Mark Kirkwood
Date:
On 19/02/11 02:34, Robert Haas wrote:
>
> Please add this to the next CommitFest:
>
> https://commitfest.postgresql.org/action/commitfest_view/open
>
> With respect to the datatype of the GUC, int seems clearly correct.
> Why would you want to use a float?
>

Added. With respect to the datatype, using int with KB units means the 
largest temp size is approx 2047GB - I know that seems like a lot now... 
but maybe someone out there wants (say) their temp files limited to 
4096GB :-)

Cheers

Mark


Re: WIP - Add ability to constrain backend temporary file space

From
Mark Kirkwood
Date:
On 19/02/11 08:48, Josh Berkus wrote:
> On 2/18/11 11:44 AM, Robert Haas wrote:
>> On Fri, Feb 18, 2011 at 2:41 PM, Josh Berkus<josh@agliodbs.com>  wrote:
>>> Second, the main issue with these sorts of macro-counters has generally
>>> been their locking effect on concurrent activity.  Have you been able to
>>> run any tests which try to run lots of small externally-sorted queries
>>> at once on a multi-core machine, and checked the effect on throughput?
>> Since it's apparently a per-backend limit, that doesn't seem relevant.
> Oh!  I missed that.
>
> What good would a per-backend limit do, though?
>
> And what happens with queries which exceed the limit?  Error message?  Wait?
>
>

By "temp files" I mean those in pgsql_tmp. LOL - A backend limit will 
have the same sort of usefulness as work_mem does - i.e stop a query 
eating all your filesystem space or bringing a server to its knees with 
io load. We have had this happen twice - I know of other folks who have too.

Obviously you need to do the same sort of arithmetic as you do with 
work_mem to decide on a reasonable limit to cope with multiple users 
creating temp files. Conservative dbas might want to set it to (free 
disk)/max_connections etc. Obviously for ad-hoc systems it is a bit more 
challenging - but having a per-backend limit is way better than having 
what we have now, which is ... errr... nothing.

As an example I'd find it useful to avoid badly written queries causing 
too much io load on the db backend of (say) a web system (i.e such a 
system should not *have* queries that want to use that much resource).

To answer the other question, what happens when the limit is exceeded is 
modeled on statement timeout, i.e query is canceled and a message says 
why (exceeded temp files size).

Cheers

Mark


Re: WIP - Add ability to constrain backend temporary file space

From
Josh Berkus
Date:
> Obviously you need to do the same sort of arithmetic as you do with
> work_mem to decide on a reasonable limit to cope with multiple users
> creating temp files. Conservative dbas might want to set it to (free
> disk)/max_connections etc. Obviously for ad-hoc systems it is a bit more
> challenging - but having a per-backend limit is way better than having
> what we have now, which is ... errr... nothing.

Agreed.

> To answer the other question, what happens when the limit is exceeded is
> modeled on statement timeout, i.e query is canceled and a message says
> why (exceeded temp files size).

When does this happen?  When you try to allocate the file, or when it
does the original tape sort estimate?

The disadvantage of the former is that the user waited for minutes in
order to have their query cancelled.  The disadvantage of the latter is
that the estimate isn't remotely accurate.

--                                  -- Josh Berkus                                    PostgreSQL Experts Inc.
                        http://www.pgexperts.com
 


Re: WIP - Add ability to constrain backend temporary file space

From
Tom Lane
Date:
Mark Kirkwood <mark.kirkwood@catalyst.net.nz> writes:
> Added. With respect to the datatype, using int with KB units means the 
> largest temp size is approx 2047GB - I know that seems like a lot now... 
> but maybe someone out there wants (say) their temp files limited to 
> 4096GB :-)

[ shrug... ]  Sorry, I can't imagine a use case for this parameter where
the value isn't a *lot* less than that.  Maybe if it were global, but
not if it's per-backend.
        regards, tom lane


Re: WIP - Add ability to constrain backend temporary file space

From
Mark Kirkwood
Date:
On 19/02/11 10:38, Josh Berkus wrote:
>
>> To answer the other question, what happens when the limit is exceeded is
>> modeled on statement timeout, i.e query is canceled and a message says
>> why (exceeded temp files size).
> When does this happen?  When you try to allocate the file, or when it
> does the original tape sort estimate?
>
> The disadvantage of the former is that the user waited for minutes in
> order to have their query cancelled.  The disadvantage of the latter is
> that the estimate isn't remotely accurate.
>

Neither - it checks each write (I think this is pretty cheap - adds two 
int and double + operations and a  /, > operation to FileWrite). If the 
check shows you've written more than the limit, you get canceled. So you 
can exceed the limit by 1 buffer size.

Yeah, the disadvantage is that (like statement timeout) it is a 'bottom 
of the cliff' type of protection. The advantage is there are no false 
positives...

Cheers

Mark


Re: WIP - Add ability to constrain backend temporary file space

From
Josh Berkus
Date:
> Yeah, the disadvantage is that (like statement timeout) it is a 'bottom
> of the cliff' type of protection. The advantage is there are no false
> positives...

Yeah, just trying to get a handle on the proposed feature.  I have no
objections; it seems like a harmless limit for most people, and useful
to a few.

--                                  -- Josh Berkus                                    PostgreSQL Experts Inc.
                        http://www.pgexperts.com
 


Re: WIP - Add ability to constrain backend temporary file space

From
Mark Kirkwood
Date:
On 19/02/11 11:30, Josh Berkus wrote: <blockquote cite="mid:4D5EF2F5.5070801@agliodbs.com" type="cite"><pre wrap="">
</pre><blockquote type="cite"><pre wrap="">Yeah, the disadvantage is that (like statement timeout) it is a 'bottom
of the cliff' type of protection. The advantage is there are no false
positives...
</pre></blockquote><pre wrap="">
Yeah, just trying to get a handle on the proposed feature.  I have no
objections; it seems like a harmless limit for most people, and useful
to a few.

</pre></blockquote><font size="-1"><font face="Helvetica">No worries and sorry, I should have used the "per backend"
phrasein the title to help clarify what was intended.<br /><br /> Cheers<br /><br /> Mark<br /><br /></font></font> 

Re: WIP - Add ability to constrain backend temporary file space

From
Mark Kirkwood
Date:
New version:

- adds documentation
- adds category RESOURCES_DISK

Attachment