Greetings,
* Simon Riggs (simon@2ndquadrant.com) wrote:
> work_mem= 1 GB benefit at 8 TB
> work_mem= 256MB benefit at 0.5 TB
> (based upon runs on average twice size of memory, and each logical tape
> requiring 256KB memory, i.e. min(work_mem/4, 6) * work_mem * 2, which
> for work_mem > 2 MB gives 0.5 * work_mem^2)
Seeing this reminded me of an issue I ran into recently. In 8.1 on a
database that's only 16G, I ran a query that chewed up all the available
disk space (about 250G, yes, 0.25TB) on the partition and then failed.
Of course, this took many hours on a rather speedy box (and the disk
array is a pretty nice IBM SAN so it's not exactly a slacker either) and
produced nothing for me.
I'd like to think it's often the case that Postgres has some idea what
the total disk space usage of a given query is going to be prior to
actually running the whole query and just seeing how much space it took
at the highest point. If this can be done with some confidence then
it'd be neat if Postgres could either check if there's enough disk space
available and if not bail (I know, difficult to do cross-platform and
there's tablespaces and whatnot to consider) OR if there was a parameter
along the lines of "max_temp_disk_space" which would fail the query if
that would be exceeded by the query. The latter could even be two GUC
variables, one administrator set and unchangable by the user ('hard'
limit) and one settable by the user with a sane default ('soft' limit)
and perhaps a HINT which indicates how to change it in the error
message when the limit is hit.
I suppose I could put quotas in place or something but I don't really
have a problem with the database as a whole using up a bunch of disk
space (hence why it's got alot of room to grow into), I just would have
liked a "this will chew up more disk space than you have and then fail"
message instead of what ended up happening for this query.
Thanks!
Stephen