* Jim C. Nasby (jnasby@pervasive.com) wrote:
> On Thu, Feb 09, 2006 at 02:03:41PM -0500, Mark Woodward wrote:
> > If it is not something that can be fixed, it should be clearly documented.
>
> work_mem (integer)
>
> Specifies the amount of memory to be used by internal sort
> operations and hash tables before switching to temporary disk files.
> The value is specified in kilobytes, and defaults to 1024 kilobytes
> (1 MB). Note that for a complex query, several sort or hash
> operations might be running in parallel; each one will be allowed to
> use as much memory as this value specifies before it starts to put
> data into temporary files. Also, several running sessions could be
> doing such operations concurrently. So the total memory used could
> be many times the value of work_mem; it is necessary to keep this
> fact in mind when choosing the value. Sort operations are used for
> ORDER BY, DISTINCT, and merge joins. Hash tables are used in hash
> joins, hash-based aggregation, and hash-based processing of IN
> subqueries.
>
> So it says right there that it's very easy to exceed work_mem by a very
> large amount. Granted, this is a very painful problem to deal with and
> will hopefully be changed at some point, but it's pretty clear as to how
> this works.
It also says that when it goes over, it'll spill to disk. Additionally,
we're talking about one hash here, not multiple ones. It seems at least
misleading as, if I understand correctly, Postgres isn't actually
actively checking to see if the amount of memory used by an in-progress
hash creation has gone over the limit but rather it guesses at how much
memory will be used during the planning stage to decide if a hash plan
is possible or not. That guess can certainly be wrong but there's
nothing in place to handle the situation where the guess is wrong...
Thanks,
Stephen