Re: Disk buffering of resultsets - Mailing list pgsql-jdbc

From Craig Ringer
Subject Re: Disk buffering of resultsets
Date
Msg-id 541E7BD1.5090107@2ndquadrant.com
Whole thread Raw
In response to Re: Disk buffering of resultsets  ("Lussier, Denis" <denisl@openscg.com>)
List pgsql-jdbc
On 09/21/2014 11:24 AM, Lussier, Denis wrote:
> This does seem very worthwhile.  Can someone please sketch out a
> mini-design and see if it makes sense to the pgjdbc core?   I'd be
> willing to hack some code, but, I'd want the design to be pre-vetted.
>
> Here's my first quick strawman:
>
> 1.) Get the TEMP directory (may be OS specific).

Use Java's tempfile handling, let it use the default temporary
directory, but give the user the option of overriding it with a JDBC
driver parameter.

Java 7, use the Path and Files classes for tempfile handling.

Java 6 and below, use File.createTempFile(...).

Both let you override the target directory.

> 2.) After reading more than 100 rows, determine the average size per row
> and check the available JVM memory.

I'm not sure you can portably check available JVM memory.

I'd prefer to see if we can use an existing, mature library that
transparently serializes resultsets to disk under memory pressure.
Possibly including use of soft references to load results from the
on-disk store and let them be discarded by the JVM under pressure.

Rather than reinventing the wheel, lets see what already exists in this
problem space.

> 3.) Flush the rows to disk once more than 65% of JVM memory is exhausted
> OR according to a parameter that can be set.

I'd rather have low-cost spill to disk that we can use earlier and more
aggressively when we're reading whole tuple sets at once, rather than
use up lots of JVM memory and possibly unnecessarily evict soft
reference based caches from applications.

On Linux, if it wasn't Java code I'd want to just create an mmap()ed
file and start writing to it for result sets of almost any size. Let the
OS deal with whether to actually flush it to disk or not. I'm not sure
if that's practical from Java though, and IIRC Windows has more
aggressive writing of memory mapped files that could cause performance
issues.




--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


pgsql-jdbc by date:

Previous
From: Thomas Kellerer
Date:
Subject: Re: Disk buffering of resultsets
Next
From: Craig Ringer
Date:
Subject: Re: Disk buffering of resultsets