Re: Synchronized Scan benchmark results - Mailing list pgsql-hackers
From | Jeff Davis |
---|---|
Subject | Re: Synchronized Scan benchmark results |
Date | |
Msg-id | 1175707429.4152.64.camel@dogma.v10.wvs Whole thread Raw |
In response to | Re: Synchronized Scan benchmark results ("Simon Riggs" <simon@2ndquadrant.com>) |
Responses |
Re: Synchronized Scan benchmark results
|
List | pgsql-hackers |
On Wed, 2007-04-04 at 10:40 +0100, Simon Riggs wrote: > > That makes no sense to me, so it's probably a fluke (by which I mean > > some other activity on the system, perhaps swapping some large > > applications). The second two tests are consistent with all the other > > numbers I got, but the first one took 40 seconds longer than I would > > expect. I'll do a simple re-test tonight. > > What did you set scan_recycle_buffers to? The default was zero. > > I think v2 of the patch interpreted that setting as meaning attempt to > reuse the same buffer again immediately, which probably wouldn't be > optimal. Which is why I issued v3... I think you'll need to set > scan_recycle_buffers = 0 (==off in v3) and scan_recycle_buffers = 32 to > get sensible comparison figures. > I used v2 with default in those tests, so I think that means it used the same buffer. By the way, on another test I did that results came out at 165s, which is consistent with the other results. I think the time I ran that the machine must have been swapping out applications or something... who knows. > So please can you use v3 for any further testing. Thanks. I'll use v3 of the patch as located here: http://archives.postgresql.org/pgsql-hackers/2007-03/msg00709.php By the way, it might be easier to find the right one if the archives contained filenames for the attachments. Am I missing something obvious? > > > I would like to see some tests with different queries that have varying > > > I/O and CPU requirements to see if they stay together too. That won't > > > block the patch, but it will help everybody understand what the range of > > > real world applicability there is in this. I'd guess this can benefit us > > > sufficiently frequently in most cases that its worth it. > > > > I'll do some more varied tests. The best idea I've come up with so far > > is to do something that requires random seeking going concurrently with > > the scans. > > No, what I mean is different kinds of scans: > - a simple scan like count(*) Will use my same "scan.rb" benchmark. > - a more complex one that does buckets of cycles per tuple I'll use a modified "scan.rb" that does a computation in the select list (I'll call the function volatile so that it recomputes with each tuple). > - a hash join This is where I got stuck. * If it's one big ( > NBuffers/2 ) table and one small table, the small table will only serve to occupy some shared_buffers (right?) * If it's two big tables, a join would be a major operation. I don't think it would even choose a hash join in that situation, right? To summarize, in the next round of testing, I will * disable sync_seqscan_offset completely * use recycle_buffers=0 and 32 * I'll still test against 8.2.3 for consistency in case you suggest otherwise. Regards,Jeff Davis
pgsql-hackers by date: