Re: Query slower on 8.0.3 (Windows) vs 7.3 (cygwin) - Mailing list pgsql-performance
From | Gurpreet Aulakh |
---|---|
Subject | Re: Query slower on 8.0.3 (Windows) vs 7.3 (cygwin) |
Date | |
Msg-id | IGEGINLEEDLANNHAAKCMKEDJCCAA.gaulakh@ecmarket.com Whole thread Raw |
In response to | Re: Query slower on 8.0.3 (Windows) vs 7.3 (cygwin) (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Query slower on 8.0.3 (Windows) vs 7.3 (cygwin)
|
List | pgsql-performance |
Thanks for your help Tom. While testing 8.1, I found that simple joins take longer in 8.1 than 8.0. For example the sub query SELECT doc.doc_documentid FROM document AS doc LEFT JOIN folder_document ON doc.doc_documentid = folder_document.doc_documentId LEFT JOIN document as root ON doc.doc_internalRootXref = root.doc_documentId is actually slower on 8.1 than 8.0. However, the full query that I will be running is much faster. In my evaluation I found the same pattern. That simple joins were slower but complex joins were faster. Overall though, 8.1 is faster and we will probably be moving to it when it's officially released. -----Original Message----- From: Tom Lane [mailto:tgl@sss.pgh.pa.us] Sent: September 23, 2005 2:13 PM To: Gurpreet Aulakh Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Query slower on 8.0.3 (Windows) vs 7.3 (cygwin) "Gurpreet Aulakh" <gaulakh@ecmarket.com> writes: > After further investigation I have found that the reason why the query is > slower on 8.0.3 is that the hash and hash joins are slower on the 8.0.3. > So the question comes down to : Why are hash and hash joins slower? I looked into this a bit and determined that the problem seems to have been introduced here: 2002-12-30 10:21 tgl * src/: backend/executor/nodeHash.c, backend/executor/nodeHashjoin.c, backend/optimizer/path/costsize.c, include/executor/nodeHash.h: Better solution to integer overflow problem in hash batch-number computation: reduce the bucket number mod nbatch. This changes the association between original bucket numbers and batches, but that doesn't matter. Minor other cleanups in hashjoin code to help centralize decisions. (which means it's present in 7.4 as well as 8.0). The code now groups tuples into hash batches according to (hashvalue % totalbuckets) % nbatch When a tuple that is not in the first batch is reloaded, it is placed into a bucket according to (hashvalue % nbuckets) This means that if totalbuckets, nbatch, and nbuckets have a common factor F, the buckets won't be evenly used; in fact, only one in every F buckets will be used at all, the rest remaining empty. The ones that are used accordingly will contain about F times more tuples than intended. The slowdown comes from having to compare these extra tuples against the outer-relation tuples. 7.3 uses a different algorithm for grouping tuples that avoids this problem, but it has performance issues of its own (in particular, to avoid integer overflow we have to limit the number of batches we can have). So just reverting this patch doesn't seem very attractive. The problem no longer exists in 8.1 because of rewrites undertaken for another purpose, so I'm sort of tempted to do nothing. To fix this in the back branches we'd have to develop new code that won't ever go into CVS tip and thus will never get beta-tested. The risk of breaking things seems higher than I'd like. If we did want to fix it, my first idea is to increment nbatch looking for a value that has no common factor with nbuckets. regards, tom lane
pgsql-performance by date: