Re: huge difference in performance between MS SQL and pg 8.3 on UPDATE with full text search - Mailing list pgsql-general
From | Ivan Sergio Borgonovo |
---|---|
Subject | Re: huge difference in performance between MS SQL and pg 8.3 on UPDATE with full text search |
Date | |
Msg-id | 20100928234816.03e18905@dawn.webthatworks.it Whole thread Raw |
In response to | Re: huge difference in performance between MS SQL and pg 8.3 on UPDATE with full text search (Alban Hertroys <dalroi@solfertje.student.utwente.nl>) |
List | pgsql-general |
On Tue, 28 Sep 2010 20:19:10 +0200 Alban Hertroys <dalroi@solfertje.student.utwente.nl> wrote: > On 28 Sep 2010, at 12:49, Ivan Sergio Borgonovo wrote: > > The hardware on the 2 machines is a bit different. > > MS SQL 2005 runs on a box that is 2 years old, 2 SATA drives on > > RAID 1 hw, 2 Xeon dual core (I can't check details right now) > > PG runs on a box that has more than 5 years, 3 SCSI drives on > > RAID 5 hw, 2 Xeon single core HT (Intel(R) Xeon(TM) CPU 3.20GHz, > > cpu fam 14, model 4) > > In both tables I've to update price. > > VACUUM FULL was performed just before updating the prices. > VACUUM FULL? Was that really necessary? You did REINDEX after > that, didn't you? If not, your indexes became bloated. If the > table wasn't empty before you probably meant to do a VACUUM > ANALYSE, but if it was, just ANALYSE would have been sufficient. ... > Did you ANALYSE between loading the data and updating? I thought VACUUM FULL was more "magical" and implied a REINDEX. Am I wrong? The index that should be reindexed is the one on the pk, a simple btree, that's not going to be as slow as rebuilding a gin... still I'd really thought that VACUUM FULL implied a lot of things (including ANALYZE as well). > Also, executing the trigger on each copied line is likely to be a > little slow, due to the overhead of calling a stored procedure > (especially if it's plpgsql). It's probably quite a bit faster to > disable the trigger and create the gin-index after loading the > data. I'll try to drop the trigger. I'm not expecting it the biggest factor still adding something here and something there may end up in the huge difference between the 2. Anyway MS SQL seems to overcome all this nuisances auto-magically. > An EXPLAIN ANALYSE of that statement would tell what it's doing > and what's taking so long. A simple EXPLAIN would probably be > sufficient to see what query plan it thinks it needs though. I'll post the EXPLAIN. Before I run EXPLAIN ANALYZE I've to take some precaution the DB doesn't explode. > Did you tune that database? Several options (work_mem for example) > could significantly improve your performance if you can set them > higher (or reduce it if you set them too high). You can do that > per session too. pg is generally faster than the other MS SQL box on what's normally done on a daily basis. Just large updates to the product page seems to be a pain. Other INSERT/UPDATE operations are seldom performed, they involve smaller tables with no gin index. > > If HW can justify such huge difference I'll devote my time to > > other problems. > Partially, yes, but not that much I think. That's my worry... but still in many circumstances pg performs better than the MS SQL box... yeah... on pretty different workload... but while on other workloads pg is a bit faster (20% to 100% faster) even if it is on an older box, on this one is very slow. > > I'd say that a potential culprit could be the gin index. No > > matter if the tsvector is updated or not, if the row is changed > > I think the index is going to be updated anyway. > gin indexes require relatively much RAM. If you didn't assign much > in your settings then it's quite possible that the database can't > keep the index in memory or that things have to spill to disk. > Leave enough room for the OS's disk cache though, Postgres > benefits from that as well. > Is there any particular reason you went with a gin index and not a > gist one? Gin can be faster, but consumes (much) more memory, but > gist is also quite good with text searches and doesn't require > quite as much memory. gin index is doing a very good work and well full text searches are the typical workload of that box and the one that is more important to be fast. I'd say if gin was occupying so much memory performances wouldn't be so good on a daily basis. I'd post excerpt of my postgres.conf (what's important here?) and see if anything can be improved for *this* workload and temporary tune the DB for this exceptional update still I'm going to continue to be a bit surprised of such a huge difference even if it will come out that it was actually a .conf not suited for this workload. BTW the box is running Apache and php. thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it
pgsql-general by date: