Thread: Few questions about my slow query
Hi, I'm returning to postgresql performance problem, while I'm trying a lot to understand way pg is performing queries I need your help... So I have two tables: refs (...some fields... , key is id_tit char(7) - indexed with default method); temp_kat (...some fields..., key is id_ref char(7) - indexed with default method); Refs contains 80000 recs, temp_kat contains 5500 recs and all records from temp_kat have a matching key value in refs... When I need "mark" records in table refs (set the field exist for records existing also in temp_kat) I do following query: update refs set exist='true' where temp_kat.id_ref=refs.id_tit; This query takes a lot of time - I don't know exactly how long because I had to leave my office after 1.5 hour and the query was still running. I think it's too long... Before executing I increased shared buffers to 1024 and both tables are vacuumized. And I'm running the RedHat5.1 box on Celeron 266 with 64MB ram. So the questions are: 1. Did I reach the maximum possible performance or should I rewrite query some other way? 2. Could I start postgres with some diff. options for ex. with the feature of delayed flushing? My pg is 6.4.2 and I saw here there is a possibility to run it at some diff. mode but I don't know how (no docs about it) and don't know if it's only pg 6.5 feature 3. Should I change the indexing method from default one to some other? Many thanks for helping me; believe me that I'm not lazy to study but there is not so much docs about the internal pg behaviour so I can only ask you... -- Michal Samek, Tony distribuce s.r.o. webmaster@tony.cz (++420659/321350) ICQ: 38607210
>Many thanks for helping me; believe me that I'm not lazy to study but there is >not so much docs about the internal pg behaviour so I can only ask you... OK, after posting this message I downloaded the 6.5 manual and I can see the whole new chapter about pg internals, it's great :) I'm sorry I should read it before I ask you. But still thanks if you want to help me. -- Michal Samek, Tony distribuce s.r.o. webmaster@tony.cz (++420659/321350) ICQ: 38607210
webmaster <webmaster@tony.cz> writes: > Refs contains 80000 recs, temp_kat contains 5500 recs and all records from > temp_kat have a matching key value in refs... When I need "mark" records in > table refs (set the field exist for records existing also in temp_kat) I do > following query: > update refs set exist='true' where temp_kat.id_ref=refs.id_tit; > This query takes a lot of time Hmm. Are the id fields unique, or are there duplicates? If there were, say, 10 records in temp_kat with the same ID, then the corresponding records in refs would each get updated 10 times. It would help to see what EXPLAIN says about this query. Since you have indexes, a merge join would probably be the best strategy, but we can't tell what's actually being used. > 2. Could I start postgres with some diff. options for ex. with the feature of > delayed flushing? My pg is 6.4.2 and I saw here there is a possibility to I think you should update to 6.5. You might also want to start the postmaster with "-o -F" to turn off fsync, but this assumes that you have confidence in your OS, hardware, and power supply --- in the event of a system crash, you have a greater risk of losing data with -F. regards, tom lane