Thread: Big problem with sql update operation
Recently I've wrote few pgSql procedures that generates invoices and store it in postgres table. Small test has shown that there is performance problem. I've thought that string operation in pgsql are not perfect but it has appeared that 90% of time program waste on very simple update. Below is my simplified procedures: CREATE TABLE group_fin_account_tst ( group_fin_account_tst_id BIGSERIAL PRIMARY KEY, credit NUMERIC(8,2) DEFAULT 0.00 NOT NULL ) ; ALTER TABLE group_fin_account_tst OWNER TO freeconetadm; INSERT INTO group_fin_account_tst (credit) VALUES (4); CREATE OR REPLACE FUNCTION test() RETURNS void AS $BODY$ DECLARE BEGIN FOR v_i IN 1..4000 LOOP UPDATE group_fin_account_tst SET credit = v_i WHERE group_fin_account_tst_id = 1; -- for real procedure I update different rows END LOOP; END; $BODY$ LANGUAGE 'plpgsql' VOLATILE; ALTER FUNCTION test() OWNER TO freeconetadm; select test(); The strange thing is how program behave when I increase number of iteration. Below my results (where u/s is number of updates per second) On windows 500 - 0.3s(1666u/s) 1000 - 0.7s (1428u/s) 2000 - 2.3s (869u/s) 4000 - 9s (444u/s) 8000 -29s (275u/s) 16000-114s (14u/s) On linux: 500 - 0.5s(1000u/s) 1000 - 1.8s (555u/s) 2000 - 7.0s (285u/s) 4000 - 26s (153u/s) 8000 -101s (79u/s) 16000-400s (40u/s) On both systems relation between number of iteration and time is strongly nonlinear! Do you know what is a problem? Is it possible to commit transaction inside pgsql procedure because I think that maybe transaction is too long? Regards Michal Szymanski http://blog.szymanskich.net
Michal Szymanski <szymanskim@datera.pl> writes: > CREATE OR REPLACE FUNCTION test() > RETURNS void AS > $BODY$ > DECLARE > BEGIN > FOR v_i IN 1..4000 LOOP > UPDATE group_fin_account_tst SET > credit = v_i > WHERE group_fin_account_tst_id = 1; -- for real procedure I > update different rows Does updating the *same* record 4000 times per transaction reflect the real behavior of your application? If not, this is not a good benchmark. If so, consider redesigning your app to avoid so many redundant updates. (For the record, the reason you see nonlinear degradation is the accumulation of tentatively-dead versions of the row, each of which has to be rechecked by each later update.) regards, tom lane
Tom Lane wrote: > Michal Szymanski <szymanskim@datera.pl> writes: > >> CREATE OR REPLACE FUNCTION test() >> RETURNS void AS >> $BODY$ >> DECLARE >> BEGIN >> FOR v_i IN 1..4000 LOOP >> UPDATE group_fin_account_tst SET >> credit = v_i >> WHERE group_fin_account_tst_id = 1; -- for real procedure I >> update different rows >> > > Does updating the *same* record 4000 times per transaction reflect the > real behavior of your application? If not, this is not a good > benchmark. If so, consider redesigning your app to avoid so many > redundant updates. > > Real application modifiy every time modify different row. > (For the record, the reason you see nonlinear degradation is the > accumulation of tentatively-dead versions of the row, each of which has > to be rechecked by each later update.) > > There is another strange thing. We have two versions of our test environment one with production DB copy and second genereated with minimal data set and it is odd that update presented above on copy of production is executing 170ms but on small DB it executing 6s !!!! Michal Szymanski http://blog.szymanskich.net
Michal Szymanski wrote: > Tom Lane wrote: > >(For the record, the reason you see nonlinear degradation is the > >accumulation of tentatively-dead versions of the row, each of which has > >to be rechecked by each later update.) > > > There is another strange thing. We have two versions of our test > environment one with production DB copy and second genereated with > minimal data set and it is odd that update presented above on copy of > production is executing 170ms but on small DB it executing 6s !!!! How are you vacuuming the tables? -- Alvaro Herrera http://www.advogato.org/person/alvherre "El conflicto es el camino real hacia la unión"
Michal Szymanski wrote: > There is another strange thing. We have two versions of our test > >>environment one with production DB copy and second genereated with > >>minimal data set and it is odd that update presented above on copy of > >>production is executing 170ms but on small DB it executing 6s !!!! > > > >How are you vacuuming the tables? > > > Using pgAdmin (DB is installed on my laptop) and I use this tool for > vaccuminh, I do not think that vaccuming can help because I've tested on > both database just after importing. I think you are misunderstanding the importance of vacuuming the table. Try this: on a different terminal from the one running the test, run a VACUUM on the updated table with vacuum_cost_delay set to 20, on an infinite loop. Keep this running while you do your update test. Vary the vacuum_cost_delay and measure the average/min/max UPDATE times. Also try putting a short sleep on the infinite VACUUM loop and see how its length affects the UPDATE times. One thing not clear to me is if your table is in a clean state. Before running this test, do a TRUNCATE and import the data again. This will get rid of any dead space that may be hurting your measurements. -- Alvaro Herrera http://www.advogato.org/person/alvherre "The Postgresql hackers have what I call a "NASA space shot" mentality. Quite refreshing in a world of "weekend drag racer" developers." (Scott Marlowe)