Thread: Big problem with sql update operation

Big problem with sql update operation

From
Michal Szymanski
Date:
Recently I've wrote few pgSql procedures that generates invoices and
store it in postgres table. Small test has shown that there is
performance problem. I've thought that string operation in pgsql are not
perfect but it has appeared that 90% of time program waste on very
simple update.
Below is my simplified procedures:

CREATE TABLE group_fin_account_tst (
       group_fin_account_tst_id BIGSERIAL PRIMARY KEY,
       credit                 NUMERIC(8,2) DEFAULT 0.00 NOT NULL
) ;  ALTER TABLE group_fin_account_tst OWNER TO freeconetadm;

INSERT INTO group_fin_account_tst
(credit) VALUES (4);

CREATE OR REPLACE FUNCTION test()
RETURNS void AS
$BODY$
DECLARE
BEGIN
FOR v_i IN 1..4000 LOOP
    UPDATE group_fin_account_tst SET
            credit     = v_i
         WHERE group_fin_account_tst_id = 1;   -- for real procedure I
update different rows

END LOOP;
END; $BODY$ LANGUAGE 'plpgsql' VOLATILE;
ALTER FUNCTION test()  OWNER TO freeconetadm;
select test();

The strange thing is how program behave  when I increase number of
iteration.
Below my results (where u/s is number of updates per second)

On windows
500  - 0.3s(1666u/s)
1000 - 0.7s  (1428u/s)
2000 - 2.3s  (869u/s)
4000 - 9s (444u/s)
8000 -29s (275u/s)
16000-114s (14u/s)

On linux:
500  - 0.5s(1000u/s)
1000 - 1.8s  (555u/s)
2000 - 7.0s  (285u/s)
4000 - 26s (153u/s)
8000 -101s (79u/s)
16000-400s (40u/s)

On both systems relation between number of iteration and time is
strongly nonlinear! Do you know what is a problem? Is it possible to
commit transaction inside  pgsql procedure because I think that maybe
transaction is too long?

Regards
Michal Szymanski
http://blog.szymanskich.net

Re: Big problem with sql update operation

From
Tom Lane
Date:
Michal Szymanski <szymanskim@datera.pl> writes:
> CREATE OR REPLACE FUNCTION test()
> RETURNS void AS
> $BODY$
> DECLARE
> BEGIN
> FOR v_i IN 1..4000 LOOP
>     UPDATE group_fin_account_tst SET
>             credit     = v_i
>          WHERE group_fin_account_tst_id = 1;   -- for real procedure I
> update different rows

Does updating the *same* record 4000 times per transaction reflect the
real behavior of your application?  If not, this is not a good
benchmark.  If so, consider redesigning your app to avoid so many
redundant updates.

(For the record, the reason you see nonlinear degradation is the
accumulation of tentatively-dead versions of the row, each of which has
to be rechecked by each later update.)

            regards, tom lane

Re: Big problem with sql update operation

From
Michal Szymanski
Date:
Tom Lane wrote:
> Michal Szymanski <szymanskim@datera.pl> writes:
>
>> CREATE OR REPLACE FUNCTION test()
>> RETURNS void AS
>> $BODY$
>> DECLARE
>> BEGIN
>> FOR v_i IN 1..4000 LOOP
>>     UPDATE group_fin_account_tst SET
>>             credit     = v_i
>>          WHERE group_fin_account_tst_id = 1;   -- for real procedure I
>> update different rows
>>
>
> Does updating the *same* record 4000 times per transaction reflect the
> real behavior of your application?  If not, this is not a good
> benchmark.  If so, consider redesigning your app to avoid so many
> redundant updates.
>
>
Real application modifiy every time modify different row.

> (For the record, the reason you see nonlinear degradation is the
> accumulation of tentatively-dead versions of the row, each of which has
> to be rechecked by each later update.)
>
>
There is another  strange  thing. We  have two versions of our test
environment one with production DB copy and second genereated with
minimal data set and it is odd that update presented above on copy of
production is executing 170ms but on small DB it executing 6s !!!!

Michal Szymanski
http://blog.szymanskich.net

Re: Big problem with sql update operation

From
Alvaro Herrera
Date:
Michal Szymanski wrote:
> Tom Lane wrote:

> >(For the record, the reason you see nonlinear degradation is the
> >accumulation of tentatively-dead versions of the row, each of which has
> >to be rechecked by each later update.)
> >
> There is another  strange  thing. We  have two versions of our test
> environment one with production DB copy and second genereated with
> minimal data set and it is odd that update presented above on copy of
> production is executing 170ms but on small DB it executing 6s !!!!

How are you vacuuming the tables?

--
Alvaro Herrera                        http://www.advogato.org/person/alvherre
"El conflicto es el camino real hacia la unión"

Re: Big problem with sql update operation

From
Alvaro Herrera
Date:
Michal Szymanski wrote:
> There is another strange thing. We have two versions of our test
> >>environment one with production DB copy and second genereated with
> >>minimal data set and it is odd that update presented above on copy of
> >>production is executing 170ms but on small DB it executing 6s !!!!
> >
> >How are you vacuuming the tables?
> >
> Using pgAdmin (DB is installed on my laptop) and I use this tool for
> vaccuminh, I do not think that vaccuming can help because I've tested on
> both database just after importing.

I think you are misunderstanding the importance of vacuuming the table.
Try this: on a different terminal from the one running the test, run a
VACUUM on the updated table with vacuum_cost_delay set to 20, on an
infinite loop.  Keep this running while you do your update test.  Vary
the vacuum_cost_delay and measure the average/min/max UPDATE times.
Also try putting a short sleep on the infinite VACUUM loop and see how
its length affects the UPDATE times.

One thing not clear to me is if your table is in a clean state.  Before
running this test, do a TRUNCATE and import the data again.  This will
get rid of any dead space that may be hurting your measurements.

--
Alvaro Herrera                        http://www.advogato.org/person/alvherre
"The Postgresql hackers have what I call a "NASA space shot" mentality.
 Quite refreshing in a world of "weekend drag racer" developers."
(Scott Marlowe)