Re: DB Performance decreases due to often written/accessed - Mailing list pgsql-performance

From Richard Huxton
Subject Re: DB Performance decreases due to often written/accessed
Date
Msg-id 4537B394.9000205@archonet.com
Whole thread Raw
In response to Re: DB Performance decreases due to often written/accessed table  ("Jens Schipkowski" <jens.schipkowski@apus.co.at>)
Responses Re: DB Performance decreases due to often written/accessed  ("Jim C. Nasby" <jim@nasby.net>)
List pgsql-performance
Jens Schipkowski wrote:
>> And running vacuum every 30 seconds does what?
> Not yet fully tested. It seems to lower the slow down.
> But minimizing the gain of slow down doesn't solve the problem. The
> Problem is the decrease of execution speed of DELETE and SELECT
> statements by a table row count between 150k - 200k. The table starts
> growing first after DELETE statements fails during timeout.
>>
>>> It does not sound professional, but the database seems to be aging by
>>> the increase of executed statements.
>>
>> It sounds very likely if you aren't vacuuming enough or the tables are
>> growing rapidly.
> vacuuming once a minute is not enough? We reach the execution of 3k
> statements per minute (startup time of testing). 1/4 of them are INSERTs
> and DELETEs. After 5 minutes a DELETE will took about 50 seconds -
> compared to startup time about 300 - 600ms.

You want to vacuum enough so that the deletes don't leave
ever-increasing gaps in your table. If you run VACUUM VERBOSE it will
tell you what it did - if the number of (non-)removable rows keeps
increasing you're not vacuuming more. The trick is to vacuum often but
not have to do a lot of work in each. The autovacuum tool in recent
versions tries to estimate this for you, but might not cope in your case.

>>> The Statements
>>> ---------------
>>> // normal insert - does not seem to be the problem - runtime is ok
>>> INSERT INTO
>>> tbl_reg(idreg,idtype,attr1,...,attr6,loc1,...,loc3,register)
>>> VALUES(nextval('tbl_reg_idreg_seq'),1,[attr],[loc],1);
>>>  // select finds out which one has not an twin
>>> // a twin is defined as record with the same attr* values
>>> // decreases speed over time until timeout by postgresql
>>> SELECT *
>>> FROM tbl_reg reg
>>> WHERE register <> loc1 AND
>>>     idreg NOT IN
>>>         (
>>>         SELECT reg.idreg
>>>         FROM tbl_reg reg, tbl_reg regtwin
>>>         WHERE regtwin.register = 1 AND
>>>             regtwin.type <> 20 AND
>>>             reg.attr1 = regtwin.attr1 AND
>>>             reg.attr2 = regtwin.attr2 AND
>>>             reg.attr3 = regtwin.attr3 AND
>>>             reg.attr4 = regtwin.attr4 AND
>>>             reg.attr5 = regtwin.attr5 AND
>>>             reg.attr6 = regtwin.attr6 AND
>>>             reg.idreg <> regtwin.idreg AND
>>>             reg.register = 2
>>>         );
>>> I tried to optimize the seslect statement but the group by having
>>> count(*)
>>>> 1 solution is half as fast as this statement - relating to the query
>>> plan of EXPLAIN ANALYZE.
>>
>> And what did EXPLAIN ANALYSE show here? I'm guessing you're getting
>> time increasing as the square of the number of rows in tbl_reg. So, if
>> 50 rows takes 2500s then 100 rows will take 10000s. Now, if you had
>> enough RAM (or an index) I'd expect the planner to process the table
>> in a sorted manner so the query-time would increase linearly.
>
> EXPLAIN ANALYZE shows at startup
>    QUERY PLAN
> -------------------------------------------------------------------------------------------------
>
>  Seq Scan on tbl_reg  (cost=25841.35..31433.17 rows=72424 width=6)
> (actual time=673.877..673.877 rows=0 loops=1)
>    Filter: ((hashed subplan) OR ((loc1 = 2::smallint) AND (loc2 =
> 2::smallint)))
>    SubPlan
>      ->  Merge Join  (cost=22186.21..25646.57 rows=77913 width=8)
> (actual time=285.624..285.624 rows=0 loops=1)
>            Merge Cond: (("outer".attr1 = "inner".attr1) AND
> ("outer"."?column8?" = "inner"."?column8?") AND ("outer"."?column9?" =
> "inner"."?column9?") AND ("outer"."?column10?" = "inner"."?column10?")
> AND ("outer"."?column11?" = "inner"."?column11?") AND ("outer".attr6 =
> "inner".attr6))
>            Join Filter: ("outer".idreg <> "inner".idreg)
>            ->  Sort  (cost=4967.06..4971.65 rows=1835 width=56) (actual
> time=285.622..285.622 rows=0 loops=1)
>                  Sort Key: reg.attr1, (reg.attr2)::text,
> (reg.attr3)::text, (reg.attr4)::text, (reg.attr5)::text, reg.attr6
>                  ->  Seq Scan on tbl_reg reg  (cost=0.00..4867.59
> rows=1835 width=56) (actual time=285.551..285.551 rows=0 loops=1)
>                        Filter: (register = 2)
>            ->  Sort  (cost=17219.15..17569.77 rows=140247 width=56)
> (never executed)
>                  Sort Key: regtwin.attr1, (regtwin.attr2)::text,
> (regtwin.attr3)::text, (regtwin.attr4)::text, (regtwin.attr5)::text,
> regtwin.attr6
>                  ->  Seq Scan on tbl_reg regtwin  (cost=0.00..5229.70
> rows=140247 width=56) (never executed)
>                        Filter: ((register = 1) AND ("type" <> 20))
>  Total runtime: 604.410 ms
> (15 rows)
>
> EXPLAIN ANALYZE shows after 10 minutes load and 1x vacuuming
>       QUERY PLAN
> ----------------------------------------------------------------------------------------------------
>
>  Seq Scan on tbl_reg  (cost=25841.35..31433.17 rows=72424 width=6)
> (actual time=43261.910..43261.910 rows=0 loops=1)
>    Filter: ((hashed subplan) OR ((loc1 = 2::smallint) AND (loc2 =
> 2::smallint)))
>    SubPlan
>      ->  Merge Join  (cost=22186.21..25646.57 rows=77913 width=8)
> (actual time=43132.063..43132.063 rows=0 loops=1)
>            Merge Cond: (("outer".attr1 = "inner".attr1) AND
> ("outer"."?column8?" = "inner"."?column8?") AND ("outer"."?column9?" =
> "inner"."?column9?") AND ("outer"."?column10?" = "inner"."?column10?")
> AND ("outer"."?column11?" = "inner"."?column11?") AND ("outer".attr6 =
> "inner".attr6))
>            Join Filter: ("outer".idreg <> "inner".idreg)
>            ->  Sort  (cost=4967.06..4971.65 rows=1835 width=56) (actual
> time=387.071..387.872 rows=1552 loops=1)
>                  Sort Key: reg.attr1, (reg.attr2)::text,
> (reg.attr3)::text, (reg.attr4)::text, (reg.attr5)::text, reg.attr6
>                  ->  Seq Scan on tbl_reg reg  (cost=0.00..4867.59
> rows=1835 width=56) (actual time=303.966..325.614 rows=1552 loops=1)
>                        Filter: (register = 2)
>            ->  Sort  (cost=17219.15..17569.77 rows=140247 width=56)
> (actual time=42368.934..42530.986 rows=145324 loops=1)
>                  Sort Key: regtwin.attr1, (regtwin.attr2)::text,
> (regtwin.attr3)::text, (regtwin.attr4)::text, (regtwin.attr5)::text,
> regtwin.attr6
>                  ->  Seq Scan on tbl_reg regtwin  (cost=0.00..5229.70
> rows=140247 width=56) (actual time=0.015..1159.515 rows=145453 loops=1)
>                        Filter: ((register = 1) AND ("type" <> 20))
>  Total runtime: 44073.127 ms
> (15 rows)

OK - these plans look about the same, but the time is greatly different.
Both have rows=140247 as the estimated number of rows in tbl_reg. Either
  you have many more rows in the second case (in which case you're not
running ANALYSE enough) or you have lots of gaps in the table (you're
not running VACUUM enough).

I'd then try putting an index on (attr1,attr2,attr3...attr6) and see if
that helps reduce time.

--
   Richard Huxton
   Archonet Ltd

pgsql-performance by date:

Previous
From: Tobias Brox
Date:
Subject: Re: Swappiness setting on a linux pg server
Next
From: "Jim C. Nasby"
Date:
Subject: Re: DB Performance decreases due to often written/accessed