Thread: INSERT waiting under heavy load

INSERT waiting under heavy load

From

alex-lists-pgsql@yuriev.com

Date:

06 January 2006, 15:58:57

After digging through all the discussions of "INSERT waiting" problems I am
still not clear about the concensus about solving it.



I am running ration 6:1 SELECT:INSERT (insert fires up an UPDATE trigger
that hits a column in a table holding keys used by SELECT). I am looking at
doing about 2,000 INSERT/UPDATE per second, with possible peaks at 10,000
INSERT/UPDATE per second (i.e. 60,000 SELECTs).


(table 1)

The table holding primary keys is expected to grow to around 10,000 rows.
This is the table that gets 50% of SELECTs and 100% of UPDATES. This is the
owner status table. It is optimized so with a single SELECT against this
table all information needed for real-time clients would be accessible.

(table 2)

The 2nd number of rows in the second table is expected to be around 100
times the number of rows in the 1st table. Each entry in this table has uses
first table's column as a foreign key to avoid unlinked entries. It also has
foreign key dependecies to some other tables that for the purpose of the
application are never updated. This table gets the other 50% of SELECTs.

(table 3)

Finally, the 3rd table (audit log) is expected to have arbitraty number of
entries (measured in millions). It gets virtually no SELECT activity in the
mornal operations. If the data from this table is needed, a snapshot of this
table gets pulled into a different table (most likely on a different
database) and gets processed there. The table gets cleaned up at specific
intervals using DROP TABLE/CREATE TABLE sequence. It is guaraneed that when
the management application (non-real time) performs DROP TABLE/CREATE table
combination. The only thing that I do not particulary like is that every INSERT
into this table has to adjust a counter column in a corresponding row of the
(table1) via (table3->table2->table1) path.



The server is configured to accept about 200 connections from clients. The
problem is that after first couple of hours of working normally, as the
table (3) grows, the backend indicates that more and more INSERTs into table
3 are held up in the "INSERT waiting" state.

It happens even when table 1 contains only one row, table 2 contains 4 rows.


Is there anything that can be done to diagnose why "INSERT waiting" state
becomes so prevalent?

Would pulling the counter from table 1 into a table
(4) that contains only reference to appropriate table (1) row and counter
value make it better? 


Thanks,
Alex

Re: INSERT waiting under heavy load

From

Tom Lane

Date:

06 January 2006, 16:52:10

alex-lists-pgsql@yuriev.com writes:
> After digging through all the discussions of "INSERT waiting" problems I am
> still not clear about the concensus about solving it.
> ...
> The only thing that I do not particulary like is that every INSERT
> into this table has to adjust a counter column in a corresponding row of the
> (table1) via (table3->table2->table1) path.

Well, if there are only a few rows in table1, then this design is
inherently going to lose big.  Any two transactions trying to update the
same table1 row are going to conflict and one will have to wait for the
other to complete.  Rethink the need for those counters.
        regards, tom lane

Re: INSERT waiting under heavy load

From

alex-lists-pgsql@yuriev.com

Date:

06 January 2006, 17:26:47

> > After digging through all the discussions of "INSERT waiting" problems I am
> > still not clear about the concensus about solving it.
> > ...
> > The only thing that I do not particulary like is that every INSERT
> > into this table has to adjust a counter column in a corresponding row of the
> > (table1) via (table3->table2->table1) path.
> 
> Well, if there are only a few rows in table1, then this design is
> inherently going to lose big.  Any two transactions trying to update the
> same table1 row are going to conflict and one will have to wait for the
> other to complete.  Rethink the need for those counters.

I appreciate that it is most likely not the best design though i expect
reasonable distribution of UPDATE hits against the first table when the
number of rows increases. 

What I do not understand is this: 

if the problem is caused by the the acquire lock->modify column->release
lock on the table 1, then why does it increase significantly increase as the
number of entries in the table 3 grows? The simulation maintains pretty much
constant rate of new requests coming to table 3.

Alex

Re: INSERT waiting under heavy load

From

Tom Lane

Date:

06 January 2006, 17:49:11

alex-lists-pgsql@yuriev.com writes:
> if the problem is caused by the the acquire lock->modify column->release
> lock on the table 1, then why does it increase significantly increase as the
> number of entries in the table 3 grows? The simulation maintains pretty much
> constant rate of new requests coming to table 3.

Are you vacuuming table 1 on a regular basis while you do this?
Accumulation of dead tuple versions in table 1 is a pretty severe
performance risk in the sort of scenario you are describing.
(I'm not sure why it would manifest as transactions showing "INSERT
waiting" state though.)
        regards, tom lane

Re: INSERT waiting under heavy load

From

Andrew Sullivan

Date:

08 January 2006, 14:21:29

On Fri, Jan 06, 2006 at 04:49:09PM -0500, Tom Lane wrote:
> performance risk in the sort of scenario you are describing.
> (I'm not sure why it would manifest as transactions showing "INSERT
> waiting" state though.)

It's because of I/O.  When you have a large number of updates, the
planner always assumes an indexscan (correctly), but you end up
scanning megabytes of dead tuples.  With a large number of open
transactions, most of the time VACUUM can't recover the space.  I
agree with what Tom said earlier in this thread: the design is
guaranteed to lose.

A

-- 
Andrew Sullivan  | ajs@crankycanuck.ca
"The year's penultimate month" is not in truth a good way of saying
November.    --H.W. Fowler