Thread: concurrency performance degradation

concurrency performance degradation

From
Sheer El-Showk
Date:
Hi,

I have an application that uses foreign keys rather heavily in postgres.
It supports multiple concurrent sessions with high transaction volumes.
Ideally the server should be able to have 20-30 connections open to the
database each partaking in similar operations.  Each transaction may
involve 20+ writes and maybe 100 reads.

When run single threaded, I can perform on transaction per second (30 or
so writes, 100 or so reads, plus time spent int he application), but if I
increase concurrency to about 10 active threads all doing similar
operations (on the same set of tables) I gain very little from it.  I
notice in the process listing that the postges processes are almost all in
"INSERT WAITING..." or "UPDATE WAITING..." except for one which is in a
"SELECT" of an "INSERT".  My confusion here is that I thought postgres
used record-level locking so I should be able run multipel concurrent
inserts on a table without locking.

The reason I mention foreign keys above is that I imagine that foreign
keys might complicate locking issues and I use them a _lot_.  I also
noticed that the database slows down significantly as it gets populated.
When I get to about 8000 transactions (~8000 records in my primary table)
it is much slower, and by 19k it takes up to 5-15 seconds to do an insert.
A "vacuum analyze" didn't help here.

I tried this using postgres 7.0.2 and postgres 7.2b1 without any
significant differences.  Does anybody have some related metrics to give
me or some tuning suggestions.

While the server being used is certainly not high-end (single processor
IDE) I'm still somewhat suprised at the results considering the small data
set (~20k entries).  My experience with Oracle running a very similar
application (albeit on much higher end machines) suggests that when the
indices are created properly insert time is not significanly affected by
the size of the data set (because the indices are b-trees).

I can provide a rate of increase of transaction time as well as a stack
trace of the relevant blocking method if any postgres developer is
interested.  The transaction times seem to be increaseing linearly rather
than logrithmically (which I would imagine they would if all the indices
were b-tree and are the limiting factor).

Any information would be helpful here.

Thanks,
Sheer


Re: concurrency performance degradation

From
Sheer El-Showk
Date:
I tried disabling foreign keys to test this and I didn't see a real
performance increase.  Does setting pgenable in the pg_trigger table to
false disable all the foreign keys?

Is there any way to get a good trace of the locks or lock contention?

Thanks,
Sheer

On Fri, 26 Oct 2001, Stephan Szabo wrote:

> On Fri, 26 Oct 2001, Sheer El-Showk wrote:
>
> > Hi,
> >
> > I have an application that uses foreign keys rather heavily in postgres.
> > It supports multiple concurrent sessions with high transaction volumes.
> > Ideally the server should be able to have 20-30 connections open to the
> > database each partaking in similar operations.  Each transaction may
> > involve 20+ writes and maybe 100 reads.
> >
> > When run single threaded, I can perform on transaction per second (30 or
> > so writes, 100 or so reads, plus time spent int he application), but if I
> > increase concurrency to about 10 active threads all doing similar
> > operations (on the same set of tables) I gain very little from it.  I
> > notice in the process listing that the postges processes are almost all in
> > "INSERT WAITING..." or "UPDATE WAITING..." except for one which is in a
> > "SELECT" of an "INSERT".  My confusion here is that I thought postgres
> > used record-level locking so I should be able run multipel concurrent
> > inserts on a table without locking.
> >
> > The reason I mention foreign keys above is that I imagine that foreign
> > keys might complicate locking issues and I use them a _lot_.  I also
>
> If you're doing alot of inserts or updates to rows that match the same row
> in the primary key table this may be explained by the foreign key locking
> as I believe a row lock is grabbed on the primary key row to prevent
> it from being removed by another transaction.
>


Re: concurrency performance degradation

From
Stephan Szabo
Date:
On Fri, 26 Oct 2001, Sheer El-Showk wrote:

> Hi,
>
> I have an application that uses foreign keys rather heavily in postgres.
> It supports multiple concurrent sessions with high transaction volumes.
> Ideally the server should be able to have 20-30 connections open to the
> database each partaking in similar operations.  Each transaction may
> involve 20+ writes and maybe 100 reads.
>
> When run single threaded, I can perform on transaction per second (30 or
> so writes, 100 or so reads, plus time spent int he application), but if I
> increase concurrency to about 10 active threads all doing similar
> operations (on the same set of tables) I gain very little from it.  I
> notice in the process listing that the postges processes are almost all in
> "INSERT WAITING..." or "UPDATE WAITING..." except for one which is in a
> "SELECT" of an "INSERT".  My confusion here is that I thought postgres
> used record-level locking so I should be able run multipel concurrent
> inserts on a table without locking.
>
> The reason I mention foreign keys above is that I imagine that foreign
> keys might complicate locking issues and I use them a _lot_.  I also

If you're doing alot of inserts or updates to rows that match the same row
in the primary key table this may be explained by the foreign key locking
as I believe a row lock is grabbed on the primary key row to prevent
it from being removed by another transaction.


Re: concurrency performance degradation

From
Stephan Szabo
Date:
On Sat, 27 Oct 2001, Sheer El-Showk wrote:

> I tried disabling foreign keys to test this and I didn't see a real
> performance increase.  Does setting pgenable in the pg_trigger table to
> false disable all the foreign keys?

I don't believe so.  I think the only real ways to do that are to delete
them or temporarily set reltriggers to 0 in the pg_class for those tables.

> Is there any way to get a good trace of the locks or lock contention?
There probably is, but I don't know of one.



Re: concurrency performance degradation

From
Sheer El-Showk
Date:
Unfortunately the appilcation I was using had so many queries it was
difficult to EXPLAIN each of them.  I ended up measuring and logging
execution time of each query in the application itself and then just
sorting by times and going in and fixing indices on the slow queries.

I am having another related issue though (and I'm trying to use EXPLAIN
VERBOSE to figure it out).  I have a large table with a foreign key into a
smaller table; I do very frequent inserts into the large table.  I think
the query optmizer is electing to use a sequential scan on the foreign key
in the smaller table since its small, but I think this is degrading
performance as locking is worse on concurrent sequential scans (I'm not
really sure of any of this stuff by the way).  Does anybody know how to
check if this is the case (understanding the output of EXPLAIN VERBOSE
is non-trivial) and how to force or suggest an index lookup rather than a
sequential scan?

Thanks,
Sheer

On Mon, 29 Oct 2001, Thomas Lockhart wrote:

> ...
> > The reason I mention foreign keys above is that I imagine that foreign
> > keys might complicate locking issues and I use them a _lot_.  I also
> > noticed that the database slows down significantly as it gets populated.
> > When I get to about 8000 transactions (~8000 records in my primary table)
> > it is much slower, and by 19k it takes up to 5-15 seconds to do an insert.
> > A "vacuum analyze" didn't help here.
>
> But appropriate indices should help. Updates which require sequential
> scans are expensive as tables get bigger, and the scaling problem you
> describe is the usual symptom of missing indices.
>
> Use "explain" and examine your queries to tune the performance.
>
>                      - Thomas
>


Re: concurrency performance degradation

From
Thomas Lockhart
Date:
...
> The reason I mention foreign keys above is that I imagine that foreign
> keys might complicate locking issues and I use them a _lot_.  I also
> noticed that the database slows down significantly as it gets populated.
> When I get to about 8000 transactions (~8000 records in my primary table)
> it is much slower, and by 19k it takes up to 5-15 seconds to do an insert.
> A "vacuum analyze" didn't help here.

But appropriate indices should help. Updates which require sequential
scans are expensive as tables get bigger, and the scaling problem you
describe is the usual symptom of missing indices.

Use "explain" and examine your queries to tune the performance.

                     - Thomas

Re: concurrency performance degradation

From
Stephan Szabo
Date:
On Mon, 29 Oct 2001, Sheer El-Showk wrote:

> Unfortunately the appilcation I was using had so many queries it was
> difficult to EXPLAIN each of them.  I ended up measuring and logging
> execution time of each query in the application itself and then just
> sorting by times and going in and fixing indices on the slow queries.
>
> I am having another related issue though (and I'm trying to use EXPLAIN
> VERBOSE to figure it out).  I have a large table with a foreign key into a
> smaller table; I do very frequent inserts into the large table.  I think
> the query optmizer is electing to use a sequential scan on the foreign key
> in the smaller table since its small, but I think this is degrading
> performance as locking is worse on concurrent sequential scans (I'm not
> really sure of any of this stuff by the way).  Does anybody know how to
> check if this is the case (understanding the output of EXPLAIN VERBOSE
> is non-trivial) and how to force or suggest an index lookup rather than a
> sequential scan?

If you're inserting rows in the large table that reference the same row(s)
in the smaller table alot, your problem is probably not the plan of the
foreign key checks, but instead the row locks gotten on the matching rows.
If you can look at the ps output during these problems, and you see alot
of backends waiting, that's probably what it is.

--------------------------------------
AFAIK, If you do something like
T1: begin
T2: begin
T1: insert into big (ref) values (1);
T2: insert into big (ref) values (1);
right now the locking is such that T2 is going to wait until T1
commits or rolls back.  The lock gotten is stronger than is necessary
(it's the same as a SELECT ... FOR UPDATE which conflicts with itself,
where we really want a lock which conflicts with updates/deletes but
not itself).