Massive parallel queue table causes index deterioration, but REINDEX fails with deadlocks. - Mailing list pgsql-performance

From Gunther
Subject Massive parallel queue table causes index deterioration, but REINDEX fails with deadlocks.
Date
Msg-id 33bae6e3-1c5a-77f0-ee99-408fbf0fbb83@gusw.net
Whole thread Raw
Responses Re: Massive parallel queue table causes index deterioration, butREINDEX fails with deadlocks.  (Peter Geoghegan <pg@bowt.ie>)
Re: Massive parallel queue table causes index deterioration, butREINDEX fails with deadlocks.  (David Rowley <david.rowley@2ndquadrant.com>)
Re: Massive parallel queue table causes index deterioration, butREINDEX fails with deadlocks.  (Justin Pryzby <pryzby@telsasoft.com>)
Re: Massive parallel queue table causes index deterioration, but REINDEX fails with deadlocks.  (Jeff Janes <jeff.janes@gmail.com>)
Re: Massive parallel queue table causes index deterioration, but REINDEX fails with deadlocks.  (Jeff Janes <jeff.janes@gmail.com>)
List pgsql-performance

Hi,

I am using an SQL queue for distributing work to massively parallel workers. Workers come in servers with 12 parallel threads. One of those worker sets handles 7 transactions per second. If I add a second one, for 24 parallel workers, it scales to 14 /s. Even a third, for 36 parallel workers, I can add to reach 21 /s. If I try a fourth set, 48 workers, I end up in trouble. But that isn't even so much my problem rather than the fact that in short time, the performance will deteriorate, and it looks like that is because the queue index deteriorates and needs a REINDEX. The queue table is essentially this:

CREATE TABLE Queue (  jobId bigint,  action text,  pending boolean,  result text
);

the dequeue operation is essentially this:

BEGIN

SELECT jobId, action
  FROM Queue 
  WHERE pending  FOR UPDATE SKIP LOCKED

which is a wonderful concept with the SKIP LOCKED.

Then I perform the action and finally:

UPDATE Queue   SET pending = false,       result = ? WHERE jobId = ?

COMMIT

I thought to keep my index tight, I would define it like this:

CREATE UNIQUE INDEX Queue_idx_pending ON Queue(jobId) WHERE pending;

so that only pending jobs are in that index.

When a job is done, follow up work is often inserted into the Queue as pending, thus adding to that index.

Below is the performance chart.

The blue line at the bottom is the db server.


You can see the orange line is the first worker server with 12 threads. It settled into a steady state of 7/s ran with 90% CPU for some 30 min, and then the database CPU% started climbing and I tried to rebuild the indexes on the queue, got stuck there, exclusive lock, no jobs were processing, but the exclusive lock was never obtained for too long. So I shut down the worker server. Database quiet I could resolve the messed up indexes and restarted again. Soon I added a second worker server (green line) starting around 19:15. Once settled in they were pulling 14/s together. but you can see in just 15 min, the db server CPU % climbed again to over 40% and the performance of the workers dropped, their load falling to 30%. Now at around 19:30 I stopped them all, REINDEXed the queue table and then started 3 workers servers simultaneously. They settled in to 21/s but in just 10 min again the deterioration happened. Again I stopped them all, REINDEXed, and now started 4 worker servers (48 threads). This time 5 min was not enough to see them ever settling into a decent 28/s transaction rate, but I guess they might have reached that for a minute or two, only for the index deteriorating again. I did another stop now started only 2 servers and again, soon the index deteriorated again.

Clearly that index is deteriorating quickly, in about 10,000 transactions. 
BTW: when I said 7/s, it is in reality about 4 times as many transactions, because of the follow up jobs that also get added on this queue. So 10,0000 transactions may be 30 or 40 k transactions before the index deteriorates.

Do you have any suggestion how I can avoid that index deterioration problem smoothly?

I figured I might just pause all workers briefly to schedule the REINDEX Queue command, but the problem with this is that while the transaction volume is large, some jobs may take minutes to process, and in that case we need to wait minutes to quiet the database with then 47 workers sitting as idle capacity waiting for the 48th to finish so that the index can be rebuilt!

Of course I tried to resolve the issue with vacuumdb --analyze (just in case if the autovacuum doesn't act in time) and that doesn't help. Vacuumdb --full --analyze would probably help but can't work because it required an exclusive table lock. I tried to just create a new index of the same

CREATE UNIQUE INDEX Queue_idx2_pending ON Queue(jobId) WHERE pending;
DROP INDEX Queue_idx_pending;
ANALYZE Queue; 

but with that I got completely stuck with two indexes where I could not remove either of them for those locking issues. And REINDEX will give me a deadlock error rightout.

I am looking for a way to manage that index so that it does not deteriorate.

May be if I was not defining it with

... WHERE pending;

then it would only grow, but never shrink. May be that helps somehow? I doubt it though. Adding to an index also causes deterioration, and most of the rows would be irrelevant because they would be past work. It would be nicer if there was another smooth way.

regards,
-Gunther



Attachment

pgsql-performance by date:

Previous
From: Igor Neyman
Date:
Subject: RE: Slow query with aggregate and many LEFT JOINS
Next
From: Peter Geoghegan
Date:
Subject: Re: Massive parallel queue table causes index deterioration, butREINDEX fails with deadlocks.