Re: maintaining a reference to a fetched row - Mailing list pgsql-performance

From Brian Karlak
Subject Re: maintaining a reference to a fetched row
Date
Msg-id 82545852-CE9E-4B0D-ABAB-B628C888D086@metaweb.com
Whole thread Raw
In response to Re: maintaining a reference to a fetched row  (Jeff Janes <jeff.janes@gmail.com>)
Responses Re: maintaining a reference to a fetched row
Re: maintaining a reference to a fetched row
List pgsql-performance

On Nov 4, 2009, at 8:47 AM, Jeff Janes wrote:

Worker daemons do a bounded, ordered, limited SELECT to grab a row, which
they lock by setting a value in the queue.status column.

So you do a select, and then an update?

I do a select for update in a stored proc:

FOR queue_item IN  
  SELECT *  FROM queue
   WHERE status IS NULL AND id >= low_bound_id
   ORDER BY id LIMIT batch_size
     FOR UPDATE
LOOP
  UPDATE queue_proc set status = 'proc' where id = queue_item.id ;

The daemons keep track of their last position in the queue with low_bound_id.  Also, as you probably notice, I also fetch a batch of (100) items at a time.  In practice, it's pretty fast.  The job I'm running now is showing an average fetch time of 30ms per 100 actions, which ain't bad.

However, the writing of results back to the row takes ~5ms, which is slower
than I'd like.

It seems you have an select, and update, and another update.  Where in
this process do you commit?  Are you using fsync=off or
synchronous_commit=off?

First commit occurs after the stored proc to select/update a batch of items is complete.  Second commit occurs on the writing of results back for each particular action.  Two commits are required because the time it takes to complete the intervening action can vary wildly: anywhere between 20ms and 45min.

It seems that this is because I need to to do an index scan
on the queue table to find the row I just fetched.

Why would the index scan take 1 ms two of the times it is done but 5ms
the third time?  Isn't it the same index scan each time?  Or does the
change in queue.status change the plan?

The final update is a different query -- just a plain old update by ID:

UPDATE queue_proc set status = 'proc' where id = %s ;

This update by ID takes ~2.5ms, which means it's where the framework is spending most of its overhead.

Brian 

pgsql-performance by date:

Previous
From: Brian Karlak
Date:
Subject: Re: maintaining a reference to a fetched row
Next
From: Tom Lane
Date:
Subject: Re: maintaining a reference to a fetched row