Thread: Message queue table..

Message queue table..

From
Jesper Krogh
Date:
Hi.

I have this "message queue" table.. currently with 8m+ records. Picking
the top priority messages seem to take quite long.. it is just a matter
of searching the index.. (just as explain analyze tells me it does).

Can anyone digest further optimizations out of this output? (All records
have funcid=4)

# explain analyze SELECT job.jobid, job.funcid, job.arg, job.uniqkey,
job.insert_time, job.run_after, job.grabbed_until, job.priority,
job.coalesce FROM workqueue.job  WHERE (job.funcid = 4) AND
(job.run_after <= 1208442668) AND (job.grabbed_until <= 1208442668) AND
(job.coalesce = 'Efam') ORDER BY funcid, priority ASC LIMIT 1
;

    QUERY PLAN


--------------------------------------------------------------------------------------------------------------------------------------------------------------
  Limit  (cost=0.00..0.09 rows=1 width=106) (actual
time=245.273..245.274 rows=1 loops=1)
    ->  Index Scan using workqueue_job_funcid_priority_idx on job
(cost=0.00..695291.80 rows=8049405 width=106) (actual
time=245.268..245.268 rows=1 loops=1)
          Index Cond: (funcid = 4)
          Filter: ((run_after <= 1208442668) AND (grabbed_until <=
1208442668) AND ("coalesce" = 'Efam'::text))
  Total runtime: 245.330 ms
(5 rows)

--
Jesper

Re: Message queue table..

From
Craig Ringer
Date:
Jesper Krogh wrote:
>
> Hi.
>
> I have this "message queue" table.. currently with 8m+ records. Picking
> the top priority messages seem to take quite long.. it is just a matter
> of searching the index.. (just as explain analyze tells me it does).
>
> Can anyone digest further optimizations out of this output? (All records
> have funcid=4)

You mean all records of interest, right, not all records in the table?

What indexes do you have in place? What's the schema? Can you post a "\d
tablename" from psql?

> # explain analyze SELECT job.jobid, job.funcid, job.arg, job.uniqkey,
> job.insert_time, job.run_after, job.grabbed_until, job.priority,
> job.coalesce FROM workqueue.job  WHERE (job.funcid = 4) AND
> (job.run_after <= 1208442668) AND (job.grabbed_until <= 1208442668) AND
> (job.coalesce = 'Efam') ORDER BY funcid, priority ASC LIMIT 1
> ;
>
>    QUERY PLAN
>
--------------------------------------------------------------------------------------------------------------------------------------------------------------

>
>  Limit  (cost=0.00..0.09 rows=1 width=106) (actual time=245.273..245.274
> rows=1 loops=1)
>    ->  Index Scan using workqueue_job_funcid_priority_idx on job
> (cost=0.00..695291.80 rows=8049405 width=106) (actual
> time=245.268..245.268 rows=1 loops=1)
>          Index Cond: (funcid = 4)
>          Filter: ((run_after <= 1208442668) AND (grabbed_until <=
> 1208442668) AND ("coalesce" = 'Efam'::text))
>  Total runtime: 245.330 ms
> (5 rows)

Without seeing the schema and index definitions ... maybe you'd benefit
from a multiple column index. I'd experiment with an index on
(funcid,priority) first.

--
Craig Ringer

Re: Message queue table..

From
Jesper Krogh
Date:
Craig Ringer wrote:
> Jesper Krogh wrote:
>>
>> Hi.
>>
>> I have this "message queue" table.. currently with 8m+ records.
>> Picking the top priority messages seem to take quite long.. it is just
>> a matter of searching the index.. (just as explain analyze tells me it
>> does).
>>
>> Can anyone digest further optimizations out of this output? (All
>> records have funcid=4)
>
> You mean all records of interest, right, not all records in the table?

Actually all the records.. since all the other virtual queues currently
are empty.

> What indexes do you have in place? What's the schema? Can you post a "\d
> tablename" from psql?
>
>> # explain analyze SELECT job.jobid, job.funcid, job.arg, job.uniqkey,
>> job.insert_time, job.run_after, job.grabbed_until, job.priority,
>> job.coalesce FROM workqueue.job  WHERE (job.funcid = 4) AND
>> (job.run_after <= 1208442668) AND (job.grabbed_until <= 1208442668)
>> AND (job.coalesce = 'Efam') ORDER BY funcid, priority ASC LIMIT 1

I found that removing the funcid from the order by made it use a better
index. (priority, run_after, grabbed_until)  that probably makes sense
since the funcid doesnt give any value in the index at all.

thanks for leading me back on track.

Jesper

--
Jesper

Re: Message queue table..

From
Tom Lane
Date:
Jesper Krogh <jesper@krogh.cc> writes:
> I have this "message queue" table.. currently with 8m+ records. Picking
> the top priority messages seem to take quite long.. it is just a matter
> of searching the index.. (just as explain analyze tells me it does).

>   Limit  (cost=0.00..0.09 rows=1 width=106) (actual
> time=245.273..245.274 rows=1 loops=1)
>     ->  Index Scan using workqueue_job_funcid_priority_idx on job
> (cost=0.00..695291.80 rows=8049405 width=106) (actual
> time=245.268..245.268 rows=1 loops=1)
>           Index Cond: (funcid = 4)
>           Filter: ((run_after <= 1208442668) AND (grabbed_until <=
> 1208442668) AND ("coalesce" = 'Efam'::text))
>   Total runtime: 245.330 ms

Well, what that's doing in English is: scan all the rows with funcid =
4, in priority order, until we hit the first one satisfying the filter
conditions.  Apparently there are a lot of low-priority rows that have
funcid = 4 but not the other conditions.

If it's the "coalesce" condition that's the problem, an index on
(funcid, coalesce, priority) --- or (coalesce, funcid, priority) ---
would probably help.  I'm not sure there's a simple fix if it's
the other conditions that are really selective.

            regards, tom lane

Re: Message queue table..

From
Chris Browne
Date:
jesper@krogh.cc (Jesper Krogh) writes:
> I have this "message queue" table.. currently with 8m+
> records. Picking the top priority messages seem to take quite
> long.. it is just a matter of searching the index.. (just as explain
> analyze tells me it does).
>
> Can anyone digest further optimizations out of this output? (All
> records have funcid=4)
>
> # explain analyze SELECT job.jobid, job.funcid, job.arg, job.uniqkey,
> job.insert_time, job.run_after, job.grabbed_until, job.priority,
> job.coalesce FROM workqueue.job  WHERE (job.funcid = 4) AND
> (job.run_after <= 1208442668) AND (job.grabbed_until <= 1208442668)
> AND (job.coalesce = 'Efam') ORDER BY funcid, priority ASC LIMIT 1
> ;

There might be value in having one or more extra indices...

Here are *plausible* candidates:

1.  If "funcid = 4" is highly significant (e.g. - you are always
running this query, and funcid often <> 4), then you might add a
functional index such as:

  create index job_funcid_run_after on workqueue.job (run_after) where funcid = 4;
  create index job_funcid_grabbeduntil on workqueue.job (grabbed_until) where funcid = 4;

2.  Straight indices like the following:

   create index job_run_after on workqueue.job(run_after);
   create index job_grabbed_until on workqueue.job(grabbed_until);
   create index job_funcid on workqueue.job(funcid);
   create index job_coalesce on workqueue.job(coalesce);

Note that it is _possible_ (though by no means guaranteed) that all
three might prove useful, if you're running 8.1+ where PostgreSQL
supports bitmap index scans.

Another possibility...

3.  You might change your process to process multiple records in a
"run" so that you might instead run the query (perhaps via a cursor?)

with LIMIT [Something Bigger than 1].

It does seem mighty expensive to run a 245ms query to find just one
record.  It seems quite likely that you could return the top 100 rows
(LIMIT 100) without necessarily finding it runs in any more time.

Returning 100 tuples in 245ms seems rather more acceptable, no?  :-)
--
(format nil "~S@~S" "cbbrowne" "linuxfinances.info")
http://linuxdatabases.info/info/linuxdistributions.html
Rules of the Evil Overlord #32. "I will not fly into a rage and kill a
messenger who brings me bad news  just to illustrate how evil I really
am. Good messengers are hard to come by."
<http://www.eviloverlord.com/>