Thread: Message queue table..
Hi. I have this "message queue" table.. currently with 8m+ records. Picking the top priority messages seem to take quite long.. it is just a matter of searching the index.. (just as explain analyze tells me it does). Can anyone digest further optimizations out of this output? (All records have funcid=4) # explain analyze SELECT job.jobid, job.funcid, job.arg, job.uniqkey, job.insert_time, job.run_after, job.grabbed_until, job.priority, job.coalesce FROM workqueue.job WHERE (job.funcid = 4) AND (job.run_after <= 1208442668) AND (job.grabbed_until <= 1208442668) AND (job.coalesce = 'Efam') ORDER BY funcid, priority ASC LIMIT 1 ; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.00..0.09 rows=1 width=106) (actual time=245.273..245.274 rows=1 loops=1) -> Index Scan using workqueue_job_funcid_priority_idx on job (cost=0.00..695291.80 rows=8049405 width=106) (actual time=245.268..245.268 rows=1 loops=1) Index Cond: (funcid = 4) Filter: ((run_after <= 1208442668) AND (grabbed_until <= 1208442668) AND ("coalesce" = 'Efam'::text)) Total runtime: 245.330 ms (5 rows) -- Jesper
Jesper Krogh wrote: > > Hi. > > I have this "message queue" table.. currently with 8m+ records. Picking > the top priority messages seem to take quite long.. it is just a matter > of searching the index.. (just as explain analyze tells me it does). > > Can anyone digest further optimizations out of this output? (All records > have funcid=4) You mean all records of interest, right, not all records in the table? What indexes do you have in place? What's the schema? Can you post a "\d tablename" from psql? > # explain analyze SELECT job.jobid, job.funcid, job.arg, job.uniqkey, > job.insert_time, job.run_after, job.grabbed_until, job.priority, > job.coalesce FROM workqueue.job WHERE (job.funcid = 4) AND > (job.run_after <= 1208442668) AND (job.grabbed_until <= 1208442668) AND > (job.coalesce = 'Efam') ORDER BY funcid, priority ASC LIMIT 1 > ; > > QUERY PLAN > -------------------------------------------------------------------------------------------------------------------------------------------------------------- > > Limit (cost=0.00..0.09 rows=1 width=106) (actual time=245.273..245.274 > rows=1 loops=1) > -> Index Scan using workqueue_job_funcid_priority_idx on job > (cost=0.00..695291.80 rows=8049405 width=106) (actual > time=245.268..245.268 rows=1 loops=1) > Index Cond: (funcid = 4) > Filter: ((run_after <= 1208442668) AND (grabbed_until <= > 1208442668) AND ("coalesce" = 'Efam'::text)) > Total runtime: 245.330 ms > (5 rows) Without seeing the schema and index definitions ... maybe you'd benefit from a multiple column index. I'd experiment with an index on (funcid,priority) first. -- Craig Ringer
Craig Ringer wrote: > Jesper Krogh wrote: >> >> Hi. >> >> I have this "message queue" table.. currently with 8m+ records. >> Picking the top priority messages seem to take quite long.. it is just >> a matter of searching the index.. (just as explain analyze tells me it >> does). >> >> Can anyone digest further optimizations out of this output? (All >> records have funcid=4) > > You mean all records of interest, right, not all records in the table? Actually all the records.. since all the other virtual queues currently are empty. > What indexes do you have in place? What's the schema? Can you post a "\d > tablename" from psql? > >> # explain analyze SELECT job.jobid, job.funcid, job.arg, job.uniqkey, >> job.insert_time, job.run_after, job.grabbed_until, job.priority, >> job.coalesce FROM workqueue.job WHERE (job.funcid = 4) AND >> (job.run_after <= 1208442668) AND (job.grabbed_until <= 1208442668) >> AND (job.coalesce = 'Efam') ORDER BY funcid, priority ASC LIMIT 1 I found that removing the funcid from the order by made it use a better index. (priority, run_after, grabbed_until) that probably makes sense since the funcid doesnt give any value in the index at all. thanks for leading me back on track. Jesper -- Jesper
Jesper Krogh <jesper@krogh.cc> writes: > I have this "message queue" table.. currently with 8m+ records. Picking > the top priority messages seem to take quite long.. it is just a matter > of searching the index.. (just as explain analyze tells me it does). > Limit (cost=0.00..0.09 rows=1 width=106) (actual > time=245.273..245.274 rows=1 loops=1) > -> Index Scan using workqueue_job_funcid_priority_idx on job > (cost=0.00..695291.80 rows=8049405 width=106) (actual > time=245.268..245.268 rows=1 loops=1) > Index Cond: (funcid = 4) > Filter: ((run_after <= 1208442668) AND (grabbed_until <= > 1208442668) AND ("coalesce" = 'Efam'::text)) > Total runtime: 245.330 ms Well, what that's doing in English is: scan all the rows with funcid = 4, in priority order, until we hit the first one satisfying the filter conditions. Apparently there are a lot of low-priority rows that have funcid = 4 but not the other conditions. If it's the "coalesce" condition that's the problem, an index on (funcid, coalesce, priority) --- or (coalesce, funcid, priority) --- would probably help. I'm not sure there's a simple fix if it's the other conditions that are really selective. regards, tom lane
jesper@krogh.cc (Jesper Krogh) writes: > I have this "message queue" table.. currently with 8m+ > records. Picking the top priority messages seem to take quite > long.. it is just a matter of searching the index.. (just as explain > analyze tells me it does). > > Can anyone digest further optimizations out of this output? (All > records have funcid=4) > > # explain analyze SELECT job.jobid, job.funcid, job.arg, job.uniqkey, > job.insert_time, job.run_after, job.grabbed_until, job.priority, > job.coalesce FROM workqueue.job WHERE (job.funcid = 4) AND > (job.run_after <= 1208442668) AND (job.grabbed_until <= 1208442668) > AND (job.coalesce = 'Efam') ORDER BY funcid, priority ASC LIMIT 1 > ; There might be value in having one or more extra indices... Here are *plausible* candidates: 1. If "funcid = 4" is highly significant (e.g. - you are always running this query, and funcid often <> 4), then you might add a functional index such as: create index job_funcid_run_after on workqueue.job (run_after) where funcid = 4; create index job_funcid_grabbeduntil on workqueue.job (grabbed_until) where funcid = 4; 2. Straight indices like the following: create index job_run_after on workqueue.job(run_after); create index job_grabbed_until on workqueue.job(grabbed_until); create index job_funcid on workqueue.job(funcid); create index job_coalesce on workqueue.job(coalesce); Note that it is _possible_ (though by no means guaranteed) that all three might prove useful, if you're running 8.1+ where PostgreSQL supports bitmap index scans. Another possibility... 3. You might change your process to process multiple records in a "run" so that you might instead run the query (perhaps via a cursor?) with LIMIT [Something Bigger than 1]. It does seem mighty expensive to run a 245ms query to find just one record. It seems quite likely that you could return the top 100 rows (LIMIT 100) without necessarily finding it runs in any more time. Returning 100 tuples in 245ms seems rather more acceptable, no? :-) -- (format nil "~S@~S" "cbbrowne" "linuxfinances.info") http://linuxdatabases.info/info/linuxdistributions.html Rules of the Evil Overlord #32. "I will not fly into a rage and kill a messenger who brings me bad news just to illustrate how evil I really am. Good messengers are hard to come by." <http://www.eviloverlord.com/>