Re: Postgresql the right tool (queue using advisory_locks + long transactions) - Mailing list pgsql-general

From David G Johnston
Subject Re: Postgresql the right tool (queue using advisory_locks + long transactions)
Date
Msg-id 1398629225770-5801670.post@n5.nabble.com
Whole thread Raw
In response to Postgresql the right tool (queue using advisory_locks + long transactions)  (Dorian Hoxha <dorian.hoxha@gmail.com>)
Responses Re: Re: Postgresql the right tool (queue using advisory_locks + long transactions)
List pgsql-general
Dorian Hoxha wrote
> Hi list,
>
> I am trying to use postgresql as a queue for long-jobs (max ~4 hours)
> using
> advisory_locks. I can't separate the long-job into sub-jobs.
>
>
>    1. At ultimate-best-case scenario there will be ~100 workers, so no
>    web-scale performance required.
>    Is there a problem with 100 open sessions (behind a connection pooler?)
>    for hours like in my case?
>    The session will commit the progress of the job, but will stay opened
>    and hold the advisory_lock till the job is done or it expires (look 2).
>    2. Is it possible to set a time limit to auto-close an opened session
>    that hasn't made a query in a while ?
>    So if a worker fails,and the session time-outs, postgresql/pgbouncer
>    will close the session and release the lock ?

Do you understand the difference between a session and a transaction?  Your
subject and body indicate either you don't or have a thinko/typo somewhere?
Long-running transactions (idle or otherwise) are a major problem while
long-running idle sessions are simply marginally wasteful (in moderation).

Why not just update the job as dispatched when it is being worked on and
then completed when finished?  You still would need to handle partial
completion somehow but this way you don't waste live resources during the
long waiting period simply monitoring a lock.  Though probably at your
volume this is not that big an issue.

PostgreSQL proper does not have session timeouts that I am aware of.  If a
worker fails it should release its connection and all advisory locks would
be released...

You reference to a connection pooler in the above doesn't make sense to me;
you'd need persistent connections for this to work (they can be from a pool
but the pool size would have to be 100+).

The main decision is whether job process state is part of your data model or
simply an artifact.  I'd suggest that it should be part of the model so
state should be managed directly thorough row-level locking and status
fields instead of indirectly through temporary system advisory locks.

If you want to keep track of active workers you should setup some kind of
heartbeat update query; and maybe depending on how important this is attach
advisory lock info to that heartbeat record so a monitoring process can
check both the pulse table and the system advisory lock for a two data point
confirmation of activity.

David J.





--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Postgresql-the-right-tool-queue-using-advisory-locks-long-transactions-tp5801667p5801670.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: Postgresql the right tool (queue using advisory_locks + long transactions)
Next
From: Hannes Erven
Date:
Subject: Re: Postgresql the right tool (queue using advisory_locks + long transactions)