On Wed, Sep 28, 2011 at 2:16 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> (sorry for the late reply, this fell through the cracks..)
>
> On 10.08.2011 11:48, Dave Page wrote:
>>
>> On Thu, Aug 4, 2011 at 8:30 PM, Merlin Moncure<mmoncure@gmail.com> wrote:
>>>
>>> On Thu, Aug 4, 2011 at 2:19 PM, Heikki Linnakangas
>>> <heikki.linnakangas@enterprisedb.com> wrote:
>>>>
>>>> I created 100 identical pgagent jobs, with one step that simply does
>>>> "SELECT
>>>> pg_sleep(10)". I then forced them all to run immediately, with "UPDATE
>>>> pgagent.pga_job SET jobnextrun=now();". pgagent crashed.
>>>>
>>>> What happened is that the when all those jobs are launched at the same
>>>> time,
>>>> the server ran into the max_connections limit, and pgagent didn't handle
>>>> that too well. JobThread::JobThread constructor does not check for NULL
>>>> result from DBConn::Get(), and passes a NULL connection to Job::Job,
>>>> which
>>>> tries to reference it, leading to a segfault.
>>>>
>>>> I propose the attached patch.
>>>
>>> hm, in the event that happens, is that logged in the client somehow?
>>> wouldn't you want to throw an exception or something like that?
>>
>> I think the most straightforward way to handle this is to dump an
>> error into pgagent.pga_joblog when deleting the thread. Might be a
>> little ugly to pass the original error message back rather than a
>> generic one though. Can you take a look Heikki?
>
> You mean something like the attached? Works for me, but inserting an entry
> in joblog for each failed attempt might create a lot of entries there, if
> the problem persists.
Is it correct behavior to throw unhandled sql exception if the logging
query fails?
merlin