Re: [pgAgent][Patch] Fixing connection pool leak - Mailing list pgadmin-hackers

From Rob Emery
Subject Re: [pgAgent][Patch] Fixing connection pool leak
Date
Msg-id CAPCETpvGF31F74TwE=VdnqDY-JwTyF4pdQYftMs1G8HVSb_y1w@mail.gmail.com
Whole thread Raw
In response to [pgAgent][Patch] Fixing connection pool leak  (Rob Emery <re-pgsql@codeweavers.net>)
List pgadmin-hackers
Hello again,

I can't see any other way of solving this problem without either exposing a method on JobThread to delete the Job owned by the thread which is called by https://github.com/postgres/pgagent/blob/master/pgAgent.cpp#L139 or another method that only gives up the connection which is called in the same place as the delete jt.

Is there a problem with chaining the ~JobThread() and ~Job() together as I suggested previously? This does solve the leak.

Many Thanks,
Rob

On 23 October 2017 at 20:09, Rob Emery <re-pgsql@codeweavers.net> wrote:
Hiya,

I've just confirmed neither of these patches resolve the issue, it appears in the error case I'm experiencing the JobThread::Entry() never executes.

To clarify I'm talking about the path entering the else in pgAgent.cpp:125 which will save into pga_joblog with jlgstatus = "i"

This path explicitly deletes jt however deleting jt doesn't clean up the connection, hence why I cascaded the delete in the original patch.

I hope that makes sense

Thanks,
Rob

On 23 October 2017 at 08:18, Neel Patel <neel.patel@enterprisedb.com> wrote:
Hi Ashesh,

I just added condition before delete the job otherwise it looks good. Correct me if I am wrong.

if (job != NULL)
{
    delete job;
    job = NULL;
}

I have created two instances of pgagent on database cluster. As of now not able to see any leak and keep you updated if found anything.

@Robert - Can you also test at your environment and keep us updated for any leak ?

Thanks,
Neel Patel

On Mon, Oct 23, 2017 at 10:30 AM, Ashesh Vashi <ashesh.vashi@enterprisedb.com> wrote:
Hi Rob,

How about this?


--

Thanks & Regards,

Ashesh Vashi
EnterpriseDB INDIA: Enterprise PostgreSQL Company


http://www.linkedin.com/in/asheshvashi


On Sat, Oct 21, 2017 at 8:36 PM, Rob Emery <re-pgsql@codeweavers.net> wrote:
Hi,

Following on from https://www.postgresql.org/message-id/CA%2BOCxoz4tONxSpd1rdU-9SPKRzucz8Bar2CXkEDnCwV6H77ZyA%40mail.gmail.com

I think I've identified and fixed the issue, please see the patch attached.

As I understand it when there are multiple pgagent instances and they clash executing a job (i.e rc != 1 on job.cpp:38), the loser of the conflict's thread will never be executed (i.e. job.cpp:418 JobThread::Entry), which is responsible for deleting the job owned by the thread, meaning that the connection is never returned to the pool. By moving the delete of the job into the destructor, we can assure that the connection is tidied up in both cases as the thread is deleted in the error case explicitly in pgAgent.cpp:185.

The only possibly unintended difference that I can see with doing this is that the log "Completed job: %s" is now output when before it wasn't, however I think this new behaviour is actually correct as the job object is completed at that time.

Thanks,
Rob






Phone: 0800 021 0888   Email: contactus@codeweavers.net
Codeweavers Ltd | Barn 4 | Dunston Business Village | Dunston | ST18 9AB
Registered in England and Wales No. 04092394 | VAT registration no. 974 9705 63

--
Robert Emery
Infrastructure Director

E: robertemery@codeweavers.net | T: 01785 711633 | W: www.codeweavers.net



--






Phone: 0800 021 0888   Email: contactus@codeweavers.net
Codeweavers Ltd | Barn 4 | Dunston Business Village | Dunston | ST18 9AB
Registered in England and Wales No. 04092394 | VAT registration no. 974 9705 63 


    https://plus.google.com/b/105942302039373248738/+CodeweaversNet  https://twitter.com/CodeweaversTeam

pgadmin-hackers by date:

Previous
From: pgAdmin 4 Jenkins
Date:
Subject: Build failed in Jenkins: pgadmin4-master-python27 #360
Next
From: Ashesh Vashi
Date:
Subject: Re: [pgAgent][Patch] Fixing connection pool leak