Thread: Problem with psycopg2 and asyncio
Hi all
I know that psycopg2 has some support for asyncio, and that there is a package ‘aiopg’ to assist with this, but my question has nothing to do with either of these.
I am writing a server-side program in a client-server environment. I have chosen to support three databases – sqlite3 using python’s built-in module, Sql Server using pyodbc, and PostgreSQL using psycopg2.
I have been using asyncio for some time on the network side, and it is working well. I have recently turned my attention to ensuring that database calls do not block the event loop. I want to keep my code as generic as possible across the three databases, so I am looking for a solution that will work with all three.
The recommended approach is to use ‘run_in_executor()’, but I did not want to do that because, AFAICT, you would have to use cur.fetchall(), and I would prefer to iterate over the cursor. I came up with a solution that seems to work well.
I run each database connection in its own thread, with its own queue.Queue() as a request queue. When I want to issue a command, I create an instance of an asyncio.Queue() as a return queue, and make a tuple of the command and the return queue. I ‘put’ the tuple on the request queue, and ‘await’ the return queue. The connection ‘gets’ the tuple, executes the command, iterates over the cursor, and ‘puts’ the rows retrieved on the return queue in blocks of 50.
The theory is that the issuer of the command will block while it awaits the response, but the main event loop will not be blocked, so no other users should experience any delays.
The theory works with sqlite3 and with pyodbc, but for some reason it does not work with psycopg2. The main event loop experiences noticeable delays while the connection is retrieving the rows, even though it is running in a different thread.
I have written a program that demonstrates this, but it is 135 lines long. I can post it if required, but I thought I would ask the question first to see if this is a known issue.
I am using psycopg2 2.6.1 and python 3.5 on Fedora 22.
Thanks
Frank Millman
As far as I know, iterating on the cursor will still "fetchall" the rows(or worse, fetchone?), unless you're working with serverside cursors.
Does the same delay happen even when doing fetchall ?On Sun, Feb 28, 2016 at 8:53 AM, Frank Millman <frank@chagford.com> wrote:
Hi allI know that psycopg2 has some support for asyncio, and that there is a package ‘aiopg’ to assist with this, but my question has nothing to do with either of these.I am writing a server-side program in a client-server environment. I have chosen to support three databases – sqlite3 using python’s built-in module, Sql Server using pyodbc, and PostgreSQL using psycopg2.I have been using asyncio for some time on the network side, and it is working well. I have recently turned my attention to ensuring that database calls do not block the event loop. I want to keep my code as generic as possible across the three databases, so I am looking for a solution that will work with all three.The recommended approach is to use ‘run_in_executor()’, but I did not want to do that because, AFAICT, you would have to use cur.fetchall(), and I would prefer to iterate over the cursor. I came up with a solution that seems to work well.I run each database connection in its own thread, with its own queue.Queue() as a request queue. When I want to issue a command, I create an instance of an asyncio.Queue() as a return queue, and make a tuple of the command and the return queue. I ‘put’ the tuple on the request queue, and ‘await’ the return queue. The connection ‘gets’ the tuple, executes the command, iterates over the cursor, and ‘puts’ the rows retrieved on the return queue in blocks of 50.The theory is that the issuer of the command will block while it awaits the response, but the main event loop will not be blocked, so no other users should experience any delays.The theory works with sqlite3 and with pyodbc, but for some reason it does not work with psycopg2. The main event loop experiences noticeable delays while the connection is retrieving the rows, even though it is running in a different thread.I have written a program that demonstrates this, but it is 135 lines long. I can post it if required, but I thought I would ask the question first to see if this is a known issue.I am using psycopg2 2.6.1 and python 3.5 on Fedora 22.ThanksFrank Millman
On 28/02/16 08:53, Frank Millman wrote: > I know that psycopg2 has some support for asyncio, and that there is a > package ‘aiopg’ to assist with this, but my question has nothing to do > with either of these. > I am writing a server-side program in a client-server environment. I > have chosen to support three databases – sqlite3 using python’s built-in > module, Sql Server using pyodbc, and PostgreSQL using psycopg2. > I have been using asyncio for some time on the network side, and it is > working well. I have recently turned my attention to ensuring that > database calls do not block the event loop. I want to keep my code as > generic as possible across the three databases, so I am looking for a > solution that will work with all three. > The recommended approach is to use ‘run_in_executor()’, but I did not > want to do that because, AFAICT, you would have to use cur.fetchall(), > and I would prefer to iterate over the cursor. I came up with a solution > that seems to work well. > I run each database connection in its own thread, with its own > queue.Queue() as a request queue. When I want to issue a command, I > create an instance of an asyncio.Queue() as a return queue, and make a > tuple of the command and the return queue. I ‘put’ the tuple on the > request queue, and ‘await’ the return queue. The connection ‘gets’ the > tuple, executes the command, iterates over the cursor, and ‘puts’ the > rows retrieved on the return queue in blocks of 50. > The theory is that the issuer of the command will block while it awaits > the response, but the main event loop will not be blocked, so no other > users should experience any delays. > The theory works with sqlite3 and with pyodbc, but for some reason it > does not work with psycopg2. The main event loop experiences noticeable > delays while the connection is retrieving the rows, even though it is > running in a different thread. > I have written a program that demonstrates this, but it is 135 lines > long. I can post it if required, but I thought I would ask the question > first to see if this is a known issue. If the fetch runs in a different thread the only reason for the main loop to experiences noticeable delays is that you're fetching a lot of data, hogging CPU and memory. Try using a server-side cursor: http://initd.org/psycopg/docs/usage.html#server-side-cursors federico -- Federico Di Gregorio federico.digregorio@dndg.it DNDG srl http://dndg.it The devil speaks truth much oftener than he's deemed. He has an ignorant audience. -- Byron (suggested by Alice Fontana)
On 29/02/16 10:17, federico wrote:
>
> On 28/02/16 08:53, Frank Millman wrote:
[...]
[...]
>
> I have written a program that demonstrates this, but it is 135 lines
>
> long. I can post it if required, but I thought I would ask the question
>
> first to see if this is a known issue.
>
>
If the fetch runs in a different thread the only reason for the main
>
loop to experiences noticeable delays is that you're fetching a lot of
>
data, hogging CPU and memory. Try using a server-side cursor:
Thanks for the replies, Federico and Dorian.
I have done some more tests, and I show the results below, but I think the bottom line is that it is just the way that psycopg2 works.
This is how my test program works. It starts up an asyncio event loop, and it starts a background task that simply prints a counter every second. That way I can confirm that the loop is running, and I can see if it is being held up.
Then I start another task that runs every 10 seconds. It tries to simulate heavy database usage. The table I am using only has about 8000 rows, so I Iaunch 25 connections, each in their own thread, to SELECT the table and count the rows. psycopg2 executes the task quite a bit quicker than pyodbc and sqlite3, but that is not the point of the exercise. The point is to monitor the background counter, to see how it is affected. Using the other two, it barely budges – there is a delay of no more than .02 of a second while the database threads are running. psycopg2 on the other hand creates very noticeable delays.
I tried four ways of selecting and counting the rows -
1. cur.execute(‘SELECT table’)
tot = 0
for row in cur:
tot += 1
2. cur.execute(‘SELECT table’)
rows = cur.execute(fetchall())
tot = len(rows)
3. cur.execute(‘DECLARE xxx SCROLL CURSOR FOR SELECT table’)
cur.execute(‘MOVE FORWARD ALL IN x’)
no_rows = conn.rowcount
cur.execute(‘MOVE ABSOLUTE 0 IN x’)
tot = 0
while no_rows > 50:
rows = cur.execute(‘FETCH FORWARD 50 FROM x’).fetchall()
tot += len(rows)
no_rows –= len(rows)
rows = cur.execute(‘FETCH FORWARD {} FROM x’.format(no_rows)).fetchall()
tot += len(rows)
4. cur.execute(‘DECLARE xxx SCROLL CURSOR FOR SELECT table’)
tot = 0
while True:
rows = cur.execute(‘FETCH FORWARD 50 FROM x’).fetchall()
tot += len(rows)
if len(rows) < 50:
break
rows = cur.execute(‘FETCH FORWARD 50 FROM x’).fetchall()
tot += len(rows)
They all held up the event loop by between 2 and 4 seconds. 1 was the best, as it spread the load over 3-4 seconds, so was less noticeable. The others held it up for 2-3 seconds at a time, which would be very disruptive in practice.
It looks as if I may have to look into aiopg after all – I was hoping to avoid that.
Of course I may just be doing something silly, in which case I would be delighted if someone pointed it out.
Frank
My guess is that the GIL is causing your problem here. The other thing you probably want to do is run the profiler on your code and see where the time is actually being spent.
Remember that the Python GIL will only allow one thread at a time run Python Code, and underlying C libraries must release the GIL at appropriate times if they want to allow other things to happen while they are waiting on C code. Of course this is the whole reason that asyncio I was written in the first place, to allow a single thread to do packets of work based on when data arrives asynchronously.
If you are using asyncio and want to actually run things asynchronously you will need to use aiopg or something you roll your self. remember asyncio is a wrapper around select and poll. There is documentation in psycopg2 about making things asynchronous
On Mon, Feb 29, 2016 at 9:52 AM, Frank Millman <frank@chagford.com> wrote:
On 29/02/16 10:17, federico wrote:>> On 28/02/16 08:53, Frank Millman wrote:
[...]>> I have written a program that demonstrates this, but it is 135 lines>> long. I can post it if required, but I thought I would ask the question>> first to see if this is a known issue.>
>If the fetch runs in a different thread the only reason for the main>loop to experiences noticeable delays is that you're fetching a lot of>data, hogging CPU and memory. Try using a server-side cursor:Thanks for the replies, Federico and Dorian.I have done some more tests, and I show the results below, but I think the bottom line is that it is just the way that psycopg2 works.This is how my test program works. It starts up an asyncio event loop, and it starts a background task that simply prints a counter every second. That way I can confirm that the loop is running, and I can see if it is being held up.Then I start another task that runs every 10 seconds. It tries to simulate heavy database usage. The table I am using only has about 8000 rows, so I Iaunch 25 connections, each in their own thread, to SELECT the table and count the rows. psycopg2 executes the task quite a bit quicker than pyodbc and sqlite3, but that is not the point of the exercise. The point is to monitor the background counter, to see how it is affected. Using the other two, it barely budges – there is a delay of no more than .02 of a second while the database threads are running. psycopg2 on the other hand creates very noticeable delays.I tried four ways of selecting and counting the rows -1. cur.execute(‘SELECT table’)tot = 0for row in cur:tot += 12. cur.execute(‘SELECT table’)rows = cur.execute(fetchall())tot = len(rows)3. cur.execute(‘DECLARE xxx SCROLL CURSOR FOR SELECT table’)cur.execute(‘MOVE FORWARD ALL IN x’)no_rows = conn.rowcountcur.execute(‘MOVE ABSOLUTE 0 IN x’)tot = 0while no_rows > 50:rows = cur.execute(‘FETCH FORWARD 50 FROM x’).fetchall()tot += len(rows)no_rows –= len(rows)rows = cur.execute(‘FETCH FORWARD {} FROM x’.format(no_rows)).fetchall()tot += len(rows)4. cur.execute(‘DECLARE xxx SCROLL CURSOR FOR SELECT table’)tot = 0while True:rows = cur.execute(‘FETCH FORWARD 50 FROM x’).fetchall()tot += len(rows)if len(rows) < 50:breakrows = cur.execute(‘FETCH FORWARD 50 FROM x’).fetchall()tot += len(rows)They all held up the event loop by between 2 and 4 seconds. 1 was the best, as it spread the load over 3-4 seconds, so was less noticeable. The others held it up for 2-3 seconds at a time, which would be very disruptive in practice.It looks as if I may have to look into aiopg after all – I was hoping to avoid that.Of course I may just be doing something silly, in which case I would be delighted if someone pointed it out.Frank
Michael Robellard
(216)288-2745
(216)288-2745
On Mon, Feb 29, 2016 at 2:52 PM, Frank Millman <frank@chagford.com> wrote: > Of course I may just be doing something silly, in which case I would be > delighted if someone pointed it out. Are you running psycopg in async mode or in green mode? -- Daniele
On 29/02/16 17:26, Michael Robellard wrote:
>
> My guess is that the GIL is causing your problem here. The other thing you probably want to do is run the profiler on your code and see where the time is actually being spent.
> Remember that the Python GIL will only allow one thread at a time run Python Code, and underlying C libraries must release the GIL at appropriate times if they want to allow other
> things to happen while they are waiting on C code. Of course this is the whole reason that asyncio I was written in the first place, to allow a single thread to do packets of work based
> on when data arrives asynchronously.
>
> If you are using asyncio and want to actually run things asynchronously you will need to use aiopg or something you roll your self. remember asyncio is a wrapper around select and poll. > There is documentation in psycopg2 about making things asynchronous
Thanks for the reply, Michael.
I suspect that the GIL is causing the problem. I am wondering if psycopg2 is not releasing the GIL at the appropriate time, while pyodbc and sqlite3 are. The symptoms seem to suggest that.
Regarding profiling, I use exactly the same program while testing with pyodbc and sqlite3 – the only difference is setting up the database connection. So I don’t think that anything in my code is causing the holdup.
I was trying to avoid aiopg, simply because I am trying to maintain generic code to support all three databases as much as possible, and the other two do not seem to have asynchronous support at all.
I still think that my theory holds water. Run the database in its own thread, pass requests to it using a queue.Queue(), and pass results back using an asyncio.Queue(). It works brilliantly for the other two databases, and I cannot see any reason why it should not work with psycopg2.
Frank
On 29/02/16 18:07, Daniele Varrazzo wrote:
>
> On Mon, Feb 29, 2016 at 2:52 PM, Frank Millman <frank@chagford.com> wrote:
>
> > Of course I may just be doing something silly, in which case I would be
> > delighted if someone pointed it out.
>
> Are you running psycopg in async mode or in green mode?
>
>
> > Of course I may just be doing something silly, in which case I would be
> > delighted if someone pointed it out.
>
> Are you running psycopg in async mode or in green mode?
>
To be honest, I don’t know what that means, so I am probably running in green mode.
I am aware that psycopg2 has some extensions that enable asynchronous communication, but I am not using any of those.
To repeat what I replied to Michael,
I still think that my theory holds water. Run the database in its own thread, pass requests to it using a queue.Queue(), and pass results back using an asyncio.Queue(). It works brilliantly for the other two databases, and I cannot see any reason why it should not work with psycopg2.Frank
On Tue, Mar 1, 2016 at 5:24 AM, Frank Millman <frank@chagford.com> wrote: > On 29/02/16 18:07, Daniele Varrazzo wrote: >> >> On Mon, Feb 29, 2016 at 2:52 PM, Frank Millman <frank@chagford.com> wrote: >> >> > Of course I may just be doing something silly, in which case I would be >> > delighted if someone pointed it out. >> >> Are you running psycopg in async mode or in green mode? >> > To be honest, I don’t know what that means, so I am probably running in > green mode. I think you should have that under control if you want psycopg to perform ok in asynchronous environment. Call psycopg2.extensions.get_wait_callback(). If it returns None, then psycopg is simply blocking. -- Daniele
On 01/03/16 17:24, Daniele Varrazzo wrote:
>
>> On Mon, Feb 29, 2016 at 2:52 PM, Frank Millman <frank@chagford.com> wrote:
>>
>> > Of course I may just be doing something silly, in which case I would be
>> > delighted if someone pointed it out.
>>
>> Are you running psycopg in async mode or in green mode?
>>
> To be honest, I don’t know what that means, so I am probably running in
> green mode.
> I think you should have that under control if you want psycopg to
> perform ok in asynchronous environment.
> Call psycopg2.extensions.get_wait_callback(). If it returns None, then
> psycopg is simply blocking.
>>
>> > Of course I may just be doing something silly, in which case I would be
>> > delighted if someone pointed it out.
>>
>> Are you running psycopg in async mode or in green mode?
>>
> To be honest, I don’t know what that means, so I am probably running in
> green mode.
> I think you should have that under control if you want psycopg to
> perform ok in asynchronous environment.
> Call psycopg2.extensions.get_wait_callback(). If it returns None, then
> psycopg is simply blocking.
Either I am misunderstanding you, or you are misunderstanding me – I am not sure which.
I am not running psycopg ‘in an asynchronous environment’. It is running in its own thread, and is unaware that there is an event loop running in a different thread.
It boils down to the fact that psycopg running in one thread is blocking another thread from executing. Running pyodbc and sqlite3 in exactly the same conditions do not have this effect. Therefore my (quite possibly wrong) suspicion is that the GIL is not being released timeously, or at least psycopg is doing something different from the others.
If I have misunderstood, I would appreciate it if you would clarify how you see the problem. That should give me a clearer idea of the best way forward from here.
Thanks
Frank
On 02/03/16 09:48, Frank Millman wrote:
>
> It boils down to the fact that psycopg running in one thread is blocking another thread from executing. Running pyodbc and sqlite3 in exactly the same conditions do not have this effect. > Therefore my (quite possibly wrong) suspicion is that the GIL is not being released timeously, or at least psycopg is doing something different from the others.
>
It occurs to me that there is one difference which could be significant. As far as I know, pyodbc and sqlite do not use sockets to communicate with the database, but psycopg does. Whether that could interfere with the socket-handling in the asyncio thread I have no idea, but it is a possibility.
Frank
On 02/03/16 09:19, Frank Millman wrote: > On 02/03/16 09:48, Frank Millman wrote: > > > > It boils down to the fact that psycopg running in one thread is > blocking another thread from executing. Running pyodbc and sqlite3 in > exactly the same conditions do not have this effect. > Therefore my > (quite possibly wrong) suspicion is that the GIL is not being released > timeously, or at least psycopg is doing something different from the others. > > > It occurs to me that there is one difference which could be significant. > As far as I know, pyodbc and sqlite do not use sockets to communicate > with the database, but psycopg does. Whether that could interfere with > the socket-handling in the asyncio thread I have no idea, but it is a > possibility. I think pyodbc does use sockets. We try hard to release the GIL anytime we're about to call libpq functions and, looking at the code, I don't see any obvious place where the GIL is hold during a time-consuming call. Can you plase send us your code (the shortest possible version that actually can be run) so that we can investigate? Also, I understand your reasons to not use one of the async modes but you should really look into that because psycopg has very good support for async loops. federico -- Federico Di Gregorio federico.digregorio@dndg.it DNDG srl http://dndg.it Io non sono romantica. La candelina sul tavolo mi vede e si spegne. -- sisterconfusion
On 02/03/16 10:44, Federico Di Gregorio wrote:
> We try hard to release the GIL anytime we're about to call libpq
> functions and, looking at the code, I don't see any obvious place where
> the GIL is hold during a time-consuming call.
>
> Can you please send us your code (the shortest possible version that
> actually can be run) so that we can investigate?
>
Thanks, Federico. I have attached my program.
A few points -
1. Instead of using my home-grown threading/queue model, I have used loop.run_in_executor(), as this is more standard. It also shows the slowdown.
2. I test on two machines – Windows and linux. To save me commenting/uncommenting the connection details, I use ‘try import pyodbc’. If it succeeds, I assume Windows. If it fails, I assume linux and import psycopg2. In both cases I have an alternative connection that uses sqlite3, but then I do have to comment/uncomment to switch databases.
3. The program uses loop.run_forever(). To stop it, just press Enter. I use a separate thread to listen for that and shut everything down. I do that because I could never get KeyboardInterrupt working properly on Windows.
4. I use a home-grown connection pool – hope it looks ok! The first iteration of the test opens 10 connections, so best to ignore the first set of timings.
5. You will have to substitute your own database details and table name to test it.
6. I commented out some lines that show more information – uncomment them if required.
That is all I can think of. If you have any other problems, let me know.
Thanks
Frank