Re: Problem with psycopg2 and asyncio - Mailing list psycopg

From Frank Millman
Subject Re: Problem with psycopg2 and asyncio
Date
Msg-id 0E4E4D0E71294F6DA2AC11CEA4017F55@FrankLaptop
Whole thread Raw
In response to Re: Problem with psycopg2 and asyncio  (Federico Di Gregorio <fog@dndg.it>)
Responses Re: Problem with psycopg2 and asyncio
Re: Problem with psycopg2 and asyncio
List psycopg
On 29/02/16 10:17, federico wrote:
>
> On 28/02/16 08:53, Frank Millman wrote:
[...]
>
> I have written a program that demonstrates this, but it is 135 lines
>
> long. I can post it if required, but I thought I would ask the question
>
> first to see if this is a known issue.
>

 
>
If the fetch runs in a different thread the only reason for the main
>
loop to experiences noticeable delays is that you're fetching a lot of
>
data, hogging CPU and memory. Try using a server-side cursor:
Thanks for the replies, Federico and Dorian.
 
I have done some more tests, and I show the results below, but I think the bottom line is that it is just the way that psycopg2 works.
 
This is how my test program works. It starts up an asyncio event loop, and it starts a background task that simply prints a counter every second. That way I can confirm that the loop is running, and I can see if it is being held up.
 
Then I start another task that runs every 10 seconds. It tries to simulate heavy database usage. The table I am using only has about 8000 rows, so I Iaunch 25 connections, each in their own thread, to SELECT the table and count the rows. psycopg2 executes the task quite a bit quicker than pyodbc and sqlite3, but that is not the point of the exercise. The point is to monitor the background counter, to see how it is affected. Using the other two, it barely budges – there is a delay of no more than .02 of a second while the database threads are running. psycopg2 on the other hand creates very noticeable delays.
 
I tried four ways of selecting and counting the rows -
 
1. cur.execute(‘SELECT table’)
    tot = 0
    for row in cur:
        tot += 1
 
2. cur.execute(‘SELECT table’)
    rows = cur.execute(fetchall())
    tot = len(rows)
 
3. cur.execute(‘DECLARE xxx SCROLL CURSOR FOR SELECT table’)
    cur.execute(‘MOVE FORWARD ALL IN x’)
    no_rows = conn.rowcount
    cur.execute(‘MOVE ABSOLUTE 0 IN x’)
    tot = 0
    while no_rows > 50:
        rows = cur.execute(‘FETCH FORWARD 50 FROM x’).fetchall()
        tot += len(rows)
        no_rows –= len(rows)
    rows = cur.execute(‘FETCH FORWARD {} FROM x’.format(no_rows)).fetchall()
    tot += len(rows)
 
4. cur.execute(‘DECLARE xxx SCROLL CURSOR FOR SELECT table’)
    tot = 0
    while True:
        rows = cur.execute(‘FETCH FORWARD 50 FROM x’).fetchall()
        tot += len(rows)
        if len(rows) < 50:
            break
    rows = cur.execute(‘FETCH FORWARD 50 FROM x’).fetchall()
    tot += len(rows)
 
They all held up the event loop by between 2 and 4 seconds. 1 was the best, as it spread the load over 3-4 seconds, so was less noticeable. The others held it up for 2-3 seconds at a time, which would be very disruptive in practice.
 
It looks as if I may have to look into aiopg after all – I was hoping to avoid that.
 
Of course I may just be doing something silly, in which case I would be delighted if someone pointed it out.
 
Frank
 

psycopg by date:

Previous
From: Federico Di Gregorio
Date:
Subject: Re: Problem with psycopg2 and asyncio
Next
From: "Robellard, Michael"
Date:
Subject: Re: Problem with psycopg2 and asyncio