Re: [psycopg] speed concerns with executemany() - Mailing list psycopg

From Christophe Pettus
Subject Re: [psycopg] speed concerns with executemany()
Date
Msg-id 79BCF488-D1FB-42E1-B0E9-D4A54E7341A4@thebuild.com
Whole thread Raw
In response to [psycopg] speed concerns with executemany()  (mike bayer <mike_mp@zzzcomputing.com>)
Responses Re: [psycopg] speed concerns with executemany()  (Daniele Varrazzo <daniele.varrazzo@gmail.com>)
List psycopg
Are you running with the transaction isolation level set to ISOLATION_LEVEL_AUTOCOMMIT?  If so, each of those INSERTs
willbe in its own transaction, and thus will go through the COMMIT overhead.  That by itself wouldn't explain a jump
thatlarge (in most environments), but it will definitely be *much* slower. 

> On Dec 23, 2016, at 16:05, mike bayer <mike_mp@zzzcomputing.com> wrote:
>
> I'm getting more and more regular complaints among users of SQLAlchemy of the relatively slow speed of the
cursor.executemany()call in psycopg2.   In almost all cases, these users have discovered that Postgresql is entirely
capableof running an INSERT or UPDATE of many values with a high degree of speed using a single statement with a form
likethis: 
>
> INSERT INTO table (a, b, c)
> VALUES  (1, 2, 3), (4, 5, 6), (7, 8, 9), ...
>
> whereas if they run the same form using a single VALUES insert and sending the parameters as a sequence via
executemany(),they will observe hyperbolically slow speeds - today, someone claims they can run approximately 200 sets
ofthree integers each using the multiple VALUES approach in approximately .02 seconds, whereas running 200 values into
asingle executemany() call of the otherwise identical INSERT statement, they are claiming takes 20 seconds; that is,
100000%slower.   I'm not really sure how that's even possible, considering the single INSERT with many VALUES is a much
largerstring to send over the network and be parsed by the server, if the overhead of a single INSERT is .02 seconds,
wewould think an executemany() of 200 INSERT statements each with a single parameter set would be at most four seconds. 
>
> Obviously something is wrong with these users' environment, although I will note that the relative speed of psycopg2
executemany()over a 1G network is still pretty bad, compared to both sending a single INSERT with a large VALUES clause
aswell as compared to the executemany() speed of DBAPIs (even pure Python) for other databases like MySQL, I can
providesome quick benchmarks if that's helpful. 
>
> I understand that psycopg2 does not use prepared statements, and I have dim recollections that internal use of
preparedstatements for executemany is not on the roadmap for psycopg2.  However, I'm still not sure what I should be
tellingmy users when I get reports of these vastly slower results with executemany(). 
>
> I'm not asking that psycopg2 change anything, I'm just looking to understand what the heck is going on when people
arereporting this. Should I: 
>
> 1. tell them they have a network issue that is causing executemany() to have a problem?  (even though I can also
observeexecutemany() is kind of slow, though not as slow as these people are reporting) 
>
> 2. tell them there's some known issue, vacuuming / indexes/ or something that is known to have this effect?
>
> 3. tell them that yes, they should use multiple-VALUES within a single INSERT (which would eventually lead to strong
pressureon me to reinvent executemany() within SQLAlchemy's psycopg2 dialect to use this form) ? (also if so, why is
thisthe case?   can this claimed 100000% slowdown be real?) 
>
> 4. use a hack to actually make my own prepared statements within executemany() (I vaguely recall some recipe that you
canget a prepared statement going with psycopg2 by rolling it on the outside) ? 
>
> 5. Other reasons that executemany() is known to sometimes be extremely slow?
>
> I'm purposely trying to stay out of the realm of picking apart the libpq internals, assuming psycopg2 devs can shed
somelight what's going on here.   Thanks for any guidance you can offer! 
>
>
>
>
>
> --
> Sent via psycopg mailing list (psycopg@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/psycopg

--
-- Christophe Pettus
   xof@thebuild.com



psycopg by date:

Previous
From: mike bayer
Date:
Subject: [psycopg] speed concerns with executemany()
Next
From: Daniele Varrazzo
Date:
Subject: Re: [psycopg] speed concerns with executemany()