Re: [psycopg] speed concerns with executemany() - Mailing list psycopg

From Adrian Klaver
Subject Re: [psycopg] speed concerns with executemany()
Date
Msg-id 9e943705-466b-c317-87fd-9e87e66f0338@aklaver.com
Whole thread Raw
In response to [psycopg] speed concerns with executemany()  (mike bayer <mike_mp@zzzcomputing.com>)
Responses Re: [psycopg] speed concerns with executemany()  (Daniele Varrazzo <daniele.varrazzo@gmail.com>)
List psycopg
On 12/23/2016 04:05 PM, mike bayer wrote:
> I'm getting more and more regular complaints among users of SQLAlchemy
> of the relatively slow speed of the cursor.executemany() call in
> psycopg2.   In almost all cases, these users have discovered that
> Postgresql is entirely capable of running an INSERT or UPDATE of many
> values with a high degree of speed using a single statement with a form
> like this:
>
> INSERT INTO table (a, b, c)
> VALUES  (1, 2, 3), (4, 5, 6), (7, 8, 9), ...
>
> whereas if they run the same form using a single VALUES insert and
> sending the parameters as a sequence via executemany(), they will
> observe hyperbolically slow speeds - today, someone claims they can run
> approximately 200 sets of three integers each using the multiple VALUES
> approach in approximately .02 seconds, whereas running 200 values into a
> single executemany() call of the otherwise identical INSERT statement,
> they are claiming takes 20 seconds; that is, 100000% slower.   I'm not
> really sure how that's even possible, considering the single INSERT with
> many VALUES is a much larger string to send over the network and be
> parsed by the server, if the overhead of a single INSERT is .02 seconds,
> we would think an executemany() of 200 INSERT statements each with a
> single parameter set would be at most four seconds.
>
> Obviously something is wrong with these users' environment, although I
> will note that the relative speed of psycopg2 executemany() over a 1G
> network is still pretty bad, compared to both sending a single INSERT
> with a large VALUES clause as well as compared to the executemany()
> speed of DBAPIs (even pure Python) for other databases like MySQL, I can
> provide some quick benchmarks if that's helpful.
>
> I understand that psycopg2 does not use prepared statements, and I have
> dim recollections that internal use of prepared statements for
> executemany is not on the roadmap for psycopg2.  However, I'm still not
> sure what I should be telling my users when I get reports of these
> vastly slower results with executemany().
>
> I'm not asking that psycopg2 change anything, I'm just looking to
> understand what the heck is going on when people are reporting this.
> Should I:
>
> 1. tell them they have a network issue that is causing executemany() to
> have a problem?  (even though I can also observe executemany() is kind
> of slow, though not as slow as these people are reporting)
>
> 2. tell them there's some known issue, vacuuming / indexes/ or something
> that is known to have this effect?
>
> 3. tell them that yes, they should use multiple-VALUES within a single
> INSERT (which would eventually lead to strong pressure on me to reinvent
> executemany() within SQLAlchemy's psycopg2 dialect to use this form) ?
> (also if so, why is this the case?   can this claimed 100000% slowdown
> be real?)

I have to go with Christophe's explanation. They are seeing the effects
of 200 separate transactions, though like he stated later this more an
assumption then something I have tested.

>
> 4. use a hack to actually make my own prepared statements within
> executemany() (I vaguely recall some recipe that you can get a prepared
> statement going with psycopg2 by rolling it on the outside) ?
>
> 5. Other reasons that executemany() is known to sometimes be extremely
> slow?
>
> I'm purposely trying to stay out of the realm of picking apart the libpq
> internals, assuming psycopg2 devs can shed some light what's going on
> here.   Thanks for any guidance you can offer!
>
>
>
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


psycopg by date:

Previous
From: Daniele Varrazzo
Date:
Subject: Re: [psycopg] speed concerns with executemany()
Next
From: Daniele Varrazzo
Date:
Subject: Re: [psycopg] speed concerns with executemany()