Re: [psycopg] speed concerns with executemany() - Mailing list psycopg

From Daniele Varrazzo
Subject Re: [psycopg] speed concerns with executemany()
Date
Msg-id CA+mi_8bVHi_wkZBhDS-Wib9-5kkLjFR09d7BHtHBO7-MStcX=Q@mail.gmail.com
Whole thread Raw
In response to Re: [psycopg] speed concerns with executemany()  (Federico Di Gregorio <fog@dndg.it>)
Responses Re: [psycopg] speed concerns with executemany()  (Adrian Klaver <adrian.klaver@aklaver.com>)
Re: [psycopg] speed concerns with executemany()  (Federico Di Gregorio <fog@dndg.it>)
Re: [psycopg] speed concerns with executemany()  (mike bayer <mike_mp@zzzcomputing.com>)
List psycopg
On Thu, Jan 5, 2017 at 5:32 PM, Federico Di Gregorio <fog@dndg.it> wrote:
> On 02/01/17 17:07, Daniele Varrazzo wrote:
>>
>> On Mon, Jan 2, 2017 at 4:35 PM, Adrian Klaver <adrian.klaver@aklaver.com>
>> wrote:
>>>
>>> With NRECS=10000 and page size=100:
>>>
>>> aklaver@tito:~> python psycopg_executemany.py -p 100
>>> classic: 427.618795156 sec
>>> joined: 7.55754685402 sec
>>
>> Ugh! :D
>
>
> That's great. Just a minor point: I won't overload executemany() with this
> feature but add a new method UNLESS the semantics are exactly the same
> especially regarding session isolation. Also, right now psycopg keeps track
> of the number of affected rows over executemany() calls: I'd like to not
> lose that because it is a breaking change to the API.

It seems to me that the semantics would stay the same, even in
presence of volatile functions. However unfortunately rowcount would
break. That's just sad.

We can have no problem an extra argument to executemany: page_size
defaulting to 1 (previous behaviour) which could be bumped. It's sad
the default cannot be 100.

Mike Bayer reported (https://github.com/psycopg/psycopg2/issues/491)
that SQLAlchemy actually uses the aggregated rowcount for concurrency
control.

So, how much it is of a deal-breaker? Can we afford losing aggregated
rowcount to obtain a juicy speedup in default usage, or we'd rather
leave the behaviour untouched but having people "opting in for speed"?

ponder, ponder...

Pondered: as the features had little test and I don't want to delay
releasing 2.7 further, I'd rather release the feature with a page_size
default of 1. People could use it and report eventual failures if they
use a page_size > 1. If tests turn out to be positive that the
database behaves ok we could think about changing the default in the
future. We may want to drop the aggregated rowcount in the future but
with better planning, e.g. to allow SQLAlchemy to ignore aggregated
rowcount from psycopg >= 2.8...

How does it sound?

-- Daniele


psycopg by date:

Previous
From: Adrian Klaver
Date:
Subject: Re: [psycopg] Solving the SQL composition problem
Next
From: Daniele Varrazzo
Date:
Subject: Re: [psycopg] Solving the SQL composition problem