Re: [psycopg] speed concerns with executemany() - Mailing list psycopg

From Federico Di Gregorio
Subject Re: [psycopg] speed concerns with executemany()
Date
Msg-id 2b88cb87-7ff6-b801-2a7c-b8e6c3f78183@dndg.it
Whole thread Raw
In response to Re: [psycopg] speed concerns with executemany()  (Daniele Varrazzo <daniele.varrazzo@gmail.com>)
List psycopg
On 05/01/17 20:00, Daniele Varrazzo wrote:
> On Thu, Jan 5, 2017 at 5:32 PM, Federico Di Gregorio <fog@dndg.it> wrote:
>> On 02/01/17 17:07, Daniele Varrazzo wrote:
>>> On Mon, Jan 2, 2017 at 4:35 PM, Adrian Klaver <adrian.klaver@aklaver.com>
>>> wrote:
>>>> With NRECS=10000 and page size=100:
>>>>
>>>> aklaver@tito:~> python psycopg_executemany.py -p 100
>>>> classic: 427.618795156 sec
>>>> joined: 7.55754685402 sec
>>> Ugh! :D
>>
>> That's great. Just a minor point: I won't overload executemany() with this
>> feature but add a new method UNLESS the semantics are exactly the same
>> especially regarding session isolation. Also, right now psycopg keeps track
>> of the number of affected rows over executemany() calls: I'd like to not
>> lose that because it is a breaking change to the API.
> It seems to me that the semantics would stay the same, even in
> presence of volatile functions. However unfortunately rowcount would
> break. That's just sad.
>
> We can have no problem an extra argument to executemany: page_size
> defaulting to 1 (previous behaviour) which could be bumped. It's sad
> the default cannot be 100.
>
> Mike Bayer reported (https://github.com/psycopg/psycopg2/issues/491)
> that SQLAlchemy actually uses the aggregated rowcount for concurrency
> control.
>
> So, how much it is of a deal-breaker? Can we afford losing aggregated
> rowcount to obtain a juicy speedup in default usage, or we'd rather
> leave the behaviour untouched but having people "opting in for speed"?
>
> ponder, ponder...
>
> Pondered: as the features had little test and I don't want to delay
> releasing 2.7 further, I'd rather release the feature with a page_size
> default of 1. People could use it and report eventual failures if they
> use a page_size > 1. If tests turn out to be positive that the
> database behaves ok we could think about changing the default in the
> future. We may want to drop the aggregated rowcount in the future but
> with better planning, e.g. to allow SQLAlchemy to ignore aggregated
> rowcount from psycopg >= 2.8...
>
> How does it sound?

Fine for me.

federico


--
Federico Di Gregorio                         federico.digregorio@dndg.it
DNDG srl                                                  http://dndg.it
            Purtroppo i creazionisti non si sono ancora estinti. -- vodka


psycopg by date:

Previous
From: Adrian Klaver
Date:
Subject: Re: [psycopg] speed concerns with executemany()
Next
From: Daniele Varrazzo
Date:
Subject: Re: [psycopg] Releasing Linux binary packages of psycopg