Re: [psycopg] speed concerns with executemany() - Mailing list psycopg

From Adrian Klaver
Subject Re: [psycopg] speed concerns with executemany()
Date
Msg-id b08cf4b4-f588-e36e-0cd7-2543abc57809@aklaver.com
Whole thread Raw
In response to Re: [psycopg] speed concerns with executemany()  (Daniele Varrazzo <daniele.varrazzo@gmail.com>)
List psycopg
On 01/05/2017 11:00 AM, Daniele Varrazzo wrote:
> On Thu, Jan 5, 2017 at 5:32 PM, Federico Di Gregorio <fog@dndg.it> wrote:
>> On 02/01/17 17:07, Daniele Varrazzo wrote:
>>>
>>> On Mon, Jan 2, 2017 at 4:35 PM, Adrian Klaver <adrian.klaver@aklaver.com>
>>> wrote:
>>>>
>>>> With NRECS=10000 and page size=100:
>>>>
>>>> aklaver@tito:~> python psycopg_executemany.py -p 100
>>>> classic: 427.618795156 sec
>>>> joined: 7.55754685402 sec
>>>
>>> Ugh! :D
>>
>>
>> That's great. Just a minor point: I won't overload executemany() with this
>> feature but add a new method UNLESS the semantics are exactly the same
>> especially regarding session isolation. Also, right now psycopg keeps track
>> of the number of affected rows over executemany() calls: I'd like to not
>> lose that because it is a breaking change to the API.
>
> It seems to me that the semantics would stay the same, even in
> presence of volatile functions. However unfortunately rowcount would
> break. That's just sad.
>
> We can have no problem an extra argument to executemany: page_size
> defaulting to 1 (previous behaviour) which could be bumped. It's sad
> the default cannot be 100.
>
> Mike Bayer reported (https://github.com/psycopg/psycopg2/issues/491)
> that SQLAlchemy actually uses the aggregated rowcount for concurrency
> control.
>
> So, how much it is of a deal-breaker? Can we afford losing aggregated
> rowcount to obtain a juicy speedup in default usage, or we'd rather
> leave the behaviour untouched but having people "opting in for speed"?
>
> ponder, ponder...
>
> Pondered: as the features had little test and I don't want to delay
> releasing 2.7 further, I'd rather release the feature with a page_size
> default of 1. People could use it and report eventual failures if they
> use a page_size > 1. If tests turn out to be positive that the
> database behaves ok we could think about changing the default in the
> future. We may want to drop the aggregated rowcount in the future but
> with better planning, e.g. to allow SQLAlchemy to ignore aggregated
> rowcount from psycopg >= 2.8...
>
> How does it sound?

Works for me.
>
> -- Daniele
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


psycopg by date:

Previous
From: Adrian Klaver
Date:
Subject: Re: [psycopg] Solving the SQL composition problem
Next
From: Federico Di Gregorio
Date:
Subject: Re: [psycopg] speed concerns with executemany()