Re: pgbench and timestamps (bounced) - Mailing list pgsql-hackers

From Anastasia Lubennikova
Subject Re: pgbench and timestamps (bounced)
Date
Msg-id 6c6af173-85b9-f0b3-4764-b0df415d2736@postgrespro.ru
Whole thread Raw
In response to Re: pgbench and timestamps (bounced)  (Fabien COELHO <coelho@cri.ensmp.fr>)
Responses Re: pgbench and timestamps (bounced)  (Fabien COELHO <coelho@cri.ensmp.fr>)
List pgsql-hackers
On 11.09.2020 16:59, Fabien COELHO wrote:
>
> Hello Tom,
>
>>> It requires a mutex around the commands, I tried to do some windows
>>> implementation which may or may not work.
>>
>> Ugh, I'd really rather not do that.  Even disregarding the effects
>> of a mutex, though, my initial idea for fixing this has a big problem:
>> if we postpone PREPARE of the query until first execution, then it's
>> happening during timed execution of the benchmark scenario and thus
>> distorting the timing figures.  (Maybe if we'd always done it like
>> that, it'd be okay, but I'm quite against changing the behavior now
>> that it's stood for a long time.)
>
> Hmmm.
>
> Prepare is done *once* per client, ISTM that the impact on any 
> statistically significant benchmark is nul in practice, or it would 
> mean that the benchmark settings are too low.
>
> Second, the mutex is only used when absolutely necessary, only for the 
> substitution part of the query (replacing :stuff by ?), because 
> scripts are shared between threads. This is just once, in an unlikely 
> case occuring at the beginning.
>
>> However, perhaps there's more than one way to fix this.  Once we've
>> scanned all of the script and seen all the \set commands, we know
>> (in principle) the set of all variable names that are in use.
>> So maybe we could fix this by
>>
>> (1) During the initial scan of the script, make variable-table
>> entries for every \set argument, with the values shown as undefined
>> for the moment.  Do not try to parse SQL commands in this scan,
>> just collect them.
>
> The issue with this approach is
>
>   SELECT 1 AS one \gset pref_
>
> which will generate a "pref_one" variable, and these names cannot be 
> guessed without SQL parsing and possibly execution. That is why the
> preparation is delayed to when the variables are actually known.
>
>> (2) Make another scan in which we identify variable references
>> in the SQL commands and issue PREPAREs (if enabled).
>
>> (3) Perform the timed run.
>>
>> This avoids any impact of this bug fix on the semantics or timing
>> of the benchmark proper.  I'm not sure offhand whether this
>> approach makes any difference for the concerns you had about
>> identifying/suppressing variable references inside quotes.
>
> I do not think this plan is workable, because of the \gset issue.
>
> I do not see that the conditional mutex and delayed PREPARE would have 
> any significant (measurable) impact on an actual (reasonable) 
> benchmark run.
>
> A workable solution would be that each client actually execute each 
> script once before starting the actual benchmark. It would still need 
> a mutex and also a sync barrier (which I'm proposing in some other 
> thread). However this may raise some other issues because then some 
> operations would be trigger out of the benchmarking run, which may or 
> may not be desirable.
>
> So I'm not to keen to go that way, and I think the proposed solution 
> is reasonable from a benchmarking point of view as the impact is 
> minimal, although not zero.
>
CFM reminder.

Hi, this entry is "Waiting on Author" and the thread was inactive for a 
while. I see this discussion still has some open questions. Are you 
going to continue working on it, or should I mark it as "returned with 
feedback" until a better time?

-- 
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




pgsql-hackers by date:

Previous
From: Dilip Kumar
Date:
Subject: Re: [HACKERS] Custom compression methods
Next
From: Anastasia Lubennikova
Date:
Subject: Re: Libpq support to connect to standby server as priority