Thread: Retry in pgbench

Retry in pgbench

From
Tatsuo Ishii
Date:
Currently standard pgbench scenario produces transaction serialize
errors "could not serialize access due to concurrent update" if
PostgreSQL runs in REPEATABLE READ or SERIALIZABLE level, and the
session aborts. In order to achieve meaningful results even in these
transaction isolation levels, I would like to propose an automatic
retry feature if "could not serialize access due to concurrent update"
error occurs.

Probably just adding a switch to retry is not enough, maybe retry
method (random interval etc.) and max retry number are needed to be
added.

I would like to hear your thoughts,

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp



Re: Retry in pgbench

From
Thomas Munro
Date:
On Tue, Apr 13, 2021 at 5:51 PM Tatsuo Ishii <ishii@sraoss.co.jp> wrote:
> Currently standard pgbench scenario produces transaction serialize
> errors "could not serialize access due to concurrent update" if
> PostgreSQL runs in REPEATABLE READ or SERIALIZABLE level, and the
> session aborts. In order to achieve meaningful results even in these
> transaction isolation levels, I would like to propose an automatic
> retry feature if "could not serialize access due to concurrent update"
> error occurs.
>
> Probably just adding a switch to retry is not enough, maybe retry
> method (random interval etc.) and max retry number are needed to be
> added.
>
> I would like to hear your thoughts,

See also:

https://www.postgresql.org/message-id/flat/72a0d590d6ba06f242d75c2e641820ec%40postgrespro.ru



Re: Retry in pgbench

From
Tatsuo Ishii
Date:
> On Tue, Apr 13, 2021 at 5:51 PM Tatsuo Ishii <ishii@sraoss.co.jp> wrote:
>> Currently standard pgbench scenario produces transaction serialize
>> errors "could not serialize access due to concurrent update" if
>> PostgreSQL runs in REPEATABLE READ or SERIALIZABLE level, and the
>> session aborts. In order to achieve meaningful results even in these
>> transaction isolation levels, I would like to propose an automatic
>> retry feature if "could not serialize access due to concurrent update"
>> error occurs.
>>
>> Probably just adding a switch to retry is not enough, maybe retry
>> method (random interval etc.) and max retry number are needed to be
>> added.
>>
>> I would like to hear your thoughts,
> 
> See also:
> 
> https://www.postgresql.org/message-id/flat/72a0d590d6ba06f242d75c2e641820ec%40postgrespro.ru

Thanks for the pointer. It seems we need to resume the discussion.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp



Re: Retry in pgbench

From
Jehan-Guillaume de Rorthais
Date:
Hi,

On Tue, 13 Apr 2021 16:12:59 +0900 (JST)
Tatsuo Ishii <ishii@sraoss.co.jp> wrote:

>  [...]  
>  [...]  
>  [...]  
> 
> Thanks for the pointer. It seems we need to resume the discussion.

By the way, I've been playing with the idea of failing gracefully and retry
indefinitely (or until given -T) on SQL error AND connection issue.

It would be useful to test replicating clusters with a (switch|fail)over
procedure.

Regards,



Re: Retry in pgbench

From
Tatsuo Ishii
Date:
> By the way, I've been playing with the idea of failing gracefully and retry
> indefinitely (or until given -T) on SQL error AND connection issue.
> 
> It would be useful to test replicating clusters with a (switch|fail)over
> procedure.

Interesting idea but in general a failover takes sometime (like a few
minutes), and it will strongly affect TPS. I think in the end it just
compares the failover time.

Or are you suggesting to ignore the time spent in failover?

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp



Re: Retry in pgbench

From
Fabien COELHO
Date:
>> It would be useful to test replicating clusters with a (switch|fail)over
>> procedure.
>
> Interesting idea but in general a failover takes sometime (like a few
> minutes), and it will strongly affect TPS. I think in the end it just
> compares the failover time.
>
> Or are you suggesting to ignore the time spent in failover?

Or simply to be able to measure it simply from a client perspective? How 
much delay is introduced, how long is endured to go back to the previous 
tps level…

My recollection of Marina patch is that it was non trivial, adding such a 
new and interesting feature suggests a set of patches, not just one patch.

-- 
Fabien.

Re: Retry in pgbench

From
Jehan-Guillaume de Rorthais
Date:
On Fri, 16 Apr 2021 10:28:48 +0900 (JST)
Tatsuo Ishii <ishii@sraoss.co.jp> wrote:

> > By the way, I've been playing with the idea of failing gracefully and retry
> > indefinitely (or until given -T) on SQL error AND connection issue.
> > 
> > It would be useful to test replicating clusters with a (switch|fail)over
> > procedure.  
> 
> Interesting idea but in general a failover takes sometime (like a few
> minutes), and it will strongly affect TPS. I think in the end it just
> compares the failover time.

This usecase is not about benchmarking. It's about generating constant trafic
to be able to practice/train some [auto]switchover procedures while being close
to production activity.

In this contexte, a max-saturated TPS of one node is not relevant. But being
able to add some stats about downtime might be a good addition.

Regards,



Re: Retry in pgbench

From
Tatsuo Ishii
Date:
> This usecase is not about benchmarking. It's about generating constant trafic
> to be able to practice/train some [auto]switchover procedures while being close
> to production activity.
> 
> In this contexte, a max-saturated TPS of one node is not relevant. But being
> able to add some stats about downtime might be a good addition.

Oh I see. That makes sense.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp