Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors - Mailing list pgsql-hackers

From Fabien COELHO
Subject Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
Date
Msg-id alpine.DEB.2.21.1806090810090.5307@lancre
Whole thread Raw
In response to Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors  (Marina Polyakova <m.polyakova@postgrespro.ru>)
Responses Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
List pgsql-hackers
Hello Marina,

> v9-0001-Pgbench-errors-use-the-RandomState-structure-for-.patch
> - a patch for the RandomState structure (this is used to reset a client's 
> random seed during the repeating of transactions after serialization/deadlock 
> failures).

A few comments about this first patch.

Patch applies cleanly, compiles, global & pgbench "make check" ok.

I'm mostly ok with the changes, which cleanly separate the different use 
of random between threads (script choice, throttle delay, sampling...) and 
client (random*() calls).

This change is necessary so that a client can restart a transaction 
deterministically (at the client level at least), which is the ultimate 
aim of the patch series.

A few remarks:

The RandomState struct is 6 bytes, which will induce some padding when 
used. This is life and pre-existing. No problem.

ISTM that the struct itself does not need a name, ie. "typedef struct { 
... } RandomState" is enough.

There could be clear comments, say in the TState and CState structs, about 
what randomness is impacted (i.e. script choices, etc.).

getZipfianRand, computeHarmonicZipfian: The "thread" parameter was 
justified because it was used for two fieds. As the random state is 
separated, I'd suggest that the other argument should be a zipfcache 
pointer.

While reading your patch, it occurs to me that a run is not deterministic 
at the thread level under throttling and sampling, because the random 
state is sollicited differently depending on when transaction ends. This 
suggest that maybe each thread random_state use should have its own random 
state.

In passing, and totally unrelated to this patch:

I've always been a little puzzled about why a quite small 48-bit internal 
state random generator is used. I understand the need for pg to have a 
portable & state-controlled thread-safe random generator, but why this 
particular small one fails me. The source code (src/port/erand48.c, 
copyright in 1993...) looks optimized for 16 bits architectures, which is 
probably pretty inefficent to run on 64 bits architectures. Maybe this 
could be updated with something more consistent with today's processors, 
providing more quality at a lower cost.

-- 
Fabien.


pgsql-hackers by date:

Previous
From: Thomas Kellerer
Date:
Subject: Re: Compromised postgresql instances
Next
From: Andrew Gierth
Date:
Subject: Re: Compromised postgresql instances