Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors - Mailing list pgsql-hackers

From Fabien COELHO
Subject Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
Date
Msg-id alpine.DEB.2.20.1708271745550.32012@lancre
Whole thread Raw
In response to Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors  (Marina Polyakova <m.polyakova@postgrespro.ru>)
Responses Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
List pgsql-hackers
Hello,

> Here is the third version of the patch for pgbench thanks to Fabien Coelho 
> comments. As in the previous one, transactions with serialization and 
> deadlock failures are rolled back and retried until they end successfully or 
> their number of tries reaches maximum.

Here is some partial review.

Patch applies cleanly.

It compiles with warnings, please fix them:

   pgbench.c:2624:28: warning: ‘failure_status’ may be used uninitialized in this function
   pgbench.c:2697:34: warning: ‘command’ may be used uninitialized in this function

I do not think that the error handling feature needs preeminence in the
final report, compare to scale, number of clients and so. The number
of tries should be put further on.

I would spell "number of tries" instead of "tries number" which seems to
suggest that each try is attributed a number. "sql" -> "SQL".

For the per statement latency final report, I do not think it is worth 
distinguishing the kind of retry at this level, because ISTM that 
serialization & deadlocks are unlikely to appear simultaneously. I would 
just report total failures and total tries on this report. We only have 2 
errors now, but if more are added I'm pretty sure that we would not want 
to have more columns... Moreover the 25 characters alignment is ugly, 
better use a much smaller alignment.

I'm okay with having details shown in the "log to file" group report.
The documentation does not seem consistent. It discusses "the very last fields"
and seem to suggest that there are two, but the example trace below just
adds one field.

If you want a paragraph you should add <para>, skipping a line does not
work (around "All values are computed for ...").

I do not understand the second note of the --max-tries documentation.
It seems to suggest that some script may not end their own transaction...
which should be an error in my opinion? Some explanations would be welcome.

I'm not sure that "Retries" deserves a type of its own for two counters.
The "retries" in RetriesState may be redundant with these.
The failures are counted on simple counters while retries have a type,
this is not consistent. I suggest to just use simple counters everywhere.

I'm ok with having the detail report tell about failures & retries only
when some occured.

typo: sucessufully -> successfully

If a native English speaker could provide an opinion on that, and more
generally review the whole documentation, it would be great.

I think that the rand functions should really take a random_state pointer
argument, not a Thread or Client.

I'm at odds that FailureStatus does not have a clean NO_FAILURE state,
and that it is merged with misc failures.

I'm not sure that initRetries, mergeRetries, getAllRetries really
deserve a function.

I do not thing that there should be two accum Functions. Just extend
the existing one, and adding zero to zero is not a problem.

I guess that in the end pgbench & psql variables will have to be merged
if pgbench expression engine is to be used by psql as well, but this is
not linked to this patch.

The tap tests seems over-complicated and heavy with two pgbench run in
parallel... I'm not sure we really want all that complexity for this
somehow small feature. Moreover pgbench can run several scripts, I'm not
sure why two pgbench would need to be invoked. Could something much
simpler and lighter be proposed instead to test the feature?

The added code does not conform to Pg C style. For instance, if brace
should be aligned to the if. Please conform the project style.

The is_transaction_block_end seems simplistic. ISTM that it would not
work with compound commands. It should be clearly documented somewhere.

Also find attached two scripts I used for some testing:

   psql < dl_init.sql
   pgbench -f dl_trans.sql -c 8 -T 10 -P 1

-- 
Fabien.
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

pgsql-hackers by date:

Previous
From: David Fetter
Date:
Subject: Re: [HACKERS] Re: Poor cost estimate with interaction between tablecorrelation and partial indexes
Next
From: Tom Lane
Date:
Subject: Re: [HACKERS] [BUGS] [postgresql 10 beta3] unrecognized node type: 90