Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors - Mailing list pgsql-hackers
From | Fabien COELHO |
---|---|
Subject | Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors |
Date | |
Msg-id | alpine.DEB.2.21.1808170917510.20841@lancre Whole thread Raw |
In response to | Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors (Marina Polyakova <m.polyakova@postgrespro.ru>) |
Responses |
Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
|
List | pgsql-hackers |
Hello Marina, >> Detailed -r report. I understand from the doc that the retry number on >> the detailed per-statement report is to identify at what point errors >> occur? Probably this is more or less always at the same point on a >> given script, so that the most interesting feature is to report the >> number of retries at the script level. > > This may depend on various factors.. for example: > [...] > 21.239 5 36 UPDATE xy3 SET y = y + :delta WHERE x > = :x3; > 21.360 5 39 UPDATE xy2 SET y = y + :delta WHERE x > = :x2; Ok, not always the same point, and you confirm that it identifies where the error is raised which leads to a retry. > And you can always get the number of retries at the script level from the > main report (if only one script is used) or from the report for each script > (if multiple scripts are used). Ok. >> ISTM that "skipped" transactions are NOT "successful" so there are a >> problem with comments. I believe that your formula are probably right, >> it has more to do with what is "success". For cnt decomposition, ISTM >> that "other transactions" are really "directly successful >> transactions". > > I agree with you, but I also think that skipped transactions should not be > considered errors. I'm ok with having a special category for them in the explanations, which is neither success nor error. > So we can write something like this: > All the transactions are divided into several types depending on their > execution. Firstly, they can be divided into transactions that we started to > execute, and transactions which were skipped (it was too late to execute > them). Secondly, running transactions fall into 2 main types: is there any > command that got a failure during the last execution of the transaction > script or not? Thus Here is an attempt at having a more precise and shorter version, not sure it is much better than yours, though: """ Transactions are counted depending on their execution and outcome. First a transaction may have started or not: skipped transactions occur under --rate and --latency-limit when the client is too late to execute them. Secondly, a started transaction may ultimately succeed or fail on some error, possibly after some retries when --max-tries is not one. Thus """ > the number of all transactions = > skipped (it was too late to execute them) > cnt (the number of successful transactions) + > ecnt (the number of failed transactions). > > A successful transaction can have several unsuccessful tries before a > successfull run. Thus > > cnt (the number of successful transactions) = > retried (they got a serialization or a deadlock failure(s), but were > successfully retried from the very beginning) + > directly successfull transactions (they were successfully completed on > the first try). These above description is clearer for me. >> I'd suggest to put "ANOTHER_SQL_FAILURE" as the last option, otherwise >> "another" does not make sense yet. > > Maybe firstly put a general group, and then special cases?... I understand it more as a catch all default "none of the above" case. >> In TState, field "uint32 retries": maybe it would be simpler to count >> "tries", which can be compared directly to max tries set in the option? > > If you mean retries in CState - on the one hand, yes, on the other hand, > statistics always use the number of retries... Ok. >> The automaton skips to FAILURE on every possible error. I'm wondering >> whether it could do so only on SQL errors, because other fails will >> lead to ABORTED anyway? If there is no good reason to skip to FAILURE >> from some errors, I'd suggest to keep the previous behavior. Maybe the >> good reason is to do some counting, but this means that on eg >> metacommand errors now the script would loop over instead of aborting, >> which does not look like a desirable change of behavior. > > Even in the case of meta command errors we must prepare for CSTATE_END_TX and > the execution of the next script: if necessary, clear the conditional stack > and rollback the current transaction block. Seems ok. >> commandFailed: I'm not thrilled by the added boolean, which is partially >> redundant with the second argument. > > Do you mean that it is partially redundant with the argument "cmd" and, for > example, the meta commands errors always do not cause the abortions of the > client? Yes. And also I'm not sure we should want this boolean at all. > [...] > If in such cases one command is placed on several lines, ISTM that the code > is more understandable if curly brackets are used... Hmmm. Such basic style changes are avoided because they break backpatching, so we try to avoid gratuitous changes unless there is a strong added value, which does not seem to be the case here. -- Fabien.
pgsql-hackers by date: