Re: [HACKERS] WIP aPatch: Pgbench Serialization and deadlock errors - Mailing list pgsql-hackers
From | Fabien COELHO |
---|---|
Subject | Re: [HACKERS] WIP aPatch: Pgbench Serialization and deadlock errors |
Date | |
Msg-id | alpine.DEB.2.22.394.2106231053090.1117163@pseudo Whole thread Raw |
In response to | Re: [HACKERS] WIP aPatch: Pgbench Serialization and deadlock errors (Yugo NAGATA <nagata@sraoss.co.jp>) |
Responses |
Re: [HACKERS] WIP aPatch: Pgbench Serialization and deadlock errors
|
List | pgsql-hackers |
Hello Yugo-san, # About v12.2 ## Compilation Patch seems to apply cleanly with "git apply", but does not compile on my host: "undefined reference to `conditional_stack_reset'". However it works better when using the "patch". I'm wondering why git apply fails silently… When compiling there are warnings about "pg_log_fatal", which does not expect a FILE* on pgbench.c:4453. Remove the "stderr" argument. Global and local checks ok. > number of transactions failed: 324 (3.240%) > ... > number of transactions retried: 5629 (56.290%) > number of total retries: 103299 I'd suggest: "number of failed transactions". "total number of retries" or just "number of retries"? ## Feature The overall code structure changes to implements the feature seems reasonable to me, as we are at the 12th iteration of the patch. Comments below are somehow about details and asking questions about choices, and commenting… ## Documentation There is a lot of documentation, which is good. I'll review these separatly. It looks good, but having a native English speaker/writer would really help! Some output examples do not correspond to actual output for the current version. In particular, there is always one TPS figure given now, instead of the confusing two shown before. ## Comments transactinos -> transactions. ## Code By default max_tries = 0. Should not the initialization be 1, as the documentation argues that it is the default? Counter comments, missing + in the formula on the skipped line. Given that we manage errors, ISTM that we should not necessarily stop on other not retried errors, but rather count/report them and possibly proceed. Eg with something like: -- server side random fail DO LANGUAGE plpgsql $$ BEGIN IF RANDOM() < 0.1 THEN RAISE EXCEPTION 'unlucky!'; END IF; END; $$; Or: -- client side random fail BEGIN; \if random(1, 10) <= 1 SELECT 1 +; \else SELECT 2; \endif COMMIT; We could count the fail, rollback if necessary, and go on. What do you think? Maybe such behavior would deserve an option. --report-latencies -> --report-per-command: should we keep supporting the previous option? --failures-detailed: if we bother to run with handling failures, should it always be on? --debug-errors: I'm not sure we should want a special debug mode for that, I'd consider integrating it with the standard debug, or just for development. Also, should it use pg_log_debug? doRetry: I'd separate the 3 no retries options instead of mixing max_tries and timer_exceeeded, for clarity. Tries vs retries: I'm at odds with having tries & retries and + 1 here and there to handle that, which is a little bit confusing. I'm wondering whether we could only count "tries" and adjust to report what we want later? advanceConnectionState: ISTM that ERROR should logically be before others which lead to it. Variables management: it looks expensive, with copying and freeing variable arrays. I'm wondering whether we should think of something more clever. Well, that would be for some other patch. "Accumulate the retries" -> "Count (re)tries"? Currently, ISTM that the retry on error mode is implicitely always on. Do we want that? I'd say yes, but maybe people could disagree. ## Tests There are tests, good! I'm wondering whether something simpler could be devised to trigger serialization or deadlock errors, eg with a SEQUENCE and an \if. See the attached files for generating deadlocks reliably (start with 2 clients). What do you think? The PL/pgSQL minimal, it is really client-code oriented. Given that deadlocks are detected about every seconds, the test runs would take some time. Let it be for now. -- Fabien.
pgsql-hackers by date: