Thread: Re: [HACKERS] psql - add special variable to reflect the last query status
Hi
2017-04-04 23:01 GMT+02:00 Pavel Stehule <pavel.stehule@gmail.com>:
2017-04-04 22:05 GMT+02:00 Fabien COELHO <coelho@cri.ensmp.fr>:
After some discussions about what could be useful since psql scripts now accepts tests, this patch sets a few variables which can be used by psql after a "front door" (i.e. actually typed by the user) query:
- RESULT_STATUS: the status of the query
- ERROR: whether the query failed
- ERROR_MESSAGE: ...
- ROW_COUNT: #rows affected
SELECT * FROM ;
\if :ERROR
\echo oops
\q
\endif
I'm not sure that the names are right. Maybe STATUS would be better than RESULT_STATUS.
I am sending review of this patch:
1. I agree so STATUS is better name, than RESULT status. Currently it returns values with prefix PGRES (like PGRES_FATAL_ERROR, PGRES_TUPLES_OK). Maybe we should to cut this prefix. FATAL_ERROR, TUPLES_OK looks better for custom level. The PGRES prefix has not sense in psql.
2. I propose availability to read ERROR_CODE - sometimes it can be more practical than parsing error possible translated message
3. The fields ERROR_MESSAGE and ROW_COUNT are set only when it has sense. This behave is maybe too strict for psql and the processing needs more nesting \if command. What do you think about -1 or 0 for ROW_COUNT (for DDL) and "" for ERROR_MESSAGE when there are not any error? It will be consistent with already implemented LASTOID variable (and other state psql variables). Using default values are not strict clean, but it can reduce complexity of psql scripts.
4. all regress tests passed
5. there are not any problem with doc building
Regards
Pavel
good ideasRegardsPavel
--
Fabien.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hello Pavel, >>> After some discussions about what could be useful since psql scripts now >>> accepts tests, this patch sets a few variables which can be used by psql >>> after a "front door" (i.e. actually typed by the user) query: >>> >>> - RESULT_STATUS: the status of the query >>> - ERROR: whether the query failed >>> - ERROR_MESSAGE: ... >>> - ROW_COUNT: #rows affected >>> >>> SELECT * FROM ; >>> \if :ERROR >>> \echo oops >>> \q >>> \endif >>> >>> I'm not sure that the names are right. Maybe STATUS would be better than >>> RESULT_STATUS. >> > I am sending review of this patch: > > 1. I agree so STATUS is better name, than RESULT status. Ok, looks simpler. > Currently it returns values with prefix PGRES (like PGRES_FATAL_ERROR, > PGRES_TUPLES_OK). Maybe we should to cut this prefix. FATAL_ERROR, > TUPLES_OK looks better for custom level. The PGRES prefix has not sense > in psql. Indeed. I skipped "PGRES_". > 2. I propose availability to read ERROR_CODE - sometimes it can be more > practical than parsing error possible translated message Ok. > 3. The fields ERROR_MESSAGE and ROW_COUNT are set only when it has sense. > This behave is maybe too strict for psql and the processing needs more > nesting \if command. What do you think about -1 or 0 for ROW_COUNT (for > DDL) and "" for ERROR_MESSAGE when there are not any error? It will be > consistent with already implemented LASTOID variable (and other state psql > variables). Using default values are not strict clean, but it can reduce > complexity of psql scripts. My intention was that it could be tested with the "is defined" syntax, which is yet to be agreed upon and implemented, so maybe generating empty string is a better option. For ROW_COUNT, I think that it should be consistent with what PL/pgSQL does, so it think that 0 should be the default. > 4. all regress tests passed > 5. there are not any problem with doc building Please find attached a v2 which hopefully takes into account all your points above. Open question: should it gather more PQerrorResultField, or the two selected one are enough? If more, which should be included? -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
2017-05-22 9:40 GMT+02:00 Fabien COELHO <coelho@cri.ensmp.fr>:
Hello Pavel,I am sending review of this patch:After some discussions about what could be useful since psql scripts now
accepts tests, this patch sets a few variables which can be used by psql
after a "front door" (i.e. actually typed by the user) query:
- RESULT_STATUS: the status of the query
- ERROR: whether the query failed
- ERROR_MESSAGE: ...
- ROW_COUNT: #rows affected
SELECT * FROM ;
\if :ERROR
\echo oops
\q
\endif
I'm not sure that the names are right. Maybe STATUS would be better than
RESULT_STATUS.
1. I agree so STATUS is better name, than RESULT status.
Ok, looks simpler.Currently it returns values with prefix PGRES (like PGRES_FATAL_ERROR, PGRES_TUPLES_OK). Maybe we should to cut this prefix. FATAL_ERROR, TUPLES_OK looks better for custom level. The PGRES prefix has not sense in psql.
Indeed. I skipped "PGRES_".2. I propose availability to read ERROR_CODE - sometimes it can be more
practical than parsing error possible translated message
Ok.3. The fields ERROR_MESSAGE and ROW_COUNT are set only when it has sense.
This behave is maybe too strict for psql and the processing needs more
nesting \if command. What do you think about -1 or 0 for ROW_COUNT (for
DDL) and "" for ERROR_MESSAGE when there are not any error? It will be
consistent with already implemented LASTOID variable (and other state psql
variables). Using default values are not strict clean, but it can reduce
complexity of psql scripts.
My intention was that it could be tested with the "is defined" syntax, which is yet to be agreed upon and implemented, so maybe generating empty string is a better option.
For ROW_COUNT, I think that it should be consistent with what PL/pgSQL does, so it think that 0 should be the default.4. all regress tests passed
5. there are not any problem with doc building
Please find attached a v2 which hopefully takes into account all your points above.
Open question: should it gather more PQerrorResultField, or the two selected one are enough? If more, which should be included?
I don't think so it is necessary. No in this moment. ERROR_CODE and ERROR_MESSAGE are fundamental - and if we add other, then we should to add all. Has not sense to add only some.
Regards
Pavel
--
Fabien.
>> Please find attached a v2 which hopefully takes into account all your >> points above. >> >> Open question: should it gather more PQerrorResultField, or the two >> selected one are enough? If more, which should be included? > > > I don't think so it is necessary. No in this moment. ERROR_CODE and > ERROR_MESSAGE are fundamental - and if we add other, then we should to add > all. Has not sense to add only some. Ok. I'm fine with stopping at CODE & MESSAGE. -- Fabien.
2017-05-22 21:33 GMT+02:00 Fabien COELHO <coelho@cri.ensmp.fr>:
Please find attached a v2 which hopefully takes into account all your
points above.
Open question: should it gather more PQerrorResultField, or the two
selected one are enough? If more, which should be included?
I don't think so it is necessary. No in this moment. ERROR_CODE and
ERROR_MESSAGE are fundamental - and if we add other, then we should to add
all. Has not sense to add only some.
Ok. I'm fine with stopping at CODE & MESSAGE.
I have not any other comments. The implementation is trivial. I rerun all tests and tests passed.
I'll mark this patch as ready for commiters.
Regards
Pavel
--
Fabien.
Hello Pavel, > I have not any other comments. The implementation is trivial. [...] Indeed. > I'll mark this patch as ready for commiters. Thanks for the review. -- Fabien.
> I have not any other comments. The implementation is trivial. I rerun all > tests and tests passed. > > I'll mark this patch as ready for commiters. Oops, I just noticed a stupid confusion on my part which got through, I was setting "ERROR" as "success", inverting the expected boolean value. Here is a fixed version. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
2017-06-17 7:58 GMT+02:00 Fabien COELHO <coelho@cri.ensmp.fr>:
I have not any other comments. The implementation is trivial. I rerun all
tests and tests passed.
I'll mark this patch as ready for commiters.
Oops, I just noticed a stupid confusion on my part which got through, I was setting "ERROR" as "success", inverting the expected boolean value.
Here is a fixed version.
I missed it too.
We can introduce macro SetVariableBool(vars, varname, bool) instead
SetVariable(pset.vars, "ERROR", "FALSE");
Regards
Pavel
--
Fabien.
Hi
2017-06-19 5:55 GMT+02:00 Pavel Stehule <pavel.stehule@gmail.com>:
2017-06-17 7:58 GMT+02:00 Fabien COELHO <coelho@cri.ensmp.fr>:I have not any other comments. The implementation is trivial. I rerun all
tests and tests passed.
I'll mark this patch as ready for commiters.
Oops, I just noticed a stupid confusion on my part which got through, I was setting "ERROR" as "success", inverting the expected boolean value.
Here is a fixed version.I missed it too.We can introduce macro SetVariableBool(vars, varname, bool) insteadSetVariable(pset.vars, "ERROR", "FALSE");
I checked source code, and it requires little bit more harder refactoring because now we have SetVariableBool - what is unhappy name, because it initialize variable to ON value. It is question what is better name?
I found more interesting issue - the code of SetResultVariables is partially redundant with AcceptResult - maybe the switch there can be shared.
Regards
Pavel
RegardsPavel
--
Fabien.
Hello Pavel, >> We can introduce macro SetVariableBool(vars, varname, bool) instead >> >> SetVariable(pset.vars, "ERROR", "FALSE"); > > I checked source code, and it requires little bit more harder refactoring > because now we have SetVariableBool - what is unhappy name, because it > initialize variable to ON value. It is question what is better name? The boolean values (on/off 1/0 true/false...) accepted for pg settings is probably convenient but also somehow fuzzy. From a programming point of view, I like booleans to have either true or false values, and nothing else. I agree that the existing "SetVariableBool" function is a misnommer, it should be "SetVariableOn" given what it does, and it is not what we need. Here is a v4 which attempts to extend & reuse the function. People might be surprised that TRUE is used where ON was used before, so I'm not sure. > I found more interesting issue - the code of SetResultVariables is > partially redundant with AcceptResult - maybe the switch there can be > shared. I agree that there is some common structure, but ISTM that the AcceptResult function is called in a variety of situation where variables are not to be set (eg "internal" queries, not user provided queries), so I thought it best to keep the two apart. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
2017-06-27 17:30 GMT+02:00 Fabien COELHO <coelho@cri.ensmp.fr>:
Hello Pavel,We can introduce macro SetVariableBool(vars, varname, bool) instead
SetVariable(pset.vars, "ERROR", "FALSE");
I checked source code, and it requires little bit more harder refactoring
because now we have SetVariableBool - what is unhappy name, because it
initialize variable to ON value. It is question what is better name?
The boolean values (on/off 1/0 true/false...) accepted for pg settings is probably convenient but also somehow fuzzy.
From a programming point of view, I like booleans to have either true or false values, and nothing else.
I agree that the existing "SetVariableBool" function is a misnommer, it should be "SetVariableOn" given what it does, and it is not what we need.
switching default setting from ON to TRUE requires wider discussion - in this moment I like to have special function "SetVariableON".
Here is a v4 which attempts to extend & reuse the function. People might be surprised that TRUE is used where ON was used before, so I'm not sure.I found more interesting issue - the code of SetResultVariables is
partially redundant with AcceptResult - maybe the switch there can be
shared.
I agree that there is some common structure, but ISTM that the AcceptResult function is called in a variety of situation where variables are not to be set (eg "internal" queries, not user provided queries), so I thought it best to keep the two apart.
I understand, but It is not nice, really - maybe only switch can be moved to some inlining function like IsSuccess() - more .. with this function, the SetResultVariables function will be more cleaner
Regards
Pavel
--
Fabien.
Hello Pavel, >> I agree that the existing "SetVariableBool" function is a misnommer, it >> should be "SetVariableOn" given what it does, and it is not what we >> need. > > switching default setting from ON to TRUE requires wider discussion - Yep. > in this moment I like to have special function "SetVariableON". I'm fine with this, but this make it a change totally unrelated to this patch as it would not use the function... Moreover, this function would not use an hypothetical "set var bool" function because of the debatable on/off vs true/false change. Also, a "set var bool" function would be called only twice, which is not very beneficial for a oneliner, so I left it out. >> I agree that there is some common structure, but ISTM that the >> AcceptResult function is called in a variety of situation where variables >> are not to be set (eg "internal" queries, not user provided queries), so I >> thought it best to keep the two apart. > > I understand, but It is not nice, really - maybe only switch can be moved > to some inlining function like IsSuccess() - more .. with this function, > the SetResultVariables function will be more cleaner Indeed. Attached v5 does that. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
2017-06-28 9:25 GMT+02:00 Fabien COELHO <coelho@cri.ensmp.fr>:
Hello Pavel,I agree that the existing "SetVariableBool" function is a misnommer, it should be "SetVariableOn" given what it does, and it is not what we need.
switching default setting from ON to TRUE requires wider discussion -
Yep.in this moment I like to have special function "SetVariableON".
I'm fine with this, but this make it a change totally unrelated to this patch as it would not use the function... Moreover, this function would not use an hypothetical "set var bool" function because of the debatable on/off vs true/false change.
Also, a "set var bool" function would be called only twice, which is not very beneficial for a oneliner, so I left it out.I agree that there is some common structure, but ISTM that the
AcceptResult function is called in a variety of situation where variables
are not to be set (eg "internal" queries, not user provided queries), so I
thought it best to keep the two apart.
I understand, but It is not nice, really - maybe only switch can be moved
to some inlining function like IsSuccess() - more .. with this function,
the SetResultVariables function will be more cleaner
Indeed. Attached v5 does that.
juju - something like this
+ if (success)
+ {
+ char *ntuples = PQcmdTuples(results);
+ SetVariable(pset.vars, "ROW_COUNT", *ntuples ? ntuples : "0");
+ SetVariable(pset.vars, "ERROR", "FALSE");
+ }
+ else
+ {
+ SetVariable(pset.vars, "ROW_COUNT", "0");
+ SetVariable(pset.vars, "ERROR", "TRUE");
+ }
+}
It can be simplified
SetVariable(pset.vars, "ROW_COUNT", success ? PQcmdTuples(results) : 0);
SetVariable(pset.vars, "ERROR", success ? "FALSE" : "TRUE");
Regards
Pavel
--
Fabien.
Hello Pavel, > + if (success) > + { > + char *ntuples = PQcmdTuples(results); > + SetVariable(pset.vars, "ROW_COUNT", *ntuples ? ntuples : "0"); > + SetVariable(pset.vars, "ERROR", "FALSE"); > + } > + else > + { > + SetVariable(pset.vars, "ROW_COUNT", "0"); > + SetVariable(pset.vars, "ERROR", "TRUE"); > + } > +} > > It can be simplified > > SetVariable(pset.vars, "ROW_COUNT", success ? PQcmdTuples(results) : 0); According to the documentation, PQcmdTuples returns "" in some cases and ISTM we want "0" instead for consistency, so that it is always a number. I rejected calling PQcmdTuples twice: ..., success && *PQcmdTuples(results) ? PQcmdTuples(results) : "0") Thus it makes the "if (success)" necessary for ROW_COUNT, and then it looked simpler to handle ERROR the same way. Now if the semantics is changed to put as row count whatever comes out of the function, even if not a count, then the code could indeed be simplified as you suggest. -- Fabien.
2017-06-28 10:04 GMT+02:00 Fabien COELHO <coelho@cri.ensmp.fr>:
Hello Pavel,+ if (success)
+ {
+ char *ntuples = PQcmdTuples(results);
+ SetVariable(pset.vars, "ROW_COUNT", *ntuples ? ntuples : "0");
+ SetVariable(pset.vars, "ERROR", "FALSE");
+ }
+ else
+ {
+ SetVariable(pset.vars, "ROW_COUNT", "0");
+ SetVariable(pset.vars, "ERROR", "TRUE");
+ }
+}
It can be simplified
SetVariable(pset.vars, "ROW_COUNT", success ? PQcmdTuples(results) : 0);
According to the documentation, PQcmdTuples returns "" in some cases and ISTM we want "0" instead for consistency, so that it is always a number. I rejected calling PQcmdTuples twice:
..., success && *PQcmdTuples(results) ? PQcmdTuples(results) : "0")
Thus it makes the "if (success)" necessary for ROW_COUNT, and then it looked simpler to handle ERROR the same way.
Now if the semantics is changed to put as row count whatever comes out of the function, even if not a count, then the code could indeed be simplified as you suggest.
Understand
Pavel
--
Fabien.
A few thoughts about this patch: * I think the ERROR_CODE variable should instead be named SQLSTATE. That is what the SQL standard calls this string, and it's also what just about all our documentation calls it; see PG_DIAG_SQLSTATE in libpq, or the SQLSTATE 'xxxxx' construct in pl/pgsql, or the sqlstate attribute of an exception object in plpython, etc etc. * I'm not exactly convinced that there's a use-case for STATUS that's not covered as well or better by ERROR. Client code that looks at PQresStatus for anything beyond error/not-error is usually doing that because it's library code that doesn't know what kind of query it's working on. It seems like a stretch that a psql script would not know that. Also, PQresultStatus memorializes some legacy distinctions, like "fatal" vs "nonfatal" error, that I think we'd be better off not exposing in psql scripting. * It might be better if SQLSTATE and ERROR_MESSAGE were left unchanged by a non-error query. That would reduce the need to copy them into other variables just because you needed to do something else before printing them. It'd save a few cycles too. * Speaking of which, has anyone tried to quantify the performance impact of this patch? It might well be negligible, but I do not think we should assume that without evidence. * I wonder why you didn't merge this processing into ProcessResult, instead of inventing an extra function (whose call site seems rather poorly chosen anyhow --- what's the justification for not counting this overhead as part of the query runtime?). You could probably eliminate the refactoring you did, since it wouldn't be necessary to recompute AcceptResult's result that way. regards, tom lane
Hello Tom, > * I think the ERROR_CODE variable should instead be named SQLSTATE. > That is what the SQL standard calls this string, and it's also what > just about all our documentation calls it; see PG_DIAG_SQLSTATE > in libpq, or the SQLSTATE 'xxxxx' construct in pl/pgsql, or the > sqlstate attribute of an exception object in plpython, etc etc. I choose ERROR_CODE because it matched the ERROR boolean. But is may be a misnomer if the status is that all is well. I'm okay with SQLSTATE. > * I'm not exactly convinced that there's a use-case for STATUS that's > not covered as well or better by ERROR. Client code that looks at > PQresStatus for anything beyond error/not-error is usually doing that > because it's library code that doesn't know what kind of query it's > working on. It seems like a stretch that a psql script would not know > that. Also, PQresultStatus memorializes some legacy distinctions, like > "fatal" vs "nonfatal" error, that I think we'd be better off not > exposing in psql scripting. Ok. > * It might be better if SQLSTATE and ERROR_MESSAGE were left > unchanged by a non-error query. Hmmm. I'm not sure. If so, ERROR_MESSAGE should be LAST_ERROR_MESSAGE maybe? Then what about LAST_ERROR_SQLSTATE to go with it, and let SQLSTATE & ERROR_MESSAGE reflect the last command, and ERROR is just the boolean to test if it occured? > That would reduce the need to copy them into other variables just > because you needed to do something else before printing them. It'd save > a few cycles too. Well, my suggestion would mean that they would be copied when an error occurs, but only when it occurs, which would not be often. If you would like them, I'm not sure how these variable should be initialized. Undefined? Empty? > * Speaking of which, has anyone tried to quantify the performance > impact of this patch? It might well be negligible, but I do not > think we should assume that without evidence. I think it should be negligible compared to network connections, aborting an ongoing transaction, reading the script... But I do not know libpq internals so I may be quite naive. > * I wonder why you didn't merge this processing into ProcessResult, > instead of inventing an extra function (whose call site seems rather > poorly chosen anyhow --- what's the justification for not counting this > overhead as part of the query runtime?). You could probably eliminate > the refactoring you did, since it wouldn't be necessary to recompute > AcceptResult's result that way. Hmmm. I assume that you are unhappy about ResultIsSuccess. The refactoring is because the function is used twice. I choose to do that because the functionality is clear and could be made as a function which improved readability. Ok, PQresultStatus is thus called twice, I assumed that it is just reading a field in a struct... it could be returned to the caller with an additional pointer to avoid that. The SendResult & ProcessResult functions are already quite heavy to my taste, I did not want to add significantly to that. The ProcessResult switch does not test all states cleanly, it is really about checking about copy, and not so clear, so I do not think that it would mix well to add the variable stuff in the middle of that. SendQuery is also pretty complex, including gotos everywhere. So I did want to add to these two functions beyond the minimum. Now, I can also inline everything coldly in ProcessResult, no problem. -- Fabien.
Fabien COELHO <coelho@cri.ensmp.fr> writes: >> * It might be better if SQLSTATE and ERROR_MESSAGE were left >> unchanged by a non-error query. > Hmmm. I'm not sure. If so, ERROR_MESSAGE should be LAST_ERROR_MESSAGE > maybe? Then what about LAST_ERROR_SQLSTATE to go with it, and let SQLSTATE > & ERROR_MESSAGE reflect the last command, and ERROR is just the boolean to > test if it occured? >> That would reduce the need to copy them into other variables just >> because you needed to do something else before printing them. It'd save >> a few cycles too. > Well, my suggestion would mean that they would be copied when an error > occurs, but only when it occurs, which would not be often. Uh ... what? regards, tom lane
>>> * It might be better if SQLSTATE and ERROR_MESSAGE were left >>> unchanged by a non-error query. > >> Hmmm. I'm not sure. If so, ERROR_MESSAGE should be LAST_ERROR_MESSAGE >> maybe? Then what about LAST_ERROR_SQLSTATE to go with it, and let SQLSTATE >> & ERROR_MESSAGE reflect the last command, and ERROR is just the boolean to >> test if it occured? > >>> That would reduce the need to copy them into other variables just >>> because you needed to do something else before printing them. It'd save >>> a few cycles too. > >> Well, my suggestion would mean that they would be copied when an error >> occurs, but only when it occurs, which would not be often. > > Uh ... what? Sorry if my sentence was not very clear. Time to go do bed:-) I just mean that a LAST_ERROR_* would be set when an error occurs. When there is no error, it is expected to remain the same, and it does not cost anything to let it as is. If an error occured then you had a problem, a transaction aborted, paying to set a few variables when it occurs does not look like a big performance issue. Script usually expect to run without error, errors are rare events. -- Fabien.
Hello Tom, Here is a version 6. > A few thoughts about this patch: > > * I think the ERROR_CODE variable should instead be named SQLSTATE. > That is what the SQL standard calls this string, and it's also what > just about all our documentation calls it; see PG_DIAG_SQLSTATE > in libpq, or the SQLSTATE 'xxxxx' construct in pl/pgsql, or the > sqlstate attribute of an exception object in plpython, etc etc. ERROR_CODE -> SQLSTATE. > * I'm not exactly convinced that there's a use-case for STATUS Removed, but I think it was nice to have, it is easier to interpret than error codes and their classes that I have not memorized yet:-) > * It might be better if SQLSTATE and ERROR_MESSAGE were left > unchanged by a non-error query. That would reduce the need to > copy them into other variables just because you needed to do > something else before printing them. It'd save a few cycles too. Added LAST_ERROR_SQLSTATE & MESSAGE, only reset when an error occured. > * Speaking of which, has anyone tried to quantify the performance > impact of this patch? It might well be negligible, but I do not > think we should assume that without evidence. My guess is negligible. Not sure how to measure this negligible, as many very fast query should be executed to have something significant. Maybe 100,000 "SELECT 1;" in a script? > * I wonder why you didn't merge this processing into ProcessResult, > instead of inventing an extra function (whose call site seems rather > poorly chosen anyhow --- what's the justification for not counting this > overhead as part of the query runtime?). You could probably eliminate > the refactoring you did, since it wouldn't be necessary to recompute > AcceptResult's result that way. Variable setting moved at then end of ProcessResult, no new functions, result is clean, so I should have done it like that in the beginning. Forgotten help stuff added. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
> Here is a version 6. Small v7 update, sorry for the noise. Add testing the initial state of all variables. Fix typos in a comment in tests. Fix the documentation wrt the current implementation behavior. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Hi
2017-09-06 11:14 GMT+02:00 Fabien COELHO <coelho@cri.ensmp.fr>:
Here is a version 6.
Small v7 update, sorry for the noise.
Add testing the initial state of all variables.
Fix typos in a comment in tests.
Fix the documentation wrt the current implementation behavior.
I rechecked last patch
There is not any issue with patching. All regress tests passed
I checked performance - the most fast queries are execution of simple prepared statement
prepare x as select 1;
-- 1000000 x
execute x;
execute x;
execute x;
execute x;
## patched
[pavel@nemesis postgresql]$ time psql -At -1 postgres -f ~/xxx.sql > /dev/null
real 0m44,887s
user 0m11,703s
sys 0m6,942s
This is probably the most worst case, what is possible and see some slowdown - in this case there is about 10% slowdown -
but it is pretty untypical - the one query was less than 50 microsec. When there will be any IO activity or network usage, than this patch doesn't create any significant overhead.
I'll mark this patch as ready for commiter
Regards
Pavel
--
Fabien.
Hello Pavel, > I checked performance - the most fast queries are execution of simple > prepared statement > > prepare x as select 1; > -- 1000000 x > execute x; > execute x; > execute x; > execute x; > > ## patched > [pavel@nemesis postgresql]$ time psql -At -1 postgres -f ~/xxx.sql > > /dev/null > > real 0m44,887s > user 0m11,703s > sys 0m6,942s > > This is probably the most worst case, what is possible and see some > slowdown - in this case there is about 10% slowdown - > > but it is pretty untypical - the one query was less than 50 microsec. When > there will be any IO activity or network usage, than this patch doesn't > create any significant overhead. Interesting. Thanks for the test. I tried to replicate with a variant without any output: "SELECT;" SELECT NOW() AS start \gset BEGIN; SELECT; -- 2^19 times END; SELECT NOW() - :'start'; The run time is about 19 µs per SELECT on my laptop. Over 33 runs each alternating master with and without the patch, I got the following stats on the measured time in seconds (average, stddev [min Q1 median Q3 max]): - with : 9.898 ± 0.158 [9.564, 9.762, 9.936, 10.037, 10.108] - without: 9.419 ± 0.294 [8.670, 9.226, 9.533, 9.625, 9.845] This seems consistent and significant. It suggests a 0.40-0.50 s difference, that is about 5%, i.e. about (under) 1 µs overhead per statement in pretty defavorable circumstances. -- Fabien.
Fabien COELHO <coelho@cri.ensmp.fr> writes: > Small v7 update, sorry for the noise. Hm. Looking closer at this, I see that it doesn't work so well after all to put the variable-setting code in ProcessResult: that fails to cover the ExecQueryUsingCursor code path. And it also fails to cover DescribeQuery, which arguably should set these variables as well -- certainly so if it gets a failure. Maybe you could create a small subroutine along the lines of SetResultVariables(PGresult *result, bool success) for all three places to call. (ProcessResult certainly has already decided whether it's got a success, and I think the other paths would know that as well, so no need to re-extract it from the PGresult.) I think you're overly optimistic to believe that every failure will have a SQLSTATE; I don't think that's true for libpq-reported errors, such as connection loss. Using upper-case TRUE/FALSE for the values of ERROR seems a bit ugly to me; we generally use lower case for other variable values, so I'd go with true/false. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Hello Tom, > Hm. Looking closer at this, I see that it doesn't work so well after all > to put the variable-setting code in ProcessResult: > that fails to cover the ExecQueryUsingCursor code path. Ok, I'll investigate this path. > And it also fails to cover DescribeQuery, which arguably should set > these variables as well And this one. > -- certainly so if it gets a failure. Maybe you > could create a small subroutine along the lines of > SetResultVariables(PGresult *result, bool success) for all three places > to call. (ProcessResult certainly has already decided whether it's got > a success, and I think the other paths would know that as well, so no > need to re-extract it from the PGresult.) Ok. > I think you're overly optimistic to believe that every failure will > have a SQLSTATE; I don't think that's true for libpq-reported errors, > such as connection loss. Yep, I thought I was optimistic:-) Can I add a special SQLSTATE for that situation where libpq did not report an error? > Using upper-case TRUE/FALSE for the values of ERROR seems a bit > ugly to me; we generally use lower case for other variable values, > so I'd go with true/false. Ok. The choice is not aesthetic but systematic: I use upper-case for all SQL keywords, and lower-case or capitalized for anything user land. I can put lower-case if you want. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Fabien COELHO <coelho@cri.ensmp.fr> writes: >> I think you're overly optimistic to believe that every failure will >> have a SQLSTATE; I don't think that's true for libpq-reported errors, >> such as connection loss. > Yep, I thought I was optimistic:-) Can I add a special SQLSTATE for that > situation where libpq did not report an error? Meh. If we're going to do that I think it might be better to hack libpq itself to do so, ie, force PQresultErrorField(..., PG_DIAG_SQLSTATE) to always return something. But it seems like a hack either way. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
>>> I think you're overly optimistic to believe that every failure will >>> have a SQLSTATE; I don't think that's true for libpq-reported errors, >>> such as connection loss. > >> Yep, I thought I was optimistic:-) Can I add a special SQLSTATE for that >> situation where libpq did not report an error? > > Meh. If we're going to do that I think it might be better to hack > libpq itself to do so, ie, force PQresultErrorField(..., PG_DIAG_SQLSTATE) > to always return something. But it seems like a hack either way. I would not have took the liberty to hack into libpq internals for such a small front-end feature. However I agree that having libpq always return some diagnostic, even if it means "something unclear happened, sorry not to be very precise", would be better. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
2017-09-11 20:46 GMT+02:00 Fabien COELHO <coelho@cri.ensmp.fr>:
I think you're overly optimistic to believe that every failure will
have a SQLSTATE; I don't think that's true for libpq-reported errors,
such as connection loss.Yep, I thought I was optimistic:-) Can I add a special SQLSTATE for that
situation where libpq did not report an error?
Meh. If we're going to do that I think it might be better to hack
libpq itself to do so, ie, force PQresultErrorField(..., PG_DIAG_SQLSTATE)
to always return something. But it seems like a hack either way.
I would not have took the liberty to hack into libpq internals for such a small front-end feature. However I agree that having libpq always return some diagnostic, even if it means "something unclear happened, sorry not to be very precise", would be better.
probably better don't do it before somebody implements this are correctly .. some temporary solution can introduce possible compatibility issues.
If SQLSTATE has not know value, then it should be NULL or maybe empty string.
Pavel
--
Fabien.
Hello Tom, >>> Yep, I thought I was optimistic:-) Can I add a special SQLSTATE for that >>> situation where libpq did not report an error? >> >> Meh. If we're going to do that I think it might be better to hack >> libpq itself to do so, ie, force PQresultErrorField(..., PG_DIAG_SQLSTATE) >> to always return something. But it seems like a hack either way. > > I would not have took the liberty to hack into libpq internals for such a > small front-end feature. However I agree that having libpq always return some > diagnostic, even if it means "something unclear happened, sorry not to be > very precise", would be better. Here is an attempt at implementing your suggestions. I added two error codes, which is debatable. One is used hardcoded by libpq if no diagnostic is found, and the other by psql if libpq returned something empty, which might happen if psql is linked with an older libpq, maybe. I do not know how to trigger such errors anyway, so this is rather academic. I put back SetResultVariables function which is called twice, for SQL queries and the new descriptions. It worked out of the box with DECLARE which is just another SQL statement, so maybe I did not understood the cursor issue you were signaling... -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On Tue, Sep 12, 2017 at 1:23 PM, Fabien COELHO <coelho@cri.ensmp.fr> wrote: > I added two error codes, which is debatable. One is used hardcoded by libpq > if no diagnostic is found, and the other by psql if libpq returned something > empty, which might happen if psql is linked with an older libpq, maybe. I do > not know how to trigger such errors anyway, so this is rather academic. I think this is a bad plan. Right now, libpq sets no SQLSTATE for internally generated errors; it is almost certain that there are applications testing for an empty SQLSTATE to notice when they're getting an error from libpq. EnterpriseDB had a support ticket quite recently where this precise behavior was at issue. Changing it will break stuff, so we shouldn't do it unless there's a really compelling benefit. Universally returning PQ000 is not a sufficient improvement over universally returning the empty string to justify the risk of application breakage. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
2017-09-12 20:43 GMT+02:00 Robert Haas <robertmhaas@gmail.com>:
On Tue, Sep 12, 2017 at 1:23 PM, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
> I added two error codes, which is debatable. One is used hardcoded by libpq
> if no diagnostic is found, and the other by psql if libpq returned something
> empty, which might happen if psql is linked with an older libpq, maybe. I do
> not know how to trigger such errors anyway, so this is rather academic.
I think this is a bad plan. Right now, libpq sets no SQLSTATE for
internally generated errors; it is almost certain that there are
applications testing for an empty SQLSTATE to notice when they're
getting an error from libpq. EnterpriseDB had a support ticket quite
recently where this precise behavior was at issue. Changing it will
break stuff, so we shouldn't do it unless there's a really compelling
benefit. Universally returning PQ000 is not a sufficient improvement
over universally returning the empty string to justify the risk of
application breakage.
+1
Pavel
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Tue, Sep 12, 2017 at 1:23 PM, Fabien COELHO <coelho@cri.ensmp.fr> wrote: >> I added two error codes, which is debatable. One is used hardcoded by libpq >> if no diagnostic is found, and the other by psql if libpq returned something >> empty, which might happen if psql is linked with an older libpq, maybe. I do >> not know how to trigger such errors anyway, so this is rather academic. > I think this is a bad plan. Right now, libpq sets no SQLSTATE for > internally generated errors; it is almost certain that there are > applications testing for an empty SQLSTATE to notice when they're > getting an error from libpq. EnterpriseDB had a support ticket quite > recently where this precise behavior was at issue. Changing it will > break stuff, so we shouldn't do it unless there's a really compelling > benefit. Universally returning PQ000 is not a sufficient improvement > over universally returning the empty string to justify the risk of > application breakage. I don't think I want to buy this argument, because the logical conclusion of it is that we can never fix libpq to offer proper SQLSTATEs for client-side errors. Admittedly, the fact that nobody's bothered to do so in ~15 years may indicate that nobody cares ... but I would think that at least it'd be useful to distinguish, say, ENOMEM from connection loss. Saying we can't do it for compatibility reasons doesn't sound great to me. Especially when you've not provided any hard evidence as to why the current lack-of-information is useful. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Sep 12, 2017 at 3:12 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I think this is a bad plan. Right now, libpq sets no SQLSTATE for >> internally generated errors; it is almost certain that there are >> applications testing for an empty SQLSTATE to notice when they're >> getting an error from libpq. EnterpriseDB had a support ticket quite >> recently where this precise behavior was at issue. Changing it will >> break stuff, so we shouldn't do it unless there's a really compelling >> benefit. Universally returning PQ000 is not a sufficient improvement >> over universally returning the empty string to justify the risk of >> application breakage. > > I don't think I want to buy this argument, because the logical conclusion > of it is that we can never fix libpq to offer proper SQLSTATEs for > client-side errors. Admittedly, the fact that nobody's bothered to do so > in ~15 years may indicate that nobody cares ... but I would think that > at least it'd be useful to distinguish, say, ENOMEM from connection loss. > Saying we can't do it for compatibility reasons doesn't sound great > to me. Especially when you've not provided any hard evidence as to why > the current lack-of-information is useful. Well, if we provided a different SQLSTATE for each qualitatively different type of libpq error, that might well be useful enough to justify some risk of application breakage. But replacing a constant string that we've had for ~15 years with a different constraint string isn't doing anything about the lack-of-information problem you're complaining about. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes: > Well, if we provided a different SQLSTATE for each qualitatively > different type of libpq error, that might well be useful enough to > justify some risk of application breakage. But replacing a constant > string that we've had for ~15 years with a different constraint string > isn't doing anything about the lack-of-information problem you're > complaining about. True. Well, the original point here was whether psql ought to be doing something to mask libpq's (mis) behavior. I'm inclined to think not: if it doesn't get a SQLSTATE from the PGresult, it should just set the sqlstate variables to empty strings. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
>> Well, if we provided a different SQLSTATE for each qualitatively >> different type of libpq error, that might well be useful enough to >> justify some risk of application breakage. But replacing a constant >> string that we've had for ~15 years with a different constraint string >> isn't doing anything about the lack-of-information problem you're >> complaining about. > > True. Well, the original point here was whether psql ought to be doing > something to mask libpq's (mis) behavior. I'm inclined to think not: > if it doesn't get a SQLSTATE from the PGresult, it should just set the > sqlstate variables to empty strings. See v9 attached. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Fabien COELHO <coelho@cri.ensmp.fr> writes: > See v9 attached. I've pushed this with some editorialization. > I put back SetResultVariables function which is called twice, for SQL > queries and the new descriptions. It worked out of the box with DECLARE > which is just another SQL statement, so maybe I did not understood the > cursor issue you were signaling... No, I was concerned about ExecQueryUsingCursor(), which is used when FETCH_COUNT is enabled. It's sort of a pain because you have to accumulate the row count across multiple PGresults. If you don't, then FETCH_COUNT mode isn't transparent, which it's supposed to be. I did some performance testing of my own, based on this possibly-silly test case: perl -e 'for($i=0;$i<9999999;$i++) {print "set enable_seqscan=0;\n";}' | psql -q The idea was to run a trivial query and minimize all other psql overhead, particularly results-printing. With this, "perf" told me that SetResultVariables and its called functions accounted for 1.5% of total CPU (including the server processes). That's kind of high, but it's probably tolerable considering that any real application would involve both far more server work per query and far more psql work (at least for SELECTs). One thing we could think about if this seems too high is to drop ROW_COUNT. I'm unconvinced that it has a real use-case, and it seems to be taking more than its share of the work in non-error cases, because it turns out that PQcmdTuples() is not an amazingly cheap function. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Hello Tom, >> I put back SetResultVariables function which is called twice, for SQL >> queries and the new descriptions. It worked out of the box with DECLARE >> which is just another SQL statement, so maybe I did not understood the >> cursor issue you were signaling... > > No, I was concerned about ExecQueryUsingCursor(), which is used when > FETCH_COUNT is enabled. It's sort of a pain because you have to > accumulate the row count across multiple PGresults. If you don't, > then FETCH_COUNT mode isn't transparent, which it's supposed to be. Please allow me to disagree a little with this semantics. ISTM that the semantics of the simple previous implementation was fine, "number of rows returned by previous statement". If you do "FETCH 3 ...", then you get between 0 and 3 rows... Good. If you do it again, same... I'm not sure having an accumulation semantics helps a lot, because it creates an exception, and moreover I can think of legitimate use case where counting only last statement rows would be useful, eg to check that we are done with a cursor and it can be closed. If someone really wants to accumulate, it can be done by hand quite simply, currently as: SELECT :ROW_COUNT + :accum AS accum \gset or client side: \set accum `expr :ROW_COUNT + :accum` and maybe some day something like: \let accum :ROW_COUNT + :accum > I did some performance testing of my own, based on this possibly-silly > test case: [...] > > The idea was to run a trivial query and minimize all other psql overhead, > particularly results-printing. With this, "perf" told me that > SetResultVariables and its called functions accounted for 1.5% of total > CPU (including the server processes). > That's kind of high, but it's probably tolerable considering that any > real application would involve both far more server work per query and > far more psql work (at least for SELECTs). This seems pretty reasonable to me, and is consistent with my 1% elapsed time measure on a silent "SELECT;". > One thing we could think about if this seems too high is to drop > ROW_COUNT. I'm unconvinced that it has a real use-case, and it seems > to be taking more than its share of the work in non-error cases, because > it turns out that PQcmdTuples() is not an amazingly cheap function. I do think that a small overhead on a contrived case is worth removing the feature, as it is really insignificant on any realistic case. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
>> One thing we could think about if this seems too high is to drop >> ROW_COUNT. I'm unconvinced that it has a real use-case, and it seems >> to be taking more than its share of the work in non-error cases, because >> it turns out that PQcmdTuples() is not an amazingly cheap function. > > I do think that a small overhead on a contrived case is worth removing the > feature, as it is really insignificant on any realistic case. Please read: I do NOT think that... -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers