Thread: Statement timeout in pg_rewind
Hi, It is quite common to set a global statement_timeout to a few seconds (or minutes) in postgresql.conf in order to avoid hitting a production server with slow/bad queries. This value is applied to all connections in the system unless it is redefined per database, user, or explicitly changed in the connection. Pg_rewind runs quite a few queries on the primary and if statement_timeout hits one of them it breaks the whole process. I can't tell for sure if this is an unrecoverable error or not and maybe the second run of pg_rewind would be able to finish the process. Even in case if retry is possible it makes it hard to use it for reliable automation. There are a few workarounds to this problem: 1. ALTER DATABASE postgres SET statement_timeout = 0; 2. ALTER rewind_username SET statement_timeout = 0; 3. Run export PGOPTIONS="-c statement_timeout=0" before calling pg_rwind. All of them have certain pros and cons. The third approach works good for automation, but IMHO we should simply fix pg_rewind itself and SET statement_timeout after establishing a connection, so everybody will benefit from it. Patch attached. Regards, -- Alexander Kukushkin
Attachment
On Fri, Aug 23, 2019 at 10:05:02AM +0200, Alexander Kukushkin wrote: > Hi, > > It is quite common to set a global statement_timeout to a few seconds > (or minutes) in postgresql.conf in order to avoid hitting a production > server with slow/bad queries. True! Is pg_rewind the only thing that this hits? Best, David. -- David Fetter <david(at)fetter(dot)org> http://fetter.org/ Phone: +1 415 235 3778 Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
David Fetter <david@fetter.org> writes: > Is pg_rewind the only thing that this hits? pg_dump has forced statement_timeout to 0 for ages. If pg_rewind is also likely to have a long-running transaction, I don't see any good reason for it not to do likewise. regards, tom lane
On Sun, Aug 25, 2019 at 04:30:38PM -0400, Tom Lane wrote: > David Fetter <david@fetter.org> writes: > > Is pg_rewind the only thing that this hits? > > pg_dump has forced statement_timeout to 0 for ages. If pg_rewind > is also likely to have a long-running transaction, I don't see any > good reason for it not to do likewise. My mistake. I meant to ask whether, in addition to pg_dump and pg_rewind, there are other things that should ignore statement_timeout settings. Best, David. -- David Fetter <david(at)fetter(dot)org> http://fetter.org/ Phone: +1 415 235 3778 Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On Sun, Aug 25, 2019 at 10:34:29PM +0200, David Fetter wrote: > I meant to ask whether, in addition to pg_dump and pg_rewind, there > are other things that should ignore statement_timeout settings. Sure. Please note that I am not sure if it is worth bothering about all the code paths which emit SQL queries as one thing to consider is that your query "SET statement_timeout = 0" could be cancelled by the system's default, defeating its purpose. (For example, just enforce statement_timeout = 1 into PostgresNode.pm::init and enjoy the show). So any tool that we consider worth fixing should be able to act with timeout values that are realistic. On top of pg_rewind, there is a point to raise about src/bin/scripts/ (just enforce the parameters in common.c), and vacuumlo which creates a temporary table potentially large. Alexander, it seems to me that we should also consider lock_timeout and idle_in_transaction_session_timeout (new as of 9.6), no? We could also group the PQexec/PQresultStatus into a simple wrapper which gets also called by run_simple_query(). -- Michael
Attachment
Hi, On Mon, 26 Aug 2019 at 06:28, Michael Paquier <michael@paquier.xyz> wrote: > Alexander, it seems to me that we should also consider lock_timeout > and idle_in_transaction_session_timeout (new as of 9.6), no? We could Well, I was thinking about it and came to the conclusion that we are neither taking heavy locks nor explicitly opening a transaction and therefore we can avoid changing them. But maybe you are right, having them set to the safe value shouldn't hurt. > also group the PQexec/PQresultStatus into a simple wrapper which gets > also called by run_simple_query(). I don't think we can use the same wrapper for run_simple_query() and for places where we call a SET, because PQresultStatus() returns PGRES_TUPLES_OK and PGRES_COMMAND_OK respectively. Passing expected ExecStatusType to the wrapper for comparison is looking a bit ugly to me. Regards, -- Alexander Kukushkin
On Mon, Aug 26, 2019 at 03:42:46PM +0200, Alexander Kukushkin wrote: > Well, I was thinking about it and came to the conclusion that we are > neither taking heavy locks nor explicitly opening a transaction and > therefore we can avoid changing them. > But maybe you are right, having them set to the safe value shouldn't > hurt. I'd rather be on the safe side and as we are looking at this at this area.. Who knows if this logic is going to change in the future and how it will change. > I don't think we can use the same wrapper for run_simple_query() and > for places where we call a SET, because PQresultStatus() returns > PGRES_TUPLES_OK and PGRES_COMMAND_OK respectively. > Passing expected ExecStatusType to the wrapper for comparison is > looking a bit ugly to me. Oops, I misread this part. What about a simple wrapper run_simple_command which checks after PGRES_COMMAND_OK, and frees the result then? This could be used for the temporary table creation and when setting synchronous_commit. -- Michael
Attachment
On Tue, 27 Aug 2019 at 08:36, Michael Paquier <michael@paquier.xyz> wrote: > I'd rather be on the safe side and as we are looking at this at this > area.. Who knows if this logic is going to change in the future and > how it will change. Agree. > Oops, I misread this part. What about a simple wrapper > run_simple_command which checks after PGRES_COMMAND_OK, and frees the > result then? This could be used for the temporary table creation and > when setting synchronous_commit. Done, please see the next version attached. Regards, -- Alexander Kukushkin
Attachment
On Tue, Aug 27, 2019 at 10:45:27AM +0200, Alexander Kukushkin wrote: > Done, please see the next version attached. I have made the new error message consistent with run_simple_query to avoid more work to translators and because it is possible to know immediately the code path involved thanks to the SQL query, then applied the fix down to 9.5 where pg_rewind has been added. Please note that your patch had a warning as "result" is not needed in run_simple_command(). idle_in_transaction_session_timeout only applies to 9.6 and newer versions. lock_timeout (imagine a concurrent lock on pg_class for example) and statement_timeout can cause issues, but the full set gets disabled as your patch did and as mentioned upthread. -- Michael
Attachment
Hi, Thank you, Michael, for committing it! On Wed, 28 Aug 2019 at 04:51, Michael Paquier <michael@paquier.xyz> wrote: > note that your patch had a warning as "result" is not needed in > run_simple_command(). Ohh, sorry about that. I compiled the whole source tree and the warning was buried down among other output :( Regards, -- Alexander Kukushkin
Alexander Kukushkin <cyberdemn@gmail.com> writes: > On Wed, 28 Aug 2019 at 04:51, Michael Paquier <michael@paquier.xyz> wrote: >> note that your patch had a warning as "result" is not needed in >> run_simple_command(). > Ohh, sorry about that. I compiled the whole source tree and the > warning was buried down among other output :( "make -s" is your friend ... regards, tom lane