Thread: BUG #15074: psql client never returns when creating index (longrunning operation)
BUG #15074: psql client never returns when creating index (longrunning operation)
From
PG Bug reporting form
Date:
The following bug has been logged on the website: Bug reference: 15074 Logged by: Lars Vonk Email address: lars.vonk@gmail.com PostgreSQL version: 9.6.7 Operating system: ubuntu 14.04 LTS Description: When creating an index on a table containing approx. 100M records the psql client never returns when the index is eventually created on the server. This happens both in a psql client (9.6.6) on mac osx and using the libpq client via the ruby pg gem when using active records migrations. When running the command directly op the postgres database server using the psql client it works as expected. The logs from the server: 2018-02-16 16:11:05 CET:82.161.205.254(59584):REDACTED@REDACTED_DB:[23141]: LOG: duration: 999618.165 ms 2018-02-16 16:11:05 CET:82.161.205.254(59584):REDACTED@REDACTED_DB:[23141]: LOG: could not receive data from client: Connection reset by peer The psql client just hangs, when trying to cancel the query client side (ctrl-c) the server logs says: 2018-02-16 16:13:53 CET:82.161.205.254(59961):[unknown]@[unknown]:[25738]: LOG: PID 23141 in cancel request did not match any process
Re: BUG #15074: psql client never returns when creating index (longrunning operation)
From
Michael Paquier
Date:
On Mon, Feb 19, 2018 at 08:26:48AM +0000, PG Bug reporting form wrote: > When creating an index on a table containing approx. 100M records the psql > client never returns when the index is eventually created on the server. > This happens both in a psql client (9.6.6) on mac osx and using the libpq > client via the ruby pg gem when using active records migrations. Are you sure that the process got stuck? Perhaps it was spending a lot of time processing and you did not wait for it to finish. > When running the command directly op the postgres database server using the > psql client it works as expected. > > The logs from the server: > 2018-02-16 16:11:05 CET:82.161.205.254(59584):REDACTED@REDACTED_DB:[23141]: > LOG: duration: 999618.165 ms > 2018-02-16 16:11:05 CET:82.161.205.254(59584):REDACTED@REDACTED_DB:[23141]: > LOG: could not receive data from client: Connection reset by peer > > The psql client just hangs, when trying to cancel the query client side > (ctrl-c) the server logs says: > 2018-02-16 16:13:53 CET:82.161.205.254(59961):[unknown]@[unknown]:[25738]: > LOG: PID 23141 in cancel request did not match any process How is defined your schema? It could be possible that a CHECK_FOR_INTERRUPTS() added in the code paths where the type of the index you are creating is involved could help in making the system more responsive to cancel requests. -- Michael
Attachment
Re: BUG #15074: psql client never returns when creating index (long running operation)
From
Andrew Gierth
Date:
>>>>> "PG" == PG Bug reporting form <noreply@postgresql.org> writes: PG> When creating an index on a table containing approx. 100M records PG> the psql client never returns when the index is eventually created PG> on the server. This happens both in a psql client (9.6.6) on mac PG> osx and using the libpq client via the ruby pg gem when using PG> active records migrations. PG> When running the command directly op the postgres database server PG> using the psql client it works as expected. PG> The logs from the server: PG> 2018-02-16 16:11:05 CET:82.161.205.254(59584):REDACTED@REDACTED_DB:[23141]: PG> LOG: duration: 999618.165 ms PG> 2018-02-16 16:11:05 CET:82.161.205.254(59584):REDACTED@REDACTED_DB:[23141]: PG> LOG: could not receive data from client: Connection reset by peer This looks like a network issue; most likely you have a NAT or stateful firewall in between the client and server which is dropping idle connections, with a timeout set shorter than your keepalive interval on either client or server. The server gets "connection reset" because it tried to send the result of the statement, plus the "ready for query" message, and the NAT/firewall sent back a reset because it no longer recognized the connection as valid. The psql client itself continues to wait for input (and will do so until the keepalive timeout is reached if keepalive is enabled, otherwise indefinitely) because the NAT/firewall doesn't typically send any proactive reset messages. -- Andrew (irc:RhodiumToad)
Re: BUG #15074: psql client never returns when creating index (longrunning operation)
From
Lars Vonk
Date:
Hi,
Thanks for replying. I will double check the network settings, but couldn't find anything.
Are there any psql / postgres settings I can check, I couldn't find any that might be affecting this.
Lars
Op ma 19 feb. 2018 om 13:27 schreef Andrew Gierth <andrew@tao11.riddles.org.uk>:
>>>>> "PG" == PG Bug reporting form <noreply@postgresql.org> writes:
PG> When creating an index on a table containing approx. 100M records
PG> the psql client never returns when the index is eventually created
PG> on the server. This happens both in a psql client (9.6.6) on mac
PG> osx and using the libpq client via the ruby pg gem when using
PG> active records migrations.
PG> When running the command directly op the postgres database server
PG> using the psql client it works as expected.
PG> The logs from the server:
PG> 2018-02-16 16:11:05 CET:82.161.205.254(59584):REDACTED@REDACTED_DB:[23141]:
PG> LOG: duration: 999618.165 ms
PG> 2018-02-16 16:11:05 CET:82.161.205.254(59584):REDACTED@REDACTED_DB:[23141]:
PG> LOG: could not receive data from client: Connection reset by peer
This looks like a network issue; most likely you have a NAT or stateful
firewall in between the client and server which is dropping idle
connections, with a timeout set shorter than your keepalive interval on
either client or server.
The server gets "connection reset" because it tried to send the result
of the statement, plus the "ready for query" message, and the
NAT/firewall sent back a reset because it no longer recognized the
connection as valid. The psql client itself continues to wait for input
(and will do so until the keepalive timeout is reached if keepalive is
enabled, otherwise indefinitely) because the NAT/firewall doesn't
typically send any proactive reset messages.
--
Andrew (irc:RhodiumToad)
Re: BUG #15074: psql client never returns when creating index (longrunning operation)
From
Lars Vonk
Date:
Hi Michael,
Thanks for replying. We waited as least as long as it took to create the index. It is reproducible when creating this particular index. To be clear, the index was created, the client just did not get notified.
> How is defined your schema?
> How is defined your schema?
What schema are you referring to?
> It could be possible that a CHECK_FOR_INTERRUPTS() added in the code paths where the type of the
index you are creating is involved could help in making the system more
responsive to cancel requests.
index you are creating is involved could help in making the system more
responsive to cancel requests.
I am not sure I understand what this means? It is just psql that we are using connecting to a remote database.
Op ma 19 feb. 2018 om 10:21 schreef Michael Paquier <michael@paquier.xyz>:
On Mon, Feb 19, 2018 at 08:26:48AM +0000, PG Bug reporting form wrote:
> When creating an index on a table containing approx. 100M records the psql
> client never returns when the index is eventually created on the server.
> This happens both in a psql client (9.6.6) on mac osx and using the libpq
> client via the ruby pg gem when using active records migrations.
Are you sure that the process got stuck? Perhaps it was spending a lot
of time processing and you did not wait for it to finish.
> When running the command directly op the postgres database server using the
> psql client it works as expected.
>
> The logs from the server:
> 2018-02-16 16:11:05 CET:82.161.205.254(59584):REDACTED@REDACTED_DB:[23141]:
> LOG: duration: 999618.165 ms
> 2018-02-16 16:11:05 CET:82.161.205.254(59584):REDACTED@REDACTED_DB:[23141]:
> LOG: could not receive data from client: Connection reset by peer
>
> The psql client just hangs, when trying to cancel the query client side
> (ctrl-c) the server logs says:
> 2018-02-16 16:13:53 CET:82.161.205.254(59961):[unknown]@[unknown]:[25738]:
> LOG: PID 23141 in cancel request did not match any process
How is defined your schema? It could be possible that a
CHECK_FOR_INTERRUPTS() added in the code paths where the type of the
index you are creating is involved could help in making the system more
responsive to cancel requests.
--
Michael
Re: BUG #15074: psql client never returns when creating index (long running operation)
From
Andrew Gierth
Date:
>>>>> "Lars" == Lars Vonk <lars.vonk@gmail.com> writes: Lars> Hi, Lars> Thanks for replying. I will double check the network settings, Lars> but couldn't find anything. I've seen this so many times that I'd bet a fair sum on it. Lars> Are there any psql / postgres settings I can check, I couldn't Lars> find any that might be affecting this. The only relevant pg settings are the tcp_keepalives_* ones, which (on operating systems that support this, i.e. not Windows) cause pg to request a different keepalive timeout from the OS's default. Note that OS defaults themselves default to a bit over 2 hours unless you explicitly configure something else. Likewise on the client side, if you are using an OS that supports it, you can use keepalives_idle= in the connection string. -- Andrew (irc:RhodiumToad)