Thread: Multi-second pauses blocking even trivial activity

Multi-second pauses blocking even trivial activity

From
Patrick Molgaard
Date:
Hi folks,

I've been seeing some curious behaviour on a postgres server I administer.

Intermittently (one or two times a week), all queries on that host are simultaneously blocked for extended periods (10s of seconds).

The blocked queries are trivial & not related to locking - I'm seeing slowlogs of the form:

`LOG: duration: 22627.299 ms statement: SET client_encoding='''utf-8''';` 

where this is the first statement on a fresh connection.

It happens even for connections from the same host - so it doesn't appear to be e.g. network slowness, if that is even counted in query duration.

Does anyone have any hints for where to look for a cause?

Thanks,

Patrick

-----

Set up information:

Postgres version:  PostgreSQL 9.6.5 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609, 64-bit

Full Table and Index Schema: not applicable, I think

EXPLAIN ANALYSE: n/a

Hardware: AWS i3.2xlarge

Maintenance Setup: autovacuum yes, but the times it runs don't correlate to the incidences of slow queries

WAL settings: shipped to S3 with wal-e, stored on same disk for interim period




Re: Multi-second pauses blocking even trivial activity

From
Jeff Janes
Date:
On Fri, Sep 7, 2018 at 8:00 AM Patrick Molgaard <draaglom@gmail.com> wrote:
Hi folks,

I've been seeing some curious behaviour on a postgres server I administer.

Intermittently (one or two times a week), all queries on that host are simultaneously blocked for extended periods (10s of seconds).

The blocked queries are trivial & not related to locking - I'm seeing slowlogs of the form:

`LOG: duration: 22627.299 ms statement: SET client_encoding='''utf-8''';` 


Do you have log_lock_waits set to on?  If not, you might want to turn it on.

Cheers,

Jeff

Re: Multi-second pauses blocking even trivial activity

From
Patrick Molgaard
Date:

Hi Jeff,

Thanks for your reply. Are locks relevant in this case, though? 

To be clear, the slow statements are the first thing happening on the connection and don't look like they should be acquiring any kind of lock - eg. 'select version();' also seems to be paused when it occurs.

Or are there some system level locks that a trivial query, touching no relations, might be contending for?

Best
Patrick

On Fri, 7 Sep 2018, 15:32 Jeff Janes, <jeff.janes@gmail.com> wrote:
On Fri, Sep 7, 2018 at 8:00 AM Patrick Molgaard <draaglom@gmail.com> wrote:
Hi folks,

I've been seeing some curious behaviour on a postgres server I administer.

Intermittently (one or two times a week), all queries on that host are simultaneously blocked for extended periods (10s of seconds).

The blocked queries are trivial & not related to locking - I'm seeing slowlogs of the form:

`LOG: duration: 22627.299 ms statement: SET client_encoding='''utf-8''';` 


Do you have log_lock_waits set to on?  If not, you might want to turn it on.

Cheers,

Jeff

Re: Multi-second pauses blocking even trivial activity

From
Andreas Kretschmer
Date:
>
>Intermittently (one or two times a week), all queries on that host are
>simultaneously blocked for extended periods (10s of seconds).
>
>The blocked queries are trivial & not related to locking - I'm seeing
>slowlogs of the form:
>


please check if THP are enabled.


Regards, Andreas



--
2ndQuadrant - The PostgreSQL Support Company


Re: Multi-second pauses blocking even trivial activity

From
"Joshua D. Drake"
Date:
On 09/07/2018 11:12 AM, Andreas Kretschmer wrote:
>> Intermittently (one or two times a week), all queries on that host are
>> simultaneously blocked for extended periods (10s of seconds).
>>
>> The blocked queries are trivial & not related to locking - I'm seeing
>> slowlogs of the form:
>>
>
> please check if THP are enabled.

Just to help out those who don't know I believe that Andreas is 
referring to Transparent Huge Pages.

JD

>
>
> Regards, Andreas
>
>
>

-- 
Command Prompt, Inc. || http://the.postgres.company/ || @cmdpromptinc
***  A fault and talent of mine is to tell it exactly how it is.  ***
PostgreSQL centered full stack support, consulting and development.
Advocate: @amplifypostgres || Learn: https://postgresconf.org
*****     Unless otherwise stated, opinions are my own.   *****



Re: Multi-second pauses blocking even trivial activity

From
Jeff Janes
Date:
On Fri, Sep 7, 2018 at 2:03 PM Patrick Molgaard <draaglom@gmail.com> wrote:

Hi Jeff,

Thanks for your reply. Are locks relevant in this case, though? 

I don't know, but why theorize when we can know for sure?  It at least invokes VirtualXactLockTableInsert.  I don't see how that could block on a heavyweight lock, though. But again, why theorize when logging it is simple?

Is it always the first statement in a connection which is blocking, or will established connections also block at the same time the new ones start to block?
 
Cheers,

Jeff

Re: Multi-second pauses blocking even trivial activity

From
Patrick Molgaard
Date:
Oh, to be clear - I'll be implementing your suggestion regardless, it seems valuable whether or not it gets me closer to the root cause this time :)

I was just trying to dig into why it may be relevant -- I want to really get a good grip on the mechanism behind this phenomenon.

Cheers
Patrick
On Fri, 7 Sep 2018, 20:20 Jeff Janes, <jeff.janes@gmail.com> wrote:
On Fri, Sep 7, 2018 at 2:03 PM Patrick Molgaard <draaglom@gmail.com> wrote:

Hi Jeff,

Thanks for your reply. Are locks relevant in this case, though? 

I don't know, but why theorize when we can know for sure?  It at least invokes VirtualXactLockTableInsert.  I don't see how that could block on a heavyweight lock, though. But again, why theorize when logging it is simple?

Is it always the first statement in a connection which is blocking, or will established connections also block at the same time the new ones start to block?
 
Cheers,

Jeff

Re: Multi-second pauses blocking even trivial activity

From
Patrick Molgaard
Date:
This sounds extremely plausible -- thanks for the tip, Andreas.

Best,
Patrick

On Fri, 7 Sep 2018, 19:20 Andreas Kretschmer, <andreas@a-kretschmer.de> wrote:

>
>Intermittently (one or two times a week), all queries on that host are
>simultaneously blocked for extended periods (10s of seconds).
>
>The blocked queries are trivial & not related to locking - I'm seeing
>slowlogs of the form:
>


please check if THP are enabled.


Regards, Andreas



--
2ndQuadrant - The PostgreSQL Support Company

Re: Multi-second pauses blocking even trivial activity

From
Patrick Molgaard
Date:
Andreas -- just following up to say that this was indeed the root cause. Thanks again.

Patrick

On Sat, 8 Sep 2018, 01:34 Patrick Molgaard, <draaglom@gmail.com> wrote:
This sounds extremely plausible -- thanks for the tip, Andreas.

Best,
Patrick


On Fri, 7 Sep 2018, 19:20 Andreas Kretschmer, <andreas@a-kretschmer.de> wrote:

>
>Intermittently (one or two times a week), all queries on that host are
>simultaneously blocked for extended periods (10s of seconds).
>
>The blocked queries are trivial & not related to locking - I'm seeing
>slowlogs of the form:
>


please check if THP are enabled.


Regards, Andreas



--
2ndQuadrant - The PostgreSQL Support Company

Re: Multi-second pauses blocking even trivial activity

From
Andreas Kretschmer
Date:

Am 21.09.2018 um 21:07 schrieb Patrick Molgaard:
> Andreas -- just following up to say that this was indeed the root 
> cause. Thanks again.
>

glad i could help you.


Regards, Andreas

-- 
2ndQuadrant - The PostgreSQL Support Company.
www.2ndQuadrant.com