Thread: debugging intermittent slow updates under higher load

debugging intermittent slow updates under higher load

From

Chris Withers

Date:

05 December 2018, 14:34:18

Hi All,

This is on postgres 9.4.16, same table as the last question I asked, 
here's an abbreviated desc:

# \d alerts_alert
               Table "public.alerts_alert"
      Column      |           Type           | Modifiers
-----------------+--------------------------+-----------
  tags            | jsonb                    | not null
  id              | character varying(86)    | not null
...
Indexes:
     "alerts_alert_pkey" PRIMARY KEY, btree (id)

The table has around 1.5M rows which have been updated/inserted around 
121M times, the distribution of updates to row in alerts_alert will be 
quite uneven, from 1 insert up to 1 insert and 0.5M updates.

Under high load (200-300 inserts/updates per second) we see occasional 
(~10 per hour) updates taking excessively long times (2-10s). These 
updates are always of the form:

UPDATE "alerts_alert" SET ...bunch of fields... WHERE 
"alerts_alert"."id" = '...sha1 hash...';

Here's a sample explain:

https://explain.depesz.com/s/Fjq8

What could be causing this? What could we do to debug? What config 
changes could we make to alleviate this?

cheers,

Chris

Re: debugging intermittent slow updates under higher load

From

Alexey Bashtanov

Date:

05 December 2018, 15:40:09

>
> The table has around 1.5M rows which have been updated/inserted around 
> 121M times, the distribution of updates to row in alerts_alert will be 
> quite uneven, from 1 insert up to 1 insert and 0.5M updates.
>
> Under high load (200-300 inserts/updates per second) we see occasional 
> (~10 per hour) updates taking excessively long times (2-10s). These 
> updates are always of the form:
>
> UPDATE "alerts_alert" SET ...bunch of fields... WHERE 
> "alerts_alert"."id" = '...sha1 hash...';
>
> Here's a sample explain:
>
> https://explain.depesz.com/s/Fjq8
>
> What could be causing this? What could we do to debug? What config 
> changes could we make to alleviate this?
>

Hello Chris,

One of the reasons could be the row already locked by another backend, 
doing the same kind of an update or something different.
Are these updates performed in a longer transactions?
Can they hit the same row from two clients at the same time?
Is there any other write or select-for-update/share load on the table?

Have you tried periodical logging of the non-granted locks?
Try querying pg_stat_activity and pg_locks (possibly joined and maybe 
repeatedly self-joined, google for it)
to get the backends that wait one for another while competing for to 
lock the same row or object.

Best,
  Alex

Re: debugging intermittent slow updates under higher load

From

Rene Romero Benavides

Date:

05 December 2018, 15:47:43

Also read about hot updates and the storage parameter named "fill_factor", so, data blocks can be recycled instead of creating new ones if the updated fields don't update also indexes.

Am Mi., 5. Dez. 2018 um 09:39 Uhr schrieb Alexey Bashtanov <bashtanov@imap.cc>:

>
> The table has around 1.5M rows which have been updated/inserted around
> 121M times, the distribution of updates to row in alerts_alert will be
> quite uneven, from 1 insert up to 1 insert and 0.5M updates.
>
> Under high load (200-300 inserts/updates per second) we see occasional
> (~10 per hour) updates taking excessively long times (2-10s). These
> updates are always of the form:
>
> UPDATE "alerts_alert" SET ...bunch of fields... WHERE
> "alerts_alert"."id" = '...sha1 hash...';
>
> Here's a sample explain:
>
> https://explain.depesz.com/s/Fjq8
>
> What could be causing this? What could we do to debug? What config
> changes could we make to alleviate this?
>

Hello Chris,

One of the reasons could be the row already locked by another backend,
doing the same kind of an update or something different.
Are these updates performed in a longer transactions?
Can they hit the same row from two clients at the same time?
Is there any other write or select-for-update/share load on the table?

Have you tried periodical logging of the non-granted locks?
Try querying pg_stat_activity and pg_locks (possibly joined and maybe
repeatedly self-joined, google for it)
to get the backends that wait one for another while competing for to
lock the same row or object.

Best,
Alex

El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/

Re: debugging intermittent slow updates under higher load

From

Rene Romero Benavides

Date:

05 December 2018, 15:51:38

This parameter can be updated on a "per table" basis.

Am Mi., 5. Dez. 2018 um 09:47 Uhr schrieb Rene Romero Benavides <rene.romero.b@gmail.com>:

Also read about hot updates and the storage parameter named "fill_factor", so, data blocks can be recycled instead of creating new ones if the updated fields don't update also indexes.

Am Mi., 5. Dez. 2018 um 09:39 Uhr schrieb Alexey Bashtanov <bashtanov@imap.cc>:

>
> The table has around 1.5M rows which have been updated/inserted around
> 121M times, the distribution of updates to row in alerts_alert will be
> quite uneven, from 1 insert up to 1 insert and 0.5M updates.
>
> Under high load (200-300 inserts/updates per second) we see occasional
> (~10 per hour) updates taking excessively long times (2-10s). These
> updates are always of the form:
>
> UPDATE "alerts_alert" SET ...bunch of fields... WHERE
> "alerts_alert"."id" = '...sha1 hash...';
>
> Here's a sample explain:
>
> https://explain.depesz.com/s/Fjq8
>
> What could be causing this? What could we do to debug? What config
> changes could we make to alleviate this?
>

Hello Chris,

One of the reasons could be the row already locked by another backend,
doing the same kind of an update or something different.
Are these updates performed in a longer transactions?
Can they hit the same row from two clients at the same time?
Is there any other write or select-for-update/share load on the table?

Have you tried periodical logging of the non-granted locks?
Try querying pg_stat_activity and pg_locks (possibly joined and maybe
repeatedly self-joined, google for it)
to get the backends that wait one for another while competing for to
lock the same row or object.

Best,
Alex

--
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/

El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/

Re: debugging intermittent slow updates under higher load

From

Chris Withers

Date:

06 December 2018, 08:28:44

On 05/12/2018 15:40, Alexey Bashtanov wrote:
> 
>>
> One of the reasons could be the row already locked by another backend, 
> doing the same kind of an update or something different.
> Are these updates performed in a longer transactions?

Nope, the transaction will just be updating one row at a time.

> Can they hit the same row from two clients at the same time?

I've looked for evidence of this, but can't find any. Certainly nothing 
running for 2-10s, queries against this table are normally a few hundred ms.

> Is there any other write or select-for-update/share load on the table?

Not that I'm aware of. How would I go about getting metrics on problems 
like these?

> Have you tried periodical logging of the non-granted locks?
> Try querying pg_stat_activity and pg_locks (possibly joined and maybe 
> repeatedly self-joined, google for it)
> to get the backends that wait one for another while competing for to 
> lock the same row or object.

Is there any existing tooling that does this? I'm loath to start hacking 
something up when I'd hope others have done a better job already...

Chris

Re: debugging intermittent slow updates under higher load

From

Chris Withers

Date:

06 December 2018, 08:29:49

On 05/12/2018 15:47, Rene Romero Benavides wrote:
> Also read about hot updates and the storage parameter named 
> "fill_factor", so, data blocks can be recycled instead of creating new 
> ones if the updated fields don't update also indexes.

I have read about these, but I'd prefer not to be making 
opportunistic/guessing changes on this.

How can I collect metrics/logging/etc evidence to confirm what the 
problem actually is?

cheers,

Chris

Re: debugging intermittent slow updates under higher load

From

Alexey Bashtanov

Date:

06 December 2018, 11:00:18

> Is there any existing tooling that does this?
There must be some, google for queries involving pg_locks

> I'm loath to start hacking something up when I'd hope others have done 
> a better job already...
If you log all queries that take more than a second to complete, is your 
update the only one logged, or something (the would-be blocker) gets 
logged down together with it?

Re: debugging intermittent slow updates under higher load

From

Chris Withers

Date:

06 December 2018, 11:18:25

On 06/12/2018 11:00, Alexey Bashtanov wrote:
> 
>> I'm loath to start hacking something up when I'd hope others have done 
>> a better job already...
> If you log all queries that take more than a second to complete, is your 
> update the only one logged, or something (the would-be blocker) gets 
> logged down together with it?

Nope, only ones logged are these updates.

Chris

Re: debugging intermittent slow updates under higher load

From

Pavel Stehule

Date:

06 December 2018, 11:21:33

čt 6. 12. 2018 v 12:18 odesílatel Chris Withers <chris@withers.org> napsal:

On 06/12/2018 11:00, Alexey Bashtanov wrote:
>
>> I'm loath to start hacking something up when I'd hope others have done
>> a better job already...
> If you log all queries that take more than a second to complete, is your
> update the only one logged, or something (the would-be blocker) gets
> logged down together with it?

Nope, only ones logged are these updates.

Can you check latency on file system? Some latencies can be enforced by overloaded file system due wrong configuration of file system cache.

https://serverfault.com/questions/471070/linux-file-system-cache-move-data-from-dirty-to-writeback

Regards

Pavel

Chris