Re: recently and selectively slow, but very simple, update query.... - Mailing list pgsql-performance

From David G Johnston
Subject Re: recently and selectively slow, but very simple, update query....
Date
Msg-id 1399334079374-5802579.post@n5.nabble.com
Whole thread Raw
In response to Re: Re: recently and selectively slow, but very simple, update query....  (Stelios Mavromichalis <mstelios@cytech.gr>)
Responses Re: Re: recently and selectively slow, but very simple, update query....  (Stelios Mavromichalis <mstelios@cytech.gr>)
List pgsql-performance
Stelios Mavromichalis wrote
> as a prior step to dump/restore i am thinking of deleting and re-inserting
> that particular row. that might share some light you think?

I still dislike the randomness of the unresponsiveness...

Every time you perform an update you "delete and insert" that row - that is
how an update works in MVCC - so doing so explicitly is unlikely to provide
any benefit.  Since that row is continually being inserted, and no other
rows are having this issue, I'm seriously doubting that a dump/restore is
going to have any effect either.  Note that the index scan took twice as
long in the bad case - but still reasonable and you didn't notice any
benefit from a REINDEX.  This is what I would expect.

The only other difference, if concurrency has been ruled out, is the 4 vs 18
buffers that had to be read.  I cannot imagine that, since all 22 were in
cache, that simply reading that much more data would account for the
difference (we're talking a 10,000-fold increase, not 2to4-fold).  The
reason for this particular difference, IIUC, is how may candidate tuples are
present whose visibility has to be accounted for (assuming 1 buffer per
tuple, you needed to scan 4 vs 18 for visibility in the two queries).

Is there any log file information you can share?  Especially if you can set
log_min_statement_duration (or whatever that GUC is named) so that whenever
one of these gets adversely delayed it appears in the log along with
whatever other system messages are being sent.  Checkpoints are a typical
culprit though that should be affecting a great deal more than what you
indicate you are seeing.

I'm pretty certain you are seeing this here largely because of the frequency
of activity on this particular user; not because the data itself is
corrupted.  It could be some kind of symptom of internal concurrency that
you just haven't observed yet but it could also be I/O or other system
contention that you also haven't properly instrumented.  Unfortunately that
is beyond my current help-providing skill-set.

A dump-restore likely would not make anything worse though I'd be surprised
if it were to improve matters.  It also doesn't seem like hardware - unless
the RAM is bad.  Software bugs are unlikely if this had been working well
before 5 days ago.   So, you need to observe the system during both periods
(good and bad) and observe something that is different - probably not within
PostgreSQL if indeed you've minimized concurrency.  And also see if you can
see if any other queries, executed during both these times, exhibit a
performance decrease.  Logging all statements would help matters greatly if
you can afford it in your production environment - it would make looking for
internal concurrency much easier.

David J.



--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Re-recently-and-selectively-slow-but-very-simple-update-query-tp5802553p5802579.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.


pgsql-performance by date:

Previous
From: Stelios Mavromichalis
Date:
Subject: Re: Re: recently and selectively slow, but very simple, update query....
Next
From: Johann Spies
Date:
Subject: Specifications for a new server