Re: Auto-vacuum is not running in 9.1.12 - Mailing list pgsql-hackers

From Prakash Itnal
Subject Re: Auto-vacuum is not running in 9.1.12
Date
Msg-id CAHC5u78Bi1n=MjS-3kuUar3WU0bpgS3UrevsH10ibbs=DodH5w@mail.gmail.com
Whole thread Raw
In response to Re: Auto-vacuum is not running in 9.1.12  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Auto-vacuum is not running in 9.1.12  (Prakash Itnal <prakash074@gmail.com>)
List pgsql-hackers
Hi,

To my understanding it will probably not open doors for worst situations! Please correct if my below understanding is correct.

The latch will wake up under below three situations:
a) Socket error (=> result is set to negative number)
b) timeout (=> result is set to TIMEOUT)
c) some event arrived on socket (=> result is set to non-zero value, if caller registers for arrived events otherwise no value is set)

Given the above conditions, the result can be zero only if there is an unregistered event which breaks the latch (*). In such case, current implementation evaluates the remaining sleep time. This calculation is making the situation worst, if time goes back. 

The time difference between cur_time (current time) and start_time (time when latch started) should always be a positive integer because cur_time is always greater than start_time under all normal conditions. 

    delta_timeout = cur_time - start_time;

The difference can be negative only if time shifts to past. So it is possible to detect if time shifted to past. When it is possible to detect can it be possible to correct? I think we can correct and prevent long sleeps due to time shifts.

Currently I treat it as TIMEOUT, though conceptually it is not. The ideal solution would be to leave this decision to the caller of WaitLatch(). With my little knowledge of postgres code, I think TIMEOUT would be fine! 


(*) The above description is true only for timed wait. If latch is started with blocking wait (no timeout) then above logic is not applicable.

On Sat, Jun 20, 2015 at 10:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Prakash Itnal <prakash074@gmail.com> writes:
> Sorry for the late response. The current patch only fixes the scenario-1
> listed below. It will not address the scenario-2. Also we need a fix in
> unix_latch.c where the remaining sleep time is evaluated, if latch is woken
> by other events (or result=0). Here to it is possible the latch might go in
> long sleep if time shifts to past time.

Forcing WL_TIMEOUT if the clock goes backwards seems like quite a bad
idea to me.  That seems like a great way to make a bad situation worse,
ie it induces failures where there were none before.

                        regards, tom lane



--
Cheers,
Prakash

pgsql-hackers by date:

Previous
From: Fabien COELHO
Date:
Subject: Re: pgbench - allow backslash-continuations in custom scripts
Next
From: Dean Rasheed
Date:
Subject: Re: Inheritance planner CPU and memory usage change since 9.3.2