RE: error "can only drop stats once" brings down database - Mailing list pgsql-bugs

From Floris Van Nee
Subject RE: error "can only drop stats once" brings down database
Date
Msg-id 47ca6fcc687e408287871f7eaaaf3c3b@Optiver.com
Whole thread Raw
In response to Re: error "can only drop stats once" brings down database  (Andres Freund <andres@anarazel.de>)
Responses Re: error "can only drop stats once" brings down database
List pgsql-bugs
> I suspect that the "high value" oids are related to the issue. I bet you're
> running into OID wraparound for objects at a much higher frequency than
> most users.  I suspect the reason this causes problems is the problem I just
> found and described here:

I definitely suspect oid wraparound has something to do with it. However, given what I found on the other thread (
https://www.postgresql.org/message-id/flat/17947-b9554521ad963c9c%40postgresql.org ), I don't think the two have the
exactsame root cause. The other thread seems a clear case of 'forgot to call gc' when it should. 

>
> The fact that we're not triggering "stats object garbage collection" for most
> drops also explains why you're much more likely to see this on a standby
> than on a primary. An a primary it's going to be rare to have a single backend
> live long enough to observe an oid wraparound leading to one backend
> accessing stats for the same object type with the same oid after that object
> previously having been dropped. But most stats accesses on a standby are
> going to be by the same process, the startup process.  You'd still need some
> other accesses to prevent the object from being dropped "immediately", but
> that could be due to shorter lived processes.
>

As mentioned on other thread, I do think garbage collection gets called as part of pgstat_execute_transactional_drops
inrecovery. 

There's still the case of the "skip gc of entry if pending" though, however I struggle to understand how this can lead
tosuch an error in the recovery process after wraparound. I'd expect the gc to happen relatively frequently (at least
muchmore frequently than a wraparound) as tables get created/dropped quite frequently. Next to that, because the
recoveryprocess is the only one creating/dropping relations on standby, it should be impossible to get in a state where
therecovery "local" stats cache has a non-gc'd entry that should be dropped? Because when dropping, at least it removes
thelocal cache entry (even if it cannot remove the shared one). So later, when creating it again in
pgstat_get_entry_ref,it should always hit the pgstat_reinit_entry path? 

Would it make sense to at least commit your patch to enhance the error message a bit?


-Floris




pgsql-bugs by date:

Previous
From: Sjors Gielen
Date:
Subject: Re: BUG #18365: Inconsistent cost function between materialized and non-materialized CTE
Next
From: Tom Lane
Date:
Subject: Re: Missing semicolumn in anonymous plpgsql block does not raise syntax error