Jerry Sievers <gsievers19@comcast.net> writes:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
>> Oh, hmm, yeah it could be ye olde get_actual_variable_range() issue.
>> When this happens, are there perhaps a lot of recently-dead rows at either
>> extreme of the range of table1.source_id or table2.id?
> We noticed the cluster of interest had a rogue physical rep slot holding
> 71k WAL segments.
> Dropping same slot seemed to correlate with the problem going away.
> Does that sound like a plausible explanation for the observed slow
> planning times?
I believe the slot would hold back global xmin and thereby prevent
"recently-dead" rows from becoming just plain "dead", so yeah, this
observation does seem to square with the get_actual_variable_range
theory. You'd still need to posit that something had recently deleted
a lot of rows at the end of the range of one of those columns, though.
regards, tom lane