I think my previous message wasn't clear enough. I do NOT think that LVM snapshot is the culprit.
However I cannot discount it as one of the possibilities. But I have no evidence in either /var/log/messages or in dmesg that the LVM snapshot went into a bad state AND we have been using this method for a long time.
The only thing that is new is that we took the snapshot from the streaming replica. So again my best guess as of now is that if the database crashes while it is in streaming standby a invalid disk state can result during during the following startup (in rare and as of now unclear circumstances).
You seem to be quite convinced that it must be LVM can you elaborate why?
That's one possible explanation. It's worth noting that we haven't seen this before moving to streaming rep first and we have been using that method for a long time.