Mark <mwchambers@gmail.com> writes:
> Am I correct in my understanding that any row that has been modified (i.e.
> UPDATE) is in state HEAPTUPLE_INSERT_IN_PROGRESS so it will not be included
> in the sample?
An update will mark the existing tuple as delete-in-progress and then
insert a new tuple (row version) that's insert-in-progress.
A concurrent ANALYZE scan will definitely see the old tuple (modulo
sampling considerations) but it's timing-dependent which state it sees it
in --- it could still be "live" when we see it, or already
delete-in-progress. ANALYZE might or might not see the new tuple at all,
depending on timing and where the new tuple gets placed. So "count/sample
delete-in-progress but not insert-in-progress" seems like a good rule to
minimize the timing sensitivity of the results. It's not completely
bulletproof, but I think it's better than what we're doing now.
> I'm going to rework the application so there is less time between the
> DELETE and the COMMIT so I will only see the problem if ANALYZE runs during
> this smaller time window.
Yeah, that's about the best you can do from the application side.
regards, tom lane