Tom Lane wrote:
> But note that barring backend crash, once all the scans are done it is
> guaranteed that the hint will be removed --- somebody will be last to
> update the hint, and therefore will remove it when they do heap_endscan,
> even if others are not quite done. This is good in the sense that
> later-starting backends won't be fooled into starting at what is
> guaranteed to be the most pessimal spot, but it's got a downside too,
> which is that there will be windows where seqscans are in process but
> a newly started scan won't see them. Maybe that's a killer objection.
I think the way the patch is now is better than trying to remove the
hints, but I don't feel strongly either way.
However, I don't think we should try hard to mask the issue. It just
means people are more likely to miss it in testing, and run into it in
production. It's better to find out sooner than later.
It might be a good idea to preserve the order within a transaction,
though that means more code.
> When exactly is the hint updated? I gathered from something Heikki said
> that it's set after processing X amount of data, but I think it might be
> better to set it *before* processing X amount of data. That is, the
> hint means "I'm going to be scanning at least <threshold> blocks
> starting here", not "I have scanned <threshold> blocks ending here",
> which seems like the interpretation that's being used at the moment.
> What that would mean is that successive "LIMIT 1000" calls would in fact
> all start at the same place, barring interference from other backends.
I don't see how it makes any difference whether you update the hint
before or after processing. Running a LIMIT 1000 query repeatedly will
start from the same place in any case, assuming 1000 tuples fit in the
"report interval", which is 128KB currently.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com