Tom Lane <tgl@sss.pgh.pa.us> writes:
> Greg Stark <gsstark@mit.edu> writes:
> > Other than that case is there any other case the reindex could deadlock with?
>
> Only SELECT, INSERT, UPDATE, and DELETE.
How does that happen? What exclusive locks do these take that reindex would
conflict with? I guess I'm missing some basic piece of the puzzle here.
I've also started wondering if the other approach proposed doesn't have more
merit than it was granted, namely the plan to create an "incomplete index"
that other transactions update appropriately as they make changes and then
insert the records one by one.
The main objection given was that the newly created index wouldn't be as
cleanly structured as a real reindex. But it seems like that's something that
can be overcome. There's plenty of research on how to build trees that
withstand sequential inserts for example. In the worst case Postgres could
sort and build the entire index temporarily to decide where the page splits
ought to be and then arrange to insert the new records in the real index such
that the page splits end up in those places.
The unstated objection that I think is more severe here is that performance
would suck. Repeatedly inserting individual records would be much slower than
a real reindex. And worse, it would slow down regular operation since every
update and insert would need to content with this batch job to get their
chance to insert their entries.
But I'm not sure performance sucking is really that big of a problem here.
There's always the option of an offline REINDEX if you can withstand the
downtime. To do an online reindex you'll just need a faster system. Nobody
ever said 24x7 operation wouldn't have costs.
--
greg