Re: Weird indices - Mailing list pgsql-general
From | Ian Lance Taylor |
---|---|
Subject | Re: Weird indices |
Date | |
Msg-id | sig0h8ijk5.fsf@daffy.airs.com Whole thread Raw |
In response to | Re[2]: Weird indices (Jean-Christophe Boggio <cat@thefreecat.org>) |
List | pgsql-general |
Joseph Shraibman <jks@selectacast.net> writes: > > Note that this all implies that when walking through the index to find > > heap tuples, you must check the current validity of each heap tuple. > > It is normal for an index tuple to point to a heap tuple which has > > been deleted. > > <snip> > > > > > > I'm talking about indices. The index should be updated to only point at > > > valid rows. > > > > When should the index be updated to only point at valid rows? That is > > only possible when a heap tuple is finally and completely removed. > > But currently that is only known at the time of a VACUUM. > > > You just said above 'It is normal for an index tuple to point to a heap > tuple which has been deleted.' I'm not sure what your point is here. I can see that there is an ambiguity in what I said. I said it is normal for an index tuple to point to a heap tuple which has been deleted. However, although the heap tuple has been deleted in present time, there may still be transactions which do not see that heap tuple as having been deleted. So the index tuple is still meaningful until all those transactions have completed. When all such transactions have completed is the case I meant when I described the heap tuple as finally and completely removed. > > Consider a transaction which sits around for a while and then aborts. > > At the moment that the transaction aborts, Postgres may become able to > > remove some heap tuples and some index tuples, and it might be invalid > > for Postgres to remove those tuples before the transaction aborts. > > > > But the transaction might never have looked at those tuples. So, > > given the Postgres data structures, the only way to keep an index > > fully up to date would be to effectively run a VACUUM over the entire > > database every time a transaction aborts. > > Why? There is a mechanism for keeping track of which heap tuples are > valid, why not index tuples? It is the nature of indices to be updated > on inserts, why not deletes? I would think that the advantage of being > able to use the index in the planner would outweigh the immediate cost > of doing the update. You're right. The mechanism used to preserve multiple versions of heap tuples could be extended to index tuples as well. Based on the heap tuple implementation, this would require adding two transaction ID's and a few flags to each index tuple. That's not insignificant. In a B-tree, right now, I think there are 8 bytes plus the key for each item in the tree. This would require adding another 10 bytes or so. That's a lot. Also, more work would be required for every update. Right now an update requires a B-tree insert for each index. With this change, every update would require an additional B-tree lookup and write for each index. That would require on average a bit less than one additional block write per index. That's a lot. In exchange, certain queries would become faster. Specifically, any query which only needed the information found in an index would become faster. Each such query would save on average a bit less than one additional block read per value found in the index. But since the indices would be less efficient, some of the advantage would be lost due to extra block reads of the index. What you are suggesting seems possible, but it does not seem to be obviously better. If you feel strongly about this, the most reasonable thing would be for you to implement it, and test the results. Since as far as I can see what you are suggesting is not clearly better, it's unlikely that anybody else is going to go to the considerable effort of implement it on your behalf. > > > But if the index isn't used by the planner then the point > > > is moot. > > > > As far as I know the index itself isn't used by the planner. > > > But could be. As I understand it the reason the index isn't used by the > planner is because the index could point at non-visible rows (row = heap > tuple). If the index could be used, many things now that are seq scans > could be converted to faster index scans. Some things could, sure. It's not obvious to me that many things could. The planner can't spend a lot of time looking at an index to decide whether or not to use it. If it's going to do that, it's better off to just decide to use the index in the first place. Index examination is not free. It requires disk reads just like everything else. > > I don't think there is any way to do that today. It would be possible > > to implement something along the lines I suggest above. I have no > > idea if the Postgres maintainers have any plans along these lines. > > > At the end of a transaction, when it sets the bit that this tuple isn't > valid, couldn't it at the same time also remove it if was no longer > visible to any transaction? It wouldn't remove the need for vacuum > because there may be another transaction that prevents it from being > removed right then and there. Yes, this could be done. It wouldn't speed things up, though. In fact, it would slow them down. The only advantage would be that VACUUM would be required less often--an advantage which is not insignificant. I would guess that in the average multi-user database less than half of the tuples could be deleted at that point. It would be easy to instrument Postgres to test this--why don't you try that? Ian
pgsql-general by date: