Re: [PATCHES] Including Snapshot Info with Indexes - Mailing list pgsql-hackers
From | Hannu Krosing |
---|---|
Subject | Re: [PATCHES] Including Snapshot Info with Indexes |
Date | |
Msg-id | 1193145947.17735.33.camel@hannu-laptop Whole thread Raw |
In response to | Re: [PATCHES] Including Snapshot Info with Indexes ("Gokulakannan Somasundaram" <gokul007@gmail.com>) |
Responses |
Re: [PATCHES] Including Snapshot Info with Indexes
|
List | pgsql-hackers |
Ühel kenal päeval, T, 2007-10-23 kell 18:36, kirjutas Gokulakannan Somasundaram: > > There are several advantages to keeping a separate visibility > heap: > > 1) it is usually higly compressible, at least you can throw > away > cmin/cmax quite soon, usually also FREEZE and RLE encode the > rest. > > 2) faster access, more tightly packed data pages. > > 3) index-only scans > > 4) superfast VACUUM FREEZE > > 5) makes VACUUM faster even for worst cases (interleaving live > and dead > tuples) > > 6) any index scan will be faster due to fetching only visible > rows from > main heap. > > if you have to store the visibility fields of all the tuples of each > table, then you may not be able to accomodate in the cache. Say if a > table is of 1 million rows, we would need 22 MB of visibility > space(since visibility info takes 16 bytes. I think if we have to link > it with say tuple-id(6 Bytes). You can keep the visibility info small, by first dropping cmin/cmax and then FREEZ'ing the tuples (setting xmin to special value), after that you can replace a lot of visibility info tuples with single RLE encoded tuple, which simply states, that tuples N:A to M:B are visible. If that 1 million row table is mostly static, the static parts will soon have (al lot) less than 1 bit in visibility heap. For example, after vacuum there will be just one visibility info which say that whole table is visible. I envision HOT-like on-the-fly VACUUM FREEZE manipulations of visibility info so it won't grow very big at all. > I think we may need to link it with indexes with one more id. i am not > counting that now). why ? we will keep visibility info for ctids (PAGE:NR) and if we need to see, if any ctid pointe from index points to a visible tuple we check it based on that ctid. > If we have 10 tables, then we will have 220 MB. Keeping them pinned > in memory may not be advisable in some circumstances. no no! no pinning, the "mostly in cache" will happen automatically (and I mean mostly in processors _internal_ L1 or L2 cache, not just in RAM) > If it is not going to be in memory, then that is no different from > referring a table. But i accept that is a concept worth trying out. I > think the advantage with thick indexes comes with the fact, that it is > optional. If we can make this also as optional, that would be better. > But if we are going to suggest it as a replacement of DSM, then it > loses the advantage of being small. I agree that a single-purpose DSM can be made smaller than multi-purpose visibility heap. ------------------ Hannu
pgsql-hackers by date: