Thread: AW: pg_index.indislossy
> > > Can someone tell me what we use indislossy for? Ok, so the interpretation of this field is:A match in the index needs to be reevaluated in the heap tuple data,since a matchin the index does not necessarily mean, that the heap tuplematches.If the heap tuple data matches, the index must alwaysmatch. A very typical example for such an index is a hash index. This might explain the fact, that the ODBC driver misinterpreted that field as meaning that the index is a hash. The field has nothing to do with partial index. Andreas
Added to pg_index.h file as a comment. > > > > > Can someone tell me what we use indislossy for? > > Ok, so the interpretation of this field is: > A match in the index needs to be reevaluated in the heap tuple data, > since a match in the index does not necessarily mean, that the heap tuple > matches. > If the heap tuple data matches, the index must always match. > > A very typical example for such an index is a hash index. This might explain the > fact, that the ODBC driver misinterpreted that field as meaning that the index is a hash. > The field has nothing to do with partial index. > > Andreas > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Bruce Momjian writes: > > > > > Can someone tell me what we use indislossy for? > > > > Ok, so the interpretation of this field is: > > A match in the index needs to be reevaluated in the heap tuple data, > > since a match in the index does not necessarily mean, that the heap tuple > > matches. > > If the heap tuple data matches, the index must always match. AFAIK, this is true for all indexes in PostgreSQL, because index rows don't store the transactions status. Of course those are two different underlying reasons why a heap lookup is always necessary, but there shouldn't be any functional difference in the current implementation. -- Peter Eisentraut peter_e@gmx.net http://funkturm.homeip.net/~peter
> Bruce Momjian writes: > > > > > > > Can someone tell me what we use indislossy for? > > > > > > Ok, so the interpretation of this field is: > > > A match in the index needs to be reevaluated in the heap tuple data, > > > since a match in the index does not necessarily mean, that the heap tuple > > > matches. > > > If the heap tuple data matches, the index must always match. > > AFAIK, this is true for all indexes in PostgreSQL, because index rows > don't store the transactions status. Of course those are two different > underlying reasons why a heap lookup is always necessary, but there > shouldn't be any functional difference in the current implementation. Seems it is something they added for the index abstraction and not for practical use by PostgreSQL. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Peter Eisentraut <peter_e@gmx.net> writes: > Bruce Momjian writes: > A match in the index needs to be reevaluated in the heap tuple data, > since a match in the index does not necessarily mean, that the heap tuple > matches. > AFAIK, this is true for all indexes in PostgreSQL, because index rows > don't store the transactions status. Not true at all. The tuple commit status needs to be rechecked, yes, but with a normal index it is not necessary to recheck whether the index key field actually satisfies the index qual conditions. With a lossy index it *is* necessary to recheck --- the index may return more tuples than the ones that match the given qual. For example, an r-tree index applied to a "distance from point X <= D" query might return all the tuples lying within a bounding box of the circle actually wanted. The LIKE index optimization can also be thought of as using an index as a lossy index: the index scan gives you all the tuples you want, plus some you don't, so you have to evaluate the LIKE operator over again at each returned tuple. Basically, what this is good for is using an index for more kinds of WHERE conditions than it could otherwise support. It is *not* a useless abstraction. It occurs to me though that marking the index itself as lossy is the wrong way to think about it --- the right way is to associate the "lossy" flag with use of a particular operator with an index. So maybe the flag should be in pg_amop or pg_amproc, instead. Someday I'd also like to see those tables extended so that the LIKE index optimization is described by the tables, rather than being hard-wired into the planner as it is now. regards, tom lane
Tom Lane writes: > Not true at all. The tuple commit status needs to be rechecked, yes, > but with a normal index it is not necessary to recheck whether the index > key field actually satisfies the index qual conditions. With a lossy > index it *is* necessary to recheck --- the index may return more tuples > than the ones that match the given qual. Okay, this is not surprising. I agree that storing this in the index might be suboptimal. But why is this called lossy? Shouldn't it be called "exceedy"? -- Peter Eisentraut peter_e@gmx.net http://funkturm.homeip.net/~peter
> Peter Eisentraut <peter_e@gmx.net> writes: > > But why is this called lossy? Shouldn't it be called "exceedy"? > > Good point ;-). "lossy" does sound like the index might "lose" tuples, > which is exactly what it's not allowed to do; it must find all the > tuples that match the query. > > The terminology is correct by analogy to "lossy compression" --- the > index loses information, in the sense that its result isn't quite the > result you wanted. But I can see where it'd confuse the unwary. > Perhaps we should consult the literature and see if there is another > term for this concept. Seeing how our ODBC driver refrences it in previous releases, we are going to have trouble changing it. I always thought it was "lossy" in terms of compression too. I don't see it mentioned now in ODBC, but I think it used to be there. I changed it recently to check for word "hash" instead. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Peter Eisentraut <peter_e@gmx.net> writes: > But why is this called lossy? Shouldn't it be called "exceedy"? Good point ;-). "lossy" does sound like the index might "lose" tuples, which is exactly what it's not allowed to do; it must find all the tuples that match the query. The terminology is correct by analogy to "lossy compression" --- the index loses information, in the sense that its result isn't quite the result you wanted. But I can see where it'd confuse the unwary. Perhaps we should consult the literature and see if there is another term for this concept. regards, tom lane
On Tue, Jul 10, 2001 at 01:36:33PM -0400, Tom Lane wrote: > Peter Eisentraut <peter_e@gmx.net> writes: > > But why is this called lossy? Shouldn't it be called "exceedy"? > > Good point ;-). "lossy" does sound like the index might "lose" tuples, > which is exactly what it's not allowed to do; it must find all the > tuples that match the query. > > The terminology is correct by analogy to "lossy compression" --- the > index loses information, in the sense that its result isn't quite the > result you wanted. But I can see where it'd confuse the unwary. > Perhaps we should consult the literature and see if there is another > term for this concept. How about "hinty"? :-) Seriously, "indislossy" is a singularly poor name for a predicate. Also, are we so poor that we can't afford whole words, or even word breaks? I propose "index_is_hint". Actually, is the "ind[ex]" part even necessary? How about "must_check_heap"? Nathan Myers ncm@zembu.com
ncm@zembu.com (Nathan Myers) writes: > Seriously, "indislossy" is a singularly poor name for a predicate. Perhaps, but it fits with the existing naming conventions for Postgres catalog columns. Unless we want to indulge in wholesale renaming of the system's catalog columns (and break an awful lot of applications) I'd resist any name for a pg_index column that's not of the form "indFOO". regards, tom lane