Re: int8 primary keys still not using index without manual - Mailing list pgsql-general
From | Bruce Momjian |
---|---|
Subject | Re: int8 primary keys still not using index without manual |
Date | |
Msg-id | 200311071827.hA7IR5503637@candle.pha.pa.us Whole thread Raw |
In response to | Re: int8 primary keys still not using index without manual (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: int8 primary keys still not using index without manual
|
List | pgsql-general |
Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > We can try removing most int2 functions and see if that makes such > > conversions much easier, or try casting incoming constants to int2 to > > see what happens. > > I already did that --- that was exactly the substance of the tests I was > reporting in that message. > > I have been thinking lately that the fundamental approach is wrong > anyway. Basically the idea was to make the world safe for > single-datatype index handling by removing all the cross-type comparison > operators. The reason > > SELECT ... WHERE int8col = 42 > > isn't indexable is that the = operator is int84eq, which is not to be > found in the set of operators associated with an index on int8. What > we were thinking was that if we didn't have int84eq then the parser > would be forced to promote the 42 to int8, and then the comparison using > int8eq would be recognized as indexable. Could we not always promote int4 to int8 for indexing purposes? I realize OID has issues, though, as you mention in that URL: This algorithm was wrong on both practical and theoretical levels; in the first place it's not very helpful to only be able to handle binary-compatible transformations, and in the second place there isn't any good guarantee that it's not changing the semantics when it replaces the operator. For instance int4 < and oid < do not act the same. Depending on equality of operator names was a bad idea even then, and would be quite unworkable now in the world of schema search paths. Could we just promote int4 constants to int8 always? I just checked and 2^32-1 is already promoted to int8: select 4294967295; so the funny thing is that: SELECT ... WHERE int8col = 4294967296; already uses the int8 index. I think the complex case you mentioned was oid. Let's look at the various possible constant comparisons against an oid column: -1 This constant would match no oid, so we could just allow the sequential scan. I don't think anyone would expect good behavior for such a comparison. 1 This could be promoted to oid cleanly. 2^31+1 This will come in a int8, so we can just downcast to oid automatically. I know your case was "<" comparison. It would be: SELECT ... WHERE oidcol < -1; SELECT ... WHERE oidcol < 1; SELECT ... WHERE oidcol < 2147483649; 2^31+1 These all seem to work, I think. -1 will not use an index, which is OK. I am concerned about having to add catalog maintenance for every index case, which seems it could be a lot. Here is my logic. I am having trouble getting the big picture on this: int2 fits in the int4 range int4 fits in the int8 range oid fits only in the int8 range, some oids fit in int4 This means a valid oid could come in as int4 or int8. I realize this requires hard-coded comparisons to C include defines to get the maxium for each type. I know this breaks our type-neutral style, but in this case, it seems it might be the cleanest way --- abstracting this out into a table seems too hard. Now for int2-based indexes. Can't we just downcast constants to int2 if they fit in the int2 valid range? > I think this might actually be workable for int8, but it's not going to > work for int2 without changing the initial typing of small integer > constants, and we already know that that opens a Pandora's box of other > problems. > > But quite aside from the semantic difficulties of rejiggering all that > stuff, it's going to break other parts of the optimizer if we do it. > In particular it will interfere with handling of mergejoins and > recognizing transitive equality. For example consider > > SELECT ... WHERE a.int8col = b.int4col AND b.int4col = 42; > > Currently we are able to deduce a.int8col = 42 (where the operator > is int84eq). If we remove int84eq then the output of the parser for > this example will look like > > SELECT ... WHERE a.int8col = b.int4col::int8 AND b.int4col = 42; > > and the transitive equality will not be recognized because > b.int4col::int8 is not the same expression as b.int4col. Seems we should keep those cross-type comparisons around for col op col comparisons, at least, as well as internal optimizer use as you described. > So I'm currently thinking we'd be better off not to try to eliminate > the cross-type comparison operators. Instead we need some solution > that is narrowly focused on the problem of making a non-indexable > comparison indexable, by converting a comparison value of the wrong > datatype into the right datatype locally to the indexscan plan > generation code. I posted some speculation about that here: > http://archives.postgresql.org/pgsql-hackers/2003-09/msg00983.php Agreed. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
pgsql-general by date: