Thread: Number of index fields configurable
I have moved INDEX_MAX_KEYS to postgres.h, and have removed the hard-coded limits that it is 8 fields. I hope I got all of them. The default is still 8. There were only a few places left that had the 8 hard-coded. I haven't tested non-8 values but they should work. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
> -----Original Message----- > From: owner-pgsql-hackers@postgreSQL.org > [mailto:owner-pgsql-hackers@postgreSQL.org]On Behalf Of Bruce Momjian > > I have moved INDEX_MAX_KEYS to postgres.h, and have removed the > hard-coded limits that it is 8 fields. I hope I got all of them. The > default is still 8. > > There were only a few places left that had the 8 hard-coded. > > I haven't tested non-8 values but they should work. > Shouldn't the following catalog be changed ? CATALOG(pg_index) { ....int28 indkey;^^^^^oid8 indclass;^^^^^ Regards. Hiroshi Inoue Inoue@tpf.co.jp
[Charset iso-8859-1 unsupported, filtering to ASCII...] > > -----Original Message----- > > From: owner-pgsql-hackers@postgreSQL.org > > [mailto:owner-pgsql-hackers@postgreSQL.org]On Behalf Of Bruce Momjian > > > > I have moved INDEX_MAX_KEYS to postgres.h, and have removed the > > hard-coded limits that it is 8 fields. I hope I got all of them. The > > default is still 8. > > > > There were only a few places left that had the 8 hard-coded. > > > > I haven't tested non-8 values but they should work. > > > > Shouldn't the following catalog be changed ? > > CATALOG(pg_index) > { > .... > int28 indkey; > ^^^^^ > oid8 indclass; > ^^^^^ The underlying definitions of the types are now based in the #define parameter. Not sure if this is going to work so I have not change the actual type names yet. I have a few more changes to commit now. Also, what should the new names be? Can't call it int16. Does anyone outside the source tree rely on those type names? -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Bruce Momjian <pgman@candle.pha.pa.us> writes: >> Shouldn't the following catalog be changed ? >> >> CATALOG(pg_index) >> { >> .... >> int28 indkey; >> ^^^^^ >> oid8 indclass; >> ^^^^^ > The underlying definitions of the types are now based in the #define > parameter. Not sure if this is going to work so I have not change the > actual type names yet. I have a few more changes to commit now. If we think the parameter works, then we should test it by changing the value ;-) > Also, what should the new names be? Can't call it int16. I like oidN and int2N, or oidn and int2n if you object to uppercase names. > Does anyone outside the source tree rely on those type names? I was worried about that at first --- we couldn't change the names if it would break pg_dump files. But the system catalogs themselves don't get dumped as such, so it shouldn't be a problem. There might be a few folks out there who are using oid8 or int28 as column types in user tables, but surely not many. What it comes down to is that a few people might have to tweak their code or dump files, but not very many compared to the number of people who will be glad of the improvement. But if these types are to have parameterizable sizes, I think it's critical that oidNin() and int2Nin() be robust about the number of input values they see. I suggest that they ought to work like this: * if the number of supplied values is less than the currently configured value of N, silently fill in zeroes for the extra places. * if the number of supplied values is more than N, check the extra values to see if any are not 0. Complain if any are not 0, but if they are all 0 then silently accept it. This will allow interoperability of pg_dump files across different values of N, and raise an error only if there's really a problem. You have the first behavior but not the second... regards, tom lane
> Bruce Momjian <pgman@candle.pha.pa.us> writes: > >> Shouldn't the following catalog be changed ? > >> > >> CATALOG(pg_index) > >> { > >> .... > >> int28 indkey; > >> ^^^^^ > >> oid8 indclass; > >> ^^^^^ > > > The underlying definitions of the types are now based in the #define > > parameter. Not sure if this is going to work so I have not change the > > actual type names yet. I have a few more changes to commit now. > > If we think the parameter works, then we should test it by changing > the value ;-) > > > Also, what should the new names be? Can't call it int16. > > I like oidN and int2N, or oidn and int2n if you object to uppercase > names. How about oidvector and int2vector. A vector is a 1-dimmensional array. Calling it an array is too confusing. > > > Does anyone outside the source tree rely on those type names? > > I was worried about that at first --- we couldn't change the names > if it would break pg_dump files. But the system catalogs themselves > don't get dumped as such, so it shouldn't be a problem. There might > be a few folks out there who are using oid8 or int28 as column types > in user tables, but surely not many. What it comes down to is that > a few people might have to tweak their code or dump files, but not > very many compared to the number of people who will be glad of the > improvement. > > But if these types are to have parameterizable sizes, I think it's > critical that oidNin() and int2Nin() be robust about the number of > input values they see. I suggest that they ought to work like this: > > * if the number of supplied values is less than the currently configured > value of N, silently fill in zeroes for the extra places. > > * if the number of supplied values is more than N, check the extra > values to see if any are not 0. Complain if any are not 0, but > if they are all 0 then silently accept it. > > This will allow interoperability of pg_dump files across different > values of N, and raise an error only if there's really a problem. I will tweek the code to properly check for trailing numbers. Right now multiple spaces cause problems, and trailing numbers are ignored. With oidn, we can get away with trailing zeros because an oid of 0 is invalid, but with int2n, a zero is valid, so I think we can't just ignore extra trailing zeros. We can pad with zeros, however. Comments? -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Bruce Momjian <pgman@candle.pha.pa.us> writes: > I will tweek the code to properly check for trailing numbers. Right now > multiple spaces cause problems, and trailing numbers are ignored. With > oidn, we can get away with trailing zeros because an oid of 0 is > invalid, but with int2n, a zero is valid, so I think we can't just ignore > extra trailing zeros. We can pad with zeros, however. Comments? For the primary use of these things, which is attribute numbers in pg_index, padding or dropping zeroes is correct behavior --- unused positions in the vector will have zero values, same as for the oid vector. I think it's OK to define the type's behavior suitably for the system's use, because it's not intended as a general-purpose user type; users oughta be using int2[]. (Really, the only reason we have these types at all is that we depend on having compile-time-constant field sizes in the system catalogs that are accessed via include/catalog/'s struct declarations...) regards, tom lane
> But if these types are to have parameterizable sizes, I think it's > critical that oidNin() and int2Nin() be robust about the number of > input values they see. I suggest that they ought to work like this: > > * if the number of supplied values is less than the currently configured > value of N, silently fill in zeroes for the extra places. > > * if the number of supplied values is more than N, check the extra > values to see if any are not 0. Complain if any are not 0, but > if they are all 0 then silently accept it. > > This will allow interoperability of pg_dump files across different > values of N, and raise an error only if there's really a problem. OK, different solution. I decided there is no need to be dumping out zeros to pad the type. New code does the following. This looks very clean to me: test=> create table x (y int28);CREATEtest=> insert into x values ('1 2 3');INSERT 18697 1test=> select * from x; y -------1 2 3(1 row)test=> insert into x values ('1 2 3 4 5 6 7 8');INSERT 18699 1test=> select * from x; y ----------------- 1 2 3 1 2 3 4 5 6 7 8(3 rows)test=> insert into x values ('1 2 3 4 5 6 7 8 9');ERROR: int28 value hastoo many valuestest=> insert into x values ('1 2 3 4 5 6 7 8 0');ERROR: int28 value has too many values Notice the trailing zero is treated as an error. Because we trim trailing zeros, we can affort do handle things this way. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Bruce Momjian <pgman@candle.pha.pa.us> writes: > OK, different solution. I decided there is no need to be dumping out > zeros to pad the type. Oh, that's a thought. You haven't really gained anything in generality, since the code is still treating zero as a special case; but I agree it looks nicer (and is easier to check for too many values). Only worry I have is whether it will interoperate comfortably with the old code. Let's see: * old dump to new: no problem, unless you've reduced MAX_INDEX_KEYS below 8 (doesn't seem likely). * new to old: fails for every case except where there's exactly 8 non zero entries. The latter is a bit bothersome, but may not be a big deal --- in reality we don't dump and reload pg_index this way. BTW, be sure you are only suppressing *trailing* zeroes not *embedded* zeroes. I know that oid8 has to deal with embedded zeroes (some of the pg_proc entries look like that); int28 might not, but the code should probably act the same for both. regards, tom lane
> Bruce Momjian <pgman@candle.pha.pa.us> writes: > > OK, different solution. I decided there is no need to be dumping out > > zeros to pad the type. > > Oh, that's a thought. You haven't really gained anything in generality, > since the code is still treating zero as a special case; but I agree it > looks nicer (and is easier to check for too many values). > > Only worry I have is whether it will interoperate comfortably with the > old code. Let's see: > > * old dump to new: no problem, unless you've reduced MAX_INDEX_KEYS > below 8 (doesn't seem likely). > > * new to old: fails for every case except where there's exactly 8 > non zero entries. Not sure about this. Old code did sscanf on 8 entries, but if it returned fewer, it padded with zeros, so new->old should work. > > The latter is a bit bothersome, but may not be a big deal --- in reality > we don't dump and reload pg_index this way. > > BTW, be sure you are only suppressing *trailing* zeroes not *embedded* > zeroes. I know that oid8 has to deal with embedded zeroes (some of > the pg_proc entries look like that); int28 might not, but the code > should probably act the same for both. Yes, only trailing. New code walks from end to beginning until it finds a non-zero. If the entry is all zeros, you get a zero-length string output. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Bruce Momjian <pgman@candle.pha.pa.us> writes: >> * new to old: fails for every case except where there's exactly 8 >> non zero entries. > Not sure about this. Old code did sscanf on 8 entries, but if it > returned fewer, it padded with zeros, so new->old should work. Oh, OK. Nevermind then... regards, tom lane
> Bruce Momjian <pgman@candle.pha.pa.us> writes: > > I will tweek the code to properly check for trailing numbers. Right now > > multiple spaces cause problems, and trailing numbers are ignored. With > > oidn, we can get away with trailing zeros because an oid of 0 is > > invalid, but with int2n, a zero is valid, so I think we can't just ignore > > extra trailing zeros. We can pad with zeros, however. Comments? > > For the primary use of these things, which is attribute numbers in > pg_index, padding or dropping zeroes is correct behavior --- unused > positions in the vector will have zero values, same as for the oid > vector. I think it's OK to define the type's behavior suitably for > the system's use, because it's not intended as a general-purpose user > type; users oughta be using int2[]. (Really, the only reason we have > these types at all is that we depend on having compile-time-constant > field sizes in the system catalogs that are accessed via > include/catalog/'s struct declarations...) Renamed oid8 ->oidvector and int28->int2vector. initdb everyone. New type names require it. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026