Thread: Number of index fields configurable

Number of index fields configurable

From
Bruce Momjian
Date:
I have moved INDEX_MAX_KEYS to postgres.h, and have removed the
hard-coded limits that it is 8 fields.  I hope I got all of them.  The
default is still 8.

There were only a few places left that had the 8 hard-coded.

I haven't tested non-8 values but they should work.

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


RE: [HACKERS] Number of index fields configurable

From
"Hiroshi Inoue"
Date:
> -----Original Message-----
> From: owner-pgsql-hackers@postgreSQL.org
> [mailto:owner-pgsql-hackers@postgreSQL.org]On Behalf Of Bruce Momjian
> 
> I have moved INDEX_MAX_KEYS to postgres.h, and have removed the
> hard-coded limits that it is 8 fields.  I hope I got all of them.  The
> default is still 8.
> 
> There were only a few places left that had the 8 hard-coded.
> 
> I haven't tested non-8 values but they should work.
>

Shouldn't the following catalog be changed ?

CATALOG(pg_index)
{
....int28        indkey;^^^^^oid8        indclass;^^^^^

Regards.

Hiroshi Inoue
Inoue@tpf.co.jp 


Re: [HACKERS] Number of index fields configurable

From
Bruce Momjian
Date:
[Charset iso-8859-1 unsupported, filtering to ASCII...]
> > -----Original Message-----
> > From: owner-pgsql-hackers@postgreSQL.org
> > [mailto:owner-pgsql-hackers@postgreSQL.org]On Behalf Of Bruce Momjian
> > 
> > I have moved INDEX_MAX_KEYS to postgres.h, and have removed the
> > hard-coded limits that it is 8 fields.  I hope I got all of them.  The
> > default is still 8.
> > 
> > There were only a few places left that had the 8 hard-coded.
> > 
> > I haven't tested non-8 values but they should work.
> >
> 
> Shouldn't the following catalog be changed ?
> 
> CATALOG(pg_index)
> {
> ....
>     int28        indkey;
>     ^^^^^
>     oid8        indclass;
>     ^^^^^

The underlying definitions of the types are now based in the #define
parameter.  Not sure if this is going to work so I have not change the
actual type names yet.  I have a few more changes to commit now.

Also, what should the new names be?  Can't call it int16.  Does anyone
outside the source tree rely on those type names?

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Number of index fields configurable

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
>> Shouldn't the following catalog be changed ?
>> 
>> CATALOG(pg_index)
>> {
>> ....
>> int28        indkey;
>> ^^^^^
>> oid8        indclass;
>> ^^^^^

> The underlying definitions of the types are now based in the #define
> parameter.  Not sure if this is going to work so I have not change the
> actual type names yet.  I have a few more changes to commit now.

If we think the parameter works, then we should test it by changing
the value ;-)

> Also, what should the new names be?  Can't call it int16.

I like oidN and int2N, or oidn and int2n if you object to uppercase
names.

> Does anyone outside the source tree rely on those type names?

I was worried about that at first --- we couldn't change the names
if it would break pg_dump files.  But the system catalogs themselves
don't get dumped as such, so it shouldn't be a problem.  There might
be a few folks out there who are using oid8 or int28 as column types
in user tables, but surely not many.  What it comes down to is that
a few people might have to tweak their code or dump files, but not
very many compared to the number of people who will be glad of the
improvement.

But if these types are to have parameterizable sizes, I think it's
critical that oidNin() and int2Nin() be robust about the number of
input values they see.  I suggest that they ought to work like this:

* if the number of supplied values is less than the currently configured
value of N, silently fill in zeroes for the extra places.

* if the number of supplied values is more than N, check the extra
values to see if any are not 0.  Complain if any are not 0, but
if they are all 0 then silently accept it.

This will allow interoperability of pg_dump files across different
values of N, and raise an error only if there's really a problem.

You have the first behavior but not the second...
        regards, tom lane


Re: [HACKERS] Number of index fields configurable

From
Bruce Momjian
Date:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> >> Shouldn't the following catalog be changed ?
> >> 
> >> CATALOG(pg_index)
> >> {
> >> ....
> >> int28        indkey;
> >> ^^^^^
> >> oid8        indclass;
> >> ^^^^^
> 
> > The underlying definitions of the types are now based in the #define
> > parameter.  Not sure if this is going to work so I have not change the
> > actual type names yet.  I have a few more changes to commit now.
> 
> If we think the parameter works, then we should test it by changing
> the value ;-)
> 
> > Also, what should the new names be?  Can't call it int16.
> 
> I like oidN and int2N, or oidn and int2n if you object to uppercase
> names.

How about oidvector and int2vector.  A vector is a 1-dimmensional array.
Calling it an array is too confusing.

> 
> > Does anyone outside the source tree rely on those type names?
> 
> I was worried about that at first --- we couldn't change the names
> if it would break pg_dump files.  But the system catalogs themselves
> don't get dumped as such, so it shouldn't be a problem.  There might
> be a few folks out there who are using oid8 or int28 as column types
> in user tables, but surely not many.  What it comes down to is that
> a few people might have to tweak their code or dump files, but not
> very many compared to the number of people who will be glad of the
> improvement.

> 
> But if these types are to have parameterizable sizes, I think it's
> critical that oidNin() and int2Nin() be robust about the number of
> input values they see.  I suggest that they ought to work like this:
> 
> * if the number of supplied values is less than the currently configured
> value of N, silently fill in zeroes for the extra places.
> 
> * if the number of supplied values is more than N, check the extra
> values to see if any are not 0.  Complain if any are not 0, but
> if they are all 0 then silently accept it.
> 
> This will allow interoperability of pg_dump files across different
> values of N, and raise an error only if there's really a problem.

I will tweek the code to properly check for trailing numbers.  Right now
multiple spaces cause problems, and trailing numbers are ignored.  With
oidn, we can get away with trailing zeros because an oid of 0 is
invalid, but with int2n, a zero is valid, so I think we can't just ignore
extra trailing zeros.  We can pad with zeros, however.  Comments?

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Number of index fields configurable

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> I will tweek the code to properly check for trailing numbers.  Right now
> multiple spaces cause problems, and trailing numbers are ignored.  With
> oidn, we can get away with trailing zeros because an oid of 0 is
> invalid, but with int2n, a zero is valid, so I think we can't just ignore
> extra trailing zeros.  We can pad with zeros, however.  Comments?

For the primary use of these things, which is attribute numbers in
pg_index, padding or dropping zeroes is correct behavior --- unused
positions in the vector will have zero values, same as for the oid
vector.  I think it's OK to define the type's behavior suitably for
the system's use, because it's not intended as a general-purpose user
type; users oughta be using int2[].  (Really, the only reason we have
these types at all is that we depend on having compile-time-constant
field sizes in the system catalogs that are accessed via
include/catalog/'s struct declarations...)
        regards, tom lane


Re: [HACKERS] Number of index fields configurable

From
Bruce Momjian
Date:
> But if these types are to have parameterizable sizes, I think it's
> critical that oidNin() and int2Nin() be robust about the number of
> input values they see.  I suggest that they ought to work like this:
> 
> * if the number of supplied values is less than the currently configured
> value of N, silently fill in zeroes for the extra places.
> 
> * if the number of supplied values is more than N, check the extra
> values to see if any are not 0.  Complain if any are not 0, but
> if they are all 0 then silently accept it.
> 
> This will allow interoperability of pg_dump files across different
> values of N, and raise an error only if there's really a problem.

OK, different solution.  I decided there is no need to be dumping out
zeros to pad the type.  New code does the following.  This looks very
clean to me:
test=> create table x (y int28);CREATEtest=> insert into x values ('1 2 3');INSERT 18697 1test=> select * from x;   y
-------1 2 3(1 row)test=> insert into x values ('1 2 3 4 5 6 7 8');INSERT 18699 1test=> select * from x;        y
----------------- 1 2 3 1 2 3 4 5 6 7 8(3 rows)test=> insert into x values ('1 2 3 4 5 6 7 8 9');ERROR:  int28 value
hastoo many valuestest=> insert into x values ('1 2 3 4 5 6 7 8 0');ERROR:  int28 value has too many values
 

Notice the trailing zero is treated as an error.  Because we trim
trailing zeros, we can affort do handle things this way.

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Number of index fields configurable

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> OK, different solution.  I decided there is no need to be dumping out
> zeros to pad the type.

Oh, that's a thought.  You haven't really gained anything in generality,
since the code is still treating zero as a special case; but I agree it
looks nicer (and is easier to check for too many values).

Only worry I have is whether it will interoperate comfortably with the
old code.  Let's see:

* old dump to new: no problem, unless you've reduced MAX_INDEX_KEYS below 8 (doesn't seem likely).

* new to old: fails for every case except where there's exactly 8 non zero entries.

The latter is a bit bothersome, but may not be a big deal --- in reality
we don't dump and reload pg_index this way.

BTW, be sure you are only suppressing *trailing* zeroes not *embedded*
zeroes.  I know that oid8 has to deal with embedded zeroes (some of
the pg_proc entries look like that); int28 might not, but the code
should probably act the same for both.
        regards, tom lane


Re: [HACKERS] Number of index fields configurable

From
Bruce Momjian
Date:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > OK, different solution.  I decided there is no need to be dumping out
> > zeros to pad the type.
> 
> Oh, that's a thought.  You haven't really gained anything in generality,
> since the code is still treating zero as a special case; but I agree it
> looks nicer (and is easier to check for too many values).
> 
> Only worry I have is whether it will interoperate comfortably with the
> old code.  Let's see:
> 
> * old dump to new: no problem, unless you've reduced MAX_INDEX_KEYS
>   below 8 (doesn't seem likely).
> 
> * new to old: fails for every case except where there's exactly 8
>   non zero entries.

Not sure about this.  Old code did sscanf on 8 entries, but if it
returned fewer, it padded with zeros, so new->old should work.

> 
> The latter is a bit bothersome, but may not be a big deal --- in reality
> we don't dump and reload pg_index this way.
> 
> BTW, be sure you are only suppressing *trailing* zeroes not *embedded*
> zeroes.  I know that oid8 has to deal with embedded zeroes (some of
> the pg_proc entries look like that); int28 might not, but the code
> should probably act the same for both.

Yes, only trailing.  New code walks from end to beginning until it finds
a non-zero.  If the entry is all zeros, you get a zero-length string
output.


--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Number of index fields configurable

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
>> * new to old: fails for every case except where there's exactly 8
>> non zero entries.

> Not sure about this.  Old code did sscanf on 8 entries, but if it
> returned fewer, it padded with zeros, so new->old should work.

Oh, OK.  Nevermind then...
        regards, tom lane


Re: [HACKERS] Number of index fields configurable

From
Bruce Momjian
Date:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > I will tweek the code to properly check for trailing numbers.  Right now
> > multiple spaces cause problems, and trailing numbers are ignored.  With
> > oidn, we can get away with trailing zeros because an oid of 0 is
> > invalid, but with int2n, a zero is valid, so I think we can't just ignore
> > extra trailing zeros.  We can pad with zeros, however.  Comments?
> 
> For the primary use of these things, which is attribute numbers in
> pg_index, padding or dropping zeroes is correct behavior --- unused
> positions in the vector will have zero values, same as for the oid
> vector.  I think it's OK to define the type's behavior suitably for
> the system's use, because it's not intended as a general-purpose user
> type; users oughta be using int2[].  (Really, the only reason we have
> these types at all is that we depend on having compile-time-constant
> field sizes in the system catalogs that are accessed via
> include/catalog/'s struct declarations...)

Renamed oid8 ->oidvector and int28->int2vector.  initdb everyone.  New
type names require it.

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026