Thread: Planning to force reindex of hash indexes

Planning to force reindex of hash indexes

From
Tom Lane
Date:
I've found a number of infelicities in the hash index code that can't be
fixed without an on-disk format change.  The biggest one is that the 
hashm_ntuples field in hash meta pages is only uint32, meaning that
hash index space management will become confused if the number of
entries exceeds 4G.  I'd like to change it to a "double", and clean up
a couple other uglinesses at the same time.

Ordinarily I'd just force an initdb for such a change, but at this late
stage of the 7.4 cycle it seems better to avoid requiring initdb,
especially since many beta testers wouldn't be using hash indexes anyway
and shouldn't need to reload.  What I intend to do instead is increment
the version number that already exists in the hash metapage, and add
code to spit out a "please reindex this index" error if the version
number isn't right.  A REINDEX command would be sufficient to
reconstruct the index in the new format.

Any objections?
        regards, tom lane


Re: Planning to force reindex of hash indexes

From
Bruce Momjian
Date:
Tom Lane wrote:
> I've found a number of infelicities in the hash index code that can't be
> fixed without an on-disk format change.  The biggest one is that the 
> hashm_ntuples field in hash meta pages is only uint32, meaning that
> hash index space management will become confused if the number of
> entries exceeds 4G.  I'd like to change it to a "double", and clean up
> a couple other uglinesses at the same time.
> 
> Ordinarily I'd just force an initdb for such a change, but at this late
> stage of the 7.4 cycle it seems better to avoid requiring initdb,
> especially since many beta testers wouldn't be using hash indexes anyway
> and shouldn't need to reload.  What I intend to do instead is increment
> the version number that already exists in the hash metapage, and add
> code to spit out a "please reindex this index" error if the version
> number isn't right.  A REINDEX command would be sufficient to
> reconstruct the index in the new format.
> 
> Any objections?

Good plan.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Planning to force reindex of hash indexes

From
"Mendola Gaetano"
Date:
"Tom Lane" <tgl@sss.pgh.pa.us> wrote:
> I've found a number of infelicities in the hash index code that can't be
> fixed without an on-disk format change.  The biggest one is that the 
> hashm_ntuples field in hash meta pages is only uint32, meaning that
> hash index space management will become confused if the number of
> entries exceeds 4G.  I'd like to change it to a "double", and clean up
> a couple other uglinesses at the same time.

How can we avoid this kind of mess for the future ?

Regards
Gaetano Mendola



Re: Planning to force reindex of hash indexes

From
Tom Lane
Date:
"Mendola Gaetano" <mendola@bigfoot.com> writes:
> "Tom Lane" <tgl@sss.pgh.pa.us> wrote:
>> I've found a number of infelicities in the hash index code that can't be
>> fixed without an on-disk format change.

> How can we avoid this kind of mess for the future ?

Build a time machine, go back fifteen years, wave a magic wand to
increase the IQ levels of the Berkeley grad students?  Sometimes
we just have to change bad decisions, that's all.
        regards, tom lane


Re: Planning to force reindex of hash indexes

From
"Mendola Gaetano"
Date:
"Tom Lane" <tgl@sss.pgh.pa.us> wrote:
> "Mendola Gaetano" <mendola@bigfoot.com> writes:
> > "Tom Lane" <tgl@sss.pgh.pa.us> wrote:
> >> I've found a number of infelicities in the hash index code that can't
be
> >> fixed without an on-disk format change.
>
> > How can we avoid this kind of mess for the future ?
>
> Build a time machine, go back fifteen years, wave a magic wand to
> increase the IQ levels of the Berkeley grad students?

:-)

> Sometimes we just have to change bad decisions, that's all.

I don't know how much old is the code incriminated but I mean
there is no way to improve the code approved ?

Regards
Gaetano Mendola





Re: Planning to force reindex of hash indexes

From
Bruce Momjian
Date:
Mendola Gaetano wrote:
> "Tom Lane" <tgl@sss.pgh.pa.us> wrote:
> > "Mendola Gaetano" <mendola@bigfoot.com> writes:
> > > "Tom Lane" <tgl@sss.pgh.pa.us> wrote:
> > >> I've found a number of infelicities in the hash index code that can't
> be
> > >> fixed without an on-disk format change.
> >
> > > How can we avoid this kind of mess for the future ?
> >
> > Build a time machine, go back fifteen years, wave a magic wand to
> > increase the IQ levels of the Berkeley grad students?
> 
> :-)
> 
> > Sometimes we just have to change bad decisions, that's all.
> 
> I don't know how much old is the code incriminated but I mean
> there is no way to improve the code approved ?

I am not sure what you are asking, but the bug existed long before we
got it from Berkeley, and even then, it might have slipped by us anyway
without our noticing anyway  --- we are never going to be perfect.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Planning to force reindex of hash indexes

From
jearl@bullysports.com
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> "Mendola Gaetano" <mendola@bigfoot.com> writes:
>> "Tom Lane" <tgl@sss.pgh.pa.us> wrote:
>>> I've found a number of infelicities in the hash index code that
>>> can't be fixed without an on-disk format change.
>
>> How can we avoid this kind of mess for the future ?
>
> Build a time machine, go back fifteen years, wave a magic wand to
> increase the IQ levels of the Berkeley grad students?  Sometimes we
> just have to change bad decisions, that's all.
>
>             regards, tom lane

Actually, I think that your time would be better spent on the time
machine.  After all, with working time travel you wouldn't have to
worry about optimizing PostgreSQL.  Queries could take as long as they
needed to take and then you could send the response back in time to
right before the query was issued.

It's too bad you don't have the time machine finished already.  You
could use the time machine to go back in time and work on the time
machine :).  You'd be finished in no time.

Jason