Thread: Planning to force reindex of hash indexes
I've found a number of infelicities in the hash index code that can't be fixed without an on-disk format change. The biggest one is that the hashm_ntuples field in hash meta pages is only uint32, meaning that hash index space management will become confused if the number of entries exceeds 4G. I'd like to change it to a "double", and clean up a couple other uglinesses at the same time. Ordinarily I'd just force an initdb for such a change, but at this late stage of the 7.4 cycle it seems better to avoid requiring initdb, especially since many beta testers wouldn't be using hash indexes anyway and shouldn't need to reload. What I intend to do instead is increment the version number that already exists in the hash metapage, and add code to spit out a "please reindex this index" error if the version number isn't right. A REINDEX command would be sufficient to reconstruct the index in the new format. Any objections? regards, tom lane
Tom Lane wrote: > I've found a number of infelicities in the hash index code that can't be > fixed without an on-disk format change. The biggest one is that the > hashm_ntuples field in hash meta pages is only uint32, meaning that > hash index space management will become confused if the number of > entries exceeds 4G. I'd like to change it to a "double", and clean up > a couple other uglinesses at the same time. > > Ordinarily I'd just force an initdb for such a change, but at this late > stage of the 7.4 cycle it seems better to avoid requiring initdb, > especially since many beta testers wouldn't be using hash indexes anyway > and shouldn't need to reload. What I intend to do instead is increment > the version number that already exists in the hash metapage, and add > code to spit out a "please reindex this index" error if the version > number isn't right. A REINDEX command would be sufficient to > reconstruct the index in the new format. > > Any objections? Good plan. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
"Tom Lane" <tgl@sss.pgh.pa.us> wrote: > I've found a number of infelicities in the hash index code that can't be > fixed without an on-disk format change. The biggest one is that the > hashm_ntuples field in hash meta pages is only uint32, meaning that > hash index space management will become confused if the number of > entries exceeds 4G. I'd like to change it to a "double", and clean up > a couple other uglinesses at the same time. How can we avoid this kind of mess for the future ? Regards Gaetano Mendola
"Mendola Gaetano" <mendola@bigfoot.com> writes: > "Tom Lane" <tgl@sss.pgh.pa.us> wrote: >> I've found a number of infelicities in the hash index code that can't be >> fixed without an on-disk format change. > How can we avoid this kind of mess for the future ? Build a time machine, go back fifteen years, wave a magic wand to increase the IQ levels of the Berkeley grad students? Sometimes we just have to change bad decisions, that's all. regards, tom lane
"Tom Lane" <tgl@sss.pgh.pa.us> wrote: > "Mendola Gaetano" <mendola@bigfoot.com> writes: > > "Tom Lane" <tgl@sss.pgh.pa.us> wrote: > >> I've found a number of infelicities in the hash index code that can't be > >> fixed without an on-disk format change. > > > How can we avoid this kind of mess for the future ? > > Build a time machine, go back fifteen years, wave a magic wand to > increase the IQ levels of the Berkeley grad students? :-) > Sometimes we just have to change bad decisions, that's all. I don't know how much old is the code incriminated but I mean there is no way to improve the code approved ? Regards Gaetano Mendola
Mendola Gaetano wrote: > "Tom Lane" <tgl@sss.pgh.pa.us> wrote: > > "Mendola Gaetano" <mendola@bigfoot.com> writes: > > > "Tom Lane" <tgl@sss.pgh.pa.us> wrote: > > >> I've found a number of infelicities in the hash index code that can't > be > > >> fixed without an on-disk format change. > > > > > How can we avoid this kind of mess for the future ? > > > > Build a time machine, go back fifteen years, wave a magic wand to > > increase the IQ levels of the Berkeley grad students? > > :-) > > > Sometimes we just have to change bad decisions, that's all. > > I don't know how much old is the code incriminated but I mean > there is no way to improve the code approved ? I am not sure what you are asking, but the bug existed long before we got it from Berkeley, and even then, it might have slipped by us anyway without our noticing anyway --- we are never going to be perfect. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Tom Lane <tgl@sss.pgh.pa.us> writes: > "Mendola Gaetano" <mendola@bigfoot.com> writes: >> "Tom Lane" <tgl@sss.pgh.pa.us> wrote: >>> I've found a number of infelicities in the hash index code that >>> can't be fixed without an on-disk format change. > >> How can we avoid this kind of mess for the future ? > > Build a time machine, go back fifteen years, wave a magic wand to > increase the IQ levels of the Berkeley grad students? Sometimes we > just have to change bad decisions, that's all. > > regards, tom lane Actually, I think that your time would be better spent on the time machine. After all, with working time travel you wouldn't have to worry about optimizing PostgreSQL. Queries could take as long as they needed to take and then you could send the response back in time to right before the query was issued. It's too bad you don't have the time machine finished already. You could use the time machine to go back in time and work on the time machine :). You'd be finished in no time. Jason