Thread: possible patch to increase number of hash overflow pages?
Hello all, I hope it is OK to post to this mailing list regarding this issue (if not, please let me know...). I was attempting to index an int4 column on a table with 6x10^7 rows using the "hash" index algorithm under PostgreSQL 7.1 on Debian Linux, and received the following error message: nubs=# create index throughput_index_service_fk on throughput_datum using hash (service_fk); ERROR: HASH: Out of overflow pages. Out of luck. Looking into the source code a bit, it looked (to my untrained eye) as if it might be possible to increase the number of overflow pages, with a patch to src/include/access/hash.h to use a 32-bit "overflow page address" data type rather than a 16-bit "overflow page address" data type. However, I really don't know much about the internals of postgres or its hash algorithm, so I thought I should ask to see if this change would be at all workable. If doing this is just a *bad* idea, please let me know; I'm new to Postgres and am guessing this issue must have come up before... Here is the patch: ------------------ sramsey.ocp$ diff hash.h hash.h.new 42a43,45 > > /* SAR hacking this to see if the overflow page addresses can be increased in size to 32 bits */ > 44c47,48 < typedef bits16 OverflowPageAddress; --- > /*typedef bits16 OverflowPageAddress;*/ > typedef bits32 OverflowPageAddress; 51,52c55,56 < #define SPLITSHIFT 11 < #define SPLITMASK 0x7FF --- > #define SPLITSHIFT 24 > #define SPLITMASK 0xFFFFFF 138c142 < #define NCACHED 32 --- > #define NCACHED 256 Will this work? Please CC "sramsey@internap.com" for any replies, because I'm not on the mailing list. Thanks in advance, Steve Ramsey --------------------------------------------- Stephen Ramsey Software Engineer Core Software Development Internap Network Services e-mail sramsey@internap.com telephone 206.504.5361 facsimile 206.447.1870 ---------------------------------------------
Stephen Ramsey <sramsey@internap.com> writes: > I was attempting to index an int4 column on a table with 6x10^7 rows using > the "hash" index algorithm under PostgreSQL 7.1 on Debian Linux, and > received the following error message: > nubs=# create index throughput_index_service_fk on throughput_datum using > hash (service_fk); > ERROR: HASH: Out of overflow pages. Out of luck. Just out of curiosity, what's the reason for using a hash index at all? The btree index type is much better supported and will do everything that a hash index could do (and more). > Looking into the source code a bit, it looked (to my untrained eye) as if > it might be possible to increase the number of overflow pages, with a > patch to src/include/access/hash.h to use a 32-bit "overflow page address" > data type rather than a 16-bit "overflow page address" data type. I haven't looked much at hash either, but am I right to guess that overflow pages are used when an individual hash bucket fills up? If so, overrunning a 16-bit field would suggest that you've got more than 64K index pages in a single hash bucket ... which does not bode well at all for performance. Seems like the answer is to get the thing to use more hash buckets, not to make it possible to support linear searches over chains exceeding 64K pages... regards, tom lane
Hello Tom, Thanks for your reply. > Just out of curiosity, what's the reason for using a hash index at all? > The btree index type is much better supported and will do everything > that a hash index could do (and more). Good point. I'm using a btree index currently, and it is working great, even for tables with 10^8 rows. I pursued the overflow pages with the hash access method only because I wanted to compare it with btree to see which performed better. > > Looking into the source code a bit, it looked (to my untrained eye) as if > > it might be possible to increase the number of overflow pages, with a > > patch to src/include/access/hash.h to use a 32-bit "overflow page address" > > data type rather than a 16-bit "overflow page address" data type. > > I haven't looked much at hash either, but am I right to guess that > overflow pages are used when an individual hash bucket fills up? > If so, overrunning a 16-bit field would suggest that you've got more > than 64K index pages in a single hash bucket ... which does not bode > well at all for performance. Seems like the answer is to get the thing > to use more hash buckets, not to make it possible to support linear > searches over chains exceeding 64K pages... I believe that the algorithm only uses 5 of the 16 bits in the OverflowPageAddress type to identify the "split number" part of the overflow page address. These five bits in turn dictate the size (2^5 = 32) of the hashm_spares[] and hashm_mapp[] arrays in the HashMetaPageData structure. From the comments in hash.h: * * The reason that the size is restricted to NCACHED (32) is because * the bitmaps are 16 bits: upper 5 represent the splitpoint, lower 11 * indicate the page number within the splitpoint. Since there are * only 5 bits to store the splitpoint, there can only be 32 splitpoints. * Both spares[] and bitmaps[] use splitpoints as there indices, so there * can only be 32 of them. */ It looks (from the hash algorithm code) as if the system is possibly needing more splitpoints than can be accomodated by the HashMetaPageData structure, rather than running out of overflow pages, because the error message that I'm getting is when the "splitnum" variable is greater than NCACHED, the latter being the array bound for the hashm_spares[] element of the HashMetaPageData structure. From src/backend/access/hashovfl.c: #define OVMSG "HASH: Out of overflow pages. Out of luck.\n" if (offset > SPLITMASK) { if (++splitnum >= NCACHED) elog(ERROR, OVMSG); metap->OVFL_POINT = splitnum; metap->SPARES[splitnum] = metap->SPARES[splitnum - 1]; metap->SPARES[splitnum - 1]--; offset = 0; } So that's why I bumped the number of bits (in the OverflowPageAddress type) assigned to keep track of splitpoints to 8 bits. Cheers, Steve Ramsey --------------------------------------------- Stephen Ramsey Software Engineer Core Software Development Internap Network Services e-mail sramsey@internap.com telephone 206.504.5361 facsimile 206.447.1870 ---------------------------------------------
> It looks (from the hash algorithm code) as if the system is possibly > needing more splitpoints than can be accomodated by the HashMetaPageData > structure, rather than running out of overflow pages, because the error > message that I'm getting is when the "splitnum" variable is greater than > NCACHED, the latter being the array bound for the hashm_spares[] element > of the HashMetaPageData structure. From src/backend/access/hashovfl.c: > > #define OVMSG "HASH: Out of overflow pages. Out of luck.\n" > > if (offset > SPLITMASK) > { > if (++splitnum >= NCACHED) > elog(ERROR, OVMSG); > metap->OVFL_POINT = splitnum; > metap->SPARES[splitnum] = metap->SPARES[splitnum - 1]; > metap->SPARES[splitnum - 1]--; > offset = 0; > } > > So that's why I bumped the number of bits (in the OverflowPageAddress > type) assigned to keep track of splitpoints to 8 bits. If you want to work on the hash stuff, let us know. We are looking for someone to do performance testing and debugging on hash indexes. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
Bruce, Sure, I would be happy to look into this. I'm a Postgres beginner, so it may take me a while to get up to speed. But am happy to develop and run some tests, identify potential or actual issues or problems, and attempt to remedy any issues I might find. Cheers, Steve Ramsey --------------------------------------------- Stephen Ramsey Software Engineer Core Software Development Internap Network Services e-mail sramsey@internap.com telephone 206.504.5361 facsimile 206.447.1870 --------------------------------------------- On Tue, 19 Jun 2001, Bruce Momjian wrote: > > It looks (from the hash algorithm code) as if the system is possibly > > needing more splitpoints than can be accomodated by the HashMetaPageData > > structure, rather than running out of overflow pages, because the error > > message that I'm getting is when the "splitnum" variable is greater than > > NCACHED, the latter being the array bound for the hashm_spares[] element > > of the HashMetaPageData structure. From src/backend/access/hashovfl.c: > > > > #define OVMSG "HASH: Out of overflow pages. Out of luck.\n" > > > > if (offset > SPLITMASK) > > { > > if (++splitnum >= NCACHED) > > elog(ERROR, OVMSG); > > metap->OVFL_POINT = splitnum; > > metap->SPARES[splitnum] = metap->SPARES[splitnum - 1]; > > metap->SPARES[splitnum - 1]--; > > offset = 0; > > } > > > > So that's why I bumped the number of bits (in the OverflowPageAddress > > type) assigned to keep track of splitpoints to 8 bits. > > If you want to work on the hash stuff, let us know. We are looking for > someone to do performance testing and debugging on hash indexes. > > -- > Bruce Momjian | http://candle.pha.pa.us > pgman@candle.pha.pa.us | (610) 853-3000 > + If your life is a hard drive, | 830 Blythe Avenue > + Christ can be your backup. | Drexel Hill, Pennsylvania 19026 > >
> Bruce, > > Sure, I would be happy to look into this. I'm a Postgres beginner, so it > may take me a while to get up to speed. But am happy to develop and run > some tests, identify potential or actual issues or problems, and attempt > to remedy any issues I might find. Great. First, can you run tests to see if hash is faster/slower than btree in various cases. Second, if could understand the jash code maybe there are some improvements that can be made. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026