Thread: why is gist index taking so much space on the disc
Hi folks my conquers with Gist index for custom type are nearly finished. It is working as it is now, but there are few problems here and there. One of em, being amount of disc space index it self takes. The type stucture it self takes 160bytes. Adding 100.000 rows into table - CREATE TABLE blah (a serial, b customType); with my gist index takes around 2GB on disc ! 100.000 is a large number, but the purpose of having gist in first place is defeated if that machine can't handle fast I/O or has at least 3GB of ram, first to hold index in cache, secondly to operate postgres caching (shared memory). Is it normal that index is so hudge ? Even tho my type has built in masks (element that can match few different values), and %. up front the string (which behaves just like the sql % in b ~ '%.something'). And both are used to build "unions" for pick-split, and other operations. Is it because of pick-split it self ? It does good work in splitting up table of elements into two separate ones, by sorting them first, than creating common "mask" for L and P. And by scanning whole table again, and putting elements matching into L or P. L and P elements sometimes overlap, but so far I can't find better solution. Having to iterate 10 or 20 times using k-means (the type holds tree a like structure) isn't going to boost efficiency either. This index works, and it is very fast, but still large. So final question, what should I do to make that index much smaller on the disc. -- GJ "If we knew what we were doing, it wouldn't be called Research, would it?" - AE
On 2005-11-21, at 19:32, Martijn van Oosterhout wrote: > On Mon, Nov 21, 2005 at 04:58:25PM +0100, Grzegorz Jaskiewicz wrote: >> my conquers with Gist index for custom type are nearly finished. It >> is working as it is now, but there are few problems here and there. >> One of em, being amount of disc space index it self takes. The type >> stucture it self takes 160bytes. Adding 100.000 rows into table - >> CREATE TABLE blah (a serial, b customType); > > Let's see, 160bytes means you'll get aboud 50 keys per page. So you > would expect 2000 leaf page, 40 level 1 pages. This should be less > than > 20-30MB > yep; > You mean you sometimes put the same elements in the two halves? You > shouldn't do that. The whole point is that the search will descend any > node that matches consistant, but any single key should only appear > once in each index. > > picksplit should *split* the set, not return two sets about the same > size as you started... Nope, I mean that 'masks' created to match either 'half' sometimes match elements in the other one. This shouldn't be a big deal, just one level to go down on query to much more specific result set. I have fixed that with, somewhat hack. Here's some pseudo code sort_data(input); find_split_point(input); mask1 = generate_two_masks(input[0]); mask2 = generate_two_masks(input[1]); foreach(input) {bool a = matches1(input);bool b = matches2(input);if ( a && b ) { if ( left_index == 0 ) { left[left_index++]= input; } else { if ( right_index == 0 ) { right[right_index++] = input; continue; } /* this part is new code, and helped a lot, now gist index takes much less space and is much faster, because of lower I/O consumption*/ if ( loop_index % 2 ) { right[right_index++] = input; } else { left[left_index++] = input; } }}else { if ( a) left[left_index++] =input; if ( b) right[right_index++] = input;} } mask1 = generate(left ); mask2 = generate(right); return (left, right, blah, blih, others); Ok, so the part with i%2 helped a lot, it distributes elements matching both masks evenly. Thanks guys. I will play with k-means, and see if they will work better with no hacks. Either way, I have to have some code that will handle "matches both" case. Thanks again. -- GJ "If we knew what we were doing, it wouldn't be called Research, would it?" - AE
Take the query. select a,b from dupa where b::text in (select b::text from dupa group by b::text having count(b) > 2); This is acceptable to create a unique constraint, however, we cannot mark the column unique, without defining btree operators, which clearly are not possible for sorting. Is there any way to base the operators based on the text representation of the type for strict equality (not to be confused with same or equivilent) and thus use that not as an ordering method, but as a simple equality for uniqueness. Kevin McArthur ----- Original Message ----- From: "Grzegorz Jaskiewicz" <gj@pointblue.com.pl> To: <pgsql-hackers@postgresql.org> Sent: Monday, November 21, 2005 7:58 AM Subject: [HACKERS] why is gist index taking so much space on the disc > Hi folks > > my conquers with Gist index for custom type are nearly finished. It is > working as it is now, but there are few problems here and there. > One of em, being amount of disc space index it self takes. The type > stucture it self takes 160bytes. Adding 100.000 rows into table - CREATE > TABLE blah (a serial, b customType); > with my gist index takes around 2GB on disc ! 100.000 is a large number, > but the purpose of having gist in first place is defeated if that machine > can't handle fast I/O or has at least 3GB of ram, first to hold index in > cache, secondly to operate postgres caching (shared memory). > Is it normal that index is so hudge ? Even tho my type has built in masks > (element that can match few different values), and %. up front the string > (which behaves just like the sql % in b ~ '%.something'). And both are > used to build "unions" for pick-split, and other operations. Is it > because of pick-split it self ? It does good work in splitting up table > of elements into two separate ones, by sorting them first, than creating > common "mask" for L and P. And by scanning whole table again, and putting > elements matching into L or P. L and P elements sometimes overlap, but so > far I can't find better solution. Having to iterate 10 or 20 times using > k-means (the type holds tree a like structure) isn't going to boost > efficiency either. > This index works, and it is very fast, but still large. > > So final question, what should I do to make that index much smaller on > the disc. > > -- > GJ > > "If we knew what we were doing, it wouldn't be called Research, would > it?" - AE > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match >
> So final question, what should I do to make that index much smaller on > the disc. Tune your penalty and picksplit function. Gevel module can help you to look inside of index ( http://www.sai.msu.su/~megera/postgres/gist/gevel ). Usially, index becomes big when picksplit works bad: during split it place one key on one page and all other keys on another page. So you have a huge number of page with single value. -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
On Mon, Nov 21, 2005 at 04:58:25PM +0100, Grzegorz Jaskiewicz wrote: > my conquers with Gist index for custom type are nearly finished. It > is working as it is now, but there are few problems here and there. > One of em, being amount of disc space index it self takes. The type > stucture it self takes 160bytes. Adding 100.000 rows into table - > CREATE TABLE blah (a serial, b customType); Let's see, 160bytes means you'll get aboud 50 keys per page. So you would expect 2000 leaf page, 40 level 1 pages. This should be less than 20-30MB <snip> > Is it normal that index is so hudge ? Even tho my type has built in > masks (element that can match few different values), and %. up front > the string (which behaves just like the sql % in b ~ '%.something'). > And both are used to build "unions" for pick-split, and other > operations. Is it because of pick-split it self ? It does good work > in splitting up table of elements into two separate ones, by sorting > them first, than creating common "mask" for L and P. And by scanning > whole table again, and putting elements matching into L or P. L and P > elements sometimes overlap, but so far I can't find better solution. You mean you sometimes put the same elements in the two halves? You shouldn't do that. The whole point is that the search will descend any node that matches consistant, but any single key should only appear once in each index. picksplit should *split* the set, not return two sets about the same size as you started... Hope this helps, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
On Mon, Nov 21, 2005 at 08:14:44PM +0100, Grzegorz Jaskiewicz wrote: > >You mean you sometimes put the same elements in the two halves? You > >shouldn't do that. The whole point is that the search will descend any > >node that matches consistant, but any single key should only appear > >once in each index. > > > >picksplit should *split* the set, not return two sets about the same > >size as you started... > > Nope, I mean that 'masks' created to match either 'half' sometimes > match elements in the other one. > This shouldn't be a big deal, just one level to go down on query to > much more specific result set. > I have fixed that with, somewhat hack. It's not a hack, that's how it's supposed to work. An entry should only appear once in the index, but it could appear in multiple places. Like you say, some entries can go into either half. B-Trees are the rather special case that you can always split a set of values into two non-overlapping sets. With geometric types (like your bitmasks) you can't avoid overlap sometimes so you have to follow multiple branches to check if an element is there or not. Your pseudo code is good and will work fine. However, ideally you want to divide the overlap in such a way that later splits work better. Maybe by trying to decide which mask is "closer". The better the splitting the more efficient your tree will become. Ofcourse, perfect splitting may be expensive but then it depends on how many inserts vs how many selects. If you do a lot of searches it may be worth the time. BTW, I glad you're making progress and hopefully you might be able to publish some code. PostgreSQL could do with some example GiST indexes on bitmaps. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
On Mon, 21 Nov 2005, Martijn van Oosterhout wrote: > On Mon, Nov 21, 2005 at 08:14:44PM +0100, Grzegorz Jaskiewicz wrote: >>> You mean you sometimes put the same elements in the two halves? You >>> shouldn't do that. The whole point is that the search will descend any >>> node that matches consistant, but any single key should only appear >>> once in each index. >>> >>> picksplit should *split* the set, not return two sets about the same >>> size as you started... >> >> Nope, I mean that 'masks' created to match either 'half' sometimes >> match elements in the other one. >> This shouldn't be a big deal, just one level to go down on query to >> much more specific result set. >> I have fixed that with, somewhat hack. > > It's not a hack, that's how it's supposed to work. An entry should only > appear once in the index, but it could appear in multiple places. Like > you say, some entries can go into either half. > > B-Trees are the rather special case that you can always split a set of > values into two non-overlapping sets. With geometric types (like your > bitmasks) you can't avoid overlap sometimes so you have to follow > multiple branches to check if an element is there or not. > > Your pseudo code is good and will work fine. However, ideally you want > to divide the overlap in such a way that later splits work better. > Maybe by trying to decide which mask is "closer". > > The better the splitting the more efficient your tree will become. > Ofcourse, perfect splitting may be expensive but then it depends on how > many inserts vs how many selects. If you do a lot of searches it may be > worth the time. Martijn is perfectly right here. You, probably, need to read a bit some classical papers, for example, "R-TREES: A dynamic index structure for spatial searching" by Antonin Guttman. _____________________________________________________________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83