Thread: Dept of ugly hacks: eliminating padding space in system indexes
I was thinking a bit about how we pad columns of type NAME to fixed-width, even though they're semantically equivalent to C strings. The reason for wasting that space is that it makes it possible to overlay a C struct onto the leading columns of most system catalogs. I don't wish to propose changing that (at least not today), but it struck me that there is no reason to overlay a C struct onto index entries, and that getting rid of the padding space would be even more useful in an index than in the catalog itself. It turns out to be dead easy to implement this: effectively, we just decree that the index column storage type for NAME is always CSTRING. Because the two types are effectively binary-compatible as long as you don't look at the padding, the attached ugly-but-impressively-short patch seems to accomplish this. It passes the regression tests anyway. Here are some numbers about the space savings in a virgin database: CVS HEAD w/patch savings pg_database_size('postgres') 4439752 4071112 8.3% pg_relation_size('pg_class_relname_nsp_index') 57344 40960 28% pg_relation_size('pg_proc_proname_args_nsp_index') 319488 204800 35% Cutting a third off the size of a system index has got to be worth something, but is it worth a hack as ugly as this one? regards, tom lane Index: src/backend/catalog/index.c =================================================================== RCS file: /cvsroot/pgsql/src/backend/catalog/index.c,v retrieving revision 1.300 diff -c -r1.300 index.c *** src/backend/catalog/index.c 19 Jun 2008 00:46:04 -0000 1.300 --- src/backend/catalog/index.c 23 Jun 2008 19:34:54 -0000 *************** *** 262,267 **** --- 262,278 ---- ReleaseSysCache(tuple); } + + /* + * For an index on NAME, force the index storage to be CSTRING, + * rather than padded to fixed length. + */ + if (to->atttypid == NAMEOID) + { + to->atttypid = CSTRINGOID; + to->attlen = -2; + to->attalign = 'c'; + } } return indexTupDesc;
I would mention in the C comment that we are doing this for space savings, but other than that, it seems fine. --------------------------------------------------------------------------- Tom Lane wrote: > I was thinking a bit about how we pad columns of type NAME to > fixed-width, even though they're semantically equivalent to C strings. > The reason for wasting that space is that it makes it possible to > overlay a C struct onto the leading columns of most system catalogs. > I don't wish to propose changing that (at least not today), but it > struck me that there is no reason to overlay a C struct onto index > entries, and that getting rid of the padding space would be even more > useful in an index than in the catalog itself. It turns out to be > dead easy to implement this: effectively, we just decree that the > index column storage type for NAME is always CSTRING. Because the > two types are effectively binary-compatible as long as you don't > look at the padding, the attached ugly-but-impressively-short patch > seems to accomplish this. It passes the regression tests anyway. > Here are some numbers about the space savings in a virgin database: > > CVS HEAD w/patch savings > > pg_database_size('postgres') 4439752 4071112 8.3% > pg_relation_size('pg_class_relname_nsp_index') 57344 40960 28% > pg_relation_size('pg_proc_proname_args_nsp_index') 319488 204800 35% > > Cutting a third off the size of a system index has got to be worth > something, but is it worth a hack as ugly as this one? > > regards, tom lane > > Content-Description: index-name-as-cstring.patch > Index: src/backend/catalog/index.c > =================================================================== > RCS file: /cvsroot/pgsql/src/backend/catalog/index.c,v > retrieving revision 1.300 > diff -c -r1.300 index.c > *** src/backend/catalog/index.c 19 Jun 2008 00:46:04 -0000 1.300 > --- src/backend/catalog/index.c 23 Jun 2008 19:34:54 -0000 > *************** > *** 262,267 **** > --- 262,278 ---- > > ReleaseSysCache(tuple); > } > + > + /* > + * For an index on NAME, force the index storage to be CSTRING, > + * rather than padded to fixed length. > + */ > + if (to->atttypid == NAMEOID) > + { > + to->atttypid = CSTRINGOID; > + to->attlen = -2; > + to->attalign = 'c'; > + } > } > > return indexTupDesc; > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Tom Lane wrote: > Cutting a third off the size of a system index has got to be worth > something, but is it worth a hack as ugly as this one? > > I think so. cheers andrew
On Mon, 2008-06-23 at 15:52 -0400, Tom Lane wrote: > CVS HEAD w/patch savings > > pg_database_size('postgres') 4439752 4071112 8.3% > pg_relation_size('pg_class_relname_nsp_index') 57344 40960 28% > pg_relation_size('pg_proc_proname_args_nsp_index') 319488 204800 35% > > Cutting a third off the size of a system index has got to be worth > something, but is it worth a hack as ugly as this one? Not doing it would be more ugly, unless there is some negative side-effect? -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
Andrew Dunstan wrote: > Tom Lane wrote: >> Cutting a third off the size of a system index has got to be worth >> something, but is it worth a hack as ugly as this one? >> > > I think so. Were you able to time any speedup? Is this something that would benefit installations with a lot of metadata? I presume most of this information normally easily fits in cache most of the time? I am trying to understand what exactly it is worth... :-) Cheers, mark -- Mark Mielke <mark@mielke.cc>
Mark Mielke <mark@mark.mielke.cc> writes: >> Tom Lane wrote: >>> Cutting a third off the size of a system index has got to be worth >>> something, but is it worth a hack as ugly as this one? > Were you able to time any speedup? I didn't try; can you suggest any suitable benchmark? The performance impact is probably going to be limited by our extensive use of catalog caches --- once a desired row is in a backend's catcache, it doesn't take a btree search to fetch it again. Still, the system indexes are probably "hot" enough to stay in shared buffers most of the time, and the smaller they are the more space will be left for other stuff, so I think there should be a distributed benefit. regards, tom lane
Simon Riggs <simon@2ndquadrant.com> writes: > On Mon, 2008-06-23 at 15:52 -0400, Tom Lane wrote: >> Cutting a third off the size of a system index has got to be worth >> something, but is it worth a hack as ugly as this one? > Not doing it would be more ugly, unless there is some negative > side-effect? I thought some more about why this seems ugly to me, and realized that a lot of it has to do with the change in typalign. Currently, a compiler is entitled to assume that a pointer to Name is 4-byte aligned; thus for instance it could generate word-wide instructions for copying a Name from one place to another. A "Name" that is stored as just CSTRING might break that. We are already at risk of this, really, because of all the places where we gaily pass plain old C strings to syscache and index searches on Name columns. I think the only reason we've not been burnt is that it's hard to optimize strcmp() into word-wide operations. However the solution to that seems fairly obvious: let's downgrade Name to typalign 1 instead of 4. regards, tom lane
Tom Lane wrote: <blockquote cite="mid:4440.1214262551@sss.pgh.pa.us" type="cite"><blockquote type="cite"><pre wrap="">Wereyou able to time any speedup? </pre></blockquote><pre wrap=""> I didn't try; can you suggest any suitable benchmark? </pre></blockquote><br /> Unfortunately - no. I kind of think it won't benefit any of my databases in any noticeable way.My numbers are similar to yours:<br /><br /><blockquote type="cite">pccyber=# select pg_database_size('postgres');<br/> 4468332<br /><br /> pccyber=# select pg_relation_size('pg_class_relname_nsp_index');<br/> 90112<br /><br /> pccyber=# select pg_relation_size('pg_proc_proname_args_nsp_index');<br/> 294912<br /></blockquote><br /> Not that I disagree withyour change, but < 5 Mbytes in 4 Gbytes of RAM for my main PostgreSQL system that I manage seems like a drop in thebucket. Even if 40% of pg_class_relname and pg_proc_proname indices was saved - we're talking about 154 Kbytes saved onboth those indices combined. Minor? Major? I bet I wouldn't notice unless my database requirements used up all RAM, andeven then I'm suspecting it wouldn't matter except for border line cases (like all pages required for everything elsehappened to equal 4 Gbytes near exactly).<br /><br /><blockquote cite="mid:4440.1214262551@sss.pgh.pa.us" type="cite"><prewrap="">The performance impact is probably going to be limited by our extensive use of catalog caches --- once a desired row is in a backend's catcache, it doesn't take a btree search to fetch it again. Still, the system indexes are probably "hot" enough to stay in shared buffers most of the time, and the smaller they are the more space will be left for other stuff, so I think there should be a distributed benefit. </pre></blockquote><br /> In my opinion it is 'do the right thing',rather than a performance question. It seems to me that an index keeping tracking of space characters at the end ofa name, char, varchar, or text does not make sense, and the right thing may be to do a more generic version of your patch?In the few cases that space at the end matters, couldn't that be determined by re-checking the table row after queryingit?<br /><br /> Cheers,<br /> mark<br /><br /><pre class="moz-signature" cols="72">-- Mark Mielke <a class="moz-txt-link-rfc2396E" href="mailto:mark@mielke.cc"><mark@mielke.cc></a> </pre>
> dead easy to implement this: effectively, we just decree that the > index column storage type for NAME is always CSTRING. Because the Isn't it a reason to add STORAGE option of CREATE OPERATOR CLASS to BTree? as it's done for GiST and GIN indexes. -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
Mark Mielke wrote: > Not that I disagree with your change, but < 5 Mbytes in 4 Gbytes of RAM > for my main PostgreSQL system that I manage seems like a drop in the > bucket. Even if 40% of pg_class_relname and pg_proc_proname indices was > saved - we're talking about 154 Kbytes saved on both those indices > combined. Minor? Major? I bet I wouldn't notice unless my database > requirements used up all RAM, and even then I'm suspecting it wouldn't > matter except for border line cases (like all pages required for > everything else happened to equal 4 Gbytes near exactly). Guess the mileage will vary depending on the complexity of the db structure. Shorter names will also benefit more than longer ones. >> The performance impact is probably going to be limited by our extensive >> use of catalog caches --- once a desired row is in a backend's catcache, >> it doesn't take a btree search to fetch it again. Still, the system >> indexes are probably "hot" enough to stay in shared buffers most of the >> time, and the smaller they are the more space will be left for other >> stuff, so I think there should be a distributed benefit. >> My question is whether this is limited to system catalogs? or will this benefit char() index used on any table? The second would make it more worthwhile. -- Shane Ambler pgSQL (at) Sheeky (dot) Biz Get Sheeky @ http://Sheeky.Biz
Shane Ambler wrote: > My question is whether this is limited to system catalogs? or will this > benefit char() index used on any table? The second would make it more > worthwhile. char(n) fields are already stored as variable-length on disk. This isn't applicable to them. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Teodor Sigaev <teodor@sigaev.ru> writes: >> dead easy to implement this: effectively, we just decree that the >> index column storage type for NAME is always CSTRING. Because the > Isn't it a reason to add STORAGE option of CREATE OPERATOR CLASS to BTree? as > it's done for GiST and GIN indexes. Hmm ... I don't see a point in exposing that as a user-level facility, unless you can point to other use-cases besides NAME. But it would be cute to implement the hack by changing the initial contents of pg_opclass instead of inserting code in the backend. I'll give that a try. regards, tom lane
Shane Ambler wrote: > Mark Mielke wrote: > >> Not that I disagree with your change, but < 5 Mbytes in 4 Gbytes of >> RAM for my main PostgreSQL system that I manage seems like a drop in >> the bucket. Even if 40% of pg_class_relname and pg_proc_proname >> indices was saved - we're talking about 154 Kbytes saved on both those >> indices combined. Minor? Major? I bet I wouldn't notice unless my >> database requirements used up all RAM, and even then I'm suspecting it >> wouldn't matter except for border line cases (like all pages required >> for everything else happened to equal 4 Gbytes near exactly). > > Guess the mileage will vary depending on the complexity of the db > structure. Shorter names will also benefit more than longer ones. There are PostgreSQL users out there with more than 100,000 tables per server instance. This will make more of a difference to them. --Josh
Josh Berkus wrote: > Shane Ambler wrote: >> Mark Mielke wrote: >> >>> Not that I disagree with your change, but < 5 Mbytes in 4 Gbytes of >>> RAM for my main PostgreSQL system that I manage seems like a drop in >>> the bucket. Even if 40% of pg_class_relname and pg_proc_proname >>> indices was saved - we're talking about 154 Kbytes saved on both >>> those indices combined. Minor? Major? I bet I wouldn't notice unless >>> my database requirements used up all RAM, and even then I'm >>> suspecting it wouldn't matter except for border line cases (like all >>> pages required for everything else happened to equal 4 Gbytes near >>> exactly). >> >> Guess the mileage will vary depending on the complexity of the db >> structure. Shorter names will also benefit more than longer ones. > > There are PostgreSQL users out there with more than 100,000 tables per > server instance. This will make more of a difference to them. More than I think people realize. Joshua D. Drake > > --Josh > >
Re: Dept of ugly hacks: eliminating padding space in system indexes
From
"Stephen R. van den Berg"
Date:
Mark Mielke wrote: >saved - we're talking about 154 Kbytes saved on both those indices >combined. Minor? Major? I bet I wouldn't notice unless my database >requirements used up all RAM, and even then I'm suspecting it wouldn't >matter except for border line cases (like all pages required for >everything else happened to equal 4 Gbytes near exactly). There is always only so much of 1st level and 2nd level cache; for those the savings might well make a difference, even on multigigabyte databases. -- Sincerely, Stephen R. van den Berg. Life is that brief interlude between nothingness and eternity.
Mark, > Not that I disagree with your change, but < 5 Mbytes in 4 Gbytes of RAM > for my main PostgreSQL system that I manage seems like a drop in the > bucket. Even if 40% of pg_class_relname and pg_proc_proname indices was > saved - we're talking about 154 Kbytes saved on both those indices > combined. Minor? Major? I bet I wouldn't notice unless my database > requirements used up all RAM, and even then I'm suspecting it wouldn't > matter except for border line cases (like all pages required for > everything else happened to equal 4 Gbytes near exactly). Again, I think the best way to test this would be to create an installation with more than 100,000 tables & views. That's not hypothetical; I've encountered it already twice in production users. --Josh Berkus
Tom Lane napsal(a): > Cutting a third off the size of a system index has got to be worth > something, but is it worth a hack as ugly as this one? > The problem what I see there is how to fit with in-place-upgrade. Catalog should be generate from scratch, but if somebody uses name in regular table it invokes request for reindex. Zdenek -- Zdenek Kotala Sun Microsystems Prague, Czech Republic http://sun.com/postgresql
Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes: > Tom Lane napsal(a): >> Cutting a third off the size of a system index has got to be worth >> something, but is it worth a hack as ugly as this one? > The problem what I see there is how to fit with in-place-upgrade. Catalog should > be generate from scratch, but if somebody uses name in regular table it invokes > request for reindex. Actually, an existing index stored as "name" would continue to work fine, so I think this could be worked around. But in any case name is deprecated for user use, so anyone who suffers a reindex has only themselves to blame. regards, tom lane
Tom Lane napsal(a): > Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes: >> Tom Lane napsal(a): >>> Cutting a third off the size of a system index has got to be worth >>> something, but is it worth a hack as ugly as this one? > >> The problem what I see there is how to fit with in-place-upgrade. Catalog should >> be generate from scratch, but if somebody uses name in regular table it invokes >> request for reindex. > > Actually, an existing index stored as "name" would continue to work > fine, so I think this could be worked around. But in any case name is > deprecated for user use, so anyone who suffers a reindex has only > themselves to blame. Yes, it is deprecated by you know a user. Give him a loaded shot-gun and he start play a golf :-). However, reindex is acceptable penalty for user who uses deprecated things. Zdenek