Thread: How to REINDEX in high volume environments?
Hi all, Am experimenting to find out what kind of performance gain are achieved from moving indexes to a different scsi drives than the WAL files, than the data itself, etc. Have come across an interesting problem. Have moved the indexes to another drive, then created symlinks to them. Ran a benchmark against the database, REINDEX'd the tables, VACUUM FULL ANALYZE'd, prepared to re-run the benchmark again and guess what? The indexes were back on the original drive. The process of REINDEX-ing obviously creates another file then drops the original. Is there a way to allow REINDEX to work without having this side affect? Pre-creating a bunch of dangling symlinks doesn't work (tried that, it gives a "ERROR: cannot create accounts_pkey: File exists" on FreeBSD 4.6.2 when using the REINDEX). Any suggestions? :-) Regards and best wishes, Justin Clift -- "My grandfather once told me that there are two kinds of people: those who work and those who take the credit. He told me to try to be in the first group; there was less competition there." - Indira Gandhi
On 28 Sep 2002 at 17:08, Justin Clift wrote: > Have moved the indexes to another drive, then created symlinks to them. > Ran a benchmark against the database, REINDEX'd the tables, VACUUM FULL > ANALYZE'd, prepared to re-run the benchmark again and guess what? > > The indexes were back on the original drive. > Is there a way to allow REINDEX to work without having this side affect? > > Pre-creating a bunch of dangling symlinks doesn't work (tried that, it > gives a "ERROR: cannot create accounts_pkey: File exists" on FreeBSD > 4.6.2 when using the REINDEX). Looks like we should have a subdirectory in database directory which stores index. May be transaction logs, indexes goes in separte directory which can be symlinked. Linking a directory is much simpler solution than linking a file. I suggest we have per database transaction log and indexes created in separate subdirectories for each database. Furhter given that large tables are segmented after one GB size, a table should have it's own subdirectory optionally.. At the cost of few inodes, postgresql would gain much more flexibility and hence tunability.. May be TODO for 7.4? Anyone? ByeShridhar -- Software, n.: Formal evening attire for female computer analysts.
Shridhar Daithankar wrote: <snip> > Looks like we should have a subdirectory in database directory which stores > index. That was my first thought also, but an alternative/additional approach would be this (not sure if it's workable): - As each index already has a bunch of information stored stored for it, would it be possible to have an additional column added called 'idxpath' or something? - This would mean that the index location would be stable per index, and would allow for *really* high volume environments to keep different indexes on different drives. Not sure what the default value would be, maybe the PGDATA directory, maybe something as a GUC variable, etc, but that's the concept. :-) Regards and best wishes, Justin Clift > May be transaction logs, indexes goes in separte directory which can be > symlinked. Linking a directory is much simpler solution than linking a file. > > I suggest we have per database transaction log and indexes created in separate > subdirectories for each database. Furhter given that large tables are segmented > after one GB size, a table should have it's own subdirectory optionally.. > > At the cost of few inodes, postgresql would gain much more flexibility and > hence tunability.. > > May be TODO for 7.4? Anyone? > > Bye > Shridhar > > -- > Software, n.: Formal evening attire for female computer analysts. > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster -- "My grandfather once told me that there are two kinds of people: those who work and those who take the credit. He told me to try to be in the first group; there was less competition there." - Indira Gandhi
On 28 Sep 2002 at 17:51, Justin Clift wrote: > Shridhar Daithankar wrote: > <snip> > > Looks like we should have a subdirectory in database directory which stores > > index. > > That was my first thought also, but an alternative/additional approach > would be this (not sure if it's workable): > > - As each index already has a bunch of information stored stored for > it, would it be possible to have an additional column added called > 'idxpath' or something? > > - This would mean that the index location would be stable per index, > and would allow for *really* high volume environments to keep different > indexes on different drives. I have to disagree.. Completely.. This is like turning PG-Metadata into registry... And what happens when index starts splitting when it grows beyond 1GB in size? Putting indexes into a separate subdirectoy and mount/link that directory on a device that is on a separate SCSI channel is what I can think of as last drop of performance out of it.. Just a thought, as usual.. I don't know how much efforts it would take but if we have pg_xlog in separte configurable dir. now, putting indexes as well and having per database pg_xlog should be on the same line. The later aspect is also important IMO.. ByeShridhar -- VMS, n.: The world's foremost multi-user adventure game.
Shridhar Daithankar wrote: <snip> > And what happens when index starts splitting when it grows beyond 1GB in size? Having an index directory: i.e. $PGDATA/data/<oid>/indexes/ (that's the kind of thing you mean isn't it?) Sounds workable, and sounds better than the present approach. The reason that I was thinking of having a different path per index would be for high volume situations like this: /dev/dsk1 : /pgdata <- data here /dev/dsk2 : /pgindexes1 <- some indexes here /dev/dsk3 : /pgindexes2 <- some ultra-high volume activity here Let's say that there's a bunch of data on /dev/dsk1, and for performance reasons it's been decided to move the indexes to another drive /dev/dsk2. Now, if just one of those indexes is getting *a lot* of the drive activity, it would make sense to move it to it's own dedicated drive. Having an um... PGINDEX (that's just an identifier for this example, not an environment variable suggestion) directory location defined would mean that each time a REINDEX operation occurs, then all new indexes would be created in the same spot. That sounds better than the present approach thus far, but wouldn't work for situations where indexes are spread across multiple disk drives. The suggestion of having some kind of path info for each index is merely a thought of how to meet that potential future need, not necessarily the best method anyone has ever thought of. Like someone might pipe up and say "Nah, it could be done better XYZ way", etc. :-) Regards and best wishes, Justin Clift > Putting indexes into a separate subdirectoy and mount/link that directory on a > device that is on a separate SCSI channel is what I can think of as last drop > of performance out of it.. > > Just a thought, as usual.. > > I don't know how much efforts it would take but if we have pg_xlog in separte > configurable dir. now, putting indexes as well and having per database pg_xlog > should be on the same line. The later aspect is also important IMO.. > > Bye > Shridhar > > -- > VMS, n.: The world's foremost multi-user adventure game. > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster -- "My grandfather once told me that there are two kinds of people: those who work and those who take the credit. He told me to try to be in the first group; there was less competition there." - Indira Gandhi
Justin Clift dijo: Hi, > Ran a benchmark against the database, REINDEX'd the tables, VACUUM FULL > ANALYZE'd, prepared to re-run the benchmark again and guess what? > > The indexes were back on the original drive. Yes, this is expected. Same for CLUSTER. They create a different filenode and point the relation (table or index) at it. I think the separate space for indexes is a good idea. However, and this is orthogonal, I feel the way REINDEX works now is not the best, because it precludes you from using the index while you are doing it. I'm trying to implement a way to concurrently compact the indexes. I hope to have it for 7.4. -- Alvaro Herrera (<alvherre[a]atentus.com>) Y una voz del caos me hablo y me dijo "Sonrie y se feliz, podria ser peor". Y sonrei. Y fui feliz. Y fue peor.
On Sat, 2002-09-28 at 02:16, Shridhar Daithankar wrote: > On 28 Sep 2002 at 17:08, Justin Clift wrote: > > > Have moved the indexes to another drive, then created symlinks to them. > > Ran a benchmark against the database, REINDEX'd the tables, VACUUM FULL > > ANALYZE'd, prepared to re-run the benchmark again and guess what? > > > > The indexes were back on the original drive. > > Is there a way to allow REINDEX to work without having this side affect? > > > > Pre-creating a bunch of dangling symlinks doesn't work (tried that, it > > gives a "ERROR: cannot create accounts_pkey: File exists" on FreeBSD > > 4.6.2 when using the REINDEX). > > Looks like we should have a subdirectory in database directory which stores > index. > > May be transaction logs, indexes goes in separte directory which can be > symlinked. Linking a directory is much simpler solution than linking a file. > > I suggest we have per database transaction log and indexes created in separate > subdirectories for each database. Furhter given that large tables are segmented > after one GB size, a table should have it's own subdirectory optionally.. > > At the cost of few inodes, postgresql would gain much more flexibility and > hence tunability.. > > May be TODO for 7.4? Anyone? Very neat idea! Sounds like an excellent way of gaining lots of granularity! I can't even think of a reason not to use the directory per table scheme all the time. Perhaps simply allowing for a script/tool that will automatically perform such a physical table migration to a distinct directory would be in order too. Either way, sounds like a good idea. Greg
Justin Clift <justin@postgresql.org> writes: > Shridhar Daithankar wrote: >> Looks like we should have a subdirectory in database directory which stores >> index. > That was my first thought also, but an alternative/additional approach > would be this (not sure if it's workable): See the tablespaces TODO item. I'm not excited about building half-baked versions of tablespaces before we get around to doing the real thing ... regards, tom lane
On 28 Sep 2002 at 12:18, Tom Lane wrote: > Justin Clift <justin@postgresql.org> writes: > > Shridhar Daithankar wrote: > >> Looks like we should have a subdirectory in database directory which stores > >> index. > > > That was my first thought also, but an alternative/additional approach > > would be this (not sure if it's workable): > > See the tablespaces TODO item. I'm not excited about building > half-baked versions of tablespaces before we get around to doing the > real thing ... I wen thr. the messages posted regarding tablespaces. It looks like Tablesspaces should provide 1. Managability 2. Performance tuning 3. Better Administration.. Creating a directory for each object or object type would allow to do same thing. Why directory? 1. You can mount it someplace else. 2. You can symlink it without worrying about postgresql creating new files instead of symlink while drop/recreate. Whether to choose directory or tablespaces? I say directory. Why? 1. PostgreSQL philosphy has always been using facilities provided by OS and not to duplicate that work. Tablespaces directly violates that. Directory mounting does not. 2. Tablespaces combines objects on them, adding a layer of abstraction. and then come ideas like vacuuming a tablespace. Frankly given what vacuum does, I can't imagine what vacuuming tablespace would exactly do. 3. Tablespace would be a single file or structure of directories? How do we configure it? What tuning option do we provide? Basically table spaces I feel is a layer of abstraction that can be avoided if we layout the DB in a directory tree with sufficient levels. That would be easy to deal with as configuration and maitainance delegated to OS and it would be flexible enough to. Anyway if we have a directory per object/object type, how much different it's going to be from a table space? Frankly I am wary of table spaces because I have seen them in oracle and not eaxctly convinced that's the best way of doing things. If we introdude word tablespace, users will be expecting all those idiocies like taking a table space offline/online, adding data files aka pre-claiming space etc. All these are responsibilities of OS. Let OS handle it. PostgreSQL should just create a file structure which would grow as and when required. The issue looks similimar to having raw disk I/O. Oracle might have good reason to do it but are we sure postgresql needs this? Just another policy decision waiting.. Here are some links I found in archive. Would like to know more about this issue.. http://candle.pha.pa.us/mhonarc/todo.detail/tablespaces/msg00006.html http://candle.pha.pa.us/mhonarc/todo.detail/tablespaces/msg00007.html Just a thought.. ByeShridhar -- The sooner our happiness together begins, the longer it will last. -- Miramanee, "The Paradise Syndrome", stardate 4842.6
On 29 Sep 2002 at 0:43, Justin Clift wrote: > Shridhar Daithankar wrote: > The reason that I was thinking of having a different path per index > would be for high volume situations like this: > > /dev/dsk1 : /pgdata <- data here > /dev/dsk2 : /pgindexes1 <- some indexes here > /dev/dsk3 : /pgindexes2 <- some ultra-high volume activity here I would say this would look better.. /pgdata -indexes --index1 ---indexfiles --index2 ---indexfiles Where index1 and index2 are two different indexes. Just like each table gets it's own directory, each index gets it's own directory as well. So the admin would/can tune on per object basis rather than worrying about creating right group of objects and then tuning about that group. If required throwing per database transaction log there as well might prove a good idea. It would insulate one db from load of other, as far as I/O is concerned.. This possiblity is not lost with this scheme but it just gets something simpler IMO.. Just illustration of my another post on hackers on this topic.. ByeShridhar -- You're too beautiful to ignore. Too much woman. -- Kirk to Yeoman Rand, "The Enemy Within", stardate unknown
Am Samstag, 28. September 2002 10:17 schrieb Shridhar Daithankar: (snip) > I have to disagree.. Completely.. This is like turning PG-Metadata into > registry... > > And what happens when index starts splitting when it grows beyond 1GB in > size? > > Putting indexes into a separate subdirectoy and mount/link that directory > on a device that is on a separate SCSI channel is what I can think of as > last drop of performance out of it.. (snip) I think a good approach would be the introduction of tablespaces like oracle has, and assigning locations to that tablespace. Best regards,Mario Weilguni
Just wanted to pipe in here. I am still very interested in tablespaces ( I have many database systems that are over 500GB and growing) and am willing to port my tablespace patch to 7.4. I have everything (but only tested here) working in 7.2 but the patch was not accepted. I didn't see a great speed improvement but the patch helps with storage management. Recap. the patch would enable the following a database to have a default data tablespace and index tablespace a user to have a default data and index tablespace a table to have a specific tablespace an index to have a specfic tablespace I would like to also add namespace (schema) to have a default data and index tablespaces Jim > Justin Clift <justin@postgresql.org> writes: > > Shridhar Daithankar wrote: > >> Looks like we should have a subdirectory in database directory which stores > >> index. > > > That was my first thought also, but an alternative/additional approach > > would be this (not sure if it's workable): > > See the tablespaces TODO item. I'm not excited about building > half-baked versions of tablespaces before we get around to doing the > real thing ... > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/users-lounge/docs/faq.html
Jim, glad you are still around. Yes, we would love to get tablespaces in 7.4. I think we need to think bigger and get something where we can name tablespaces and place tables/indexes into these named spaces. I can reread the TODO.detail stuff and give you an outline. How does that sound? Thomas Lockhart is also interested in this feature. --------------------------------------------------------------------------- Jim Buttafuoco wrote: > Just wanted to pipe in here. I am still very interested in tablespaces ( I have many database systems that are over > 500GB and growing) and am willing to port my tablespace patch to 7.4. I have everything (but only tested here) working > in 7.2 but the patch was not accepted. I didn't see a great speed improvement but the patch helps with storage management. > > Recap. the patch would enable the following > > a database to have a default data tablespace and index tablespace > a user to have a default data and index tablespace > a table to have a specific tablespace > an index to have a specfic tablespace > > I would like to also add namespace (schema) to have a default data and index tablespaces > > Jim > > > > > > Justin Clift <justin@postgresql.org> writes: > > > Shridhar Daithankar wrote: > > >> Looks like we should have a subdirectory in database directory which stores > > >> index. > > > > > That was my first thought also, but an alternative/additional approach > > > would be this (not sure if it's workable): > > > > See the tablespaces TODO item. I'm not excited about building > > half-baked versions of tablespaces before we get around to doing the > > real thing ... > > > > regards, tom lane > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 5: Have you checked our extensive FAQ? > > > > http://www.postgresql.org/users-lounge/docs/faq.html > > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073