Thread: Creating indexes in the background
Hey folks, It seems like right now when you want to create an index concurrently, the index creation will get canceled if you cancelthe statement (i.e. you must keep your statement open). Is there a way to "launch" an index creation in the background so that the statement doesn't need to be kept open? --David
On Oct 27, 2012, at 19:20, David Lee <davidomundo@gmail.com> wrote: > Hey folks, > > It seems like right now when you want to create an index concurrently, the index creation will get canceled if you cancelthe statement (i.e. you must keep your statement open). > > Is there a way to "launch" an index creation in the background so that the statement doesn't need to be kept open? > > --David > No there is not. David J.
<p>Thanks. Is this something viable as a feature request? <div class="gmail_quote">On Oct 28, 2012 7:48 AM, "David Johnston"<<a href="mailto:polobo@yahoo.com">polobo@yahoo.com</a>> wrote:<br type="attribution" /><blockquote class="gmail_quote"style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> On Oct 27, 2012, at 19:20, DavidLee <<a href="mailto:davidomundo@gmail.com">davidomundo@gmail.com</a>> wrote:<br /><br /> > Hey folks,<br />><br /> > It seems like right now when you want to create an index concurrently, the index creation will get canceledif you cancel the statement (i.e. you must keep your statement open).<br /> ><br /> > Is there a way to "launch"an index creation in the background so that the statement doesn't need to be kept open?<br /> ><br /> > --David<br/> ><br /><br /> No there is not.<br /><br /> David J.</blockquote></div>
On Sat, Oct 27, 2012 at 4:20 PM, David Lee <davidomundo@gmail.com> wrote: > Hey folks, > > It seems like right now when you want to create an index concurrently, the index creation will get canceled if you cancelthe statement (i.e. you must keep your statement open). > > Is there a way to "launch" an index creation in the background so that the statement doesn't need to be kept open? Launch the session itself in the background. For example, on linux: psql -c 'create index concurrently ....' & At some point, you will probably want to verify that the creation succeeded. Cheers, Jeff
On Sat, Oct 27, 2012 at 7:20 PM, David Lee <davidomundo@gmail.com> wrote: > It seems like right now when you want to create an index concurrently, the index creation will get canceled if you cancelthe statement (i.e. you must keep your statement open). > > Is there a way to "launch" an index creation in the background so that the statement doesn't need to be kept open? Nope. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Thanks for all the responses. I forgot to ask in my initial post: If not already available, is background indexing a viable feature request? (This waswhy I sent to pgsql-hackers). My use case is a multi-tenant CMS where indexes can be created by a web front-end. Since web requests should return quickly,it would be optimal to fire off an index creation and return immediately. Since this is not available natively inpostgresql, I have to have a separate worker process that handles long-running connections, and manage communication betweenthe worker process and the web server process. Since index creation--especially concurrent index creation on tables with lots of writes--takes a long time, it would begreat to be able to trigger index creation in the background, without having to dedicate a process+connection that keepsa statement open. Since I've mentioned my use case, I might as well mention another issue I stumbled across, which is that concurrent indexcreation cannot happen from within trigger functions. I'm able to non-concurrently create indexes from within triggerfunctions. Why is there this disparity? If you can't tell, my ideal goal is to be able to launch background concurrent index creation via triggers. --David On Oct 28, 2012, at 2:35 PM, Kevin Grittner <kgrittn@mail.com> wrote: > David Lee wrote: > >> It seems like right now when you want to create an index >> concurrently, the index creation will get canceled if you cancel >> the statement (i.e. you must keep your statement open). >> >> Is there a way to "launch" an index creation in the background so >> that the statement doesn't need to be kept open? > > You are asking on the wrong list; this list is for discussions > related to developing the PostgreSQL product itself. To ask questions > about how to *use* PostgreSQL, please use one of the other lists. > This sort of question belongs on the pgsql-general or pgsql-admin > list. > > As long as I'm sending an email -- I usually run such queries on the > server, either through a crontab job (if it's a recurring need), an > "at" job (if it is a one-time off-hours need), or a background job. > The specifics will depend on your operating system, which is > something you should generally include in a question. > > http://wiki.postgresql.org/wiki/Guide_to_reporting_problems > > -Kevin
David Lee wrote: > It seems like right now when you want to create an index > concurrently, the index creation will get canceled if you cancel > the statement (i.e. you must keep your statement open). > > Is there a way to "launch" an index creation in the background so > that the statement doesn't need to be kept open? You are asking on the wrong list; this list is for discussions related to developing the PostgreSQL product itself. To ask questions about how to *use* PostgreSQL, please use one of the other lists. This sort of question belongs on the pgsql-general or pgsql-admin list. As long as I'm sending an email -- I usually run such queries on the server, either through a crontab job (if it's a recurring need), an "at" job (if it is a one-time off-hours need), or a background job. The specifics will depend on your operating system, which is something you should generally include in a question. http://wiki.postgresql.org/wiki/Guide_to_reporting_problems -Kevin
On Sat, Oct 27, 2012 at 6:20 PM, David Lee <davidomundo@gmail.com> wrote: > Hey folks, > > It seems like right now when you want to create an index concurrently, the index creation will get canceled if you cancelthe statement (i.e. you must keep your statement open). > > Is there a way to "launch" an index creation in the background so that the statement doesn't need to be kept open? > do it in the command line: psql -c "CREATE INDEX CONCURRENTLY aname ON atable(acolumn)" adb & -- Jaime Casanova www.2ndQuadrant.com Professional PostgreSQL: Soporte 24x7 y capacitación Phone: +593 4 5107566 Cell: +593 987171157
On Sun, Oct 28, 2012 at 8:20 AM, David Lee <davidomundo@gmail.com> wrote:
-- Hey folks,
It seems like right now when you want to create an index concurrently, the index creation will get canceled if you cancel the statement (i.e. you must keep your statement open).
Is there a way to "launch" an index creation in the background so that the statement doesn't need to be kept open?
If I understood your question, the answer is no, you need to maintain the server session alive when doing a transaction on a PG server, cutting the server connection automatically aborts the transaction it is running on server side.
You can still use a batch processing for doing such operations like:
psql -c "create index concurrently foo on tab(a)" -p $port -h $host $dbname &
But this needs to be done on the client application side that will maintain alive a session on server.
You can still use a batch processing for doing such operations like:
psql -c "create index concurrently foo on tab(a)" -p $port -h $host $dbname &
But this needs to be done on the client application side that will maintain alive a session on server.
Michael Paquier
http://michael.otacoo.com
David Lee escribió: > Thanks for all the responses. > > I forgot to ask in my initial post: If not already available, is background indexing a viable feature request? (This waswhy I sent to pgsql-hackers). > > My use case is a multi-tenant CMS where indexes can be created by a web front-end. Since web requests should return quickly,it would be optimal to fire off an index creation and return immediately. Since this is not available natively inpostgresql, I have to have a separate worker process that handles long-running connections, and manage communication betweenthe worker process and the web server process. I think the easiest way to handle this would be to have the web frontend record the request to create the index somewhere, and a separate process (maybe one doing LISTEN permanently) does the actual creation. > Since I've mentioned my use case, I might as well mention another issue I stumbled across, which is that concurrent indexcreation cannot happen from within trigger functions. I'm able to non-concurrently create indexes from within triggerfunctions. Why is there this disparity? Because concurrent index creation needs to open and commit transactions underneath, which regular index creation does not. Functions cannot run across transaction boundaries. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 10/28/12 10:22 AM, David Lee wrote: > Thanks. Is this something viable as a feature request? Possibly, but it's not terribly high on the list. In the meantime, we've built a user-space index daemon at $WORK that generally solves this issue. We intend to release itat some point, but if you need it now we could provide you with code. -- Jim "Decibel!" Nasby jnasby@enova.com Primary: 512-569-9461 Backup: 512-579-9024
On Sun, Oct 28, 2012 at 8:22 AM, David Lee <davidomundo@gmail.com> wrote: > Thanks. Is this something viable as a feature request? Just to contribute a tiny amount of data: I also get this request from users on a semi-regular basis. It's definitely below the pains of pg_dump/restore or fork-and-reuse-of-connections of libpq in the space of caveats that come to mind, but it's worth noting that regular people do notice this is a gap (disconnected session support) reasonably frequently. However, the workaround is fairly clear and network connections are pretty reliable, and that blunts its severity. it's also useful for other use cases besides CREATE INDEX, like incremental data rewrites. I think there's some indirect overlap here with the daemon facility under discussion that is active in another thread. -- fdr
On 28 October 2012 01:20, David Lee <davidomundo@gmail.com> wrote: > It seems like right now when you want to create an index concurrently, the index creation will get canceled if you cancelthe statement (i.e. you must keep your statement open). > > Is there a way to "launch" an index creation in the background so that the statement doesn't need to be kept open? There is some work in 9.3 that will give low-level infrastructure support to allow the concept of a "job/task scheduler" in Postgres, which is the generic description of the facility you're looking for. It is early days, but that will emerge sometime in 9.3 or beyond. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
>> Since I've mentioned my use case, I might as well mention another issue I stumbled across, which is that concurrent indexcreation cannot happen from within trigger functions. I'm able to non-concurrently create indexes from within triggerfunctions. Why is there this disparity? > > Because concurrent index creation needs to open and commit transactions > underneath, which regular index creation does not. Functions cannot run > across transaction boundaries. Would having background index creation allow for this? For example, launching a background concurrent index creation couldrecord just the "actuation" of the index creation in the trigger's transaction, and postgresql would actually populatethe index once the trigger's transaction is committed. --David