Thread: Creating indexes in the background

Creating indexes in the background

From
David Lee
Date:
Hey folks,

It seems like right now when you want to create an index concurrently, the index creation will get canceled if you
cancelthe statement (i.e. you must keep your statement open). 

Is there a way to "launch" an index creation in the background so that the statement doesn't need to be kept open?

--David


Re: Creating indexes in the background

From
David Johnston
Date:
On Oct 27, 2012, at 19:20, David Lee <davidomundo@gmail.com> wrote:

> Hey folks,
>
> It seems like right now when you want to create an index concurrently, the index creation will get canceled if you
cancelthe statement (i.e. you must keep your statement open). 
>
> Is there a way to "launch" an index creation in the background so that the statement doesn't need to be kept open?
>
> --David
>

No there is not.

David J.


Re: Creating indexes in the background

From
David Lee
Date:
<p>Thanks. Is this something viable as a feature request? <div class="gmail_quote">On Oct 28, 2012 7:48 AM, "David
Johnston"<<a href="mailto:polobo@yahoo.com">polobo@yahoo.com</a>> wrote:<br type="attribution" /><blockquote
class="gmail_quote"style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> On Oct 27, 2012, at 19:20,
DavidLee <<a href="mailto:davidomundo@gmail.com">davidomundo@gmail.com</a>> wrote:<br /><br /> > Hey folks,<br
/>><br /> > It seems like right now when you want to create an index concurrently, the index creation will get
canceledif you cancel the statement (i.e. you must keep your statement open).<br /> ><br /> > Is there a way to
"launch"an index creation in the background so that the statement doesn't need to be kept open?<br /> ><br /> >
--David<br/> ><br /><br /> No there is not.<br /><br /> David J.</blockquote></div> 

Re: Creating indexes in the background

From
Jeff Janes
Date:
On Sat, Oct 27, 2012 at 4:20 PM, David Lee <davidomundo@gmail.com> wrote:
> Hey folks,
>
> It seems like right now when you want to create an index concurrently, the index creation will get canceled if you
cancelthe statement (i.e. you must keep your statement open).
 
>
> Is there a way to "launch" an index creation in the background so that the statement doesn't need to be kept open?

Launch the session itself in the background.

For example, on linux:

psql -c 'create index concurrently ....' &

At some point, you will probably want to verify that the creation succeeded.

Cheers,

Jeff



Re: Creating indexes in the background

From
Robert Haas
Date:
On Sat, Oct 27, 2012 at 7:20 PM, David Lee <davidomundo@gmail.com> wrote:
> It seems like right now when you want to create an index concurrently, the index creation will get canceled if you
cancelthe statement (i.e. you must keep your statement open).
 
>
> Is there a way to "launch" an index creation in the background so that the statement doesn't need to be kept open?

Nope.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Creating indexes in the background

From
David Lee
Date:
Thanks for all the responses.

I forgot to ask in my initial post: If not already available, is background indexing a viable feature request? (This
waswhy I sent to pgsql-hackers). 

My use case is a multi-tenant CMS where indexes can be created by a web front-end. Since web requests should return
quickly,it would be optimal to fire off an index creation and return immediately. Since this is not available natively
inpostgresql, I have to have a separate worker process that handles long-running connections, and manage communication
betweenthe worker process and the web server process. 

Since index creation--especially concurrent index creation on tables with lots of writes--takes a long time, it would
begreat to be able to trigger index creation in the background, without having to dedicate a process+connection that
keepsa statement open. 

Since I've mentioned my use case, I might as well mention another issue I stumbled across, which is that concurrent
indexcreation cannot happen from within trigger functions. I'm able to non-concurrently create indexes from within
triggerfunctions. Why is there this disparity? 

If you can't tell, my ideal goal is to be able to launch background concurrent index creation via triggers.

--David

On Oct 28, 2012, at 2:35 PM, Kevin Grittner <kgrittn@mail.com> wrote:

> David Lee wrote:
>
>> It seems like right now when you want to create an index
>> concurrently, the index creation will get canceled if you cancel
>> the statement (i.e. you must keep your statement open).
>>
>> Is there a way to "launch" an index creation in the background so
>> that the statement doesn't need to be kept open?
>
> You are asking on the wrong list; this list is for discussions
> related to developing the PostgreSQL product itself. To ask questions
> about how to *use* PostgreSQL, please use one of the other lists.
> This sort of question belongs on the pgsql-general or pgsql-admin
> list.
>
> As long as I'm sending an email -- I usually run such queries on the
> server, either through a crontab job (if it's a recurring need), an
> "at" job (if it is a one-time off-hours need), or a background job.
> The specifics will depend on your operating system, which is
> something you should generally include in a question.
>
> http://wiki.postgresql.org/wiki/Guide_to_reporting_problems
>
> -Kevin




Re: Creating indexes in the background

From
"Kevin Grittner"
Date:
David Lee wrote:

> It seems like right now when you want to create an index
> concurrently, the index creation will get canceled if you cancel
> the statement (i.e. you must keep your statement open).
> 
> Is there a way to "launch" an index creation in the background so
> that the statement doesn't need to be kept open?

You are asking on the wrong list; this list is for discussions
related to developing the PostgreSQL product itself. To ask questions
about how to *use* PostgreSQL, please use one of the other lists.
This sort of question belongs on the pgsql-general or pgsql-admin
list.

As long as I'm sending an email -- I usually run such queries on the
server, either through a crontab job (if it's a recurring need), an
"at" job (if it is a one-time off-hours need), or a background job.
The specifics will depend on your operating system, which is
something you should generally include in a question.

http://wiki.postgresql.org/wiki/Guide_to_reporting_problems

-Kevin



Re: Creating indexes in the background

From
Jaime Casanova
Date:
On Sat, Oct 27, 2012 at 6:20 PM, David Lee <davidomundo@gmail.com> wrote:
> Hey folks,
>
> It seems like right now when you want to create an index concurrently, the index creation will get canceled if you
cancelthe statement (i.e. you must keep your statement open). 
>
> Is there a way to "launch" an index creation in the background so that the statement doesn't need to be kept open?
>

do it in the command line:

psql -c "CREATE INDEX CONCURRENTLY aname ON atable(acolumn)" adb &

--
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación
Phone: +593 4 5107566         Cell: +593 987171157



Re: Creating indexes in the background

From
Michael Paquier
Date:


On Sun, Oct 28, 2012 at 8:20 AM, David Lee <davidomundo@gmail.com> wrote:
Hey folks,

It seems like right now when you want to create an index concurrently, the index creation will get canceled if you cancel the statement (i.e. you must keep your statement open).

Is there a way to "launch" an index creation in the background so that the statement doesn't need to be kept open?
If I understood your question, the answer is no, you need to maintain the server session alive when doing a transaction on a PG server, cutting the server connection automatically aborts the transaction it is running on server side.
You can still use a batch processing for doing such operations like:
psql -c "create index concurrently foo on tab(a)" -p $port -h $host $dbname &
But this needs to be done on the client application side that will maintain alive a session on server.
--
Michael Paquier
http://michael.otacoo.com

Re: Creating indexes in the background

From
Alvaro Herrera
Date:
David Lee escribió:
> Thanks for all the responses.
>
> I forgot to ask in my initial post: If not already available, is background indexing a viable feature request? (This
waswhy I sent to pgsql-hackers). 
>
> My use case is a multi-tenant CMS where indexes can be created by a web front-end. Since web requests should return
quickly,it would be optimal to fire off an index creation and return immediately. Since this is not available natively
inpostgresql, I have to have a separate worker process that handles long-running connections, and manage communication
betweenthe worker process and the web server process. 

I think the easiest way to handle this would be to have the web frontend
record the request to create the index somewhere, and a separate process
(maybe one doing LISTEN permanently) does the actual creation.

> Since I've mentioned my use case, I might as well mention another issue I stumbled across, which is that concurrent
indexcreation cannot happen from within trigger functions. I'm able to non-concurrently create indexes from within
triggerfunctions. Why is there this disparity? 

Because concurrent index creation needs to open and commit transactions
underneath, which regular index creation does not.  Functions cannot run
across transaction boundaries.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: Creating indexes in the background

From
Jim Nasby
Date:
On 10/28/12 10:22 AM, David Lee wrote:
> Thanks. Is this something viable as a feature request?

Possibly, but it's not terribly high on the list.

In the meantime, we've built a user-space index daemon at $WORK that generally solves this issue. We intend to release
itat some point, but if you need it now we could provide you with code.
 
-- 
Jim "Decibel!" Nasby          jnasby@enova.com
Primary: 512-569-9461     Backup: 512-579-9024



Re: Creating indexes in the background

From
Daniel Farina
Date:
On Sun, Oct 28, 2012 at 8:22 AM, David Lee <davidomundo@gmail.com> wrote:
> Thanks. Is this something viable as a feature request?

Just to contribute a tiny amount of data: I also get this request from
users on a semi-regular basis. It's definitely below the pains of
pg_dump/restore or fork-and-reuse-of-connections of libpq in the space
of caveats that come to mind, but it's worth noting that regular
people do notice this is a gap (disconnected session support)
reasonably frequently.  However, the workaround is fairly clear and
network connections are pretty reliable, and that blunts its severity.

it's also useful for other use cases besides CREATE INDEX, like
incremental data rewrites.  I think there's some indirect overlap here
with the daemon facility under discussion that is active in another
thread.

--
fdr



Re: Creating indexes in the background

From
Simon Riggs
Date:
On 28 October 2012 01:20, David Lee <davidomundo@gmail.com> wrote:

> It seems like right now when you want to create an index concurrently, the index creation will get canceled if you
cancelthe statement (i.e. you must keep your statement open).
 
>
> Is there a way to "launch" an index creation in the background so that the statement doesn't need to be kept open?

There is some work in 9.3 that will give low-level infrastructure
support to allow the concept of a "job/task scheduler" in Postgres,
which is the generic description of the facility you're looking for.
It is early days, but that will emerge sometime in 9.3 or beyond.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: Creating indexes in the background

From
David Lee
Date:
>> Since I've mentioned my use case, I might as well mention another issue I stumbled across, which is that concurrent
indexcreation cannot happen from within trigger functions. I'm able to non-concurrently create indexes from within
triggerfunctions. Why is there this disparity? 
>
> Because concurrent index creation needs to open and commit transactions
> underneath, which regular index creation does not.  Functions cannot run
> across transaction boundaries.

Would having background index creation allow for this? For example, launching a background concurrent index creation
couldrecord just the "actuation" of the index creation in the trigger's transaction, and postgresql would actually
populatethe index once the trigger's transaction is committed. 

--David