Thread: 'DROP INDEX' kills stored rpocedures

'DROP INDEX' kills stored rpocedures

From
Vlad Krupin
Date:
this is my first post to this mailing list, so if this is the wrong
list, direct me to the right one please.
also, please cc: to me because I am not (yet) subscribed.

1. I understand that working with indexes is outside the scope of
transaction. That is, if you start a transaction, then drop an index, at
that very moment, before the transaction is committed, the index will be
unavailable to any other concurrently running queries. I didn't find
that in documentation, but a small experiment showed that to be true.

2. If (1) is correct, then I have found a problem. I have a stored
procedure that takes a few seconds to run. If, while it's running, an
index is dropped, it bails with 'Relation [OID of the index] does not
exist'. I do not know if the same is true with long-running query, or
just stored procedures.
I can understand that postgres created a plan for a query that relied on
an index, and then half-way through it's execution the index
disappeared. But in my understanding it shouldn't bail like that.

It should either
(1) 'lock index' - do not allow to drop the index until all queries that
are already planned and need that index are done executing. For all the
new queries that are planned while the index is 'locked' like this the
index is unavailable, and they'll have to do sequential scan/whatever.
(2) make the query execution 'smarter' - if an index we were relying on
disappears, instead of throwing 'Relation 12345 does not exist' re-plan
the query and make it do sequential scan or whatever else is available
instead of the index. I don't know if that's feasible
(3) make statements like 'DROP/CREATE INDEX' obey a transaction scope -
e.g.
BEGIN;
DROP INDEX "blah";
...
[do a lot of work here]
...
CREATE INDEX "blah"...
COMMIT;

will never leave a concurrently running query without an index. I do not
know if this is against some principles that are deep within postgres'
foundation though.

My complaint is that I can't drop an index without risking to bomb a
long-running stored procedure (option (2) above wouldn't be very helpful
either, but at least we won't bomb!); on the other hand I can't do
massive inserts and do them fast (faster than a couple hours!) without
dropping the index. At least without a ugly hack.

Can someone tell me if this request of mine makes sense?

Vlad

--
Vlad Krupin
Software Engineer
echospace.com


Re: 'DROP INDEX' kills stored rpocedures

From
"scott.marlowe"
Date:
On Wed, 2 Apr 2003, Vlad Krupin wrote:

> this is my first post to this mailing list, so if this is the wrong
> list, direct me to the right one please.
> also, please cc: to me because I am not (yet) subscribed.
>
> 1. I understand that working with indexes is outside the scope of
> transaction. That is, if you start a transaction, then drop an index, at
> that very moment, before the transaction is committed, the index will be
> unavailable to any other concurrently running queries. I didn't find
> that in documentation, but a small experiment showed that to be true.

Don't make assumptions like that.  In postgresql, DDL is transactionable.

begin;
drop index test;
create index test on table (field);
commit;

will work just fine.


Re: 'DROP INDEX' kills stored rpocedures

From
Tom Lane
Date:
"scott.marlowe" <scott.marlowe@ihs.com> writes:
> Don't make assumptions like that.  In postgresql, DDL is transactionable.

I think his actual problem is that plpgsql cached a query plan for some
command while the index was in place, and then on a later re-execution
of the function, it tried to re-use the cached plan.  This is a general
problem that we know we need to fix sooner or later --- cached plans
need to be invalidated when there are DDL changes to the underlying
tables, views, etc.

            regards, tom lane


Re: 'DROP INDEX' kills stored rpocedures

From
Tom Lane
Date:
Vlad Krupin <vlad@echospace.com> writes:
> So, if I first re-create indexes and then do a 'VACUUM ANALYZE' on the
> table where those indexes were dropped, it should discard the old cached
> plans and create a new one, right?

All you need do is start a new database session --- plans are never
cached for longer than the life of one backend process.

            regards, tom lane


Re: 'DROP INDEX' kills stored rpocedures

From
"scott.marlowe"
Date:
On Thu, 3 Apr 2003, Vlad Krupin wrote:

> Thanks for response, Scott
>
> scott.marlowe wrote:
> [snip]
>
> >>1. I understand that working with indexes is outside the scope of
> >>transaction. That is, if you start a transaction, then drop an index, at
> >>that very moment, before the transaction is committed, the index will be
> >>unavailable to any other concurrently running queries. I didn't find
> >>that in documentation, but a small experiment showed that to be true.
> >>
> >>
> >
> >Don't make assumptions like that.  In postgresql, DDL is transactionable.
> >
> >begin;
> >drop index test;
> >create index test on table (field);
> >commit;
> >
> >will work just fine.
> >
> That's not really an assumption on my part. That's what I have observed
> by doing a small experiment. Consider two clients: #1 and #2 that are
> connected at the same time. Table "foo" is indexed on "bar". Now,
> consider this sequence of commands:
>
> #1 BEGIN;
> #2 BEGIN;
> #2 DROP INDEX "bar_idx";
> #1 EXPLAIN ANALYZE SELECT * FROM "foo" WHERE "bar"='hello';
>
> This performs a sequential scan for me, even though I have not committed
> the transaction on client #2 yet! If I do not drop the index (no #2
> statements), it performes an indexed scan.

On my 7.2.x box, this results in #1 waiting for #2 to commit.  It just
pauses #1 indefinitely.  Are you running 7.3.x?  Might explain the
differences.

> Does that seem to make sense?  Why does it behave like that?
>
> Also, I am still trying to figure out why I see my stored procedure
> bailing with 'Relation [OID of index dropped] does not exist' error. Is
> that because the planner somehow remembers that there used to be an
> index with that OID, but, since I dropped and re-created it, it's not
> there anymore and I need to tell the planner to re-analyze how to
> execute that query, e.g. 'VACUUM ANALYZE'? Or am I totally off track here?

Not sure.  Might be one of things that you just can't do, run both the
index change and your stored proc.  Can we see the source of your stored
procedure?


Re: 'DROP INDEX' kills stored rpocedures

From
Tom Lane
Date:
"scott.marlowe" <scott.marlowe@ihs.com> writes:
> On Thu, 3 Apr 2003, Vlad Krupin wrote:
>> consider this sequence of commands:
>>
>> #1 BEGIN;
>> #2 BEGIN;
>> #2 DROP INDEX "bar_idx";
>> #1 EXPLAIN ANALYZE SELECT * FROM "foo" WHERE "bar"='hello';
>>
>> This performs a sequential scan for me, even though I have not committed
>> the transaction on client #2 yet! If I do not drop the index (no #2
>> statements), it performes an indexed scan.

> On my 7.2.x box, this results in #1 waiting for #2 to commit.  It just
> pauses #1 indefinitely.  Are you running 7.3.x?  Might explain the
> differences.

I overlooked this part of Vlad's message.  AFAIK the above should cause #1
to wait for #2's commit in *any* version of Postgres; certainly anything
released in the last several years.  DROP INDEX will take an exclusive
lock on the table owning the index, and that will prevent EXPLAIN from
accessing the table even just to plan a query on it.

[ thinks... ]  Um, Scott and I are both assuming that bar_idx is indeed
an index on table foo.  Perhaps this was just pilot error about what
index belonged to what table?

            regards, tom lane


Re: 'DROP INDEX' kills stored rpocedures

From
Tom Lane
Date:
Vlad Krupin <vlad@echospace.com> writes:
> - Also, it does not seem to affect just regular SELECT queries - only
> when they are wrapped in my stored procedure that returns a refcursor.

Queries in plpgsql functions and PREPAREd queries are the only ones that
save plans, AFAIR.

> An even better way to do it is to tell all planners to discard their
> cached plans if the the index they are relying on suddenly disappears.

When we get around to implementing this, it will be automatic, not based
on the user having to do something special.

            regards, tom lane


Re: 'DROP INDEX' kills stored rpocedures

From
Vlad Krupin
Date:
Tom Lane wrote:

>"scott.marlowe" <scott.marlowe@ihs.com> writes:
>
>
>>On Thu, 3 Apr 2003, Vlad Krupin wrote:
>>
>>
>>>consider this sequence of commands:
>>>
>>>#1 BEGIN;
>>>#2 BEGIN;
>>>#2 DROP INDEX "bar_idx";
>>>#1 EXPLAIN ANALYZE SELECT * FROM "foo" WHERE "bar"='hello';
>>>
>>>This performs a sequential scan for me, even though I have not committed
>>>the transaction on client #2 yet! If I do not drop the index (no #2
>>>statements), it performes an indexed scan.
>>>
>>>
>
>
>
>>On my 7.2.x box, this results in #1 waiting for #2 to commit.  It just
>>pauses #1 indefinitely.  Are you running 7.3.x?  Might explain the
>>differences.
>>
>>
>
>I overlooked this part of Vlad's message.  AFAIK the above should cause #1
>to wait for #2's commit in *any* version of Postgres; certainly anything
>released in the last several years.  DROP INDEX will take an exclusive
>lock on the table owning the index, and that will prevent EXPLAIN from
>accessing the table even just to plan a query on it.
>
You are right. I must have assumed the SELECT was taking a long time to
run due to sequential scan, and typed 'COMMIT' in the transaction that
was dropping indexes. That would make the other transaction do a
sequential scan, and that's what I saw. ops... I tried to reporoduce
what I claimed to have seen in the previous email, and was unable to do
so - indeed it does lock the table indefinitely. Sorry for confusion.

That was my mistake, but the problem I was trying to solve still stands.
Based on what Tom said, it's due to plan being cached.
I do something like that:
1. Start a connection
2. Execute stored procedure (it succeeds and caches the plan that
includes OID of an index it used).
3. In a different connection I drop and re-create the index (it gets new
OID)
4. Try to repeat step (2) and get a 'Relation [OID of index dropped]
does not exist'.
- If I close and re-open the connection somewhere between steps 2 and 4,
everything is good.
- Also, it does not seem to affect just regular SELECT queries - only
when they are wrapped in my stored procedure that returns a refcursor. I
do not know why - I imagine the planner would cache those plans too, right?

While closing connections solves the problem, it's not a good fix
because connections are pooled on the client, and I have no control over
them. What would be super-nice is if after doing CREATE INDEX, but
before COMMIT I could do something to tell the planner to discard caches
for that table, including planners that are being used in other
concurrent connections. Is there a way to do that?

An even better way to do it is to tell all planners to discard their
cached plans if the the index they are relying on suddenly disappears.
Otherwise a seemingly innocent operation (DROP INDEX) results not in a
slower execution, but in a stored procedure failure.

Is that possible, or is closing the connection my only way of working
around that problem?

Thanks for your help guys,

Vlad

--
Vlad Krupin
Software Engineer
echospace.com


Re: 'DROP INDEX' kills stored rpocedures

From
Vlad Krupin
Date:
Thanks for response, Scott

scott.marlowe wrote:
[snip]

>>1. I understand that working with indexes is outside the scope of
>>transaction. That is, if you start a transaction, then drop an index, at
>>that very moment, before the transaction is committed, the index will be
>>unavailable to any other concurrently running queries. I didn't find
>>that in documentation, but a small experiment showed that to be true.
>>
>>
>
>Don't make assumptions like that.  In postgresql, DDL is transactionable.
>
>begin;
>drop index test;
>create index test on table (field);
>commit;
>
>will work just fine.
>
That's not really an assumption on my part. That's what I have observed
by doing a small experiment. Consider two clients: #1 and #2 that are
connected at the same time. Table "foo" is indexed on "bar". Now,
consider this sequence of commands:

#1 BEGIN;
#2 BEGIN;
#2 DROP INDEX "bar_idx";
#1 EXPLAIN ANALYZE SELECT * FROM "foo" WHERE "bar"='hello';
...

This performs a sequential scan for me, even though I have not committed
the transaction on client #2 yet! If I do not drop the index (no #2
statements), it performes an indexed scan.

Does that seem to make sense?  Why does it behave like that?

Also, I am still trying to figure out why I see my stored procedure
bailing with 'Relation [OID of index dropped] does not exist' error. Is
that because the planner somehow remembers that there used to be an
index with that OID, but, since I dropped and re-created it, it's not
there anymore and I need to tell the planner to re-analyze how to
execute that query, e.g. 'VACUUM ANALYZE'? Or am I totally off track here?


Any hints?

Vlad

--
Vlad Krupin
Software Engineer
echospace.com


Re: 'DROP INDEX' kills stored rpocedures

From
Vlad Krupin
Date:
Tom Lane wrote:

>"scott.marlowe" <scott.marlowe@ihs.com> writes:
>
>
>>Don't make assumptions like that.  In postgresql, DDL is transactionable.
>>
>>
>
>I think his actual problem is that plpgsql cached a query plan for some
>command while the index was in place, and then on a later re-execution
>of the function, it tried to re-use the cached plan.  This is a general
>problem that we know we need to fix sooner or later --- cached plans
>need to be invalidated when there are DDL changes to the underlying
>tables, views, etc.
>
that totally makes sense (in fact, I sent another message to the list 15
minutes ago guessing that this could be the case. You can ignore it now).

So, if I first re-create indexes and then do a 'VACUUM ANALYZE' on the
table where those indexes were dropped, it should discard the old cached
plans and create a new one, right?

Vlad

--
Vlad Krupin
Software Engineer
echospace.com