Thread: How about using dirty snapshots to locate dependent objects?

How about using dirty snapshots to locate dependent objects?

From

Ashutosh Sharma

Date:

06 June 2024, 12:29:00

Hello everyone,

At present, we use MVCC snapshots to identify dependent objects. This implies that if a new dependent object is inserted within a transaction that is still ongoing, our search for dependent objects won't include this recently added one. Consequently, if someone attempts to drop the referenced object, it will be dropped, and when the ongoing transaction completes, we will end up having an entry for a referenced object that has already been dropped. This situation can lead to an inconsistent state. Below is an example illustrating this scenario:

Session 1:
- create table t1(a int);
- insert into t1 select i from generate_series(1, 10000000) i;
- create extension btree_gist;
- create index i1 on t1 using gist( a );

Session 2: (While the index creation in session 1 is in progress, drop the btree_gist extension)
- drop extension btree_gist;

Above command succeeds and so does the create index command running in session 1, post this, if we try running anything on table t1, i1, it fails with an error: "cache lookup failed for opclass ..."

Attached is the patch that I have tried, which seems to be working for me. It's not polished and thoroughly tested, but just sharing here to clarify what I am trying to suggest. Please have a look and let me know your thoughts.

With Regards,

Ashutosh Sharma.

Attachment

use-dirty-snapshots-to-find-dependent-objects.patch

Re: How about using dirty snapshots to locate dependent objects?

From

Dilip Kumar

Date:

06 June 2024, 12:50:30

On Thu, Jun 6, 2024 at 5:59 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>
> Hello everyone,
>
> At present, we use MVCC snapshots to identify dependent objects. This implies that if a new dependent object is
insertedwithin a transaction that is still ongoing, our search for dependent objects won't include this recently added
one.Consequently, if someone attempts to drop the referenced object, it will be dropped, and when the ongoing
transactioncompletes, we will end up having an entry for a referenced object that has already been dropped. This
situationcan lead to an inconsistent state. Below is an example illustrating this scenario: 

I don't think it's correct to allow the index to be dropped while a
transaction is creating it. Instead, the right solution should be for
the create index operation to protect the object it is using from
being dropped. Specifically, the create index operation should acquire
a shared lock on the Access Method (AM) to ensure it doesn't get
dropped concurrently while the transaction is still in progress.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: How about using dirty snapshots to locate dependent objects?

From

Bertrand Drouvot

Date:

06 June 2024, 13:27:44

On Thu, Jun 06, 2024 at 05:59:00PM +0530, Ashutosh Sharma wrote:
> Hello everyone,
> 
> At present, we use MVCC snapshots to identify dependent objects. This
> implies that if a new dependent object is inserted within a transaction
> that is still ongoing, our search for dependent objects won't include this
> recently added one. Consequently, if someone attempts to drop the
> referenced object, it will be dropped, and when the ongoing transaction
> completes, we will end up having an entry for a referenced object that has
> already been dropped. This situation can lead to an inconsistent state.
> Below is an example illustrating this scenario:
> 
> Session 1:
> - create table t1(a int);
> - insert into t1 select i from generate_series(1, 10000000) i;
> - create extension btree_gist;
> - create index i1 on t1 using gist( a );
> 
> Session 2: (While the index creation in session 1 is in progress, drop the
> btree_gist extension)
> - drop extension btree_gist;
> 
> Above command succeeds and so does the create index command running in
> session 1, post this, if we try running anything on table t1, i1, it fails
> with an error: "cache lookup failed for opclass ..."
> 
> Attached is the patch that I have tried, which seems to be working for me.
> It's not polished and thoroughly tested, but just sharing here to clarify
> what I am trying to suggest. Please have a look and let me know your
> thoughts.

Thanks for the patch proposal!

The patch does not fix the other way around:

- session 1: BEGIN; DROP extension btree_gist;
- session 2: create index i1 on t1 using gist( a );
- session 1: commits while session 2 is creating the index

and does not address all the possible orphaned dependencies cases.

There is an ongoing thread (see [1]) to fix the orphaned dependencies issue.

v9 attached in [1] fixes the case you describe here.

[1]: https://www.postgresql.org/message-id/flat/ZiYjn0eVc7pxVY45%40ip-10-97-1-34.eu-west-3.compute.internal

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: How about using dirty snapshots to locate dependent objects?

From

Ashutosh Sharma

Date:

06 June 2024, 14:09:31

On Thu, Jun 6, 2024 at 6:20 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Thu, Jun 6, 2024 at 5:59 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>
> Hello everyone,
>
> At present, we use MVCC snapshots to identify dependent objects. This implies that if a new dependent object is inserted within a transaction that is still ongoing, our search for dependent objects won't include this recently added one. Consequently, if someone attempts to drop the referenced object, it will be dropped, and when the ongoing transaction completes, we will end up having an entry for a referenced object that has already been dropped. This situation can lead to an inconsistent state. Below is an example illustrating this scenario:

I don't think it's correct to allow the index to be dropped while a
transaction is creating it. Instead, the right solution should be for
the create index operation to protect the object it is using from
being dropped. Specifically, the create index operation should acquire
a shared lock on the Access Method (AM) to ensure it doesn't get
dropped concurrently while the transaction is still in progress.

If I'm following you correctly, that's exactly what the patch is trying to do; while the index creation is in progress, if someone tries to drop the object referenced by the index under creation, the referenced object being dropped is able to know about the dependent object (in this case the index being created) using dirty snapshot and hence, it is unable to acquire the lock on the dependent object, and as a result of that, it is unable to drop it.

With Regards,

Ashutosh Sharma.

Re: How about using dirty snapshots to locate dependent objects?

From

Ashutosh Sharma

Date:

06 June 2024, 14:21:22

On Thu, Jun 6, 2024 at 6:57 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote:

On Thu, Jun 06, 2024 at 05:59:00PM +0530, Ashutosh Sharma wrote:
> Hello everyone,
>
> At present, we use MVCC snapshots to identify dependent objects. This
> implies that if a new dependent object is inserted within a transaction
> that is still ongoing, our search for dependent objects won't include this
> recently added one. Consequently, if someone attempts to drop the
> referenced object, it will be dropped, and when the ongoing transaction
> completes, we will end up having an entry for a referenced object that has
> already been dropped. This situation can lead to an inconsistent state.
> Below is an example illustrating this scenario:
>
> Session 1:
> - create table t1(a int);
> - insert into t1 select i from generate_series(1, 10000000) i;
> - create extension btree_gist;
> - create index i1 on t1 using gist( a );
>
> Session 2: (While the index creation in session 1 is in progress, drop the
> btree_gist extension)
> - drop extension btree_gist;
>
> Above command succeeds and so does the create index command running in
> session 1, post this, if we try running anything on table t1, i1, it fails
> with an error: "cache lookup failed for opclass ..."
>
> Attached is the patch that I have tried, which seems to be working for me.
> It's not polished and thoroughly tested, but just sharing here to clarify
> what I am trying to suggest. Please have a look and let me know your
> thoughts.

Thanks for the patch proposal!

The patch does not fix the other way around:

- session 1: BEGIN; DROP extension btree_gist;
- session 2: create index i1 on t1 using gist( a );
- session 1: commits while session 2 is creating the index

and does not address all the possible orphaned dependencies cases.

There is an ongoing thread (see [1]) to fix the orphaned dependencies issue.

v9 attached in [1] fixes the case you describe here.

[1]: https://www.postgresql.org/message-id/flat/ZiYjn0eVc7pxVY45%40ip-10-97-1-34.eu-west-3.compute.internal

I see. Thanks for sharing this. I can take a look at this and help in whatever way I can.

With Regards,

Ashutosh Sharma.

Re: How about using dirty snapshots to locate dependent objects?

From

Dilip Kumar

Date:

07 June 2024, 04:36:27

On Thu, Jun 6, 2024 at 7:39 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>
> On Thu, Jun 6, 2024 at 6:20 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>>
>> On Thu, Jun 6, 2024 at 5:59 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>> >
>> > Hello everyone,
>> >
>> > At present, we use MVCC snapshots to identify dependent objects. This implies that if a new dependent object is
insertedwithin a transaction that is still ongoing, our search for dependent objects won't include this recently added
one.Consequently, if someone attempts to drop the referenced object, it will be dropped, and when the ongoing
transactioncompletes, we will end up having an entry for a referenced object that has already been dropped. This
situationcan lead to an inconsistent state. Below is an example illustrating this scenario: 
>>
>> I don't think it's correct to allow the index to be dropped while a
>> transaction is creating it. Instead, the right solution should be for
>> the create index operation to protect the object it is using from
>> being dropped. Specifically, the create index operation should acquire
>> a shared lock on the Access Method (AM) to ensure it doesn't get
>> dropped concurrently while the transaction is still in progress.
>
>
> If I'm following you correctly, that's exactly what the patch is trying to do; while the index creation is in
progress,if someone tries to drop the object referenced by the index under creation, the referenced object being
droppedis able to know about the dependent object (in this case the index being created) using dirty snapshot and
hence,it is unable to acquire the lock on the dependent object, and as a result of that, it is unable to drop it. 

You are aiming for the same outcome, but not in the conventional way.
In my opinion, the correct approach is not to find objects being
created using a dirty snapshot. Instead, when creating an object, you
should acquire a proper lock on any dependent objects to prevent them
from being dropped during the creation process. For instance, when
creating an index that depends on the btree_gist access method, the
create index operation should protect btree_gist from being dropped by
acquiring the appropriate lock. It is not the responsibility of the
drop extension to identify in-progress index creations.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: How about using dirty snapshots to locate dependent objects?

From

Ashutosh Sharma

Date:

07 June 2024, 06:23:14

On Fri, Jun 7, 2024 at 10:06 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Thu, Jun 6, 2024 at 7:39 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> >
> > On Thu, Jun 6, 2024 at 6:20 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >>
> >> On Thu, Jun 6, 2024 at 5:59 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> >> >
> >> > Hello everyone,
> >> >
> >> > At present, we use MVCC snapshots to identify dependent objects. This implies that if a new dependent object is
insertedwithin a transaction that is still ongoing, our search for dependent objects won't include this recently added
one.Consequently, if someone attempts to drop the referenced object, it will be dropped, and when the ongoing
transactioncompletes, we will end up having an entry for a referenced object that has already been dropped. This
situationcan lead to an inconsistent state. Below is an example illustrating this scenario: 
> >>
> >> I don't think it's correct to allow the index to be dropped while a
> >> transaction is creating it. Instead, the right solution should be for
> >> the create index operation to protect the object it is using from
> >> being dropped. Specifically, the create index operation should acquire
> >> a shared lock on the Access Method (AM) to ensure it doesn't get
> >> dropped concurrently while the transaction is still in progress.
> >
> >
> > If I'm following you correctly, that's exactly what the patch is trying to do; while the index creation is in
progress,if someone tries to drop the object referenced by the index under creation, the referenced object being
droppedis able to know about the dependent object (in this case the index being created) using dirty snapshot and
hence,it is unable to acquire the lock on the dependent object, and as a result of that, it is unable to drop it. 
>
> You are aiming for the same outcome, but not in the conventional way.
> In my opinion, the correct approach is not to find objects being
> created using a dirty snapshot. Instead, when creating an object, you
> should acquire a proper lock on any dependent objects to prevent them
> from being dropped during the creation process. For instance, when
> creating an index that depends on the btree_gist access method, the
> create index operation should protect btree_gist from being dropped by
> acquiring the appropriate lock. It is not the responsibility of the
> drop extension to identify in-progress index creations.

Thanks for sharing your thoughts, I appreciate your inputs and
completely understand your perspective, but I wonder if that is
feasible? For example, if an object (index in this case) has
dependency on lets say 'n' number of objects, and those 'n' number of
objects belong to say 'n' different catalog tables, so should we
acquire locks on each of them until the create index command succeeds,
or, should we just check for the presence of dependent objects and
record their dependency inside the pg_depend table. Talking about this
particular case, we are trying to create gist index that has
dependency on gist_int4 opclass, it is one of the tuple inside
pg_opclass catalog table, so should acquire lock in this tuple/table
until the create index command succeeds and is that the thing to be
done for all the dependent objects?

--
With Regards,
Ashutosh Sharma.

Re: How about using dirty snapshots to locate dependent objects?

From

Dilip Kumar

Date:

07 June 2024, 06:43:06

On Fri, Jun 7, 2024 at 11:53 AM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>
> On Fri, Jun 7, 2024 at 10:06 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Thu, Jun 6, 2024 at 7:39 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> > >
> > > On Thu, Jun 6, 2024 at 6:20 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >>
> > >> On Thu, Jun 6, 2024 at 5:59 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> > >> >
> > >> > Hello everyone,
> > >> >
> > >> > At present, we use MVCC snapshots to identify dependent objects. This implies that if a new dependent object
isinserted within a transaction that is still ongoing, our search for dependent objects won't include this recently
addedone. Consequently, if someone attempts to drop the referenced object, it will be dropped, and when the ongoing
transactioncompletes, we will end up having an entry for a referenced object that has already been dropped. This
situationcan lead to an inconsistent state. Below is an example illustrating this scenario: 
> > >>
> > >> I don't think it's correct to allow the index to be dropped while a
> > >> transaction is creating it. Instead, the right solution should be for
> > >> the create index operation to protect the object it is using from
> > >> being dropped. Specifically, the create index operation should acquire
> > >> a shared lock on the Access Method (AM) to ensure it doesn't get
> > >> dropped concurrently while the transaction is still in progress.
> > >
> > >
> > > If I'm following you correctly, that's exactly what the patch is trying to do; while the index creation is in
progress,if someone tries to drop the object referenced by the index under creation, the referenced object being
droppedis able to know about the dependent object (in this case the index being created) using dirty snapshot and
hence,it is unable to acquire the lock on the dependent object, and as a result of that, it is unable to drop it. 
> >
> > You are aiming for the same outcome, but not in the conventional way.
> > In my opinion, the correct approach is not to find objects being
> > created using a dirty snapshot. Instead, when creating an object, you
> > should acquire a proper lock on any dependent objects to prevent them
> > from being dropped during the creation process. For instance, when
> > creating an index that depends on the btree_gist access method, the
> > create index operation should protect btree_gist from being dropped by
> > acquiring the appropriate lock. It is not the responsibility of the
> > drop extension to identify in-progress index creations.
>
> Thanks for sharing your thoughts, I appreciate your inputs and
> completely understand your perspective, but I wonder if that is
> feasible? For example, if an object (index in this case) has
> dependency on lets say 'n' number of objects, and those 'n' number of
> objects belong to say 'n' different catalog tables, so should we
> acquire locks on each of them until the create index command succeeds,
> or, should we just check for the presence of dependent objects and
> record their dependency inside the pg_depend table. Talking about this
> particular case, we are trying to create gist index that has
> dependency on gist_int4 opclass, it is one of the tuple inside
> pg_opclass catalog table, so should acquire lock in this tuple/table
> until the create index command succeeds and is that the thing to be
> done for all the dependent objects?

I am not sure what is the best way to do it, but if you are creating
an object which is dependent on the other object then you need to
check the existence of those objects, record dependency on those
objects, and also lock them so that those object doesn't get dropped
while you are creating your object.  I haven't looked into the patch
but something similar is being achieved in the thread Bertrand has
pointed out by locking the database object while recording the
dependency on those.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com