Thread: Re: [Pgsphere-dev] GIST index concurrency concern

Re: [Pgsphere-dev] GIST index concurrency concern

From
Oleg Bartunov
Date:
Daniel,

concurrency is a big issue of current implementation of GiST.
But it should don't bite you for READ ops ! 
-hackers mailing list is a very relevant mailing list for GiST
discussions. It's pity we several times claimed to work on GiST
concurrency and recovery, but never got a chance :)
I see Neil become interested in GiST concurrency, though.

    Oleg
On Tue, 9 Nov 2004, Daniel Ceregatti wrote:

> Hi,
>
> It's recently come to my attention that GIST indices suffer from
> concurrency issues. I have already developed a dating sites using GIST
> for use with attributes using the intarray contrib, and for Earth
> distance/radius calculations using pg_sphere.
>
> I'm wondering if I haven't shot myself in the foot here. So far, I
> understand that a GIST index will be locked by a backend for any DML.
> Basically I'm concerned that my database will not scale in the manner
> that I was hoping, because the sites that access the database are to be
> used by many multiple concurrent users, doing  some DML.
>
> I expect my site to sustain something around 1000-3000 new user
> acquisitions per day, all of which will account for an insert into 3
> GIST indices. Additionally there will be people that will be updating
> their attributes and locations as well, but this will probably only
> account for a small fraction of the DML. We don't allow people to delete
> stuff.
>
> My concern now is this concurrency issue. My question is: Is there
> anyone out there using a GIST index on a database where there's a lot of
> DML? Should I be concerned with this issue at all?
>
> If so, what can be done to minimize the impact of heavy DML on a GIST
> index? I've pondered rolling all DML into queues via triggers and then
> de-queuing them in one transaction every so often, like 15 minutes, via
> cron. Any other suggestions?
>
> I'm posting to this list because I understand that both Oleg and Teodor
> read it, and I found no other relevant list. If I've misposted, please
> accept my apology and please direct me to the appropriate list.
>
> Thanks,
>
> Daniel
>
>
    Regards,        Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83


Re: [Pgsphere-dev] GIST index concurrency concern

From
Oleg Bartunov
Date:
Patrick,

you didn't say us about your setup. Have you proved you've seen
locking issue for reading ? Are you sure you have no any locks in
your code ? Any tests demonstrated your problem would be great.
    Oleg
On Tue, 9 Nov 2004, Patrick Clery wrote:

> Oleg,
> Daniel and I have both been collaborating on this structure for a while now.
> We are aware that GiST reads work very fast. But won't they be "paralyzed"
> when there are writes? Both of us are working on dating sites, and the main
> problem that concerns us is a very heavy traffic load. At this point I am
> planning to queue all changes to a GiST index and commit them every 10-15
> minutes. Is that really necessary? It's realistic to assume here that if
> there is a problem with locking the table for writes, it will be a problem in
> this situation because this structure is going to be hit VERY hard (and
> Daniel's situation is on an even larger scale). We hope that we can alleviate
> that with a "transaction queue", but this is not a simple fix. Have you seen
> any projects that were under a heavy load using a GiST index, and were they
> able to avoid being "paralyzed" somehow?
>
> Thanks in advance,
> Patrick
>
> On Tuesday 09 November 2004 22:08, Oleg Bartunov wrote:
>> Oleg Bartunov <oleg@sai.msu.su>
>
>> Daniel,
>>
>> concurrency is a big issue of current implementation of GiST.
>> But it should don't bite you for READ ops !
>> -hackers mailing list is a very relevant mailing list for GiST
>> discussions. It's pity we several times claimed to work on GiST
>> concurrency and recovery, but never got a chance :)
>> I see Neil become interested in GiST concurrency, though.
>>
>>
>>         Oleg
>> On Tue, 9 Nov 2004, Daniel Ceregatti wrote:
>>
>>> Hi,
>>>
>>> It's recently come to my attention that GIST indices suffer from
>>> concurrency issues. I have already developed a dating sites using GIST
>>> for use with attributes using the intarray contrib, and for Earth
>>> distance/radius calculations using pg_sphere.
>>>
>>> I'm wondering if I haven't shot myself in the foot here. So far, I
>>> understand that a GIST index will be locked by a backend for any DML.
>>> Basically I'm concerned that my database will not scale in the manner
>>> that I was hoping, because the sites that access the database are to be
>>> used by many multiple concurrent users, doing  some DML.
>>>
>>> I expect my site to sustain something around 1000-3000 new user
>>> acquisitions per day, all of which will account for an insert into 3
>>> GIST indices. Additionally there will be people that will be updating
>>> their attributes and locations as well, but this will probably only
>>> account for a small fraction of the DML. We don't allow people to delete
>>> stuff.
>>>
>>> My concern now is this concurrency issue. My question is: Is there
>>> anyone out there using a GIST index on a database where there's a lot of
>>> DML? Should I be concerned with this issue at all?
>>>
>>> If so, what can be done to minimize the impact of heavy DML on a GIST
>>> index? I've pondered rolling all DML into queues via triggers and then
>>> de-queuing them in one transaction every so often, like 15 minutes, via
>>> cron. Any other suggestions?
>>>
>>> I'm posting to this list because I understand that both Oleg and Teodor
>>> read it, and I found no other relevant list. If I've misposted, please
>>> accept my apology and please direct me to the appropriate list.
>>>
>>> Thanks,
>>>
>>> Daniel
>>>
>>
> _______________________________________________
> Pgsphere-dev mailing list
> Pgsphere-dev@gborg.postgresql.org
> http://gborg.postgresql.org/mailman/listinfo/pgsphere-dev
>
    Regards,        Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83


Re: [Pgsphere-dev] GIST index concurrency concern

From
Patrick Clery
Date:
Oleg,Daniel and I have both been collaborating on this structure for a while now. 
We are aware that GiST reads work very fast. But won't they be "paralyzed" 
when there are writes? Both of us are working on dating sites, and the main 
problem that concerns us is a very heavy traffic load. At this point I am 
planning to queue all changes to a GiST index and commit them every 10-15 
minutes. Is that really necessary? It's realistic to assume here that if 
there is a problem with locking the table for writes, it will be a problem in 
this situation because this structure is going to be hit VERY hard (and 
Daniel's situation is on an even larger scale). We hope that we can alleviate 
that with a "transaction queue", but this is not a simple fix. Have you seen 
any projects that were under a heavy load using a GiST index, and were they 
able to avoid being "paralyzed" somehow?

Thanks in advance,
Patrick

On Tuesday 09 November 2004 22:08, Oleg Bartunov wrote:
> Oleg Bartunov <oleg@sai.msu.su>

> Daniel,
> 
> concurrency is a big issue of current implementation of GiST.
> But it should don't bite you for READ ops ! 
> -hackers mailing list is a very relevant mailing list for GiST
> discussions. It's pity we several times claimed to work on GiST
> concurrency and recovery, but never got a chance :)
> I see Neil become interested in GiST concurrency, though.
> 
> 
>         Oleg
> On Tue, 9 Nov 2004, Daniel Ceregatti wrote:
> 
> > Hi,
> >
> > It's recently come to my attention that GIST indices suffer from
> > concurrency issues. I have already developed a dating sites using GIST
> > for use with attributes using the intarray contrib, and for Earth
> > distance/radius calculations using pg_sphere.
> >
> > I'm wondering if I haven't shot myself in the foot here. So far, I
> > understand that a GIST index will be locked by a backend for any DML.
> > Basically I'm concerned that my database will not scale in the manner
> > that I was hoping, because the sites that access the database are to be
> > used by many multiple concurrent users, doing  some DML.
> >
> > I expect my site to sustain something around 1000-3000 new user
> > acquisitions per day, all of which will account for an insert into 3
> > GIST indices. Additionally there will be people that will be updating
> > their attributes and locations as well, but this will probably only
> > account for a small fraction of the DML. We don't allow people to delete
> > stuff.
> >
> > My concern now is this concurrency issue. My question is: Is there
> > anyone out there using a GIST index on a database where there's a lot of
> > DML? Should I be concerned with this issue at all?
> >
> > If so, what can be done to minimize the impact of heavy DML on a GIST
> > index? I've pondered rolling all DML into queues via triggers and then
> > de-queuing them in one transaction every so often, like 15 minutes, via
> > cron. Any other suggestions?
> >
> > I'm posting to this list because I understand that both Oleg and Teodor
> > read it, and I found no other relevant list. If I've misposted, please
> > accept my apology and please direct me to the appropriate list.
> >
> > Thanks,
> >
> > Daniel
> >
>