Thread: partitioning and locking problems

partitioning and locking problems

From
"Marc Morin"
Date:
We have a large database system designed around partitioning.  Our
application is characterized with

- terabytes of data
- billions of rows in dozens of base tables (and 100s of paritions)
- 24x7 insert load of new data that cannot be stopped, data is time
sensitive.
- periodic reports that can have long running queries with query times
measured in hours

We have 2 classes of "maintenance" activities that are causing us
problems:
- periodically we need to change an insert rule on a view to point to a
different partition.
- periodically we need to delete data that is no longer needed.
Performed via truncate.

Under both these circumstances (truncate and create / replace rule) the
locking behaviour of these commands can cause locking problems for us.
The scenario is best illustrated as a series of steps:


    1- long running report is running on view
    2- continuous inserters into view into a table via a rule
    3- truncate or rule change occurs, taking an exclusive lock.
Must wait for #1 to finish.
    4- new reports and inserters must now wait for #3.
    5- now everyone is waiting for a single query in #1.   Results
in loss of insert data granularity (important for our application).


Would like to understand the implications of changing postgres'
code/locking for rule changes and truncate to not require locking out
select statements?

The following is a simplified schema to help illustrate the problem.


    create table a_1
    (
        pkey int primary key
    );
    create table a_2
    (
        pkey int primary key
    );

    create view a as select * from a_1 union all select * from a_2;

    create function change_rule(int) returns void as
    '
    begin
        execute ''create or replace rule insert as on insert to a do
instead insert into a_''||$1||''(pkey) values(NEW.pkey)'';
    end;
    ' language plpgsql;

    -- change rule, execute something like the following
periodically
    select change_rule(1);


We've looked at the code and the rule changes appear "easy" but we are
concerned about the required changes for truncate.

Thanks
Marc


Re: partitioning and locking problems

From
Tom Lane
Date:
"Marc Morin" <marc@sandvine.com> writes:
> Would like to understand the implications of changing postgres'
> code/locking for rule changes and truncate to not require locking out
> select statements?

It won't work...

            regards, tom lane

Re: partitioning and locking problems

From
Richard Huxton
Date:
Marc Morin wrote:
> Under both these circumstances (truncate and create / replace rule) the
> locking behaviour of these commands can cause locking problems for us.
> The scenario is best illustrated as a series of steps:
>
>
>     1- long running report is running on view
>     2- continuous inserters into view into a table via a rule
>     3- truncate or rule change occurs, taking an exclusive lock.
> Must wait for #1 to finish.
>     4- new reports and inserters must now wait for #3.
>     5- now everyone is waiting for a single query in #1.   Results
> in loss of insert data granularity (important for our application).

How much would you get from splitting the view into two: reporting and
inserting?

--
   Richard Huxton
   Archonet Ltd

Re: partitioning and locking problems

From
"Marc Morin"
Date:
Tom,

Do you mean it would be impossible to change the code so that existing
selects continue to use the pre-truncated table until they commit? Or
just require a more extensive change?

The update/insert rule change appears to be more more doable? No?

> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> Sent: Wednesday, February 01, 2006 12:50 AM
> To: Marc Morin
> Cc: pgsql-performance@postgresql.org
> Subject: Re: [PERFORM] partitioning and locking problems
>
> "Marc Morin" <marc@sandvine.com> writes:
> > Would like to understand the implications of changing postgres'
> > code/locking for rule changes and truncate to not require
> locking out
> > select statements?
>
> It won't work...
>
>             regards, tom lane
>
>

Re: partitioning and locking problems

From
Tom Lane
Date:
"Marc Morin" <marc@sandvine.com> writes:
> Do you mean it would be impossible to change the code so that existing
> selects continue to use the pre-truncated table until they commit?

Yes, because that table won't exist any more (as in the file's been
unlinked) once the TRUNCATE commits.

> The update/insert rule change appears to be more more doable? No?

You've still got race conditions there: do onlooker transactions see the
old set of rules, or the new set, or some unholy mixture?  Removing the
lock as you suggest would make it possible for the rule rewriter to pick
up non-self-consistent data from the system catalogs, leading to
arbitrarily bad behavior ... if you're lucky, it'll just crash, if
you're not lucky the incorrect rule will do a fandango on your data.

            regards, tom lane

Re: partitioning and locking problems

From
Markus Schaber
Date:
Hi, Marc,

Marc Morin wrote:

>     1- long running report is running on view
>     2- continuous inserters into view into a table via a rule
>     3- truncate or rule change occurs, taking an exclusive lock.
> Must wait for #1 to finish.
>     4- new reports and inserters must now wait for #3.
>     5- now everyone is waiting for a single query in #1.   Results
> in loss of insert data granularity (important for our application).

Apart from having two separate views (one for report, one for insert) as
Richard suggested:

If you have fixed times for #3, don't start any #1 that won't finish
before it's time for #3.

You could also use the LOCK command on an empty lock table at the
beginning of each #1 or #3 transaction to prevent #3 from getting the
view lock before #1 is finished.


HTH,
Markus

--
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf.     | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org

Re: partitioning and locking problems

From
"Marc Morin"
Date:
Using a separate lock table is what we've decided to do in this
particular case to serialize #1 and #3.  Inserters don't take this lock
and as such will not be stalled.

> -----Original Message-----
> From: Markus Schaber [mailto:schabi@logix-tt.com]
> Sent: Thursday, February 02, 2006 7:44 AM
> To: Marc Morin
> Cc: pgsql-performance@postgresql.org
> Subject: Re: [PERFORM] partitioning and locking problems
>
> Hi, Marc,
>
> Marc Morin wrote:
>
> >     1- long running report is running on view
> >     2- continuous inserters into view into a table via a rule
> >     3- truncate or rule change occurs, taking an exclusive lock.
> > Must wait for #1 to finish.
> >     4- new reports and inserters must now wait for #3.
> >     5- now everyone is waiting for a single query in #1.   Results
> > in loss of insert data granularity (important for our application).
>
> Apart from having two separate views (one for report, one for
> insert) as Richard suggested:
>
> If you have fixed times for #3, don't start any #1 that won't
> finish before it's time for #3.
>
> You could also use the LOCK command on an empty lock table at
> the beginning of each #1 or #3 transaction to prevent #3 from
> getting the view lock before #1 is finished.
>
>
> HTH,
> Markus
>
> --
> Markus Schaber | Logical Tracking&Tracing International AG
> Dipl. Inf.     | Software Development GIS
>
> Fight against software patents in EU! www.ffii.org
> www.nosoftwarepatents.org
>
>

Re: partitioning and locking problems

From
"Jim C. Nasby"
Date:
On Wed, Feb 01, 2006 at 10:20:21AM -0500, Tom Lane wrote:
> "Marc Morin" <marc@sandvine.com> writes:
> > Do you mean it would be impossible to change the code so that existing
> > selects continue to use the pre-truncated table until they commit?
>
> Yes, because that table won't exist any more (as in the file's been
> unlinked) once the TRUNCATE commits.

Is there a reason the truncate must happen in 'real time'? If TRUNCATE
marked a table as "truncated as of tid, cid" and created a new set of
empty objects to be used by all transactions after that, then it should
be possible to truncate without waiting on existing selects.
Unfortunately, I can't think of any way to avoid blocking new inserters,
but in the partitioning case that shouldn't matter.

> > The update/insert rule change appears to be more more doable? No?
>
> You've still got race conditions there: do onlooker transactions see the
> old set of rules, or the new set, or some unholy mixture?  Removing the
> lock as you suggest would make it possible for the rule rewriter to pick
> up non-self-consistent data from the system catalogs, leading to
> arbitrarily bad behavior ... if you're lucky, it'll just crash, if
> you're not lucky the incorrect rule will do a fandango on your data.

Where can one read about why the catalogs can't/don't use MVCC (I'm
assuming that's why this won't work...)
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: partitioning and locking problems

From
Simon Riggs
Date:
On Thu, 2006-02-02 at 11:27 -0500, Marc Morin wrote:

> > >     1- long running report is running on view
> > >     2- continuous inserters into view into a table via a rule
> > >     3- truncate or rule change occurs, taking an exclusive lock.
> > > Must wait for #1 to finish.
> > >     4- new reports and inserters must now wait for #3.
> > >     5- now everyone is waiting for a single query in #1.   Results
> > > in loss of insert data granularity (important for our application).

> Using a separate lock table is what we've decided to do in this
> particular case to serialize #1 and #3.  Inserters don't take this lock
> and as such will not be stalled.

Would it not be simpler to have the Inserters change from one table to
another either upon command, on a fixed timing cycle or even better
based upon one of the inserted values (Logdate?) (or all 3?). (Requires
changes in the application layer: 3GL or db functions).

The truncates can wait until the data has stopped being used.

I'd be disinclined to using the locking system as a scheduling tool.

Best Regards, Simon Riggs



Re: partitioning and locking problems

From
Ron
Date:
At 05:09 PM 2/7/2006, Simon Riggs wrote:

>I'd be disinclined to using the locking system as a scheduling tool.
I Agree with Simon.  Using the locking system for scheduling feels
like a form of Programming by Side Effect.

Ron



Re: partitioning and locking problems

From
"Marc Morin"
Date:
All good ideas, unfortunately, we can't change the inserting applicatin
code easily.

> -----Original Message-----
> From: Simon Riggs [mailto:simon@2ndquadrant.com]
> Sent: Tuesday, February 07, 2006 5:09 PM
> To: Marc Morin
> Cc: Markus Schaber; pgsql-performance@postgresql.org
> Subject: Re: [PERFORM] partitioning and locking problems
>
> On Thu, 2006-02-02 at 11:27 -0500, Marc Morin wrote:
>
> > > >     1- long running report is running on view
> > > >     2- continuous inserters into view into a table
> via a rule
> > > >     3- truncate or rule change occurs, taking an
> exclusive lock.
> > > > Must wait for #1 to finish.
> > > >     4- new reports and inserters must now wait for #3.
> > > >     5- now everyone is waiting for a single query
> in #1.   Results
> > > > in loss of insert data granularity (important for our
> application).
>
> > Using a separate lock table is what we've decided to do in this
> > particular case to serialize #1 and #3.  Inserters don't take this
> > lock and as such will not be stalled.
>
> Would it not be simpler to have the Inserters change from one
> table to another either upon command, on a fixed timing cycle
> or even better based upon one of the inserted values
> (Logdate?) (or all 3?). (Requires changes in the application
> layer: 3GL or db functions).
>
> The truncates can wait until the data has stopped being used.
>
> I'd be disinclined to using the locking system as a scheduling tool.
>
> Best Regards, Simon Riggs
>
>
>

Re: partitioning and locking problems

From
"Jim C. Nasby"
Date:
On Tue, Feb 07, 2006 at 10:09:02PM +0000, Simon Riggs wrote:
> On Thu, 2006-02-02 at 11:27 -0500, Marc Morin wrote:
>
> > > >     1- long running report is running on view
> > > >     2- continuous inserters into view into a table via a rule
> > > >     3- truncate or rule change occurs, taking an exclusive lock.
> > > > Must wait for #1 to finish.
> > > >     4- new reports and inserters must now wait for #3.
> > > >     5- now everyone is waiting for a single query in #1.   Results
> > > > in loss of insert data granularity (important for our application).
>
> > Using a separate lock table is what we've decided to do in this
> > particular case to serialize #1 and #3.  Inserters don't take this lock
> > and as such will not be stalled.
>
> Would it not be simpler to have the Inserters change from one table to
> another either upon command, on a fixed timing cycle or even better
> based upon one of the inserted values (Logdate?) (or all 3?). (Requires
> changes in the application layer: 3GL or db functions).

Unfortunately, AFAIK rule changes would suffer from the exact same
problem, which will be a serious issue for table partitioning. If you
try and add a new partition while a long report is running you'll end up
blocking everything.

ALso, IIRC the OP was trying *not* to have the locking system impose
scheduling. I believe the intention is that either 1 not block 3 or 3
not block 4.

I'm honestly somewhat surprised someone hasn't run into this problem
with partitioning yet; or maybe everyone who needs to do long
transactions just shoves those off to slony slaves...
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: partitioning and locking problems

From
Simon Riggs
Date:
On Tue, 2006-02-07 at 18:59 -0600, Jim C. Nasby wrote:

> I'm honestly somewhat surprised someone hasn't run into this problem
> with partitioning yet; or maybe everyone who needs to do long
> transactions just shoves those off to slony slaves...

All DDL takes locks, on all DBMS.

Best Regards, Simon Riggs