Thread: Manual sys catalog constraint setup to avoid downtime?

Manual sys catalog constraint setup to avoid downtime?

From
Jerry Sievers
Date:
Admins; this is (I assume) a not uncommon scenario).

Suppose you are managing a BIG ugly DB that's in 24x7 production and
desperately in need of constraints of all types.

(Of course, you inherited this piece of crap and the irresponsible
designers are long gone :-)

You have verified that the application now honors the new constraints
because on your R&D DBs, the constraints are in place and no
constraint violations are occuring.

You can't however get business to sign off on proper constraint
addition via ALTER TABLE because the tables are huge and this locks
out concurrent activity for several hours.

My question;

Have any of you setup the constraints by doing manual catalog updates
and later (or maybe never),  gone back and cleaned up the
non-conforming data using small batch updates or deletes?

Frobing of pg_constraint and pg_depend (at least) would be required.

I'm wondering what non-obvious pitfalls exist here?

Comments?

--
-------------------------------------------------------------------------------
Jerry Sievers   305 854-3001 (home)     Production Database Administrator
                305 321-1144 (mobil    WWW E-Commerce Consultant

Re: Manual sys catalog constraint setup to avoid downtime?

From
"Jim C. Nasby"
Date:
On Tue, Nov 07, 2006 at 11:05:38AM -0500, Jerry Sievers wrote:
> Admins; this is (I assume) a not uncommon scenario).
>
> Suppose you are managing a BIG ugly DB that's in 24x7 production and
> desperately in need of constraints of all types.
>
> (Of course, you inherited this piece of crap and the irresponsible
> designers are long gone :-)
>
> You have verified that the application now honors the new constraints
> because on your R&D DBs, the constraints are in place and no
> constraint violations are occuring.
>
> You can't however get business to sign off on proper constraint
> addition via ALTER TABLE because the tables are huge and this locks
> out concurrent activity for several hours.
>
> My question;
>
> Have any of you setup the constraints by doing manual catalog updates
> and later (or maybe never),  gone back and cleaned up the
> non-conforming data using small batch updates or deletes?
>
> Frobing of pg_constraint and pg_depend (at least) would be required.
>
> I'm wondering what non-obvious pitfalls exist here?
>
> Comments?

Use the source, luke!

Hmmm... interesting problem. For some constraints, I think you could
probably handle things pretty well once we have non-blocking index
creation (did that get into 8.2?). I guess the real key there would be
if the constraint creation code will use indexes when checking the
constraints. This would be especially useful for generic CHECK
constraints... if you build an index that has a WHERE clause that
matches the proposed constraint, the backend could just look to see if
there are any entries in the index. If there aren't, you're good to go.

In the meantime, I suggest using triggers to enforce your constraints,
since you can create one without verifying the underlying data.
--
Jim Nasby                                            jim@nasby.net
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)

Re: Manual sys catalog constraint setup to avoid downtime?

From
Jerry Sievers
Date:
"Jim C. Nasby" <jim@nasby.net> writes:

> On Tue, Nov 07, 2006 at 11:05:38AM -0500, Jerry Sievers wrote:
>
> > Admins; this is (I assume) a not uncommon scenario).
> >
> > Suppose you are managing a BIG ugly DB that's in 24x7 production and
> > desperately in need of constraints of all types.
> >
> > (Of course, you inherited this piece of crap and the irresponsible
> > designers are long gone :-)
> >
> > You have verified that the application now honors the new constraints
> > because on your R&D DBs, the constraints are in place and no
> > constraint violations are occuring.
> >
> > You can't however get business to sign off on proper constraint
> > addition via ALTER TABLE because the tables are huge and this locks
> > out concurrent activity for several hours.
> >
> > My question;
> >
> > Have any of you setup the constraints by doing manual catalog updates
> > and later (or maybe never),  gone back and cleaned up the
> > non-conforming data using small batch updates or deletes?
> >
> > Frobing of pg_constraint and pg_depend (at least) would be required.
> >
> > I'm wondering what non-obvious pitfalls exist here?
> >
> > Comments?
>
> Use the source, luke!
>
> Hmmm... interesting problem. For some constraints, I think you could
> probably handle things pretty well once we have non-blocking index
> creation (did that get into 8.2?). I guess the real key there would be
> if the constraint creation code will use indexes when checking the
> constraints. This would be especially useful for generic CHECK
> constraints... if you build an index that has a WHERE clause that
> matches the proposed constraint, the backend could just look to see if
> there are any entries in the index. If there aren't, you're good to
> go.

Right.  The problem is that during production operation, we can't
afford the bulk updates/deletes to bring the data into conformance
with the to be added constraints.  We also can't afford the exclusive
locks taken on both sides of the new FK relationship either due to the
time involved.

Just the updates/deletes alone are likely to bump against client
activity due to the length of time some of these transactions may run.

We'll need to tiptoe around and make such that none of our
administrative activity results in any transaction that takes more
than a few seconds.

Here's the plan that I believe would solve the problem.

1. Clean illegal records from table using small group transactions or
   single row deletes/updates.  We want short running transactions
   so's not to block client activity that may still be able to see and
   update these rows.  (App has been verified not to put illegal rows
   in the DB with respect to nulls etc. but we're not sure the bad
   rows can't still be managed by the app through updates and such.

2. Create new NOT NULL, CHECK and/or FK constraints by inserting them
   into the catalogs directly.  NOT NULL is the simplest and requires
   only an update to pg_attribute table.  CHECK constraints a bit more
   tricky and requires figuring out what goes in the conbin field of
   pg_constraint table.  FK the most difficult as it requires adding
   recs to all of pg_constraint, pg_trigger and pg_depend.  The
   trigger I guess can be created using standard CREATE TRIGGER
   operation.

Having done this, we are NOT absolutely guaranteed that the new
constraints are expressed in the DB due to possible client activity
that ran during or after our bulk delete or updates in step #1.  We
are however insured that new activity as of the constraint additions
is conforming.

3. Let the DB age beyond any transactions that may have been open
   during our initial cleanup pass and repeat cleanup steps again.
   Now, we should be up to snuff and fine going forward.

There remains still at least one "fly in the ointment"... new indexes
that may be needed on the child side of FK to allow fast validation on
parent side update/delete still going to require long running
operation that can't concur with live production.

Agreed that in 8.2 the concurrent index functionality can help here.

Finally, as this memo went out early this week and drew only 1 reply
(yours), I intuit that this sort of unorthodox tactics gets no
emotional support from the community!

(Best to don the politician hat and get the downtime authorized to get
this done properly.)

Phew!

> In the mean time, I suggest using triggers to enforce your
> constraints, > since you can create one without verifying the
> underlying data.  > -- > Jim Nasby jim@nasby.net > EnterpriseDB
> http://enterprisedb.com 512.569.9461 (cell)
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>        choose an index scan if your joining column's datatypes do not
>        match
>

--
-------------------------------------------------------------------------------
Jerry Sievers   305 854-3001 (home)     Production Database Administrator
                305 321-1144 (mobil    WWW E-Commerce Consultant

Re: Manual sys catalog constraint setup to avoid downtime?

From
"Jim C. Nasby"
Date:
On Thu, Nov 09, 2006 at 08:46:42AM -0500, Jerry Sievers wrote:
> 2. Create new NOT NULL, CHECK and/or FK constraints by inserting them
>    into the catalogs directly.  NOT NULL is the simplest and requires
>    only an update to pg_attribute table.  CHECK constraints a bit more
>    tricky and requires figuring out what goes in the conbin field of
>    pg_constraint table.  FK the most difficult as it requires adding
>    recs to all of pg_constraint, pg_trigger and pg_depend.  The
>    trigger I guess can be created using standard CREATE TRIGGER
>    operation.
>
> Having done this, we are NOT absolutely guaranteed that the new
> constraints are expressed in the DB due to possible client activity
> that ran during or after our bulk delete or updates in step #1.  We
> are however insured that new activity as of the constraint additions
> is conforming.
>
> 3. Let the DB age beyond any transactions that may have been open
>    during our initial cleanup pass and repeat cleanup steps again.
>    Now, we should be up to snuff and fine going forward.

You should check into whether other backends will pick those catalog
changes up automagically or not... you may have to restart all
connections for those changes to take effect.

BTW, it'd probably be worthwhile looking at the code that creates the
different constraints to see exactly what it's doing.

And you're right... this is not something the community supports. :)
--
Jim Nasby                                            jim@nasby.net
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)