Thread: 24x7x365 high-volume ops ideas

24x7x365 high-volume ops ideas

From
"Ed L."
Date:
I have a few high-volume, fairly large clusters that I'm struggling to keep
up 24x7x365.  I want to ask for advice from anyone with similar experience
or hard-won wisdom.

Generally these are clusters with 100-200 queries/second, maybe 10GB-30GB of
data (always increasing), and maybe 10% writes.  A little regular routine
downtime for maintenance would do wonders for these systems, but
unfortunately, the requirement is 100% uptime all the time, and any
downtime at all is a liability.  Here are some of the issues:

1)  Big tables.  When the tables grow large enough, it takes too long to
vacuum them.  In some cases there's just too much data.  In other cases,
it's dead space, but both reindex and vacuum full block production queries
(a lesser version of downtime).  In the past, we have taken a PR hit for
downtime to dump/reload (we've found it to be faster than vacuum full).
Async replication helps with cluster moves from one server to another, but
still don't have a low-to-zero downtime solution for regular maint.

2)  Big tables, part 2.  Of course, customers want all data that ever
existed online and quickly available via sub-second queries.  I assume at
some point this data is going to be too much for one table (how much is too
much?).  This is a little vague, I know, but what sorts of segmenting
strategies to folks employ to deal with data that cannot be retired but
gets too expensive to vacuum, etc.

3)  Simple restarts for configuration changes (shared_buffers,
max_connections, etc).  When we have to break client connections, we have
to notify the customer and take a PR hit.  Maybe pgpool is a possible
solution?

Are these issues for Oracle, DB2, etc as well?

TIA.

Ed


Re: 24x7x365 high-volume ops ideas

From
"Dann Corbit"
Date:
> -----Original Message-----
> From: pgsql-general-owner@postgresql.org
> [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Ed L.
> Sent: Wednesday, November 03, 2004 5:10 PM
> To: pgsql-general@postgresql.org
> Subject: [GENERAL] 24x7x365 high-volume ops ideas
>
>
>
> I have a few high-volume, fairly large clusters that I'm
> struggling to keep
> up 24x7x365.  I want to ask for advice from anyone with
> similar experience
> or hard-won wisdom.
>
> Generally these are clusters with 100-200 queries/second,
> maybe 10GB-30GB of
> data (always increasing), and maybe 10% writes.  A little
> regular routine
> downtime for maintenance would do wonders for these systems, but
> unfortunately, the requirement is 100% uptime all the time, and any
> downtime at all is a liability.  Here are some of the issues:
>
> 1)  Big tables.  When the tables grow large enough, it takes
> too long to
> vacuum them.  In some cases there's just too much data.  In
> other cases,
> it's dead space, but both reindex and vacuum full block
> production queries
> (a lesser version of downtime).  In the past, we have taken a
> PR hit for
> downtime to dump/reload (we've found it to be faster than
> vacuum full).
> Async replication helps with cluster moves from one server to
> another, but
> still don't have a low-to-zero downtime solution for regular maint.

This is a tough problem.  Related to those below, of course.

> 2)  Big tables, part 2.  Of course, customers want all data that ever
> existed online and quickly available via sub-second queries.
> I assume at
> some point this data is going to be too much for one table
> (how much is too
> much?).  This is a little vague, I know, but what sorts of segmenting
> strategies to folks employ to deal with data that cannot be
> retired but
> gets too expensive to vacuum, etc.

This is an opportunity for savings.  Create a history table that
contains everything more than about 2 years old.
For people who want access to everything create a view with union all
against current and history.
Once they see how fast the current data and how rarely they really need
the old stuff, they will shy away from using it.

E.g.
Table inventory has current data
Table inventory_hist has old data
View  inventory_all has "SELECT * FROM inventory UNION ALL SELECT * FROM
inventory_hist"

> 3)  Simple restarts for configuration changes (shared_buffers,
> max_connections, etc).  When we have to break client
> connections, we have
> to notify the customer and take a PR hit.  Maybe pgpool is a possible
> solution?

There's the time-tested method of throwing more hardware at it.  Would
an 8-CPU machine with 64 GB ram and Ultra320 stripped raid array help?
The hardware is always less expensive than the software and the data --
usually by orders of magnitude.  If a $25,000-$50,000 machine will save
one man-year of effort over its lifetime, then it is a splendid idea.

> Are these issues for Oracle, DB2, etc as well?
Yes, but not as pronounced.  For instance with SQL*Server you do "UPDATE
STATISTICS" to get the equivalent of a vacuum full.

Re: 24x7x365 high-volume ops ideas

From
Karim Nassar
Date:
On Wed, 2004-11-03 at 18:10, Ed L. wrote:
> unfortunately, the requirement is 100% uptime all the time, and any
> downtime at all is a liability.  Here are some of the issues:

Seems like 100% uptime is always an issue, but not even close to
reality. I think it's unreasonable to expect a single piece of software
that NEVER to be restarted. Never is a really long time.

For this case, isn't replication sufficient? (FWIW, in 1 month I have to
answer this same question). Would this work?

* 'Main' db server up 99.78% of time
* 'Replicant' up 99.78% of time (using slony, dbmirror)
* When Main goes down (crisis, maintenance), Replicant answers for Main,
  in a read-only fashion.
* When Main comes back up, any waiting writes can now happen.
* Likewise, Replicant can be taken down for maint, then Main syncs to it
  when going back online.

Is this how it's done?

\<.


Re: 24x7x365 high-volume ops ideas

From
Christopher Browne
Date:
A long time ago, in a galaxy far, far away, Karim.Nassar@NAU.EDU (Karim Nassar) wrote:
> On Wed, 2004-11-03 at 18:10, Ed L. wrote:
>> unfortunately, the requirement is 100% uptime all the time, and any
>> downtime at all is a liability.  Here are some of the issues:
>
> Seems like 100% uptime is always an issue, but not even close to
> reality. I think it's unreasonable to expect a single piece of
> software that NEVER to be restarted. Never is a really long time.
>
> For this case, isn't replication sufficient? (FWIW, in 1 month I
> have to answer this same question). Would this work?
>
> * 'Main' db server up 99.78% of time
> * 'Replicant' up 99.78% of time (using slony, dbmirror)
> * When Main goes down (crisis, maintenance), Replicant answers for Main,
>   in a read-only fashion.
> * When Main comes back up, any waiting writes can now happen.
> * Likewise, Replicant can be taken down for maint, then Main syncs to it
>   when going back online.
>
> Is this how it's done?

The challenge lies in two places:

1.  You need some mechanism to detect that the "replica" should take
over, and to actually perform that takeover.

That "takeover" requires having some way for your application to
become aware of the new IP address of the DB host.

2.  Some changes need to take place in order to prepare the "replica"
to be treated as "master."

For instance, in the case of Slony-I, you can do a fullscale
"failover" where you tell it to treat the "main" database as being
dead.  At that point, the replica becomes the master.  That
essentially discards the former 'master' as dead.

Alternatively, there's a "MOVE SET" which is suitable for predictable
maintenance; that shifts the "master" node from one node to another;
you can take MAIN out of service for a while, and add it back, perhaps
making it the "master" again.

None of these systems _directly_ address how apps would get pointed to
the shifting servers.

A neat approach would involve making pgpool, a C-based 'connection
pool' manager, Slony-I-aware.  If it were to submit either MOVE SET or
FAILOVER, it would be aware of which DB to point things to, so that
applications that pass requests through pgpool would not necessarily
need to be aware of there being a change beyond perhaps seeing some
transactions terminated.  That won't be ready tomorrow...

Something needs to be "smart enough" to point apps to the right place;
that's something to think about...
--
let name="cbbrowne" and tld="linuxfinances.info" in String.concat "@" [name;tld];;
http://www3.sympatico.ca/cbbrowne/advocacy.html
"XFS might  (or might not)  come out before  the year 3000.  As far as
kernel patches go,  SGI are brilliant.  As far as graphics, especially
OpenGL,  go,  SGI is  untouchable.  As  far as   filing  systems go, a
concussed doormouse in a tarpit would move faster."  -- jd on Slashdot

Re: 24x7x365 high-volume ops ideas

From
Karim Nassar
Date:
On Sun, 2004-11-07 at 21:16, Christopher Browne wrote:
> None of these systems _directly_ address how apps would get pointed to
> the shifting servers.
<snip>
> Something needs to be "smart enough" to point apps to the right place;
> that's something to think about...

Seems like it would be pretty easy to be smart in PHP:

function db_connect() {
  $conn = pg_connect("dbname='foo' user='dawg' password='HI!'
                      host='master'");
  if (!($conn AND (pg_connection_status($conn) == 0))) {
    // problem with master
    $conn = pg_connect("dbname='foo' user='dawg' password='HI!'
                      host='replica'");
    if ($conn AND (pg_connection_status($conn) == 0)) {
      return $conn;
    }
  } else {
    return $conn;
  }
  return NULL;
}

Whatever client-side language one uses, the technique is the same
(though the coding style might differ :P ), can be used for persistent
connections (eg: with pg_pconnect in PHP), and seems like it could be
extended to any reasonable number of database servers.

What is the problem with this? The only issue I can see is that
"replica" might be behind. Depending on the application, this might not
be bad. If the app MUST have the very most accurate DB, you could remove
the logic that connects to the replica, but then that nullifies this
whole conversation...

\<.




Re: 24x7x365 high-volume ops ideas

From
Chris Browne
Date:
Karim.Nassar@NAU.EDU (Karim Nassar) writes:
> On Sun, 2004-11-07 at 21:16, Christopher Browne wrote:
>> None of these systems _directly_ address how apps would get pointed to
>> the shifting servers.
> <snip>
>> Something needs to be "smart enough" to point apps to the right place;
>> that's something to think about...
>
> Seems like it would be pretty easy to be smart in PHP:
>
> function db_connect() {
>   $conn = pg_connect("dbname='foo' user='dawg' password='HI!'
>                       host='master'");
>   if (!($conn AND (pg_connection_status($conn) == 0))) {
>     // problem with master
>     $conn = pg_connect("dbname='foo' user='dawg' password='HI!'
>                       host='replica'");
>     if ($conn AND (pg_connection_status($conn) == 0)) {
>       return $conn;
>     }
>   } else {
>     return $conn;
>   }
>   return NULL;
> }
>
> Whatever client-side language one uses, the technique is the same
> (though the coding style might differ :P ), can be used for
> persistent connections (eg: with pg_pconnect in PHP), and seems like
> it could be extended to any reasonable number of database servers.
>
> What is the problem with this? The only issue I can see is that
> "replica" might be behind. Depending on the application, this might
> not be bad. If the app MUST have the very most accurate DB, you
> could remove the logic that connects to the replica, but then that
> nullifies this whole conversation...

The "problem" is that this requires modifications to the application,
and communicating configuration changes gets that bit more
complicated.

Supposing, for instance, the code that accesses connections has
already gotten wrapped in some more-or-less arcane object class
specific to the application, it may be somewhat troublesome to make
the modification.

It would be attractive to allow the configuration change to take place
outside the application in a manner that allows the application to be
completely ignorant about it.

By the way, your db_connect() suggestion doesn't cope with the problem
where a connection is broken and the application continues to use that
broken connection.  There may be a need to cope with that...
--
let name="cbbrowne" and tld="cbbrowne.com" in String.concat "@" [name;tld];;
http://www.ntlug.org/~cbbrowne/linuxxian.html
A VAX is virtually a computer, but not quite.

Re: 24x7x365 high-volume ops ideas

From
Richard_D_Levine@raytheon.com
Date:
Hi Chris and Karim,

I haven't been following this thread, so excuse me if I suggest something
that has already been tossed out.

Solaris allows multiple IP addresses to be assigned to a single NIC.  I
just looked at the man page for Linux ifconfig but didn't see quickly how
to do this.  If Linux doesn't allow this, the same thing can be
accomplished using multiple NICs per server.

We reserve a special IP for the DB server.  This IP can be assigned to the
NIC of the machine currently hosting the database.  If you want apps to
connect to a different server, remove the IP from one machine and reassign
it to the other.  This special DB IP is assigned on top of the regular IP
for the machine.

Newly connecting apps are never the wiser, but existing connections must be
terminated.

machine A: (DB server)

ifconfig hme0 192.1.1.1 up
ifconfig hme0:1 192.1.1.100 up #special DB IP

To change DB server from machine A to B:

MachineB: (New DB server)
ifconfig hme0 192.1.1.2 up
ifconfig hme0:1 192.1.1.100 up

Machine A:
ifconfig hme0:1 down

Hope this helps,

Rick



                     
                      Chris Browne
                     
                      <cbbrowne@acm.org>             To:       pgsql-general@postgresql.org
                     
                      Sent by:                       cc:
                     
                      pgsql-general-owner@pos        Subject:  Re: [GENERAL] 24x7x365 high-volume ops ideas
                     
                      tgresql.org
                     

                     

                     
                      11/15/2004 05:20 PM
                     

                     

                     




Karim.Nassar@NAU.EDU (Karim Nassar) writes:
> On Sun, 2004-11-07 at 21:16, Christopher Browne wrote:
>> None of these systems _directly_ address how apps would get pointed to
>> the shifting servers.
> <snip>
>> Something needs to be "smart enough" to point apps to the right place;
>> that's something to think about...
>
> Seems like it would be pretty easy to be smart in PHP:
>
> function db_connect() {
>   $conn = pg_connect("dbname='foo' user='dawg' password='HI!'
>                       host='master'");
>   if (!($conn AND (pg_connection_status($conn) == 0))) {
>     // problem with master
>     $conn = pg_connect("dbname='foo' user='dawg' password='HI!'
>                       host='replica'");
>     if ($conn AND (pg_connection_status($conn) == 0)) {
>       return $conn;
>     }
>   } else {
>     return $conn;
>   }
>   return NULL;
> }
>
> Whatever client-side language one uses, the technique is the same
> (though the coding style might differ :P ), can be used for
> persistent connections (eg: with pg_pconnect in PHP), and seems like
> it could be extended to any reasonable number of database servers.
>
> What is the problem with this? The only issue I can see is that
> "replica" might be behind. Depending on the application, this might
> not be bad. If the app MUST have the very most accurate DB, you
> could remove the logic that connects to the replica, but then that
> nullifies this whole conversation...

The "problem" is that this requires modifications to the application,
and communicating configuration changes gets that bit more
complicated.

Supposing, for instance, the code that accesses connections has
already gotten wrapped in some more-or-less arcane object class
specific to the application, it may be somewhat troublesome to make
the modification.

It would be attractive to allow the configuration change to take place
outside the application in a manner that allows the application to be
completely ignorant about it.

By the way, your db_connect() suggestion doesn't cope with the problem
where a connection is broken and the application continues to use that
broken connection.  There may be a need to cope with that...
--
let name="cbbrowne" and tld="cbbrowne.com" in String.concat "@"
[name;tld];;
http://www.ntlug.org/~cbbrowne/linuxxian.html
A VAX is virtually a computer, but not quite.

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org




Re: 24x7x365 high-volume ops ideas

From
Tino Wildenhain
Date:
Hi,

On Tue, 2004-11-16 at 14:17, Richard_D_Levine@raytheon.com wrote:
> Hi Chris and Karim,
>
> I haven't been following this thread, so excuse me if I suggest something
> that has already been tossed out.
>
> Solaris allows multiple IP addresses to be assigned to a single NIC.  I
> just looked at the man page for Linux ifconfig but didn't see quickly how
> to do this.  If Linux doesn't allow this, the same thing can be
> accomplished using multiple NICs per server.
>
> We reserve a special IP for the DB server.  This IP can be assigned to the
> NIC of the machine currently hosting the database.  If you want apps to
> connect to a different server, remove the IP from one machine and reassign
> it to the other.  This special DB IP is assigned on top of the regular IP
> for the machine.
>
> Newly connecting apps are never the wiser, but existing connections must be
> terminated.

Yes, linux can do it as well. But either case beware the arp cache :-)
There is sqlrelay which could do the switching as well without
forcing the apps to reconnect.

Regards
Tino


Re: 24x7x365 high-volume ops ideas

From
"Joshua D. Drake"
Date:
Tino Wildenhain wrote:

>Hi,
>
>On Tue, 2004-11-16 at 14:17, Richard_D_Levine@raytheon.com wrote:
>
>
>>Hi Chris and Karim,
>>
>>I haven't been following this thread, so excuse me if I suggest something
>>that has already been tossed out.
>>
>>Solaris allows multiple IP addresses to be assigned to a single NIC.  I
>>just looked at the man page for Linux ifconfig but didn't see quickly how
>>to do this.  If Linux doesn't allow this, the same thing can be
>>accomplished using multiple NICs per server.
>>
>>
ifconfig device:<alias> ipaddress up

For linux.

Sincerely,

Joshua D. Drake



>>We reserve a special IP for the DB server.  This IP can be assigned to the
>>NIC of the machine currently hosting the database.  If you want apps to
>>connect to a different server, remove the IP from one machine and reassign
>>it to the other.  This special DB IP is assigned on top of the regular IP
>>for the machine.
>>
>>Newly connecting apps are never the wiser, but existing connections must be
>>terminated.
>>
>>
>
>Yes, linux can do it as well. But either case beware the arp cache :-)
>There is sqlrelay which could do the switching as well without
>forcing the apps to reconnect.
>
>Regards
>Tino
>
>
>---------------------------(end of broadcast)---------------------------
>TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>
>


--
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-667-4564 - jd@commandprompt.com - http://www.commandprompt.com
PostgreSQL Replicator -- production quality replication for PostgreSQL


Attachment

Re: 24x7x365 high-volume ops ideas

From
Richard_D_Levine@raytheon.com
Date:
Thanks Joshua,

I just played with it and it is more flexible that Solaris.  Linux allows
<alias> to be four characters, but Solaris only allows low numbers.  I like
this:

ifconfig eth0:dbms 192.1.1.100 up

Tino's right about the ARP tables.  Gotta watch that one, especially with
lower grade switches.

Rick



               
                      "Joshua D. Drake"
               
                      <jd@commandprompt        To:       Tino Wildenhain <tino@wildenhain.de>
               
                      .com>                    cc:       Richard_D_Levine@raytheon.com, Chris Browne
<cbbrowne@acm.org>,               
                                                pgsql-general@postgresql.org, pgsql-general-owner@postgresql.org
               
                      11/16/2004 11:08         Subject:  Re: [GENERAL] 24x7x365 high-volume ops ideas
               
                      AM
               

               

               




Tino Wildenhain wrote:

>Hi,
>
>On Tue, 2004-11-16 at 14:17, Richard_D_Levine@raytheon.com wrote:
>
>
>>Hi Chris and Karim,
>>
>>I haven't been following this thread, so excuse me if I suggest something
>>that has already been tossed out.
>>
>>Solaris allows multiple IP addresses to be assigned to a single NIC.  I
>>just looked at the man page for Linux ifconfig but didn't see quickly how
>>to do this.  If Linux doesn't allow this, the same thing can be
>>accomplished using multiple NICs per server.
>>
>>
ifconfig device:<alias> ipaddress up

For linux.

Sincerely,

Joshua D. Drake



>>We reserve a special IP for the DB server.  This IP can be assigned to
the
>>NIC of the machine currently hosting the database.  If you want apps to
>>connect to a different server, remove the IP from one machine and
reassign
>>it to the other.  This special DB IP is assigned on top of the regular IP
>>for the machine.
>>
>>Newly connecting apps are never the wiser, but existing connections must
be
>>terminated.
>>
>>
>
>Yes, linux can do it as well. But either case beware the arp cache :-)
>There is sqlrelay which could do the switching as well without
>forcing the apps to reconnect.
>
>Regards
>Tino
>
>
>---------------------------(end of broadcast)---------------------------
>TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>
>


--
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-667-4564 - jd@commandprompt.com - http://www.commandprompt.com
PostgreSQL Replicator -- production quality replication for PostgreSQL

(See attached file: jd.vcf)

Attachment