Re: sustained update load of 1-2k/sec - Mailing list pgsql-performance

From Bob Ippolito
Subject Re: sustained update load of 1-2k/sec
Date
Msg-id 67788A4A-4011-49AB-B329-683FD9532661@redivi.com
Whole thread Raw
In response to sustained update load of 1-2k/sec  (Mark Cotner <mcotner@yahoo.com>)
Responses Re: sustained update load of 1-2k/sec
Re: sustained update load of 1-2k/sec
List pgsql-performance
On Aug 18, 2005, at 10:24 PM, Mark Cotner wrote:

> I'm currently working on an application that will poll
> thousands of cable modems per minute and I would like
> to use PostgreSQL to maintain state between polls of
> each device.  This requires a very heavy amount of
> updates in place on a reasonably large table(100k-500k
> rows, ~7 columns mostly integers/bigint).  Each row
> will be refreshed every 15 minutes, or at least that's
> how fast I can poll via SNMP.  I hope I can tune the
> DB to keep up.
>
> The app is threaded and will likely have well over 100
> concurrent db connections.  Temp tables for storage
> aren't a preferred option since this is designed to be
> a shared nothing approach and I will likely have
> several polling processes.

Somewhat OT, but..

The easiest way to speed that up is to use less threads.  You're
adding a whole TON of overhead with that many threads that you just
don't want or need.  You should probably be using something event-
driven to solve this problem, with just a few database threads to
store all that state.  Less is definitely more in this case.  See
<http://www.kegel.com/c10k.html> (and there's plenty of other
literature out there saying that event driven is an extremely good
way to do this sort of thing).

Here are some frameworks to look at for this kind of network code:
(Python) Twisted - <http://twistedmatrix.com/>
(Perl) POE - <http://poe.perl.org/>
(Java) java.nio (not familiar enough with the Java thing to know
whether or not there's a high-level wrapper)
(C++) ACE - <http://www.cs.wustl.edu/~schmidt/ACE.html>
(Ruby) IO::Reactor - <http://www.deveiate.org/code/IO-Reactor.html>
(C) libevent - <http://monkey.org/~provos/libevent/>

.. and of course, you have select/poll/kqueue/WaitNextEvent/whatever
that you could use directly, if you wanted to roll your own solution,
but don't do that.

If you don't want to optimize the whole application, I'd at least
just push the DB operations down to a very small number of
connections (*one* might even be optimal!), waiting on some kind of
thread-safe queue for updates from the rest of the system.  This way
you can easily batch those updates into transactions and you won't be
putting so much unnecessary synchronization overhead into your
application and the database.

Generally, once you have more worker threads (or processes) than
CPUs, you're going to get diminishing returns in a bad way, assuming
those threads are making good use of their time.

-bob


pgsql-performance by date:

Previous
From: Mark Cotner
Date:
Subject: sustained update load of 1-2k/sec
Next
From: Mark Cotner
Date:
Subject: Re: sustained update load of 1-2k/sec