Thread: question about frequency of updates/reads

question about frequency of updates/reads

From
Andrew Smith
Date:
Hi all,

I'm working on a problem at the moment where I have some data that I
need to get from a proprietary system into a web page.  I was thinking
of using PostgreSQL as a middle man to store the data.  E.g

- C++ app reads data from proprietary system and writes it into temp
table in PostgreSQL
- ASP.NET web service reads data from temp table in PostgreSQL and
generates HTML

I already have a database that I'm using for other parts of the web
site, so thought I'd just add an extra table that looks like this:

CREATE TABLE "DataExchange"
(
    "DataExchangeID" serial NOT NULL PRIMARY KEY,
    "Name" text NOT NULL UNIQUE,
    "Value" integer NOT NULL,
    "UpdateTime" timestamp without time zone
);

This temp table will probably contain up to 10000 records, each of
which could be changing every second (data is coming from a real-time
monitoring system).  On top of this, I've then got the ASP.NET app
reading the updated data values every second or so (the operators want
to see the data as soon as it changes).  I was going to do some
performance testing to see how well it would work, but thought I'd ask
the question here first: I know that the number of records isn't a
problem, but how about the frequency of updates/reads?  Is 10000
updates/reads a second considered a lot in the PostgreSQL world, or
will it do it easily?

Regards,

Andrew

Re: question about frequency of updates/reads

From
Scott Marlowe
Date:
On Tue, Jun 23, 2009 at 10:12 PM, Andrew Smith<laconical@gmail.com> wrote:

> This temp table will probably contain up to 10000 records, each of
> which could be changing every second (data is coming from a real-time
> monitoring system).  On top of this, I've then got the ASP.NET app
> reading the updated data values every second or so (the operators want
> to see the data as soon as it changes).  I was going to do some
> performance testing to see how well it would work, but thought I'd ask
> the question here first: I know that the number of records isn't a
> problem, but how about the frequency of updates/reads?  Is 10000
> updates/reads a second considered a lot in the PostgreSQL world, or
> will it do it easily?

Maybe.  Rows that are updated often are NOT generally pgsql's strong
suit, but IF you're running 8.3 or above, and IF you have a low enough
fill factor that there's empty space for the updates and IF the fields
you are updating are not indexed and IF you have aggressive enough
vacuuming and IF you restrict your updates to JUST real updates (i.e.
update ... set a=1 where a<>1) and IF your IO subsystem has enough raw
horsepower, you can make this work.  But only benchmarking will tell
you if you can do it with your current hardware and setup.

Re: question about frequency of updates/reads

From
David Wilson
Date:
On Wed, Jun 24, 2009 at 12:48 AM, Scott Marlowe<scott.marlowe@gmail.com> wrote:

> Maybe.  Rows that are updated often are NOT generally pgsql's strong
> suit, but IF you're running 8.3 or above, and IF you have a low enough
> fill factor that there's empty space for the updates and IF the fields
> you are updating are not indexed and IF you have aggressive enough
> vacuuming and IF you restrict your updates to JUST real updates (i.e.
> update ... set a=1 where a<>1) and IF your IO subsystem has enough raw
> horsepower, you can make this work.  But only benchmarking will tell
> you if you can do it with your current hardware and setup.

One option, though not nearly as nice as a simple additional table, is
to initdb on a ramdisk and run a separate server instance on it. It
doesn't sound like a) you've got a lot of data in there, or b) you
really care about the data sticking around after a crash (since it's
in the proprietary system already). You'd get the benefit of easy sql
to do the storage and retrieval, though you'd need a separate
connection to the separate server instance on the web page which would
be annoying, and you'd probably want to script the ramdisk and db
setup since you'll have to recreate after a crash. (Alternately,
create on disk somewhere; then put a copy on the ramdisk and start a
postgres instance pointed at it. Then after a restart you just need to
copy over from disk and start up the postgres instance)


--
- David T. Wilson
david.t.wilson@gmail.com

Re: question about frequency of updates/reads

From
Craig Ringer
Date:
Andrew Smith wrote:

> - C++ app reads data from proprietary system and writes it into temp
> table in PostgreSQL
> - ASP.NET web service reads data from temp table in PostgreSQL and
> generates HTML
[snip]
> This temp table will probably contain up to 10000 records, each of
> which could be changing every second (data is coming from a real-time
> monitoring system).  On top of this, I've then got the ASP.NET app
> reading the updated data values every second or so (the operators want
> to see the data as soon as it changes).

PostgresSQL - or, in fact, any relational database - isn't really a
great choice for this particular role. You don't care about retaining
the data, you're not that bothered by long-term data integrity, and
you're really just using the DB as a communications tool.

(If you *DO* care about the history, then that's different, but you're
also talking serious hardware).

Personally I think that for your real-time monitoring you might be much
better off using something like memcached for your intermediary. Or, for
that matter, a text file. It does depend on the complexity of the
queries you want to run on the data, though.

If you do want to use SQL, but you don't care about the history and only
want it as a communication intermediary, I'd actually suggest MySQL with
MyISAM tables for this one isolated role. While horrifyingly unsafe,
with MyISAM tables it's also blazingly fast. Do NOT store anything you
care about keeping that way, though.

If you're using Pg anyway for other things, if you do also intend to
store the history of the changes (say, as a change log in another table
on safe storage), etc, then you might want to look at temp tables in a
RAM-disk based tablespace. Do some testing and see how you go.

--
Craig Ringer