Thread: question about frequency of updates/reads
Hi all, I'm working on a problem at the moment where I have some data that I need to get from a proprietary system into a web page. I was thinking of using PostgreSQL as a middle man to store the data. E.g - C++ app reads data from proprietary system and writes it into temp table in PostgreSQL - ASP.NET web service reads data from temp table in PostgreSQL and generates HTML I already have a database that I'm using for other parts of the web site, so thought I'd just add an extra table that looks like this: CREATE TABLE "DataExchange" ( "DataExchangeID" serial NOT NULL PRIMARY KEY, "Name" text NOT NULL UNIQUE, "Value" integer NOT NULL, "UpdateTime" timestamp without time zone ); This temp table will probably contain up to 10000 records, each of which could be changing every second (data is coming from a real-time monitoring system). On top of this, I've then got the ASP.NET app reading the updated data values every second or so (the operators want to see the data as soon as it changes). I was going to do some performance testing to see how well it would work, but thought I'd ask the question here first: I know that the number of records isn't a problem, but how about the frequency of updates/reads? Is 10000 updates/reads a second considered a lot in the PostgreSQL world, or will it do it easily? Regards, Andrew
On Tue, Jun 23, 2009 at 10:12 PM, Andrew Smith<laconical@gmail.com> wrote: > This temp table will probably contain up to 10000 records, each of > which could be changing every second (data is coming from a real-time > monitoring system). On top of this, I've then got the ASP.NET app > reading the updated data values every second or so (the operators want > to see the data as soon as it changes). I was going to do some > performance testing to see how well it would work, but thought I'd ask > the question here first: I know that the number of records isn't a > problem, but how about the frequency of updates/reads? Is 10000 > updates/reads a second considered a lot in the PostgreSQL world, or > will it do it easily? Maybe. Rows that are updated often are NOT generally pgsql's strong suit, but IF you're running 8.3 or above, and IF you have a low enough fill factor that there's empty space for the updates and IF the fields you are updating are not indexed and IF you have aggressive enough vacuuming and IF you restrict your updates to JUST real updates (i.e. update ... set a=1 where a<>1) and IF your IO subsystem has enough raw horsepower, you can make this work. But only benchmarking will tell you if you can do it with your current hardware and setup.
On Wed, Jun 24, 2009 at 12:48 AM, Scott Marlowe<scott.marlowe@gmail.com> wrote: > Maybe. Rows that are updated often are NOT generally pgsql's strong > suit, but IF you're running 8.3 or above, and IF you have a low enough > fill factor that there's empty space for the updates and IF the fields > you are updating are not indexed and IF you have aggressive enough > vacuuming and IF you restrict your updates to JUST real updates (i.e. > update ... set a=1 where a<>1) and IF your IO subsystem has enough raw > horsepower, you can make this work. But only benchmarking will tell > you if you can do it with your current hardware and setup. One option, though not nearly as nice as a simple additional table, is to initdb on a ramdisk and run a separate server instance on it. It doesn't sound like a) you've got a lot of data in there, or b) you really care about the data sticking around after a crash (since it's in the proprietary system already). You'd get the benefit of easy sql to do the storage and retrieval, though you'd need a separate connection to the separate server instance on the web page which would be annoying, and you'd probably want to script the ramdisk and db setup since you'll have to recreate after a crash. (Alternately, create on disk somewhere; then put a copy on the ramdisk and start a postgres instance pointed at it. Then after a restart you just need to copy over from disk and start up the postgres instance) -- - David T. Wilson david.t.wilson@gmail.com
Andrew Smith wrote: > - C++ app reads data from proprietary system and writes it into temp > table in PostgreSQL > - ASP.NET web service reads data from temp table in PostgreSQL and > generates HTML [snip] > This temp table will probably contain up to 10000 records, each of > which could be changing every second (data is coming from a real-time > monitoring system). On top of this, I've then got the ASP.NET app > reading the updated data values every second or so (the operators want > to see the data as soon as it changes). PostgresSQL - or, in fact, any relational database - isn't really a great choice for this particular role. You don't care about retaining the data, you're not that bothered by long-term data integrity, and you're really just using the DB as a communications tool. (If you *DO* care about the history, then that's different, but you're also talking serious hardware). Personally I think that for your real-time monitoring you might be much better off using something like memcached for your intermediary. Or, for that matter, a text file. It does depend on the complexity of the queries you want to run on the data, though. If you do want to use SQL, but you don't care about the history and only want it as a communication intermediary, I'd actually suggest MySQL with MyISAM tables for this one isolated role. While horrifyingly unsafe, with MyISAM tables it's also blazingly fast. Do NOT store anything you care about keeping that way, though. If you're using Pg anyway for other things, if you do also intend to store the history of the changes (say, as a change log in another table on safe storage), etc, then you might want to look at temp tables in a RAM-disk based tablespace. Do some testing and see how you go. -- Craig Ringer