Re: Multiple Postmasters on Beowulf cluster - Mailing list pgsql-admin
From | Jan Hartmann |
---|---|
Subject | Re: Multiple Postmasters on Beowulf cluster |
Date | |
Msg-id | DIEALLGCLLCNIHBDCMAEOEHFCDAA.jhart@frw.uva.nl Whole thread Raw |
In response to | Re: Multiple Postmasters on Beowulf cluster (Tom Lane <tgl@sss.pgh.pa.us>) |
List | pgsql-admin |
Thanks a lot for the reactions. I experimented a bit further with the answers in mind and got the following result: (Tom Lane) > In that case, make 45 copies of the database ... Without expecting much I created 3 data-directories and made symbolic links from everything in the original data-directory, except postmaster.pid. Next I started PostgreSQL on 3 nodes with PGDATA set to a different directory and PGPORT to a different port. Surprisingly it worked! First startup gave a message on each node about not having had a proper shutdown, but afterwards everything ran ok from all servers, even restarting PostgreSQL. MapServer didn't have a problem at all in producing a map from layers requested from different nodes, although without any time gain (see next point). Probably this gives PostgreSQL wizards the creeps and I wouldn't advise it to anyone, but just for curiosity's sake, what dangers am I running (given that only read access is needed). (Paul Ramsey) > It is worth noting that layers are not really all that independant from a display > point of view. (...) The process of creating the final visual product is the > result of sequential application of layers. Yes, I didn't realise that MapServer waits until a layer has been returned from PostgreSQL before starting with the next, essentially doing nothing in the meantime. I thought it worked like a web browser retrieving images, which is done asynchronously. It should work however when asking for complete maps from different browser frames, or retrieving a map in one frame and getting statistics for it in another, using separate PHP scripts targeted at different nodes. This would already help me enormously. (Bob Meyer) > Since disk is typically the slowest part of > any system, I would imagine that 45 nodes, all beating on one network > file system (or a multiport filesystem for that matter) would tend to > slow things down dramatically. I would think that it would be better to > make 45 separate copies of the database and then if there are updates, > make some kind of process to pass all of the transactions to each > instantiation of the DB. Granted, the disk space would increase to 45X > the original estimate. How much updating/changing goes on in the Db? I am trying this out for population statistics in Dutch municipalities within specified distances (1, 2, 5, 10, 25 km etc ) from the railway network. Number of railway lines: 419 (each having numerous line segments), #municipalites: 633, size of mun. map about 10M. It takes about 30 seconds wall time to produce a map (good compared to desktop GIS-sytems, I have no experience with Oracle Spatial). Next step would be using the roads network, (much larger of course, but still in the range of tens of megabytes, perhaps a hundred), and data from very diverse sources and years, including raster bitmaps, all not excessively large. Lots of different buffers have to be put around all kinds of selections (type roads, geographical selections) and compared with each other. Last step is animating the results in Flash movies: MapServer will be supporting Flash in the very near future, and I already got some preliminary results. This will require even more computations of intermediate results, to get flowing movies. So the problem is not data access, it is computing power and administration of a very disparate bunch of data. I certainly have enough computing power, and probably also enough disk space for a 45 fold data reduplication, but I know from experience how error prone this is, even with duplicating scripts. Even so, unless I am very much mistaken, the MapServer-PostgreSQL-Beowulf combinatation should offer some very interesting prospects in GIS. Thanks for the answers Jan Jan Hartmann Department of Geography University of Amsterdam jhart@frw.uva.nl
pgsql-admin by date: