Thread: partitioning os swap data log tempdb
Hi, we want to migrate from MS SQL Server (windows2000) to PostgreSQL (Linux) :-)) and we want to use the old MSSQL Hardware. Dual Pentium III 800 1 GB RAM 2 IDE 10 GB 2 RAID Controller (RAID 0,1 aviable) with 2 9GB SCSI HDD 1 RAID Controller (RAID 0,1,5 aviable) with 3 18GB SCSI HDD The configuration for MS-SQL was this: OS on the 2 IDE Harddisks with Software-RAID1 SQL-Data on RAID-Controller with RAID-5 (3 x 18GB SCSI Harddisks) SQL-TempDB on RAID-Controller with RAID-1 (2 x 9GB SCSI Harddisk) SQL-TransactionLog on RAID-Controller with RAID-1 (2 x 9GB SCSI Harddisk) Can i make a similar configuration with PostgreSQL? Or what is the prefered fragmentation for operatingsystem, swap-partition, data, indexes, tempdb and transactionlog? What is pg_xlog and how important is it? What ist the prefered filesystem (ext2, ext3 or raiserfs)? We want to use about 20 databases with varios size from 5 MB to 500MB per database and more selects than inserts (insert/select ratio about 1/10) for fast webaccess. Thank you for your hints! Bye, Mario
On 24 Feb 2003 at 10:52, Schaefer, Mario wrote: > Hi, > > we want to migrate from MS SQL Server (windows2000) > to PostgreSQL (Linux) :-)) > and we want to use the old MSSQL Hardware. > > Dual Pentium III 800 > 1 GB RAM > 2 IDE 10 GB > 2 RAID Controller (RAID 0,1 aviable) with 2 9GB SCSI HDD > 1 RAID Controller (RAID 0,1,5 aviable) with 3 18GB SCSI HDD > > The configuration for MS-SQL was this: > OS on the 2 IDE Harddisks with Software-RAID1 > SQL-Data on RAID-Controller with RAID-5 (3 x 18GB SCSI Harddisks) > SQL-TempDB on RAID-Controller with RAID-1 (2 x 9GB SCSI Harddisk) > SQL-TransactionLog on RAID-Controller with RAID-1 (2 x 9GB SCSI Harddisk) > > Can i make a similar configuration with PostgreSQL? > Or what is the prefered fragmentation for > operatingsystem, swap-partition, data, indexes, tempdb and transactionlog? Hmm.. You can put your OS on IDE and databases on 3x18GB SCSI. Postgresql can not split data/indexes/tempdb etc. So they will be on one drive. You don't have much of a choice here. > What is pg_xlog and how important is it? It is transaction log. It is hit every now and then for insert/update/deletes. Symlinking it to a separate drive would be a great performance boost. Put it on the other SCSI disk. AFAIK, it is a single file. I suggest you put WAL and xlog on other 2x9GB SCSI drive. You need to shut down postgresql after schema creation and symlink the necessary files by hand. If postgresql ever recreates these files/directories by hand, it will drop the symlinks and recreate the files. In that case you need to redo the exercise of symlinkg. > What ist the prefered filesystem (ext2, ext3 or raiserfs)? reiserfs or XFS. > We want to use about 20 databases with varios size from 5 MB to 500MB per > database > and more selects than inserts (insert/select ratio about 1/10) for fast > webaccess. shouldn't be a problem. Tune shared_buffers around 150-250MB. Beef up FSM entries, sort mem and vacuum regularly. HTH Bye Shridhar -- Slous' Contention: If you do a job too well, you'll get stuck with it.
On Mon, Feb 24, 2003 at 10:52:55AM +0100, Schaefer, Mario wrote: > Can i make a similar configuration with PostgreSQL? Sort of. PostgreSQL currently does not have a convenient way to specify where this or that part of the database lives. As a result, your best bet is to use RAID 1+0 for the data area, and get the speed that way. If you must do it without more hardware, however, you can manually move some files to other drives and symlink them. You _must_ do this while offline, and if the file grows above 1G, the advantage will be lost. It is nevertheless a good idea to put your OS, data directory, and write head log (WAL) on separate disks. Also, you should make sure your PostgreSQL logs don't interfere with the database I/O (the OS disk is probably a good place for them, but make sure you use some sort of log rotator. Syslog is helpful here). > What is pg_xlog and how important is it? It's the write ahead log. Put it on a separate RAID. > What ist the prefered filesystem (ext2, ext3 or raiserfs)? Certainly ext2 is not crash-safe. There've been some reports of corruption under reiserfs, but I've never had it happen. There have been complaints about performance with ext3. You might want to investigate XFS, as it was designed for this sort of task. > We want to use about 20 databases with varios size from 5 MB to 500MB per > database > and more selects than inserts (insert/select ratio about 1/10) for fast > webaccess. The WAL is less critical in this case, because it is only extended when you change the data, not when you select. A -- ---- Andrew Sullivan 204-4141 Yonge Street Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.info> M2P 2A8 +1 416 646 3304 x110
On Mon, 24 Feb 2003, Schaefer, Mario wrote: > Hi, > > we want to migrate from MS SQL Server (windows2000) > to PostgreSQL (Linux) :-)) > and we want to use the old MSSQL Hardware. > > Dual Pentium III 800 > 1 GB RAM > 2 IDE 10 GB > 2 RAID Controller (RAID 0,1 aviable) with 2 9GB SCSI HDD > 1 RAID Controller (RAID 0,1,5 aviable) with 3 18GB SCSI HDD > > The configuration for MS-SQL was this: > OS on the 2 IDE Harddisks with Software-RAID1 > SQL-Data on RAID-Controller with RAID-5 (3 x 18GB SCSI Harddisks) > SQL-TempDB on RAID-Controller with RAID-1 (2 x 9GB SCSI Harddisk) > SQL-TransactionLog on RAID-Controller with RAID-1 (2 x 9GB SCSI Harddisk) > > Can i make a similar configuration with PostgreSQL? > Or what is the prefered fragmentation for > operatingsystem, swap-partition, data, indexes, tempdb and transactionlog? > > What is pg_xlog and how important is it? > > What ist the prefered filesystem (ext2, ext3 or raiserfs)? > > We want to use about 20 databases with varios size from 5 MB to 500MB per > database > and more selects than inserts (insert/select ratio about 1/10) for fast > webaccess. With that ratio of writers to readers, you may find a big RAID5 works as well as anything. Also, you don't mention what RAID controllers you're using. If they're real low end stuff like adaptec 133s, then you're better off just using them as straight scsi cards under linux and letting the kernel do the work. Can you create RAID arrays across multiple RAID cards on that setup? if so, a big RAID-5 with 4 9 gigs and 3 more 9 gigs from the other 3 drives might be your fastest storage. That's 36 gigs of storage across 7 spindles, which means good parallel read access. How many simo users are you expecting?
Mario, > we want to migrate from MS SQL Server (windows2000) > to PostgreSQL (Linux) :-)) > and we want to use the old MSSQL Hardware. I don't blame you. I just finished an MSSQL project. BLEAH! > The configuration for MS-SQL was this: > OS on the 2 IDE Harddisks with Software-RAID1 > SQL-Data on RAID-Controller with RAID-5 (3 x 18GB SCSI Harddisks) > SQL-TempDB on RAID-Controller with RAID-1 (2 x 9GB SCSI Harddisk) > SQL-TransactionLog on RAID-Controller with RAID-1 (2 x 9GB SCSI Harddisk) > > Can i make a similar configuration with PostgreSQL? Yes. Many of the concerns are the same. However, 3-disk RAID 5 performs poorly for UPDATES for PostgreSQL. That is, while reads are often better than for a single SCSI disk, UPDATES happen at half or less of the speed than they would do on a SCSI disk alone. There is no TempDB in PostgreSQL. This gives you a whole free RAID array to play with. > Or what is the prefered fragmentation for > operatingsystem, swap-partition, data, indexes, tempdb and transactionlog? > > What is pg_xlog and how important is it? It is analogous to the SQL Transaction Log, although it does not need to be backed up to truncate it. Deal with it the way you would deal with an MSSQL transaction log; i.e. on its own disk, if possible. However, you gain little by putting it on RAID other than failover safety; in fact, you might find that the xlog peforms better on a lone SCSI disk since even the best RAID 1 will slow down data writes by up to 15%. Swap is not such a concern for PostgreSQL on Linux or BSD. With proper database tuning and a GB of memory, you will never use the swap. Or to put it another way, if you're seeing regular hits on swap, you need to re-tune your database. Finally, you want to make absolutely sure that either the write-through cache on each RAID controller is disabled in the BIOS, or that you have a battery back-up which you trust 100%. Otherwise, the caching done by the RAID controllers will cancel completely the benefit of the Xlog for database recovery. > What ist the prefered filesystem (ext2, ext3 or raiserfs)? That's a matter of open debate. I like Reiser. Ext3 has its proponents, as does XFS. Ext2 is probably faster than all of the above ... provided that your machine never has an unexpected shutdown. Then Ext2 is very bad ar recovering from power-outs ... > We want to use about 20 databases with varios size from 5 MB to 500MB per > database > and more selects than inserts (insert/select ratio about 1/10) for fast > webaccess. Keep in mind that unlike SQL Server, you cannot easily query between databases on PostgreSQL. So if those databases are all related, you probably want to put them in the same PostgreSQL 7.3.2 database and use schema instead. If the databases are for different users and completely seperate, you'll want to read up heavily on Postgres' security model, which has been significantly improved for 7.3. -- Josh Berkus Aglio Database Solutions San Francisco