Thread: Summaries on SSD usage?
Hi, I'm looking for summaries (or best practices) on SSD usage with PostgreSQL. My use case is mainly a "read-only" database. Are there any around? Yours, Stefan
On Aug 30, 2011, at 12:23 PM, Stefan Keller wrote: > I'm looking for summaries (or best practices) on SSD usage with PostgreSQL. > My use case is mainly a "read-only" database. > Are there any around? I'm not sure, but for read-only why not just put more memory in the server? It'll be a lot cheaper than SSDs. -- Jim C. Nasby, Database Architect jim@nasby.net 512.569.9461 (cell) http://jim.nasby.net
You mean something like "Unlogged Tables" in PostgreSQL 9.1 (= in-memory database) or simply a large ramdisk? Yours, Stefan 2011/9/1 Jim Nasby <jim@nasby.net>: > On Aug 30, 2011, at 12:23 PM, Stefan Keller wrote: >> I'm looking for summaries (or best practices) on SSD usage with PostgreSQL. >> My use case is mainly a "read-only" database. >> Are there any around? > > I'm not sure, but for read-only why not just put more memory in the server? It'll be a lot cheaper than SSDs. > -- > Jim C. Nasby, Database Architect jim@nasby.net > 512.569.9461 (cell) http://jim.nasby.net > > >
On 2011-09-01 23:28, Jim Nasby wrote: > On Aug 30, 2011, at 12:23 PM, Stefan Keller wrote: >> I'm looking for summaries (or best practices) on SSD usage with PostgreSQL. >> My use case is mainly a "read-only" database. >> Are there any around? > I'm not sure, but for read-only why not just put more memory in the server? It'll be a lot cheaper than SSDs It is "really expensive" to go over 512GB memory and the performance regression for just hitting disk in a system where you assume everything is in memory is really huge. SSD makes the "edge" be a bit smoother than rotating drives do. Jesper -- Jesper
On Tue, Aug 30, 2011 at 11:23 AM, Stefan Keller <sfkeller@gmail.com> wrote: > Hi, > > I'm looking for summaries (or best practices) on SSD usage with PostgreSQL. > My use case is mainly a "read-only" database. > Are there any around? How big is your DB? What kind of reads are most common, random access or sequential? How big of a dataset do you pull out at once with a query. SSDs are usually not a big winner for read only databases. If the dataset is small (dozen or so gigs) get more RAM to fit it in If it's big and sequentially accessed, then build a giant RAID-10 or RAID-6 If it's big and randomly accessed then buy a bunch of SSDs and RAID them
On 09/01/2011 11:14 PM, Jesper Krogh wrote: > It is "really expensive" to go over 512GB memory and the performance > regression for just hitting disk in a system where you assume > everything is in memory is really huge. SSD makes the "edge" be a bit > smoother than rotating drives do. Ironically, this is actually the topic of my presentation at Postgres Open. We transitioned to NVRAM PCI cards for exactly this reason. Having a giant database in cache is great, until a few reads come from your slow backing disks, or heaven-forbid, you have to restart your database during a high transactional period. Lemme tell ya... no RAID-10 in the world can supply 12k TPS with little to no warning. A good set of SSDs or PCI cards can. -- Shaun Thomas OptionsHouse | 141 W. Jackson Blvd. | Suite 800 | Chicago IL, 60604 312-676-8870 sthomas@peak6.com ______________________________________________ See http://www.peak6.com/email_disclaimer.php for terms and conditions related to this email
2011/9/2 Scott Marlowe <scott.marlowe@gmail.com>: > On Tue, Aug 30, 2011 at 11:23 AM, Stefan Keller <sfkeller@gmail.com> wrote: > How big is your DB? > What kind of reads are most common, random access or sequential? > How big of a dataset do you pull out at once with a query. > > SSDs are usually not a big winner for read only databases. > If the dataset is small (dozen or so gigs) get more RAM to fit it in > If it's big and sequentially accessed, then build a giant RAID-10 or RAID-6 > If it's big and randomly accessed then buy a bunch of SSDs and RAID them My dataset is a mirror of OpenStreetMap updated daily. For Switzerland it's about 10 GB total disk space used (half for tables, half for indexes) based on 2 GB raw XML input. Europe would be about 70 times larger (130 GB) and world has 250 GB raw input. It's both randomly (= index scan?) and sequentially (= seq scan?) accessed with queries like: " SELECT * FROM osm_point WHERE tags @> hstore('tourism','zoo') AND name ILIKE 'Zoo%' ". You can try it yourself online, e.g. http://labs.geometa.info/postgisterminal/?xapi=node[tourism=zoo] So I'm still unsure what's better: SSD, NVRAM (PCI card) or plain RAM? And I'm eager to understand if unlogged tables could help anyway. Yours, Stefan
On 2011-09-03 00:04, Stefan Keller wrote: > 2011/9/2 Scott Marlowe<scott.marlowe@gmail.com>: >> On Tue, Aug 30, 2011 at 11:23 AM, Stefan Keller<sfkeller@gmail.com> wrote: >> How big is your DB? >> What kind of reads are most common, random access or sequential? >> How big of a dataset do you pull out at once with a query. >> >> SSDs are usually not a big winner for read only databases. >> If the dataset is small (dozen or so gigs) get more RAM to fit it in >> If it's big and sequentially accessed, then build a giant RAID-10 or RAID-6 >> If it's big and randomly accessed then buy a bunch of SSDs and RAID them > My dataset is a mirror of OpenStreetMap updated daily. For Switzerland > it's about 10 GB total disk space used (half for tables, half for > indexes) based on 2 GB raw XML input. Europe would be about 70 times > larger (130 GB) and world has 250 GB raw input. > > It's both randomly (= index scan?) and sequentially (= seq scan?) > accessed with queries like: " SELECT * FROM osm_point WHERE tags @> > hstore('tourism','zoo') AND name ILIKE 'Zoo%' ". You can try it > yourself online, e.g. > http://labs.geometa.info/postgisterminal/?xapi=node[tourism=zoo] > > So I'm still unsure what's better: SSD, NVRAM (PCI card) or plain RAM? > And I'm eager to understand if unlogged tables could help anyway It's not that hard to figure out.. take some of your "typical" queries. say the one above.. Change the search-term to something "you'd expect the user to enter in a minute, but hasn't been run". (could be "museum" instead of "zoo".. then you run it with \timing and twice.. if the two queries are "close" to each other in timing, then you only hit memory anyway and neither SSD, NVRAM or more RAM will buy you anything. Faster memory and faster CPU-cores will.. if you have a significant speedup to the second run, then more RAM, NVRAM, SSD is a good fix. Typically I have slow-query-logging turned on, permanently set to around 250ms. If I find queries in the log that "i didnt expect" to take above 250ms then I'd start to investigate if query-plans are correct .. and so on.. The above numbers are "raw-data" size and now how PG uses them.. or? And you havent told anything about the size of your current system. Jesper
2011/9/3 Jesper Krogh <jesper@krogh.cc>: > On 2011-09-03 00:04, Stefan Keller wrote: > It's not that hard to figure out.. take some of your "typical" queries. > say the one above.. Change the search-term to something "you'd expect > the user to enter in a minute, but hasn't been run". (could be "museum" > instead > of "zoo".. then you run it with \timing and twice.. if the two queries are > "close" to each other in timing, then you only hit memory anyway and > neither SSD, NVRAM or more RAM will buy you anything. Faster memory > and faster CPU-cores will.. if you have a significant speedup to the > second run, then more RAM, NVRAM, SSD is a good fix. > > Typically I have slow-query-logging turned on, permanently set to around > 250ms. > If I find queries in the log that "i didnt expect" to take above 250ms then > I'd start to investigate if query-plans are correct .. and so on.. > > The above numbers are "raw-data" size and now how PG uses them.. or? > And you havent told anything about the size of your current system. Its definitely the case that the second query run is much faster (first ones go up to 30 seconds and more...). PG uses the raw data for Switzerlad like this: 10 GB total disk space based on 2 GB raw XML input. Table osm_point is one of the four big tables and uses 984 MB for table and 1321 MB for indexes (where hstore is the biggest from id, name and geometry). Stefan
Shaun, 2011/9/2 Shaun Thomas <sthomas@peak6.com>: > Ironically, this is actually the topic of my presentation at Postgres Open.> Do you think my problem would now be solved with NVRAM PCI card? Stefan ---------- Forwarded message ---------- From: Stefan Keller <sfkeller@gmail.com> Date: 2011/9/3 Subject: Re: [PERFORM] Summaries on SSD usage? To: Jesper Krogh <jesper@krogh.cc> Cc: pgsql-performance@postgresql.org 2011/9/3 Jesper Krogh <jesper@krogh.cc>: > On 2011-09-03 00:04, Stefan Keller wrote: > It's not that hard to figure out.. take some of your "typical" queries. > say the one above.. Change the search-term to something "you'd expect > the user to enter in a minute, but hasn't been run". (could be "museum" > instead > of "zoo".. then you run it with \timing and twice.. if the two queries are > "close" to each other in timing, then you only hit memory anyway and > neither SSD, NVRAM or more RAM will buy you anything. Faster memory > and faster CPU-cores will.. if you have a significant speedup to the > second run, then more RAM, NVRAM, SSD is a good fix. > > Typically I have slow-query-logging turned on, permanently set to around > 250ms. > If I find queries in the log that "i didnt expect" to take above 250ms then > I'd start to investigate if query-plans are correct .. and so on.. > > The above numbers are "raw-data" size and now how PG uses them.. or? > And you havent told anything about the size of your current system. Its definitely the case that the second query run is much faster (first ones go up to 30 seconds and more...). PG uses the raw data for Switzerlad like this: 10 GB total disk space based on 2 GB raw XML input. Table osm_point is one of the four big tables and uses 984 MB for table and 1321 MB for indexes (where hstore is the biggest from id, name and geometry). Stefan
On 09/06/2011 08:45 AM, Stefan Keller wrote: > Do you think my problem would now be solved with NVRAM PCI card? That's a tough call. Part of the reason I'm doing the presentation is because there are a lot of other high OLTP databases out there which have (or will) reached critical mass where cache can't fulfill generic database requests anymore. As an example, we were around 11k database transactions per second on 250GB of data with 32GB of RAM. The first thing we tried was bumping it up to 64GB, and that kinda worked. But what you'll find, is that an autovacuum, or a nightly vacuum, will occasionally hit a large table and flush all of that handy cached data down the tubes, and then your database starts choking trying to keep up with the requests. Even a large, well equipped RAID can only really offer 2500-ish TPS before you start getting into the larger and more expensive SANs, so you either have to pre-load your memory with dd or pgfincore, or if your random access patterns actually exceed your RAM, you need a bigger disk pool or tiered storage. And by tiered storage, I mean tablespaces, with critical high-TPS tables located on a PCIe card or a pool of modern (capacitor-backed, firmware GC) SSDs. Your case looks more like you have just a couple big-ass queries/tables that occasionally give you trouble. If optimizing the queries, index tweaks, and other sundry tools can't help anymore, you may have to start dragging ou the bigger guns. But if you can afford it, having some NVRam storage around as a top-tier tablespace for critical-need data is probably good practice these days. They're expensive, though. Even the cheap ones start around $5k. Just remember you're paying for the performance in this case, and not storage capacity. Some vendors have demo hardware they'll let you use to determine if it applies to your case, so you might want to contact FusionIO, RAMSAN, Virident, or maybe OCZ. -- Shaun Thomas OptionsHouse | 141 W. Jackson Blvd. | Suite 800 | Chicago IL, 60604 312-676-8870 sthomas@peak6.com ______________________________________________ See http://www.peak6.com/email_disclaimer.php for terms and conditions related to this email