Re: Summaries on SSD usage? - Mailing list pgsql-performance

From Jesper Krogh
Subject Re: Summaries on SSD usage?
Date
Msg-id 4E61CDF7.30505@krogh.cc
Whole thread Raw
In response to Re: Summaries on SSD usage?  (Stefan Keller <sfkeller@gmail.com>)
Responses Re: Summaries on SSD usage?
List pgsql-performance
On 2011-09-03 00:04, Stefan Keller wrote:
> 2011/9/2 Scott Marlowe<scott.marlowe@gmail.com>:
>> On Tue, Aug 30, 2011 at 11:23 AM, Stefan Keller<sfkeller@gmail.com>  wrote:
>> How big is your DB?
>> What kind of reads are most common, random access or sequential?
>> How big of a dataset do you pull out at once with a query.
>>
>> SSDs are usually not a big winner for read only databases.
>> If the dataset is small (dozen or so gigs)  get more RAM to fit it in
>> If it's big and sequentially accessed, then build a giant RAID-10 or RAID-6
>> If it's big and randomly accessed then buy a bunch of SSDs and RAID them
> My dataset is a mirror of OpenStreetMap updated daily. For Switzerland
> it's about 10 GB total disk space used (half for tables, half for
> indexes) based on 2 GB raw XML input. Europe would be about 70 times
> larger (130 GB) and world has 250 GB raw input.
>
> It's both randomly (= index scan?) and sequentially (= seq scan?)
> accessed with queries like: " SELECT * FROM osm_point WHERE tags @>
> hstore('tourism','zoo') AND name ILIKE 'Zoo%' ".  You can try it
> yourself online, e.g.
> http://labs.geometa.info/postgisterminal/?xapi=node[tourism=zoo]
>
> So I'm still unsure what's better: SSD, NVRAM (PCI card) or plain RAM?
> And I'm eager to understand if unlogged tables could help anyway

It's not that hard to figure out.. take some of your "typical" queries.
say the one above..  Change the search-term to something "you'd expect
the user to enter in a minute, but hasn't been run". (could be "museum"
instead
of "zoo".. then you run it with \timing  and twice.. if the two queries are
"close" to each other in timing, then you only hit memory anyway and
neither SSD, NVRAM or more RAM will buy you anything. Faster memory
and faster CPU-cores will..  if you have a significant speedup to the
second run, then more RAM, NVRAM, SSD is a good fix.

Typically I have slow-query-logging turned on, permanently set to around
250ms.
If I find queries in the log that "i didnt expect" to take above 250ms then
I'd start to investigate if query-plans are correct .. and so on..

The above numbers are "raw-data" size and now how PG uses them.. or?
And you havent told anything about the size of your current system.

Jesper

pgsql-performance by date:

Previous
From: Jayadevan
Date:
Subject: Re: Query performance issue
Next
From: Stefan Keller
Date:
Subject: Re: Summaries on SSD usage?