Re: TB-sized databases - Mailing list pgsql-performance
From | Russell Smith |
---|---|
Subject | Re: TB-sized databases |
Date | |
Msg-id | 474FB0B1.6070900@pws.com.au Whole thread Raw |
In response to | Re: TB-sized databases (Simon Riggs <simon@2ndquadrant.com>) |
Responses |
Re: TB-sized databases
(Simon Riggs <simon@2ndquadrant.com>)
|
List | pgsql-performance |
Simon Riggs wrote: > On Tue, 2007-11-27 at 18:06 -0500, Pablo Alcaraz wrote: > >> Simon Riggs wrote: >> >>> All of those responses have cooked up quite a few topics into one. Large >>> databases might mean text warehouses, XML message stores, relational >>> archives and fact-based business data warehouses. >>> >>> The main thing is that TB-sized databases are performance critical. So >>> it all depends upon your workload really as to how well PostgreSQL, or >>> another other RDBMS vendor can handle them. >>> >>> >>> Anyway, my reason for replying to this thread is that I'm planning >>> changes for PostgreSQL 8.4+ that will make allow us to get bigger and >>> faster databases. If anybody has specific concerns then I'd like to hear >>> them so I can consider those things in the planning stages >>> >> it would be nice to do something with selects so we can recover a rowset >> on huge tables using a criteria with indexes without fall running a full >> scan. >> >> In my opinion, by definition, a huge database sooner or later will have >> tables far bigger than RAM available (same for their indexes). I think >> the queries need to be solved using indexes enough smart to be fast on disk. >> > > OK, I agree with this one. > > I'd thought that index-only plans were only for OLTP, but now I see they > can also make a big difference with DW queries. So I'm very interested > in this area now. > > If that's true, then you want to get behind the work Gokulakannan Somasundaram (http://archives.postgresql.org/pgsql-hackers/2007-10/msg00220.php) has done with relation to thick indexes. I would have thought that concept particularly useful in DW. Only having to scan indexes on a number of join tables would be a huge win for some of these types of queries. My tiny point of view would say that is a much better investment than setting up the proposed parameter. I can see the use of the parameter though. Most of the complaints about indexes having visibility is about update /delete contention. I would expect in a DW that those things aren't in the critical path like they are in many other applications. Especially with partitioning and previous partitions not getting may updates, I would think there could be great benefit. I would think that many of Pablo's requests up-thread would get significant performance benefit from this type of index. But as I mentioned at the start, that's my tiny point of view and I certainly don't have the resources to direct what gets looked at for PostgreSQL. Regards Russell Smith
pgsql-performance by date: